about summary refs log tree commit diff
path: root/docs/gemma.ipynb
diff options
context:
space:
mode:
Diffstat (limited to 'docs/gemma.ipynb')
-rw-r--r--docs/gemma.ipynb310
1 files changed, 292 insertions, 18 deletions
diff --git a/docs/gemma.ipynb b/docs/gemma.ipynb
index 4c5f9a6..4fbaab2 100644
--- a/docs/gemma.ipynb
+++ b/docs/gemma.ipynb
@@ -6,25 +6,41 @@
    "source": [
     "# Running GEMMA from GN3\n",
     "\n",
-    "This document outlines how to use gemma from Genenetwork3."
+    "This document outlines how to use gemma from Genenetwork3.\n",
+    "\n",
+    "The current mechanism for how Gemma runs is that when you run one of the endpoints that runs the actual gemma, it constructs the command and queues it in to a REDIS queue; thereby bypassing any time-out issues with the endpoind for long running process. A worker(sheepdog) processes the endpoints.\n",
+    "\n",
+    "If you are running gn3 through a development environment, ensure that it is up and running by running the command:\n",
+    "\n",
+    "```sh\n",
+    "env FLASK_APP=\"main.py\" flask run --port 8080\n",
+    "```\n",
+    "\n",
+    "**PS: For these examples, I'm assuming you provide your Genotype files. This will be edited out since they will be replaced by those provided by IPFS.**"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "### All imports go here!\n",
+    "import requests"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "If you are running gn3 through a development environment, ensure that it is up and running by running the command:\n",
-    "\n",
-    "```sh\n",
-    "env FLASK_APP=\"main.py\" flask run --port 8080\n",
-    "```\n",
+    "##### GET api/gemma/version\n",
     "\n",
     "To ensure that things are working, run:"
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 2,
+   "execution_count": 7,
    "metadata": {
     "scrolled": true
    },
@@ -38,8 +54,7 @@
     }
    ],
    "source": [
-    "import requests\n",
-    "r = requests.get(\"http://127.0.0.1:8080/api/gemma/version\")\n",
+    "r = requests.get(\"http://gn3-test.genenetwork.org/api/gemma/version\")\n",
     "print(r.json())"
    ]
   },
@@ -49,9 +64,70 @@
    "source": [
     "## Uploading data\n",
     "\n",
-    "Before you perform any computation, you need to ensure you have your data uploaded\n",
+    "Before you perform any computation, you need to ensure you have your data uploaded. For these examples, I'll use data provided [here](https://github.com/genetics-statistics/gemma-wrapper/tree/master/test/data/input)\n",
+    "\n",
+    "##### POST /api/metadata/upload/:token\n",
+    "\n",
+    "If token is provided, your user directory will be overridden with the new upload data!\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 21,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "{'status': 0, 'token': 'VROD4N-XVW1L0'}\n"
+     ]
+    }
+   ],
+   "source": [
+    "# I intentionally upload the data file without having the metadata file so that I can upload it later.\n",
+    "r = requests.post(\"http://gn3-test.genenetwork.org/api/metadata/upload/\",\n",
+    "                  files=[('file', ('file.tar.gz',\n",
+    "                                   open('/tmp/file.tar.gz',\n",
+    "                                        'rb'), 'application/octet-stream'))])\n",
     "\n",
-    "TODO"
+    "print(r.json())"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Now we overrlde the upload with the relevant metadatafile:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 25,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "{'status': 0, 'token': 'VROD4N-XVW1L0'}\n"
+     ]
+    }
+   ],
+   "source": [
+    "r = requests.post(\"http://gn3-test.genenetwork.org/api/metadata/upload/VROD4N-XVW1L0\",\n",
+    "                  files=[('file', ('file-with-metadata.tar.gz',\n",
+    "                                   open('/tmp/file-with-metadata.tar.gz',\n",
+    "                                        'rb'), 'application/octet-stream'))])\n",
+    "\n",
+    "print(r.json())"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "We will use the above token for computations!"
    ]
   },
   {
@@ -68,7 +144,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "#### Computing K-Values (no loco, no cavariates): \n",
+    "#### Computing K-Values (no loco): \n",
     "##### POST /gemma/k-compute/:token\n",
     "The end command will look something like:\n",
     "\n",
@@ -81,13 +157,76 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 66,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "{'output_file': '8f4906862459e59dcb452fd8162d2cc1-output.json', 'status': 'queued', 'unique_id': 'cmd::2021-03-1005-1727-1727-7006dfef-8e60-4bb2-8390-0c174d15a17c'}\n"
+     ]
+    }
+   ],
+   "source": [
+    "r = requests.post(\"http://gn3-test.genenetwork.org/api/gemma/k-compute/VROD4N-XVW1L0\")\n",
+    "print(r.json())"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "This generates the command:\n",
+    "\n",
+    "```\n",
+    "gemma-wrapper --json -- -g /tmp/VROD4N-XVW1L0/BXD_geno.txt.gz -p /tmp/VROD4N-XVW1L0/BXD_pheno.txt -a /tmp/VROD4N-XVW1L0/BXD_snps.txt -gk > /tmp/VROD4N-XVW1L0/8f4906862459e59dcb452fd8162d2cc1-output.json\n",
+    "```"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Check the status of the command"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 35,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "{'status': 'queued'}\n"
+     ]
+    }
+   ],
+   "source": [
+    "r = requests.get(\"http://gn3-test.genenetwork.org/api/gemma/status/cmd%3A%3A2021-03-1005-1727-1727-7006dfef-8e60-4bb2-8390-0c174d15a17c\")\n",
+    "print(r.json())"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 68,
+   "metadata": {
+    "scrolled": true
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "{'status': 'success'}\n"
+     ]
+    }
+   ],
    "source": [
-    "# Demo\n",
-    "import requests\n",
-    "r = requests.post(\"localhost:8080/api/gemma/k-compute/abcde-abcde\")\n",
+    "# After a short while:\n",
+    "r = requests.get(\"http://gn3-test.genenetwork.org/api/gemma/status/cmd%3A%3A2021-03-1005-1727-1727-7006dfef-8e60-4bb2-8390-0c174d15a17c\")\n",
     "print(r.json())"
    ]
   },
@@ -95,7 +234,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "#### POST /gemma/k-compute/:chromosomes/:token\n",
+    "#### POST /api/gemma/k-compute/loco/:chromosomes/:token\n",
     "\n",
     "Cqmpute K values with chromosomes. The end command will look similar to:\n",
     "\n",
@@ -105,6 +244,141 @@
     "                -a genotype-snps -gk > k_output_filename.json\n",
     "```"
    ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 78,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "{'output_file': '8f4906862459e59dcb452fd8162d2cc1-+O9bus-output.json', 'status': 'queued', 'unique_id': 'cmd::2021-03-1006-1926-1926-60d0aed2-1645-44e0-ba21-28f37bb4e688'}\n"
+     ]
+    }
+   ],
+   "source": [
+    "r = requests.post(\"http://gn3-test.genenetwork.org/api/gemma/k-compute/loco/1%2C2%2C3%2C4/VROD4N-XVW1L0\")\n",
+    "print(r.json())"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## GWA Computation\n",
+    "##### POST /api/gemma/gwa-compute/:k-inputfile/:token\n",
+    "(No Loco; No covars)\n",
+    "Assuming we use the previously generated k-inputfile\n",
+    "Also, K-inputfile can be any file you added during the data upload!"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 81,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "{'output_file': '8f4906862459e59dcb452fd8162d2cc1-output.json', 'status': 'queued', 'unique_id': 'cmd::2021-03-1006-3257-3257-30669a04-dc3d-4cce-a622-8be2103a864f'}\n"
+     ]
+    }
+   ],
+   "source": [
+    "r = requests.post(\"http://gn3-test.genenetwork.org/api/gemma/gwa-compute/8f4906862459e59dcb452fd8162d2cc1-output.json/VROD4N-XVW1L0\")\n",
+    "print(r.json())"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "##### POST /api/gemma/gwa-compute/covars/:k-inputfile/:token\n",
+    "\n",
+    "The covars file is fetched from the metadata file"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 83,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "{'output_file': 'c718773b04935405258315b9588d13e6-output.json', 'status': 'queued', 'unique_id': 'cmd::2021-03-1006-3518-3518-70d057d3-cb07-4171-be07-e1dafe1fb278'}\n"
+     ]
+    }
+   ],
+   "source": [
+    "r = requests.post(\"http://gn3-test.genenetwork.org/api/gemma/gwa-compute/covars/8f4906862459e59dcb452fd8162d2cc1-output.json/VROD4N-XVW1L0\")\n",
+    "print(r.json())"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "##### POST /api/gemma/gwa-compute/:k-inputfile/loco/maf/:maf/:token\n",
+    "\n",
+    "Compute GWA with loco(maf has to be given), no covars.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 84,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "{'output_file': '8f4906862459e59dcb452fd8162d2cc1-output.json', 'status': 'queued', 'unique_id': 'cmd::2021-03-1006-3848-3848-c66924be-fbf9-494a-9123-eb5941aca912'}\n"
+     ]
+    }
+   ],
+   "source": [
+    "r = requests.post(\"http://gn3-test.genenetwork.org/api/gemma/gwa-compute/8f4906862459e59dcb452fd8162d2cc1-output.json/loco/maf/9/VROD4N-XVW1L0\")\n",
+    "print(r.json())"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "##### POST /api/gemma/gwa-compute/:k-inputfile/loco/covars/maf/:maf/:token\n",
+    "\n",
+    "The covariate file is fetched from the name defined in the metadata json file."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 87,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "{'output_file': 'c718773b04935405258315b9588d13e6-output.json', 'status': 'queued', 'unique_id': 'cmd::2021-03-1006-5255-5255-46765ebe-bbca-4402-86fa-a4c145ad4f71'}\n"
+     ]
+    }
+   ],
+   "source": [
+    "r = requests.post(\"http://gn3-test.genenetwork.org/api/gemma/gwa-compute/8f4906862459e59dcb452fd8162d2cc1-output.json/loco/covars/maf/9/VROD4N-XVW1L0\")\n",
+    "print(r.json())"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
   }
  ],
  "metadata": {