aboutsummaryrefslogtreecommitdiff
path: root/docs/gemma.ipynb
diff options
context:
space:
mode:
Diffstat (limited to 'docs/gemma.ipynb')
-rw-r--r--docs/gemma.ipynb310
1 files changed, 292 insertions, 18 deletions
diff --git a/docs/gemma.ipynb b/docs/gemma.ipynb
index 4c5f9a6..4fbaab2 100644
--- a/docs/gemma.ipynb
+++ b/docs/gemma.ipynb
@@ -6,25 +6,41 @@
"source": [
"# Running GEMMA from GN3\n",
"\n",
- "This document outlines how to use gemma from Genenetwork3."
+ "This document outlines how to use gemma from Genenetwork3.\n",
+ "\n",
+ "The current mechanism for how Gemma runs is that when you run one of the endpoints that runs the actual gemma, it constructs the command and queues it in to a REDIS queue; thereby bypassing any time-out issues with the endpoind for long running process. A worker(sheepdog) processes the endpoints.\n",
+ "\n",
+ "If you are running gn3 through a development environment, ensure that it is up and running by running the command:\n",
+ "\n",
+ "```sh\n",
+ "env FLASK_APP=\"main.py\" flask run --port 8080\n",
+ "```\n",
+ "\n",
+ "**PS: For these examples, I'm assuming you provide your Genotype files. This will be edited out since they will be replaced by those provided by IPFS.**"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 6,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "### All imports go here!\n",
+ "import requests"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
- "If you are running gn3 through a development environment, ensure that it is up and running by running the command:\n",
- "\n",
- "```sh\n",
- "env FLASK_APP=\"main.py\" flask run --port 8080\n",
- "```\n",
+ "##### GET api/gemma/version\n",
"\n",
"To ensure that things are working, run:"
]
},
{
"cell_type": "code",
- "execution_count": 2,
+ "execution_count": 7,
"metadata": {
"scrolled": true
},
@@ -38,8 +54,7 @@
}
],
"source": [
- "import requests\n",
- "r = requests.get(\"http://127.0.0.1:8080/api/gemma/version\")\n",
+ "r = requests.get(\"http://gn3-test.genenetwork.org/api/gemma/version\")\n",
"print(r.json())"
]
},
@@ -49,9 +64,70 @@
"source": [
"## Uploading data\n",
"\n",
- "Before you perform any computation, you need to ensure you have your data uploaded\n",
+ "Before you perform any computation, you need to ensure you have your data uploaded. For these examples, I'll use data provided [here](https://github.com/genetics-statistics/gemma-wrapper/tree/master/test/data/input)\n",
+ "\n",
+ "##### POST /api/metadata/upload/:token\n",
+ "\n",
+ "If token is provided, your user directory will be overridden with the new upload data!\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 21,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "{'status': 0, 'token': 'VROD4N-XVW1L0'}\n"
+ ]
+ }
+ ],
+ "source": [
+ "# I intentionally upload the data file without having the metadata file so that I can upload it later.\n",
+ "r = requests.post(\"http://gn3-test.genenetwork.org/api/metadata/upload/\",\n",
+ " files=[('file', ('file.tar.gz',\n",
+ " open('/tmp/file.tar.gz',\n",
+ " 'rb'), 'application/octet-stream'))])\n",
"\n",
- "TODO"
+ "print(r.json())"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Now we overrlde the upload with the relevant metadatafile:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 25,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "{'status': 0, 'token': 'VROD4N-XVW1L0'}\n"
+ ]
+ }
+ ],
+ "source": [
+ "r = requests.post(\"http://gn3-test.genenetwork.org/api/metadata/upload/VROD4N-XVW1L0\",\n",
+ " files=[('file', ('file-with-metadata.tar.gz',\n",
+ " open('/tmp/file-with-metadata.tar.gz',\n",
+ " 'rb'), 'application/octet-stream'))])\n",
+ "\n",
+ "print(r.json())"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "We will use the above token for computations!"
]
},
{
@@ -68,7 +144,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
- "#### Computing K-Values (no loco, no cavariates): \n",
+ "#### Computing K-Values (no loco): \n",
"##### POST /gemma/k-compute/:token\n",
"The end command will look something like:\n",
"\n",
@@ -81,13 +157,76 @@
},
{
"cell_type": "code",
- "execution_count": null,
+ "execution_count": 66,
"metadata": {},
- "outputs": [],
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "{'output_file': '8f4906862459e59dcb452fd8162d2cc1-output.json', 'status': 'queued', 'unique_id': 'cmd::2021-03-1005-1727-1727-7006dfef-8e60-4bb2-8390-0c174d15a17c'}\n"
+ ]
+ }
+ ],
+ "source": [
+ "r = requests.post(\"http://gn3-test.genenetwork.org/api/gemma/k-compute/VROD4N-XVW1L0\")\n",
+ "print(r.json())"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "This generates the command:\n",
+ "\n",
+ "```\n",
+ "gemma-wrapper --json -- -g /tmp/VROD4N-XVW1L0/BXD_geno.txt.gz -p /tmp/VROD4N-XVW1L0/BXD_pheno.txt -a /tmp/VROD4N-XVW1L0/BXD_snps.txt -gk > /tmp/VROD4N-XVW1L0/8f4906862459e59dcb452fd8162d2cc1-output.json\n",
+ "```"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Check the status of the command"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 35,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "{'status': 'queued'}\n"
+ ]
+ }
+ ],
+ "source": [
+ "r = requests.get(\"http://gn3-test.genenetwork.org/api/gemma/status/cmd%3A%3A2021-03-1005-1727-1727-7006dfef-8e60-4bb2-8390-0c174d15a17c\")\n",
+ "print(r.json())"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 68,
+ "metadata": {
+ "scrolled": true
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "{'status': 'success'}\n"
+ ]
+ }
+ ],
"source": [
- "# Demo\n",
- "import requests\n",
- "r = requests.post(\"localhost:8080/api/gemma/k-compute/abcde-abcde\")\n",
+ "# After a short while:\n",
+ "r = requests.get(\"http://gn3-test.genenetwork.org/api/gemma/status/cmd%3A%3A2021-03-1005-1727-1727-7006dfef-8e60-4bb2-8390-0c174d15a17c\")\n",
"print(r.json())"
]
},
@@ -95,7 +234,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
- "#### POST /gemma/k-compute/:chromosomes/:token\n",
+ "#### POST /api/gemma/k-compute/loco/:chromosomes/:token\n",
"\n",
"Cqmpute K values with chromosomes. The end command will look similar to:\n",
"\n",
@@ -105,6 +244,141 @@
" -a genotype-snps -gk > k_output_filename.json\n",
"```"
]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 78,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "{'output_file': '8f4906862459e59dcb452fd8162d2cc1-+O9bus-output.json', 'status': 'queued', 'unique_id': 'cmd::2021-03-1006-1926-1926-60d0aed2-1645-44e0-ba21-28f37bb4e688'}\n"
+ ]
+ }
+ ],
+ "source": [
+ "r = requests.post(\"http://gn3-test.genenetwork.org/api/gemma/k-compute/loco/1%2C2%2C3%2C4/VROD4N-XVW1L0\")\n",
+ "print(r.json())"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## GWA Computation\n",
+ "##### POST /api/gemma/gwa-compute/:k-inputfile/:token\n",
+ "(No Loco; No covars)\n",
+ "Assuming we use the previously generated k-inputfile\n",
+ "Also, K-inputfile can be any file you added during the data upload!"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 81,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "{'output_file': '8f4906862459e59dcb452fd8162d2cc1-output.json', 'status': 'queued', 'unique_id': 'cmd::2021-03-1006-3257-3257-30669a04-dc3d-4cce-a622-8be2103a864f'}\n"
+ ]
+ }
+ ],
+ "source": [
+ "r = requests.post(\"http://gn3-test.genenetwork.org/api/gemma/gwa-compute/8f4906862459e59dcb452fd8162d2cc1-output.json/VROD4N-XVW1L0\")\n",
+ "print(r.json())"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "##### POST /api/gemma/gwa-compute/covars/:k-inputfile/:token\n",
+ "\n",
+ "The covars file is fetched from the metadata file"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 83,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "{'output_file': 'c718773b04935405258315b9588d13e6-output.json', 'status': 'queued', 'unique_id': 'cmd::2021-03-1006-3518-3518-70d057d3-cb07-4171-be07-e1dafe1fb278'}\n"
+ ]
+ }
+ ],
+ "source": [
+ "r = requests.post(\"http://gn3-test.genenetwork.org/api/gemma/gwa-compute/covars/8f4906862459e59dcb452fd8162d2cc1-output.json/VROD4N-XVW1L0\")\n",
+ "print(r.json())"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "##### POST /api/gemma/gwa-compute/:k-inputfile/loco/maf/:maf/:token\n",
+ "\n",
+ "Compute GWA with loco(maf has to be given), no covars.\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 84,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "{'output_file': '8f4906862459e59dcb452fd8162d2cc1-output.json', 'status': 'queued', 'unique_id': 'cmd::2021-03-1006-3848-3848-c66924be-fbf9-494a-9123-eb5941aca912'}\n"
+ ]
+ }
+ ],
+ "source": [
+ "r = requests.post(\"http://gn3-test.genenetwork.org/api/gemma/gwa-compute/8f4906862459e59dcb452fd8162d2cc1-output.json/loco/maf/9/VROD4N-XVW1L0\")\n",
+ "print(r.json())"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "##### POST /api/gemma/gwa-compute/:k-inputfile/loco/covars/maf/:maf/:token\n",
+ "\n",
+ "The covariate file is fetched from the name defined in the metadata json file."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 87,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "{'output_file': 'c718773b04935405258315b9588d13e6-output.json', 'status': 'queued', 'unique_id': 'cmd::2021-03-1006-5255-5255-46765ebe-bbca-4402-86fa-a4c145ad4f71'}\n"
+ ]
+ }
+ ],
+ "source": [
+ "r = requests.post(\"http://gn3-test.genenetwork.org/api/gemma/gwa-compute/8f4906862459e59dcb452fd8162d2cc1-output.json/loco/covars/maf/9/VROD4N-XVW1L0\")\n",
+ "print(r.json())"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": []
}
],
"metadata": {