{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Running GEMMA from GN3\n", "\n", "This document outlines how to use gemma from Genenetwork3.\n", "\n", "The current mechanism for how Gemma runs is that when you run one of the endpoints that runs the actual gemma, it constructs the command and queues it in to a REDIS queue; thereby bypassing any time-out issues with the endpoint for a long running process. A worker(sheepdog) processes the endpoints.\n", "\n", "If you are running gn3 through a development environment, ensure that it is up and running by running the command:\n", "\n", "```sh\n", "env FLASK_APP=\"main.py\" flask run --port 8080\n", "```\n", "\n", "**PS: For these examples, I'm assuming you provide your Genotype files. This will be edited out since they will be replaced by those provided by IPFS.**" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "### All imports go here!\n", "import requests" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### GET api/gemma/version\n", "\n", "To ensure that things are working, run:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'code': 0, 'output': 'gemma-wrapper 0.98.1 (Ruby 2.6.5) by Pjotr Prins 2017,2018\\n'}\n" ] } ], "source": [ "r = requests.get(\"http://gn3-test.genenetwork.org/api/gemma/version\")\n", "print(r.json())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Uploading data\n", "\n", "Before you perform any computation, you need to ensure you have your data uploaded. For these examples, I'll use data provided [here](https://github.com/genetics-statistics/gemma-wrapper/tree/master/test/data/input)\n", "\n", "##### POST /api/metadata/upload/:token\n", "\n", "If token is provided, your user directory will be overridden with the new upload data!\n" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'status': 0, 'token': 'VROD4N-XVW1L0'}\n" ] } ], "source": [ "# I intentionally upload the data file without having the metadata file so that I can upload it later.\n", "r = requests.post(\"http://gn3-test.genenetwork.org/api/metadata/upload/\",\n", " files=[('file', ('file.tar.gz',\n", " open('/tmp/file.tar.gz',\n", " 'rb'), 'application/octet-stream'))])\n", "\n", "print(r.json())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we overrlde the upload with the relevant metadatafile:" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'status': 0, 'token': 'VROD4N-XVW1L0'}\n" ] } ], "source": [ "r = requests.post(\"http://gn3-test.genenetwork.org/api/metadata/upload/VROD4N-XVW1L0\",\n", " files=[('file', ('file-with-metadata.tar.gz',\n", " open('/tmp/file-with-metadata.tar.gz',\n", " 'rb'), 'application/octet-stream'))])\n", "\n", "print(r.json())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We will use the above token for computations!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## K-Computation\n", "\n", "Make sure that your metadata file is up to date!\n", "You need the genofile, traitfile, and snpsfile. You also need a token provided when you first uploaded your metadata file." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Computing K-Values (no loco): \n", "##### POST /gemma/k-compute/:token\n", "The end command will look something like:\n", "\n", "```\n", "gemma-wrapper --json -- -debug \\\n", " -g genotype-file -p traitfile \\\n", " -a genotypte-snps -gk > input-hash-k-output-filename\"\n", "```" ] }, { "cell_type": "code", "execution_count": 66, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'output_file': '8f4906862459e59dcb452fd8162d2cc1-output.json', 'status': 'queued', 'unique_id': 'cmd::2021-03-1005-1727-1727-7006dfef-8e60-4bb2-8390-0c174d15a17c'}\n" ] } ], "source": [ "r = requests.post(\"http://gn3-test.genenetwork.org/api/gemma/k-compute/VROD4N-XVW1L0\")\n", "print(r.json())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This generates the command:\n", "\n", "```\n", "gemma-wrapper --json -- -g /tmp/VROD4N-XVW1L0/BXD_geno.txt.gz -p /tmp/VROD4N-XVW1L0/BXD_pheno.txt -a /tmp/VROD4N-XVW1L0/BXD_snps.txt -gk > /tmp/VROD4N-XVW1L0/8f4906862459e59dcb452fd8162d2cc1-output.json\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Check the status of the command" ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'status': 'queued'}\n" ] } ], "source": [ "r = requests.get(\"http://gn3-test.genenetwork.org/api/gemma/status/cmd%3A%3A2021-03-1005-1727-1727-7006dfef-8e60-4bb2-8390-0c174d15a17c\")\n", "print(r.json())" ] }, { "cell_type": "code", "execution_count": 68, "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'status': 'success'}\n" ] } ], "source": [ "# After a short while:\n", "r = requests.get(\"http://gn3-test.genenetwork.org/api/gemma/status/cmd%3A%3A2021-03-1005-1727-1727-7006dfef-8e60-4bb2-8390-0c174d15a17c\")\n", "print(r.json())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### POST /api/gemma/k-compute/loco/:chromosomes/:token\n", "\n", "Cqmpute K values with chromosomes. The end command will look similar to:\n", "\n", "```\n", " gemma-wrapper --json --loco 1,2,3,4 \\\n", " -debug -g genotypefile -p traitfile \\\n", " -a genotype-snps -gk > k_output_filename.json\n", "```" ] }, { "cell_type": "code", "execution_count": 78, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'output_file': '8f4906862459e59dcb452fd8162d2cc1-+O9bus-output.json', 'status': 'queued', 'unique_id': 'cmd::2021-03-1006-1926-1926-60d0aed2-1645-44e0-ba21-28f37bb4e688'}\n" ] } ], "source": [ "r = requests.post(\"http://gn3-test.genenetwork.org/api/gemma/k-compute/loco/1%2C2%2C3%2C4/VROD4N-XVW1L0\")\n", "print(r.json())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## GWA Computation\n", "##### POST /api/gemma/gwa-compute/:k-inputfile/:token\n", "(No Loco; No covars)\n", "Assuming we use the previously generated k-inputfile\n", "Also, K-inputfile can be any file you added during the data upload!" ] }, { "cell_type": "code", "execution_count": 81, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'output_file': '8f4906862459e59dcb452fd8162d2cc1-output.json', 'status': 'queued', 'unique_id': 'cmd::2021-03-1006-3257-3257-30669a04-dc3d-4cce-a622-8be2103a864f'}\n" ] } ], "source": [ "r = requests.post(\"http://gn3-test.genenetwork.org/api/gemma/gwa-compute/8f4906862459e59dcb452fd8162d2cc1-output.json/VROD4N-XVW1L0\")\n", "print(r.json())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### POST /api/gemma/gwa-compute/covars/:k-inputfile/:token\n", "\n", "The covars file is fetched from the metadata file" ] }, { "cell_type": "code", "execution_count": 83, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'output_file': 'c718773b04935405258315b9588d13e6-output.json', 'status': 'queued', 'unique_id': 'cmd::2021-03-1006-3518-3518-70d057d3-cb07-4171-be07-e1dafe1fb278'}\n" ] } ], "source": [ "r = requests.post(\"http://gn3-test.genenetwork.org/api/gemma/gwa-compute/covars/8f4906862459e59dcb452fd8162d2cc1-output.json/VROD4N-XVW1L0\")\n", "print(r.json())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### POST /api/gemma/gwa-compute/:k-inputfile/loco/maf/:maf/:token\n", "\n", "Compute GWA with loco(maf has to be given), no covars.\n" ] }, { "cell_type": "code", "execution_count": 84, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'output_file': '8f4906862459e59dcb452fd8162d2cc1-output.json', 'status': 'queued', 'unique_id': 'cmd::2021-03-1006-3848-3848-c66924be-fbf9-494a-9123-eb5941aca912'}\n" ] } ], "source": [ "r = requests.post(\"http://gn3-test.genenetwork.org/api/gemma/gwa-compute/8f4906862459e59dcb452fd8162d2cc1-output.json/loco/maf/9/VROD4N-XVW1L0\")\n", "print(r.json())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### POST /api/gemma/gwa-compute/:k-inputfile/loco/covars/maf/:maf/:token\n", "\n", "The covariate file is fetched from the name defined in the metadata json file." ] }, { "cell_type": "code", "execution_count": 87, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'output_file': 'c718773b04935405258315b9588d13e6-output.json', 'status': 'queued', 'unique_id': 'cmd::2021-03-1006-5255-5255-46765ebe-bbca-4402-86fa-a4c145ad4f71'}\n" ] } ], "source": [ "r = requests.post(\"http://gn3-test.genenetwork.org/api/gemma/gwa-compute/8f4906862459e59dcb452fd8162d2cc1-output.json/loco/covars/maf/9/VROD4N-XVW1L0\")\n", "print(r.json())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## K-GWA computation\n", "\n", "Computing k and gwa in one full swoop. This is important since in GN2, gemma does this in one full swoop." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### POST /api/gemma/k-gwa-compute/covars/maf/:maf/:token\n" ] }, { "cell_type": "code", "execution_count": 89, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'output_file': 'c718773b04935405258315b9588d13e6-output.json', 'status': 'queued', 'unique_id': 'cmd::2021-03-1010-1138-1138-558d7849-793f-4625-aac9-73d6bbc6dfdb'}\n" ] } ], "source": [ "r = requests.post(\"http://gn3-test.genenetwork.org/api/gemma/k-gwa-compute/covars/VROD4N-XVW1L0\")\n", "print(r.json())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### POST /api/gemma/k-gwa-compute/loco/:chromosomes/maf/:maf/:token" ] }, { "cell_type": "code", "execution_count": 92, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'output_file': '8f4906862459e59dcb452fd8162d2cc1-output.json', 'status': 'queued', 'unique_id': 'cmd::2021-03-1010-1402-1402-1fe13423-e4f6-4f4f-9c1d-c855e3ab55b5'}\n" ] } ], "source": [ "r = requests.post(\"http://gn3-test.genenetwork.org/api/gemma/k-gwa-compute/loco/1%2C2%2C3/maf/9/VROD4N-XVW1L0\")\n", "print(r.json())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### POST /api/k-gwa-compute/covars/loco/:chromosomes/maf/:maf/:token" ] }, { "cell_type": "code", "execution_count": 93, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'output_file': 'c718773b04935405258315b9588d13e6-output.json', 'status': 'queued', 'unique_id': 'cmd::2021-03-1010-1947-1947-c07c85e7-1355-4fab-bf9c-6c5e68f91a36'}\n" ] } ], "source": [ "r = requests.post(\"http://gn3-test.genenetwork.org/api/gemma//k-gwa-compute/covars/loco/1%2C2%2C3/maf/9/VROD4N-XVW1L0\")\n", "print(r.json())" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "/gnu/store/bvd09gb8ka642jzgxd2lpqlpdp160gn0-python-wrapper-3.8.2/bin/python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.2" } }, "nbformat": 4, "nbformat_minor": 2 }