# Running GEMMA from GN3

This document outlines how to use gemma from Genenetwork3.

The current mechanism for how Gemma runs is that when you run one of the endpoints that runs the actual gemma, it constructs the command and queues it in to a REDIS queue; thereby bypassing any time-out issues with the endpoint for a long running process. A worker(sheepdog) processes the endpoints.

If you are running gn3 through a development environment, ensure that it is up and running by running the command:

```sh
env FLASK_APP="main.py" flask run --port 8080
```

**PS: For these examples, I'm assuming you provide your Genotype files. This will be edited out since they will be replaced by those provided by IPFS.**

In [6]:
### All imports go here!
import requests

##### GET api/gemma/version

To ensure that things are working, run:

In [7]:
r = requests.get("http://gn3-test.genenetwork.org/api/gemma/version")
print(r.json())

{'code': 0, 'output': 'gemma-wrapper 0.98.1 (Ruby 2.6.5) by Pjotr Prins 2017,2018\n'}


## Uploading data

Before you perform any computation, you need to ensure you have your data uploaded. For these examples, I'll use data provided [here](https://github.com/genetics-statistics/gemma-wrapper/tree/master/test/data/input)

##### POST /api/metadata/upload/:token

If token is provided, your user directory will be overridden with the new upload data!


In [21]:
# I intentionally upload the data file without having the metadata file so that I can upload it later.
r = requests.post("http://gn3-test.genenetwork.org/api/metadata/upload/",
                  files=[('file', ('file.tar.gz',
                                   open('/tmp/file.tar.gz',
                                        'rb'), 'application/octet-stream'))])

print(r.json())

{'status': 0, 'token': 'VROD4N-XVW1L0'}


Now we overrlde the upload with the relevant metadatafile:

In [25]:
r = requests.post("http://gn3-test.genenetwork.org/api/metadata/upload/VROD4N-XVW1L0",
                  files=[('file', ('file-with-metadata.tar.gz',
                                   open('/tmp/file-with-metadata.tar.gz',
                                        'rb'), 'application/octet-stream'))])

print(r.json())

{'status': 0, 'token': 'VROD4N-XVW1L0'}


We will use the above token for computations!

## K-Computation

Make sure that your metadata file is up to date!
You need the genofile, traitfile, and snpsfile. You also need a token provided when you first uploaded your metadata file.

#### Computing K-Values (no loco): 
##### POST /gemma/k-compute/:token
The end command will look something like:

```
gemma-wrapper --json -- -debug \
                -g genotype-file -p traitfile \
                -a genotypte-snps -gk > input-hash-k-output-filename"
```

In [66]:
r = requests.post("http://gn3-test.genenetwork.org/api/gemma/k-compute/VROD4N-XVW1L0")
print(r.json())

{'output_file': '8f4906862459e59dcb452fd8162d2cc1-output.json', 'status': 'queued', 'unique_id': 'cmd::2021-03-1005-1727-1727-7006dfef-8e60-4bb2-8390-0c174d15a17c'}


This generates the command:

```
gemma-wrapper --json -- -g /tmp/VROD4N-XVW1L0/BXD_geno.txt.gz -p /tmp/VROD4N-XVW1L0/BXD_pheno.txt -a /tmp/VROD4N-XVW1L0/BXD_snps.txt -gk > /tmp/VROD4N-XVW1L0/8f4906862459e59dcb452fd8162d2cc1-output.json
```

Check the status of the command

In [35]:
r = requests.get("http://gn3-test.genenetwork.org/api/gemma/status/cmd%3A%3A2021-03-1005-1727-1727-7006dfef-8e60-4bb2-8390-0c174d15a17c")
print(r.json())

{'status': 'queued'}


In [68]:
# After a short while:
r = requests.get("http://gn3-test.genenetwork.org/api/gemma/status/cmd%3A%3A2021-03-1005-1727-1727-7006dfef-8e60-4bb2-8390-0c174d15a17c")
print(r.json())

{'status': 'success'}


#### POST /api/gemma/k-compute/loco/:chromosomes/:token

Cqmpute K values with chromosomes. The end command will look similar to:

```
  gemma-wrapper --json --loco 1,2,3,4 \
                -debug -g genotypefile -p traitfile \
                -a genotype-snps -gk > k_output_filename.json
```

In [78]:
r = requests.post("http://gn3-test.genenetwork.org/api/gemma/k-compute/loco/1%2C2%2C3%2C4/VROD4N-XVW1L0")
print(r.json())

{'output_file': '8f4906862459e59dcb452fd8162d2cc1-+O9bus-output.json', 'status': 'queued', 'unique_id': 'cmd::2021-03-1006-1926-1926-60d0aed2-1645-44e0-ba21-28f37bb4e688'}


## GWA Computation
##### POST /api/gemma/gwa-compute/:k-inputfile/:token
(No Loco; No covars)
Assuming we use the previously generated k-inputfile
Also, K-inputfile can be any file you added during the data upload!

In [81]:
r = requests.post("http://gn3-test.genenetwork.org/api/gemma/gwa-compute/8f4906862459e59dcb452fd8162d2cc1-output.json/VROD4N-XVW1L0")
print(r.json())

{'output_file': '8f4906862459e59dcb452fd8162d2cc1-output.json', 'status': 'queued', 'unique_id': 'cmd::2021-03-1006-3257-3257-30669a04-dc3d-4cce-a622-8be2103a864f'}


##### POST /api/gemma/gwa-compute/covars/:k-inputfile/:token

The covars file is fetched from the metadata file

In [83]:
r = requests.post("http://gn3-test.genenetwork.org/api/gemma/gwa-compute/covars/8f4906862459e59dcb452fd8162d2cc1-output.json/VROD4N-XVW1L0")
print(r.json())

{'output_file': 'c718773b04935405258315b9588d13e6-output.json', 'status': 'queued', 'unique_id': 'cmd::2021-03-1006-3518-3518-70d057d3-cb07-4171-be07-e1dafe1fb278'}


##### POST /api/gemma/gwa-compute/:k-inputfile/loco/maf/:maf/:token

Compute GWA with loco(maf has to be given), no covars.


In [84]:
r = requests.post("http://gn3-test.genenetwork.org/api/gemma/gwa-compute/8f4906862459e59dcb452fd8162d2cc1-output.json/loco/maf/9/VROD4N-XVW1L0")
print(r.json())

{'output_file': '8f4906862459e59dcb452fd8162d2cc1-output.json', 'status': 'queued', 'unique_id': 'cmd::2021-03-1006-3848-3848-c66924be-fbf9-494a-9123-eb5941aca912'}


##### POST /api/gemma/gwa-compute/:k-inputfile/loco/covars/maf/:maf/:token

The covariate file is fetched from the name defined in the metadata json file.

In [87]:
r = requests.post("http://gn3-test.genenetwork.org/api/gemma/gwa-compute/8f4906862459e59dcb452fd8162d2cc1-output.json/loco/covars/maf/9/VROD4N-XVW1L0")
print(r.json())

{'output_file': 'c718773b04935405258315b9588d13e6-output.json', 'status': 'queued', 'unique_id': 'cmd::2021-03-1006-5255-5255-46765ebe-bbca-4402-86fa-a4c145ad4f71'}


## K-GWA computation

Computing k and gwa in one full swoop. This is important since in GN2, gemma does this in one full swoop.

##### POST /api/gemma/k-gwa-compute/covars/maf/:maf/:token


In [89]:
r = requests.post("http://gn3-test.genenetwork.org/api/gemma/k-gwa-compute/covars/VROD4N-XVW1L0")
print(r.json())

{'output_file': 'c718773b04935405258315b9588d13e6-output.json', 'status': 'queued', 'unique_id': 'cmd::2021-03-1010-1138-1138-558d7849-793f-4625-aac9-73d6bbc6dfdb'}


##### POST /api/gemma/k-gwa-compute/loco/:chromosomes/maf/:maf/:token

In [92]:
r = requests.post("http://gn3-test.genenetwork.org/api/gemma/k-gwa-compute/loco/1%2C2%2C3/maf/9/VROD4N-XVW1L0")
print(r.json())

{'output_file': '8f4906862459e59dcb452fd8162d2cc1-output.json', 'status': 'queued', 'unique_id': 'cmd::2021-03-1010-1402-1402-1fe13423-e4f6-4f4f-9c1d-c855e3ab55b5'}


##### POST /api/k-gwa-compute/covars/loco/:chromosomes/maf/:maf/:token

In [93]:
r = requests.post("http://gn3-test.genenetwork.org/api/gemma//k-gwa-compute/covars/loco/1%2C2%2C3/maf/9/VROD4N-XVW1L0")
print(r.json())

{'output_file': 'c718773b04935405258315b9588d13e6-output.json', 'status': 'queued', 'unique_id': 'cmd::2021-03-1010-1947-1947-c07c85e7-1355-4fab-bf9c-6c5e68f91a36'}
