# Running GEMMA from GN3

This document outlines how to use gemma from Genenetwork3.

The current mechanism for how Gemma runs is that when you run one of the endpoints that runs the actual gemma, it constructs the command and queues it in to a REDIS queue; thereby bypassing any time-out issues with the endpoint for a long running process. A worker(sheepdog) processes the endpoints.

If you are running gn3 through a development environment, ensure that it is up and running by running the command:

```sh
env FLASK_APP="main.py" flask run --port 8080
```

**PS: For these examples, I'm assuming you provide your Genotype files. This will be edited out since they will be replaced by those provided by IPFS.**

In [1]:
### All imports go here!
import requests

##### GET api/gemma/version

To ensure that things are working, run:

In [2]:
r = requests.get("http://gn3-test.genenetwork.org/api/gemma/version")
print(r.json())

{'code': 0, 'output': 'gemma-wrapper 0.98.1 (Ruby 2.6.5) by Pjotr Prins 2017,2018\n'}


## Uploading data

Before you perform any computation, you need to ensure you have your data uploaded. For these examples, I'll use data provided [here](https://github.com/genetics-statistics/gemma-wrapper/tree/master/test/data/input)

An example of a metadata.json file:

```
{
    "title": "This is my dataset for testing the REST API",
    "description": "Longer description",
    "date": "20210127",
    "authors": [
        "R. W. Williams"
    ],
    "cross": "BXD",
    "geno": "/ipfs/QmakcPHuxKouUvuNZ5Gna1pyXSAPB5fFSkqFt5pDydd9A4/GN638/BXH.geno",
    "pheno": "BXD_pheno.txt",
    "snps": "BXD_snps.txt",
    "covar": "BXD_covariates.txt"
}

```

Note that you provide the genotype file with an ipfs valid address.

##### POST /api/metadata/upload/:token

If token is provided, your user directory will be overridden with the new upload data!


In [22]:
# I intentionally upload the data file without having the metadata file so that I can upload it later.
r = requests.post("http://gn3-test.genenetwork.org/api/metadata/upload/",
                  files=[('file', ('file.tar.gz',
                                   open('/tmp/file-no-metadata.tar.gz',
                                        'rb'), 'application/octet-stream'))])

token_no_metadata = r.json()["token"]
print(token_no_metadata)

PEERO1-WE4BBX


Now we can override the upload with the relevant metadatafile if we want. If the contents are the same, we get the same token!

In [40]:
r = requests.post("http://gn3-test.genenetwork.org/api/metadata/upload/VROD4N-XVW1L0",
                  files=[('file', ('file.tar.gz',
                                   open('/tmp/file.tar.gz',
                                        'rb'), 'application/octet-stream'))])

token = r.json()["token"]

In [41]:
print(token)

VROD4N-XVW1L0


We will use the above token for computations!

## K-Computation

Make sure that your metadata file is up to date!
You need the genofile, traitfile, and snpsfile. You also need a token provided when you first uploaded your metadata file.

#### Computing K-Values (no loco): 
##### POST /gemma/k-compute/:token
The end command will look something like:

```
gemma-wrapper --json -- -debug \
                -g genotype-file -p traitfile \
                -a genotypte-snps -gk > input-hash-k-output-filename"
```

In [43]:
r = requests.post("http://gn3-test.genenetwork.org/api/"
                  f"gemma/k-compute/{token}")
print(r.json())

{'output_file': '5e9b1e32053f3d456418af2119a0a9e0-output.json', 'status': 'queued', 'unique_id': 'cmd::2021-03-2209-1901-1901-aef4973e-6d07-49f3-8216-ea883c6442b3'}


This generates the command:

```
gemma-wrapper --json -- -g /tmp/cache/QmRZpiWCPxn6d1sCaPLpFQBpDmaPaPygvZ165oK4B9myjy/GN602/BXD.geno -p /tmp/VROD4N-XVW1L0/BXD_pheno.txt -a /tmp/VROD4N-XVW1L0/BXD_snps.txt -gk > /tmp/VROD4N-XVW1L0/5e9b1e32053f3d456418af2119a0a9e0-output.json
```

Check the status of the command

In [44]:
r = requests.get("http://gn3-test.genenetwork.org/api/gemma/"
                 "status/cmd%3A%3A2021-03-2209-1901-1901-aef4973e-6d07-"
                 "49f3-8216-ea883c6442b3")
print(r.json())

{'status': 'error'}


Work as expected with the error log:

```
Using GEMMA 0.98.4 (2020-12-15) by Xiang Zhou and team (C) 2012-2020

Found 0.98.4, comparing against expected v0.98.0

GEMMA 0.98.4 (2020-12-15) by Xiang Zhou and team (C) 2012-2020
Reading Files ...
**** FAILED: Parsing input file '/tmp/cache/QmRZpiWCPxn6d1sCaPLpFQBpDmaPaPygvZ165oK4B9myjy/GN602/BXD.geno' failed in function ReadFile_geno in src/gemma_io.cpp at line 744                                                              
{"warnings":[],"errno":2,"debug":[],"type":"K","files":[[null,"/tmp/605220a4d963542392fb44c15060cf6a8cee659b.log.txt","/tmp/605220a4d963542392fb44c15060cf6a8cee659b.cXX.txt"]],"gemma_command":"/gnu/store/yyrfmg0i18w190l4lb21cv86fqclalsk-gemma-gn2-git-0.98.3-47221d6/bin/gemma -g /tmp/cache/QmRZpiWCPxn6d1sCaPLpFQBpDmaPaPygvZ165oK4B9myjy/GN602/BXD.geno -p /tmp/VROD4N-XVW1L0/BXD_pheno.txt -a /tmp/VROD4N-XVW1L0/BXD_snps.txt -gk -outdir /tmp -o 605220a4d963542392fb44c15060cf6a8cee659b"}Traceback (most recent call last):                          
        6: from /gnu/store/86df7mjr3y1mrz62k4zipm6bznj10faj-profile/bin/gemma-wrapper:4:in `<main>'                                                        
        5: from /gnu/store/86df7mjr3y1mrz62k4zipm6bznj10faj-profile/bin/gemma-wrapper:4:in `load'                                                          
        4: from /gnu/store/p1alkfcsw3m24vlgxfb6gk24zn67h8n2-gemma-wrapper-0.98.1/bin/.real/gemma-wrapper:23:in `<top (required)>'                          
        3: from /gnu/store/p1alkfcsw3m24vlgxfb6gk24zn67h8n2-gemma-wrapper-0.98.1/bin/.real/gemma-wrapper:23:in `load'                                      
        2: from /gnu/store/p1alkfcsw3m24vlgxfb6gk24zn67h8n2-gemma-wrapper-0.98.1/lib/ruby/vendor_ruby/gems/bio-gemma-wrapper-0.98.1/bin/gemma-wrapper:345:in `<top (required)>'                                                          
        1: from /gnu/store/p1alkfcsw3m24vlgxfb6gk24zn67h8n2-gemma-wrapper-0.98.1/lib/ruby/vendor_ruby/gems/bio-gemma-wrapper-0.98.1/bin/gemma-wrapper:316:in `block in <top (required)>'                                                 
/gnu/store/p1alkfcsw3m24vlgxfb6gk24zn67h8n2-gemma-wrapper-0.98.1/lib/ruby/vendor_ruby/gems/bio-gemma-wrapper-0.98.1/bin/gemma-wrapper:278:in `block in <top (required)>': exit on GEMMA error 2 (RuntimeError) 
```

In [31]:
# After a short while:
# Commands will throw error since I didn't know where the right
# genotype files for BXD are :(
r = requests.get("http://gn3-test.genenetwork.org/api/gemma/status/"
                 "cmd%3A%3A2021-03-2209-0015-"
                 "0015-bb5c765f-93a3-45d1-a50d-b817cd7192e7")
print(r.json())

{'status': 'error'}


#### POST /api/gemma/k-compute/loco/:chromosomes/:token

Cqmpute K values with chromosomes. The end command will look similar to:

```
  gemma-wrapper --json --loco 1,2,3,4 \
                -debug -g genotypefile -p traitfile \
                -a genotype-snps -gk > k_output_filename.json
```

In [78]:
r = requests.post("http://gn3-test.genenetwork.org/api/gemma/"
                  "k-compute/loco/1%2C2%2C3%2C4/VROD4N-XVW1L0")
print(r.json())

{'output_file': '8f4906862459e59dcb452fd8162d2cc1-+O9bus-output.json', 'status': 'queued', 'unique_id': 'cmd::2021-03-1006-1926-1926-60d0aed2-1645-44e0-ba21-28f37bb4e688'}


## GWA Computation
##### POST /api/gemma/gwa-compute/:k-inputfile/:token
(No Loco; No covars)
Assuming we use the previously generated k-inputfile
Also, K-inputfile can be any file you added during the data upload!

In [81]:
r = requests.post("http://gn3-test.genenetwork.org/api/"
                  "gemma/gwa-compute/8f4906862459e59dcb"
                  "452fd8162d2cc1-output.json/VROD4N-XVW1L0")
print(r.json())

{'output_file': '8f4906862459e59dcb452fd8162d2cc1-output.json', 'status': 'queued', 'unique_id': 'cmd::2021-03-1006-3257-3257-30669a04-dc3d-4cce-a622-8be2103a864f'}


##### POST /api/gemma/gwa-compute/covars/:k-inputfile/:token

The covars file is fetched from the metadata file

In [83]:
r = requests.post("http://gn3-test.genenetwork.org/api/gemma/"
                  "gwa-compute/covars/"
                  "8f4906862459e59dcb452fd8162d2cc1-output.json/"
                  "VROD4N-XVW1L0")
print(r.json())

{'output_file': 'c718773b04935405258315b9588d13e6-output.json', 'status': 'queued', 'unique_id': 'cmd::2021-03-1006-3518-3518-70d057d3-cb07-4171-be07-e1dafe1fb278'}


##### POST /api/gemma/gwa-compute/:k-inputfile/loco/maf/:maf/:token

Compute GWA with loco(maf has to be given), no covars.


In [84]:
r = requests.post("http://gn3-test.genenetwork.org/api/"
                  "gemma/gwa-compute/"
                  "8f4906862459e59dcb452fd8162d2cc1-output.json/"
                  "loco/maf/9/VROD4N-XVW1L0")
print(r.json())

{'output_file': '8f4906862459e59dcb452fd8162d2cc1-output.json', 'status': 'queued', 'unique_id': 'cmd::2021-03-1006-3848-3848-c66924be-fbf9-494a-9123-eb5941aca912'}


##### POST /api/gemma/gwa-compute/:k-inputfile/loco/covars/maf/:maf/:token

The covariate file is fetched from the name defined in the metadata json file.

In [87]:
r = requests.post("http://gn3-test.genenetwork.org/api/gemma/"
                  "gwa-compute/8f4906862459e59dcb452fd8162d2cc1"
                  "-output.json/loco/covars/maf/9/VROD4N-XVW1L0")
print(r.json())

{'output_file': 'c718773b04935405258315b9588d13e6-output.json', 'status': 'queued', 'unique_id': 'cmd::2021-03-1006-5255-5255-46765ebe-bbca-4402-86fa-a4c145ad4f71'}


## K-GWA computation

Computing k and gwa in one full swoop. This is important since in GN2, gemma does this in one full swoop.

##### POST /api/gemma/k-gwa-compute/covars/maf/:maf/:token


In [89]:
r = requests.post("http://gn3-test.genenetwork.org/api/gemma/"
                  "k-gwa-compute/covars/VROD4N-XVW1L0")
print(r.json())

{'output_file': 'c718773b04935405258315b9588d13e6-output.json', 'status': 'queued', 'unique_id': 'cmd::2021-03-1010-1138-1138-558d7849-793f-4625-aac9-73d6bbc6dfdb'}


##### POST /api/gemma/k-gwa-compute/loco/:chromosomes/maf/:maf/:token

In [92]:
r = requests.post("http://gn3-test.genenetwork.org/api/gemma/"
                  "k-gwa-compute/loco/1%2C2%2C3/maf/9/VROD4N-XVW1L0")
print(r.json())

{'output_file': '8f4906862459e59dcb452fd8162d2cc1-output.json', 'status': 'queued', 'unique_id': 'cmd::2021-03-1010-1402-1402-1fe13423-e4f6-4f4f-9c1d-c855e3ab55b5'}


##### POST /api/k-gwa-compute/covars/loco/:chromosomes/maf/:maf/:token

In [93]:
r = requests.post("http://gn3-test.genenetwork.org/api/gemma/"
                  "k-gwa-compute/covars/loco/1%2C2%2C3/maf/9/VROD4N-XVW1L0")
print(r.json())

{'output_file': 'c718773b04935405258315b9588d13e6-output.json', 'status': 'queued', 'unique_id': 'cmd::2021-03-1010-1947-1947-c07c85e7-1355-4fab-bf9c-6c5e68f91a36'}
