aboutsummaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
Diffstat (limited to 'doc')
-rw-r--r--doc/API_readme.md168
-rw-r--r--doc/README.org4
-rw-r--r--doc/elasticsearch.org247
-rw-r--r--doc/heatmap-generation.org34
-rw-r--r--doc/images/gn2_header_collections.pngbin0 -> 7890 bytes
-rw-r--r--doc/images/heatmap_form.pngbin0 -> 9363 bytes
-rw-r--r--doc/images/heatmap_with_hover_tools.pngbin0 -> 42578 bytes
-rw-r--r--doc/joss/2016/2020.12.23.424047v1.full.pdfbin0 -> 3804818 bytes
8 files changed, 36 insertions, 417 deletions
diff --git a/doc/API_readme.md b/doc/API_readme.md
index be6668dc..17d10e44 100644
--- a/doc/API_readme.md
+++ b/doc/API_readme.md
@@ -1,169 +1,3 @@
# API Query Documentation #
----
-# Fetching Dataset/Trait info/data #
----
-## Fetch Species List ##
-To get a list of species with data available in GN (and their associated names and ids):
-```
-curl http://genenetwork.org/api/v_pre1/species
-[ { "FullName": "Mus musculus", "Id": 1, "Name": "mouse", "TaxonomyId": 10090 }, ... { "FullName": "Populus trichocarpa", "Id": 10, "Name": "poplar", "TaxonomyId": 3689 } ]
-```
-
-Or to get a single species info:
-```
-curl http://genenetwork.org/api/v_pre1/species/mouse
-```
-OR
-```
-curl http://genenetwork.org/api/v_pre1/species/mouse.json
-```
-
-*For all queries where the last field is a user-specified name/ID, there will be the option to append a file format type. Currently there is only JSON (and it will default to JSON if none is provided), but other formats will be added later*
-
-## Fetch Groups/RISets ##
-
-This query can optionally filter by species:
-
-```
-curl http://genenetwork.org/api/v_pre1/groups (for all species)
-```
-OR
-```
-curl http://genenetwork.org/api/v_pre1/groups/mouse (for just mouse groups/RISets)
-[ { "DisplayName": "BXD", "FullName": "BXD RI Family", "GeneticType": "riset", "Id": 1, "MappingMethodId": "1", "Name": "BXD", "SpeciesId": 1, "public": 2 }, ... { "DisplayName": "AIL LGSM F34 and F39-43 (GBS)", "FullName": "AIL LGSM F34 and F39-43 (GBS)", "GeneticType": "intercross", "Id": 72, "MappingMethodId": "2", "Name": "AIL-LGSM-F34-F39-43-GBS", "SpeciesId": 1, "public": 2 } ]
-```
-
-## Fetch Genotypes for Group/RISet ##
-```
-curl http://genenetwork.org/api/v_pre1/genotypes/bimbam/BXD
-curl http://genenetwork.org/api/v_pre1/genotypes/BXD.bimbam
-```
-Returns a group's genotypes in one of several formats - bimbam, rqtl2, or geno (a format used by qtlreaper which is just a CSV file consisting of marker positions and genotypes)
-
-Rqtl2 genotype queries can also include the dataset name and will return a zip of the genotypes, phenotypes, and gene map (marker names/positions). For example:
-```
-curl http://genenetwork.org/api/v_pre1/genotypes/rqtl2/BXD/HC_M2_0606_P.zip
-```
-
-## Fetch Datasets ##
-```
-curl http://genenetwork.org/api/v_pre1/datasets/bxd
-```
-OR
-```
-curl http://genenetwork.org/api/v_pre1/datasets/mouse/bxd
-[ { "AvgID": 1, "CreateTime": "Fri, 01 Aug 2003 00:00:00 GMT", "DataScale": "log2", "FullName": "UTHSC/ETHZ/EPFL BXD Liver Polar Metabolites Extraction A, CD Cohorts (Mar 2017) log2", "Id": 1, "Long_Abbreviation": "BXDMicroArray_ProbeSet_August03", "ProbeFreezeId": 3, "ShortName": "Brain U74Av2 08/03 MAS5", "Short_Abbreviation": "Br_U_0803_M", "confidentiality": 0, "public": 0 }, ... { "AvgID": 3, "CreateTime": "Tue, 14 Aug 2018 00:00:00 GMT", "DataScale": "log2", "FullName": "EPFL/LISP BXD CD Liver Affy Mouse Gene 1.0 ST (Aug18) RMA", "Id": 859, "Long_Abbreviation": "EPFLMouseLiverCDRMAApr18", "ProbeFreezeId": 181, "ShortName": "EPFL/LISP BXD CD Liver Affy Mouse Gene 1.0 ST (Aug18) RMA", "Short_Abbreviation": "EPFLMouseLiverCDRMA0818", "confidentiality": 0, "public": 1 } ]
-```
-(I added the option to specify species just in case we end up with the same group name across multiple species at some point, though it's currently unnecessary)
-
-## Fetch Individual Dataset Info ##
-### For mRNA Assay/"ProbeSet" ###
-
-```
-curl http://genenetwork.org/api/v_pre1/dataset/HC_M2_0606_P
-```
-OR
-```
-curl http://genenetwork.org/api/v_pre1/dataset/bxd/HC_M2_0606_P```
-{ "confidential": 0, "data_scale": "log2", "dataset_type": "mRNA expression", "full_name": "Hippocampus Consortium M430v2 (Jun06) PDNN", "id": 112, "name": "HC_M2_0606_P", "public": 2, "short_name": "Hippocampus M430v2 BXD 06/06 PDNN", "tissue": "Hippocampus mRNA", "tissue_id": 9 }
-```
-(This also has the option to specify group/riset)
-
-### For "Phenotypes" (basically non-mRNA Expression; stuff like weight, sex, etc) ###
-For these traits, the query fetches publication info and takes the group and phenotype 'ID' as input. For example:
-```
-curl http://genenetwork.org/api/v_pre1/dataset/bxd/10001
-{ "dataset_type": "phenotype", "description": "Central nervous system, morphology: Cerebellum weight, whole, bilateral in adults of both sexes [mg]", "id": 10001, "name": "CBLWT2", "pubmed_id": 11438585, "title": "Genetic control of the mouse cerebellum: identification of quantitative trait loci modulating size and architecture", "year": "2001" }
-```
-
-## Fetch Sample Data for Dataset ##
-```
-curl http://genenetwork.org/api/v_pre1/sample_data/HSNIH-PalmerPublish.csv
-```
-
-Returns a CSV file with sample/strain names as the columns and trait IDs as rows
-
-## Fetch Sample Data for Single Trait ##
-```
-curl http://genenetwork.org/api/v_pre1/sample_data/HC_M2_0606_P/1436869_at
-[ { "data_id": 23415463, "sample_name": "129S1/SvImJ", "sample_name_2": "129S1/SvImJ", "se": 0.123, "value": 8.201 }, { "data_id": 23415463, "sample_name": "A/J", "sample_name_2": "A/J", "se": 0.046, "value": 8.413 }, { "data_id": 23415463, "sample_name": "AKR/J", "sample_name_2": "AKR/J", "se": 0.134, "value": 8.856 }, ... ]
-```
-
-## Fetch Trait List for Dataset ##
-```
-curl http://genenetwork.org/api/v_pre1/traits/HXBBXHPublish.json
-[ { "Additive": 0.0499967532467532, "Id": 10001, "LRS": 16.2831307029479, "Locus": "rs106114574", "PhenotypeId": 1449, "PublicationId": 319, "Sequence": 1 }, ... ]
-```
-
-Both JSON and CSV formats can be specified, with JSON as default. There is also an optional "ids_only" and "names_only" parameter that will only return a list of trait IDs or names, respectively.
-
-## Fetch Trait Info (Name, Description, Location, etc) ##
-### For mRNA Expression/"ProbeSet" ###
-```
-curl http://genenetwork.org/api/v_pre1/trait/HC_M2_0606_P/1436869_at
-{ "additive": -0.214087568058076, "alias": "HHG1; HLP3; HPE3; SMMCI; Dsh; Hhg1", "chr": "5", "description": "sonic hedgehog (hedgehog)", "id": 99602, "locus": "rs8253327", "lrs": 12.7711275309832, "mb": 28.457155, "mean": 9.27909090909091, "name": "1436869_at", "p_value": 0.306, "se": null, "symbol": "Shh" }
-```
-
-### For "Phenotypes" ###
-For phenotypes this just gets the max LRS, its location, and additive effect (as calculated by qtlreaper)
-
-Since each group/riset only has one phenotype "dataset", this query takes either the group/riset name or the group/riset name + "Publish" (for example "BXDPublish", which is the dataset name in the DB) as input
-```
-curl http://genenetwork.org/api/v_pre1/trait/BXD/10001
-{ "additive": 2.39444435069444, "id": 4, "locus": "rs48756159", "lrs": 13.4974911471087 }
-```
-
----
-
-# Analyses #
----
-## Mapping ##
-Currently two mapping tools can be used - GEMMA and R/qtl. qtlreaper will be added later with Christian Fischer's RUST implementation - https://github.com/chfi/rust-qtlreaper
-
-Each method's query takes the following parameters respectively (more will be added):
-### GEMMA ###
-* trait_id (*required*) - ID for trait being mapped
-* db (*required*) - DB name for trait above (Short_Abbreviation listed when you query for datasets)
-* use_loco - Whether to use LOCO (leave one chromosome out) method (default = false)
-* maf - minor allele frequency (default = 0.01)
-
-Example query:
-```
-curl http://genenetwork.org/api/v_pre1/mapping?trait_id=10015&db=BXDPublish&method=gemma&use_loco=true
-```
-
-### R/qtl ###
-(See the R/qtl guide for information on some of these options - http://www.rqtl.org/manual/qtl-manual.pdf)
-* trait_id (*required*) - ID for trait being mapped
-* db (*required*) - DB name for trait above (Short_Abbreviation listed when you query for datasets)
-* rqtl_method - hk (default) | ehk | em | imp | mr | mr-imp | mr-argmax ; Corresponds to the "method" option for the R/qtl scanone function.
-* rqtl_model - normal (default) | binary | 2-part | np ; corresponds to the "model" option for the R/qtl scanone function
-* num_perm - number of permutations; 0 by default
-* control_marker - Name of marker to use as control; this relies on the user knowing the name of the marker they want to use as a covariate
-* interval_mapping - Whether to use interval mapping; "false" by default
-* pair_scan - *NYI*
-
-Example query:
-```
-curl http://genenetwork.org/api/v_pre1/mapping?trait_id=1418701_at&db=HC_M2_0606_P&method=rqtl&num_perm=100
-```
-
-Some combinations of methods/models may not make sense. The R/qtl manual should be referred to for any questions on its use (specifically the scanone function in this case)
-
-## Calculate Correlation ##
-Currently only Sample and Tissue correlations are implemented
-
-This query currently takes the following parameters (though more will be added):
-* trait_id (*required*) - ID for trait used for correlation
-* db (*required*) - DB name for the trait above (this is the Short_Abbreviation listed when you query for datasets)
-* target_db (*required*) - Target DB name to be correlated against
-* type - sample (default) | tissue
-* method - pearson (default) | spearman
-* return - Number of results to return (default = 500)
-
-Example query:
-```
-curl http://genenetwork.org/api/v_pre1/correlation?trait_id=1427571_at&db=HC_M2_0606_P&target_db=BXDPublish&type=sample&return_count=100
-[ { "#_strains": 6, "p_value": 0.004804664723032055, "sample_r": -0.942857142857143, "trait": 20511 }, { "#_strains": 6, "p_value": 0.004804664723032055, "sample_r": -0.942857142857143, "trait": 20724 }, { "#_strains": 12, "p_value": 1.8288943424888848e-05, "sample_r": -0.9233615170820528, "trait": 13536 }, { "#_strains": 7, "p_value": 0.006807187408935392, "sample_r": 0.8928571428571429, "trait": 10157 }, { "#_strains": 7, "p_value": 0.006807187408935392, "sample_r": -0.8928571428571429, "trait": 20392 }, ... ]
-```
+This document has moved to [gn-docs](https://github.com/genenetwork/gn-docs/blob/master/api/GN2-REST-API.md)!
diff --git a/doc/README.org b/doc/README.org
index 1236016e..e1c6b614 100644
--- a/doc/README.org
+++ b/doc/README.org
@@ -26,7 +26,7 @@
* Introduction
-Large system deployments can get very [[http://biogems.info/contrib/genenetwork/gn2.svg ][complex]]. In this document we
+Large system deployments can get very [[http://genenetwork.org/environments/][complex]]. In this document we
explain the GeneNetwork version 2 (GN2) reproducible deployment system
which is based on GNU Guix (see also [[https://github.com/pjotrp/guix-notes/blob/master/README.md][Guix-notes]]). The Guix
system can be used to install GN with all its files and dependencies.
@@ -81,14 +81,12 @@ GeneNetwork2 with
: source ~/opt/guix-pull/etc/profile
: git clone https://git.genenetwork.org/guix-bioinformatics/guix-bioinformatics.git ~/guix-bioinformatics
: cd ~/guix-bioinformatics
-: git pull
: env GUIX_PACKAGE_PATH=$HOME/guix-bioinformatics guix package -i genenetwork2 -p ~/opt/genenetwork2
you probably also need guix-past (the upstream channel for older packages):
: git clone https://gitlab.inria.fr/guix-hpc/guix-past.git ~/guix-past
: cd ~/guix-past
-: git pull
: env GUIX_PACKAGE_PATH=$HOME/guix-bioinformatics:$HOME/guix-past/modules ~/opt/guix-pull/bin/guix package -i genenetwork2 -p ~/opt/genenetwork2
ignore the warnings. Guix should install the software without trying
diff --git a/doc/elasticsearch.org b/doc/elasticsearch.org
deleted file mode 100644
index 864a8363..00000000
--- a/doc/elasticsearch.org
+++ /dev/null
@@ -1,247 +0,0 @@
-* Elasticsearch
-
-** Introduction
-
-GeneNetwork uses elasticsearch (ES) for all things considered
-'state'. One example is user collections, another is user management.
-
-** Example
-
-To get the right environment, first you can get a python REPL with something like
-
-: env GN2_PROFILE=~/opt/gn-latest ./bin/genenetwork2 ../etc/default_settings.py -cli python
-
-(make sure to use the correct GN2_PROFILE!)
-
-Next try
-
-#+BEGIN_SRC python
-
-from elasticsearch import Elasticsearch, TransportError
-
-es = Elasticsearch([{ "host": 'localhost', "port": '9200' }])
-
-# Dump all data
-
-es.search("*")
-
-# To fetch an E-mail record from the users index
-
-record = es.search(
- index = 'users', doc_type = 'local', body = {
- "query": { "match": { "email_address": "myname@email.com" } }
- })
-
-# It is also possible to do wild card matching
-
-q = { "query": { "wildcard" : { "full_name" : "pjot*" } }}
-es.search(index = 'users', doc_type = 'local', body = q)
-
-# To get elements from that record:
-
-record['hits']['hits'][0][u'_source']['full_name']
-u'Pjotr'
-
-record['hits']['hits'][0][u'_source']['email_address']
-u"myname@email.com"
-
-#+END_SRC
-
-** Health
-
-ES provides support for checking its health:
-
-: curl -XGET http://localhost:9200/_cluster/health?pretty=true
-
-#+BEGIN_SRC json
-
-
- {
- "cluster_name" : "asgard",
- "status" : "yellow",
- "timed_out" : false,
- "number_of_nodes" : 1,
- "number_of_data_nodes" : 1,
- "active_primary_shards" : 5,
- "active_shards" : 5,
- "relocating_shards" : 0,
- "initializing_shards" : 0,
- "unassigned_shards" : 5
- }
-
-#+END_SRC
-
-Yellow means just one instance is running (no worries).
-
-To get full cluster info
-
-: curl -XGET "localhost:9200/_cluster/stats?human&pretty"
-
-#+BEGIN_SRC json
-{
- "_nodes" : {
- "total" : 1,
- "successful" : 1,
- "failed" : 0
- },
- "cluster_name" : "elasticsearch",
- "timestamp" : 1529050366452,
- "status" : "yellow",
- "indices" : {
- "count" : 3,
- "shards" : {
- "total" : 15,
- "primaries" : 15,
- "replication" : 0.0,
- "index" : {
- "shards" : {
- "min" : 5,
- "max" : 5,
- "avg" : 5.0
- },
- "primaries" : {
- "min" : 5,
- "max" : 5,
- "avg" : 5.0
- },
- "replication" : {
- "min" : 0.0,
- "max" : 0.0,
- "avg" : 0.0
- }
- }
- },
- "docs" : {
- "count" : 14579,
- "deleted" : 0
- },
- "store" : {
- "size" : "44.7mb",
- "size_in_bytes" : 46892794
- },
- "fielddata" : {
- "memory_size" : "0b",
- "memory_size_in_bytes" : 0,
- "evictions" : 0
- },
- "query_cache" : {
- "memory_size" : "0b",
- "memory_size_in_bytes" : 0,
- "total_count" : 0,
- "hit_count" : 0,
- "miss_count" : 0,
- "cache_size" : 0,
- "cache_count" : 0,
- "evictions" : 0
- },
- "completion" : {
- "size" : "0b",
- "size_in_bytes" : 0
- },
- "segments" : {
- "count" : 24,
- "memory" : "157.3kb",
- "memory_in_bytes" : 161112,
- "terms_memory" : "122.6kb",
- "terms_memory_in_bytes" : 125569,
- "stored_fields_memory" : "15.3kb",
- "stored_fields_memory_in_bytes" : 15728,
- "term_vectors_memory" : "0b",
- "term_vectors_memory_in_bytes" : 0,
- "norms_memory" : "10.8kb",
- "norms_memory_in_bytes" : 11136,
- "points_memory" : "111b",
- "points_memory_in_bytes" : 111,
- "doc_values_memory" : "8.3kb",
- "doc_values_memory_in_bytes" : 8568,
- "index_writer_memory" : "0b",
- "index_writer_memory_in_bytes" : 0,
- "version_map_memory" : "0b",
- "version_map_memory_in_bytes" : 0,
- "fixed_bit_set" : "0b",
- "fixed_bit_set_memory_in_bytes" : 0,
- "max_unsafe_auto_id_timestamp" : -1,
- "file_sizes" : { }
- }
- },
- "nodes" : {
- "count" : {
- "total" : 1,
- "data" : 1,
- "coordinating_only" : 0,
- "master" : 1,
- "ingest" : 1
- },
- "versions" : [
- "6.2.1"
- ],
- "os" : {
- "available_processors" : 16,
- "allocated_processors" : 16,
- "names" : [
- {
- "name" : "Linux",
- "count" : 1
- }
- ],
- "mem" : {
- "total" : "125.9gb",
- "total_in_bytes" : 135189286912,
- "free" : "48.3gb",
- "free_in_bytes" : 51922628608,
- "used" : "77.5gb",
- "used_in_bytes" : 83266658304,
- "free_percent" : 38,
- "used_percent" : 62
- }
- },
- "process" : {
- "cpu" : {
- "percent" : 0
- },
- "open_file_descriptors" : {
- "min" : 415,
- "max" : 415,
- "avg" : 415
- }
- },
- "jvm" : {
- "max_uptime" : "1.9d",
- "max_uptime_in_millis" : 165800616,
- "versions" : [
- {
- "version" : "9.0.4",
- "vm_name" : "OpenJDK 64-Bit Server VM",
- "vm_version" : "9.0.4+11",
- "vm_vendor" : "Oracle Corporation",
- "count" : 1
- }
- ],
- "mem" : {
- "heap_used" : "1.1gb",
- "heap_used_in_bytes" : 1214872032,
- "heap_max" : "23.8gb",
- "heap_max_in_bytes" : 25656426496
- },
- "threads" : 110
- },
- "fs" : {
- "total" : "786.4gb",
- "total_in_bytes" : 844400918528,
- "free" : "246.5gb",
- "free_in_bytes" : 264688160768,
- "available" : "206.5gb",
- "available_in_bytes" : 221771468800
- },
- "plugins" : [ ],
- "network_types" : {
- "transport_types" : {
- "netty4" : 1
- },
- "http_types" : {
- "netty4" : 1
- }
- }
- }
-}
-#+BEGIN_SRC json
diff --git a/doc/heatmap-generation.org b/doc/heatmap-generation.org
new file mode 100644
index 00000000..a697c70b
--- /dev/null
+++ b/doc/heatmap-generation.org
@@ -0,0 +1,34 @@
+#+STARTUP: inlineimages
+#+TITLE: Heatmap Generation
+#+AUTHOR: Muriithi Frederick Muriuki
+
+* Generating Heatmaps
+
+Like a lot of other features, the heatmap generation requires an existing collection. If none exists, see [[][Creating a new collection]] for how to create a new collection.
+
+Once you have a collection, you can navigate to the collections page by clicking on the "Collections" link in the header
+
+
+[[./images/gn2_header_collections.png]]
+
+From that page, pick the collection that you want to work with by clicking on its name on the collections table.
+
+That takes you to that collection's page, where you can select the data that you want to use to generate the heatmap.
+
+** Selecting Orientation
+
+Once you have selected the data, select the orientation of the heatmap you want generated. You do this by selecting either *"vertical"* or *"horizontal"* in the heatmaps form:
+
+[[./images/heatmap_form.png]]
+
+Once you have selected the orientation, click on the "Generate Heatmap" button as in the image above.
+
+The heatmap generation might take a while, but once it is done, an image shows up above the data table.
+
+** Downloading the PNG copy of the Heatmap
+
+Once the heatmap image is shown, hovering over it, displays some tools to interact with the image.
+
+To download, hover over the heatmap image, and click on the "Download plot as png" icon as shown.
+
+[[./images/heatmap_with_hover_tools.png]]
diff --git a/doc/images/gn2_header_collections.png b/doc/images/gn2_header_collections.png
new file mode 100644
index 00000000..ac23f9c1
--- /dev/null
+++ b/doc/images/gn2_header_collections.png
Binary files differ
diff --git a/doc/images/heatmap_form.png b/doc/images/heatmap_form.png
new file mode 100644
index 00000000..163fbb60
--- /dev/null
+++ b/doc/images/heatmap_form.png
Binary files differ
diff --git a/doc/images/heatmap_with_hover_tools.png b/doc/images/heatmap_with_hover_tools.png
new file mode 100644
index 00000000..4ab79f99
--- /dev/null
+++ b/doc/images/heatmap_with_hover_tools.png
Binary files differ
diff --git a/doc/joss/2016/2020.12.23.424047v1.full.pdf b/doc/joss/2016/2020.12.23.424047v1.full.pdf
new file mode 100644
index 00000000..491dddf3
--- /dev/null
+++ b/doc/joss/2016/2020.12.23.424047v1.full.pdf
Binary files differ