From ba98ef026544d4437e65a7bd248ff9591296b48e Mon Sep 17 00:00:00 2001 From: Frederick Muriuki Muriithi Date: Wed, 13 Oct 2021 09:37:19 +0300 Subject: Add some documentation for generating heatmaps Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/non-clustered-heatmaps-and-flipping.gmi * Add some documentation on generating the heatmaps, that would be useful for the end user. --- doc/heatmap-generation.org | 34 ++++++++++++++++++++++++++++++++ doc/images/gn2_header_collections.png | Bin 0 -> 7890 bytes doc/images/heatmap_form.png | Bin 0 -> 9363 bytes doc/images/heatmap_with_hover_tools.png | Bin 0 -> 42578 bytes 4 files changed, 34 insertions(+) create mode 100644 doc/heatmap-generation.org create mode 100644 doc/images/gn2_header_collections.png create mode 100644 doc/images/heatmap_form.png create mode 100644 doc/images/heatmap_with_hover_tools.png (limited to 'doc') diff --git a/doc/heatmap-generation.org b/doc/heatmap-generation.org new file mode 100644 index 00000000..a697c70b --- /dev/null +++ b/doc/heatmap-generation.org @@ -0,0 +1,34 @@ +#+STARTUP: inlineimages +#+TITLE: Heatmap Generation +#+AUTHOR: Muriithi Frederick Muriuki + +* Generating Heatmaps + +Like a lot of other features, the heatmap generation requires an existing collection. If none exists, see [[][Creating a new collection]] for how to create a new collection. + +Once you have a collection, you can navigate to the collections page by clicking on the "Collections" link in the header + + +[[./images/gn2_header_collections.png]] + +From that page, pick the collection that you want to work with by clicking on its name on the collections table. + +That takes you to that collection's page, where you can select the data that you want to use to generate the heatmap. + +** Selecting Orientation + +Once you have selected the data, select the orientation of the heatmap you want generated. You do this by selecting either *"vertical"* or *"horizontal"* in the heatmaps form: + +[[./images/heatmap_form.png]] + +Once you have selected the orientation, click on the "Generate Heatmap" button as in the image above. + +The heatmap generation might take a while, but once it is done, an image shows up above the data table. + +** Downloading the PNG copy of the Heatmap + +Once the heatmap image is shown, hovering over it, displays some tools to interact with the image. + +To download, hover over the heatmap image, and click on the "Download plot as png" icon as shown. + +[[./images/heatmap_with_hover_tools.png]] diff --git a/doc/images/gn2_header_collections.png b/doc/images/gn2_header_collections.png new file mode 100644 index 00000000..ac23f9c1 Binary files /dev/null and b/doc/images/gn2_header_collections.png differ diff --git a/doc/images/heatmap_form.png b/doc/images/heatmap_form.png new file mode 100644 index 00000000..163fbb60 Binary files /dev/null and b/doc/images/heatmap_form.png differ diff --git a/doc/images/heatmap_with_hover_tools.png b/doc/images/heatmap_with_hover_tools.png new file mode 100644 index 00000000..4ab79f99 Binary files /dev/null and b/doc/images/heatmap_with_hover_tools.png differ -- cgit 1.4.1 From 6151faa9ea67af4bf4ea95fb681a9dc4319474b6 Mon Sep 17 00:00:00 2001 From: Arthur Centeno Date: Mon, 25 Oct 2021 20:51:16 +0000 Subject: Updated version of tutorials by ACenteno on 10-25-21 --- doc/joss/2016/2020.12.23.424047v1.full.pdf | Bin 0 -> 3804818 bytes wqflask/wqflask/templates/tutorials.html | 256 +++++++++++++++++++++++++++-- 2 files changed, 245 insertions(+), 11 deletions(-) create mode 100644 doc/joss/2016/2020.12.23.424047v1.full.pdf (limited to 'doc') diff --git a/doc/joss/2016/2020.12.23.424047v1.full.pdf b/doc/joss/2016/2020.12.23.424047v1.full.pdf new file mode 100644 index 00000000..491dddf3 Binary files /dev/null and b/doc/joss/2016/2020.12.23.424047v1.full.pdf differ diff --git a/wqflask/wqflask/templates/tutorials.html b/wqflask/wqflask/templates/tutorials.html index 3e6ef01c..dbda8d6f 100644 --- a/wqflask/wqflask/templates/tutorials.html +++ b/wqflask/wqflask/templates/tutorials.html @@ -2,16 +2,250 @@ {% block title %}Tutorials/Primers{% endblock %} {% block content %} - - - -
-

Tutorials/Primers

- -

-
+ + GeneNetwork Webinar Series, Tutorials and Short Video Tours + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ + + +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + +
Title/DescriptionPresentation

Webinar #01 - Introduction to Quantitative Trait Loci (QTL) Analysis

+

Friday, May 8th, 2020
+ 10am PDT/ 11am MDT/ 12pm CDT/ 1pm EDT

+

Goals of this webinar (trait variance to QTL):

+
    +
  • Define quantitative trait locus (QTL)
  • +
  • Explain how genome scans can help find QTL
  • +
+

Presented by:
+Dr. Saunak Sen
+Professor and Chief of Biostatistics
+Department of Preventative Medicine
+University of Tennessee Health Science Center +

+

Link to course material +

+

Webinar #02 - Mapping Addiction and Behavioral Traits and Getting at Causal Gene Variants with GeneNetwork

+

Friday, May 22nd. 2020 + 10am PDT/ 11am MDT/ 12pm CDT/ 1pm EDT +

+

Goals of this webinar (QTL to gene variant):

+
    +
  • Demonstrate mapping a quantitative trait using GeneNetwork (GN)
  • +
  • Explore GN tools to identify genes and genetics variants related to a QTL
  • +
+

Presented by:
+Dr. Rob Williams
+Professor and Chair
+Department of Genetics, Genomics, and Informatics
+University of Tennessee Health Science Center +

Link to course material +

Data structure, disease risk, GXE, and causal modeling

+

Friday, November 20th at 9am PDT/ 11pm CDT/ 12pm EDT
+ 1-hour presentation followed by 30 minutes of discussion
+ +

Human disease is mainly due to complex interactions between genetic and environmental factors (GXE). We need to acquire the right "smart" data types—coherent and multiplicative data—required to make accurate predictions about risk and outcome for n = 1 individuals—a daunting task. We have developed large families of fully sequenced mice that mirror the genetic complexity of humans. We are using these Reference Populations to generate multiplicatively useful data and to build and test causal quantitative models of disease mechanisms with a special focus on diseases of aging, addiction, and neurological and psychiatric disease. + +

Speaker Bio: Robert (Rob) W. Williams received a BA in neuroscience from UC Santa Cruz (1975) and a Ph.D. in system physiology at UC Davis with Leo M. Chalupa (1983). He did postdoctoral work in developmental neurobiology at Yale School of Medicine with Pasko Rakic where he developed novel stereological methods to estimate cell populations in brain. In 2013 Williams established the Department of Genetics, Genomics and Informatics at UTHSC. He holds the UT Oak Ridge National Laboratory Governor’s Chair in Computational Genomics. Williams is director of the Complex Trait Community (www.complextrait.org) and editor-in-chief of Frontiers in Neurogenomics. One of Williams’ more notable contributions is in the field of systems neurogenetics and experimental precision medicine. He and his research collaborators have built GeneNetwork (www.genenetwork.org), an online resource of data and analysis code that is used as a platform for experimental precision medicine.

+ +

Presented by:
+Dr. Rob Williams
+Professor and Chair
+Department of Genetics, Genomics, and Informatics
+University of Tennessee Health Science Center +

+ + +
+
+ + +
+
+ + + + + + + + + + + + + + + + + + + + + + + +
Title/DescriptionPresentation

#01- Introduction to Gene Network

+

Please note that this tutorial is based on GeneNetwork v1 + +

GeneNetwork is a group of linked data sets and tools used to study complex networks of genes, molecules, and higher order gene function and phenotypes. GeneNetwork combines more than 25 years of legacy data generated by hundreds of scientists together with sequence data (SNPs) and massive transcriptome data sets (expression genetic or eQTL data sets). The quantitative trait locus (QTL) mapping module that is built into GN is optimized for fast on-line analysis of traits that are controlled by combinations of gene variants and environmental factors. GeneNetwork can be used to study humans, mice (BXD, AXB, LXS, etc.), rats (HXB), Drosophila, and plant species (barley and Arabidopsis). Most of these population data sets are linked with dense genetic maps (genotypes) that can be used to locate the genetic modifiers that cause differences in expression and phenotypes, including disease susceptibility. + +

Users are welcome to enter their own private data directly into GeneNetwork to exploit the full range of analytic tools and to map modulators in a powerful environment. This combination of data and fast analytic functions enable users to study relations between sequence variants, molecular networks, and function.

+ +

Presented by:
+Dr. Rob Williams
+Professor and Chair
+Department of Genetics, Genomics, and Informatics
+University of Tennessee Health Science Center +

+ + +
+

#02 - How to search in GeneNetwork

+
+
+ + +
+
+ + + + + + + + + + + + + + + + +
TitleSpeakerVideo link
Diallel Crosses, Artificial Intelligence, and Mouse Models of Alzheimer’s DiseaseDavid G. Ashbrook
Assistant Professor
University of Tennessee Health Science Center
YouTube link
+ +
+ + +
+ +
+
+ + + +
+ + + + + + + + + + + + + + {% endblock %} + -- cgit 1.4.1 From 3181bd6261b09a5e9b027256057c21c49792bd32 Mon Sep 17 00:00:00 2001 From: jgart Date: Fri, 10 Sep 2021 00:29:28 -0400 Subject: Remove unnecessary git pull commands from installation instructions --- doc/README.org | 2 -- 1 file changed, 2 deletions(-) (limited to 'doc') diff --git a/doc/README.org b/doc/README.org index 1236016e..8839aefc 100644 --- a/doc/README.org +++ b/doc/README.org @@ -81,14 +81,12 @@ GeneNetwork2 with : source ~/opt/guix-pull/etc/profile : git clone https://git.genenetwork.org/guix-bioinformatics/guix-bioinformatics.git ~/guix-bioinformatics : cd ~/guix-bioinformatics -: git pull : env GUIX_PACKAGE_PATH=$HOME/guix-bioinformatics guix package -i genenetwork2 -p ~/opt/genenetwork2 you probably also need guix-past (the upstream channel for older packages): : git clone https://gitlab.inria.fr/guix-hpc/guix-past.git ~/guix-past : cd ~/guix-past -: git pull : env GUIX_PACKAGE_PATH=$HOME/guix-bioinformatics:$HOME/guix-past/modules ~/opt/guix-pull/bin/guix package -i genenetwork2 -p ~/opt/genenetwork2 ignore the warnings. Guix should install the software without trying -- cgit 1.4.1 From 03caa57ad209f3bdd135be9d6516b94261c9b8de Mon Sep 17 00:00:00 2001 From: BonfaceKilz Date: Thu, 28 Oct 2021 11:05:19 +0300 Subject: Remove all elasticsearch references in gn2 --- bin/genenetwork2 | 7 - doc/elasticsearch.org | 247 ------------------------------ test/requests/parametrized_test.py | 32 ---- test/requests/test-website.py | 1 - test/requests/test_forgot_password.py | 29 +--- test/requests/test_registration.py | 36 ++--- wqflask/maintenance/quantile_normalize.py | 18 --- wqflask/utility/elasticsearch_tools.py | 121 --------------- wqflask/utility/tools.py | 5 +- wqflask/wqflask/user_session.py | 1 - 10 files changed, 23 insertions(+), 474 deletions(-) delete mode 100644 doc/elasticsearch.org delete mode 100644 test/requests/parametrized_test.py delete mode 100644 wqflask/utility/elasticsearch_tools.py (limited to 'doc') diff --git a/bin/genenetwork2 b/bin/genenetwork2 index 2b94b2a2..5f714d2e 100755 --- a/bin/genenetwork2 +++ b/bin/genenetwork2 @@ -101,13 +101,6 @@ fi export GN2_SETTINGS=$settings # Python echo GN2_SETTINGS=$settings -# This is a temporary hack to inject ES - should have added python2-elasticsearch package to guix instead -# if [ -z $ELASTICSEARCH_PROFILE ]; then -# echo -e "WARNING: Elastic Search profile has not been set - use ELASTICSEARCH_PROFILE"; -# else -# PYTHONPATH="$PYTHONPATH${PYTHONPATH:+:}$ELASTICSEARCH_PROFILE/lib/python3.8/site-packages" -# fi - if [ -z $GN2_PROFILE ] ; then echo "WARNING: GN2_PROFILE has not been set - you need the environment, so I hope you know what you are doing!" export GN2_PROFILE=$(dirname $(dirname $(which genenetwork2))) diff --git a/doc/elasticsearch.org b/doc/elasticsearch.org deleted file mode 100644 index 864a8363..00000000 --- a/doc/elasticsearch.org +++ /dev/null @@ -1,247 +0,0 @@ -* Elasticsearch - -** Introduction - -GeneNetwork uses elasticsearch (ES) for all things considered -'state'. One example is user collections, another is user management. - -** Example - -To get the right environment, first you can get a python REPL with something like - -: env GN2_PROFILE=~/opt/gn-latest ./bin/genenetwork2 ../etc/default_settings.py -cli python - -(make sure to use the correct GN2_PROFILE!) - -Next try - -#+BEGIN_SRC python - -from elasticsearch import Elasticsearch, TransportError - -es = Elasticsearch([{ "host": 'localhost', "port": '9200' }]) - -# Dump all data - -es.search("*") - -# To fetch an E-mail record from the users index - -record = es.search( - index = 'users', doc_type = 'local', body = { - "query": { "match": { "email_address": "myname@email.com" } } - }) - -# It is also possible to do wild card matching - -q = { "query": { "wildcard" : { "full_name" : "pjot*" } }} -es.search(index = 'users', doc_type = 'local', body = q) - -# To get elements from that record: - -record['hits']['hits'][0][u'_source']['full_name'] -u'Pjotr' - -record['hits']['hits'][0][u'_source']['email_address'] -u"myname@email.com" - -#+END_SRC - -** Health - -ES provides support for checking its health: - -: curl -XGET http://localhost:9200/_cluster/health?pretty=true - -#+BEGIN_SRC json - - - { - "cluster_name" : "asgard", - "status" : "yellow", - "timed_out" : false, - "number_of_nodes" : 1, - "number_of_data_nodes" : 1, - "active_primary_shards" : 5, - "active_shards" : 5, - "relocating_shards" : 0, - "initializing_shards" : 0, - "unassigned_shards" : 5 - } - -#+END_SRC - -Yellow means just one instance is running (no worries). - -To get full cluster info - -: curl -XGET "localhost:9200/_cluster/stats?human&pretty" - -#+BEGIN_SRC json -{ - "_nodes" : { - "total" : 1, - "successful" : 1, - "failed" : 0 - }, - "cluster_name" : "elasticsearch", - "timestamp" : 1529050366452, - "status" : "yellow", - "indices" : { - "count" : 3, - "shards" : { - "total" : 15, - "primaries" : 15, - "replication" : 0.0, - "index" : { - "shards" : { - "min" : 5, - "max" : 5, - "avg" : 5.0 - }, - "primaries" : { - "min" : 5, - "max" : 5, - "avg" : 5.0 - }, - "replication" : { - "min" : 0.0, - "max" : 0.0, - "avg" : 0.0 - } - } - }, - "docs" : { - "count" : 14579, - "deleted" : 0 - }, - "store" : { - "size" : "44.7mb", - "size_in_bytes" : 46892794 - }, - "fielddata" : { - "memory_size" : "0b", - "memory_size_in_bytes" : 0, - "evictions" : 0 - }, - "query_cache" : { - "memory_size" : "0b", - "memory_size_in_bytes" : 0, - "total_count" : 0, - "hit_count" : 0, - "miss_count" : 0, - "cache_size" : 0, - "cache_count" : 0, - "evictions" : 0 - }, - "completion" : { - "size" : "0b", - "size_in_bytes" : 0 - }, - "segments" : { - "count" : 24, - "memory" : "157.3kb", - "memory_in_bytes" : 161112, - "terms_memory" : "122.6kb", - "terms_memory_in_bytes" : 125569, - "stored_fields_memory" : "15.3kb", - "stored_fields_memory_in_bytes" : 15728, - "term_vectors_memory" : "0b", - "term_vectors_memory_in_bytes" : 0, - "norms_memory" : "10.8kb", - "norms_memory_in_bytes" : 11136, - "points_memory" : "111b", - "points_memory_in_bytes" : 111, - "doc_values_memory" : "8.3kb", - "doc_values_memory_in_bytes" : 8568, - "index_writer_memory" : "0b", - "index_writer_memory_in_bytes" : 0, - "version_map_memory" : "0b", - "version_map_memory_in_bytes" : 0, - "fixed_bit_set" : "0b", - "fixed_bit_set_memory_in_bytes" : 0, - "max_unsafe_auto_id_timestamp" : -1, - "file_sizes" : { } - } - }, - "nodes" : { - "count" : { - "total" : 1, - "data" : 1, - "coordinating_only" : 0, - "master" : 1, - "ingest" : 1 - }, - "versions" : [ - "6.2.1" - ], - "os" : { - "available_processors" : 16, - "allocated_processors" : 16, - "names" : [ - { - "name" : "Linux", - "count" : 1 - } - ], - "mem" : { - "total" : "125.9gb", - "total_in_bytes" : 135189286912, - "free" : "48.3gb", - "free_in_bytes" : 51922628608, - "used" : "77.5gb", - "used_in_bytes" : 83266658304, - "free_percent" : 38, - "used_percent" : 62 - } - }, - "process" : { - "cpu" : { - "percent" : 0 - }, - "open_file_descriptors" : { - "min" : 415, - "max" : 415, - "avg" : 415 - } - }, - "jvm" : { - "max_uptime" : "1.9d", - "max_uptime_in_millis" : 165800616, - "versions" : [ - { - "version" : "9.0.4", - "vm_name" : "OpenJDK 64-Bit Server VM", - "vm_version" : "9.0.4+11", - "vm_vendor" : "Oracle Corporation", - "count" : 1 - } - ], - "mem" : { - "heap_used" : "1.1gb", - "heap_used_in_bytes" : 1214872032, - "heap_max" : "23.8gb", - "heap_max_in_bytes" : 25656426496 - }, - "threads" : 110 - }, - "fs" : { - "total" : "786.4gb", - "total_in_bytes" : 844400918528, - "free" : "246.5gb", - "free_in_bytes" : 264688160768, - "available" : "206.5gb", - "available_in_bytes" : 221771468800 - }, - "plugins" : [ ], - "network_types" : { - "transport_types" : { - "netty4" : 1 - }, - "http_types" : { - "netty4" : 1 - } - } - } -} -#+BEGIN_SRC json diff --git a/test/requests/parametrized_test.py b/test/requests/parametrized_test.py deleted file mode 100644 index 50003850..00000000 --- a/test/requests/parametrized_test.py +++ /dev/null @@ -1,32 +0,0 @@ -import logging -import unittest -from wqflask import app -from utility.elasticsearch_tools import get_elasticsearch_connection, get_user_by_unique_column -from elasticsearch import Elasticsearch, TransportError - -class ParametrizedTest(unittest.TestCase): - - def __init__(self, methodName='runTest', gn2_url="http://localhost:5003", es_url="localhost:9200"): - super(ParametrizedTest, self).__init__(methodName=methodName) - self.gn2_url = gn2_url - self.es_url = es_url - - def setUp(self): - self.es = get_elasticsearch_connection() - self.es_cleanup = [] - - es_logger = logging.getLogger("elasticsearch") - es_logger.setLevel(app.config.get("LOG_LEVEL")) - es_logger.addHandler( - logging.FileHandler("/tmp/es_TestRegistrationInfo.log")) - es_trace_logger = logging.getLogger("elasticsearch.trace") - es_trace_logger.addHandler( - logging.FileHandler("/tmp/es_TestRegistrationTrace.log")) - - def tearDown(self): - from time import sleep - self.es.delete_by_query( - index="users" - , doc_type="local" - , body={"query":{"match":{"email_address":"test@user.com"}}}) - sleep(1) diff --git a/test/requests/test-website.py b/test/requests/test-website.py index 8bfb47c2..d619a7d5 100755 --- a/test/requests/test-website.py +++ b/test/requests/test-website.py @@ -43,7 +43,6 @@ def dummy(args_obj, parser): def integration_tests(args_obj, parser): gn2_url = args_obj.host - es_url = app.config.get("ELASTICSEARCH_HOST")+":"+str(app.config.get("ELASTICSEARCH_PORT")) run_integration_tests(gn2_url, es_url) def initTest(klass, gn2_url, es_url): diff --git a/test/requests/test_forgot_password.py b/test/requests/test_forgot_password.py index 346524bc..65b061f8 100644 --- a/test/requests/test_forgot_password.py +++ b/test/requests/test_forgot_password.py @@ -1,25 +1,22 @@ import requests -from utility.elasticsearch_tools import get_user_by_unique_column from parameterized import parameterized from parametrized_test import ParametrizedTest passwork_reset_link = '' forgot_password_page = None -class TestForgotPassword(ParametrizedTest): +class TestForgotPassword(ParametrizedTest): def setUp(self): super(TestForgotPassword, self).setUp() self.forgot_password_url = self.gn2_url+"/n/forgot_password_submit" + def send_email(to_addr, msg, fromaddr="no-reply@genenetwork.org"): print("CALLING: send_email_mock()") email_data = { - "to_addr": to_addr - , "msg": msg - , "fromaddr": from_addr} + "to_addr": to_addr, "msg": msg, "fromaddr": from_addr} data = { - "es_connection": self.es, "email_address": "test@user.com", "full_name": "Test User", "organization": "Test Organisation", @@ -27,24 +24,12 @@ class TestForgotPassword(ParametrizedTest): "password_confirm": "test_password" } - def testWithoutEmail(self): data = {"email_address": ""} - error_notification = '
You MUST provide an email
' + error_notification = ('
' + 'You MUST provide an email
') result = requests.post(self.forgot_password_url, data=data) self.assertEqual(result.url, self.gn2_url+"/n/forgot_password") self.assertTrue( - result.content.find(error_notification) >= 0 - , "Error message should be displayed but was not") - - def testWithNonExistingEmail(self): - # Monkey patching doesn't work, so simply test that getting by email - # returns the correct data - user = get_user_by_unique_column(self.es, "email_address", "non-existent@domain.com") - self.assertTrue(user is None, "Should not find non-existent user") - - def testWithExistingEmail(self): - # Monkey patching doesn't work, so simply test that getting by email - # returns the correct data - user = get_user_by_unique_column(self.es, "email_address", "test@user.com") - self.assertTrue(user is not None, "Should find user") + result.content.find(error_notification) >= 0, + "Error message should be displayed but was not") diff --git a/test/requests/test_registration.py b/test/requests/test_registration.py index 0047e8a6..5d08bf58 100644 --- a/test/requests/test_registration.py +++ b/test/requests/test_registration.py @@ -1,31 +1,25 @@ import sys import requests -from parametrized_test import ParametrizedTest class TestRegistration(ParametrizedTest): - def tearDown(self): - for item in self.es_cleanup: - self.es.delete(index="users", doc_type="local", id=item["_id"]) def testRegistrationPage(self): - if self.es.ping(): - data = { - "email_address": "test@user.com", - "full_name": "Test User", - "organization": "Test Organisation", - "password": "test_password", - "password_confirm": "test_password" - } - requests.post(self.gn2_url+"/n/register", data) - response = self.es.search( - index="users" - , doc_type="local" - , body={ - "query": {"match": {"email_address": "test@user.com"}}}) - self.assertEqual(len(response["hits"]["hits"]), 1) - else: - self.skipTest("The elasticsearch server is down") + data = { + "email_address": "test@user.com", + "full_name": "Test User", + "organization": "Test Organisation", + "password": "test_password", + "password_confirm": "test_password" + } + requests.post(self.gn2_url+"/n/register", data) + response = self.es.search( + index="users" + , doc_type="local" + , body={ + "query": {"match": {"email_address": "test@user.com"}}}) + self.assertEqual(len(response["hits"]["hits"]), 1) + def main(gn2, es): import unittest diff --git a/wqflask/maintenance/quantile_normalize.py b/wqflask/maintenance/quantile_normalize.py index 0cc963e5..32780ca6 100644 --- a/wqflask/maintenance/quantile_normalize.py +++ b/wqflask/maintenance/quantile_normalize.py @@ -5,14 +5,10 @@ import urllib.parse import numpy as np import pandas as pd -from elasticsearch import Elasticsearch, TransportError -from elasticsearch.helpers import bulk from flask import Flask, g, request from wqflask import app -from utility.elasticsearch_tools import get_elasticsearch_connection -from utility.tools import ELASTICSEARCH_HOST, ELASTICSEARCH_PORT, SQL_URI def parse_db_uri(): @@ -106,20 +102,6 @@ if __name__ == '__main__': Conn = MySQLdb.Connect(**parse_db_uri()) Cursor = Conn.cursor() - # es = Elasticsearch([{ - # "host": ELASTICSEARCH_HOST, "port": ELASTICSEARCH_PORT - # }], timeout=60) if (ELASTICSEARCH_HOST and ELASTICSEARCH_PORT) else None - - es = get_elasticsearch_connection(for_user=False) - - #input_filename = "/home/zas1024/cfw_data/" + sys.argv[1] + ".txt" - #input_df = create_dataframe(input_filename) - #output_df = quantileNormalize(input_df) - - #output_df.to_csv('quant_norm.csv', sep='\t') - - #out_filename = sys.argv[1][:-4] + '_quantnorm.txt' - success, _ = bulk(es, set_data(sys.argv[1])) response = es.search( diff --git a/wqflask/utility/elasticsearch_tools.py b/wqflask/utility/elasticsearch_tools.py deleted file mode 100644 index eae3ba03..00000000 --- a/wqflask/utility/elasticsearch_tools.py +++ /dev/null @@ -1,121 +0,0 @@ -# Elasticsearch support -# -# Some helpful commands to view the database: -# -# You can test the server being up with -# -# curl -H 'Content-Type: application/json' http://localhost:9200 -# -# List all indices -# -# curl -H 'Content-Type: application/json' 'localhost:9200/_cat/indices?v' -# -# To see the users index 'table' -# -# curl http://localhost:9200/users -# -# To list all user ids -# -# curl -H 'Content-Type: application/json' http://localhost:9200/users/local/_search?pretty=true -d ' -# { -# "query" : { -# "match_all" : {} -# }, -# "stored_fields": [] -# }' -# -# To view a record -# -# curl -H 'Content-Type: application/json' http://localhost:9200/users/local/_search?pretty=true -d ' -# { -# "query" : { -# "match" : { "email_address": "pjotr2017@thebird.nl"} -# } -# }' -# -# -# To delete the users index and data (dangerous!) -# -# curl -XDELETE -H 'Content-Type: application/json' 'localhost:9200/users' - - -from elasticsearch import Elasticsearch, TransportError -import logging - -from utility.logger import getLogger -logger = getLogger(__name__) - -from utility.tools import ELASTICSEARCH_HOST, ELASTICSEARCH_PORT - - -def test_elasticsearch_connection(): - es = Elasticsearch(['http://' + ELASTICSEARCH_HOST + \ - ":" + str(ELASTICSEARCH_PORT) + '/'], verify_certs=True) - if not es.ping(): - logger.warning("Elasticsearch is DOWN") - - -def get_elasticsearch_connection(for_user=True): - """Return a connection to ES. Returns None on failure""" - logger.info("get_elasticsearch_connection") - es = None - try: - assert(ELASTICSEARCH_HOST) - assert(ELASTICSEARCH_PORT) - logger.info("ES HOST", ELASTICSEARCH_HOST) - - es = Elasticsearch([{ - "host": ELASTICSEARCH_HOST, "port": ELASTICSEARCH_PORT - }], timeout=30, retry_on_timeout=True) if (ELASTICSEARCH_HOST and ELASTICSEARCH_PORT) else None - - if for_user: - setup_users_index(es) - - es_logger = logging.getLogger("elasticsearch") - es_logger.setLevel(logging.INFO) - es_logger.addHandler(logging.NullHandler()) - except Exception as e: - logger.error("Failed to get elasticsearch connection", e) - es = None - - return es - - -def setup_users_index(es_connection): - if es_connection: - index_settings = { - "properties": { - "email_address": { - "type": "keyword"}}} - - es_connection.indices.create(index='users', ignore=400) - es_connection.indices.put_mapping( - body=index_settings, index="users", doc_type="local") - - -def get_user_by_unique_column(es, column_name, column_value, index="users", doc_type="local"): - return get_item_by_unique_column(es, column_name, column_value, index=index, doc_type=doc_type) - - -def save_user(es, user, user_id): - es_save_data(es, "users", "local", user, user_id) - - -def get_item_by_unique_column(es, column_name, column_value, index, doc_type): - item_details = None - try: - response = es.search( - index=index, doc_type=doc_type, body={ - "query": {"match": {column_name: column_value}} - }) - if len(response["hits"]["hits"]) > 0: - item_details = response["hits"]["hits"][0]["_source"] - except TransportError as te: - pass - return item_details - - -def es_save_data(es, index, doc_type, data_item, data_id,): - from time import sleep - es.create(index, doc_type, body=data_item, id=data_id) - sleep(1) # Delay 1 second to allow indexing diff --git a/wqflask/utility/tools.py b/wqflask/utility/tools.py index 0efe8ca9..f28961ec 100644 --- a/wqflask/utility/tools.py +++ b/wqflask/utility/tools.py @@ -287,6 +287,7 @@ JS_GN_PATH = get_setting('JS_GN_PATH') GITHUB_CLIENT_ID = get_setting('GITHUB_CLIENT_ID') GITHUB_CLIENT_SECRET = get_setting('GITHUB_CLIENT_SECRET') +GITHUB_AUTH_URL = "" if GITHUB_CLIENT_ID != 'UNKNOWN' and GITHUB_CLIENT_SECRET: GITHUB_AUTH_URL = "https://github.com/login/oauth/authorize?client_id=" + \ GITHUB_CLIENT_ID + "&client_secret=" + GITHUB_CLIENT_SECRET @@ -301,10 +302,6 @@ if ORCID_CLIENT_ID != 'UNKNOWN' and ORCID_CLIENT_SECRET: "&redirect_uri=" + GN2_BRANCH_URL + "n/login/orcid_oauth2" ORCID_TOKEN_URL = get_setting('ORCID_TOKEN_URL') -ELASTICSEARCH_HOST = get_setting('ELASTICSEARCH_HOST') -ELASTICSEARCH_PORT = get_setting('ELASTICSEARCH_PORT') -# import utility.elasticsearch_tools as es -# es.test_elasticsearch_connection() SMTP_CONNECT = get_setting('SMTP_CONNECT') SMTP_USERNAME = get_setting('SMTP_USERNAME') diff --git a/wqflask/wqflask/user_session.py b/wqflask/wqflask/user_session.py index 67e2e158..d3c4a62f 100644 --- a/wqflask/wqflask/user_session.py +++ b/wqflask/wqflask/user_session.py @@ -10,7 +10,6 @@ from flask import (Flask, g, render_template, url_for, request, make_response, from wqflask import app from utility import hmac -#from utility.elasticsearch_tools import get_elasticsearch_connection from utility.redis_tools import get_redis_conn, get_user_id, get_user_by_unique_column, set_user_attribute, get_user_collections, save_collections Redis = get_redis_conn() -- cgit 1.4.1 From 9c17856e0aa047408d91b0fe0bd6591d305cca42 Mon Sep 17 00:00:00 2001 From: Pjotr Prins Date: Wed, 17 Nov 2021 04:07:08 -0600 Subject: Moven GN API doc to gn-docs repo --- doc/API_readme.md | 168 +----------------------------------------------------- 1 file changed, 1 insertion(+), 167 deletions(-) (limited to 'doc') diff --git a/doc/API_readme.md b/doc/API_readme.md index be6668dc..17d10e44 100644 --- a/doc/API_readme.md +++ b/doc/API_readme.md @@ -1,169 +1,3 @@ # API Query Documentation # ---- -# Fetching Dataset/Trait info/data # ---- -## Fetch Species List ## -To get a list of species with data available in GN (and their associated names and ids): -``` -curl http://genenetwork.org/api/v_pre1/species -[ { "FullName": "Mus musculus", "Id": 1, "Name": "mouse", "TaxonomyId": 10090 }, ... { "FullName": "Populus trichocarpa", "Id": 10, "Name": "poplar", "TaxonomyId": 3689 } ] -``` - -Or to get a single species info: -``` -curl http://genenetwork.org/api/v_pre1/species/mouse -``` -OR -``` -curl http://genenetwork.org/api/v_pre1/species/mouse.json -``` - -*For all queries where the last field is a user-specified name/ID, there will be the option to append a file format type. Currently there is only JSON (and it will default to JSON if none is provided), but other formats will be added later* - -## Fetch Groups/RISets ## - -This query can optionally filter by species: - -``` -curl http://genenetwork.org/api/v_pre1/groups (for all species) -``` -OR -``` -curl http://genenetwork.org/api/v_pre1/groups/mouse (for just mouse groups/RISets) -[ { "DisplayName": "BXD", "FullName": "BXD RI Family", "GeneticType": "riset", "Id": 1, "MappingMethodId": "1", "Name": "BXD", "SpeciesId": 1, "public": 2 }, ... { "DisplayName": "AIL LGSM F34 and F39-43 (GBS)", "FullName": "AIL LGSM F34 and F39-43 (GBS)", "GeneticType": "intercross", "Id": 72, "MappingMethodId": "2", "Name": "AIL-LGSM-F34-F39-43-GBS", "SpeciesId": 1, "public": 2 } ] -``` - -## Fetch Genotypes for Group/RISet ## -``` -curl http://genenetwork.org/api/v_pre1/genotypes/bimbam/BXD -curl http://genenetwork.org/api/v_pre1/genotypes/BXD.bimbam -``` -Returns a group's genotypes in one of several formats - bimbam, rqtl2, or geno (a format used by qtlreaper which is just a CSV file consisting of marker positions and genotypes) - -Rqtl2 genotype queries can also include the dataset name and will return a zip of the genotypes, phenotypes, and gene map (marker names/positions). For example: -``` -curl http://genenetwork.org/api/v_pre1/genotypes/rqtl2/BXD/HC_M2_0606_P.zip -``` - -## Fetch Datasets ## -``` -curl http://genenetwork.org/api/v_pre1/datasets/bxd -``` -OR -``` -curl http://genenetwork.org/api/v_pre1/datasets/mouse/bxd -[ { "AvgID": 1, "CreateTime": "Fri, 01 Aug 2003 00:00:00 GMT", "DataScale": "log2", "FullName": "UTHSC/ETHZ/EPFL BXD Liver Polar Metabolites Extraction A, CD Cohorts (Mar 2017) log2", "Id": 1, "Long_Abbreviation": "BXDMicroArray_ProbeSet_August03", "ProbeFreezeId": 3, "ShortName": "Brain U74Av2 08/03 MAS5", "Short_Abbreviation": "Br_U_0803_M", "confidentiality": 0, "public": 0 }, ... { "AvgID": 3, "CreateTime": "Tue, 14 Aug 2018 00:00:00 GMT", "DataScale": "log2", "FullName": "EPFL/LISP BXD CD Liver Affy Mouse Gene 1.0 ST (Aug18) RMA", "Id": 859, "Long_Abbreviation": "EPFLMouseLiverCDRMAApr18", "ProbeFreezeId": 181, "ShortName": "EPFL/LISP BXD CD Liver Affy Mouse Gene 1.0 ST (Aug18) RMA", "Short_Abbreviation": "EPFLMouseLiverCDRMA0818", "confidentiality": 0, "public": 1 } ] -``` -(I added the option to specify species just in case we end up with the same group name across multiple species at some point, though it's currently unnecessary) - -## Fetch Individual Dataset Info ## -### For mRNA Assay/"ProbeSet" ### - -``` -curl http://genenetwork.org/api/v_pre1/dataset/HC_M2_0606_P -``` -OR -``` -curl http://genenetwork.org/api/v_pre1/dataset/bxd/HC_M2_0606_P``` -{ "confidential": 0, "data_scale": "log2", "dataset_type": "mRNA expression", "full_name": "Hippocampus Consortium M430v2 (Jun06) PDNN", "id": 112, "name": "HC_M2_0606_P", "public": 2, "short_name": "Hippocampus M430v2 BXD 06/06 PDNN", "tissue": "Hippocampus mRNA", "tissue_id": 9 } -``` -(This also has the option to specify group/riset) - -### For "Phenotypes" (basically non-mRNA Expression; stuff like weight, sex, etc) ### -For these traits, the query fetches publication info and takes the group and phenotype 'ID' as input. For example: -``` -curl http://genenetwork.org/api/v_pre1/dataset/bxd/10001 -{ "dataset_type": "phenotype", "description": "Central nervous system, morphology: Cerebellum weight, whole, bilateral in adults of both sexes [mg]", "id": 10001, "name": "CBLWT2", "pubmed_id": 11438585, "title": "Genetic control of the mouse cerebellum: identification of quantitative trait loci modulating size and architecture", "year": "2001" } -``` - -## Fetch Sample Data for Dataset ## -``` -curl http://genenetwork.org/api/v_pre1/sample_data/HSNIH-PalmerPublish.csv -``` - -Returns a CSV file with sample/strain names as the columns and trait IDs as rows - -## Fetch Sample Data for Single Trait ## -``` -curl http://genenetwork.org/api/v_pre1/sample_data/HC_M2_0606_P/1436869_at -[ { "data_id": 23415463, "sample_name": "129S1/SvImJ", "sample_name_2": "129S1/SvImJ", "se": 0.123, "value": 8.201 }, { "data_id": 23415463, "sample_name": "A/J", "sample_name_2": "A/J", "se": 0.046, "value": 8.413 }, { "data_id": 23415463, "sample_name": "AKR/J", "sample_name_2": "AKR/J", "se": 0.134, "value": 8.856 }, ... ] -``` - -## Fetch Trait List for Dataset ## -``` -curl http://genenetwork.org/api/v_pre1/traits/HXBBXHPublish.json -[ { "Additive": 0.0499967532467532, "Id": 10001, "LRS": 16.2831307029479, "Locus": "rs106114574", "PhenotypeId": 1449, "PublicationId": 319, "Sequence": 1 }, ... ] -``` - -Both JSON and CSV formats can be specified, with JSON as default. There is also an optional "ids_only" and "names_only" parameter that will only return a list of trait IDs or names, respectively. - -## Fetch Trait Info (Name, Description, Location, etc) ## -### For mRNA Expression/"ProbeSet" ### -``` -curl http://genenetwork.org/api/v_pre1/trait/HC_M2_0606_P/1436869_at -{ "additive": -0.214087568058076, "alias": "HHG1; HLP3; HPE3; SMMCI; Dsh; Hhg1", "chr": "5", "description": "sonic hedgehog (hedgehog)", "id": 99602, "locus": "rs8253327", "lrs": 12.7711275309832, "mb": 28.457155, "mean": 9.27909090909091, "name": "1436869_at", "p_value": 0.306, "se": null, "symbol": "Shh" } -``` - -### For "Phenotypes" ### -For phenotypes this just gets the max LRS, its location, and additive effect (as calculated by qtlreaper) - -Since each group/riset only has one phenotype "dataset", this query takes either the group/riset name or the group/riset name + "Publish" (for example "BXDPublish", which is the dataset name in the DB) as input -``` -curl http://genenetwork.org/api/v_pre1/trait/BXD/10001 -{ "additive": 2.39444435069444, "id": 4, "locus": "rs48756159", "lrs": 13.4974911471087 } -``` - ---- - -# Analyses # ---- -## Mapping ## -Currently two mapping tools can be used - GEMMA and R/qtl. qtlreaper will be added later with Christian Fischer's RUST implementation - https://github.com/chfi/rust-qtlreaper - -Each method's query takes the following parameters respectively (more will be added): -### GEMMA ### -* trait_id (*required*) - ID for trait being mapped -* db (*required*) - DB name for trait above (Short_Abbreviation listed when you query for datasets) -* use_loco - Whether to use LOCO (leave one chromosome out) method (default = false) -* maf - minor allele frequency (default = 0.01) - -Example query: -``` -curl http://genenetwork.org/api/v_pre1/mapping?trait_id=10015&db=BXDPublish&method=gemma&use_loco=true -``` - -### R/qtl ### -(See the R/qtl guide for information on some of these options - http://www.rqtl.org/manual/qtl-manual.pdf) -* trait_id (*required*) - ID for trait being mapped -* db (*required*) - DB name for trait above (Short_Abbreviation listed when you query for datasets) -* rqtl_method - hk (default) | ehk | em | imp | mr | mr-imp | mr-argmax ; Corresponds to the "method" option for the R/qtl scanone function. -* rqtl_model - normal (default) | binary | 2-part | np ; corresponds to the "model" option for the R/qtl scanone function -* num_perm - number of permutations; 0 by default -* control_marker - Name of marker to use as control; this relies on the user knowing the name of the marker they want to use as a covariate -* interval_mapping - Whether to use interval mapping; "false" by default -* pair_scan - *NYI* - -Example query: -``` -curl http://genenetwork.org/api/v_pre1/mapping?trait_id=1418701_at&db=HC_M2_0606_P&method=rqtl&num_perm=100 -``` - -Some combinations of methods/models may not make sense. The R/qtl manual should be referred to for any questions on its use (specifically the scanone function in this case) - -## Calculate Correlation ## -Currently only Sample and Tissue correlations are implemented - -This query currently takes the following parameters (though more will be added): -* trait_id (*required*) - ID for trait used for correlation -* db (*required*) - DB name for the trait above (this is the Short_Abbreviation listed when you query for datasets) -* target_db (*required*) - Target DB name to be correlated against -* type - sample (default) | tissue -* method - pearson (default) | spearman -* return - Number of results to return (default = 500) - -Example query: -``` -curl http://genenetwork.org/api/v_pre1/correlation?trait_id=1427571_at&db=HC_M2_0606_P&target_db=BXDPublish&type=sample&return_count=100 -[ { "#_strains": 6, "p_value": 0.004804664723032055, "sample_r": -0.942857142857143, "trait": 20511 }, { "#_strains": 6, "p_value": 0.004804664723032055, "sample_r": -0.942857142857143, "trait": 20724 }, { "#_strains": 12, "p_value": 1.8288943424888848e-05, "sample_r": -0.9233615170820528, "trait": 13536 }, { "#_strains": 7, "p_value": 0.006807187408935392, "sample_r": 0.8928571428571429, "trait": 10157 }, { "#_strains": 7, "p_value": 0.006807187408935392, "sample_r": -0.8928571428571429, "trait": 20392 }, ... ] -``` +This document has moved to [gn-docs](https://github.com/genenetwork/gn-docs/blob/master/api/GN2-REST-API.md)! -- cgit 1.4.1 From 83719648e505747f16281d7e8e14f1003297be5c Mon Sep 17 00:00:00 2001 From: BonfaceKilz Date: Wed, 17 Nov 2021 13:42:26 +0300 Subject: README.org: Replace broken link with environments page --- doc/README.org | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'doc') diff --git a/doc/README.org b/doc/README.org index 8839aefc..e1c6b614 100644 --- a/doc/README.org +++ b/doc/README.org @@ -26,7 +26,7 @@ * Introduction -Large system deployments can get very [[http://biogems.info/contrib/genenetwork/gn2.svg ][complex]]. In this document we +Large system deployments can get very [[http://genenetwork.org/environments/][complex]]. In this document we explain the GeneNetwork version 2 (GN2) reproducible deployment system which is based on GNU Guix (see also [[https://github.com/pjotrp/guix-notes/blob/master/README.md][Guix-notes]]). The Guix system can be used to install GN with all its files and dependencies. -- cgit 1.4.1