summary refs log tree commit diff
path: root/issues/genenetwork3
diff options
context:
space:
mode:
Diffstat (limited to 'issues/genenetwork3')
-rw-r--r--issues/genenetwork3/01828928-26e6-4cad-bbc8-59fd7a7977de.json.zipbin0 -> 143152 bytes
-rw-r--r--issues/genenetwork3/broken-aliases.gmi188
-rw-r--r--issues/genenetwork3/check-for-mandatory-settings.gmi40
-rw-r--r--issues/genenetwork3/ctl-maps-error.gmi46
-rw-r--r--issues/genenetwork3/genenetwork3_configuration.gmi19
-rw-r--r--issues/genenetwork3/generate-heatmaps-failing.gmi64
-rw-r--r--issues/genenetwork3/rqtl2-mapping-error.gmi46
7 files changed, 403 insertions, 0 deletions
diff --git a/issues/genenetwork3/01828928-26e6-4cad-bbc8-59fd7a7977de.json.zip b/issues/genenetwork3/01828928-26e6-4cad-bbc8-59fd7a7977de.json.zip
new file mode 100644
index 0000000..7681b88
--- /dev/null
+++ b/issues/genenetwork3/01828928-26e6-4cad-bbc8-59fd7a7977de.json.zip
Binary files differdiff --git a/issues/genenetwork3/broken-aliases.gmi b/issues/genenetwork3/broken-aliases.gmi
new file mode 100644
index 0000000..2bfbdae
--- /dev/null
+++ b/issues/genenetwork3/broken-aliases.gmi
@@ -0,0 +1,188 @@
+# Broken Aliases
+
+## Tags
+
+* type: bug
+* status: open
+* priority: high
+* assigned: pjotrp
+* interested: pjotrp
+* keywords: aliases, aliases server
+
+## Tasks
+
+* [X] Rewrite server in gn-guile
+* [X] Fix menu search
+* [X] Fix global search aliases
+* [ ] Deploy and test aliases in GN2
+
+## Repository
+
+=> https://github.com/genenetwork/gn3
+
+moved to
+
+gn-guile repo.
+
+## Bug Report
+
+### Actual
+
+* Go to https://genenetwork.org/gn3/gene/aliases2/Shh,Brca2
+* Note that an exception is raised, with a "404 Not Found" message
+
+### Expected
+
+* We expected a list of aliases to be returned for the given symbols as is done in https://fallback.genenetwork.org/gn3/gene/aliases2/Shh,Brca2
+
+## Resolution
+
+Actually the server is up, but it is not part of the main deployment because it is written in Racket - and we don't have much support in Guix. I wrote the code the days after my bike accident:
+
+=> https://github.com/genenetwork/gn3/blob/master/gn3/web/wikidata.rkt
+
+and it is probably easiest to move it to gn-guile. Guile is another Scheme after all ;). Only fitting I spent days in hospital only recently (for a different reason). gn-guile already has its own web server and provides a REST API for our markdown editor, for example. On tux04 it responds with
+
+```
+curl http://127.0.0.1:8091/version
+"4.0.0"
+```
+
+What we want is to add the aliases server that should respond to
+
+```
+curl http://localhost:8000/gene/aliases/Shh # direct on tux01
+["9530036O11Rik","Dsh","Hhg1","Hx","Hxl3","M100081","ShhNC","ShhNC"]
+curl https://genenetwork.org/gn3/gene/aliases2/Shh,Brca2
+[["Shh",["9530036O11Rik","Dsh","Hhg1","Hx","Hxl3","M100081","ShhNC","ShhNC"]],["Brca2",["Fancd1","RAB163"]]]
+```
+
+Note this is used by search functionality in GN, as well as the gene aliases list on the mapping page. In principle we cache it for the duration of the running server so as not to overload wikidata. No one uses aliases2, that I can tell, so we only implement the first 'aliases'.
+
+Note the wikidata interface has been stable all this time. That is good.
+
+Turns out we already use wikidata in the gn-guile implementation for fetching the wikidata id for a species (as part of metadata retrieval). I wrote that about two years ago as part of the REST API expansion.
+
+Unfortunately
+
+```
+(sparql-scm (wd-sparql-endpoint-url)  (wikidata-gene-alias "Q24420953"))
+```
+
+throws a 403 forbidden error.
+
+This however works:
+
+```
+scheme@(gn db sparql) [15]> (sparql-wd-species-info "Q83310")
+;;; ("https://query.wikidata.org/sparql?query=%0ASELECT%20DISTINCT%20%3Ftaxon%20%3Fncbi%20%3Fdescr%20where%20%7B%0A%20%20%20%20wd%3AQ83310%20wdt%3AP225%20%3Ftaxon%20%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20wdt%3AP685%20%3Fncbi%20%3B%0A%20%20%20%20%20%20schema%3Adescription%20%3Fdescr%20.%0A%20%20%20%20%3Fspecies%20wdt%3AP685%20%3Fncbi%20.%0A%20%20%20%20FILTER%20%28lang%28%3Fdescr%29%3D%27en%27%29%0A%7D%20limit%205%0A%0A")
+$11 = "?taxon\t?ncbi\t?descr\n\"Mus musculus\"\t\"10090\"\t\"species of mammal\"@en\n"
+```
+
+(if you can see the mouse ;).
+
+Ah, this works
+
+```
+scheme@(gn db sparql) [17]> (sparql-tsv (wd-sparql-endpoint-url) (wikidata-query-geneids "Shh" ))
+;;; ("https://query.wikidata.org/sparql?query=SELECT%20DISTINCT%20%3Fwikidata_id%0A%20%20%20%20%20%20%20%20%20%20%20%20WHERE%20%7B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%3Fwikidata_id%20wdt%3AP31%20wd%3AQ7187%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20wdt%3AP703%20%3Fspecies%20.%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20VALUES%20%28%3Fspecies%29%20%7B%20%28wd%3AQ15978631%20%29%20%28%20wd%3AQ83310%20%29%20%28%20wd%3AQ184224%20%29%20%7D%20.%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%3Fwikidata_id%20rdfs%3Alabel%20%22Shh%22%40en%20.%0A%20%20%20%20%20%20%20%20%7D%0A")
+$12 = "?wikidata_id\n<http://www.wikidata.org/entity/Q14860079>\n<http://www.wikidata.org/entity/Q24420953>\n"
+```
+
+But this does not
+
+```
+scheme@(gn db sparql) [17]> (sparql-scm (wd-sparql-endpoint-url) (wikidata-query-geneids "Shh" ))
+ice-9/boot-9.scm:1685:16: In procedure raise-exception:
+In procedure utf8->string: Wrong type argument in position 1 (expecting bytevector): "<html>\r\n<head><title>403 Forbidden</title></head>\r\n<body>\r\n<center><h1>403 Forbidden</h1></center>\r\n<hr><center>nginx/1.18.0</center>\r\n</body>\r\n</html>\r\n"
+```
+
+Going via tsv does work
+
+```
+scheme@(gn db sparql) [18]> (tsv->scm (sparql-tsv (wd-sparql-endpoint-url) (wikidata-query-geneids "Shh" )))
+
+;;; ("https://query.wikidata.org/sparql?query=SELECT%20DISTINCT%20%3Fwikidata_id%0A%20%20%20%20%20%20%20%20%20%20%20%20WHERE%20%7B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%3Fwikidata_id%20wdt%3AP31%20wd%3AQ7187%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20wdt%3AP703%20%3Fspecies%20.%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20VALUES%20%28%3Fspecies%29%20%7B%20%28wd%3AQ15978631%20%29%20%28%20wd%3AQ83310%20%29%20%28%20wd%3AQ184224%20%29%20%7D%20.%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%3Fwikidata_id%20rdfs%3Alabel%20%22Shh%22%40en%20.%0A%20%20%20%20%20%20%20%20%7D%0A")
+$13 = ("?wikidata_id")
+$14 = (("<http://www.wikidata.org/entity/Q14860079>") ("<http://www.wikidata.org/entity/Q24420953>"))
+```
+
+that is nice enough.
+
+We now got a working alias server that is part of gn-guile. E.g.
+
+```
+curl http://127.0.0.1:8091/gene/aliases/Brca2
+["breast cancer 2","breast cancer 2, early onset","Fancd1","RAB163","BRCA2, DNA repair associated"]
+```
+
+it is part of gn-guile. gn-guile also has the 'commit/' handler by Alex, documented as
+'curl -X POST http://127.0.0.1:8091/commit' in git-markdown-editor.md. Let's see how that is wired up. The web interface is at, for example,
+https://genenetwork.org/editor/edit?file-path=general/help/facilities.md. Part of gn2's
+
+```
+gn2/wqflask/views.py
+398:@app.route("/editor/edit", methods=["GET"])
+408:@app.route("/editor/settings", methods=["GET"])
+414:@app.route("/editor/commit", methods=["GET", "POST"])
+```
+
+which has the code
+
+```
+@app.route("/editor/edit", methods=["GET"])
+@require_oauth2
+def edit_gn_doc_file():
+    file_path = urllib.parse.urlencode(
+        {"file_path": request.args.get("file-path", "")})
+    response = requests.get(f"http://localhost:8091/edit?{file_path}")
+    response.raise_for_status()
+    return render_template("gn_editor.html", **response.json())
+```
+
+Running over localhost. This is unfortunately hard coded, and we should change that! In guix system
+configuration it is already a variable as 'genenetwork-configuration-gn-guile-port 8091'. gn-guile should also be visible from outside, so that is a separate configuration.
+
+Also I note that the mapping page does three requests to wikidata (for mouse, rat and human). That could really be one.
+
+# Search
+
+Aliases are also used in search. You can tell when GN search renders too few results that aliases are not used. When aliases work we expect to list '2310010I16Rik' with
+
+=> https://genenetwork.org/search?species=mouse&group=BXD&type=Hippocampus+mRNA&dataset=HC_M2_0606_P&search_terms_or=sh*&search_terms_and=&FormID=searchResult
+
+Sheepdog tests for that and it has been failing for a while.
+
+Global search finds way more results, but also lacks that alias! Meanwhile GN1 does find that alias for record  1431728_at. GN2 finds it with hippocampus mRNA
+
+=> https://genenetwork.org/search?species=mouse&group=BXD&type=Hippocampus+mRNA&dataset=HC_M2_0606_P&search_terms_or=1431728_at%0D%0A&search_terms_and=&accession_id=None&FormID=searchResult
+
+in standard search.
+But neither 1431728_at or '2310010I16Rik' has a hit in *global* search and the result for Ssh should include the record in both search systems.
+
+# Deploy
+
+We introduced a new environment variable that does not show up on CD, part of the mapping page:
+
+=>
+
+In the logs on /export2:
+
+```
+root@tux02:/export2/guix-containers/genenetwork-development/var/log/cd# tail -100 genenetwork2.log
+2025-07-20 04:19:43   File "/genenetwork2/gn2/base/trait.py", line 157, in wikidata_alias_fmt
+2025-07-20 04:19:43     GN_GUILE_SERVER_URL + "gene/aliases/" + self.symbol.upper())
+2025-07-20 04:19:43 NameError: name 'GN_GUILE_SERVER_URL' is not defined
+```
+
+One thing I ran into is http://genenetwork.org/gn3-proxy/ - what is that for?
+
+## Deploy Updates: 2025-08-15
+=> https://git.genenetwork.org/guix-bioinformatics/commit/?id=269f99f1e1f0c253ecdd99f04bc7c6697012b0aa Update commit of gn-guile used on production
+
+This does not fix the issue on https://gn2-fred.genenetwork.org/show_trait?trait_id=1427571_at&dataset=HC_M2_0606_P, instead we get
+
+```
+fredm@tux04:~$ curl http://localhost:8091/gene/aliases/Brca2
+Resource not found: /gene/aliases/Brca2
+```
diff --git a/issues/genenetwork3/check-for-mandatory-settings.gmi b/issues/genenetwork3/check-for-mandatory-settings.gmi
new file mode 100644
index 0000000..16a2f8a
--- /dev/null
+++ b/issues/genenetwork3/check-for-mandatory-settings.gmi
@@ -0,0 +1,40 @@
+# Check for Mandatory Settings
+
+## Tags
+
+* status: open
+* priority: high
+* type: bug, improvement
+* interested: fredm, bonz
+* assigned: jnduli, rookie101
+* keywords: GN3, gn3, genenetwork3, settings, config, configs, configurations
+
+## Explanation
+
+Giving defaults to some important settings leads to situations where the correct configuration is not set up correctly leading at best to failure, and at worst, to subtle failures that can be difficult to debug: e.g. When a default URI to a server points to an active domain, just not the correct one.
+
+We want to make such (arguably, sensitive) configurations explicit, and avoid giving them defaults. We want to check that they are set up before allowing the application to run, and fail loudly and obnoxiously if they are not provided.
+
+Examples of configuration variables that should be checked for:
+
+* All external URIs (external to app/repo under consideration)
+* All secrets (secret keys, salts, tokens, etc)
+
+We should also eliminate from the defaults:
+
+* Computed values
+* Calls to get values from ENVVARs (`os.environ.get(…)` calls)
+
+### Note on ENVVARs
+
+The environment variables should be used for overriding values under specific conditions, therefore, it should both be explicit and the last thing loaded to ensure they actually override settings. 
+
+=> https://git.genenetwork.org/gn-auth/tree/gn_auth/__init__.py?id=3a276642bea934f0a7ef8f581d8639e617357a2a#n70 See this example for a possible way of allowing ENVVARs to override settings.
+
+The example above could be improved by maybe checking for environment variables starting with a specific value, e.g. the envvar `GNAUTH_SECRET_KEY` would override the `SECRET_KEY` configuration. This allows us to override settings without having to change the code.
+
+## Tasks
+
+* [ ] Explicitly check configs for ALL external URIs
+* [ ] Explicitly check configs for ALL secrets
+* [ ] Explicitly load ENVVARs last to override settings
diff --git a/issues/genenetwork3/ctl-maps-error.gmi b/issues/genenetwork3/ctl-maps-error.gmi
new file mode 100644
index 0000000..6726357
--- /dev/null
+++ b/issues/genenetwork3/ctl-maps-error.gmi
@@ -0,0 +1,46 @@
+# CTL Maps Error
+
+## Tags
+
+* type: bug
+* status: open
+* priority: high
+* assigned: alexm, zachs, fredm
+* keywords: CTL, CTL Maps, gn3, genetwork3, genenetwork 3
+
+## Description
+
+Trying to run the CTL Maps feature in the collections page as described in
+=> /issues/genenetwork2/broken-collections-feature
+
+We get an error in the results page of the form:
+
+```
+{'error': '{\'code\': 1, \'output\': \'Loading required package: MASS\\nLoading required package: parallel\\nLoading required package: qtl\\nThere were 13 warnings (use warnings() to see them)\\nError in xspline(x, y, shape = 0, lwd = lwd, border = col, lty = lty, : \\n invalid value specified for graphical parameter "lwd"\\nCalls: ctl.lineplot -> draw.spline -> xspline\\nExecution halted\\n\'}'}
+```
+
+on the CLI the same error is rendered:
+```
+Loading required package: MASS
+Loading required package: parallel
+Loading required package: qtl
+There were 13 warnings (use warnings() to see them)
+Error in xspline(x, y, shape = 0, lwd = lwd, border = col, lty = lty,  : 
+  invalid value specified for graphical parameter "lwd"
+Calls: ctl.lineplot -> draw.spline -> xspline
+Execution halted
+```
+
+On my local development machine, the command run was
+```
+Rscript /home/frederick/genenetwork/genenetwork3/scripts/ctl_analysis.R /tmp/01828928-26e6-4cad-bbc8-59fd7a7977de.json
+```
+
+Here is a zipped version of the json file (follow the link and click download):
+=> https://github.com/genenetwork/gn-gemtext-threads/blob/main/issues/genenetwork3/01828928-26e6-4cad-bbc8-59fd7a7977de.json.zip
+
+Troubleshooting a while, I suspect
+=> https://github.com/genenetwork/genenetwork3/blob/27d9c9d6ef7f37066fc63af3d6585bf18aeec925/scripts/ctl_analysis.R#L79-L80 this is the offending code.
+
+=> https://cran.r-project.org/web/packages/ctl/ctl.pdf The manual for the ctl library
+indicates that our call above might be okay, which might mean something changed in the dependencies that the ctl library used.
diff --git a/issues/genenetwork3/genenetwork3_configuration.gmi b/issues/genenetwork3/genenetwork3_configuration.gmi
new file mode 100644
index 0000000..cdd7c15
--- /dev/null
+++ b/issues/genenetwork3/genenetwork3_configuration.gmi
@@ -0,0 +1,19 @@
+# Genenetwork3 Configurations
+
+## Tags
+
+* assigned: fredm
+* priority: normal
+* status: closed, completed
+* keywords: configuration, config, gn2, genenetwork, genenetwork2
+* type: bug
+
+## Description
+
+The configuration file should only ever contain settings, and no code. Remove all code from the default settings file.
+
+Eschew executable formats (*.py) for configuration files and prefer non-executable formats e.g. *.cfg, *.json, *.conf etc
+
+## Closed as Completed
+
+See commit https://github.com/genenetwork/genenetwork3/commit/977efbb54da284fb3e8476f200206d00cb8e64cd
diff --git a/issues/genenetwork3/generate-heatmaps-failing.gmi b/issues/genenetwork3/generate-heatmaps-failing.gmi
new file mode 100644
index 0000000..522dc27
--- /dev/null
+++ b/issues/genenetwork3/generate-heatmaps-failing.gmi
@@ -0,0 +1,64 @@
+# Generate Heatmaps Failing
+
+## Tags
+
+* type: bug
+* status: open
+* priority: medium
+* assigned: fredm, zachs, zsloan
+* keywords: genenetwork3, gn3, GN3, heatmaps
+
+## Reproduce
+
+* Go to https://genenetwork.org/
+* Under "Select and Search" menu, enter "synap*" for the "Get Any" field
+* Click "Search"
+* In search results page, select first 10 traits
+* Click "Add"
+* Under "Create a new collection" enter the name "newcoll" and click "Create collection"
+* In the collections page that shows up, click "Select All" once
+* Ensure all the traits are selected
+* Click "Generate Heatmap" and wait
+* Note how system fails silently with no heatmap presented
+
+### Notes
+
+On https://gn2-fred.genenetwork.org the heatmaps fails with a note ("ERROR: undefined"). In the logs, I see "Module 'scipy' has no attribute 'array'" which seems to be due to a change in numpy.
+=> https://github.com/MaartenGr/BERTopic/issues/1791
+=> https://github.com/scipy/scipy/issues/19972
+
+This issue should not be present with python-plotly@5.20.0 but since guix-bioinformatics pins the guix version to `b0b988c41c9e0e591274495a1b2d6f27fcdae15a`, we are not able to pull in newer versions of packages from guix.
+
+
+### Update 2025-04-08T10:59CDT
+
+Got the following error when I ran the background command manually:
+
+```
+$ export RUST_BACKTRACE=full
+$ /gnu/store/dp4zq4xiap6rp7h6vslwl1n52bd8gnwm-profile/bin/qtlreaper --geno /home/frederick/genotype_files/genotype/genotype/BXD.geno --n_permutations 1000 --traits /tmp/traits_test_file_n2E7V06Cx7.txt --main_output /tmp/qtlreaper/main_output_NGVW4sfYha.txt --permu_output /tmp/qtlreaper/permu_output_MJnzLbrsrC.txt
+thread 'main' panicked at src/regression.rs:216:25:
+index out of bounds: the len is 20 but the index is 20
+stack backtrace:
+   0:     0x61399d77d46d - <unknown>
+   1:     0x61399d7b5e13 - <unknown>
+   2:     0x61399d78b649 - <unknown>
+   3:     0x61399d78f26f - <unknown>
+   4:     0x61399d78ee98 - <unknown>
+   5:     0x61399d78f815 - <unknown>
+   6:     0x61399d77d859 - <unknown>
+   7:     0x61399d77d679 - <unknown>
+   8:     0x61399d78f3f4 - <unknown>
+   9:     0x61399d6f4063 - <unknown>
+  10:     0x61399d6f41f7 - <unknown>
+  11:     0x61399d708f18 - <unknown>
+  12:     0x61399d6f6e4e - <unknown>
+  13:     0x61399d6f9e93 - <unknown>
+  14:     0x61399d6f9e89 - <unknown>
+  15:     0x61399d78e505 - <unknown>
+  16:     0x61399d6f8d55 - <unknown>
+  17:     0x75ee2b945bf7 - __libc_start_call_main
+  18:     0x75ee2b945cac - __libc_start_main@GLIBC_2.2.5
+  19:     0x61399d6f4861 - <unknown>
+  20:                0x0 - <unknown>
+```
diff --git a/issues/genenetwork3/rqtl2-mapping-error.gmi b/issues/genenetwork3/rqtl2-mapping-error.gmi
new file mode 100644
index 0000000..b43d66f
--- /dev/null
+++ b/issues/genenetwork3/rqtl2-mapping-error.gmi
@@ -0,0 +1,46 @@
+# R/qtl2 Maps Error
+
+## Tags
+
+* type: bug
+* status: closed, completed
+* priority: high
+* assigned: alexm, zachs, fredm
+* keywords: R/qtl2, R/qtl2 Maps, gn3, genetwork3, genenetwork 3
+
+## Reproduce
+
+* Go to https://genenetwork.org/
+* In the "Get Any" field, enter "synap*" and press the "Enter" key
+* In the search results, click on the "1435464_at" trait
+* Expand the "Mapping Tools" accordion section
+* Select the "R/qtl2" option
+* Click "Compute"
+* In the "Computing the Maps" page that results, click on "Display System Log"
+
+### Observed
+
+A traceback is observed, with an error of the following form:
+
+```
+⋮
+FileNotFoundError: [Errno 2] No such file or directory: '/opt/gn/tmp/gn3-tmpdir/JL9PvKm3OyKk.txt'
+```
+
+### Expected
+
+The mapping runs successfully and the results are presented in the form of a mapping chart/graph and a table of values.
+
+### Debug Notes
+
+The directory "/opt/gn/tmp/gn3-tmpdir/" exists, and is actually used by other mappings (i.e. The "R/qtl" and "Pair Scan" mappings) successfully.
+
+This might imply a code issue: Perhaps
+* a path is hardcoded, or
+* the wrong path value is passed
+
+The same error occurs on https://cd.genenetwork.org but does not seem to prevent CD from running the mapping to completion. Maybe something is missing on production — what, though?
+
+## Closed as Completed
+
+This seems fixed now.