diff options
Diffstat (limited to 'issues')
110 files changed, 4070 insertions, 131 deletions
diff --git a/issues/CI-CD/configurations.gmi b/issues/CI-CD/configurations.gmi index 54cea47..acd2512 100644 --- a/issues/CI-CD/configurations.gmi +++ b/issues/CI-CD/configurations.gmi @@ -4,7 +4,7 @@ * assigned: aruni, fredm * priority: normal -* status: open +* status: closed, completed * keywords: CI, CD, configuration, config * type: bug @@ -38,3 +38,7 @@ and at least one of the values other than "localhost" is used to determine the c The secrets (e.g. SECRET_KEY, OAUTH_CLIENT_ID, OAUTH_CLIENT_SECRET, etc) can be encrypted and stored in some secrets management system (e.g. Pass [https://www.passwordstore.org/] etc.) setup in each relevant host: better yet, have all configurations (secret or otherwise) encrypted and stored in such a secrets management system and fetch them from there. This reduces the mental overhead of dealing with multiple places to fetch the configs. From these, the CI/CD system can them build and intern the configurations into the store with guix functions like "plain-file", "local-file", etc. + +## Notes + +This idea was mostly rejected — it seems — in favour of using external settings files that are shared with the running container and separate build scripts for the different environments. This mostly covers all the bases necessary to get the settings correct. diff --git a/issues/add-documentation-and-data-retrieval-for-AI-repo.gmi b/issues/add-documentation-and-data-retrieval-for-AI-repo.gmi index 11f8f30..a96c18d 100644 --- a/issues/add-documentation-and-data-retrieval-for-AI-repo.gmi +++ b/issues/add-documentation-and-data-retrieval-for-AI-repo.gmi @@ -6,7 +6,6 @@ * priority: high * type: ui * keywords: phenotypes -* status: stalled ## Description @@ -15,3 +14,4 @@ * Share alternate way of getting sparql json-ld data from public endpoint outside isql. * Share json-ld gotchas. +* closed diff --git a/issues/add-genotype-files-to-rdf.gmi b/issues/add-genotype-files-to-rdf.gmi index 85ac39c..856c070 100644 --- a/issues/add-genotype-files-to-rdf.gmi +++ b/issues/add-genotype-files-to-rdf.gmi @@ -3,7 +3,7 @@ ## Tags * assigned: bonfacem * type: bug -* status: open, in progress +* status: stalled In Penguin2, genotype files are located in: /export/data/genenetwork/genotype_files/genotype. Each genotype files has an identifier to a dataset it refers to: diff --git a/issues/add-unique-identifiers-for-case-attributes.gmi b/issues/add-unique-identifiers-for-case-attributes.gmi new file mode 100644 index 0000000..0c3123d --- /dev/null +++ b/issues/add-unique-identifiers-for-case-attributes.gmi @@ -0,0 +1,11 @@ +# Add Case Attributes to RDF + +## Tags + +* assigned: bonfacem +* priority: high +* status: open + +## Description + +Add case attributes and their metadata into RDF. diff --git a/issues/assorted-ui-issues.gmi b/issues/assorted-ui-issues.gmi new file mode 100644 index 0000000..5fbacea --- /dev/null +++ b/issues/assorted-ui-issues.gmi @@ -0,0 +1,36 @@ +# Various UI issues raised by Rob (8/19/2024) + +# Tags + +* assigned: zsloan +* keywords: user-interface +* priority: medium +* open + +## Tasks + +* [X] Fix collection encoding issue + +* [X] Don't import empty collections (like the Default Collection) + +* [X] Update/Creation dates aren't listed for collections + +* [X] Remove in-between ticks for Effect Size Plot (from mapping page) so it's just -1/0/1 + +* [X] Also make Effect Size Plot more narrow + +* [X] Prevent X/Y-aix summary text from extending beyond the graph width + +* [X] Longer tick markers as well + +* [X] Remove triangle for phenotype mapping + +* [X] Remove ProbeSetPosition from mapping for traits with no position + +* [X] Make Haplotype legend image thicker + change text to Haplotypes (Mat, Pat, Het, Unknown) + +* [X] Change "Sequence Site" in legend to "Gene Location" + +* [X] When adding genotype marker as covariate (for scatter-plot, maybe also mapping), change description to Position instead of "undefined" + +* [ ] Check Add Covariation colorbox popup on Apple laptop (it shows up weird for Rob, but normal for me) diff --git a/issues/auth/reset-password-feature.gmi b/issues/auth/reset-password-feature.gmi index 8eaaa6a..299f915 100644 --- a/issues/auth/reset-password-feature.gmi +++ b/issues/auth/reset-password-feature.gmi @@ -1,6 +1,16 @@ # Reset/Forgot Password Feature for GN2 +# Tags + * assigned: fredm -* tags: critical +* priority: critical +* status: closed +* keywords: gn-auth, auth, reset password +* type: feature-request + +## Description Should a user forget his/her password, there's no clear way to reset the password. + +This issue is +=> https://git.genenetwork.org/gn-auth/tree/gn_auth/auth/authorisation/users/views.py?id=e829074e99fd5bec033765d18d5efa55e1edce44#n454 implemented with the latest code. diff --git a/issues/cleanup-base-file-gn2.gmi b/issues/cleanup-base-file-gn2.gmi new file mode 100644 index 0000000..8a05323 --- /dev/null +++ b/issues/cleanup-base-file-gn2.gmi @@ -0,0 +1,30 @@ +# Cleanup GN2 Base HTML File + +## Tags + +* Assigned: alexm +* Keywords: base, HTML, JavaScript, cleanup +* type: Refactoring +* Status: closed, completed, done + +## Description + +The base file should contain no custom JavaScript since it is inherited in almost all files in GN2. It should only include what is necessary. As a result, we need to move the global search from the base file to the index page, which renders the GN2 home. + +## Tasks + +* [x] Remove global search code from the base file and move it to the index page +* [x] Fix formatting and linting issues in the base file (e.g., tags) +* [x] Inherit from index page for all gn2 templates + + +## Notes + +See the PR that seeks to fix this: +=> https://github.com/genenetwork/genenetwork2/pull/877 + +## Notes 26/09/2024 + +It was agreed that global search should be a feature for all pages, +As such all files need to inherit from the global search which +defines the global search.
\ No newline at end of file diff --git a/issues/create-custom-rif-xapian-index.gmi b/issues/create-custom-rif-xapian-index.gmi new file mode 100644 index 0000000..a0b9039 --- /dev/null +++ b/issues/create-custom-rif-xapian-index.gmi @@ -0,0 +1,16 @@ +# Create Custom RIF XAPIAN Index + +## Tags + +* assigned: bonfacem +* priority: medium +* status: in-progress +* deadline: 2024-10-23 Wed + +## Description + +Given the GN Wiki search page: + +=> https://cd.genenetwork.org/genewiki GeneWiki Entries Search + +We only search by symbol. Add custom XAPIAN index to perform more powerful search. diff --git a/issues/edit-rif-metadata.gmi b/issues/edit-rif-metadata.gmi new file mode 100644 index 0000000..546dc80 --- /dev/null +++ b/issues/edit-rif-metadata.gmi @@ -0,0 +1,121 @@ +# Edit RIF Metadata in GN2 + +## Tags + +* assigned: bonfacem, jnduli +* priority: high +* status: closed + +## Tasks + +### Viewing +* [X] API: Get WIKI/RIF by symbol from rdf. + +> GET /wiki/<symbol> + +``` +[{ + "symbol": "XXXX", + "reason": "XXXX", + "species": "XXXX", + "pubmed_ids": ["XXXX", "XXXX"], // empty array when non-existent + "web_url": "XXXX" // Optional + "comment": "XXXX", + "email": "XXXX", + "categories": ["XXXX", "XXXX"], // Enumeration + "version": "XXXX", + "initial": "XXXX", // Optional user or project code or your initials. +}] +``` + +* [X] UI: Modify traits page to have "GN2 (GeneWiki)" +* [X] UI: Integrate with API + +### Editing + +* [X] API: Edit comment by id in mysql/rdf: modifies GeneRIF and GeneRIFXRef tables. +* [X] API: Modify edit comments by id to include RDF changes. + +> POST /wiki/<comment-id>/edit + +``` +{ + "symbol": "XXXX", + "reason": "XXXX", + "species": "XXXX", + "pubmed_ids": ["XXXX", "XXXX"], // Optional + "web_url": "XXXX" // Optional + "comment": "XXXX", + "email": "XXXX", + "categories": ["XXXX", "XXXX"], // Enumeration + "initial": "XXXX", // Optional user or project code or your initials. +} +``` +* [X] UI: Add buttons that edit various relevant sections. +* [X] UI: Edit page needs to fetch categories from GeneCategory table. When comment write fails, alert with error. When comment write success, update the comment on the page, and alert with success. +* [X] API: Modify edit comments by id to include RDF changes. +* [X] GN auth integration + +### History + +* [X] API: End-point to fetch all the historical data +* [X] UI: Page that contains history for how comments changes. + +> GET /wiki/<comment-id>/history + +``` +[{ + "symbol": "XXXX", + "reason": "XXXX", + "species": "XXXX", + "pubmed_ids": ["XXXX", "XXXX"], // Optional + "web_url": "XXXX" // Optional + "comment": "XXXX", + "email": "XXXX", + "categories": ["XXXX", "XXXX"], // Enumeration + "version": "XXXX", + "initial": "XXXX", // Optional user or project code or your initials. +}] +``` + +### Misc ToDos: + +* [X] Review performance of query used in 72d9a24e8e65 [Genenetwork3] + +### Ops + +* [X] RDF synchronization with SQL (gn-machines). +* [X] Update RDF in tux02. +* [X] UI: Add "edit" button after testing. + +### Resolution + +Genenetwork2: +=> https://github.com/genenetwork/genenetwork2/pull/858 UI/fetch rif using recent apis #858 +=> https://github.com/genenetwork/genenetwork2/pull/864 Add comment history page. #864 +=> https://github.com/genenetwork/genenetwork2/pull/865 Add support for auth in Rif Edit #865 +=> https://github.com/genenetwork/genenetwork2/pull/866 Add a page for searching GeneWiki by symbol. #866 +=> https://github.com/genenetwork/genenetwork2/pull/881 Add display page for NCBI RIF metadata. #881 +=> https://github.com/genenetwork/genenetwork2/pull/881 Add display page for NCBI RIF metadata. #881 +=> https://github.com/genenetwork/genenetwork2/pull/882 GN editting UI improvements #882 + + +GeneNetwork3: +=> https://github.com/genenetwork/genenetwork3/pull/180 Update script that updates Generif_BASIC table #180 +=> https://github.com/genenetwork/genenetwork3/pull/181 Add case insensitive prefixes for rif wiki #181 +=> https://github.com/genenetwork/genenetwork3/pull/184 Api/get wiki from rdf #184 +=> https://github.com/genenetwork/genenetwork3/pull/185 feat: add api calls to get categories and last comment #185 +=> https://github.com/genenetwork/genenetwork3/pull/186 Api/fetch the latest wiki by versionid #186 +=> https://github.com/genenetwork/genenetwork3/pull/187 Api/get end point to fetch all historical data #187 +=> https://github.com/genenetwork/genenetwork3/pull/189 Add auth to edit RIF api call #189 +=> https://github.com/genenetwork/genenetwork3/pull/190 Api/update rif queries #190 +=> https://github.com/genenetwork/genenetwork3/pull/193 Api/edit rif endpoint #193 +=> https://github.com/genenetwork/genenetwork3/pull/194 Fix C0411/C0412 pylint errors in gn3.api.metadata.api.wiki. #194 +=> https://github.com/genenetwork/genenetwork3/pull/195 Add rif tests #195 +=> https://github.com/genenetwork/genenetwork3/pull/196 Handle missing GN3_SECRETS for CI testing. #196 +=> https://github.com/genenetwork/genenetwork3/pull/197 Rif edit atomicity #197 +=> https://github.com/genenetwork/genenetwork3/pull/198 Run tests against Virtuoso that is spun locally. #198 +=> https://github.com/genenetwork/genenetwork3/pull/199 Add rdf-tests after the check phase. #199 +=> https://github.com/genenetwork/genenetwork3/pull/200 Api/ncbi metadata #200 + +* closed diff --git a/issues/editing-dataset-metadata.gmi b/issues/editing-dataset-metadata.gmi index 17d1693..70876e0 100644 --- a/issues/editing-dataset-metadata.gmi +++ b/issues/editing-dataset-metadata.gmi @@ -5,7 +5,7 @@ * assigned: bonfacem * priority: high * type: editing -* status: in-progress +* status: stalled * keywords: metadata editing ## Description diff --git a/issues/error-handling-external-errors.gmi b/issues/error-handling-external-errors.gmi index d1707de..640e1d1 100644 --- a/issues/error-handling-external-errors.gmi +++ b/issues/error-handling-external-errors.gmi @@ -3,7 +3,7 @@ ## Tags * assigned: fredm -* status: open +* status: closed * type: bug * priority: high * keywords: error handling diff --git a/issues/fix-global-search-ui.gmi b/issues/fix-global-search-ui.gmi new file mode 100644 index 0000000..2979d99 --- /dev/null +++ b/issues/fix-global-search-ui.gmi @@ -0,0 +1,24 @@ +# Fix Broken Global Search UI + +## Tags + +* Assigned: alexm, zsloan +* Priority: high +* status: in progress +* Keyword : search, UI, bug, Refactor +* Type: UI, bug + +## Description + +The Global search UI layout is broken on certain browser versions. +This issue was reported to occur for **Firefox Version 128.3.1** ESR Version. +The root cause of the problem is unclear, +but after reviewing the global search UI code, +the following changes need to be implemented (see tasks below): + + + +## Tasks + +* [ ] Remove custom layout CSS and replace it with the Bootstrap layout for better uniformity and easier debugging. +* [ ] Modify the navbar to extend across the full width of the page on medium and small devices. diff --git a/issues/fix-pairscan-mapping.gmi b/issues/fix-pairscan-mapping.gmi new file mode 100644 index 0000000..1b48fee --- /dev/null +++ b/issues/fix-pairscan-mapping.gmi @@ -0,0 +1,28 @@ +# Fix Pairscan Mapping + +## Tags + +* assigned: alexm, +* priority: medium, +* type: bug +* keywords: pairscan, debug, fix, mapping + +## Description +Pairscan mapping is currently not working: + +Error: + +``` +GeneNetwork 3.12-rc1 https://genenetwork.org/run_mapping ( 1:01PM UTC Jan 13, 2025) +Traceback (most recent call last): + File "/gnu/store/cxawl32jm0fgavc9ahcr3g0j66zdan30-profile/lib/python3.10/site-packages/flask/app.py", line 1523, in full_dispatch_request + rv = self.dispatch_request() + File "/gnu/store/cxawl32jm0fgavc9ahcr3g0j66zdan30-profile/lib/python3.10/site-packages/flask/app.py", line 1509, in dispatch_request + return self.ensure_sync(self.view_functions[rule.endpoint])(**req.view_args) + File "/gnu/store/cxawl32jm0fgavc9ahcr3g0j66zdan30-profile/lib/python3.10/site-packages/gn2/wqflask/views.py", line 1035, in mapping_results_page + template_vars = run_mapping.RunMapping(start_vars, + File "/gnu/store/cxawl32jm0fgavc9ahcr3g0j66zdan30-profile/lib/python3.10/site-packages/gn2/wqflask/marker_regression/run_mapping.py", line 312, in __init__ + self.geno_db_exists = geno_db_exists(self.dataset, results[0]['name']) + KeyError: 'name' + +```
\ No newline at end of file diff --git a/issues/fix-rqtl-rm-bug.gmi b/issues/fix-rqtl-rm-bug.gmi new file mode 100644 index 0000000..de71487 --- /dev/null +++ b/issues/fix-rqtl-rm-bug.gmi @@ -0,0 +1,95 @@ +# Investigate and Fix `rm` Command in `rqtl` Logs + +## Tags + +* assigned: alex, bonfacem +* type: Bug +* status: in progress +* keywords: external, qtl, rqtl, bug, logs + +## Description + +For QTL analysis, we invoke the `rqtl` script as an external process through Python's `subprocess` module. +For reference, see the `rqtl_wrapper.R` script: +=> https://github.com/genenetwork/genenetwork3/blob/main/scripts/rqtl_wrapper.R + +The issue is that, upon analyzing the logs for `rqtl`, we see that an `rm` command is unexpectedly invoked: + +``` +sh: line 1: rm: command not found +``` + +This command cannot be traced to its origin, and it does not appear to be part of the expected behavior. + +The issue is currently observed only in the CD environment. The only way I have attempted to reproduce this locally is by invoking the command in a shell environment with string injection, which is not the case for GeneNetwork3, where all strings are parsed and passed as a list argument. + +Here’s an example of the above attempt: + +```python +def run_process(cmd, output_file, run_id): + """Function to execute an external process and capture the stdout in a file. + + Args: + cmd: The command to execute, provided as a list of arguments. + output_file: Absolute file path to write the stdout. + run_id: Unique ID to identify the process. + + Returns: + A dictionary with the results, indicating success or failure. + """ + cmd.append(" && rm") # Injecting potentially problematic command + cmd = " ".join(cmd) # The command is passed as a string + + try: + # Phase: Execute the command in a shell environment + with subprocess.Popen( + cmd, + shell=True, + stdout=subprocess.PIPE, + stderr=subprocess.STDOUT, + ) as process: + # Process output handling goes here +``` + +The error generated at the end of the `rqtl` if the rm run does not exists inside the container is: + +``` +sh: line 1: rm: command not found +``` + +The actual code for GeneNetwork3 is: + +```python +def run_process(cmd, output_file, run_id): + """Function to execute an external process and capture the stdout in a file. + + Args: + cmd: The command to execute, provided as a list of arguments. + output_file: Absolute file path to write the stdout. + run_id: Unique ID to identify the process. + + Returns: + A dictionary with the results, indicating success or failure. + """ + try: + # Phase: Execute the command in a shell environment + with subprocess.Popen( + cmd, + stdout=subprocess.PIPE, + stderr=subprocess.STDOUT, + ) as process: + # Process output handling goes here +``` + +## Investigated and Excluded Possibilities + +* [x] The `rm` command is not explicitly invoked within the `rqtl` script. +* [x] The `rqtl` command is passed as a list of parsed arguments (i.e., no direct string injection). +* [x] The subprocess is not invoked within a shell environment, which would otherwise result in string injection. +* [x] We simulated invoking a system command within the `rqtl` script, but the error does not match the observed issue. + +## TODO + +* [ ] Test in a similar environment to the CD environment to replicate the issue. + +* [ ] Investigate the internals of the QTL library for any unintended `rm` invocation. diff --git a/issues/gemma/gemma2-has-different-output-from-rqtl2.gmi b/issues/gemma/gemma2-has-different-output-from-rqtl2.gmi new file mode 100644 index 0000000..a0b2c5c --- /dev/null +++ b/issues/gemma/gemma2-has-different-output-from-rqtl2.gmi @@ -0,0 +1,80 @@ +# GEMMA output differs from R/qtl2 + +# Tags + +* assigned: pjotrp, davea +* priority: high +* type: bug, enhancement +* status: closed +* keywords: database, gemma, reaper, rqtl2 + +# Description + +When running trait BXD_21526 results differ significantly. + +=> https://genenetwork.org/show_trait?trait_id=21526&dataset=BXDPublish +=> https://genenetwork.org/show_trait?trait_id=21529&dataset=BXDPublish + +So I confirm I am getting the same results as Dave in GN for GEMMA (see Conclusion below). + +# Tasks + +## GeneNetwork + +I run GEMMA for precompute on the command line and that I confirmed to +be the same as what we see in the browser. This suggests either data +or method is different with Dave's approach. + +I confirmed that gemma in GN matches Dave's results. It is interesting +to see that running without LOCO has some impact, but not as bad as +the R/qtl2 difference. First we should check the genotype files to see +if they match. I checked that the phenotypes match. + +Our inputs are different if I count genotypes (first yours, the other +on production): + +``` + 1 2184941 B + 2 2132744 D + 3 628980 H + 1 2195662 B + 2 2142959 D + 3 650168 H +``` + +The number of rows/markers is the same. So we probably added some +genometypes, but if we miss one that would matter. Dave you can find +the file in /home/wrk/BXD.geno on tux02 if you want to look. + +I notice that we don't use H in the R/qtl2 control file. That +might make a difference though it probably won't explain what we see +now. BTW I also correlated the LOD scores from GEMMA and R/qtl2 in +the spreadsheet and at 0.7 that is too low. So it is probably not +just a magnitude problem. The results differ a lot in your +spreadsheet. + +Next step is that I need to run R/qtl2 using the script in your +dropbox and see what Karl's code does. The exercise does not hurt +because it will help us bring R/qtl2 to GN. + +## R/qtl2 + +R/qtl2 is packaged in guix and can be run in a shell with + +``` +guix shell -C r r-qtl2 +> library(qtl2) +> bxd <- read_cross2(file = "bxd_cancer_new_GN_July_2024.json") +Warning messages: +1: In recode_geno(sheet, genotypes) : + 630519 genotypes treated as missing: "H", "U" +2: In matrix(as.numeric(unlist(pheno)), ncol = nc) : + NAs introduced by coercion +3: In check_cross2(output) : Physical map out of order on chr 1, 2, 11, 19 +``` + +The first warning matches above. If data is missing it may be filtered out. We'll have to check for that. The third warning I am not sure about. Probably a ranking of markers. + +# Conclusion + +It turned out that R/qtl was running HK - so it was a QTL mapping rather than an LMM. diff --git a/issues/genenetwork/cannot-connect-to-mariadb.gmi b/issues/genenetwork/cannot-connect-to-mariadb.gmi new file mode 100644 index 0000000..3dfe1bc --- /dev/null +++ b/issues/genenetwork/cannot-connect-to-mariadb.gmi @@ -0,0 +1,121 @@ +# Cannot Connect to MariaDB + + +## Description + +GeneNetwork3 is failing to connect to mariadb with the error: + +``` +⋮ +2024-11-05 14:49:00 Traceback (most recent call last): +2024-11-05 14:49:00 File "/gnu/store/83v79izrqn36nbn0l1msbcxa126v21nz-profile/lib/python3.10/site-packages/flask/app.py", line 1523, in full_dispatch_request +2024-11-05 14:49:00 rv = self.dispatch_request() +2024-11-05 14:49:00 File "/gnu/store/83v79izrqn36nbn0l1msbcxa126v21nz-profile/lib/python3.10/site-packages/flask/app.py", line 1509, in dispatch_request +2024-11-05 14:49:00 return self.ensure_sync(self.view_functions[rule.endpoint])(**req.view_args) +2024-11-05 14:49:00 File "/gnu/store/83v79izrqn36nbn0l1msbcxa126v21nz-profile/lib/python3.10/site-packages/gn3/api/menu.py", line 13, in generate_json +2024-11-05 14:49:00 with database_connection(current_app.config["SQL_URI"], logger=current_app.logger) as conn: +2024-11-05 14:49:00 File "/gnu/store/lzw93sik90d780n09svjx5la1bb8g3df-python-3.10.7/lib/python3.10/contextlib.py", line 135, in __enter__ +2024-11-05 14:49:00 return next(self.gen) +2024-11-05 14:49:00 File "/gnu/store/83v79izrqn36nbn0l1msbcxa126v21nz-profile/lib/python3.10/site-packages/gn3/db_utils.py", line 34, in database_connection +2024-11-05 14:49:00 connection = mdb.connect(db=db_name, +2024-11-05 14:49:00 File "/gnu/store/83v79izrqn36nbn0l1msbcxa126v21nz-profile/lib/python3.10/site-packages/MySQLdb/__init__.py", line 121, in Connect +2024-11-05 14:49:00 return Connection(*args, **kwargs) +2024-11-05 14:49:00 File "/gnu/store/83v79izrqn36nbn0l1msbcxa126v21nz-profile/lib/python3.10/site-packages/MySQLdb/connections.py", line 195, in __init__ +2024-11-05 14:49:00 super().__init__(*args, **kwargs2) +2024-11-05 14:49:00 MySQLdb.OperationalError: (2002, "Can't connect to local MySQL server through socket '/tmp/mysql.sock' (2)") +``` + +We have previously defined the default socket file[^1][^2] as "/run/mysqld/mysqld.sock". + +## Troubleshooting Logs + +### 2024-11-05 + +I attempted to just bind `/run/mysqld/mysqld.sock` to `/tmp/mysql.sock` by adding the following mapping in GN3's `gunicorn-app` definition: + +``` +(file-system-mapping + (source "/run/mysqld/mysqld.sock") + (target "/tmp/mysql.sock") + (writable? #t)) +``` + +but that does not fix things. + +I had tried to change the mysql URI to use IP addresses, i.e. + +``` +SQL_URI="mysql://webqtlout:webqtlout@128.169.5.119:3306/db_webqtl" +``` + +but that simply changes the error from the above to the one below: + +``` +2024-11-05 15:27:12 MySQLdb.OperationalError: (2002, "Can't connect to MySQL server on '128.169.5.119' (115)") +``` + +I tried with both `127.0.0.1` and `128.169.5.119`. + +My hail-mary was to attempt to expose the `my.cnf` file generated by the `mysql-service-type` definition to the "pola-wrapper", but that is proving tricky, seeing as the file is generated elsewhere[^4] and we do not have a way of figuring out the actual final path of the file. + +I tried: + +``` +(file-system-mapping + (source (mixed-text-file "my.cnf" + (string-append "[client]\n" + "socket=/run/mysqld/mysqld.sock"))) + (target "/etc/mysql/my.cnf")) +``` + +but that did not work either. + +### 2024-11-07 + +Start digging into how GNU Guix services are defined[^5] to try and understand why the file mapping attempt did not work. + +=> http://git.savannah.gnu.org/cgit/guix.git/tree/gnu/system/file-systems.scm?id=2394a7f5fbf60dd6adc0a870366adb57166b6d8b#n575 +Looking at the code linked above specifically at lines 575 to 588, and 166, it seems, to me, that the mappings attempt should have worked. + +Try it again, taking care to verify that the paths are correct, with: + +``` +(file-system-mapping + (source (mixed-text-file "my.cnf" + (string-append "[client-server]\n" + "socket=/run/mysqld/mysqld.sock"))) + (target "/etc/my.cnf")) +``` + +Try rebuilding on tux04: started getting `Segmentation fault` errors out of the blue for many guix commands 🤦🏿. +Try building container on local dev machine: this took a long time - quit and continue later. + +### 2024-11-08 + +After guix broke, causing the `Segmentation fault` errors above, I did some troubleshooting and was able to finally fix that by pinning guix to version b0b988c41c9e0e591274495a1b2d6f27fcdae15a as shown in the troubleshooting transcript[^6]. + +Now the fixes I did to make python requests work with the newer guix (defined in guix-bioinformatics[^7]) seem to be leading to failures in the older guix version. + +Let me attempt rebasing to reorder the commits, to make the python requests commit come last, to more easily do a `git reset` before rebuilding the container — not successful. +=> https://git.genenetwork.org/gn-machines/commit/?h=production-container&id=610049b2bfa32cae5d3f992b95aac711290efa2a Manually "undo" the changes in a new commit, + +then rebuild the container. This exposes a bug in gn-auth. + +=> https://git.genenetwork.org/gn-auth/commit/?id=4c21d0e43cf0de1084d0e0a243e441c6e72236eb Fix that. + +and update the `public-jwks-uri` value for the client in the admin dashboard, and voila!!! Now the system works. + +Attempt pulling guix "2394a7f5fbf60dd6adc0a870366adb57166b6d8b" into a profile locally: went through without a hitch + +Upgrade guix daemon, and restart it. Delete profile and run `guix gc`, then try pulling guix "2394a7f5fbf60dd6adc0a870366adb57166b6d8b" again. It also went through without a problem. This eliminates the daemon being the culprit: Running `sudo -i guix pull --list-generations` on both tux04 and my local dev machine gives both daemon commits as `2a6d96425eea57dc6dd48a2bec16743046e32e06`. + + +### Footnotes + +=> https://git.genenetwork.org/gn-machines/tree/production.scm?id=46a1c4c8d01198799e6ac3b99998dca40d2c7094#n47 [^1] Lines 47 to 49 of production.scm +=> https://guix.gnu.org/manual/en/html_node/Database-Services.html#index-mysql_002dconfiguration [^2] Guix's mysql-service-type configurations +=> https://mariadb.com/kb/en/server-system-variables/#socket [^3] MariaDB configuration variables: socket +=> https://git.savannah.gnu.org/cgit/guix.git/tree/gnu/services/databases.scm?id=4c56d0cccdc44e12484b26332715f54768738c5f#n576 [^4] Guix: mysql-service-type configuration code +=> https://guix.gnu.org/manual/en/html_node/Defining-Services.html [^5] Guix documentation: Defining Services +=> https://github.com/genenetwork/gn-gemtext-threads/blob/d785b06643b5e5a2470fd0da075dcf77bda82d16/miscellaneous/broken-guix-on-tux04-20241108.org [^6] Broken guix on tux04: Troubleshooting transcript +=> https://git.genenetwork.org/guix-bioinformatics/commit/?id=eb7beb340a9731775e8ad177e47b70dba2f2a84f [^7] guix-bioinformatics: Upgrade guix channel to 2394a7f diff --git a/issues/genenetwork/containerising-production-issues.gmi b/issues/genenetwork/containerising-production-issues.gmi new file mode 100644 index 0000000..ed5702a --- /dev/null +++ b/issues/genenetwork/containerising-production-issues.gmi @@ -0,0 +1,33 @@ +# Containerising Production: Issues + +## Tags + +* type: bug +* assigned: fredm +* priority: critical +* status: closed, completed +* keywords: production, container, tux04 +* interested: alexk, aruni, bonfacem, fredm, pjotrp, soloshelby, zsloan, jnduli + +## Description + +We have recently got production into a container and deployed it: It has come up, however, that there are services that are useful to get a full-featured GeneNetwork system running that are not part of the container. + +This is, therefore, a meta-issue, tracking all issues that relate to the deployment of the disparate services that make up GeneNetwork. + +## Documentation + +=> https://issues.genenetwork.org/topics/genenetwork/genenetwork-services + +The link above documents the various services that make up the GeneNetwork service. + +## Issues + +* [x] Move user directories to a large partition +=> ./handle-tmp-dirs-in-container [x] Link TMPDIR in container to a directory on a large partition +=> ./markdown-editing-service-not-deployed [ ] Define and deploy Markdown Editing service +=> ./umhet3-samples-timing-slow [ ] Figure out and fix UM-HET3 Samples mappings on Tux04 +=> ./setup-mailing-on-tux04 [x] Setting up email service on Tux04 +=> ./virtuoso-shutdown-clears-data [x] Virtuoso seems to lose data on restart +=> ./python-requests-error-in-container [x] Fix python's requests library certificates error +=> ./cannot-connect-to-mariadb [ ] GN3 cannot connect to mariadb server diff --git a/issues/genenetwork/handle-tmp-dirs-in-container.gmi b/issues/genenetwork/handle-tmp-dirs-in-container.gmi new file mode 100644 index 0000000..5f6eb92 --- /dev/null +++ b/issues/genenetwork/handle-tmp-dirs-in-container.gmi @@ -0,0 +1,22 @@ +# Handle Temporary Directories in the Container + +## Tags + +* type: feature +* assigned: fredm +* priority: critical +* status: closed, completed +* keywords: production, container, tux04 +* interested: alexk, aruni, bonfacem, pjotrp, zsloan + +## Description + +The container's temporary directories should be in a large partition on the host to avoid a scenario where the writes fill up one of the smaller drives. + +Currently, we use the `/tmp` directory by default, but we should look into transitioning away from that — `/tmp` is world readable and world writable and therefore needs careful consideration to keep safe. + +Thankfully, we are running our systems within a container, and can bind the container's `/tmp` directory to a non-world-accessible directory, keeping things at least contained. + +### Fixes + +=> https://git.genenetwork.org/gn-machines/commit/?id=7306f1127df9d4193adfbfa51295615f13d32b55 diff --git a/issues/genenetwork/markdown-editing-service-not-deployed.gmi b/issues/genenetwork/markdown-editing-service-not-deployed.gmi new file mode 100644 index 0000000..e7a1717 --- /dev/null +++ b/issues/genenetwork/markdown-editing-service-not-deployed.gmi @@ -0,0 +1,34 @@ +# Markdown Editing Service: Not Deployed + +## Tags + +* type: bug +* status: open +* assigned: fredm +* priority: critical +* keywords: production, container, tux04 +* interested: alexk, aruni, bonfacem, fredm, pjotrp, zsloan + +## Description + +The Markdown Editing service is not working on production. + +* Link: https://genenetwork.org/facilities/ +* Repository: https://git.genenetwork.org/gn-guile + +Currently, the code is being run directly on the host, rather than inside the container. + +Some important things to note: + +* The service requires access to a checkout of https://github.com/genenetwork/gn-docs +* Currently, the service is hard-coded to use a specific port: we should probably fix that. + +## Reopened: 2024-11-01 + +While the service was deployed, the edit functionality is not working right, specifically, pushing the edits upstream to the remote seems to fail. + +If you do an edit and refresh the page, it will show up in the system, but it will not proceed to be pushed up to the remote. + +Set `CGIT_REPO_PATH="https://git.genenetwork.org/gn-guile"` which seems to allow the commit to work, but we do not actually get the changes pushed to the remote in any useful sense. + +It seems to me, that we need to configure the environment in such a way that it will be able to push the changes to remote. diff --git a/issues/genenetwork/python-requests-error-in-container.gmi b/issues/genenetwork/python-requests-error-in-container.gmi new file mode 100644 index 0000000..0289762 --- /dev/null +++ b/issues/genenetwork/python-requests-error-in-container.gmi @@ -0,0 +1,174 @@ +# Python Requests Error in Container + +## Tags + +* type: bug +* assigned: fredm +* priority: critical +* status: closed, completed, fixed +* interested: alexk, aruni, bonfacem, pjotrp, zsloan +* keywords: production, container, tux04, python, requests + +## Description + +Building the container with the +=> https://git.genenetwork.org/guix-bioinformatics/commit/?id=eb7beb340a9731775e8ad177e47b70dba2f2a84f upgraded guix definition +leads to python's requests library failing. + +``` +2024-10-30 16:04:13 OSError: Could not find a suitable TLS CA certificate bundle, invalid path: /etc/ssl/certs/ca-certificates.crt +``` + +If you login to the container itself, however, you find that the file `/etc/ssl/certs/ca-certificates.crt` actually exists and has content. + +Possible fixes suggested are to set up correct envvars for the requests library, such as `REQUESTS_CA_BUNDLE` + +See +=> https://requests.readthedocs.io/en/latest/user/advanced/#ssl-cert-verification + +### Troubleshooting Logs + +Try reproducing the issue locally: + +``` +$ guix --version +hint: Consider installing the `glibc-locales' package and defining `GUIX_LOCPATH', along these lines: + + guix install glibc-locales + export GUIX_LOCPATH="$HOME/.guix-profile/lib/locale" + +See the "Application Setup" section in the manual, for more info. + +guix (GNU Guix) 2394a7f5fbf60dd6adc0a870366adb57166b6d8b +Copyright (C) 2024 the Guix authors +License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> +This is free software: you are free to change and redistribute it. +There is NO WARRANTY, to the extent permitted by law. +$ +$ guix shell --container --network python python-requests coreutils +[env]$ ls "${GUIX_ENVIRONMENT}/etc" +ld.so.cache profile +``` + +We see from the above that there are no certificates in the environment with just python and python-requests. + +Okay. Now let's write a simple python script to test things out with: + +``` +import requests + +resp = requests.get("https://github.com") +print(resp) +``` + +and run it! + +``` +$ guix shell --container --network python python-requests coreutils -- python3 test.py +Traceback (most recent call last): + File "/tmp/test.py", line 1, in <module> + import requests + File "/gnu/store/b6ny4p29f32rrnnvgx7zz1nhsms2zmqk-profile/lib/python3.10/site-packages/requests/__init__.py", line 164, in <module> + from .api import delete, get, head, options, patch, post, put, request + File "/gnu/store/b6ny4p29f32rrnnvgx7zz1nhsms2zmqk-profile/lib/python3.10/site-packages/requests/api.py", line 11, in <module> + from . import sessions + File "/gnu/store/b6ny4p29f32rrnnvgx7zz1nhsms2zmqk-profile/lib/python3.10/site-packages/requests/sessions.py", line 15, in <module> + from .adapters import HTTPAdapter + File "/gnu/store/b6ny4p29f32rrnnvgx7zz1nhsms2zmqk-profile/lib/python3.10/site-packages/requests/adapters.py", line 81, in <module> + _preloaded_ssl_context.load_verify_locations( +FileNotFoundError: [Errno 2] No such file or directory +``` + +Uhmm, what is this new error? + +Add `nss-certs` and try again. + +``` +$ guix shell --container --network python python-requests nss-certs coreutils +[env]$ ls ${GUIX_ENVIRONMENT}/etc/ssl/ +certs +[env]$ python3 test.py +Traceback (most recent call last): + File "/tmp/test.py", line 1, in <module> + import requests + File "/gnu/store/17dw8qczqqz9fmj2kxzsbfqn730frqd7-profile/lib/python3.10/site-packages/requests/__init__.py", line 164, in <module> + from .api import delete, get, head, options, patch, post, put, request + File "/gnu/store/17dw8qczqqz9fmj2kxzsbfqn730frqd7-profile/lib/python3.10/site-packages/requests/api.py", line 11, in <module> + from . import sessions + File "/gnu/store/17dw8qczqqz9fmj2kxzsbfqn730frqd7-profile/lib/python3.10/site-packages/requests/sessions.py", line 15, in <module> + from .adapters import HTTPAdapter + File "/gnu/store/17dw8qczqqz9fmj2kxzsbfqn730frqd7-profile/lib/python3.10/site-packages/requests/adapters.py", line 81, in <module> + _preloaded_ssl_context.load_verify_locations( +FileNotFoundError: [Errno 2] No such file or directory +[env]$ +[env]$ export REQUESTS_CA_BUNDLE="${GUIX_ENVIRONMENT}/etc/ssl/certs/ca-certificates.crt" +[env]$ $ python3 test.py +Traceback (most recent call last): + File "/tmp/test.py", line 1, in <module> + import requests + File "/gnu/store/17dw8qczqqz9fmj2kxzsbfqn730frqd7-profile/lib/python3.10/site-packages/requests/__init__.py", line 164, in <module> + from .api import delete, get, head, options, patch, post, put, request + File "/gnu/store/17dw8qczqqz9fmj2kxzsbfqn730frqd7-profile/lib/python3.10/site-packages/requests/api.py", line 11, in <module> + from . import sessions + File "/gnu/store/17dw8qczqqz9fmj2kxzsbfqn730frqd7-profile/lib/python3.10/site-packages/requests/sessions.py", line 15, in <module> + from .adapters import HTTPAdapter + File "/gnu/store/17dw8qczqqz9fmj2kxzsbfqn730frqd7-profile/lib/python3.10/site-packages/requests/adapters.py", line 81, in <module> + _preloaded_ssl_context.load_verify_locations( +FileNotFoundError: [Errno 2] No such file or directory +``` + +Welp! Looks like this error is a whole different thing. + +Let us try with the genenetwork2 package. + +``` +$ guix shell --container --network genenetwork2 coreutils +[env]$ ls "${GUIX_ENVIRONMENT}/etc" +bash_completion.d jupyter ld.so.cache profile +``` + +This does not seem to have the certificates in place either, so let's add nss-certs + +``` +$ guix shell --container --network genenetwork2 coreutils nss-certs +[env]$ ls "${GUIX_ENVIRONMENT}/etc" +bash_completion.d jupyter ld.so.cache profile ssl +[env]$ python3 test.py +Traceback (most recent call last): + File "/tmp/test.py", line 3, in <module> + resp = requests.get("https://github.com") + File "/gnu/store/qigjz4i0dckbsjbd2has0md2dxwsa7ry-profile/lib/python3.10/site-packages/requests/api.py", line 73, in get + return request("get", url, params=params, **kwargs) + File "/gnu/store/qigjz4i0dckbsjbd2has0md2dxwsa7ry-profile/lib/python3.10/site-packages/requests/api.py", line 59, in request + return session.request(method=method, url=url, **kwargs) + File "/gnu/store/qigjz4i0dckbsjbd2has0md2dxwsa7ry-profile/lib/python3.10/site-packages/requests/sessions.py", line 587, in request + resp = self.send(prep, **send_kwargs) + File "/gnu/store/qigjz4i0dckbsjbd2has0md2dxwsa7ry-profile/lib/python3.10/site-packages/requests/sessions.py", line 701, in send + r = adapter.send(request, **kwargs) + File "/gnu/store/qigjz4i0dckbsjbd2has0md2dxwsa7ry-profile/lib/python3.10/site-packages/requests/adapters.py", line 460, in send + self.cert_verify(conn, request.url, verify, cert) + File "/gnu/store/qigjz4i0dckbsjbd2has0md2dxwsa7ry-profile/lib/python3.10/site-packages/requests/adapters.py", line 263, in cert_verify + raise OSError( +OSError: Could not find a suitable TLS CA certificate bundle, invalid path: /etc/ssl/certs/ca-certificates.crt +``` + +We get the expected certificates error! This is good. Now define the envvar and try again. + +``` +[env]$ export REQUESTS_CA_BUNDLE="${GUIX_ENVIRONMENT}/etc/ssl/certs/ca-certificates.crt" +[env]$ python3 test.py +<Response [200]> +``` + +Success!!! + +Adding nss-certs and setting the `REQUESTS_CA_BUNDLE` fixes things. We'll need to do the same for the container, for both the genenetwork2 and genenetwork3 packages (and any other packages that use requests library). + +### Fixes + +=> https://git.genenetwork.org/guix-bioinformatics/commit/?id=fec68c4ca87eeca4eb9e69e71fc27e0eae4dd728 +=> https://git.genenetwork.org/guix-bioinformatics/commit/?id=c3bb784c8c70857904ef97ecd7d36ec98772413d +The two commits above add nss-certs package to all the flask apps, which make use of the python-requests library, which requires a valid CA certificates bundle in each application's environment. + +=> https://git.genenetwork.org/gn-machines/commit/?h=production-container&id=04506c4496e5ca8b3bc38e28ed70945a145fb036 +The commit above defines the "REQUESTS_CA_BUNDLE" environment variable for all the flask applications that make use of python's requests library. diff --git a/issues/genenetwork/setup-mailing-on-tux04.gmi b/issues/genenetwork/setup-mailing-on-tux04.gmi new file mode 100644 index 0000000..45605d9 --- /dev/null +++ b/issues/genenetwork/setup-mailing-on-tux04.gmi @@ -0,0 +1,16 @@ +# Setup Mailing on Tux04 + +## Tags + +* type: bug +* status: closed +* assigned: fredm +* priority: critical +* interested: pjotrp, zsloan +* keywords: production, container, tux04 + +## Description + +We use emails to verify user accounts and allow changing of user passwords. We therefore need to setup a way to send emails from the system. + +I updated the configurations to use UTHSC's mail server diff --git a/issues/genenetwork/umhet3-samples-timing-slow.gmi b/issues/genenetwork/umhet3-samples-timing-slow.gmi new file mode 100644 index 0000000..a3a33a7 --- /dev/null +++ b/issues/genenetwork/umhet3-samples-timing-slow.gmi @@ -0,0 +1,72 @@ +# UM-HET3 Timing: Slow + +## Tags + +* type: bug +* status: open +* assigned: fredm +* priority: critical +* interested: fredm, pjotrp, zsloan +* keywords: production, container, tux04, UM-HET3 + +## Description + +In email from @robw: + +``` +> > Not sure why. Am I testing the wrong way? +> > Are we using memory and RAM in the same way on the two machines? +> > Here are data on the loading time improvement for Tux2: +> > I tested this using a "worst case" trait that we know when—the 25,000 +> > UM-HET3 samples: +> > [1]https://genenetwork.org/show_trait?trait_id=10004&dataset=HET3-ITPPu +> > blish +> > Tux02: 15.6, 15.6, 15.3 sec +> > Fallback: 37.8, 38.7, 38.5 sec +> > Here are data on Gemma speed/latency performance: +> > Also tested "worst case" performance using three large BXD data sets +> > tested in this order: +> > [2]https://genenetwork.org/show_trait?trait_id=10004&dataset=BXD-Longev +> > ityPublish +> > [3]https://genenetwork.org/show_trait?trait_id=10003&dataset=BXD-Longev +> > ityPublish +> > [4]https://genenetwork.org/show_trait?trait_id=10002&dataset=BXD-Longev +> > ityPublish +> > Tux02: 107.2, 329.9 (ouch), 360.0 sec (double ouch) for 1004, 1003, and +> > 1002 respectively. On recompute (from cache) 19.9, 19.9 and 20.0—still +> > too slow. +> > Fallback: 154.1, 115.9 for the first two traits (trait 10002 already in +> > the cache) +> > On recompute (from cache) 59.6, 59.0 and 59.7. Too slow from cache. +> > PROBLEM 2: Tux02 is unable to map UM-HET3. I still get an nginx 413 +> > error: Entity Too Large. +> +> Yeah, Fred should fix that one. It is an nginx setting - we run 2x +> nginx. It was reported earlier. +> +> > I need this to work asap. Now mapping our amazing UM-HET3 data. I can +> > use Fallback, but it is painfully slow and takes about 214 sec. I hope +> > Tux02 gets that down to a still intolerable slow 86 sec. +> > Can we please fix and confirm by testing. The Trait is above for your +> > testing pleasure. +> > Even 86 secs is really too slow and should motivate us (or users like +> > me) to think about how we are using all of those 24 ultra-fast cores on +> > the AMD 9274F. Why not put them all to use for us and users. It is not +> > good enough just to have "it work". It has to work in about 5–10 +> > seconds. +> > Here are my questions for you guys: Are we able to use all 24 cores +> > for any one user? How does each user interact with the CPU? Can we +> > handle a class of 24 students with 24 cores, or is it "complicated"? +> > PROBLEM 3: Zach, Fred. Are we computing render time or transport +> > latency correctly? Ideally the printout at the bottom of mapping pages +> > would be true latency as experienced by the user. As far as I can tell +> > with a stop watch our estimates of time are incorrect by as much as 3 +> > secs. And note that the link +> > to [5]http://joss.theoj.org/papers/10.21105/joss.00025 is not working +> > correctly in the footer (see image below). Oddly enough it works fine +> > on Tux02 +> +> Fred, take a note. +``` + +Figure out what this is about and fix it. diff --git a/issues/genenetwork/virtuoso-shutdown-clears-data.gmi b/issues/genenetwork/virtuoso-shutdown-clears-data.gmi new file mode 100644 index 0000000..2e01238 --- /dev/null +++ b/issues/genenetwork/virtuoso-shutdown-clears-data.gmi @@ -0,0 +1,98 @@ +# Virtuoso: Shutdown Clears Data + +## Tags + +* type: bug +* assigned: fredm +* priority: critical +* status: closed, completed +* interested: bonfacem, pjotrp, zsloan +* keywords: production, container, tux04, virtuoso + +## Description + +It seems that virtuoso has the bad habit of clearing data whenever it is stopped/restarted. + +This issue will track the work necessary to get the service behaving correctly. + +According to the documentation on +=> https://vos.openlinksw.com/owiki/wiki/VOS/VirtBulkRDFLoader the bulk loading process + +``` +The bulk loader also disables checkpointing and the scheduler, which also need to be re-enabled post bulk load +``` + +That needs to be handled. + +### Notes + +After having a look at +=> https://docs.openlinksw.com/virtuoso/ch-server/#databaseadmsrv the configuration documentation +it occurs to me that the reason virtuoso supposedly clears the data is that the `DatabaseFile` value is not set, so it defaults to a new database file every time the server is restarted (See also the `Striping` setting). + +### Troubleshooting + +Reproduce locally: + +We begin by getting a look at the settings for the remote virtuoso +``` +$ ssh tux04 +fredm@tux04:~$ cat /gnu/store/bg6i4x96nm32gjp4qhphqmxqc5vggk3h-virtuoso.ini +[Parameters] +ServerPort = localhost:8981 +DirsAllowed = /var/lib/data +NumberOfBuffers = 4000000 +MaxDirtyBuffers = 3000000 +[HTTPServer] +ServerPort = localhost:8982 +``` + +Copy these into a file locally, and adjust the `NumberOfBuffers` and `MaxDirtyBuffers` for smaller local dev environment. Also update `DirsAllowed`. + +We end up with our local configuration in `~/tmp/virtuoso/etc/virtuoso.ini` with the content: + +``` +[Parameters] +ServerPort = localhost:8981 +DirsAllowed = /var/lib/data +NumberOfBuffers = 10000 +MaxDirtyBuffers = 6000 +[HTTPServer] +ServerPort = localhost:8982 +``` + +Run virtuoso! +``` +$ cd ~/tmp/virtuoso/var/lib/virtuoso/ +$ ls +$ ~/opt/virtuoso/bin/virtuoso-t +foreground +configfile ~/tmp/virtuoso/etc/virtuoso.ini +``` + +Here we start by changing into the `~/tmp/virtuoso/var/lib/virtuoso/` directory which will be where virtuoso will put its state. Now in a different terminal list the files created int the state directory: + +``` +$ ls ~/tmp/virtuoso/var/lib/virtuoso +virtuoso.db virtuoso.lck virtuoso.log virtuoso.pxa virtuoso.tdb virtuoso.trx +``` + +That creates the database file (and other files) with the documented default values, i.e. `virtuoso.*`. + +We cannot quite reproduce the issue locally, since every reboot will have exactly the same value for the files locally. + +Checking the state directory for virtuoso on tux04, however: + +``` +fredm@tux04:~$ sudo ls -al /export2/guix-containers/genenetwork/var/lib/virtuoso/ | grep '\.db$' +-rw-r--r-- 1 986 980 3787456512 Oct 28 14:16 js1b7qjpimdhfj870kg5b2dml640hryx-virtuoso.db +-rw-r--r-- 1 986 980 4152360960 Oct 28 17:11 rf8v0c6m6kn5yhf00zlrklhp5lmgpr4x-virtuoso.db +``` + +We see that there are multiple db files, each created when virtuoso was restarted. There is an extra (possibly) random string prepended to the `virtuoso.db` part. This happens for our service if we do not actually provide the `DatabaseFile` configuration. + + +## Fixes + +=> https://github.com/genenetwork/gn-gemtext-threads/commit/8211c1e49498ba2f3b578ed5b11b15c52299aa08 Document how to restart checkpointing and the scheduler after bulk loading +=> https://git.genenetwork.org/guix-bioinformatics/commit/?id=2dc335ca84ea7f26c6977e6b432f3420b113f0aa Add configs for scheduler and checkpointing +=> https://git.genenetwork.org/guix-bioinformatics/commit/?id=7d793603189f9d41c8ee87f8bb4c876440a1fce2 Set up virtuoso database configurations +=> https://git.genenetwork.org/gn-machines/commit/?id=46a1c4c8d01198799e6ac3b99998dca40d2c7094 Explicitly name virtuoso database files. diff --git a/issues/genenetwork2-account-registration-error.gmi b/issues/genenetwork2-account-registration-error.gmi index d617f93..14b6322 100644 --- a/issues/genenetwork2-account-registration-error.gmi +++ b/issues/genenetwork2-account-registration-error.gmi @@ -5,7 +5,7 @@ * type: bug * priority: critical * assigned: zachs, zsloan, fredm -* status: open +* status: closed, completed * keywords: genenetwork2, account management, user, registration ## Description diff --git a/issues/genenetwork2-cd-sometimes-fails-to-restart.gmi b/issues/genenetwork2-cd-sometimes-fails-to-restart.gmi index d2d2013..603de59 100644 --- a/issues/genenetwork2-cd-sometimes-fails-to-restart.gmi +++ b/issues/genenetwork2-cd-sometimes-fails-to-restart.gmi @@ -10,4 +10,7 @@ A reminder that CD logs are publicly accessible on tux02. => /topics/cd-logs ## Resolution + This issue has been re-opened. Originally, we believed that the restart failures were due to occasional breakage in GN code, and were not a problem with the CI/CD system itself. This will need further investigation to figure out what the root cause is. + +* closed diff --git a/issues/genenetwork2/broken-collections-features.gmi b/issues/genenetwork2/broken-collections-features.gmi new file mode 100644 index 0000000..4239929 --- /dev/null +++ b/issues/genenetwork2/broken-collections-features.gmi @@ -0,0 +1,44 @@ +# Broken Collections Features + +## Tags + +* type: bug +* status: open +* priority: high +* assigned: zachs, fredm +* keywords: gn2, genenetwork2, genenetwork 2, collections + +## Descriptions + +There are some features in the search results page, and/or the collections page that are broken — these are: + +* "CTL" feature +* "MultiMap" feature +* "Partial Correlations" feature +* "Generate Heatmap" feature + +### Reproduce Issue + +* Go to https://genenetwork.org +* Select "Mouse (Mus musculus, mm10) for "Species" +* Select "BXD Family" for "Group" +* Select "Traits and Cofactors" for "Type" +* Select "BXD Published Phenotypes" for "Dataset" +* Type "locomotion" in the "Get Any" field (without the quotes) +* Click "Search" +* In the results page, select the traits with the following "Record" values: "BXD_10050", "BXD_10051", "BXD_10088", "BXD_10091", "BXD_10092", "BXD_10455", "BXD_10569", "BXD_10570", "BXD_11316", "BXD_11317" +* Click the "Add" button and add them to a new collection +* In the resulting collections page, click the button for any of the listed failing features above + +### Failure modes + +* The "CTL" and "WCGNA" features have a failure mode that might have been caused by recent changes making use of AJAX calls, rather than submitting the form manually. +* The "MultiMap" and "Generate Heatmap" features raise exceptions that need to be investigated and resolved +* The "Partial Correlations" feature seems to run forever + +## Break-out Issues + +We break-out the issues above into separate pages to track the progress of the fixes for each feature separately. + +=> /issues/genenetwork3/ctl-maps-error +=> /issues/genenetwork3/generate-heatmaps-failing diff --git a/issues/genenetwork2/fix-display-for-time-consumed-for-correlations.gmi b/issues/genenetwork2/fix-display-for-time-consumed-for-correlations.gmi new file mode 100644 index 0000000..0c8e9c8 --- /dev/null +++ b/issues/genenetwork2/fix-display-for-time-consumed-for-correlations.gmi @@ -0,0 +1,15 @@ +# Fix Display for the Time Consumed for Correlations + +## Tags + +* type: bug +* status: closed, completed +* priority: low +* assigned: @alexm, @bonz +* keywords: gn2, genenetwork2, genenetwork 2, gn3, genenetwork3 genenetwork 3, correlations, time display + +## Description + +The breakdown of the time consumed for the correlations computations, displayed at the bottom of the page, is not representative of reality. The time that GeneNetwork3 (or background process) takes for the computations is not actually represented in the breakdown, leading to wildly inaccurate displays of total time. + +This will need to be fixed. diff --git a/issues/genenetwork2/haley-knott-regression-mapping-error.gmi b/issues/genenetwork2/haley-knott-regression-mapping-error.gmi new file mode 100644 index 0000000..25bb221 --- /dev/null +++ b/issues/genenetwork2/haley-knott-regression-mapping-error.gmi @@ -0,0 +1,80 @@ +# Haley-Knott Regression Mapping Error + +## Tags + +* type: bug +* status: closed, completed +* priority: high +* assigned: fredm +* keywords: gn2, genenetwork2, genenetwork 2, mapping, haley-knott + +## Description + +To run the mapping: + +* Do a search +* Click on any trait in the results +* On the trait page, expand the "Mapping Tools" section +* Select the "Haley-Knott Regression" option under "Mapping Tools" +* Click "Compute" + +On running the mapping as above, we got the following error: + +``` + GeneNetwork 2.11-rc2 https://gn2-fred.genenetwork.org/run_mapping ( 6:14AM UTC Sep 11, 2024) +Traceback (most recent call last): + File "/gnu/store/hgcvlkn4bjl0f9wqiakpk5w66brbfxk6-profile/lib/python3.10/site-packages/flask/app.py", line 1523, in full_dispatch_request + rv = self.dispatch_request() + File "/gnu/store/hgcvlkn4bjl0f9wqiakpk5w66brbfxk6-profile/lib/python3.10/site-packages/flask/app.py", line 1509, in dispatch_request + return self.ensure_sync(self.view_functions[rule.endpoint])(**req.view_args) + File "/gnu/store/hgcvlkn4bjl0f9wqiakpk5w66brbfxk6-profile/lib/python3.10/site-packages/gn2/wqflask/views.py", line 1004, in mapping_results_page + gn1_template_vars = display_mapping_results.DisplayMappingResults( + File "/gnu/store/hgcvlkn4bjl0f9wqiakpk5w66brbfxk6-profile/lib/python3.10/site-packages/gn2/wqflask/marker_regression/display_mapping_results.py", line 651, in __init__ + self.perm_filename = self.drawPermutationHistogram() + File "/gnu/store/hgcvlkn4bjl0f9wqiakpk5w66brbfxk6-profile/lib/python3.10/site-packages/gn2/wqflask/marker_regression/display_mapping_results.py", line 3056, in drawPermutationHistogram + Plot.plotBar(myCanvas, perm_output, XLabel=self.LRS_LOD, + File "/gnu/store/hgcvlkn4bjl0f9wqiakpk5w66brbfxk6-profile/lib/python3.10/site-packages/gn2/utility/Plot.py", line 184, in plotBar + scaleFont = ImageFont.truetype(font=COUR_FILE, size=11) + File "/gnu/store/hgcvlkn4bjl0f9wqiakpk5w66brbfxk6-profile/lib/python3.10/site-packages/PIL/ImageFont.py", line 959, in truetype + return freetype(font) + File "/gnu/store/hgcvlkn4bjl0f9wqiakpk5w66brbfxk6-profile/lib/python3.10/site-packages/PIL/ImageFont.py", line 956, in freetype + return FreeTypeFont(font, size, index, encoding, layout_engine) + File "/gnu/store/hgcvlkn4bjl0f9wqiakpk5w66brbfxk6-profile/lib/python3.10/site-packages/PIL/ImageFont.py", line 247, in __init__ + self.font = core.getfont( +OSError: cannot open resource +``` + +### Hypothesis + +My hypothesis is that the use of relative paths[fn:1] is the cause of the failure. + +When running the application with the working directory being the root of the GeneNetwork2 repository, use of the relative paths works well. Unfortunately, that assumption breaks quickly if the application is ever run outside of the root of the GN2 repo. + +Verification: + +*Question*: Does the application run on root of GN2 repository/package? + +* Log out the path of the font file and use the results to answer the question +* https://github.com/genenetwork/genenetwork2/commit/ca8018a61f2e014b4aee4da2cbd00d7b591b2f6a +* https://github.com/genenetwork/genenetwork2/commit/01d56903ba01a91841d199fe393f9b307a7596a2 + +*Answer*: No! The application does not run with the working directory on the root of the GN2 repository/package, as evidenced by this snippet from the logs: + +``` +2024-09-11 07:41:13 [2024-09-11 07:41:13 +0000] [494] [DEBUG] POST /run_mapping +2024-09-11 07:41:18 [2024-09-11 07:41:18 +0000] [494] [DEBUG] Font file path: /gn2/wqflask/static/fonts/courbd.ttf +2024-09-11 07:41:18 DEBUG:gn2.wqflask:Font file path: /gn2/wqflask/static/fonts/courbd.ttf +2024-09-11 07:41:18 [2024-09-11 07:41:18 +0000] [494] [ERROR] https://gn2-fred.genenetwork.org/run_mapping ( 7:41AM UTC Sep 11, 2024) +2024-09-11 07:41:18 Traceback (most recent call last): +``` + +We see from this that the application seems to be running with the working directory being "/" rather than the root for the application's package files. + +### Fixes + +* https://github.com/genenetwork/genenetwork2/commit/d001c1e7cae8f69435545b8715038b1d0fc1ee62 +* https://git.genenetwork.org/guix-bioinformatics/commit/?id=7a1bf5bc1c3de67f01eabd23e1ddc0150f81b22b + +# Footnotes + +[fn:1] https://github.com/genenetwork/genenetwork2/blob/50fc0b4bc4106164745afc7e1099bb150f6e635f/gn2/utility/Plot.py#L44-L46 diff --git a/issues/genenetwork2/handle-oauth-errors-better.gmi b/issues/genenetwork2/handle-oauth-errors-better.gmi new file mode 100644 index 0000000..462ded5 --- /dev/null +++ b/issues/genenetwork2/handle-oauth-errors-better.gmi @@ -0,0 +1,17 @@ +# Handle OAuth Errors Better + +## Tags + +* type: bug +* status: open +* priority: high +* assigned: fredm +* interested: zachs, robw +* keywords: gn2, genenetwork2, ui, user interface, oauth, oauth errors + +## Description + +When a session expires, for whatever reason, a notification is displayed to the user as shown in the image below: +=> ./session_expiry_oauth_error.png + +The message is a little jarring to the end user. Make it gentler, and probably more informative, so the user is not as surprised. diff --git a/issues/genenetwork2/mapping-error.gmi b/issues/genenetwork2/mapping-error.gmi new file mode 100644 index 0000000..2e28491 --- /dev/null +++ b/issues/genenetwork2/mapping-error.gmi @@ -0,0 +1,51 @@ +# Mapping Error + +## Tags + +* type: bug +* status: open +* priority: medium +* assigned: zachs, fredm, flisso +* keywords: gn2, genenetwork2, genenetwork 2, mapping + +## Reproduction + +* Go to https://staging.genenetwork.org/ +* For 'Species' select "Arabidopsis (Arabidopsis thaliana, araTha1)" +* For 'Group' select "BayXSha(RIL by sib-mating)" +* For 'Type' select "arabidopsis seeds" +* For 'Dataset' select "Arabidopsis BayXShaXRIL_expr_reg _ATH1" +* Leave 'Get Any' blank +* Enter "*" for "Combined" +* Click "Search" +* On the search results page, click on "AT1G01010" +* Expand the "Mapping Tools" section +* For 'Chromosome' select "All" +* For 'Minor Allele ≥' enter "0.05" +* For 'Use LOCO' select "Yes" +* Ignore covariates +* Click "Compute" + +### Expected + +The system would compute the maps and display the mapping diagram(s) and data. + +### Actual + +The computation fails with: + +``` + GeneNetwork 2.11-rc2 https://staging.genenetwork.org/loading ( 6:50PM UTC Jul 03, 2024) +Traceback (most recent call last): + File "/gnu/store/jsvqai0gz6fn40k7kx3r12yq4hzfini6-profile/lib/python3.10/site-packages/flask/app.py", line 1523, in full_dispatch_request + rv = self.dispatch_request() + File "/gnu/store/jsvqai0gz6fn40k7kx3r12yq4hzfini6-profile/lib/python3.10/site-packages/flask/app.py", line 1509, in dispatch_request + return self.ensure_sync(self.view_functions[rule.endpoint])(**req.view_args) + File "/gnu/store/jsvqai0gz6fn40k7kx3r12yq4hzfini6-profile/lib/python3.10/site-packages/gn2/wqflask/views.py", line 812, in loading_page + for sample in samples: +TypeError: 'NoneType' object is not iterable +``` + +### Updates + +This is likely just because the genotype file doesn't exist in the necessary format (BIMBAM). We probably need to convert the R/qtl2 genotypes to BIMBAM. diff --git a/issues/genenetwork2/refresh-token-failure.gmi b/issues/genenetwork2/refresh-token-failure.gmi new file mode 100644 index 0000000..dd33341 --- /dev/null +++ b/issues/genenetwork2/refresh-token-failure.gmi @@ -0,0 +1,108 @@ +# Refresh Token Failure + +## Tags + +* status: open +* priority: high +* type: bug +* assigned: fredm, zsloan, zachs +* keywords: gn2, genenetwork2 + +## Description + +* Go to https://genenetwork.org +* Click "Sign in" and sign in to the application +* Wait 15 minutes +* Close the entire browser +* Open the browser and go to https://genenetwork.org +* Observe the "ERROR" message at the "Collections" link's badge + +The expectation is that the Collections badge would list the number of collection the user has, rather than the error message. + +The logs fail with an 'invalid_client' error: + +``` +2025-01-08 20:48:56 raise self.oauth_error_class( +2025-01-08 20:48:56 authlib.integrations.base_client.errors.OAuthError: invalid_client: +2025-01-08 20:48:56 ERROR:gn2.wqflask:Error loading number of collections +2025-01-08 20:48:56 Traceback (most recent call last): +2025-01-08 20:48:56 File "/gnu/store/3n1cl5cxal3qk7p9q363qgm2ag45a177-profile/lib/python3.10/site-packages/gn2/wqflask/__init__.py", +line 55, in numcoll +2025-01-08 20:48:56 return num_collections() +2025-01-08 20:48:56 File "/gnu/store/3n1cl5cxal3qk7p9q363qgm2ag45a177-profile/lib/python3.10/site-packages/gn2/wqflask/oauth2/collect +ions.py", line 13, in num_collections +2025-01-08 20:48:56 all_collections = all_collections + oauth2_get( +2025-01-08 20:48:56 File "/gnu/store/3n1cl5cxal3qk7p9q363qgm2ag45a177-profile/lib/python3.10/site-packages/gn2/wqflask/oauth2/client. +py", line 168, in oauth2_get +2025-01-08 20:48:56 resp = oauth2_client().get( +2025-01-08 20:48:56 File "/gnu/store/3n1cl5cxal3qk7p9q363qgm2ag45a177-profile/lib/python3.10/site-packages/requests/sessions.py", lin +e 600, in get +2025-01-08 20:48:56 return self.request("GET", url, **kwargs) +2025-01-08 20:48:56 File "/gnu/store/3n1cl5cxal3qk7p9q363qgm2ag45a177-profile/lib/python3.10/site-packages/authlib/integrations/reque +sts_client/oauth2_session.py", line 109, in request +2025-01-08 20:48:56 return super(OAuth2Session, self).request( +2025-01-08 20:48:56 File "/gnu/store/3n1cl5cxal3qk7p9q363qgm2ag45a177-profile/lib/python3.10/site-packages/requests/sessions.py", lin +e 573, in request +2025-01-08 20:48:56 prep = self.prepare_request(req) +2025-01-08 20:48:56 File "/gnu/store/3n1cl5cxal3qk7p9q363qgm2ag45a177-profile/lib/python3.10/site-packages/requests/sessions.py", lin +e 484, in prepare_request +2025-01-08 20:48:56 p.prepare( +2025-01-08 20:48:56 File "/gnu/store/3n1cl5cxal3qk7p9q363qgm2ag45a177-profile/lib/python3.10/site-packages/requests/models.py", line +372, in prepare +2025-01-08 20:48:56 self.prepare_auth(auth, url) +2025-01-08 20:48:56 File "/gnu/store/3n1cl5cxal3qk7p9q363qgm2ag45a177-profile/lib/python3.10/site-packages/requests/models.py", line +603, in prepare_auth +2025-01-08 20:48:56 r = auth(self) +2025-01-08 20:48:56 File "/gnu/store/3n1cl5cxal3qk7p9q363qgm2ag45a177-profile/lib/python3.10/site-packages/authlib/integrations/reque +sts_client/oauth2_session.py", line 24, in __call__ +2025-01-08 20:48:56 self.ensure_active_token() +2025-01-08 20:48:56 File "/gnu/store/3n1cl5cxal3qk7p9q363qgm2ag45a177-profile/lib/python3.10/site-packages/authlib/integrations/reque +sts_client/oauth2_session.py", line 20, in ensure_active_token +2025-01-08 20:48:56 if self.client and not self.client.ensure_active_token(self.token): +2025-01-08 20:48:56 File "/gnu/store/3n1cl5cxal3qk7p9q363qgm2ag45a177-profile/lib/python3.10/site-packages/authlib/oauth2/client.py", + line 262, in ensure_active_token +2025-01-08 20:48:56 self.refresh_token(url, refresh_token=refresh_token) +2025-01-08 20:48:56 File "/gnu/store/3n1cl5cxal3qk7p9q363qgm2ag45a177-profile/lib/python3.10/site-packages/authlib/oauth2/client.py", + line 252, in refresh_token +2025-01-08 20:48:56 return self._refresh_token( +2025-01-08 20:48:56 File "/gnu/store/3n1cl5cxal3qk7p9q363qgm2ag45a177-profile/lib/python3.10/site-packages/authlib/oauth2/client.py", + line 373, in _refresh_token +2025-01-08 20:48:56 token = self.parse_response_token(resp) +2025-01-08 20:48:56 File "/gnu/store/3n1cl5cxal3qk7p9q363qgm2ag45a177-profile/lib/python3.10/site-packages/authlib/oauth2/client.py", + line 340, in parse_response_token +2025-01-08 20:48:56 raise self.oauth_error_class( +2025-01-08 20:48:56 authlib.integrations.base_client.errors.OAuthError: invalid_client: +``` + + +### Troubleshooting + +The following commits were done as part of the troubleshooting: + +=> https://github.com/genenetwork/genenetwork2/commit/55da5809d851a3c8bfa13637947b019a2c02cc93 +=> https://git.genenetwork.org/guix-bioinformatics/commit/?id=d1cada0f0933732eb68b7786fb04ea541d8c51c9 +=> https://github.com/genenetwork/genenetwork2/commit/93dd7f7583af4e0bdd3c7b9c88d375fdc4b40039 +=> https://git.genenetwork.org/guix-bioinformatics/commit/?id=5fe04ca1545f740cbb91474576891c7fd1dff13a +=> https://github.com/genenetwork/genenetwork2/commit/2031da216f3b62c23dca64eb6d1c533c07dc81f1 +=> https://github.com/genenetwork/genenetwork2/commit/125c436f5310b194c10385ce9d81135518ac0adf +=> https://git.genenetwork.org/guix-bioinformatics/commit/?id=758e6f0fbf6af4af5b94b9aa5a9264c31f050153 +=> https://github.com/genenetwork/genenetwork2/commit/8bf483a3ab23ebf25d73380e78271c368ff06b2d +=> https://git.genenetwork.org/guix-bioinformatics/commit/?id=f1ee97a17e670b12112d48bea8969e2ee162f808 +=> https://github.com/genenetwork/genenetwork2/commit/de01f83090184fc56dce2f9887d2dc910edc60fe +=> https://github.com/genenetwork/genenetwork2/commit/91017b97ee346e73bed9b77e3f3f72daa4acbacd +=> https://github.com/genenetwork/genenetwork2/commit/7e6bfe48167c70d26e27b043eb567608bc1fda84 +=> https://git.genenetwork.org/guix-bioinformatics/commit/?id=1f71a1e78af87266e7a4170ace8860111a1569d6 +=> https://github.com/genenetwork/genenetwork2/commit/9bdc8ca0b17739c1df9dc504f8cd978296b987dd +=> https://git.genenetwork.org/guix-bioinformatics/commit/?id=02a9a99e7e3c308157f7d740a244876ab4196337 +=> https://github.com/genenetwork/genenetwork2/commit/236a48835dc6557ba0ece6aef6014f496ddb163e +=> https://git.genenetwork.org/guix-bioinformatics/commit/?id=f928be361d2e331d72448416300c331e47341807 +=> https://github.com/genenetwork/genenetwork2/commit/5fb56c51ad4eaff13a7e24b6022dffb7d82aa41d +=> https://github.com/genenetwork/genenetwork2/commit/c6c9ef71718d650f9c19ae459d6d4e25e72de00a +=> https://github.com/genenetwork/genenetwork2/commit/dc606f39fb4aad74004959a6a15e481fa74d52ff +=> https://git.genenetwork.org/guix-bioinformatics/commit/?id=4ab597b734968916af5bae6332756af8168783b3 +=> https://github.com/genenetwork/genenetwork2/commit/854639bd46293b6791c629591fd934d1f34038ac +=> https://git.genenetwork.org/guix-bioinformatics/commit/?id=7e0083555150d151e566cebed4bd82d69e347eb6 +=> https://github.com/genenetwork/genenetwork2/commit/c4508901027a2d3ea98e1e9b3f8767a455cad02f +=> https://git.genenetwork.org/guix-bioinformatics/commit/?id=955e4ce9370be9811262d7c73fa5398385cc04d8 + + diff --git a/issues/genenetwork2/session_expiry_oauth_error.png b/issues/genenetwork2/session_expiry_oauth_error.png Binary files differnew file mode 100644 index 0000000..34e2dda --- /dev/null +++ b/issues/genenetwork2/session_expiry_oauth_error.png diff --git a/issues/genenetwork3/01828928-26e6-4cad-bbc8-59fd7a7977de.json.zip b/issues/genenetwork3/01828928-26e6-4cad-bbc8-59fd7a7977de.json.zip Binary files differnew file mode 100644 index 0000000..7681b88 --- /dev/null +++ b/issues/genenetwork3/01828928-26e6-4cad-bbc8-59fd7a7977de.json.zip diff --git a/issues/genenetwork3/broken-aliases.gmi b/issues/genenetwork3/broken-aliases.gmi new file mode 100644 index 0000000..5735a1c --- /dev/null +++ b/issues/genenetwork3/broken-aliases.gmi @@ -0,0 +1,27 @@ +# Broken Aliases + +## Tags + +* type: bug +* status: open +* priority: high +* assigned: fredm +* interested: pjotrp +* keywords: aliases, aliases server + + +## Repository + +=> https://github.com/genenetwork/gn3 + +## Bug Report + +### Actual + +* Go to https://genenetwork.org/gn3/gene/aliases2/Shh,Brca2 +* Not that an exception is raised, with a "404 Not Found" message + +### Expected + +* We expected a list of aliases to be returned for the given symbols as is done in https://fallback.genenetwork.org/gn3/gene/aliases2/Shh,Brca2 + diff --git a/issues/genenetwork3/check-for-mandatory-settings.gmi b/issues/genenetwork3/check-for-mandatory-settings.gmi new file mode 100644 index 0000000..16a2f8a --- /dev/null +++ b/issues/genenetwork3/check-for-mandatory-settings.gmi @@ -0,0 +1,40 @@ +# Check for Mandatory Settings + +## Tags + +* status: open +* priority: high +* type: bug, improvement +* interested: fredm, bonz +* assigned: jnduli, rookie101 +* keywords: GN3, gn3, genenetwork3, settings, config, configs, configurations + +## Explanation + +Giving defaults to some important settings leads to situations where the correct configuration is not set up correctly leading at best to failure, and at worst, to subtle failures that can be difficult to debug: e.g. When a default URI to a server points to an active domain, just not the correct one. + +We want to make such (arguably, sensitive) configurations explicit, and avoid giving them defaults. We want to check that they are set up before allowing the application to run, and fail loudly and obnoxiously if they are not provided. + +Examples of configuration variables that should be checked for: + +* All external URIs (external to app/repo under consideration) +* All secrets (secret keys, salts, tokens, etc) + +We should also eliminate from the defaults: + +* Computed values +* Calls to get values from ENVVARs (`os.environ.get(…)` calls) + +### Note on ENVVARs + +The environment variables should be used for overriding values under specific conditions, therefore, it should both be explicit and the last thing loaded to ensure they actually override settings. + +=> https://git.genenetwork.org/gn-auth/tree/gn_auth/__init__.py?id=3a276642bea934f0a7ef8f581d8639e617357a2a#n70 See this example for a possible way of allowing ENVVARs to override settings. + +The example above could be improved by maybe checking for environment variables starting with a specific value, e.g. the envvar `GNAUTH_SECRET_KEY` would override the `SECRET_KEY` configuration. This allows us to override settings without having to change the code. + +## Tasks + +* [ ] Explicitly check configs for ALL external URIs +* [ ] Explicitly check configs for ALL secrets +* [ ] Explicitly load ENVVARs last to override settings diff --git a/issues/genenetwork3/ctl-maps-error.gmi b/issues/genenetwork3/ctl-maps-error.gmi new file mode 100644 index 0000000..6726357 --- /dev/null +++ b/issues/genenetwork3/ctl-maps-error.gmi @@ -0,0 +1,46 @@ +# CTL Maps Error + +## Tags + +* type: bug +* status: open +* priority: high +* assigned: alexm, zachs, fredm +* keywords: CTL, CTL Maps, gn3, genetwork3, genenetwork 3 + +## Description + +Trying to run the CTL Maps feature in the collections page as described in +=> /issues/genenetwork2/broken-collections-feature + +We get an error in the results page of the form: + +``` +{'error': '{\'code\': 1, \'output\': \'Loading required package: MASS\\nLoading required package: parallel\\nLoading required package: qtl\\nThere were 13 warnings (use warnings() to see them)\\nError in xspline(x, y, shape = 0, lwd = lwd, border = col, lty = lty, : \\n invalid value specified for graphical parameter "lwd"\\nCalls: ctl.lineplot -> draw.spline -> xspline\\nExecution halted\\n\'}'} +``` + +on the CLI the same error is rendered: +``` +Loading required package: MASS +Loading required package: parallel +Loading required package: qtl +There were 13 warnings (use warnings() to see them) +Error in xspline(x, y, shape = 0, lwd = lwd, border = col, lty = lty, : + invalid value specified for graphical parameter "lwd" +Calls: ctl.lineplot -> draw.spline -> xspline +Execution halted +``` + +On my local development machine, the command run was +``` +Rscript /home/frederick/genenetwork/genenetwork3/scripts/ctl_analysis.R /tmp/01828928-26e6-4cad-bbc8-59fd7a7977de.json +``` + +Here is a zipped version of the json file (follow the link and click download): +=> https://github.com/genenetwork/gn-gemtext-threads/blob/main/issues/genenetwork3/01828928-26e6-4cad-bbc8-59fd7a7977de.json.zip + +Troubleshooting a while, I suspect +=> https://github.com/genenetwork/genenetwork3/blob/27d9c9d6ef7f37066fc63af3d6585bf18aeec925/scripts/ctl_analysis.R#L79-L80 this is the offending code. + +=> https://cran.r-project.org/web/packages/ctl/ctl.pdf The manual for the ctl library +indicates that our call above might be okay, which might mean something changed in the dependencies that the ctl library used. diff --git a/issues/genenetwork3/generate-heatmaps-failing.gmi b/issues/genenetwork3/generate-heatmaps-failing.gmi new file mode 100644 index 0000000..522dc27 --- /dev/null +++ b/issues/genenetwork3/generate-heatmaps-failing.gmi @@ -0,0 +1,64 @@ +# Generate Heatmaps Failing + +## Tags + +* type: bug +* status: open +* priority: medium +* assigned: fredm, zachs, zsloan +* keywords: genenetwork3, gn3, GN3, heatmaps + +## Reproduce + +* Go to https://genenetwork.org/ +* Under "Select and Search" menu, enter "synap*" for the "Get Any" field +* Click "Search" +* In search results page, select first 10 traits +* Click "Add" +* Under "Create a new collection" enter the name "newcoll" and click "Create collection" +* In the collections page that shows up, click "Select All" once +* Ensure all the traits are selected +* Click "Generate Heatmap" and wait +* Note how system fails silently with no heatmap presented + +### Notes + +On https://gn2-fred.genenetwork.org the heatmaps fails with a note ("ERROR: undefined"). In the logs, I see "Module 'scipy' has no attribute 'array'" which seems to be due to a change in numpy. +=> https://github.com/MaartenGr/BERTopic/issues/1791 +=> https://github.com/scipy/scipy/issues/19972 + +This issue should not be present with python-plotly@5.20.0 but since guix-bioinformatics pins the guix version to `b0b988c41c9e0e591274495a1b2d6f27fcdae15a`, we are not able to pull in newer versions of packages from guix. + + +### Update 2025-04-08T10:59CDT + +Got the following error when I ran the background command manually: + +``` +$ export RUST_BACKTRACE=full +$ /gnu/store/dp4zq4xiap6rp7h6vslwl1n52bd8gnwm-profile/bin/qtlreaper --geno /home/frederick/genotype_files/genotype/genotype/BXD.geno --n_permutations 1000 --traits /tmp/traits_test_file_n2E7V06Cx7.txt --main_output /tmp/qtlreaper/main_output_NGVW4sfYha.txt --permu_output /tmp/qtlreaper/permu_output_MJnzLbrsrC.txt +thread 'main' panicked at src/regression.rs:216:25: +index out of bounds: the len is 20 but the index is 20 +stack backtrace: + 0: 0x61399d77d46d - <unknown> + 1: 0x61399d7b5e13 - <unknown> + 2: 0x61399d78b649 - <unknown> + 3: 0x61399d78f26f - <unknown> + 4: 0x61399d78ee98 - <unknown> + 5: 0x61399d78f815 - <unknown> + 6: 0x61399d77d859 - <unknown> + 7: 0x61399d77d679 - <unknown> + 8: 0x61399d78f3f4 - <unknown> + 9: 0x61399d6f4063 - <unknown> + 10: 0x61399d6f41f7 - <unknown> + 11: 0x61399d708f18 - <unknown> + 12: 0x61399d6f6e4e - <unknown> + 13: 0x61399d6f9e93 - <unknown> + 14: 0x61399d6f9e89 - <unknown> + 15: 0x61399d78e505 - <unknown> + 16: 0x61399d6f8d55 - <unknown> + 17: 0x75ee2b945bf7 - __libc_start_call_main + 18: 0x75ee2b945cac - __libc_start_main@GLIBC_2.2.5 + 19: 0x61399d6f4861 - <unknown> + 20: 0x0 - <unknown> +``` diff --git a/issues/genenetwork3/rqtl2-mapping-error.gmi b/issues/genenetwork3/rqtl2-mapping-error.gmi new file mode 100644 index 0000000..480c7c6 --- /dev/null +++ b/issues/genenetwork3/rqtl2-mapping-error.gmi @@ -0,0 +1,42 @@ +# R/qtl2 Maps Error + +## Tags + +* type: bug +* status: open +* priority: high +* assigned: alexm, zachs, fredm +* keywords: R/qtl2, R/qtl2 Maps, gn3, genetwork3, genenetwork 3 + +## Reproduce + +* Go to https://genenetwork.org/ +* In the "Get Any" field, enter "synap*" and press the "Enter" key +* In the search results, click on the "1435464_at" trait +* Expand the "Mapping Tools" accordion section +* Select the "R/qtl2" option +* Click "Compute" +* In the "Computing the Maps" page that results, click on "Display System Log" + +### Observed + +A traceback is observed, with an error of the following form: + +``` +⋮ +FileNotFoundError: [Errno 2] No such file or directory: '/opt/gn/tmp/gn3-tmpdir/JL9PvKm3OyKk.txt' +``` + +### Expected + +The mapping runs successfully and the results are presented in the form of a mapping chart/graph and a table of values. + +### Debug Notes + +The directory "/opt/gn/tmp/gn3-tmpdir/" exists, and is actually used by other mappings (i.e. The "R/qtl" and "Pair Scan" mappings) successfully. + +This might imply a code issue: Perhaps +* a path is hardcoded, or +* the wrong path value is passed + +The same error occurs on https://cd.genenetwork.org but does not seem to prevent CD from running the mapping to completion. Maybe something is missing on production — what, though? diff --git a/issues/genotype_search_bug.gmi b/issues/genotype_search_bug.gmi new file mode 100644 index 0000000..0f05f4e --- /dev/null +++ b/issues/genotype_search_bug.gmi @@ -0,0 +1,13 @@ +# The * Search for Genotypes Not Working + +## Tags + +* type: bug +* priority: medium +* status: closed +* assigned: zsloan +* keywords: bug, search + +## Description + +Currently * searches for genotypes return no results, even when data exists. diff --git a/issues/global-search-results.gmi b/issues/global-search-results.gmi deleted file mode 100644 index 9cd773a..0000000 --- a/issues/global-search-results.gmi +++ /dev/null @@ -1,32 +0,0 @@ -# Global search does not return results - -## Tags - -* priority: critical -* type: bug -* assigned: zsloan, pjotrp -* status: unclear -* keywords: global search, from github - -## Description - -=> https://github.com/genenetwork/genenetwork2/issues/629 From GitHub - -> Try a search for Brca2 -> -> I am trying to add an example to this storyboard: -> -> => https://github.com/genenetwork/gn-docs/blob/master/story-boards/starting-from-known-gene/starting-from-known-gene.md#use-the-search-page -> -> -> Interestingly luna does no better: -> -> => http://luna.genenetwork.org/gsearch?type=gene&terms=brca2 - -@pjotr @zsloan, it seems to me this might be fixed, but please have a look and fix it in case it is not - -## Resolution - -With the new xapian search, this issue is no more. - -* closed diff --git a/issues/global-search-unhandled-error.gmi b/issues/global-search-unhandled-error.gmi index b2f6ba8..7626280 100644 --- a/issues/global-search-unhandled-error.gmi +++ b/issues/global-search-unhandled-error.gmi @@ -5,7 +5,7 @@ * assigned: aruni, fredm * priority: high * type: bug -* status: open +* status: closed * keywords: global search, gn2, genenetwork2 ## Description @@ -15,3 +15,7 @@ assume the request will always be successful. This is not always the case, as ca => https://test3.genenetwork.org/gsearch?type=gene&terms=Priscilla here (as of 2024-03-04T11:25+03:00UTC). Possible errors should be checked for and handled before attempting to read and/or process expected data. + +## Closing Comments + +This issue is closed as obsoleted. The issue is really old (>=7 months). Closing it for now. To be reopened if the issue happens again. diff --git a/issues/gn-auth/email_verification.gmi b/issues/gn-auth/email_verification.gmi index 8147bb5..fff3d54 100644 --- a/issues/gn-auth/email_verification.gmi +++ b/issues/gn-auth/email_verification.gmi @@ -2,7 +2,7 @@ ## Tags -* status: open +* status: closed, completed * priority: medium * type: enhancement * assigned: fredm, zsloan @@ -17,3 +17,5 @@ SMTP_PORT = 25 (not 587, which is what we first tried) SMTP_TIMEOUT = 200 # seconds Not sure about username/password yet. We tried UNKNOWN/UNKNOWN and my own (Zach's) username/password + +Note that this host is only visible on the internal network of UTHSC. It won't work for tux02. diff --git a/issues/gn-auth/example-privileges-script.gmi b/issues/gn-auth/example-privileges-script.gmi new file mode 100644 index 0000000..afda1a1 --- /dev/null +++ b/issues/gn-auth/example-privileges-script.gmi @@ -0,0 +1,36 @@ +# Example Python script for setting privileges for user/group + +## Description + +This is just an example of a python script for setting user/group privileges, for potential future reference + +Before running this script, stop the crontab job that automatically sets unlinked resource privileges + +```python +import uuid +import sqlite3 + +group_id = '0510dc91-0eb6-4d9d-97e5-405acc84ba2b' +resource_id = 'e5cc773d-ca28-44e2-b2a7-1c2901794238' + +publishxrefs = ('10955','10957','10960','10961','10964','10966','10969','10970','10973','10975','10978','10979','10982','10984','10987','10988','12486','12487','12489','12490','12491','12492','12493','12494','12495','12496','12497','12498','12499','12500','12501','12502','12503','12504','12505','12506','12507','12508','12509','12510','12511','12512','12513','12514','12515','12516','12517','12518','12519','12520','12521','12522','12523','12524','12525','12526','12527','12528','12529','12530','12531','12532','12533','12534','12535','12536','12537','12538','12539','12540','12541','12542','12543','12544','12545','12546','12547','12548','12549','12550','12551','12566','12567','12568','12569','12574','12575','12576','12577','12578','12579','12580','12621','12735','12737','12741','12742','12743','12744','12745','12780','12781','12782','12783','12784','12785','12786','12787','12788','12789','12790','12791','12792','12793','12794','12795','12796','12797','12798','12799','12800','12801','12803','12804','12805','12806','12807','12808','12809','12810','12812','12813','12816','12817','12961','12962','12963','12964','12965','12966','12967','12970','13029','14803','14804','14805','14806','15572','15573','16197','16375','17329','17330','17331','17332','17333','17334','17335','17336','17337','17338','17339','17340','17341','17342') + +# I generated these separatedly with uuid.uuid4(); I probably could have just done this in the script itself, but wanted to make sure they stayed the same +data_link_ids = ('3041366d-1ffd-45fb-9617-043772b285c8', 'da41fc30-3cd6-4b41-83b5-8fedc4ccd65f', '364a4010-e3fe-470f-a8c9-2a9fd359a4e3', '4e878c0a-cc92-4b21-8152-310266291967', 'ab50a999-e9bb-4bb6-91c0-9828b804156e', 'd50d30e9-15f9-4578-8b48-2bcb0d7a8afb', 'd42d2ef5-278f-4b5e-ae57-10f49f48c2e9', '78c022d7-390b-4688-96c6-c1afadd45877', '17fca9ae-8e71-4c55-b035-15d04f96d936', '4f9893de-fccf-4d6a-845d-df2f83e4d06c', '8a660b03-786a-4143-9fb3-9d00e888f3a2', '3965417a-e47a-47c8-81f6-991eef8c4152', 'e27707f7-5832-4e3f-9391-849e964bbaf6', 'bf9f6ff0-a131-46ef-8a2e-c37d8b66f992', '1ee744c4-95e1-4a66-958c-e785dc937563', '0fa79294-bbdc-4701-861d-9bb91ea72588', '38665214-7cdd-4b01-81dc-d1b78e63a0b0', '82a237df-96ce-404e-b052-8dbe45e793ee', 'ec4c1848-d326-462b-9c0d-f5e5c76e92f6', '46bee64b-8ce7-4910-80ec-211063725b1a', '7f489875-38b6-4cff-a05e-f11a7957b9b8', 'f39744a1-d673-406f-a2f1-c45082bb1975', '5f53a9e9-e40c-4a01-bf9d-430d7c2fd5ef', '1f0a4f2d-cd1c-41e5-a185-2ea2b2b05cd3', 'e282651c-7dc3-40e9-bb52-14e73c3a4ef7', '3c492e6d-e807-427b-acca-44afa4862894', '38e0df6c-3f44-4acb-9965-f0d3f0278150', '35e5ae63-3a32-49ac-93ed-b39d02ab5f5c', '0e6bfa4a-4fee-4b54-80c6-209f9b0ecd00', 'eb85e71a-8b4b-4f3f-9168-59b4ebc090a1', '3eb0325c-4dce-481e-bce7-46c37031da76', '7bc5ce49-4150-4d87-bfbf-d3a1cd20ad67', '03c0cba7-8712-4a27-9b79-e38818805b1f', '07d787ec-e0f9-4b7c-b368-d1f56ce030dc', '51d9e601-31c7-4643-b896-79d90bdc4105', '3cee3754-2822-4f0a-87ad-96bdfe2f0232', 'a7e9eb54-63bd-4ca9-a1f8-1aeac02a76db', '3ff132e5-7fb6-4763-943e-1efbe5f8000e', 'c685f0c9-084d-44d2-882e-ce66cdccef6d', 'ea062e07-1f59-4312-bfd9-6560e652c878', '75d33621-b5a4-447d-a094-7480d1d57a47', 'bb3dbd16-0c73-47d8-8e21-f095d3398b61', '0211177b-a92c-4215-a622-0cba5e8e2866', 'e2139b64-e74a-4263-9785-314e73b102df', '0426f12b-c223-487b-8ab7-baea5995c480', '4a467a72-174c-4ec7-9557-859656ad2c71', '38ab978e-e78f-4c0a-8af3-449b636fe5e6', 'a45c8d42-14d3-464d-8395-8a574148da78', 'e4171cc1-4a03-4311-a287-cee1b8084227', '75d70308-6f1a-49e4-9199-97ec8f60778e', 'efb5c834-b88a-4ee9-b09d-91913fddb546', '23866a00-a729-4ba9-af22-ee83ec164d34', '3feb1154-0613-464b-b758-aad308550a74', '7019d0f1-a590-46ce-a30e-4c21541b6ea8', '6e803182-71d2-4427-a5df-ad84651e5d11', 'fe1bf3f6-818b-4fae-9880-8ae2c1bdcff6', '66d480f7-da41-49ed-a222-8724b493313a', 'c908d2a3-8378-4574-83be-3bf8bdeff5fb', '96b36360-7258-43ab-bdda-23e93f15b0ac', 'daf90aca-6ee6-4c3c-9a60-1e7ae2e29cd2', '43800347-1fe1-40f7-9013-408f0b0740e9', 'e9350a78-a62f-4a08-8881-e6e51450d120', 'bda9a217-d605-4a18-9c3f-5139679ae413', 'cbd8f79a-4992-43c9-8391-994e221b73e1', 'c6b64d90-63ff-482d-b205-f58f3cf656df', '3ecbf267-3655-42a6-a8f9-2751439efb27', '808ae753-a255-43a6-96d4-0ed02b14aefe', '1a5424df-49b3-4274-8281-a1eed838ffda', '89e6d278-e643-43a2-8a61-746cbf446109', 'b4940ece-80a0-4382-ba57-eaad1d35e83e', 'f46cd643-fccb-4037-b642-9a4a329e84e2', '497a235c-4253-4e94-a69c-4b2f200976dd', '02aa8e3a-f9ac-459b-8e35-7081f2849f48', 'da5018e2-38af-415a-ad43-8caf8d82290d', '574ee482-f534-475e-9e7a-0a14e05f4495', 'b90b3a02-fa8d-4393-9dbb-087224a80b40', 'd68370ec-f569-42f3-9c07-a3118aa73ad5', '4b6b099b-3a7c-46c2-a2fc-92c01463b698', 'c9f5608f-3301-4835-b6dc-b1891fe81c36', 'eead972c-0fc4-4c5e-b1ad-63db4d1e9409', 'd8b295eb-6d07-4abe-8b8a-8cfef066a32e', 'a89f3944-be64-42d0-aa66-d2501021760d', '02f42124-bc38-4a14-9400-bbc8e8bf41b7', 'abbcb901-da42-4ef1-bc2c-55b95d584461', 'e28b0cef-eddb-41f2-9479-722365c0b2e0', '9135c304-1dd3-4eb5-82d4-91a86e39068a', '0bbd5f1d-eef3-4c35-84ab-484165a4240d', '08ad9a25-b20d-4ad8-a5e0-a886edc4a7aa', '7e05bdf8-51f5-49dc-9ff6-fbbc6aa20c9f', 'c82d4943-dc6f-4ec8-b76f-1309290183fe', '6a8d76bc-156b-4925-823c-b4585a847efc', '2604e9a8-a4ee-49be-a754-126b1705516e', '8c32b69b-e796-418d-b254-104a179a84ba', '532dca31-c38e-4b77-a84c-563407e9ae00', '954cacda-179e-42a9-8c1f-987e6fae1079', 'bcfced8a-bd50-48e6-9edb-4776a1e95bf5', '66308324-1747-46df-8ddf-41e5bff1cd1a', 'f797e23c-7cb6-4869-97f5-3a79b685c6a3', '0869bb57-0133-4e57-9655-2b6eb1906f5e', 'fc0dddfa-e683-4a8d-9f57-82fb368f8a84', '35b7ffc1-6782-4c85-9bf8-d51629cab2d0', '232850b6-5a53-45e0-8668-7773b9cb39c2', 'af20291c-2be6-40e1-9576-b78df5d56774', 'f52f5c1a-1f8a-4b8a-8e00-fc2bdc6edc5b', '90819230-f372-4e48-96fc-6fb97199fa07', 'b31aefbf-fb67-49dc-b357-f8f0cd76cea9', '5d695f24-674a-4dc5-9e02-7817b77ab06b', '064d5972-f636-4771-95fe-3f6260fd550f', 'c2254f71-98dc-4303-bc26-9b9640582be1', '6eac9495-a366-4e65-90d2-d63472937925', '119398e3-b8cc-4ae5-addb-ec13db9834fa', '6cce7b35-fe2a-4348-9e42-5179ea9f42f1', '65940929-c9fc-47e9-b1cf-c9c9688f7871', '73ffdb1a-f70d-4e8e-88b7-0e22cfd1916e', 'c1b25581-7d28-4535-bcdc-44dc3bc7e438', '6e03a5f7-f200-439a-a465-97056d3c9f71', '4d270b71-2e06-4cfb-a60d-258ccbc7860a', '8b82e29f-a901-454f-a9ad-2f96be9d6c44', '7d699b76-f554-44db-9c68-6ff985cd6388', '3417b2dc-a88a-4cb6-a446-9e90063731f9', '18760f59-4b50-48d5-9814-8117490ab972', '4aaebf37-9529-4365-bdb8-dd53b0ac2499', '95ecdf43-12a5-4b3c-993a-ff03b58cee93', '2b5dd4e6-2310-417e-82bb-b16e96c7346b', '92ee883a-646d-44dd-b2c6-1bffb7b0d2cb', '979038e4-9392-4836-ad04-f125cf19eafa', '1220629d-000f-4508-8a41-3706eebeb812', '42abca44-8eb3-4aa7-adae-16afc211dff4', '82fe9559-718e-4424-9465-033204e1ec03', '8353fe08-e6c8-4f87-b0d8-412ab4a41d19', '1c6bebcf-c125-42a3-9d5b-4fae3113b62b', 'ba54b2ba-fee3-4f1d-a903-18edc7c694bd', '0ea0d40d-3204-4b9b-bae2-54355dce2b5c', '5ee4857c-00b4-46d6-880c-44dbae021b45', '2caa4c03-78ce-456d-8e20-edb531bdd45a', 'e2536a5e-357d-4f6d-a764-ac85a40a2f3f', 'e6341996-80bb-42f9-8842-92062680e957', '3612e03e-430d-4da3-ac87-93a310a3d780', '88c600d2-cefd-4a99-a904-bf2260554ac6', 'f1a6af16-2525-4650-b729-cbec60ad276c', '4b854252-9e87-4d7c-99d9-84ae9297d26e', 'be580989-3ccd-48bd-8c85-a750a800afbd', '5fd675fe-e765-4bf0-8e0f-8f81107a0bb8', 'cf852032-6399-4bf8-a8e7-474c84030430', 'eef27f8a-32d2-4add-a018-ff2d34208a11', '3aca3b1d-4589-4b4c-90de-588fd43fe835', 'd6187213-5a39-4089-ac50-eb144be2a3a5', '5bf60cda-b6b9-4992-91ac-c022e523202a', '4c4395ca-2f2e-4a85-93df-37d2c7f3d1d6', 'b8f9d837-2bd6-447c-9ad8-f581f84f36c1', '029a88bb-3850-4e85-87ab-8ecb3ad59538', '39ead890-0e1a-43df-9bbc-459a3ea0a016', '4b559ad2-c4d8-4763-bc08-90cb63fc79d0', '8361884a-248b-4dac-a9f9-d56f31ab477e', 'd79e2e00-9ea6-4d43-addc-3b1955bc7e5f', '4c0a35ac-c549-4c1a-9fc8-a2e93ba1c632', '50f558d0-c7b1-4204-8ebb-5855e7588998', 'be061746-1b34-4c04-a752-ab5c8d78fdef', 'f8edfb50-c572-4025-87c6-b34e88d8fb90', '0a799ff1-df2c-4c85-9b7e-4fe4885ab5cd', 'db373aa1-8ab9-4257-8d48-11dc92448344', '1e2b9de8-74a4-446a-970e-b47c662760b2', 'ac09ffdf-9cb5-49be-8f52-b681598453f6', 'ae4a55af-a1bb-4698-b2e7-ffbed8760635', '7989ff1f-a9da-439a-bb8b-14482b15dd2e') + +# delete_query deletes from the AutoAdminGroup +delete_query = 'delete from linked_phenotype_data where group_id="5ea09f67-5426-4b66-9ea2-12bdd78350e8" and SpeciesId="1" and InbredSetId="1" and PublishFreezeId="1" and PublishXRefId=?' +resource_query = "insert into phenotype_resources values ('e5cc773d-ca28-44e2-b2a7-1c2901794238', ?)" +link_query = 'insert into linked_phenotype_data (data_link_id, group_id, SpeciesId, InbredSetId, PublishFreezeId, dataset_name, dataset_fullname, dataset_shortname, PublishXRefId) values (?,?,?,?,?,?,?,?,?)' + +db_path = '/home/gn2/auth.db' +conn = sqlite3.connect(db_path) +cursor = conn.cursor() + +the_data = tuple((dlid, group_id, 1, 1, 1, 'BXDPublish', 'BXD Phenotypes', 'BXD Publish', pxrid) for (dlid, pxrid) in zip(data_link_ids, publishxrefs)) + +cursor.executemany(delete_query, tuple((item,) for item in publishxrefs)) +cursor.executemany(link_query, the_data) +cursor.executemany(resource_query, tuple((item,) for item in data_link_ids)) +conn.commit() +``` diff --git a/issues/gn-auth/feature-request-create-test-accounts.gmi b/issues/gn-auth/feature-request-create-test-accounts.gmi new file mode 100644 index 0000000..9e8aa45 --- /dev/null +++ b/issues/gn-auth/feature-request-create-test-accounts.gmi @@ -0,0 +1,51 @@ +# Feature Request: Create Test Accounts + +## Tags + +* assigned: fredm, alex +* status: open +* type: feature request, feature-request +* priority: medium +* keywords: gn-auth, auth, test accounts + +## Description + +From the requests on Matrix: + +@alexm +``` +fredmanglis +: Can we create a generic, verified email for CD to make it easier for people to test our services that requires login? +``` + +and from @pjotrp + +``` +yes, please. Let it expire after a few weeks, or something, if possible. So we can hand out test accounts. +``` + +We, thus, want to have a feature that allows the system administrator, or some other user with the appropriate privileges, to create a bunch of test accounts that have the following properties: + +* The accounts are pre-verified +* The accounts are temporary and are deleted after a set amount of time + +This feature will need a corresponding UI, say on GN2 to enable the users with the appropriate privileges create the accounts easily. + +### Implementation Considerations + +Only system-admin level users will be able to create the test accounts + +We'll probably need to track the plain-text passwords for these accounts, probably. + +Information to collect might include: +* Start of test period (automatic on test account creation: mandatory) +* End of test period (Entered at creation time: mandatory) +* A pattern of sorts to follow when creating the accounts — this brings up the question, is there a specific domain (e.g. …@uthsc.edu, …@genenetwork.org etc.) that these test accounts should use? +* Extra details on event/conference necessitating creation of the test account(s) (optional) + + +Interaction with the rest of the system that we need to consider and handle are: +* Assign public-read for all public data: mostly easy. +* Forgot Password: If such users request a password change, what happens? Password changes requires emails to be sent out with a time-sensitive token. The emails in the test accounts are not meant to be actual existing emails and thus cannot reliably receive such emails. This needs to be considered. Probably just prevent users from changing their passwords. +* What group to assign to these test accounts? I'm thinking probably a new group that is also temporary - deleted when users are deleted. +* What happens to any data uploaded by these accounts? They should probably not upload data meant to be permanent. All their data might need to be deleted along with the temporary accounts. diff --git a/issues/gn-auth/fix-refresh-token.gmi b/issues/gn-auth/fix-refresh-token.gmi new file mode 100644 index 0000000..1a6a825 --- /dev/null +++ b/issues/gn-auth/fix-refresh-token.gmi @@ -0,0 +1,53 @@ +# Fix Refresh Token + +## Tags + +* status: open +* priority: high +* assigned: fredm +* type: feature-request, bug +* keywords: gn-auth, token, refresh token, jwt + +## Description + +The way we currently provide the refresh token is wrong, and complicated, and +leads to subtle bugs in the clients. + +The refresh tokens should be sent back together with the access token in the +same response with the following important considerations: + +* The access token is sent back as the body of the response +* The refresh token is sent back as a httpOnly cookie +* The refresh token should be opaque to the client — if it is a JWT, encrypt it + +### Server-Side Changes + +The following changes will be necessary at the generation of the access token: + +* Generate the refresh token (possibly in the `create_token_response()` function in `gn_auth.auth.authentication.oauth2.grants.JWTBearerGrant`). Put the user ID, and expiration in the refresh token. Expiration can be provided as part of initial request. +* Encrypt the refresh token (maybe use the auth-server's public key for this) +* Save refresh token to DB with link to access token ID perhaps? +* Attach the token to the response as a httpOnly cookie + +at the refreshing of the access token, we'll need to: + +* Fetch the refresh token from the cookies +* Decrypt it +* Compare the user ID in the refresh token with that in the access token provided +* Verify refresh token has not expired +* Check that the refresh token is not revoked (revocation will happen when user logs out, on manual sys-admin revocation) +* Generate new access token +* Do we attach the same refresh token or generate a new one? + +#### Gotchas + +Since there are multiple workers, you could get a flurry of refresh requests using the same refresh token. We might need to handle that — maybe save the refresh request to DB with the ID of the access token used and the new access token, and simply return the same new access token generated by the first successful refresh worker. + +This actually kills 2 birds with the one stone: +* The refresh completes successfully if the refresh token is not expired and the access token is valid +* In case the access token and refresh token are somehow compromised, the system returns the same, possibly expired access token, rendering the compromise moot. + +### Client-Side Changes + +* Get the refresh token from the cookies rather than from the body +* Maybe: make refreshing the access token unaware of threads/workers diff --git a/issues/gn-auth/implement-redirect-on-login.gmi b/issues/gn-auth/implement-redirect-on-login.gmi new file mode 100644 index 0000000..342b2e6 --- /dev/null +++ b/issues/gn-auth/implement-redirect-on-login.gmi @@ -0,0 +1,22 @@ +# Redirect Users to the Correct URL on Login for GN2 + +## Tags + +* assigned: alexm +* priority: medium +* status: in progress +* keywords: gn-auth, auth, redirect, login, completed, closed, done +* type: feature-request + +## Description + +The goal is to redirect users to the login page for services that require authentication, and then return them to the page they were trying to access before logging in, rather than sending them to the homepage. Additionally, display the message "You are required to log in" on the current page instead of on the homepage. + +## Tasks + +* [x] Redirect users to the login page if they are not logged in. +* [x] Implement a redirect to the correct resource after users log in. + +## Notes +See this PR for commits that fixes this: +=> https://github.com/genenetwork/genenetwork2/pull/875 diff --git a/issues/gn-auth/implement-refresh-token.gmi b/issues/gn-auth/implement-refresh-token.gmi index 6b697eb..0dc63f3 100644 --- a/issues/gn-auth/implement-refresh-token.gmi +++ b/issues/gn-auth/implement-refresh-token.gmi @@ -2,7 +2,7 @@ ## Tags -* status: open +* status: closed, completed, fixed * priority: high * assigned: fredm, bonfacem * type: feature-request, bug diff --git a/issues/gn-auth/new-privilegs-samples-ordering.gmi b/issues/gn-auth/new-privilegs-samples-ordering.gmi new file mode 100644 index 0000000..be9cfe9 --- /dev/null +++ b/issues/gn-auth/new-privilegs-samples-ordering.gmi @@ -0,0 +1,32 @@ +# New Privileges: Samples Ordering + +## Tags + +* status: open +* assigned: fredm +* interested: @zachs, @jnduli, @flisso +* priority: medium +* type: feature-request, feature request +* keywords: gn-auth, auth, privileges, samples, ordering + +## Description + +From the email thread: + +``` +Regarding the order of samples, it can basically be whatever we decide it is. It just needs to stay consistent (like if there are multiple genotype files). It only really affects how it's displayed, and any other genotype files we use for mapping needs to share the same order. +``` + +Since this has nothing to do with the data analysis, this could be considered a system-level privilege. I propose + +``` +system:species:samples:ordering +``` + +or something similar. + +This can be added into some sort of generic GN2 curator role (as opposed to a data curator role). + +This allows us to have users that are "data curators" that we can offload some of the data curation work to (e.g. @flisso, @suheeta etc.). + +We would then, restrict the UI and display "curation" to users like @acenteno, @robw and @zachs. This second set of users would thus have both the "data curation" roles, and still have the "UI curation" roles. diff --git a/issues/gn-auth/problems-with-roles.gmi b/issues/gn-auth/problems-with-roles.gmi index 46f3c52..2778b61 100644 --- a/issues/gn-auth/problems-with-roles.gmi +++ b/issues/gn-auth/problems-with-roles.gmi @@ -3,9 +3,9 @@ ## Tags * type: bug -* status: open * priority: critical * assigned: fredm, zachs +* status: closed, completed, fixed * keywords: gn-auth, authorisation, authorization, roles, privileges ## Description @@ -29,8 +29,8 @@ The implementation should instead, tie the roles to the specific resource, rathe * [x] migration: Add `resource:role:[create|delete|edit]-role` privileges to `resource-owner` role * [x] migration: Create new `resource_roles` db table linking each resource to roles that can act on it, and the user that created the role * [x] migration: Drop table `group_roles` deleting all data in the table: data here could already have privilege escalation in place -* [ ] Create a new "Roles" section on the "Resource-View" page, or a separate "Resource-Roles" page to handle the management of that resource's roles -* [ ] Ensure user can only assign roles they have created - maybe? +* [x] Create a new "Roles" section on the "Resource-View" page, or a separate "Resource-Roles" page to handle the management of that resource's roles +* [x] Ensure user can only assign roles they have created - maybe? ### Fixes @@ -39,3 +39,4 @@ The implementation should instead, tie the roles to the specific resource, rathe => https://git.genenetwork.org/gn-auth/commit/?h=handle-role-privilege-escalation&id=5d34332f356164ce539044f538ed74b983fcc706 => https://git.genenetwork.org/gn-auth/commit/?h=handle-role-privilege-escalation&id=f691603a8e7a1700783b2be6f855f30d30f645f1 => https://git.genenetwork.org/gn-auth/commit/?h=handle-role-privilege-escalation&id=2363842cc81132a2592d5cda98e6ebf1305e8482 +=> https://github.com/genenetwork/genenetwork2/commit/a7a8754a57594e5705fea8e5bbea391a09e8f64c diff --git a/issues/gn-auth/registration.gmi b/issues/gn-auth/registration.gmi index 6558a6d..61ea94a 100644 --- a/issues/gn-auth/registration.gmi +++ b/issues/gn-auth/registration.gmi @@ -2,8 +2,11 @@ # Tags +* type: bug * assigned: fredm * priority: critical +* status: closed, completed, fixed +* keywords: gn-auth, auth, authorisation, authentication, registration # Issues diff --git a/issues/gn-auth/resources-duplicates-in-resources-list.gmi b/issues/gn-auth/resources-duplicates-in-resources-list.gmi new file mode 100644 index 0000000..379c1eb --- /dev/null +++ b/issues/gn-auth/resources-duplicates-in-resources-list.gmi @@ -0,0 +1,29 @@ +# Resources: Duplicates in Resources List + +## Tags + +* type: bug +* status: closed +* priority: medium +* assigned: fredm, zachs, zsloan +* keywords: gn-auth, auth, authorisation, resources + +## Reproduce + +* Go to https://genenetwork.org/ +* Sign in to the system +* Click on "Profile" at the top to go to your profile page +* Click on "Resources" on your profile page to see the resources you have access to + +## Expected + +Each resource appears on the list only one time + +## Actual + +Some resources appear more than once on the list + + +## Fix + +=> https://git.genenetwork.org/gn-auth/commit/?id=00f863b3dcb76f5fdca8e139e903e2f7edb861fc diff --git a/issues/send-out-confirmation-emails-on-registration.gmi b/issues/gn-auth/send-out-confirmation-emails-on-registration.gmi index c85e26b..e32c7c0 100644 --- a/issues/send-out-confirmation-emails-on-registration.gmi +++ b/issues/gn-auth/send-out-confirmation-emails-on-registration.gmi @@ -2,11 +2,11 @@ ## Tags -* status: open +* status: closed, completed * assigned: fredm * priority: medium -* keywords: email, user registration * type: feature request, feature-request +* keywords: gn-auth, email, user registration, email confirmation ## Description diff --git a/issues/gn-auth/test1-deployment-cant-find-templates.gmi b/issues/gn-auth/test1-deployment-cant-find-templates.gmi index bd2f57e..ca3bfad 100644 --- a/issues/gn-auth/test1-deployment-cant-find-templates.gmi +++ b/issues/gn-auth/test1-deployment-cant-find-templates.gmi @@ -4,7 +4,7 @@ * assigned: fredm, aruni * priority: critical -* status: open +* status: closed, completed, fixed * type: bug * keywords: gn-auth, deployment, test1 diff --git a/issues/gn-guile/Configurations.gmi b/issues/gn-guile/Configurations.gmi new file mode 100644 index 0000000..f1ae06e --- /dev/null +++ b/issues/gn-guile/Configurations.gmi @@ -0,0 +1,60 @@ +# gn-guile Configurations + +## Tags + +* type: bug +* assigned: +* priority: high +* status: open +* keywords: gn-guile, markdown editing +* interested: alexk, bonfacem, fredm, pjotrp + +## Description + +=> https://git.genenetwork.org/gn-guile/ The gn-guile service +is used to enable markdown editing in GeneNetwork. + +There are configuration that are needed to get the system to work as expected: + +* CURRENT_REPO_PATH: The local path to the cloned repository +* CGIT_REPO_PATH: path to the bare repo (according to docs [gn-guile-docs]) + +With these settings, we should be able to make changes to make edits. These edits, however, do not get pushed upstream. + +Looking at the code +=> https://git.genenetwork.org/gn-guile/tree/web/webserver.scm?id=4623225b0adb0846a4c2e879a33b31884d2e5f05#n212 +we see both the settings above being used, and we can further have a look at +=> https://git.genenetwork.org/gn-guile/tree/web/view/markdown.scm?id=4623225b0adb0846a4c2e879a33b31884d2e5f05#n78 the definition of git-invoke. + +With the above, we could, hypothetically, do a command like: + +``` +git -C ${CURRENT_REPO_PATH} push ${REMOTE_REPO_URI} master +``` + +where REMOTE_REPO_URI can be something like "appuser@git.genenetwork.org:/home/git/public/gn-guile" + +That means we change the (git-invoke …) call seen previously to something like: + +``` +(git-invoke +current-repo-path+ "push" +remote-repo-url+ "master") +``` + +and make sure that the "+remote-repo-url+" value is something along the URI above. + +### Gotchas + +We need to fetch and rebase with every push, to avoid conflicts. That means we'll need a sequence such as the following: + +``` +(git-invoke +current-repo-path+ "fetch" +remote-repo-url+ "master") +(git-invoke +current-repo-path+ "rebase" "origin/master") +(git-invoke +current-repo-path+ "push" +remote-repo-url+ "master") +``` + +The tests above work with a normal user. We'll be running this code within a container, so we do need to expose a specific private ssh key for the user to use to push to remote. This also means that the corresponding public key should be registered with the repository server. + +## References + +* [gn-guile-docs] https://git.genenetwork.org/gn-guile/tree/doc/git-markdown-editor.md?id=4623225b0adb0846a4c2e879a33b31884d2e5f05 + diff --git a/issues/gn-guile/rendering-images-within-markdown-documents.gmi b/issues/gn-guile/rendering-images-within-markdown-documents.gmi new file mode 100644 index 0000000..fe3ed39 --- /dev/null +++ b/issues/gn-guile/rendering-images-within-markdown-documents.gmi @@ -0,0 +1,22 @@ +# Rendering Images Linked in Markdown Documents + +## Tags + +* status: open +* priority: high +* type: bug +* assigned: alexm, bonfacem, fredm +* keywords: gn-guile, images, markdown + +## Description + +Rendering images linked within markdown documents does not work as expected — we cannot render images if they have a relative path. +As an example see the commit below: +=> https://github.com/genenetwork/gn-docs/commit/783e7d20368e370fb497974f843f985b51606d00 + +In that commit, we are forced to use the full github uri to get the images to load correctly when rendered via gn-guile. This, has two unfortunate consequences: + +* It makes editing more difficult, since the user has to remember to find and use the full github URL for their images. +* It ties the data and code to github + +This needs to be fixed, such that any and all paths relative to the markdown file are resolved at render time automatically. diff --git a/issues/gn-guile/rework-hard-dependence-on-github.gmi b/issues/gn-guile/rework-hard-dependence-on-github.gmi new file mode 100644 index 0000000..751e9fe --- /dev/null +++ b/issues/gn-guile/rework-hard-dependence-on-github.gmi @@ -0,0 +1,21 @@ +# Rework Hard Dependence on Github + +## Tags + +* status: open +* priority: medium +* type: bug +* assigned: alexm +* assigned: bonfacem +* assigned: fredm +* keywords: gn-guile, github + +## Description + +Currently, we have a hard-dependence on Github for our source repository — you can see this in lines 31, 41, 55 and 59 of the code linked below: + +=> https://git.genenetwork.org/gn-guile/tree/web/view/markdown.scm?id=0ebf6926db0c69e4c444a6f95907e0971ae9bf40 + +The most likely reason is that the "edit online" functionality might not exist in a lot of other popular source forges. + +This is rendered moot, however, since we do provide a means to edit the data on Genenetwork itself. We might as well get rid of this option, and only allow the "edit online" feature on Genenetwork and stop relying on its presence in the forges we use. diff --git a/issues/gn-uploader/AuthorisationError-gn-uploader.gmi b/issues/gn-uploader/AuthorisationError-gn-uploader.gmi new file mode 100644 index 0000000..50a236d --- /dev/null +++ b/issues/gn-uploader/AuthorisationError-gn-uploader.gmi @@ -0,0 +1,66 @@ +# AuthorisationError in gn uploader + +## Tags +* assigned: fredm +* status: open +* priority: critical +* type: error +* key words: authorisation, permission + +## Description + +Trying to create population for Kilifish dataset in the gn-uploader webpage, +then encountered the following error: +```sh +Traceback (most recent call last): + File "/gnu/store/wxb6rqf7125sb6xqd4kng44zf9yzsm5p-profile/lib/python3.10/site-packages/flask/app.py", line 917, in full_dispatch_request + rv = self.dispatch_request() + File "/gnu/store/wxb6rqf7125sb6xqd4kng44zf9yzsm5p-profile/lib/python3.10/site-packages/flask/app.py", line 902, in dispatch_request + return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args) # type: ignore[no-any-return] + File "/gnu/store/wxb6rqf7125sb6xqd4kng44zf9yzsm5p-profile/lib/python3.10/site-packages/uploader/authorisation.py", line 23, in __is_session_valid__ + return session.user_token().either( + File "/gnu/store/wxb6rqf7125sb6xqd4kng44zf9yzsm5p-profile/lib/python3.10/site-packages/pymonad/either.py", line 89, in either + return right_function(self.value) + File "/gnu/store/wxb6rqf7125sb6xqd4kng44zf9yzsm5p-profile/lib/python3.10/site-packages/uploader/authorisation.py", line 25, in <lambda> + lambda token: function(*args, **kwargs)) + File "/gnu/store/wxb6rqf7125sb6xqd4kng44zf9yzsm5p-profile/lib/python3.10/site-packages/uploader/population/views.py", line 185, in create_population + ).either( + File "/gnu/store/wxb6rqf7125sb6xqd4kng44zf9yzsm5p-profile/lib/python3.10/site-packages/pymonad/either.py", line 91, in either + return left_function(self.monoid[0]) + File "/gnu/store/wxb6rqf7125sb6xqd4kng44zf9yzsm5p-profile/lib/python3.10/site-packages/uploader/monadic_requests.py", line 99, in __fail__ + raise Exception(_data) +Exception: {'error': 'AuthorisationError', 'error-trace': 'Traceback (most recent call last): + File "/gnu/store/38iayxz7dgm86f2x76kfaa6gwicnnjg4-profile/lib/python3.10/site-packages/flask/app.py", line 917, in full_dispatch_request + rv = self.dispatch_request() + File "/gnu/store/38iayxz7dgm86f2x76kfaa6gwicnnjg4-profile/lib/python3.10/site-packages/flask/app.py", line 902, in dispatch_request + return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args) # type: ignore[no-any-return] + File "/gnu/store/38iayxz7dgm86f2x76kfaa6gwicnnjg4-profile/lib/python3.10/site-packages/authlib/integrations/flask_oauth2/resource_protector.py", line 110, in decorated + return f(*args, **kwargs) + File "/gnu/store/38iayxz7dgm86f2x76kfaa6gwicnnjg4-profile/lib/python3.10/site-packages/gn_auth/auth/authorisation/resources/inbredset/views.py", line 95, in create_population_resource + ).then( + File "/gnu/store/38iayxz7dgm86f2x76kfaa6gwicnnjg4-profile/lib/python3.10/site-packages/pymonad/monad.py", line 152, in then + result = self.map(function) + File "/gnu/store/38iayxz7dgm86f2x76kfaa6gwicnnjg4-profile/lib/python3.10/site-packages/pymonad/either.py", line 106, in map + return self.__class__(function(self.value), (None, True)) + File "/gnu/store/38iayxz7dgm86f2x76kfaa6gwicnnjg4-profile/lib/python3.10/site-packages/gn_auth/auth/authorisation/resources/inbredset/views.py", line 98, in <lambda> + "resource": create_resource( + File "/gnu/store/38iayxz7dgm86f2x76kfaa6gwicnnjg4-profile/lib/python3.10/site-packages/gn_auth/auth/authorisation/resources/inbredset/models.py", line 25, in create_resource + return _create_resource(cursor, + File "/gnu/store/38iayxz7dgm86f2x76kfaa6gwicnnjg4-profile/lib/python3.10/site-packages/gn_auth/auth/authorisation/checks.py", line 56, in __authoriser__ + raise AuthorisationError(error_description) +gn_auth.auth.errors.AuthorisationError: Insufficient privileges to create a resource +', 'error_description': 'Insufficient privileges to create a resource'} + +``` +The error above resulted from the attempt to upload the following information on the gn-uploader-`create population section` +Input details are as follows: +Full Name: Kilifish F2 Intercross Lines +Name: KF2_Lines +Population code: KF2 +Description: Kilifish second generation population +Family: Crosses, AIL, HS +Mapping Methods: GEMMA, QTLReaper, R/qtl +Genetic type: intercross + +And when pressed the `Create Population` icon, it led to the error above. + diff --git a/issues/gn-uploader/check-genotypes-in-database-too.gmi b/issues/gn-uploader/check-genotypes-in-database-too.gmi new file mode 100644 index 0000000..4e034b7 --- /dev/null +++ b/issues/gn-uploader/check-genotypes-in-database-too.gmi @@ -0,0 +1,22 @@ +# Check Genotypes in the Database for R/qtl2 Uploads + +## Tags + +* type: bug +* assigned: fredm +* priority: high +* status: closed, completed, fixed +* keywords: gn-uploader, uploader, upload, genotypes, geno + +## Description + +Currently, the uploader expects that a R/qtl2 bundle be self-contained, i.e. it contains all the genotypes and other data that fully describe the data in that bundle. + +This is unnecessary, in a lot of situations, seeing as Genenetwork might already have the appropriate genotypes already in its database. + +This issue tracks the implementation for the check of the genotypes against both the genotypes provided in the bundle, and those already in the database. + +### Updates + +Fixed in +=> https://git.genenetwork.org/gn-uploader/commit/?id=0e74a1589db9f367cdbc3dce232b1b6168e3aca1 this commit diff --git a/issues/gn-uploader/gn-uploader-container-running-wrong-gn2.gmi b/issues/gn-uploader/gn-uploader-container-running-wrong-gn2.gmi index d2c33e8..5a5cdfa 100644 --- a/issues/gn-uploader/gn-uploader-container-running-wrong-gn2.gmi +++ b/issues/gn-uploader/gn-uploader-container-running-wrong-gn2.gmi @@ -3,7 +3,7 @@ ## Tags * assigned: fredm, aruni -* status: open +* status: closed, completed * priority: high * type: bug * keywords: guix, gn-uploader diff --git a/issues/gn-uploader/link-authentication-authorisation.gmi b/issues/gn-uploader/link-authentication-authorisation.gmi new file mode 100644 index 0000000..90b8e5e --- /dev/null +++ b/issues/gn-uploader/link-authentication-authorisation.gmi @@ -0,0 +1,15 @@ +# Link Authentication/Authorisation + +## Tags + +* status: open +* assigned: fredm +* priority: critical +* type: feature request, feature-request +* keywords: gn-uploader, gn-auth, authorisation, authentication, uploader, upload + +## Description + +The last chain in the link to the uploads is the authentication/authorisation. Once the user uploads their data, they need access to it. The auth system, by default, will deny anyone/everyone access to any data that is not linked to a resource and which no user has any roles allowing them access to the data. + +We, currently, assign such data to the user manually, but that is not a sustainable way of working, especially as the uploader is exposed to more and more users. diff --git a/issues/quality-control/move-uploader-to-tux02.gmi b/issues/gn-uploader/move-uploader-to-tux02.gmi index 4459433..20c5b24 100644 --- a/issues/quality-control/move-uploader-to-tux02.gmi +++ b/issues/gn-uploader/move-uploader-to-tux02.gmi @@ -5,7 +5,7 @@ * type: migration * assigned: fredm * priority: high -* status: open +* status: closed, completed, fixed * keywords: gn-uploader, guix, container, deploy ## Databases @@ -17,13 +17,13 @@ This implies separate configurations, and separate startup. Some of the things to do to enable this, then, are: -- [x] Provide separate configs and run db server on separate port +* [x] Provide separate configs and run db server on separate port - Configs put in /etc/mysql3307 - Selected port 3307 - datadir in /var/lib/mysql3307 -> /export5 -- [x] Provide separate data directory for the content +* [x] Provide separate data directory for the content - extract backup -- [x] Maybe suffix the files with the port number, e.g. +* [x] Maybe suffix the files with the port number, e.g. ``` datadir = /var/lib/mysql3307 socket = /var/run/mysqld/mysqld3307.sock diff --git a/issues/gn-uploader/provide-page-for-uploaded-data.gmi b/issues/gn-uploader/provide-page-for-uploaded-data.gmi new file mode 100644 index 0000000..60b154b --- /dev/null +++ b/issues/gn-uploader/provide-page-for-uploaded-data.gmi @@ -0,0 +1,22 @@ +# Provide Page/Link for/to Uploaded Data + +## Tags + +* status: open +* assigned: fredm +* priority: medium +* type: feature, feature request, feature-request +* keywords: gn-uploader, uploader, data dashboard + +## Description + +Once a user has uploaded their data, provide them with a landing page/dashboard for the data they have uploaded, with details on what that data is. + +* Should we provide a means to edit the data here (mostly to add metadata and the like)? +* Maybe the page should actually be shown on GN2? + +## Blockers + +Depends on + +=> /issues/gn-uploader/link-authentication-authorisation diff --git a/issues/gn-uploader/replace-redis-with-sqlite3.gmi b/issues/gn-uploader/replace-redis-with-sqlite3.gmi new file mode 100644 index 0000000..3e5020a --- /dev/null +++ b/issues/gn-uploader/replace-redis-with-sqlite3.gmi @@ -0,0 +1,17 @@ +# Replace Redis with SQL + +## Tags + +* status: open +* priority: low +* assigned: fredm +* type: feature, feature-request, feature request +* keywords: gn-uploader, uploader, redis, sqlite, sqlite3 + +## Description + +We currently (as of 2024-06-27) use Redis for tracking any asynchronous jobs (e.g. QC on uploaded files). + +A lot of what we use redis for, we can do in one of the many SQL databases (we'll probably use SQLite3 anyway), which are more standardised, and easier to migrate data from and to. It has the added advantage that we can open multiple connections to the database, enabling the different processes to update the status and metadata of the same job consistently. + +Changes done here can then be migrated to the other systems, i.e. GN2, GN3, and gn-auth, as necessary. diff --git a/issues/gn-uploader/resume-upload.gmi b/issues/gn-uploader/resume-upload.gmi new file mode 100644 index 0000000..0f9ba30 --- /dev/null +++ b/issues/gn-uploader/resume-upload.gmi @@ -0,0 +1,41 @@ +# gn-uploader: Resume Upload + +## Tags + +* status: closed, completed, fixed +* priority: medium +* assigned: fredm, flisso +* type: feature request, feature-request +* keywords: gn-uploader, uploader, upload, resume upload + +## Description + +If a user is uploading a particularly large file, we might need to provide a way for the user to resume their upload of the file. + +Maybe this can wait until we have +=> /issues/gn-uploader/link-authentication-authorisation linked authentication/authorisation to gn-uploader. +In this way, each upload can be linked to a specific user. + +### TODOs + +* [x] Build UI to allow uploads +* [x] Build back-end to handle uploads +* [x] Handle upload failures/errors +* [x] Deploy to staging + +### Updates + +=> https://git.genenetwork.org/gn-uploader/commit/?id=9a8dddab072748a70d43416ac8e6db69ad6fb0cb +=> https://git.genenetwork.org/gn-uploader/commit/?id=df9da3d5b5e4382976ede1b54eb1aeb04c4c45e5 +=> https://git.genenetwork.org/gn-uploader/commit/?id=47c2ea64682064d7cb609e5459d7bd2e49efa17e +=> https://git.genenetwork.org/gn-uploader/commit/?id=a68fe177ae41f2e58a64b3f8dcf3f825d004eeca + +### Possible Resources + +=> https://javascript.info/resume-upload +=> https://github.com/23/resumable.js/ +=> https://www.dropzone.dev/ +=> https://stackoverflow.com/questions/69339582/what-hash-python-3-hashlib-yields-a-portable-hash-of-file-contents + + +This is mostly fixed. Any arising bugs can be tracked is separate issues. diff --git a/issues/gn-uploader/samplelist-details.gmi b/issues/gn-uploader/samplelist-details.gmi new file mode 100644 index 0000000..2e64d8a --- /dev/null +++ b/issues/gn-uploader/samplelist-details.gmi @@ -0,0 +1,17 @@ +# Explanation of how Sample Lists are handled in GN2 (and may be handled moving forward) + +## Tags + +* status: open +* assigned: fredm, zsloan +* priority: medium +* type: documentation +* keywords: strains, gn-uploader + +## Description + +Regarding the order of samples/strains, it can basically be whatever we decide it is. It just needs to stay consistent (like if there are multiple genotype files). It only really affects how the strains are displayed, and any other genotype files we use for mapping needs to share the same order. + +I think this is the case regardless of whether it's strains or individuals (and both the code and files make no distinction). Sometimes it just logically makes sense to sort them in a particular way for display purposes (like BXD1, BXD2, etc), but technically everything would still work the same if you swapped those columns across all genotype files. Users would be confused about why BXD2 is before BXD1, but everything would still work and all calculations would give the same results. + +zsloan's proposal for handling sample lists in the future is to just store them in a JSON file in the genotype_files/genotype directory. diff --git a/issues/gn-uploader/speed-up-rqtl2-qc.gmi b/issues/gn-uploader/speed-up-rqtl2-qc.gmi new file mode 100644 index 0000000..43e6d49 --- /dev/null +++ b/issues/gn-uploader/speed-up-rqtl2-qc.gmi @@ -0,0 +1,30 @@ +# Speed Up QC on R/qtl2 Bundles + +## Tags + +## Description + +The default format for the CSV files in a R/qtl2 bundle is: + +``` +matrix of individuals × (markers/phenotypes/covariates/phenotype covariates/etc.) +``` + +(A) (f/F)ile(s) in the R/qtl2 bundle could however +=> https://kbroman.org/qtl2/assets/vignettes/input_files.html#csv-files be transposed, +which means the system needs to "un-transpose" the file(s) before processing. + +Currently, the system does this by reading all the files of a particular type, and then "un-transposing" the entire thing. This leads to a very slow system. + +This issue proposes to do the quality control/assurance processing on each file in isolation, where possible - this will allow parallelisation/multiprocessing of the QC checks. + +The main considerations that need to be handled are as follows: + +* Do QC on (founder) genotype files (when present) before any of the other files +* Genetic and physical maps (if present) can have QC run on them after the genotype files +* Do QC on phenotype files (when present) after genotype files but before any other files +* Covariate and phenotype covariate files come after the phenotype files +* Cross information files … ? +* Sex information files … ? + +We should probably detail the type of QC checks done for each type of file diff --git a/issues/gn-uploader/uploading-samples.gmi b/issues/gn-uploader/uploading-samples.gmi new file mode 100644 index 0000000..11842b9 --- /dev/null +++ b/issues/gn-uploader/uploading-samples.gmi @@ -0,0 +1,51 @@ +# Uploading Samples + +## Tags + +* status: open +* assigned: fredm +* interested: acenteno, zachs, flisso +* priority: high +* type: feature-request +* keywords: gn-uploader, uploader, samples, strains + +## Description + +This will track the various notes regarding the upload of samples onto GeneNetwork. + +### Sample Lists + +From the email thread(s) with @zachs, @flisso and @acenteno + +``` +When there's a new set of individuals, it generally needs to be added as a new group. In the absence of genotype data, a "dummy" .geno file currently needs to be generated* in order to define the sample list (if you look at the list of .geno files in genotype_files/genotype you'll find some really small files that just have either a single marker or a bunch of fake markers calls "Marker1, Marker2, etc" - these are solely just used to get the samplelist from the columns). So in theory such a file could be generated as a part of the upload process in the absence of genotypes +``` + +We note, however, that the as @zachs mentions + +``` +This is really goofy and should probably change. I've brought up the idea of just replacing these with JSON files containing group metadata (including samplelist), but we've never actually gone through with making any change to this. I already did something sorta similar to this with the existing JSON files (in genotype_files/genotype), but those are currently only used in situations where there are either multiple genotype files, or a genotype file only contains a subset of samples/strains from a group (so the JSON file tells mapping to only use those samples/strains). +``` + +We need to explore whether such a change might need updates to the GN2/GN3 code to ensure code that depends on these dummy files can also use the new format JSON files too. + +Regarding the order of the samples, from the email thread: + +``` +Regarding the order of samples, it can basically be whatever we decide it is. It just needs to stay consistent (like if there are multiple genotype files). It only really affects how it's displayed, and any other genotype files we use for mapping needs to share the same order. +``` + +The ordering of the samples has no bearing on the analysis of the data, i.e. it does not affect the results of computations. + + +### Curation + +``` +But any time new samples are involved, there probably needs to be some explicit confirmation by a curator like Rob (since we want to avoid a situation where a sample/strain just has a typo or somethin and we treat it like a new sample/strain). +``` + +also + +``` +When there's a mix of existing individuals, I think it's usually the case that it's the same group (that is being expanded with new individuals), but anything that involves adding new samples should probably involve some sort of direct/explicit confirmation from a curator like Rob or something. +``` diff --git a/issues/gnqa/GNQA-for-evaluation.gmi b/issues/gnqa/GNQA-for-evaluation.gmi index 9f4a861..0b2e352 100644 --- a/issues/gnqa/GNQA-for-evaluation.gmi +++ b/issues/gnqa/GNQA-for-evaluation.gmi @@ -5,7 +5,7 @@ * Assigned: alexm, shelbys * Keywords: UI, GNQA, evaluation * Type: immediate -* Status: In Progress +* Status: completed ## Description @@ -13,5 +13,5 @@ We need to publish a paper on GeneNetwork Question & Answering system. To that e ## Tasks -* [ ] Add a thumbs up and down for rating the answer to a question -* [ ] Ensure to log the questions, respones, and ratings of each questions +* [X] Add a thumbs up and down for rating the answer to a question +* [X] Ensure to log the questions, respones, and ratings of each questions diff --git a/issues/gnqna/rating-system-has-no-indication-for-login-requirement.gmi b/issues/gnqa/Login_no-indicator-for-req.gmi index 7ed713a..7ed713a 100644 --- a/issues/gnqna/rating-system-has-no-indication-for-login-requirement.gmi +++ b/issues/gnqa/Login_no-indicator-for-req.gmi diff --git a/issues/fetch-pubmed-references-to-gnqa.gmi b/issues/gnqa/fetch-pubmed-references-to-gnqa.gmi index 63351d1..43c45cf 100644 --- a/issues/fetch-pubmed-references-to-gnqa.gmi +++ b/issues/gnqa/fetch-pubmed-references-to-gnqa.gmi @@ -5,7 +5,7 @@ * assigned: alexm * keywords: llm, pubmed, api, references * type: enhancements -* status: in progress +* status: completed, closed ## Description @@ -18,13 +18,13 @@ The task is to integrate PubMed references into the GNQA system by querying the * [x] Query the API with the publication titles. -* [] Display the PubMed information as reference information on the GN2 user interface. +* [x] Display the PubMed information as reference information on the GN2 user interface. -* [] dump the results to a DB e.g sqlite,lmdb +* [x] dump the results to a DB e.g sqlite,lmdb * [x] If references are not found, perform a lossy search or list the closest three papers. -* [] reimplement the reference ui to render the references as modal objects +* [x] reimplement the reference ui to render the references as modal objects For lossy search, see: diff --git a/issues/gn_llm_db_cache_integration.gmi b/issues/gnqa/gn_llm_db_cache_integration.gmi index 86f7c80..86f7c80 100644 --- a/issues/gn_llm_db_cache_integration.gmi +++ b/issues/gnqa/gn_llm_db_cache_integration.gmi diff --git a/issues/gnqa/gn_llm_integration_using_cached_searches.gmi b/issues/gnqa/gn_llm_integration_using_cached_searches.gmi new file mode 100644 index 0000000..e20b5a3 --- /dev/null +++ b/issues/gnqa/gn_llm_integration_using_cached_searches.gmi @@ -0,0 +1,43 @@ +# GN2 Integration with LLM search using cached results + +## Tags + +* assigned: jnduli, alexm, bmunyoki +* keywords: llm, genenetwork2 +* type: enhancement +* status: open + +## Description + +We'd like to include LLM searches integrated into our GN searches, when someone attempts a Xapian search e.g. when I search for `wiki:rif group:mouse nicotine`, we'd do a corresponding search for `rif mouse nicotine` on LLMs, and show the results on the main page. + +Another example: + +xapian search: rif:glioma species:human group:gtex_v8 +llm search: glioma human gtex_v8 + + +This can be phased out into + +* [ ] 1. UI integration, where we modify the search page to include a dummy content box +* [ ] 2. LLM search integration, where we perform a search and modify UI to show the results. This can either be async (i.e. the search results page waits for the LLM search results) or sync (i.e. we load the search results page after we've got the LLM results) +* [x] 2.1 create a copy branch for the gnqa-api branch +* [x] 2.2 create a PR containing all the branches +* [ ] 2.3 how much would it take to get the qa_*** branch merged into main?? +* [ ] 3. Cache design and integration: we already have some + +cache using redis (gn search history), so we may use this for the moment. + + +Let's use flag: `LLM_SEARCH_ENABLED` to enable/disable this feature during development to make sure we don't release this before it's ready. + + +## Notes + +The branch for merging to gn2: + +https://github.com/genenetwork/genenetwork2/pull/863 + +The branch for merging to gn3: + +https://github.com/genenetwork/genenetwork3/pull/188
\ No newline at end of file diff --git a/issues/gnqa/gnqa_integration_to_global_search_Design.gmi b/issues/gnqa/gnqa_integration_to_global_search_Design.gmi new file mode 100644 index 0000000..0d5afd0 --- /dev/null +++ b/issues/gnqa/gnqa_integration_to_global_search_Design.gmi @@ -0,0 +1,74 @@ +# GNQA Integration to Global Search Design Proposal + +## Tags +* assigned: jnduli, alexm +* keywords: llm, genenetwork2 +* type: feature +* status: complete, closed, done + +## Description +This document outlines the design proposal for integrating GNQA into the Global Search feature. + +## High-Level Design + +### UI Design +When the GN2 Global Search page loads: +1. A request is initiated via HTMX to the GNQA search page with the search query. +2. Based on the results, a page or subsection is rendered, displaying the query and the answer, and providing links to references. + +For more details on the UI design, refer to the pull request: +=> https://github.com/genenetwork/genenetwork2/pull/862 + +### Backend Design +The API handles requests to the Fahamu API and manages result caching. Once a request to the Fahamu API is successful, the results are cached using SQLite for future queries. Additionally, a separate API is provided to query cached results. + +## Deep Dive + +### Caching Implementation +For caching, we will use SQLite3 since it is already implemented for search history. Based on our study, this approach will require minimal space: + +*Statistical Estimation:* +We calculated that this caching solution would require approximately 79MB annually for an estimated 20 users, each querying the system 5 times a day. + +Why average request size per user and how we determined this? +The average request size was an upper bound calculation for documents returned from the Fahamu API. + +why we're assuming 20 users making 5 requests per day? + +We’re assuming 20 users making 5 requests per day to estimate typical usage of GN2 services +### Error Handling +* Handle cases where users are not logged in, as GNQA requires authentication. +* Handle scenarios where there is no response from Fahamu. +* Handle general errors. + +### Passing Questions to Fahamu +We can choose to either pass the entire query from the user to Fahamu or parse the query to search for keywords. + +### Generating Possible Questions +It is possible to generate potential questions based on the user's search and render those to Fahamu. Fahamu would then return possible related queries. + +## Related Issues +=> https://issues.genenetwork.org/issues/gn_llm_integration_using_cached_searches + +## Tasks + +* [x] Initiate a background task from HTMX to Fahamu once the search page loads. +* [x] Query Fahamu for data. +* [x] Cache results from Fahamu. +* [x] Render the UI page with the query and answer. +* [x] For "See more," render the entire GNQA page with the query, answer, references, and PubMed data. +* [x] Implement parsing for Xapian queries to normal queries. +* [x] Implement error handling. +* [x] reimplement how gnqa uses GN-AUTH in gn3. +* [x] Query Fahamu to generate possible questions based on certain keywords. + + +## Notes +From the latest Fahamu API docs, they have implemented a way to include subquestions by setting `amplify=True` for the POST request. We also have our own implementation for parsing text to extract questions. + +## PRs Merged Related to This + +=> https://github.com/genenetwork/genenetwork2/pull/868 +=> https://github.com/genenetwork/genenetwork2/pull/862 +=> https://github.com/genenetwork/genenetwork2/pull/867 +=> https://github.com/genenetwork/genenetwork3/pull/191
\ No newline at end of file diff --git a/issues/implement-auth-to-gn-llm.gmi b/issues/gnqa/implement-auth-to-gn-llm.gmi index 496a7cb..2a5456b 100644 --- a/issues/implement-auth-to-gn-llm.gmi +++ b/issues/gnqa/implement-auth-to-gn-llm.gmi @@ -6,7 +6,7 @@ * keywords: llm, auth * type: feature * priority: high -* status: done, completed +* status: done, completed, closed ## Description diff --git a/issues/gnqa/implement-no-login-requirement-for-gnqa.gmi b/issues/gnqa/implement-no-login-requirement-for-gnqa.gmi new file mode 100644 index 0000000..9dcef53 --- /dev/null +++ b/issues/gnqa/implement-no-login-requirement-for-gnqa.gmi @@ -0,0 +1,20 @@ +# Implement No-Login Requirement for GNQA + +## Tags + +* type: feature +* status: progress +* priority: medium +* assigned: alexm, +* keywords: gnqa, user experience, authentication, login, llm + +## Description +This feature will allow usage of LLM/GNQA features without requiring user authentication, while implementing measures to filter out bots + + +## Tasks + +* [x] If logged in: perform AI search with zero penalty +* [ ] Add caching lifetime to save on token usage +* [ ] Routes: check for referrer headers — if the previous search was not from the homepage, perform AI search +* [ ] If global search returns more than *n* results (*n = number*), perform an AI search diff --git a/issues/implement-reference-rating-gn-llm.gmi b/issues/gnqa/implement-reference-rating-gn-llm.gmi index f646a6f..f646a6f 100644 --- a/issues/implement-reference-rating-gn-llm.gmi +++ b/issues/gnqa/implement-reference-rating-gn-llm.gmi diff --git a/issues/integrate_gn_llm_search.gmi b/issues/gnqa/integrate_gn_llm_search.gmi index 5dfd9da..5dfd9da 100644 --- a/issues/integrate_gn_llm_search.gmi +++ b/issues/gnqa/integrate_gn_llm_search.gmi diff --git a/issues/merge-gnqa-to-production.gmi b/issues/gnqa/merge-gnqa-to-production.gmi index 3d34bb1..3d34bb1 100644 --- a/issues/merge-gnqa-to-production.gmi +++ b/issues/gnqa/merge-gnqa-to-production.gmi diff --git a/issues/refactor-gn-llm-code.gmi b/issues/gnqa/refactor-gn-llm-code.gmi index 6e33737..64c43c4 100644 --- a/issues/refactor-gn-llm-code.gmi +++ b/issues/gnqa/refactor-gn-llm-code.gmi @@ -5,7 +5,7 @@ * assigned:alexm,shelby * keywords:refactoring,llm,tests * type: enchancements -* status: in progress +* status: completed, closed ## Description diff --git a/issues/implement_xapian_to_text_transformer.gmi b/issues/implement_xapian_to_text_transformer.gmi new file mode 100644 index 0000000..a3c3dc8 --- /dev/null +++ b/issues/implement_xapian_to_text_transformer.gmi @@ -0,0 +1,15 @@ +# Xapian to Text Transformer + +## Tags +* assigned: alexm, jnduli +* keywords: llm, genenetwork2, xapian, transform +* type: feature +* status: in-progress + +## Description: + +Given a Xapian search query, e.g., "CYTOCHROME AND P450" or "CYTOCHROME NEAR P450," we need to convert the text to a format with no Xapian keywords. In this case, the transformed text would be "CYTOCHROME P450." + + +This issue is a part of the main issue below. +=> https://issues.genenetwork.org/issues/gn_llm_integration_using_cached_searches diff --git a/issues/inspect-discrepancies-between-xapian-and-sql-search.gmi b/issues/inspect-discrepancies-between-xapian-and-sql-search.gmi new file mode 100644 index 0000000..98b46b6 --- /dev/null +++ b/issues/inspect-discrepancies-between-xapian-and-sql-search.gmi @@ -0,0 +1,135 @@ +# Inspect Discrepancies Between Xapian and SQL Search. + +* assigned: bonfacem, rookie101 + +## Description + +When doing a Xapian search, we miss some data that is available from the SQL Search. The searches we tested: + +=> https://cd.genenetwork.org/search?species=mouse&group=BXD&type=Hippocampus+mRNA&dataset=HC_M2_0606_P&search_terms_or=WIKI%3Dglioma&search_terms_and=&accession_id=None&FormID=searchResulto SQL search for dataset=HC_M2_0606_P species=mouse group=BXD WIKI=glioma (31 results) + +=> https://cd.genenetwork.org/gsearch?type=gene&terms=species%3Amouse+group%3Abxd+dataset%3Ahc_m2_0606_p+wiki%3Aglioma species:mouse group:bxd dataset:hc_m2_0606_p wiki:glioma (26 results) + +We miss the following entries from the Xapian search: + +``` +15 1423803_s_at Gltscr2 glioma tumor suppressor candidate region gene 2 +16 1451121_a_at Gltscr2 glioma tumor suppressor candidate region 2; exons 8 and 9 +17 1452409_at Gltscr2 glioma tumor suppressor candidate region gene 2 +25 1416556_at Sas sarcoma amplified sequence +26 1430029_a_at Sas sarcoma amplified sequence +``` + +We want to figure out why there is a discrepancy between the 2 searches above. + +## Resolution + +Use "quest" to search for one of the symbols that don't appear in the Xapian search to get the exact document id: + +``` +quest --msize=2 -s en --boolean-prefix="iden:Qgene:" "iden:"1423803_s_at:hc_m2_0606_p"" \ +--db=/export/data/genenetwork-xapian/ + +Parsed Query: Query(0 * Qgene:1423803_s_at:hc_m2_0606_p) +Exactly 1 matches +MSet: +9665867: [0] +{ + "name": "1423803_s_at", + "symbol": "Gltscr2", + "description": "glioma tumor suppressor candidate region gene 2", + "chr": "1", + "mb": 4.687986, + "dataset": "HC_M2_0606_P", + "dataset_fullname": "Hippocampus Consortium M430v2 (Jun06) PDNN", + "species": "mouse", + "group": "BXD", + "tissue": "Hippocampus mRNA", + "mean": 11.749030303030299, + "lrs": 11.3847971289981, + "additive": -0.0650828877005346, + "geno_chr": "5", + "geno_mb": 137.010795 +} +``` + +From the retrieved document-id, use "xapian-delve" to inspect the terms inside the index: + +``` +xapian-delve -r 9665867 -d /export/data/genenetwork-xapian/ + +Data for record #9665867: +{ + "name": "1423803_s_at", + "symbol": "Gltscr2", + "description": "glioma tumor suppressor candidate region gene 2", + "chr": "1", + "mb": 4.687986, + "dataset": "HC_M2_0606_P", + "dataset_fullname": "Hippocampus Consortium M430v2 (Jun06) PDNN", + "species": "mouse", + "group": "BXD", + "tissue": "Hippocampus mRNA", + "mean": 11.749030303030299, + "lrs": 11.3847971289981, + "additive": -0.0650828877005346, + "geno_chr": "5", + "geno_mb": 137.010795 +} +Term List for record #9665867: 1423803_s_at 2 5330430h08rik +9430097c02rik Qgene:1423803_s_at:hc_m2_0606_p +XC1 XDShc_m2_0606_p XGbxd XIhippocampus XImrna XPC5 +XSmouse XTgene XYgltscr2 ZXDShc_m2_0606_p ZXGbxd +ZXIhippocampus ZXImrna ZXSmous ZXYgltscr2 Zbc017637 +Zbxd Zcandid Zgene Zglioma Zgltscr2 Zhc_m2_0606_p +Zhippocampus Zmous Zmrna Zregion Zsuppressor Ztumor +bc017637 bxd candidate gene glioma gltscr2 +hc_m2_0606_p hippocampus mouse mrna +region suppressor tumor +``` + +We have no wiki (XWK) entries from the above. When transforming to TTL files from SQL, we have symbols that exist in the GeneRIF table that do not exist in the GeneRIF_BASIC table: + +``` +SELECT COUNT(symbol) FROM GeneRIF WHERE +symbol NOT IN (SELECT symbol FROM GeneRIF_BASIC) +GROUP BY BINARY symbol; +``` + +Consequently, this means that after transforming to TTL files, we have some missing RDF entries that map a symbol (subject) to it's real name (object). When building the RDF cache, we thereby have some missing RIF/WIKI entries, and some entries are not indexed. This patch fixes the aforementioned error with missing symbols: + +=> https://git.genenetwork.org/gn-transform-databases/commit/?id=d95501bd2bd41ef8cf3584118382e83cbbbe0c87 [gn-transform-databases] Add missing RIF symbols. + +Now these 2 queries return the same exact results: + +=> https://cd.genenetwork.org/search?species=mouse&group=BXD&type=Hippocampus+mRNA&dataset=HC_M2_0606_P&search_terms_or=WIKI%3Dglioma&search_terms_and=&accession_id=None&FormID=searchResulto SQL search for dataset=HC_M2_0606_P species=mouse group=BXD WIKI=glioma (31 results) + +=> https://cd.genenetwork.org/gsearch?type=gene&terms=species%3Amouse+group%3Abxd+dataset%3Ahc_m2_0606_p+wiki%3Aglioma species:mouse group:bxd dataset:hc_m2_0606_p wiki:glioma (31 results) + +However, Xapian search is case insensitive while the SQL search is case sensitive: + +=> https://cd.genenetwork.org/gsearch?type=gene&terms=species%3Amouse+group%3Abxd+dataset%3Ahc_m2_0606_p+wiki%3Acancer species:mouse group:bxd dataset:hc_m2_0606_p wiki:cancer (72 results) + +=> https://cd.genenetwork.org/search?species=mouse&group=BXD&type=Hippocampus+mRNA&dataset=HC_M2_0606_P&search_terms_or=WIKI%3Dcancer&search_terms_and=&accession_id=None&FormID=searchResulto SQL search for dataset=HC_M2_0606_P species=mouse group=BXD WIKI=cancer (70 results) + +=> https://cd.genenetwork.org/search?species=mouse&group=BXD&type=Hippocampus+mRNA&dataset=HC_M2_0606_P&search_terms_or=WIKI%3DCancer&search_terms_and=&accession_id=None&FormID=searchResulto SQL search for dataset=HC_M2_0606_P species=mouse group=BXD WIKI=Cancer (Note the change in the case "Cancer": 13 results) + +Another reason for discrepancies between search results, E.g. + +=> https://cd.genenetwork.org/gsearch?type=gene&terms=species%3Amouse+group%3Abxd+dataset%3Ahc_m2_0606_p+wiki%3Adiabetes species:mouse group:bxd dataset:hc_m2_0606_p wiki:diabetes (59 results) + +=> https://cd.genenetwork.org/search?species=mouse&group=BXD&type=Hippocampus+mRNA&dataset=HC_M2_0606_P&search_terms_or=WIKI%3Ddiabetes&search_terms_and=&accession_id=None&FormID=searchResulto SQL search for dataset=HC_M2_0606_P species=mouse group=BXD WIKI=diabetes (52 results) + +is that Xapian performs stemming on the search terms. For example, in the above wiki search for "diabetes", Xapian will stem "diabetes" to "diabet" thereby matching "diabetic", "diabetes", or any other word variation of "diabetes." + +## Ordering of Results + +The ordering in the Xapian search and SQL search is different. By default, SQL orders by Symbol where we have: + +``` +[...] ORDER BY ProbeSet.symbol ASC +``` + +However, Xapian orders search results by decreasing relevance score. This is configurable. + +* closed diff --git a/issues/inspect-discrepancies-between-xapian-and-sql-search2.gmi b/issues/inspect-discrepancies-between-xapian-and-sql-search2.gmi new file mode 100644 index 0000000..451d5c3 --- /dev/null +++ b/issues/inspect-discrepancies-between-xapian-and-sql-search2.gmi @@ -0,0 +1,11 @@ +# Inspect Discrepancies Between Xapian and SQL Search. + +* assigned: bonfacem, rookie101 + +## Description + +When we type BXD_21526 in xapian search we should find + +=> https://genenetwork.org/search?species=mouse&group=BXD&type=Phenotypes&dataset=BXDPublish&search_terms_or=BXD_21526&search_terms_and=&accession_id=None&FormID=searchResult + +This is not the case right now. diff --git a/issues/integrate-markdown-editor-to-gn2.gmi b/issues/integrate-markdown-editor-to-gn2.gmi index 98c170b..5904eac 100644 --- a/issues/integrate-markdown-editor-to-gn2.gmi +++ b/issues/integrate-markdown-editor-to-gn2.gmi @@ -1,3 +1,4 @@ + # GN Markdown Editor Integration ## Tags @@ -5,26 +6,168 @@ * assigned: alexm * status: in progress * priority: high +* tags: markdown, integration, guile ## Notes -This is a to-do list to integrate the GN Markdown editor into GN2. + +This is a to-do list to integrate the GN Markdown editor into GN2. To see the implementation, see: -=> https://github.com/Alexanderlacuna/geditor +=> https://git.genenetwork.org/gn-guile/ ## Tasks -* [ ] Implement APIs to fetch file for edit -* [ ] Add verification for the repository -* [ ] Implement API to edit and commit changes -* [ ] Replace JS with HTMX -* [ ] Support external links and image rendering -* [ ] Package dependencies -* [ ] Handle errors +* [x] Implement APIs to fetch files for editing +* [x] Add verification for the repository +* [x] Implement API to edit and commit changes +* [x] Replace JS with HTMX +* [x] Support external links and image rendering +* [x] Package dependencies +* [x] show diff for files +* [x] Handle errors * [ ] Review by users -* [ ] Integrate auth to the system. +* [x] Integrate authentication into the system + + +## API Documentation + +This APi endpoints are implemented in guile See repo: + +=> https://git.genenetwork.org/gn-guile/ + +The main endpoints are: `/edit` and `/commit` + +### Edit (GET) + +This is a `GET` request to retrieve file content. Make sure you pass a valid `file_path` as `search_query` (the path should be relative to the repository). + +**Edit Request Example:** + +```bash + +curl -G -d "file_path=test.md" localhost:8091/edit +``` + +In case of a successful response, the expected result is: + + +```json +{ +"path": "<file_path>", +"content": "Test for new user\n test 2 for line\n test 3 for new line\n ## real markdown two test\n", +"hash": "<commit_sha>" +} +``` + +In case of an error, the expected response is: + +```json +{ +"error": "<error_type>", +"msg": "<error_reason>" +} +``` + +### Commit (POST) + +**Endpoint:** + +``` +localhost:8091/commit +``` + + +```bash + +curl -X POST http://127.0.0.1:8091/commit \ +-H 'Content-Type: application/json' \ +-d '{ +"content": "make test commit", +"filename": "test.md", +"email": "test@gmail.com", +"username": "test", +"commit_message": "init commit", +"prev_commit": "7cbfc40d98b49a64e98e7cd562f373053d0325bd" +}' + +``` -Related issues: +It expects the following data in JSON format: + +* `content` (the data you want to commit to the file, *valid markdown*) +* `prev_commit` (required for integrity) +* `filename` (file path to the file you are modifying) +* `username` (identifier for the user, in our case from auth) +* `email` (identifier email from the user, in our case from auth) +* `commit_message` + +If the request succeeds, the response should be: + +```json +{ +"status": "201", +"message": "Committed file successfully", +"content": "Test for new user\n test 2 for line\n test 3 for new line\n ## real markdown two test\n", +"commit_sha": "47df3b7f13a935d50cc8b40e98ca9e513cba104c", +"commit_message": "commit by genetics" +} +``` + +If there are no changes to the file: + +```json +{ +"status": "200", +"message": "Nothing to commit, working tree clean", +"commit_sha": "ecd96f27c45301279150fbda411544687db1aa45" +} +``` + +If the request fails, the expected results are: + +```json +{ +"error": "<error_type>", +"msg": "Commits do not match. Please pull in the latest changes for the current commit *ecd96f27c45301279150fbda411544687db1aa45* and previous commits." +} +``` + +## Related Issues => https://issues.genenetwork.org/issues/implement-gn-markdown-editor-in-guile -=> https://issues.genenetwork.org/issues/implement-gn-markdown-editor
\ No newline at end of file +=> https://issues.genenetwork.org/issues/implement-gn-markdown-editor + +## Notes on Gn-Editor UI + +Here is the link to the PR for integrating the GN-Editor, including screenshots: + +=> https://github.com/genenetwork/genenetwork2/pull/854 + +Genenetwork2 consumes the endpoint for the GN-Editor. Authentication is required to prevent access by malicious users and bots. + +The main endpoint to fetch and edit a file is: + +``` +genenetwork.org/editor/edit?file-path=<relative file path> +``` + +This loads the editor with the content for editing. + +### Modifying Editor Settings + +You can modify editor settings, such as font size and keyboard bindings. To do this, navigate to: + +``` +genenetwork.org/editor/settings +``` + +Be sure to save your changes for them to take effect. + +### Showing Diff for Editor + +The editor also provides a diff functionality to show you the changes made to the file. Use the "Diff" button in the navigation to view these changes. + +### Committing Changes + +To commit your changes, use the "Commit" button. A commit message is required in the text area for the commit to be processed. + diff --git a/issues/mgamma/mgamma-design.gmi b/issues/mgamma/mgamma-design.gmi index 23e02d5..ed4c061 100644 --- a/issues/mgamma/mgamma-design.gmi +++ b/issues/mgamma/mgamma-design.gmi @@ -7,3 +7,31 @@ We have a lot of experience running and hacking the GEMMA tool in GeneNetwork.or GEMMA proves to give great GWA results and has a decent speed for a single threaded implementation - even though the matrix calls to openblas use multiple threads. The source code base of GEMMA, however, proves hard to build on. This is why we are creating a next generation tool that has a focus on *performance and hackability*. After several attempts using R, D, Julia, python, Ruby we have in 2023 settled on Guile+C+Zig. Guile provides a REPL and great hackabability. C+Zig we'll use for performance. The other languages are all great, but we think we can work faster in this setup. + +Well, it is the end of 2024 and we have ditched that effort. Who said life was easy! The guile interface proved problematic - and Zig went out of favour because of its bootstrap story which prevents it becoming part of Guix, Debian etc. Also I discovered new tensor MPUs support f64 - so we may want to support vector and matrix computations on these cores. + +To write a gemma replacement I am now favouring to chunk up existing gemma and make sure its components can talk with alternative implementations. We may use a propagated network approach. Critical is to keep the data in RAM, so it may need some message passing interface with memory that can be shared. The chunking into CELLs (read propagator network PN) is a requirement because we kept tripping over state in GEMMA. So a PN should make sure we can run two implementations of the same CELL and compare outcomes for testing. Also it will allow us to test AVX, tensor and (say) MKL or CUDA implementations down the line. Also it should allow us to start using new functionality on GN faster. It would also be fun to have an implementation run on the RISC-V manycore. + +So, what do we want out of our languages: + +* Nice matrix interface (Julia) +* Support for AVX (Julia) +* Possibility to drop to low level C programming (Julia+prescheme+C?) +* High level -- PN -- glue (Julia+Guile?) + +Julia looks like a great candidate, even though it has notable downsides including the big 'server' blob deployment and the garbage collector (the latter also being a strength, mind). Alternatives could be Rust and Prescheme which have no such concerns, but lack the nice matrix notation. + +The approach will be to start with Julia and reimplementing GEMMA functions so they can be called from Julia and/or guile. + +Oh, I just found out that Julia, like zig, is no longer up-to-date on Debian. And the Guix version is 2 years old. That is really bad. If these languages don't get supported on major distros it is a dead end! + +=> https://mastodon.social/@pjotrprins/113379842047170785 + +What to now? + +* Nice matrix interface (?) +* Support for AVX (?) +* Possibility to drop to low level C programming (?+prescheme+C?) +* High level -- PN -- glue (?+Guile?) + +Current candidates for ? are Nim and Rust. Neither has a really nice matrix interface - though Nim's is probably what I prefer and it is close to python. Chicken may work too when I get fed with mentioned two languages. diff --git a/issues/mgamma/mgamma-lmm.gmi b/issues/mgamma/mgamma-lmm.gmi new file mode 100644 index 0000000..61481c2 --- /dev/null +++ b/issues/mgamma/mgamma-lmm.gmi @@ -0,0 +1,17 @@ +# MGAMMA LMM + +MGamma does GWAS, which means it has to do Linear Mixed Models—both univariate and multivariate. + +# Tags + +* assigned: pjotrp, artyom +* type: feature +* priority: high + +# Tasks + +* [X] Kinship matrix computation. +* [X] Univariate LMM. +* [ ] Multivariate LMM. +* [X] Export data from GEMMA. +* [ ] Compare and ensure data match between MGamma and GEMMA.
\ No newline at end of file diff --git a/issues/move-racket-gn-rest-api-to-guile.gmi b/issues/move-racket-gn-rest-api-to-guile.gmi index 185e7de..659c586 100644 --- a/issues/move-racket-gn-rest-api-to-guile.gmi +++ b/issues/move-racket-gn-rest-api-to-guile.gmi @@ -6,7 +6,7 @@ * priority: medium * type: API, metadata * keywords: API -* status: open +* status: stalled ## Description diff --git a/issues/move-search-to-xapian.gmi b/issues/move-search-to-xapian.gmi index 57612e7..d98be9b 100644 --- a/issues/move-search-to-xapian.gmi +++ b/issues/move-search-to-xapian.gmi @@ -18,3 +18,5 @@ As a work around---to make search work with Python3.10, an inefficient hack was => https://github.com/genenetwork/genenetwork2/pull/805/commits/9a6ddf9f1560b3bc1611f50bf2b94f0dc44652a2 Replace escape with conn.escape_string To get rid of this inheritance, I propose rewriting the search functionality in a more straightforward and functional manner. In doing so, we can also transition to Xapian search, a faster and more efficient search system. + +* closed diff --git a/issues/old_session_bug.gmi b/issues/old_session_bug.gmi index 649ea46..925b9f6 100644 --- a/issues/old_session_bug.gmi +++ b/issues/old_session_bug.gmi @@ -2,7 +2,7 @@ ## Tags -* status: open +* status: closed * priority: medium * type: bug * assigned: zsloan, fredm diff --git a/issues/production-container-mechanical-rob-failure.gmi b/issues/production-container-mechanical-rob-failure.gmi new file mode 100644 index 0000000..ae6bae8 --- /dev/null +++ b/issues/production-container-mechanical-rob-failure.gmi @@ -0,0 +1,224 @@ +# Production Container: `mechanical-rob` Failure + +## Tags + +* status: closed, completed, fixed +* priority: high +* type: bug +* assigned: fredm +* keywords: genenetwork, production, mechanical-rob + +## Description + +After deploying the latest commits to https://gn2-fred.genenetwork.org on 2025-02-19UTC-0600, with the following commits: + +* genenetwork2: 2a3df8cfba6b29dddbe40910c69283a1afbc8e51 +* genenetwork3: 99fd5070a84f37f91993f329f9cc8dd82a4b9339 +* gn-auth: 073395ff331042a5c686a46fa124f9cc6e10dd2f +* gn-libs: 72a95f8ffa5401649f70978e863dd3f21900a611 + +I had the (not so) bright idea to run the `mechanical-rob` tests against it before pushing it to production, proper. Here's where I ran into problems: some of the `mechanical-rob` tests failed, specifically, the correlation tests. + +Meanwhile, a run of the same tests against https://cd.genenetwork.org with the same commits was successful: + +=> https://ci.genenetwork.org/jobs/genenetwork2-mechanical-rob/1531 See this. + +This points to a possible problem with the setup of the production container, that leads to failures where none should be. This needs investigation and fixing. + +### Update 2025-02-20 + +The MariaDB server is crashing. To reproduce: + +* Go to https://gn2-fred.genenetwork.org/show_trait?trait_id=1435464_at&dataset=HC_M2_0606_P +* Click on "Calculate Correlations" to expand +* Click "Compute" + +Observe that after a little while, the system fails with the following errors: + +* `MySQLdb.OperationalError: (2013, 'Lost connection to MySQL server during query')` +* `MySQLdb.OperationalError: (2006, 'MySQL server has gone away')` + +I attempted updating the configuration for MariaDB, setting the `max_allowed_packet` to 16M and then 64M, but that did not resolve the problem. + +The log files indicate the following: + +``` +2025-02-20 7:46:07 0 [Note] Recovering after a crash using /var/lib/mysql/gn0-binary-log +2025-02-20 7:46:07 0 [Note] Starting crash recovery... +2025-02-20 7:46:07 0 [Note] Crash recovery finished. +2025-02-20 7:46:07 0 [Note] Server socket created on IP: '0.0.0.0'. +2025-02-20 7:46:07 0 [Warning] 'user' entry 'webqtlout@tux01' ignored in --skip-name-resolve mode. +2025-02-20 7:46:07 0 [Warning] 'db' entry 'db_webqtl webqtlout@tux01' ignored in --skip-name-resolve mode. +2025-02-20 7:46:07 0 [Note] Reading of all Master_info entries succeeded +2025-02-20 7:46:07 0 [Note] Added new Master_info '' to hash table +2025-02-20 7:46:07 0 [Note] /usr/sbin/mariadbd: ready for connections. +Version: '10.5.23-MariaDB-0+deb11u1-log' socket: '/run/mysqld/mysqld.sock' port: 3306 Debian 11 +2025-02-20 7:46:07 4 [Warning] Access denied for user 'root'@'localhost' (using password: NO) +2025-02-20 7:46:07 5 [Warning] Access denied for user 'root'@'localhost' (using password: NO) +2025-02-20 7:46:07 0 [Note] InnoDB: Buffer pool(s) load completed at 250220 7:46:07 +250220 7:50:12 [ERROR] mysqld got signal 11 ; +Sorry, we probably made a mistake, and this is a bug. + +Your assistance in bug reporting will enable us to fix this for the next release. +To report this bug, see https://mariadb.com/kb/en/reporting-bugs + +We will try our best to scrape up some info that will hopefully help +diagnose the problem, but since we have already crashed, +something is definitely wrong and this may fail. + +Server version: 10.5.23-MariaDB-0+deb11u1-log source revision: 6cfd2ba397b0ca689d8ff1bdb9fc4a4dc516a5eb +key_buffer_size=10485760 +read_buffer_size=131072 +max_used_connections=1 +max_threads=2050 +thread_count=1 +It is possible that mysqld could use up to +key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 4523497 K bytes of memory +Hope that's ok; if not, decrease some variables in the equation. + +Thread pointer: 0x7f599c000c58 +Attempting backtrace. You can use the following information to find out +where mysqld died. If you see no messages after this, something went +terribly wrong... +stack_bottom = 0x7f6150282d78 thread_stack 0x49000 +/usr/sbin/mariadbd(my_print_stacktrace+0x2e)[0x55f43330c14e] +/usr/sbin/mariadbd(handle_fatal_signal+0x475)[0x55f432e013b5] +sigaction.c:0(__restore_rt)[0x7f615a1cb140] +/usr/sbin/mariadbd(+0xcbffbe)[0x55f43314efbe] +/usr/sbin/mariadbd(+0xd730ec)[0x55f4332020ec] +/usr/sbin/mariadbd(+0xd1b36b)[0x55f4331aa36b] +/usr/sbin/mariadbd(+0xd1cd8e)[0x55f4331abd8e] +/usr/sbin/mariadbd(+0xc596f3)[0x55f4330e86f3] +/usr/sbin/mariadbd(_ZN7handler18ha_index_next_sameEPhPKhj+0x2a5)[0x55f432e092b5] +/usr/sbin/mariadbd(+0x7b54d1)[0x55f432c444d1] +/usr/sbin/mariadbd(_Z10sub_selectP4JOINP13st_join_tableb+0x1f8)[0x55f432c37da8] +/usr/sbin/mariadbd(_ZN10JOIN_CACHE24generate_full_extensionsEPh+0x134)[0x55f432d24224] +/usr/sbin/mariadbd(_ZN10JOIN_CACHE21join_matching_recordsEb+0x206)[0x55f432d245d6] +/usr/sbin/mariadbd(_ZN10JOIN_CACHE12join_recordsEb+0x1cf)[0x55f432d23eff] +/usr/sbin/mariadbd(_Z16sub_select_cacheP4JOINP13st_join_tableb+0x8a)[0x55f432c382fa] +/usr/sbin/mariadbd(_ZN4JOIN10exec_innerEv+0xd16)[0x55f432c63826] +/usr/sbin/mariadbd(_ZN4JOIN4execEv+0x35)[0x55f432c63cc5] +/usr/sbin/mariadbd(_Z12mysql_selectP3THDP10TABLE_LISTR4ListI4ItemEPS4_jP8st_orderS9_S7_S9_yP13select_resultP18st_select_lex_unitP13st_select_lex+0x106)[0x55f432c61c26] +/usr/sbin/mariadbd(_Z13handle_selectP3THDP3LEXP13select_resultm+0x138)[0x55f432c62698] +/usr/sbin/mariadbd(+0x762121)[0x55f432bf1121] +/usr/sbin/mariadbd(_Z21mysql_execute_commandP3THD+0x3d6c)[0x55f432bfdd1c] +/usr/sbin/mariadbd(_Z11mysql_parseP3THDPcjP12Parser_statebb+0x20b)[0x55f432bff17b] +/usr/sbin/mariadbd(_Z16dispatch_command19enum_server_commandP3THDPcjbb+0xdb5)[0x55f432c00f55] +/usr/sbin/mariadbd(_Z10do_commandP3THD+0x120)[0x55f432c02da0] +/usr/sbin/mariadbd(_Z24do_handle_one_connectionP7CONNECTb+0x2f2)[0x55f432cf8b32] +/usr/sbin/mariadbd(handle_one_connection+0x5d)[0x55f432cf8dad] +/usr/sbin/mariadbd(+0xbb4ceb)[0x55f433043ceb] +nptl/pthread_create.c:478(start_thread)[0x7f615a1bfea7] +x86_64/clone.S:97(__GI___clone)[0x7f6159dc6acf] + +Trying to get some variables. +Some pointers may be invalid and cause the dump to abort. +Query (0x7f599c012c50): SELECT ProbeSet.Name,ProbeSet.Chr,ProbeSet.Mb, + ProbeSet.Symbol,ProbeSetXRef.mean, + CONCAT_WS('; ', ProbeSet.description, ProbeSet.Probe_Target_Description) AS description, + ProbeSetXRef.additive,ProbeSetXRef.LRS,Geno.Chr, Geno.Mb + FROM ProbeSet INNER JOIN ProbeSetXRef + ON ProbeSet.Id=ProbeSetXRef.ProbeSetId + INNER JOIN Geno + ON ProbeSetXRef.Locus = Geno.Name + INNER JOIN Species + ON Geno.SpeciesId = Species.Id + WHERE ProbeSet.Name in ('1447591_x_at', '1422809_at', '1428917_at', '1438096_a_at', '1416474_at', '1453271_at', '1441725_at', '1452952_at', '1456774_at', '1438413_at', '1431110_at', '1453723_x_at', '1424124_at', '1448706_at', '1448762_at', '1428332_at', '1438389_x_at', '1455508_at', '1455805_x_at', '1433276_at', '1454989_at', '1427467_a_at', '1447448_s_at', '1438695_at', '1456795_at', '1454874_at', '1455189_at', '1448631_a_at', '1422697_s_at', '1423717_at', '1439484_at', '1419123_a_at', '1435286_at', '1439886_at', '1436348_at', '1437475_at', '1447667_x_at', '1421046_a_at', '1448296_x_at', '1460577_at', 'AFFX-GapdhMur/M32599_M_at', '1424393_s_at', '1426190_at', '1434749_at', '1455706_at', '1448584_at', '1434093_at', '1434461_at', '1419401_at', '1433957_at', '1419453_at', '1416500_at', '1439436_x_at', '1451413_at', '1455696_a_at', '1457190_at', '1455521_at', '1434842_s_at', '1442525_at', '1452331_s_at', '1428862_at', '1436463_at', '1438535_at', 'AFFX-GapdhMur/M32599_3_at', '1424012_at', '1440027_at', '1435846_x_at', '1443282_at', '1435567_at', '1450112_a_at', '1428251_at', '1429063_s_at', '1433781_a_at', '1436698_x_at', '1436175_at', '1435668_at', '1424683_at', '1442743_at', '1416944_a_at', '1437511_x_at', '1451254_at', '1423083_at', '1440158_x_at', '1424324_at', '1426382_at', '1420142_s_at', '1434553_at', '1428772_at', '1424094_at', '1435900_at', '1455322_at', '1453283_at', '1428551_at', '1453078_at', '1444602_at', '1443836_x_at', '1435590_at', '1434283_at', '1435240_at', '1434659_at', '1427032_at', '1455278_at', '1448104_at', '1421247_at', 'AFFX-MURINE_b1_at', '1460216_at', '1433969_at', '1419171_at', '1456699_s_at', '1456901_at', '1442139_at', '1421849_at', '1419824_a_at', '1460588_at', '1420131_s_at', '1446138_at', '1435829_at', '1434462_at', '1435059_at', '1415949_at', '1460624_at', '1426707_at', '1417250_at', '1434956_at', '1438018_at', '1454846_at', '1435298_at', '1442077_at', '1424074_at', '1428883_at', '1454149_a_at', '1423925_at', '1457060_at', '1433821_at', '1447923_at', '1460670_at', '1434468_at', '1454980_at', '1426913_at', '1456741_s_at', '1449278_at', '1443534_at', '1417941_at', '1433167_at', '1434401_at', '1456516_x_at', '1451360_at', 'AFFX-GapdhMur/M32599_5_at', '1417827_at', '1434161_at', '1448979_at', '1435797_at', '1419807_at', '1418330_at', '1426304_x_at', '1425492_at', '1437873_at', '1435734_x_at', '1420622_a_at', '1456019_at', '1449200_at', '1455314_at', '1428419_at', '1426349_s_at', '1426743_at', '1436073_at', '1452306_at', '1436735_at', '1439529_at', '1459347_at', '1429642_at', '1438930_s_at', '1437380_x_at', '1459861_s_at', '1424243_at', '1430503_at', '1434474_at', '1417962_s_at', '1440187_at', '1446809_at', '1436234_at', '1415906_at', 'AFFX-MURINE_B2_at', '1434836_at', '1426002_a_at', '1448111_at', '1452882_at', '1436597_at', '1455915_at', '1421846_at', '1428693_at', '1422624_at', '1423755_at', '1460367_at', '1433746_at', '1454872_at', '1429194_at', '1424652_at', '1440795_x_at', '1458690_at', '1434355_at', '1456324_at', '1457867_at', '1429698_at', '1423104_at', '1437585_x_at', '1437739_a_at', '1445605_s_at', '1436313_at', '1449738_s_at', '1437525_a_at', '1454937_at', '1429043_at', '1440091_at', '1422820_at', '1437456_x_at', '1427322_at', '1446649_at', '1433568_at', '1441114_at', '1456541_x_at', '1426985_s_at', '1454764_s_at', '1424071_s_at', '1429251_at', '1429155_at', '1433946_at', '1448771_a_at', '1458664_at', '1438320_s_at', '1449616_s_at', '1435445_at', '1433872_at', '1429273_at', '1420880_a_at', '1448645_at', '1449646_s_at', '1428341_at', '1431299_a_at', '1433427_at', '1418530_at', '1436247_at', '1454350_at', '1455860_at', '1417145_at', '1454952_s_at', '1435977_at', '1434807_s_at', '1428715_at', '1418117_at', '1447947_at', '1431781_at', '1428915_at', '1427197_at', '1427208_at', '1455460_at', '1423899_at', '1441944_s_at', '1455429_at', '1452266_at', '1454409_at', '1426384_a_at', '1428725_at', '1419181_at', '1454862_at', '1452907_at', '1433794_at', '1435492_at', '1424839_a_at', '1416214_at', '1449312_at', '1436678_at', '1426253_at', '1438859_x_at', '1448189_a_at', '1442557_at', '1446174_at', '1459718_x_at', '1437613_s_at', '1456509_at', '1455267_at', '1440480_at', '1417296_at', '1460050_x_at', '1433585_at', '1436771_x_at', '1424294_at', '1448648_at', '1417753_at', '1436139_at', '1425642_at', '1418553_at', '1415747_s_at', '1445984_at', '1440024_at', '1448720_at', '1429459_at', '1451459_at', '1428853_at', '1433856_at', '1426248_at', '1417765_a_at', '1439459_x_at', '1447023_at', '1426088_at', '1440825_s_at', '1417390_at', '1444744_at', '1435618_at', '1424635_at', '1443727_x_at', '1421096_at', '1427410_at', '1416860_s_at', '1442773_at', '1442030_at', '1452281_at', '1434774_at', '1416891_at', '1447915_x_at', '1429129_at', '1418850_at', '1416308_at', '1422858_at', '1447679_s_at', '1440903_at', '1417321_at', '1452342_at', '1453510_s_at', '1454923_at', '1454611_a_at', '1457532_at', '1438440_at', '1434232_a_at', '1455878_at', '1455571_x_at', '1436401_at', '1453289_at', '1457365_at', '1436708_x_at', '1434494_at', '1419588_at', '1433679_at', '1455159_at', '1428982_at', '1446510_at', '1434131_at', '1418066_at', '1435346_at', '1449415_at', '1455384_x_at', '1418817_at', '1442073_at', '1457265_at', '1447361_at', '1418039_at', '1428467_at', '1452224_at', '1417538_at', '1434529_x_at', '1442149_at', '1437379_x_at', '1416473_a_at', '1432750_at', '1428389_s_at', '1433823_at', '1451889_at', '1438178_x_at', '1441807_s_at', '1416799_at', '1420623_x_at', '1453245_at', '1434037_s_at', '1443012_at', '1443172_at', '1455321_at', '1438396_at', '1440823_x_at', '1436278_at', '1457543_at', '1452908_at', '1417483_at', '1418397_at', '1446589_at', '1450966_at', '1447877_x_at', '1446524_at', '1438592_at', '1455589_at', '1428629_at', '1429585_s_at', '1440020_at', '1417365_a_at', '1426442_at', '1427151_at', '1437377_a_at', '1433995_s_at', '1435464_at', '1417007_a_at', '1429690_at', '1427999_at', '1426819_at', '1454905_at', '1439516_at', '1434509_at', '1428707_at', '1416793_at', '1440822_x_at', '1437327_x_at', '1428682_at', '1435004_at', '1434238_at', '1417581_at', '1434699_at', '1455597_at', '1458613_at', '1456485_at', '1435122_x_at', '1452864_at', '1453122_at', '1435254_at', '1451221_at', '1460168_at', '1455336_at', '1427965_at', '1432576_at', '1455425_at', '1428762_at', '1455459_at', '1419317_x_at', '1434691_at', '1437950_at', '1426401_at', '1457261_at', '1433824_x_at', '1435235_at', '1437343_x_at', '1439964_at', '1444280_at', '1455434_a_at', '1424431_at', '1421519_a_at', '1428412_at', '1434010_at', '1419976_s_at', '1418887_a_at', '1428498_at', '1446883_at', '1435675_at', '1422599_s_at', '1457410_at', '1444437_at', '1421050_at', '1437885_at', '1459754_x_at', '1423807_a_at', '1435490_at', '1426760_at', '1449459_s_at', '1432098_a_at', '1437067_at', '1435574_at', '1433999_at', '1431289_at', '1428919_at', '1425678_a_at', '1434924_at', '1421640_a_at', '1440191_s_at', '1460082_at', '1449913_at', '1439830_at', '1425020_at', '1443790_x_at', '1436931_at', '1454214_a_at', '1455854_a_at', '1437061_at', '1436125_at', '1426385_x_at', '1431893_a_at', '1417140_a_at', '1435333_at', '1427907_at', '1434446_at', '1417594_at', '1426518_at', '1437345_a_at', '1420091_s_at', '1450058_at', '1435161_at', '1430348_at', '1455778_at', '1422653_at', '1447942_x_at', '1434843_at', '1454956_at', '1454998_at', '1427384_at', '1439828_at') AND + Species.Name = 'mouse' AND + ProbeSetXRef.ProbeSetFreezeId IN ( + SELECT ProbeSetFreeze.Id + FROM ProbeSetFreeze WHERE ProbeSetFreeze.Name = 'HC_M2_0606_P') + +Connection ID (thread ID): 41 +Status: NOT_KILLED + +Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=on,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on,condition_pushdown_for_subquery=on,rowid_filter=on,condition_pushdown_from_having=on,not_null_range_scan=off + +The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mariadbd/ contains +information that should help you find out what is causing the crash. +Writing a core file... +Working directory at /export/mysql/var/lib/mysql +Resource Limits: +Limit Soft Limit Hard Limit Units +Max cpu time unlimited unlimited seconds +Max file size unlimited unlimited bytes +Max data size unlimited unlimited bytes +Max stack size 8388608 unlimited bytes +Max core file size 0 unlimited bytes +Max resident set unlimited unlimited bytes +Max processes 3094157 3094157 processes +Max open files 64000 64000 files +Max locked memory 65536 65536 bytes +Max address space unlimited unlimited bytes +Max file locks unlimited unlimited locks +Max pending signals 3094157 3094157 signals +Max msgqueue size 819200 819200 bytes +Max nice priority 0 0 +Max realtime priority 0 0 +Max realtime timeout unlimited unlimited us +Core pattern: core + +Kernel version: Linux version 5.10.0-22-amd64 (debian-kernel@lists.debian.org) (gcc-10 (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2) #1 SMP Debian 5.10.178-3 (2023-04-22) + +2025-02-20 7:50:17 0 [Note] Starting MariaDB 10.5.23-MariaDB-0+deb11u1-log source revision 6cfd2ba397b0ca689d8ff1bdb9fc4a4dc516a5eb as process 3086167 +2025-02-20 7:50:17 0 [Note] InnoDB: !!! innodb_force_recovery is set to 1 !!! +2025-02-20 7:50:17 0 [Note] InnoDB: Uses event mutexes +2025-02-20 7:50:17 0 [Note] InnoDB: Compressed tables use zlib 1.2.11 +2025-02-20 7:50:17 0 [Note] InnoDB: Number of pools: 1 +2025-02-20 7:50:17 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions +2025-02-20 7:50:17 0 [Note] InnoDB: Using Linux native AIO +2025-02-20 7:50:17 0 [Note] InnoDB: Initializing buffer pool, total size = 17179869184, chunk size = 134217728 +2025-02-20 7:50:17 0 [Note] InnoDB: Completed initialization of buffer pool +2025-02-20 7:50:17 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=1537379110991,1537379110991 +2025-02-20 7:50:17 0 [Note] InnoDB: Last binlog file '/var/lib/mysql/gn0-binary-log.000134', position 82843148 +2025-02-20 7:50:17 0 [Note] InnoDB: 128 rollback segments are active. +2025-02-20 7:50:17 0 [Note] InnoDB: Removed temporary tablespace data file: "ibtmp1" +2025-02-20 7:50:17 0 [Note] InnoDB: Creating shared tablespace for temporary tables +2025-02-20 7:50:17 0 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ... +2025-02-20 7:50:17 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB. +2025-02-20 7:50:17 0 [Note] InnoDB: 10.5.23 started; log sequence number 1537379111003; transaction id 3459549902 +2025-02-20 7:50:17 0 [Note] Plugin 'FEEDBACK' is disabled. +2025-02-20 7:50:17 0 [Note] InnoDB: Loading buffer pool(s) from /export/mysql/var/lib/mysql/ib_buffer_pool +2025-02-20 7:50:17 0 [Note] Loaded 'locales.so' with offset 0x7f9551bc0000 +2025-02-20 7:50:17 0 [Note] Recovering after a crash using /var/lib/mysql/gn0-binary-log +2025-02-20 7:50:17 0 [Note] Starting crash recovery... +2025-02-20 7:50:17 0 [Note] Crash recovery finished. +2025-02-20 7:50:17 0 [Note] Server socket created on IP: '0.0.0.0'. +2025-02-20 7:50:17 0 [Warning] 'user' entry 'webqtlout@tux01' ignored in --skip-name-resolve mode. +2025-02-20 7:50:17 0 [Warning] 'db' entry 'db_webqtl webqtlout@tux01' ignored in --skip-name-resolve mode. +2025-02-20 7:50:17 0 [Note] Reading of all Master_info entries succeeded +2025-02-20 7:50:17 0 [Note] Added new Master_info '' to hash table +2025-02-20 7:50:17 0 [Note] /usr/sbin/mariadbd: ready for connections. +Version: '10.5.23-MariaDB-0+deb11u1-log' socket: '/run/mysqld/mysqld.sock' port: 3306 Debian 11 +2025-02-20 7:50:17 4 [Warning] Access denied for user 'root'@'localhost' (using password: NO) +2025-02-20 7:50:17 5 [Warning] Access denied for user 'root'@'localhost' (using password: NO) +2025-02-20 7:50:17 0 [Note] InnoDB: Buffer pool(s) load completed at 250220 7:50:17 +``` + +A possible issue is the use of the environment variable SQL_URI at this point: + +=> https://github.com/genenetwork/genenetwork2/blob/testing/gn2/wqflask/correlation/rust_correlation.py#L34 + +which is requested + +=> https://github.com/genenetwork/genenetwork2/blob/testing/gn2/wqflask/correlation/rust_correlation.py#L7 from here. + +I tried setting an environment variable "SQL_URI" with the same value as the config and rebuilt the container. That did not fix the problem. + +Running the query directly in the default mysql client also fails with: + +``` +ERROR 2013 (HY000): Lost connection to MySQL server during query +``` + +Huh, so this was not a code problem. + +Configured database to allow upgrade of tables if necessary and restarted mariadbd. + +The problem still persists. + +Note Pjotr: this is likely a mariadb bug with 10.5.23, the most recent mariadbd we use (both tux01 and tux02 are older). The dump shows it balks on creating a new thread: pthread_create.c:478. Looks similar to https://jira.mariadb.org/browse/MDEV-32262 + +10.5, 10.6, 10.11 are affected. so running correlations on production crashes mysqld? I am not trying for obvious reasons ;) the threading issues of mariadb look scary - I wonder how deep it goes. + +We'll test for a different version of mariadb combining a Debian update because Debian on tux04 is broken. diff --git a/issues/quality-control/fix-flash-messages.gmi b/issues/quality-control/fix-flash-messages.gmi index da54c52..e65c0f6 100644 --- a/issues/quality-control/fix-flash-messages.gmi +++ b/issues/quality-control/fix-flash-messages.gmi @@ -5,7 +5,7 @@ * assigned: fredm * priority: low * type: bug -* status: open +* status: closed, completed, fixed * keywords: flask, flash ## Description diff --git a/issues/quality-control/qc-r-qtl2-bundles.gmi b/issues/quality-control/qc-r-qtl2-bundles.gmi index 9cc1452..6560594 100644 --- a/issues/quality-control/qc-r-qtl2-bundles.gmi +++ b/issues/quality-control/qc-r-qtl2-bundles.gmi @@ -3,7 +3,7 @@ ## Tags * assigned: fredm, acenteno -* status: open +* status: closed, completed * type: feature request * priority: medium * keywords: quality control, QC, R/qtl2 bundle diff --git a/issues/rdf/automate-rdf-generation-and-ingress.gmi b/issues/rdf/automate-rdf-generation-and-ingress.gmi new file mode 100644 index 0000000..ef4ba9f --- /dev/null +++ b/issues/rdf/automate-rdf-generation-and-ingress.gmi @@ -0,0 +1,37 @@ +# Update RDF Generation and Ingress to Virtuoso + +## Tags + +* assigned: bonfacem +* priority: high +* tags: in-progress +* deadline: 2024-10-23 Wed + +We need to update Virtuoso in production. At the moment this is done manually. For the current set-up, we need to update the recent modified RIF+WIKI models: + + +``` +# Generate the RDF triples +time guix shell guile-dbi guile-hashing -m manifest.scm -- ./pre-inst-env ./examples/generif.scm --settings conf.scm --output /home/bonfacem/ttl-files/generif-metadata-new.ttl --documentation ./docs/generif-metadata.md + +# Make sure they are valid +guix shell -m manifest.scm -- rapper --input turtle --count /home/bonfacem/ttl-files/generif-metadata-new.ttl + +# Copy the files over to the exposed virtuoso path +cp /home/bonfacem/ttl-files/generif-metadata-new.ttl </some/dir/> + +# Get into Virtuoso (with a password) +guix shell virtuoso-ose -- isql <port-number> + +# Load the files to be loaded +# Assuming that '/var/lib/data' is where the files are +ld_dir('/var/lib/data', 'generif-metadata-new.ttl', 'http://genenetwork.org'); + +# Load the files +rdf_loader_run(); +CHECKPOINT; +``` + +Above steps should be automated and tested in CD before roll-out in production. Key considerations: + +- Pick latest important changes from git, so that we can pick what files to run instead of generating all the ttl files all the time. diff --git a/issues/rdf/hash-rdf-graph.gmi b/issues/rdf/hash-rdf-graph.gmi index c896218..2863108 100644 --- a/issues/rdf/hash-rdf-graph.gmi +++ b/issues/rdf/hash-rdf-graph.gmi @@ -5,3 +5,12 @@ ## Description Building the index is an expesive operation. Hash the graph and store the metadata in xapian, and similarly in the RDF store. The mcron-job should check whether this has changed, and if there's any difference, go ahead and re-build the index. + +Resolution: + +=> https://github.com/genenetwork/genenetwork3/pull/171 Improve Sharing Memory Across Processes. +=> https://github.com/genenetwork/genenetwork3/pull/172 Check whether table names were stored in xapian. +=> https://github.com/genenetwork/genenetwork3/pull/174 Wikidata index. +=> https://github.com/genenetwork/genenetwork3/pull/175 Refactor how the generif md5 sum is calculated and stored in XAPIAN. + +* closed diff --git a/issues/redesign-global-search-design.gmi b/issues/redesign-global-search-design.gmi new file mode 100644 index 0000000..df63791 --- /dev/null +++ b/issues/redesign-global-search-design.gmi @@ -0,0 +1,23 @@ +# Redesign Global Search Design + +## Tags +* assigned: alexm, zac +* keywords: global search, design, HTML +* type: enhancement +* status: closed, completed, done + +## Description +Rob suggested we model the global search on the NCBI PubMed interface. We should remove the `?` button, which seems to be confusing for users, and have a better user guide. + +## Tasks + +* [x] Redesign the global search to fit the NCBI PubMed model. +* [x] Replace the "?" button that acts as a user guide + +## Related issues: + +=> https://issues.genenetwork.org/issues/cleanup-base-file-gn2 + +## Notes +PR that seeks to address this issue: +=> https://github.com/genenetwork/genenetwork2/pull/880
\ No newline at end of file diff --git a/issues/remove-custom-bootstrap-css.gmi b/issues/remove-custom-bootstrap-css.gmi index 7fa6f24..14c1c35 100644 --- a/issues/remove-custom-bootstrap-css.gmi +++ b/issues/remove-custom-bootstrap-css.gmi @@ -1,7 +1,7 @@ # Remove overrides to bootstrap classes in bootstrap-custom.css * assigned: zachs, bonfacem, alexm - +* status: stalled We have a "bootstrap-custom.css" in GeneNetwork. Consider this snippet: diff --git a/issues/remove-references-to-old-gn-auth-code.gmi b/issues/remove-references-to-old-gn-auth-code.gmi index 1a03c25..8c110aa 100644 --- a/issues/remove-references-to-old-gn-auth-code.gmi +++ b/issues/remove-references-to-old-gn-auth-code.gmi @@ -4,7 +4,7 @@ * assigned: bonfacem * keywords: auth -* status: open +* status: stalled ## Description diff --git a/issues/replace-neo4j-with-virtuoso.gmi b/issues/replace-neo4j-with-virtuoso.gmi new file mode 100644 index 0000000..450fb70 --- /dev/null +++ b/issues/replace-neo4j-with-virtuoso.gmi @@ -0,0 +1,8 @@ +# Replace Neo4J with Virtuoso + +## Tags + +* assigned: bonfacem, soloshelby +* deadline: 2024-10-25 Fri + +Currently, the RAG ingests TTL files into Neo4J. Replace this with Virtuoso. diff --git a/issues/reset-password-on-container-rebuild.gmi b/issues/reset-password-on-container-rebuild.gmi index b0e4dbb..6c0ad1e 100644 --- a/issues/reset-password-on-container-rebuild.gmi +++ b/issues/reset-password-on-container-rebuild.gmi @@ -2,5 +2,6 @@ ## Tags * assigned: bonfacem +* status: stalled Whenever the virtuoso container is rebuilt, we manually have to reset the password. We should fix this by modifying the virtuoso service so that things are set automatically. diff --git a/issues/search-for-brca.gmi b/issues/search-for-brca.gmi index c42c745..05c6fd0 100644 --- a/issues/search-for-brca.gmi +++ b/issues/search-for-brca.gmi @@ -1,10 +1,31 @@ -# Search for brca +# Search Improvements: capital insensitive search for RIF+WIKI; Examples -* assigned: arun +## Tags -Search for brca does not return results for brca1 and brca2. It should. -=> https://cd.genenetwork.org/gsearch?type=gene&terms=brca +* assigned: bonfacem, rookie101 +* priority: high +* type: ops +* keywords: virtuoso -The xapian stemmer does not stem brca1 to brca. That's why when one searches for brca, results for brca1 are not returned. +## Description + +RIF search is finally working on production: + +> rif:Brca2 and group:BXD + +and capital insentive search too for the BXD. See: + +=> https://github.com/genenetwork/genenetwork3/commit/4b2e9f3fb3383421d7a55df5399aab71e0cc3b4f Stem group field regardless of case. +=> https://github.com/genenetwork/genenetwork3/commit/a37622b466f9f045db06a6f07e88fcf81b176f91 Stem all the time. + +## Questions: + +* How do we search genewiki data? + +* rif:Brca2 should also be RIF:Brca2 (prefer the latter if we have to +choose as that is what people will try) + +* Can we continue giving examples at + +=> https://genenetwork.org/search-syntax search syntax -Perhaps we should write a custom stemmer that stems brca1 to brca. But, at the same time, we should be wary of stemming terms like p450 to p. Pjotr suggests the heuristic that we look for at least 2 or 3 alphabetic characters at the beginning. Another approach is to hard-code a list of candidates to look for. diff --git a/issues/set-up-gn-guile-in-tux02.gmi b/issues/set-up-gn-guile-in-tux02.gmi new file mode 100644 index 0000000..29eca68 --- /dev/null +++ b/issues/set-up-gn-guile-in-tux02.gmi @@ -0,0 +1,15 @@ +# Set Up gn-guile in tux02 + +## Tags + +* assigned: bonfacem +* priority: high +* status: in-progress +* deadline: 2024-10-23 Wed + +## Tasks + +* [-] Create gn-guile container. +* [X] Merge gn2 UI PR. +=> https://github.com/genenetwork/genenetwork2/pull/854 Feature/gn editor UI +* [-] Test out auth editing in CD. diff --git a/issues/set-up-virtuoso-on-production.gmi b/issues/set-up-virtuoso-on-production.gmi index 88c04f7..614565a 100644 --- a/issues/set-up-virtuoso-on-production.gmi +++ b/issues/set-up-virtuoso-on-production.gmi @@ -1,8 +1,8 @@ -# Set-up Virtuoso on Production +# Set-up Virtuoso+Xapian on Production ## Tags -* assigned: bonfacem +* assigned: bonfacem, zachs, fredm * priority: high * type: ops * keywords: virtuoso @@ -11,5 +11,121 @@ We already have virtuoso set-up in tux02. Right now, to be able to interact with RDF, we need to have virtuoso set-up. This issue will unblock: +* Global Search in Production + => https://github.com/genenetwork/genenetwork3/pull/137 Update RDF endpoints + => https://github.com/genenetwork/genenetwork2/pull/808 UI/RDF frontend + + +## HOWTO: Updating Virtuoso in Production (Tux01) + + +Note where the virtuoso data directory is mapped from the "production.sh" script as you will use this in the consequent steps: + +> --share=/export2/guix-containers/genenetwork/var/lib/virtuoso=/var/lib/virtuoso + +### Generating the TTL Files + +=> https://git.genenetwork.org/gn-transform-databases/tree/generate-ttl-files.scm Run "generate-ttl-files" to generate the TTL files: + +``` +time guix shell guile-dbi -m manifest.scm -- \ +./generate-ttl-files.scm --settings conn-dev.scm --output \ +/export2/guix-containers/genenetwork-development/var/lib/virtuoso \ +--documentation /tmp/doc-directory +``` + +* [Recommended] Alternatively, copy over the TTL files (in Tux01) to the correct shared directory in the container: + +``` +cp /home/bonfacem/ttl-files/*ttl /export2/guix-containers/genenetwork/var/lib/virtuoso/ +``` + +### Loading the TTL Files + +* Make sure that the virtuoso service type has the "dirs-allowed" variable set correctly: + +``` +(service virtuoso-service-type + (virtuoso-configuration + (server-port 7892) + (http-server-port 7893) + (dirs-allowed "/var/lib/virtuoso"))) +``` + +* Get into isql: + +``` +guix shell virtuoso-ose -- isql 7892 +``` +* Make sure that no pre-existing TTL files exist in "DB.DBA.LOAD_LIST": + +``` +SQL> select * from DB.DBA.LOAD_LIST; +SQL> delete from DB.DBA.load_list; +``` +* Delete the genenetwork graph: + +``` +SQL> DELETE FROM rdf_quad WHERE g = iri_to_id('http://genenetwork.org'); +``` + +* Load all the TTL files (This takes some time): + +``` +SQL> ld_dir('/var/lib/virtuoso', '*.ttl', 'http://genenetwork.org'); +SQL> rdf_loader_run(); +SQL> CHECKPOINT; +SQL> checkpoint_interval(60); +SQL> scheduler_interval(10); +``` +* Verify you have some RDF data by running: + +``` +SQL> SPARQL +PREFIX gn: <http://genenetwork.org/id/> +PREFIX gnc: <http://genenetwork.org/category/> +PREFIX owl: <http://www.w3.org/2002/07/owl#> +PREFIX gnt: <http://genenetwork.org/term/> +PREFIX skos: <http://www.w3.org/2004/02/skos/core#> +PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> +PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> +PREFIX taxon: <http://purl.uniprot.org/taxonomy/> + +SELECT * WHERE { + ?s skos:member gn:Mus_musculus . + ?s ?p ?o . +}; +``` + +* Update GN3 Configurations to point to the correct Virtuoso instance: + +> SPARQL_ENDPOINT="http://localhost:7893/sparql" + +## HOWTO: Generating the Xapian Index + +* Make sure you are using the correct guix profile or that you have the "PYTHONPATH" pointing to the GN3 repository. + +* Generate the Xapian Index using "genenetwork3/scripts/create-xapian-index" against the correct output directory (The build takes around 71 minutes on an SSD Drive): + +``` +time python index-genenetwork create-xapian-index \ +/export/data/genenetwork-xapian/ \ +mysql://<user>:<password>@localhost/db_webqtl \ +http://localhost:7893/sparql +``` +* After the build, you can verify that the index works by: + +``` +guix shell xapian -- xapian-delve /export/data/genenetwork-xapian/ +``` +* Update GN3 configuration files to point to the right Xapian path: + +> XAPIAN_DB_PATH="/export/data/genenetwork-xapian/" + +## Resolution + +@fredm updated virtuoso; and @zachs updated the xapian index in production. + +* closed diff --git a/issues/systems/apps.gmi b/issues/systems/apps.gmi new file mode 100644 index 0000000..b9d4155 --- /dev/null +++ b/issues/systems/apps.gmi @@ -0,0 +1,207 @@ +# Apps + +GeneNetwork.org retains a number of apps. Currently they are managed by shepherd as `guix shell` services, but we should really move them to system containers. + +# Tags + +* assigned: pjotrp +* type: enhancement +* status: in progress +* priority: medium +* keywords: system, sheepdog, shepherd + +# Tasks + +* [ ] Get services running +* [ ] Move guix shell into containers +* [ ] Make sure the container starts up on reboot and/or migrate to a new host + +# List of apps + +Current apps managed by shepherd/systemd on tux02/balg01 are + +=> https://genecup.org/ +* [+] genecup [shell] (hao) +* [X] - fire up service +* [X] - add sheepdog monitor +* [ ] - add link in GN2 +* [X] - add banner for GeneNetwork +* [ ] - create system container +* [X] - create guix root +* [ ] - make sure it works on reboot (systemd) +=> https://bnw.genenetwork.org/ +* [+] bnw [container] (yan cui and rob) +* [X] - fire up service +* [X] - add sheepdog monitor +* [X] - add link in GN2 +* [ ] - add banner for GeneNetwork +* [ ] - update system container +* [X] - create guix root +* [ ] - make sure it works on reboot (systemd) +=> http://hrdp.genenetwork.org +* [+] hrdp-project (hao?) +* [X] - fire up service +* [X] - add sheepdog monitor +* [ ] - https +* [ ] - add link in GN2 +* [ ] - add banner for GeneNetwork +* [ ] - create system container +* [ ] - create guix root +* [ ] - make sure it works on reboot (systemd) +=> https://pluto.genenetwork.org/ +* [+] pluto (saunak) +* [X] - fire up service +* [X] - add sheepdog monitor +* [ ] - add link in GN2 +* [ ] - add banner for GeneNetwork +* [ ] - create system container +* [ ] - create guix root +* [ ] - make sure it works on reboot (systemd) +=> https://power.genenetwork.org/ +* [+] power app (dave) +* [X] - fire up service +* [X] - add sheepdog monitor +* [ ] - add link in GN2 +* [ ] - add banner for GeneNetwork +* [ ] - create system container +* [X] - create guix root +* [ ] - make sure it works on reboot (systemd) +* [ ] root? +=> http://longevity-explorer.genenetwork.org/ +* [+] Longevity explorer [container balg01] (dave) +* [X] - fire up service +* [X] - add sheepdog monitor +* [ ] - https +* [ ] - add link in GN2 +* [ ] - add banner for GeneNetwork +* [ ] - create system container +* [ ] - create guix root +* [ ] - make sure it works on reboot (systemd) +=> http://jumpshiny.genenetwork.org/ +* [+] jumpshiny app (xusheng) +* [+] - fire up service (still some dependencies) +* [X] - add sheepdog monitor +* [ ] - https +* [ ] - add link in GN2 +* [ ] - add banner for GeneNetwork +* [ ] - create system container +* [ ] - create guix root +* [ ] - make sure it works on reboot (systemd) +=> https://hegp.genenetwork.org/ +* [+] hegp (pjotr) +* [X] - fire up service +* [X] - add sheepdog monitor +* [ ] - add link in GN2 +* [ ] - add banner for GeneNetwork +* [ ] - create system container +* [ ] - create guix root +* [X] - make sure it works on reboot (systemd) + +* [-] singlecell (siamak) +* [-] rn6app (hao - remove) +* [-] genome-browser (defunct) + +To fix them we need to validate the sheepdog monitor and make sure they are working in either shepherd (+), or as a system container (X). + +Sheepdog monitor is at + +=> http://sheepdog.genenetwork.org/sheepdog/status.html + +# Info + +## BNW + +The app is already a Guix system container! To make it part of the startup I had to move it away from shepherd (which runs in userland) and: + +``` +/home/shepherd/guix-profiles/bnw/bin/guix system container /home/shepherd/guix-bioinformatics/gn/services/bnw-container.scm --share=/home/shepherd/logs/bnw-server=/var/log --network +ln -s /gnu/store/0hnfb9ynnxsig3yyprwxmg5h6c9g8mry-run-container /usr/local/bin/bnw-app-container +``` + +systemd service: + +``` +root@tux02:/etc/systemd/system# cat bnw-app-container.service +[Unit] +Description = Run genenetwork BNW app container +[Service] +ExecStart = /usr/local/bin/bnw-app-container +[Install] +WantedBy = multi-user.target +``` + +We need to make sure the garbace collector does not destroy the container, add the --root switch + +``` +/home/shepherd/guix-profiles/bnw/bin/guix system container /home/shepherd/guix-bioinformatics/gn/services/bnw-container.scm --share=/home/shepherd/logs/bnw-server=/var/log --network --root=/usr/local/bin/bnw-app-container +``` + +Check with + +``` +root@tux02:/home/shepherd# /home/shepherd/guix-profiles/bnw/bin/guix gc --list-roots |grep bnw + /usr/local/bin/bnw-app-container +``` + +## R/shiny apps + +The R/shiny apps were showing a tarball mismatch: + +``` +building /gnu/store/rjnw7k56z955v4bl07flm9pjwxx5vs0r-r-minimal-4.0.2.drv... +downloading from http://cran.r-project.org/src/contrib/Archive/KernSmooth/KernSmooth_2.23-17.tar.gz ... +- 'configure' phasesha256 hash mismatch for /gnu/store/n05zjfhxl0iqx1jbw8i6vv1174zkj7ja-KernSmooth_2.23-17.tar.gz: + expected hash: 11g6b0q67vasxag6v9m4px33qqxpmnx47c73yv1dninv2pz76g9b + actual hash: 1ciaycyp79l5aj78gpmwsyx164zi5jc60mh84vxxzq4j7vlcdb5p + hash mismatch for store item '/gnu/store/n05zjfhxl0iqx1jbw8i6vv1174zkj7ja-KernSmooth_2.23-17.tar.gz' +``` + +Guix checks and it is not great CRAN allows for changing tarballs with the same version number!! Luckily building with a more recent version of Guix just worked (TM). Now we create a root too: + +``` +/home/wrk/opt/guix-pull/bin/guix pull -p ~/guix-profiles/guix-for-r-shiny +``` + +Note I did not have to pull in guix-bioinformatics channel + +## Singlecell + +Singlecell is an R/shiny app. It starts with an error after above upgrade: + +``` +no slot of name "counts" for this object of class +``` + +and the code needs to be updated: + +=> https://github.com/satijalab/seurat/issues/8804 + +The 4 year old code lives at + +=> https://github.com/genenetwork/singleCellRshiny + +and it looks like lines like these need to be updated: + +=> https://github.com/genenetwork/singleCellRshiny/blob/6b2a344dd0d02f65228ad8c350bac0ced5850d05/app.R#L167 + +Let me ask the author Siamak Yousefi. I think we'll drop it. + +## longevity + +Package definition is at + +=> https://git.genenetwork.org/guix-bioinformatics/tree/gn/packages/mouse-longevity.scm + +Container is at + +=> https://git.genenetwork.org/guix-bioinformatics/tree/gn/services/bxd-power-container.scm + +## jumpshiny + +Jumpshiny is hosted on balg01. Scripts are in tux02 git. + +``` +root@balg01:/home/j*/gn-machines# . /usr/local/guix-profiles/guix-pull/etc/profile +guix system container --network -L . -L ../guix-forge/guix/ -L ../guix-bioinformatics/ -L ../guix-past/modules/ --substitute-urls='https://ci.guix.gnu.org https://bordeaux.guix.gnu.org https://cuirass.genenetwork.org' test-r-container.scm -L ../guix-forge/guix/gnu/store/xyks73sf6pk78rvrwf45ik181v0zw8rx-run-container +/gnu/store/6y65x5jk3lxy4yckssnl32yayjx9nwl5-run-container +``` diff --git a/issues/systems/fallbacks-and-backups.gmi b/issues/systems/fallbacks-and-backups.gmi index 9b890c7..53bd8fa 100644 --- a/issues/systems/fallbacks-and-backups.gmi +++ b/issues/systems/fallbacks-and-backups.gmi @@ -1,6 +1,12 @@ # Fallbacks and backups -As a hurricane is barreling towards our machine room in Memphis we are checking our fallbacks and backups for GeneNetwork. For years we have been making backups on Amazon - both S3 and a running virtual machine. The latter was expensive, so I replaced it with a bare metal server which earns itself (if it hadn't been down for months, but that is a different story). +A revisit to previous work on backups etc. The sheepdog hosts are no longer responding and we should really run sheepdog on a machine that is not physically with the other machines. In time sheepdog should also move away from redis and run in a system container, but that is for later. I did most of the work late 2021 when I wrote: + +> As a hurricane is barreling towards our machine room in Memphis we are checking our fallbacks and backups for GeneNetwork. For years we have been making backups on Amazon - both S3 and a running virtual machine. The latter was expensive, so I replaced it with a bare metal server which earns itself (if it hadn't been down for months, but that is a different story). + +As we are introducing an external sheepdog server we may give it a DNS entry as sheepdog.genenetwork.org. + +=> http://sheepdog.genenetwork.org/sheepdog/index.html See also @@ -16,13 +22,15 @@ See also ## Tasks -* [.] backup ratspub, r/shiny, bnw, covid19, hegp, pluto services -* [X] /etc /home/shepherd backups for Octopus -* [X] /etc /home/shepherd backups for P2 -* [X] Get backups running again on fallback -* [ ] fix redis queue for P2 - needs to be on rabbit +* [X] fix redis queue and sheepdog server +* [X] check backups on tux01 +* [ ] drop tux02 backups off-site +* [ ] backup ratspub, r/shiny, bnw, covid19, hegp, pluto services +* [ ] /etc /home/shepherd backups for Octopus +* [ ] /etc /home/shepherd /home/git CI-CD GN-QA backups on Tux02 +* [ ] Get backups running again on fallback * [ ] fix bacchus large backups -* [ ] backup octopus01:/lizardfs/backup-pangenome on bacchus +* [ ] mount bacchus on HPC ## Backup and restore @@ -52,22 +60,21 @@ Recently epysode was reinstated after hardware failure. I took the opportunity t As epysode was one of the main sheepdog messaging servers I need to reinstate: * [X] scripts for sheepdog -* [X] enable trim -* [X] reinstate monitoring web services -* [X] reinstate daily backup from penguin2 -* [X] CRON -* [X] make sure messaging works through redis -* [X] fix and propagate GN1 backup -* [X] fix and propagate IPFS and gitea backups -* [X] add GN1 backup -* [X] add IPFS backup -* [X] other backups +* [ ] Check tunnel on tux01 is reinstated +* [ ] enable trim +* [ ] reinstate monitoring web services +* [ ] reinstate daily backups +* [ ] CRON +* [ ] make sure messaging works through redis +* [ ] fix and propagate GN1 backup +* [ ] fix and propagate fileserver and git backups +* [ ] add GN1 backup +* [ ] other backups * [ ] email on fail Tux01 is backed up now. Need to make sure it propagates to -* [X] P2 -* [X] epysode -* [X] rabbit -* [X] Tux02 +* [ ] rabbit +* [ ] Tux02 +* [ ] balg01 * [ ] bacchus diff --git a/issues/systems/machine-room.gmi b/issues/systems/machine-room.gmi deleted file mode 100644 index 28d9921..0000000 --- a/issues/systems/machine-room.gmi +++ /dev/null @@ -1,19 +0,0 @@ -# Machine room - -## Tags - -* assign: pjotrp, dana -* type: system administration -* priority: high -* keywords: systems -* status: unclear - -## Tasks - -* [X] Make tux02e visible from outside -* [ ] Network switch 10Gbs - add hosts -* [ ] Add disks to tux01 and tux02 - need to reboot -* [ ] Set up E-mail relay for tux01 and tux02 smtp.uthsc.edu, port 25 - -=> tux02-production.gmi setup new production machine -=> decommission-machines.gmi Decommission machines diff --git a/issues/systems/octoraid-storage.gmi b/issues/systems/octoraid-storage.gmi new file mode 100644 index 0000000..97e0e55 --- /dev/null +++ b/issues/systems/octoraid-storage.gmi @@ -0,0 +1,18 @@ +# OctoRAID + +We are building machines that can handle cheap drives. + +# octoraid01 + +This is a jetson with 4 22TB seagate-ironwolf-pro-st22000nt001-22tb-enterprise-nas-hard-drives-7200-rpm. + +Unfortunately the stock kernel has no RAID support, so we simple mount the 4 drives (hosted on a USB-SATA bridge). + +Stress testing: + +``` +cd /export/nfs/lair01 +stress -v -d 1 +``` + +Running on multiple disks the jetson is holding up well! diff --git a/issues/systems/penguin2-raid5.gmi b/issues/systems/penguin2-raid5.gmi new file mode 100644 index 0000000..f03075d --- /dev/null +++ b/issues/systems/penguin2-raid5.gmi @@ -0,0 +1,61 @@ +# Penguin2 RAID 5 + +# Tags + +* assigned: @fredm, @pjotrp +* status: in progress + +# Description + +The current RAID contains 3 disks: + +``` +root@penguin2:~# cat /proc/mdstat +md0 : active raid5 sdb1[1] sda1[0] sdg1[4] +/dev/md0 33T 27T 4.2T 87% /export +``` + +using /dev/sda,sdb,sdg + +The current root and swap is on + +``` +# root +/dev/sdd1 393G 121G 252G 33% / +# swap +/dev/sdd5 partition 976M 76.5M -2 +``` + +We can therefore add four new disks in slots /dev/sdc,sde,sdf,sdh + +penguin2 has no out-of-band and no serial connector right now. That means any work needs to be done on the terminal. + +Boot loader menu: + +``` +menuentry 'Debian GNU/Linux' --class debian --class gnu-linux --class gnu --class os $menuentry_id_option 'gnulinux-simple-7ff268df-cb90-4cbc-9d76-7fd6677b4964' { + load_video + insmod gzio + if [ x$grub_platform = xxen ]; then insmod xzio; insmod lzopio; fi + insmod part_msdos + insmod ext2 + set root='hd2,msdos1' + if [ x$feature_platform_search_hint = xy ]; then + search --no-floppy --fs-uuid --set=root --hint-bios=hd2,msdos1 --hint-efi=hd2,msdos1 --hint-baremetal=ahci2,msdos1 7ff268df-cb90-4cbc-9d76-7fd6677b4964 + else + search --no-floppy --fs-uuid --set=root 7ff268df-cb90-4cbc-9d76-7fd6677b4964 + fi + echo 'Loading Linux 5.10.0-18-amd64 ...' + linux /boot/vmlinuz-5.10.0-18-amd64 root=UUID=7ff268df-cb90-4cbc-9d76-7fd6677b4964 ro quiet + echo 'Loading initial ramdisk ...' + initrd /boot/initrd.img-5.10.0-18-amd64 +} +``` + +Added to sdd MBR + +``` +root@penguin2:~# grub-install /dev/sdd +Installing for i386-pc platform. +Installation finished. No error reported. +``` diff --git a/issues/systems/tux02-production.gmi b/issues/systems/tux02-production.gmi index 7de911f..d811c5e 100644 --- a/issues/systems/tux02-production.gmi +++ b/issues/systems/tux02-production.gmi @@ -14,9 +14,9 @@ We are going to move production to tux02 - tux01 will be the staging machine. Th * [X] update guix guix-1.3.0-9.f743f20 * [X] set up nginx (Debian) -* [X] test ipmi console (172.23.30.40) +* [X] test ipmi console * [X] test ports (nginx) -* [?] set up network for external tux02e.uthsc.edu (128.169.4.52) +* [?] set up network for external tux02 * [X] set up deployment evironment * [X] sheepdog copy database backup from tux01 on a daily basis using ibackup user * [X] same for GN2 production environment diff --git a/issues/systems/tux04-disk-issues.gmi b/issues/systems/tux04-disk-issues.gmi index cea5a59..bc6e1db 100644 --- a/issues/systems/tux04-disk-issues.gmi +++ b/issues/systems/tux04-disk-issues.gmi @@ -1,4 +1,4 @@ -# Tux04 disk issues +# Tux04/Tux05 disk issues We are facing some disk issues with Tux04: @@ -6,6 +6,10 @@ We are facing some disk issues with Tux04: May 02 20:57:42 tux04 kernel: Buffer I/O error on device sdf1, logical block 859240457 ``` +and the same happened to tux05 (same batch). Basically the controllers report no issues. Just to be sure we added +a copy of the boot partition. + +=> topics/system/linux/add-boot-partition # Tags @@ -52,6 +56,8 @@ Download megacli from => https://hwraid.le-vert.net/wiki/DebianPackages ``` +apt-get update +apt-get install megacli megacli -LDInfo -L5 -a0 ``` @@ -95,3 +101,280 @@ and nothing ;). Megacli is actually the tool to use ``` megacli -AdpAllInfo -aAll ``` + +# Database + +During a backup the DB shows this error: + +``` +2025-03-02 06:28:33 Database page corruption detected at page 1079428, retrying...\n[01] 2025-03-02 06:29:33 Database page corruption detected at page 1103108, retrying... +``` + + +Interestingly the DB recovered on a second backup. + +The database is hosted on a solid /dev/sde Dell Ent NVMe FI. The log says + +``` +kernel: I/O error, dev sde, sector 2136655448 op 0x0:(READ) flags 0x80700 phys_seg 40 prio class 2 +``` + +Suggests: + +=> https://stackoverflow.com/questions/50312219/blk-update-request-i-o-error-dev-sda-sector-xxxxxxxxxxx + +> The errors that you see are interface errors, they are not coming from the disk itself but rather from the connection to it. It can be the cable or any of the ports in the connection. +> Since the CRC errors on the drive do not increase I can only assume that the problem is on the receive side of the machine you use. You should check the cable and try a different SATA port on the server. + +and someone wrote + +> analyzed that most of the reasons are caused by intensive reading and writing. This is a CDN cache node. Type reading NVME temperature is relatively high, if it continues, it will start to throttle and then slowly collapse. + +and temperature on that drive has been 70 C. + +Mariabd log is showing errors: + +``` +2025-03-02 6:54:47 0 [ERROR] InnoDB: Failed to read page 449925 from file './db_webqtl/SnpAll.ibd': Page read from tablespace is corrupted. +2025-03-02 7:01:43 489015 [ERROR] Got error 180 when reading table './db_webqtl/ProbeSetXRef' +2025-03-02 8:10:32 489143 [ERROR] Got error 180 when reading table './db_webqtl/ProbeSetXRef' +``` + +Let's try and dump those tables when the backup is done. + +``` +mariadb-dump -uwebqtlout db_webqtl SnpAll +mariadb-dump: Error 1030: Got error 1877 "Unknown error 1877" from storage engine InnoDB when dumping table `SnpAll` at row: 0 +mariadb-dump -uwebqtlout db_webqtl ProbeSetXRef > ProbeSetXRef.sql +``` + +Eeep: + +``` +tux04:/etc$ mariadb-check -uwebqtlout -c db_webqtl ProbeSetXRef +db_webqtl.ProbeSetXRef +Warning : InnoDB: Index ProbeSetFreezeId is marked as corrupted +Warning : InnoDB: Index ProbeSetId is marked as corrupted +error : Corrupt +tux04:/etc$ mariadb-check -uwebqtlout -c db_webqtl SnpAll +db_webqtl.SnpAll +Warning : InnoDB: Index PRIMARY is marked as corrupted +Warning : InnoDB: Index SnpName is marked as corrupted +Warning : InnoDB: Index Rs is marked as corrupted +Warning : InnoDB: Index Position is marked as corrupted +Warning : InnoDB: Index Source is marked as corrupted +error : Corrupt +``` + +On tux01 we have a working database, we can test with + +``` +mysqldump --no-data --all-databases > table_schema.sql +mysqldump -uwebqtlout db_webqtl SnpAll > SnpAll.sql +``` + +Running the backup with rate limiting from: + +``` +Mar 02 17:09:59 tux04 sudo[548058]: pam_unix(sudo:session): session opened for user root(uid=0) by wrk(uid=1000) +Mar 02 17:09:59 tux04 sudo[548058]: wrk : TTY=pts/3 ; PWD=/export3/local/home/wrk/iwrk/deploy/gn-deploy-servers/scripts/tux04 ; USER=roo> +Mar 02 17:09:55 tux04 sudo[548058]: pam_unix(sudo:auth): authentication failure; logname=wrk uid=1000 euid=0 tty=/dev/pts/3 ruser=wrk rhost= > +Mar 02 17:04:26 tux04 su[548006]: pam_unix(su:session): session opened for user ibackup(uid=1003) by wrk(uid=0) +``` + +Oh oh + +Tux04 is showing errors on all disks. We have to bail out. I am copying the potentially corrupted files to tux01 right now. We have backups, so nothing serious I hope. I am only worried about the myisam files we have because they have no strong internal validation: + +``` +2025-03-04 8:32:45 502 [ERROR] db_webqtl.ProbeSetData: Record-count is not ok; is 5264578601 Should be: 5264580806 +2025-03-04 8:32:45 502 [Warning] db_webqtl.ProbeSetData: Found 28665 deleted space. Should be 0 +2025-03-04 8:32:45 502 [Warning] db_webqtl.ProbeSetData: Found 2205 deleted blocks Should be: 0 +2025-03-04 8:32:45 502 [ERROR] Got an error from thread_id=502, ./storage/myisam/ha_myisam.cc:1120 +2025-03-04 8:32:45 502 [ERROR] MariaDB thread id 502, OS thread handle 139625162532544, query id 837999 localhost webqtlout Checking table +CHECK TABLE ProbeSetData +2025-03-04 8:34:02 79695 [ERROR] mariadbd: Table './db_webqtl/ProbeSetData' is marked as crashed and should be repaired +``` + +See also + +=> https://dev.mysql.com/doc/refman/8.4/en/myisam-check.html + +Tux04 will require open heart 'disk controller' surgery and some severe testing before we move back. We'll also look at tux05-8 to see if they have similar problems. + +## Recovery + +According to the logs tux04 started showing serious errors on March 2nd - when I introduced sanitizing the mariadb backup: + +``` +Mar 02 05:00:42 tux04 kernel: I/O error, dev sde, sector 2071078320 op 0x0:(READ) flags 0x80700 phys_seg 16 prio class 2 +Mar 02 05:00:58 tux04 kernel: I/O error, dev sde, sector 2083650928 op 0x0:(READ) flags 0x80700 phys_seg 59 prio class 2 +... +``` + +The log started on Feb 23 when we had our last reboot. It probably is a good idea to turn on persistent logging! Anyway, it is likely files were fine until March 2nd. Similarly the mariadb logs also show + +``` +2025-03-02 6:53:52 489007 [ERROR] mariadbd: Index for table './db_webqtl/ProbeSetData.MYI' is corrupt; try to repair it +2025-03-02 6:53:52 489007 [ERROR] db_webqtl.ProbeSetData: Can't read key from filepos: 2269659136 +``` + +So, if we can restore a backup from March 1st we should be reasonably confident it is sane. + +First is to backup the existing database(!) Next restore the new DB by changing the DB location (symlink in /var/lib/mysql as well as check /etc/mysql/mariadb.cnf). + +When upgrading it is an idea to switch on these in mariadb.cnf + +``` +# forcing recovery with these two lines: +innodb_force_recovery=3 +innodb_purge_threads=0 +``` + +Make sure to disable (and restart) once it is up and running! + +So the steps are: + +* [X] install updated guix version of mariadb in /usr/local/guix-profiles (don't use Debian!!) +* [X] repair borg backup +* [X] Stop old mariadb (on new host tux02) +* [X] backup old mariadb database +* [X] restore 'sane' version of DB from borg March 1st +* [X] point to new DB in /var/lib/mysql and cnf file +* [X] update systemd settings +* [X] start mariadb new version with recovery setting in cnf +* [X] check logs +* [X] once running revert on recovery setting in cnf and restart + +OK, looks like we are in business again. In the next phase we need to validate files. Normal files can be checked with + +``` +find -type f \( -not -name "md5sum.txt" \) -exec md5sum '{}' \; > md5sum.txt +``` + +and compared with another set on a different server with + +``` +md5sum -c md5sum.txt +``` + +* [X] check genotype file directory - some MAGIC files missing on tux01 + +gn-docs is a git repo, so that is easily checked + +* [X] check gn-docs and sync with master repo + + +## Other servers + +``` +journalctl -r|grep -i "I/O error"|less +# tux05 +Nov 18 02:19:55 tux05 kernel: XFS (sdc2): metadata I/O error in "xfs_da_read_buf+0xd9/0x130 [xfs]" at daddr 0x78 len 8 error 74 +Nov 05 14:36:32 tux05 kernel: blk_update_request: I/O error, dev sdb, sector 1993616 op 0x1:(WRITE) flags +0x0 phys_seg 35 prio class 0 +Jul 27 11:56:22 tux05 kernel: blk_update_request: I/O error, dev sdc, sector 55676616 op 0x0:(READ) flags +0x80700 phys_seg 26 prio class 0 +Jul 27 11:56:22 tux05 kernel: blk_update_request: I/O error, dev sdc, sector 55676616 op 0x0:(READ) flags +0x80700 phys_seg 26 prio class 0 +# tux06 +Apr 15 08:10:57 tux06 kernel: I/O error, dev sda, sector 21740352 op 0x1:(WRITE) flags 0x1000 phys_seg 4 prio class 2 +Dec 13 12:56:14 tux06 kernel: I/O error, dev sdb, sector 3910157327 op 0x9:(WRITE_ZEROES) flags 0x8000000 phys_seg 0 prio class 2 +# tux07 +Mar 27 08:00:11 tux07 mfschunkserver[1927469]: replication error: failed to create chunk (No space left) +# tux08 +Mar 27 08:12:11 tux08 mfschunkserver[464794]: replication error: failed to create chunk (No space left) +``` + +Tux04, 05 and 06 show disk errors. Tux07 and Tux08 are overloaded with a full disk, but no other errors. We need to babysit Lizard more! + +``` +stress -v -d 1 +``` + +Write test: + +``` +dd if=/dev/zero of=./test bs=512k count=2048 oflag=direct +``` + +Read test: + +``` +/sbin/sysctl -w vm.drop_caches=3 +dd if=./test of=/dev/zero bs=512k count=2048 +``` + + +smartctl -a /dev/sdd -d megaraid,0 + +RAID Controller in SL 3: Dell PERC H755N Front + +# The story continues + +I don't know what happened but the server gave a hard +error in the logs: + +``` +racadm getsel # get system log +Record: 340 +Date/Time: 05/31/2025 09:25:17 +Source: system +Severity: Critical +Description: A high-severity issue has occurred at the Power-On +Self-Test (POST) phase which has resulted in the system BIOS to +abruptly stop functioning. +``` + +Woops! I fixed it by resetting idrac and rebooting remotely. Nasty. + +Looking around I found this link + +=> +https://tomaskalabis.com/wordpress/a-high-severity-issue-has-occurred-at-the-power-on-self-te +st-post-phase-which-has-resulted-in-the-system-bios-to-abruptly-stop-functioning/ + +suggesting we should upgrade idrac firmware. I am not going to do that +without backups and a fully up-to-date fallback online. It may fix the +other hardware issues we have been seeing (who knows?). + +Fred, the boot sequence is not perfect yet. Turned out the network +interfaces do not come up in the right order and nginx failed because +of a missing /var/run/nginx. The container would not restart because - +missing above - it could not check the certificates. + +## A week later + +``` +[SMM] APIC 0x00 S00:C00:T00 > ASSERT [AmdPlatformRasRsSmm] u:\EDK2\MdePkg\Library\BasePciSegmentLibPci\PciSegmentLib.c(766): ((Address) & (0xfffffffff0000000ULL | (3))) == 0 !!!! X64 Exception Type - 03(#BP - Breakpoint) CPU Apic ID - 00000000 !!!! +RIP - 0000000076DA4343, CS - 0000000000000038, RFLAGS - 0000000000000002 +RAX - 0000000000000010, RCX - 00000000770D5B58, RDX - 00000000000002F8 +RBX - 0000000000000000, RSP - 0000000077773278, RBP - 0000000000000000 +RSI - 0000000000000087, RDI - 00000000777733E0 R8 - 00000000777731F8, R9 - 0000000000000000, R10 - 0000000000000000 +R11 - 00000000000000A0, R12 - 0000000000000000, R13 - 0000000000000000 +R14 - FFFFFFFFA0C1A118, R15 - 000000000005B000 +DS - 0000000000000020, ES - 0000000000000020, FS - 0000000000000020 +GS - 0000000000000020, SS - 0000000000000020 +CR0 - 0000000080010033, CR2 - 0000000015502000, CR3 - 0000000077749000 +CR4 - 0000000000001668, CR8 - 0000000000000001 +DR0 - 0000000000000000, DR1 - 0000000000000000, DR2 - 0000000000000000 DR3 - 0000000000000000, DR6 - 00000000FFFF0FF0, DR7 - 0000000000000400 +GDTR - 000000007773C000 000000000000004F, LDTR - 0000000000000000 IDTR - 0000000077761000 00000000000001FF, TR - 0000000000000040 +FXSAVE_STATE - 0000000077772ED0 +!!!! Find image based on IP(0x76DA4343) u:\Build_Genoa\DellBrazosPkg\DEBUG_MYTOOLS\X64\DellPkgs\DellChipsetPkgs\AmdGenoaModulePkg\Override\AmdCpmPkg\Features\PlatformRas\Rs\Smm\AmdPlatformRasRsSmm\DEBUG\AmdPlatformRasRsSmm.pdb (ImageBase=0000000076D3E000, EntryPoint=0000000076D3E6C0) !!!! +``` + +New error in system log: + +``` +Record: 341 Date/Time: 06/04/2025 19:47:08 +Source: system +Severity: Critical Description: A high-severity issue has occurred at the Power-On Self-Test (POST) phase which has resulted in the system BIOS to abruptly stop functioning. +``` + +The error appears to relate to AMD Brazos which is probably part of the on board APU/GPU. + +The code where it segfaulted is online at: + +=> https://github.com/tianocore/edk2/blame/master/MdePkg/Library/BasePciSegmentLibPci/PciSegmentLib.c + +and has to do with PCI registers and that can actually be caused by the new PCIe card we hosted. diff --git a/issues/systems/tux04-production.gmi b/issues/systems/tux04-production.gmi new file mode 100644 index 0000000..58ff8c1 --- /dev/null +++ b/issues/systems/tux04-production.gmi @@ -0,0 +1,279 @@ +# Production on tux04 + +Lately we have been running production on tux04. Unfortunately Debian got broken and I don't see a way to fix it (something with python versions that break apt!). Also mariadb is giving problems: + +=> issues/production-container-mechanical-rob-failure.gmi + +and that is alarming. We might as well try an upgrade. I created a new partition on /dev/sda4 using debootstrap. + +The hardware RAID has proven unreliable on this machine (and perhaps others). + +We added a drive on a PCIe raiser outside the RAID. Use this for bulk data copying. We still bootstrap from the RAID. + +Luckily not too much is running on this machine and if we mount things again, most should work. + +# Tasks + +* [X] cleanly shut down mariadb +* [X] reboot into new partition /dev/sda4 +* [X] git in /etc +* [X] make sure serial boot works (/etc/default/grub) +* [X] fix groups and users +* [X] get guix going +* [X] get mariadb going +* [X] fire up GN2 service +* [X] fire up SPARQL service +* [X] sheepdog +* [ ] fix CRON jobs and backups +* [ ] test full reboots + + +# Boot in new partition + +``` +blkid /dev/sda4 +/dev/sda4: UUID="4aca24fe-3ece-485c-b04b-e2451e226bf7" BLOCK_SIZE="4096" TYPE="ext4" PARTUUID="2e3d569f-6024-46ea-8ef6-15b26725f811" +``` + +After debootstrap there are two things to take care of: the /dev directory and grub. For good measure +I also capture some state + +``` +cd ~ +ps xau > cron.log +systemctl > systemctl.txt +cp /etc/network/interfaces . +cp /boot/grub/grub.cfg . +``` + +we should still have access to the old root partition, so I don't need to capture everything. + +## /dev + +I ran MAKEDEV and that may not be needed with udev. + +## grub + +We need to tell grub to boot into the new partition. The old root is on +UUID=8e874576-a167-4fa1-948f-2031e8c3809f /dev/sda2. + +Next I ran + +``` +tux04:~$ update-grub2 /dev/sda +Generating grub configuration file ... +Found linux image: /boot/vmlinuz-5.10.0-32-amd64 +Found initrd image: /boot/initrd.img-5.10.0-32-amd64 +Found linux image: /boot/vmlinuz-5.10.0-22-amd64 +Found initrd image: /boot/initrd.img-5.10.0-22-amd64 +Warning: os-prober will be executed to detect other bootable partitions. +Its output will be used to detect bootable binaries on them and create new boot entries. +Found Debian GNU/Linux 12 (bookworm) on /dev/sda4 +Found Windows Boot Manager on /dev/sdd1@/efi/Microsoft/Boot/bootmgfw.efi +Found Debian GNU/Linux 11 (bullseye) on /dev/sdf2 +``` + +Very good. Do a diff on grub.cfg and you see it even picked up the serial configuration. It only shows it added menu entries for the new boot. Very nice. + +At this point I feel safe to boot as we should be able to get back into the old partition. + +# /etc/fstab + +The old fstab looked like + +``` +UUID=8e874576-a167-4fa1-948f-2031e8c3809f / ext4 errors=remount-ro 0 1 +# /boot/efi was on /dev/sdc1 during installation +UUID=998E-68AF /boot/efi vfat umask=0077 0 1 +# swap was on /dev/sdc3 during installation +UUID=cbfcd84e-73f8-4cec-98ee-40cad404735f none swap sw 0 0 +UUID="783e3bd6-5610-47be-be82-ac92fdd8c8b8" /export2 ext4 auto 0 2 +UUID="9e6a9d88-66e7-4a2e-a12c-f80705c16f4f" /export ext4 auto 0 2 +UUID="f006dd4a-2365-454d-a3a2-9a42518d6286" /export3 auto auto 0 2 +/export2/gnu /gnu none defaults,bind 0 0 +# /dev/sdd1: PARTLABEL="bulk" PARTUUID="b1a820fe-cb1f-425e-b984-914ee648097e" +# /dev/sdb4 /export ext4 auto 0 2 +# /dev/sdd1 /export2 ext4 auto 0 2 +``` + +# reboot + +Next we are going to reboot, and we need a serial connector to the Dell out-of-band using racadm: + +``` +ssh IP +console com2 +racadm getsel +racadm serveraction powercycle +racadm serveraction powerstatus + +``` + +Main trick it so hit ESC, wait 2 sec and 2 when you want the bios boot menu. Ctrl-\ to escape console. Otherwise ESC (wait) ! to get to the boot menu. + +# First boot + +It still boots by default into the old root. That gave an error: + +[FAILED] Failed to start File Syste…a-2365-454d-a3a2-9a42518d6286 + +This is /export3. We can fix that later. + +When I booted into the proper partition the console clapped out. Also the racadm password did not work on tmux -- I had to switch to a standard console to log in again. Not sure why that is, but next I got: + +``` +Give root password for maintenance +(or press Control-D to continue): +``` + +and giving the root password I was in maintenance mode on the correct partition! + +To rerun grup I had to add `GRUB_DISABLE_OS_PROBER=false`. + +Once booting up it is a matter of mounting partitions and tick the check boxes above. + +The following contained errors: + +``` +/dev/sdd1 3.6T 1.8T 1.7T 52% /export2 +``` + +# Guix + +Getting guix going is a bit tricky because we want to keep the store! + +``` +cp -vau /mnt/old-root/var/guix/ /var/ +cp -vau /mnt/old-root/usr/local/guix-profiles /usr/local/ +cp -vau /mnt/old-root/usr/local/bin/* /usr/local/bin/ +cp -vau /mnt/old-root/etc/systemd/system/guix-daemon.service* /etc/systemd/system/ +cp -vau /mnt/old-root/etc/systemd/system/gnu-store.mount* /etc/systemd/system/ +``` + +Also had to add guixbuild users and group by hand. + +# nginx + +We use the streaming facility. Check that + +``` +nginx -V +``` + +lists --with-stream=static, see + +=> https://serverfault.com/questions/858067/unknown-directive-stream-in-etc-nginx-nginx-conf86/858074#858074 + +and load at the start of nginx.conf: + +``` +load_module /usr/lib/nginx/modules/ngx_stream_module.so; +``` + +and + +``` +nginx -t +``` + +passes + +Now the container responds to the browser with `Internal Server Error`. + +# container web server + +Visit the container with something like + +``` +nsenter -at 2838 /run/current-system/profile/bin/bash --login +``` + +The nginx log in the container has many + +``` +2025/02/22 17:23:48 [error] 136#0: *166916 connect() failed (111: Connection refused) while connecting to upstream, client: 127.0.0.1, server: genenetwork.org, request: "GET /gn3/gene/aliases/st%2029:1;o;s HTTP/1.1", upstream: "http://127.0.0.1:9800/gene/aliases/st%2029:1;o;s", host: "genenetwork.org" +``` + +that is interesting. Acme/https is working because GN2 is working: + +``` +curl https://genenetwork.org/api3/version +"1.0" +``` + +Looking at the logs it appears it is a redis problem first for GN2. + +Fred builds the container with `/home/fredm/opt/guix-production/bin/guix`. Machines are defined in + +``` +fredm@tux04:/export3/local/home/fredm/gn-machines +``` + +The shared dir for redis is at + +--share=/export2/guix-containers/genenetwork/var/lib/redis=/var/lib/redis + +with + +``` +root@genenetwork-production /var# ls lib/redis/ -l +-rw-r--r-- 1 redis redis 629328484 Feb 22 17:25 dump.rdb +``` + +In production.scm it is defined as + +``` +(service redis-service-type + (redis-configuration + (bind "127.0.0.1") + (port 6379) + (working-directory "/var/lib/redis"))) +``` + +The defaults are the same as the definition of redis-service-type (in guix). Not sure why we are duplicating. + +After starting redis by hand I get another error `500 DatabaseError: The following exception was raised while attempting to access http://auth.genenetwork.org/auth/data/authorisation: database disk image is malformed`. The problem is it created +a DB in the wrong place. Alright, the logs in the container say: + +``` +Feb 23 14:04:31 genenetwork-production shepherd[1]: [redis-server] 3977:C 23 Feb 2025 14:04:31.040 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo +Feb 23 14:04:31 genenetwork-production shepherd[1]: [redis-server] 3977:C 23 Feb 2025 14:04:31.040 # Redis version=7.0.12, bits=64, commit=00000000, modified=0, pid=3977, just started +Feb 23 14:04:31 genenetwork-production shepherd[1]: [redis-server] 3977:C 23 Feb 2025 14:04:31.040 # Configuration loaded +Feb 23 14:04:31 genenetwork-production shepherd[1]: [redis-server] 3977:M 23 Feb 2025 14:04:31.041 * Increased maximum number of open files to 10032 (it was originally set to 1024). +Feb 23 14:04:31 genenetwork-production shepherd[1]: [redis-server] 3977:M 23 Feb 2025 14:04:31.041 * monotonic clock: POSIX clock_gettime +Feb 23 14:04:31 genenetwork-production shepherd[1]: [redis-server] 3977:M 23 Feb 2025 14:04:31.041 * Running mode=standalone, port=6379. +Feb 23 14:04:31 genenetwork-production shepherd[1]: [redis-server] 3977:M 23 Feb 2025 14:04:31.042 # Server initialized +Feb 23 14:04:31 genenetwork-production shepherd[1]: [redis-server] 3977:M 23 Feb 2025 14:04:31.042 # WARNING Memory overcommit must be enabled! Without it, a background save or replication may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect. +Feb 23 14:04:31 genenetwork-production shepherd[1]: [redis-server] 3977:M 23 Feb 2025 14:04:31.042 # Wrong signature trying to load DB from file +Feb 23 14:04:31 genenetwork-production shepherd[1]: [redis-server] 3977:M 23 Feb 2025 14:04:31.042 # Fatal error loading the DB: Invalid argument. Exiting. +Feb 23 14:04:31 genenetwork-production shepherd[1]: Service redis (PID 3977) exited with 1. +``` + +This is caused by a newer version of redis. This is odd because we are using the same version from the container?! + +Actually it turned out the redis DB was corrupted on the SSD! Same for some other databases (ugh). + +Fred copied all data to an enterprise level storage, and we rolled back to some older DBs, so hopefully we'll be OK for now. + +# Reinstating backups + +In the next step we need to restore backups as described in + +=> /topics/systems/backups-with-borg + +I already created an ibackup user. Next we test the backup script for mariadb. + +One important step is to check the database: + +``` +/usr/bin/mariadb-check -c -u user -p* db_webqtl +``` + +A successful mariadb backup consists of multiple steps + +``` +2025-02-27 11:48:28 +0000 (ibackup@tux04) SUCCESS 0 <32m43s> mariabackup-dump +2025-02-27 11:48:29 +0000 (ibackup@tux04) SUCCESS 0 <00m00s> mariabackup-make-consistent +2025-02-27 12:16:37 +0000 (ibackup@tux04) SUCCESS 0 <28m08s> borg-tux04-sql-backup +2025-02-27 12:16:46 +0000 (ibackup@tux04) SUCCESS 0 <00m07s> drop-rsync-balg01 +``` diff --git a/issues/xapian_bug.gmi b/issues/xapian_bug.gmi index f11b604..068d8eb 100644 --- a/issues/xapian_bug.gmi +++ b/issues/xapian_bug.gmi @@ -5,6 +5,7 @@ * assigned: zsloan * priority: high * type: search +* status: closed * keywords: xapian, gn2, gn3 ## Description |