summaryrefslogtreecommitdiff
path: root/issues/genenetwork
diff options
context:
space:
mode:
Diffstat (limited to 'issues/genenetwork')
-rw-r--r--issues/genenetwork/cannot-connect-to-mariadb.gmi99
-rw-r--r--issues/genenetwork/containerising-production-issues.gmi33
-rw-r--r--issues/genenetwork/handle-tmp-dirs-in-container.gmi22
-rw-r--r--issues/genenetwork/markdown-editing-service-not-deployed.gmi34
-rw-r--r--issues/genenetwork/python-requests-error-in-container.gmi174
-rw-r--r--issues/genenetwork/setup-mailing-on-tux04.gmi16
-rw-r--r--issues/genenetwork/umhet3-samples-timing-slow.gmi72
-rw-r--r--issues/genenetwork/virtuoso-shutdown-clears-data.gmi98
8 files changed, 548 insertions, 0 deletions
diff --git a/issues/genenetwork/cannot-connect-to-mariadb.gmi b/issues/genenetwork/cannot-connect-to-mariadb.gmi
new file mode 100644
index 0000000..ca4bd9f
--- /dev/null
+++ b/issues/genenetwork/cannot-connect-to-mariadb.gmi
@@ -0,0 +1,99 @@
+# Cannot Connect to MariaDB
+
+
+## Description
+
+GeneNetwork3 is failing to connect to mariadb with the error:
+
+```
+⋮
+2024-11-05 14:49:00 Traceback (most recent call last):
+2024-11-05 14:49:00 File "/gnu/store/83v79izrqn36nbn0l1msbcxa126v21nz-profile/lib/python3.10/site-packages/flask/app.py", line 1523, in full_dispatch_request
+2024-11-05 14:49:00 rv = self.dispatch_request()
+2024-11-05 14:49:00 File "/gnu/store/83v79izrqn36nbn0l1msbcxa126v21nz-profile/lib/python3.10/site-packages/flask/app.py", line 1509, in dispatch_request
+2024-11-05 14:49:00 return self.ensure_sync(self.view_functions[rule.endpoint])(**req.view_args)
+2024-11-05 14:49:00 File "/gnu/store/83v79izrqn36nbn0l1msbcxa126v21nz-profile/lib/python3.10/site-packages/gn3/api/menu.py", line 13, in generate_json
+2024-11-05 14:49:00 with database_connection(current_app.config["SQL_URI"], logger=current_app.logger) as conn:
+2024-11-05 14:49:00 File "/gnu/store/lzw93sik90d780n09svjx5la1bb8g3df-python-3.10.7/lib/python3.10/contextlib.py", line 135, in __enter__
+2024-11-05 14:49:00 return next(self.gen)
+2024-11-05 14:49:00 File "/gnu/store/83v79izrqn36nbn0l1msbcxa126v21nz-profile/lib/python3.10/site-packages/gn3/db_utils.py", line 34, in database_connection
+2024-11-05 14:49:00 connection = mdb.connect(db=db_name,
+2024-11-05 14:49:00 File "/gnu/store/83v79izrqn36nbn0l1msbcxa126v21nz-profile/lib/python3.10/site-packages/MySQLdb/__init__.py", line 121, in Connect
+2024-11-05 14:49:00 return Connection(*args, **kwargs)
+2024-11-05 14:49:00 File "/gnu/store/83v79izrqn36nbn0l1msbcxa126v21nz-profile/lib/python3.10/site-packages/MySQLdb/connections.py", line 195, in __init__
+2024-11-05 14:49:00 super().__init__(*args, **kwargs2)
+2024-11-05 14:49:00 MySQLdb.OperationalError: (2002, "Can't connect to local MySQL server through socket '/tmp/mysql.sock' (2)")
+```
+
+We have previously defined the default socket file[^1][^2] as "/run/mysqld/mysqld.sock".
+
+## Troubleshooting Logs
+
+### 2024-11-05
+
+I attempted to just bind `/run/mysqld/mysqld.sock` to `/tmp/mysql.sock` by adding the following mapping in GN3's `gunicorn-app` definition:
+
+```
+(file-system-mapping
+ (source "/run/mysqld/mysqld.sock")
+ (target "/tmp/mysql.sock")
+ (writable? #t))
+```
+
+but that does not fix things.
+
+I had tried to change the mysql URI to use IP addresses, i.e.
+
+```
+SQL_URI="mysql://webqtlout:webqtlout@128.169.5.119:3306/db_webqtl"
+```
+
+but that simply changes the error from the above to the one below:
+
+```
+2024-11-05 15:27:12 MySQLdb.OperationalError: (2002, "Can't connect to MySQL server on '128.169.5.119' (115)")
+```
+
+I tried with both `127.0.0.1` and `128.169.5.119`.
+
+My hail-mary was to attempt to expose the `my.cnf` file generated by the `mysql-service-type` definition to the "pola-wrapper", but that is proving tricky, seeing as the file is generated elsewhere[^4] and we do not have a way of figuring out the actual final path of the file.
+
+I tried:
+
+```
+(file-system-mapping
+ (source (mixed-text-file "my.cnf"
+ (string-append "[client]\n"
+ "socket=/run/mysqld/mysqld.sock")))
+ (target "/etc/mysql/my.cnf"))
+```
+
+but that did not work either.
+
+### 2024-11-07
+
+Start digging into how GNU Guix services are defined[^5] to try and understand why the file mapping attempt did not work.
+
+=> http://git.savannah.gnu.org/cgit/guix.git/tree/gnu/system/file-systems.scm?id=2394a7f5fbf60dd6adc0a870366adb57166b6d8b#n575
+Looking at the code linked above specifically at lines 575 to 588, and 166, it seems, to me, that the mappings attempt should have worked.
+
+Try it again, taking care to verify that the paths are correct, with:
+
+```
+(file-system-mapping
+ (source (mixed-text-file "my.cnf"
+ (string-append "[client-server]\n"
+ "socket=/run/mysqld/mysqld.sock")))
+ (target "/etc/my.cnf"))
+```
+
+Try rebuilding on tux04: started getting `Segmentation fault` errors out of the blue for many guix commands 🤦🏿.
+Try building container on local dev machine: this took a long time - quit and continue later.
+
+### Footnotes
+
+[^1] Lines 47 to 49 of https://git.genenetwork.org/gn-machines/tree/production.scm?id=46a1c4c8d01198799e6ac3b99998dca40d2c7094#n47
+[^2] Guix's mysql-service-type configurations https://guix.gnu.org/manual/en/html_node/Database-Services.html#index-mysql_002dconfiguration
+[^3] https://mariadb.com/kb/en/server-system-variables/#socket
+[^4] https://git.savannah.gnu.org/cgit/guix.git/tree/gnu/services/databases.scm?id=4c56d0cccdc44e12484b26332715f54768738c5f#n576
+[^5] https://guix.gnu.org/manual/en/html_node/Defining-Services.html
diff --git a/issues/genenetwork/containerising-production-issues.gmi b/issues/genenetwork/containerising-production-issues.gmi
new file mode 100644
index 0000000..ed5702a
--- /dev/null
+++ b/issues/genenetwork/containerising-production-issues.gmi
@@ -0,0 +1,33 @@
+# Containerising Production: Issues
+
+## Tags
+
+* type: bug
+* assigned: fredm
+* priority: critical
+* status: closed, completed
+* keywords: production, container, tux04
+* interested: alexk, aruni, bonfacem, fredm, pjotrp, soloshelby, zsloan, jnduli
+
+## Description
+
+We have recently got production into a container and deployed it: It has come up, however, that there are services that are useful to get a full-featured GeneNetwork system running that are not part of the container.
+
+This is, therefore, a meta-issue, tracking all issues that relate to the deployment of the disparate services that make up GeneNetwork.
+
+## Documentation
+
+=> https://issues.genenetwork.org/topics/genenetwork/genenetwork-services
+
+The link above documents the various services that make up the GeneNetwork service.
+
+## Issues
+
+* [x] Move user directories to a large partition
+=> ./handle-tmp-dirs-in-container [x] Link TMPDIR in container to a directory on a large partition
+=> ./markdown-editing-service-not-deployed [ ] Define and deploy Markdown Editing service
+=> ./umhet3-samples-timing-slow [ ] Figure out and fix UM-HET3 Samples mappings on Tux04
+=> ./setup-mailing-on-tux04 [x] Setting up email service on Tux04
+=> ./virtuoso-shutdown-clears-data [x] Virtuoso seems to lose data on restart
+=> ./python-requests-error-in-container [x] Fix python's requests library certificates error
+=> ./cannot-connect-to-mariadb [ ] GN3 cannot connect to mariadb server
diff --git a/issues/genenetwork/handle-tmp-dirs-in-container.gmi b/issues/genenetwork/handle-tmp-dirs-in-container.gmi
new file mode 100644
index 0000000..5f6eb92
--- /dev/null
+++ b/issues/genenetwork/handle-tmp-dirs-in-container.gmi
@@ -0,0 +1,22 @@
+# Handle Temporary Directories in the Container
+
+## Tags
+
+* type: feature
+* assigned: fredm
+* priority: critical
+* status: closed, completed
+* keywords: production, container, tux04
+* interested: alexk, aruni, bonfacem, pjotrp, zsloan
+
+## Description
+
+The container's temporary directories should be in a large partition on the host to avoid a scenario where the writes fill up one of the smaller drives.
+
+Currently, we use the `/tmp` directory by default, but we should look into transitioning away from that — `/tmp` is world readable and world writable and therefore needs careful consideration to keep safe.
+
+Thankfully, we are running our systems within a container, and can bind the container's `/tmp` directory to a non-world-accessible directory, keeping things at least contained.
+
+### Fixes
+
+=> https://git.genenetwork.org/gn-machines/commit/?id=7306f1127df9d4193adfbfa51295615f13d32b55
diff --git a/issues/genenetwork/markdown-editing-service-not-deployed.gmi b/issues/genenetwork/markdown-editing-service-not-deployed.gmi
new file mode 100644
index 0000000..e7a1717
--- /dev/null
+++ b/issues/genenetwork/markdown-editing-service-not-deployed.gmi
@@ -0,0 +1,34 @@
+# Markdown Editing Service: Not Deployed
+
+## Tags
+
+* type: bug
+* status: open
+* assigned: fredm
+* priority: critical
+* keywords: production, container, tux04
+* interested: alexk, aruni, bonfacem, fredm, pjotrp, zsloan
+
+## Description
+
+The Markdown Editing service is not working on production.
+
+* Link: https://genenetwork.org/facilities/
+* Repository: https://git.genenetwork.org/gn-guile
+
+Currently, the code is being run directly on the host, rather than inside the container.
+
+Some important things to note:
+
+* The service requires access to a checkout of https://github.com/genenetwork/gn-docs
+* Currently, the service is hard-coded to use a specific port: we should probably fix that.
+
+## Reopened: 2024-11-01
+
+While the service was deployed, the edit functionality is not working right, specifically, pushing the edits upstream to the remote seems to fail.
+
+If you do an edit and refresh the page, it will show up in the system, but it will not proceed to be pushed up to the remote.
+
+Set `CGIT_REPO_PATH="https://git.genenetwork.org/gn-guile"` which seems to allow the commit to work, but we do not actually get the changes pushed to the remote in any useful sense.
+
+It seems to me, that we need to configure the environment in such a way that it will be able to push the changes to remote.
diff --git a/issues/genenetwork/python-requests-error-in-container.gmi b/issues/genenetwork/python-requests-error-in-container.gmi
new file mode 100644
index 0000000..0289762
--- /dev/null
+++ b/issues/genenetwork/python-requests-error-in-container.gmi
@@ -0,0 +1,174 @@
+# Python Requests Error in Container
+
+## Tags
+
+* type: bug
+* assigned: fredm
+* priority: critical
+* status: closed, completed, fixed
+* interested: alexk, aruni, bonfacem, pjotrp, zsloan
+* keywords: production, container, tux04, python, requests
+
+## Description
+
+Building the container with the
+=> https://git.genenetwork.org/guix-bioinformatics/commit/?id=eb7beb340a9731775e8ad177e47b70dba2f2a84f upgraded guix definition
+leads to python's requests library failing.
+
+```
+2024-10-30 16:04:13 OSError: Could not find a suitable TLS CA certificate bundle, invalid path: /etc/ssl/certs/ca-certificates.crt
+```
+
+If you login to the container itself, however, you find that the file `/etc/ssl/certs/ca-certificates.crt` actually exists and has content.
+
+Possible fixes suggested are to set up correct envvars for the requests library, such as `REQUESTS_CA_BUNDLE`
+
+See
+=> https://requests.readthedocs.io/en/latest/user/advanced/#ssl-cert-verification
+
+### Troubleshooting Logs
+
+Try reproducing the issue locally:
+
+```
+$ guix --version
+hint: Consider installing the `glibc-locales' package and defining `GUIX_LOCPATH', along these lines:
+
+ guix install glibc-locales
+ export GUIX_LOCPATH="$HOME/.guix-profile/lib/locale"
+
+See the "Application Setup" section in the manual, for more info.
+
+guix (GNU Guix) 2394a7f5fbf60dd6adc0a870366adb57166b6d8b
+Copyright (C) 2024 the Guix authors
+License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
+This is free software: you are free to change and redistribute it.
+There is NO WARRANTY, to the extent permitted by law.
+$
+$ guix shell --container --network python python-requests coreutils
+[env]$ ls "${GUIX_ENVIRONMENT}/etc"
+ld.so.cache profile
+```
+
+We see from the above that there are no certificates in the environment with just python and python-requests.
+
+Okay. Now let's write a simple python script to test things out with:
+
+```
+import requests
+
+resp = requests.get("https://github.com")
+print(resp)
+```
+
+and run it!
+
+```
+$ guix shell --container --network python python-requests coreutils -- python3 test.py
+Traceback (most recent call last):
+ File "/tmp/test.py", line 1, in <module>
+ import requests
+ File "/gnu/store/b6ny4p29f32rrnnvgx7zz1nhsms2zmqk-profile/lib/python3.10/site-packages/requests/__init__.py", line 164, in <module>
+ from .api import delete, get, head, options, patch, post, put, request
+ File "/gnu/store/b6ny4p29f32rrnnvgx7zz1nhsms2zmqk-profile/lib/python3.10/site-packages/requests/api.py", line 11, in <module>
+ from . import sessions
+ File "/gnu/store/b6ny4p29f32rrnnvgx7zz1nhsms2zmqk-profile/lib/python3.10/site-packages/requests/sessions.py", line 15, in <module>
+ from .adapters import HTTPAdapter
+ File "/gnu/store/b6ny4p29f32rrnnvgx7zz1nhsms2zmqk-profile/lib/python3.10/site-packages/requests/adapters.py", line 81, in <module>
+ _preloaded_ssl_context.load_verify_locations(
+FileNotFoundError: [Errno 2] No such file or directory
+```
+
+Uhmm, what is this new error?
+
+Add `nss-certs` and try again.
+
+```
+$ guix shell --container --network python python-requests nss-certs coreutils
+[env]$ ls ${GUIX_ENVIRONMENT}/etc/ssl/
+certs
+[env]$ python3 test.py
+Traceback (most recent call last):
+ File "/tmp/test.py", line 1, in <module>
+ import requests
+ File "/gnu/store/17dw8qczqqz9fmj2kxzsbfqn730frqd7-profile/lib/python3.10/site-packages/requests/__init__.py", line 164, in <module>
+ from .api import delete, get, head, options, patch, post, put, request
+ File "/gnu/store/17dw8qczqqz9fmj2kxzsbfqn730frqd7-profile/lib/python3.10/site-packages/requests/api.py", line 11, in <module>
+ from . import sessions
+ File "/gnu/store/17dw8qczqqz9fmj2kxzsbfqn730frqd7-profile/lib/python3.10/site-packages/requests/sessions.py", line 15, in <module>
+ from .adapters import HTTPAdapter
+ File "/gnu/store/17dw8qczqqz9fmj2kxzsbfqn730frqd7-profile/lib/python3.10/site-packages/requests/adapters.py", line 81, in <module>
+ _preloaded_ssl_context.load_verify_locations(
+FileNotFoundError: [Errno 2] No such file or directory
+[env]$
+[env]$ export REQUESTS_CA_BUNDLE="${GUIX_ENVIRONMENT}/etc/ssl/certs/ca-certificates.crt"
+[env]$ $ python3 test.py
+Traceback (most recent call last):
+ File "/tmp/test.py", line 1, in <module>
+ import requests
+ File "/gnu/store/17dw8qczqqz9fmj2kxzsbfqn730frqd7-profile/lib/python3.10/site-packages/requests/__init__.py", line 164, in <module>
+ from .api import delete, get, head, options, patch, post, put, request
+ File "/gnu/store/17dw8qczqqz9fmj2kxzsbfqn730frqd7-profile/lib/python3.10/site-packages/requests/api.py", line 11, in <module>
+ from . import sessions
+ File "/gnu/store/17dw8qczqqz9fmj2kxzsbfqn730frqd7-profile/lib/python3.10/site-packages/requests/sessions.py", line 15, in <module>
+ from .adapters import HTTPAdapter
+ File "/gnu/store/17dw8qczqqz9fmj2kxzsbfqn730frqd7-profile/lib/python3.10/site-packages/requests/adapters.py", line 81, in <module>
+ _preloaded_ssl_context.load_verify_locations(
+FileNotFoundError: [Errno 2] No such file or directory
+```
+
+Welp! Looks like this error is a whole different thing.
+
+Let us try with the genenetwork2 package.
+
+```
+$ guix shell --container --network genenetwork2 coreutils
+[env]$ ls "${GUIX_ENVIRONMENT}/etc"
+bash_completion.d jupyter ld.so.cache profile
+```
+
+This does not seem to have the certificates in place either, so let's add nss-certs
+
+```
+$ guix shell --container --network genenetwork2 coreutils nss-certs
+[env]$ ls "${GUIX_ENVIRONMENT}/etc"
+bash_completion.d jupyter ld.so.cache profile ssl
+[env]$ python3 test.py
+Traceback (most recent call last):
+ File "/tmp/test.py", line 3, in <module>
+ resp = requests.get("https://github.com")
+ File "/gnu/store/qigjz4i0dckbsjbd2has0md2dxwsa7ry-profile/lib/python3.10/site-packages/requests/api.py", line 73, in get
+ return request("get", url, params=params, **kwargs)
+ File "/gnu/store/qigjz4i0dckbsjbd2has0md2dxwsa7ry-profile/lib/python3.10/site-packages/requests/api.py", line 59, in request
+ return session.request(method=method, url=url, **kwargs)
+ File "/gnu/store/qigjz4i0dckbsjbd2has0md2dxwsa7ry-profile/lib/python3.10/site-packages/requests/sessions.py", line 587, in request
+ resp = self.send(prep, **send_kwargs)
+ File "/gnu/store/qigjz4i0dckbsjbd2has0md2dxwsa7ry-profile/lib/python3.10/site-packages/requests/sessions.py", line 701, in send
+ r = adapter.send(request, **kwargs)
+ File "/gnu/store/qigjz4i0dckbsjbd2has0md2dxwsa7ry-profile/lib/python3.10/site-packages/requests/adapters.py", line 460, in send
+ self.cert_verify(conn, request.url, verify, cert)
+ File "/gnu/store/qigjz4i0dckbsjbd2has0md2dxwsa7ry-profile/lib/python3.10/site-packages/requests/adapters.py", line 263, in cert_verify
+ raise OSError(
+OSError: Could not find a suitable TLS CA certificate bundle, invalid path: /etc/ssl/certs/ca-certificates.crt
+```
+
+We get the expected certificates error! This is good. Now define the envvar and try again.
+
+```
+[env]$ export REQUESTS_CA_BUNDLE="${GUIX_ENVIRONMENT}/etc/ssl/certs/ca-certificates.crt"
+[env]$ python3 test.py
+<Response [200]>
+```
+
+Success!!!
+
+Adding nss-certs and setting the `REQUESTS_CA_BUNDLE` fixes things. We'll need to do the same for the container, for both the genenetwork2 and genenetwork3 packages (and any other packages that use requests library).
+
+### Fixes
+
+=> https://git.genenetwork.org/guix-bioinformatics/commit/?id=fec68c4ca87eeca4eb9e69e71fc27e0eae4dd728
+=> https://git.genenetwork.org/guix-bioinformatics/commit/?id=c3bb784c8c70857904ef97ecd7d36ec98772413d
+The two commits above add nss-certs package to all the flask apps, which make use of the python-requests library, which requires a valid CA certificates bundle in each application's environment.
+
+=> https://git.genenetwork.org/gn-machines/commit/?h=production-container&id=04506c4496e5ca8b3bc38e28ed70945a145fb036
+The commit above defines the "REQUESTS_CA_BUNDLE" environment variable for all the flask applications that make use of python's requests library.
diff --git a/issues/genenetwork/setup-mailing-on-tux04.gmi b/issues/genenetwork/setup-mailing-on-tux04.gmi
new file mode 100644
index 0000000..45605d9
--- /dev/null
+++ b/issues/genenetwork/setup-mailing-on-tux04.gmi
@@ -0,0 +1,16 @@
+# Setup Mailing on Tux04
+
+## Tags
+
+* type: bug
+* status: closed
+* assigned: fredm
+* priority: critical
+* interested: pjotrp, zsloan
+* keywords: production, container, tux04
+
+## Description
+
+We use emails to verify user accounts and allow changing of user passwords. We therefore need to setup a way to send emails from the system.
+
+I updated the configurations to use UTHSC's mail server
diff --git a/issues/genenetwork/umhet3-samples-timing-slow.gmi b/issues/genenetwork/umhet3-samples-timing-slow.gmi
new file mode 100644
index 0000000..a3a33a7
--- /dev/null
+++ b/issues/genenetwork/umhet3-samples-timing-slow.gmi
@@ -0,0 +1,72 @@
+# UM-HET3 Timing: Slow
+
+## Tags
+
+* type: bug
+* status: open
+* assigned: fredm
+* priority: critical
+* interested: fredm, pjotrp, zsloan
+* keywords: production, container, tux04, UM-HET3
+
+## Description
+
+In email from @robw:
+
+```
+> > Not sure why. Am I testing the wrong way?
+> > Are we using memory and RAM in the same way on the two machines?
+> > Here are data on the loading time improvement for Tux2:
+> > I tested this using a "worst case" trait that we know when—the 25,000
+> > UM-HET3 samples:
+> > [1]https://genenetwork.org/show_trait?trait_id=10004&dataset=HET3-ITPPu
+> > blish
+> > Tux02: 15.6, 15.6, 15.3 sec
+> > Fallback: 37.8, 38.7, 38.5 sec
+> > Here are data on Gemma speed/latency performance:
+> > Also tested "worst case" performance using three large BXD data sets
+> > tested in this order:
+> > [2]https://genenetwork.org/show_trait?trait_id=10004&dataset=BXD-Longev
+> > ityPublish
+> > [3]https://genenetwork.org/show_trait?trait_id=10003&dataset=BXD-Longev
+> > ityPublish
+> > [4]https://genenetwork.org/show_trait?trait_id=10002&dataset=BXD-Longev
+> > ityPublish
+> > Tux02: 107.2, 329.9 (ouch), 360.0 sec (double ouch) for 1004, 1003, and
+> > 1002 respectively. On recompute (from cache) 19.9, 19.9 and 20.0—still
+> > too slow.
+> > Fallback: 154.1, 115.9 for the first two traits (trait 10002 already in
+> > the cache)
+> > On recompute (from cache) 59.6, 59.0 and 59.7. Too slow from cache.
+> > PROBLEM 2: Tux02 is unable to map UM-HET3. I still get an nginx 413
+> > error: Entity Too Large.
+>
+> Yeah, Fred should fix that one. It is an nginx setting - we run 2x
+> nginx. It was reported earlier.
+>
+> > I need this to work asap. Now mapping our amazing UM-HET3 data. I can
+> > use Fallback, but it is painfully slow and takes about 214 sec. I hope
+> > Tux02 gets that down to a still intolerable slow 86 sec.
+> > Can we please fix and confirm by testing. The Trait is above for your
+> > testing pleasure.
+> > Even 86 secs is really too slow and should motivate us (or users like
+> > me) to think about how we are using all of those 24 ultra-fast cores on
+> > the AMD 9274F. Why not put them all to use for us and users. It is not
+> > good enough just to have "it work". It has to work in about 5–10
+> > seconds.
+> > Here are my questions for you guys: Are we able to use all 24 cores
+> > for any one user? How does each user interact with the CPU? Can we
+> > handle a class of 24 students with 24 cores, or is it "complicated"?
+> > PROBLEM 3: Zach, Fred. Are we computing render time or transport
+> > latency correctly? Ideally the printout at the bottom of mapping pages
+> > would be true latency as experienced by the user. As far as I can tell
+> > with a stop watch our estimates of time are incorrect by as much as 3
+> > secs. And note that the link
+> > to [5]http://joss.theoj.org/papers/10.21105/joss.00025 is not working
+> > correctly in the footer (see image below). Oddly enough it works fine
+> > on Tux02
+>
+> Fred, take a note.
+```
+
+Figure out what this is about and fix it.
diff --git a/issues/genenetwork/virtuoso-shutdown-clears-data.gmi b/issues/genenetwork/virtuoso-shutdown-clears-data.gmi
new file mode 100644
index 0000000..2e01238
--- /dev/null
+++ b/issues/genenetwork/virtuoso-shutdown-clears-data.gmi
@@ -0,0 +1,98 @@
+# Virtuoso: Shutdown Clears Data
+
+## Tags
+
+* type: bug
+* assigned: fredm
+* priority: critical
+* status: closed, completed
+* interested: bonfacem, pjotrp, zsloan
+* keywords: production, container, tux04, virtuoso
+
+## Description
+
+It seems that virtuoso has the bad habit of clearing data whenever it is stopped/restarted.
+
+This issue will track the work necessary to get the service behaving correctly.
+
+According to the documentation on
+=> https://vos.openlinksw.com/owiki/wiki/VOS/VirtBulkRDFLoader the bulk loading process
+
+```
+The bulk loader also disables checkpointing and the scheduler, which also need to be re-enabled post bulk load
+```
+
+That needs to be handled.
+
+### Notes
+
+After having a look at
+=> https://docs.openlinksw.com/virtuoso/ch-server/#databaseadmsrv the configuration documentation
+it occurs to me that the reason virtuoso supposedly clears the data is that the `DatabaseFile` value is not set, so it defaults to a new database file every time the server is restarted (See also the `Striping` setting).
+
+### Troubleshooting
+
+Reproduce locally:
+
+We begin by getting a look at the settings for the remote virtuoso
+```
+$ ssh tux04
+fredm@tux04:~$ cat /gnu/store/bg6i4x96nm32gjp4qhphqmxqc5vggk3h-virtuoso.ini
+[Parameters]
+ServerPort = localhost:8981
+DirsAllowed = /var/lib/data
+NumberOfBuffers = 4000000
+MaxDirtyBuffers = 3000000
+[HTTPServer]
+ServerPort = localhost:8982
+```
+
+Copy these into a file locally, and adjust the `NumberOfBuffers` and `MaxDirtyBuffers` for smaller local dev environment. Also update `DirsAllowed`.
+
+We end up with our local configuration in `~/tmp/virtuoso/etc/virtuoso.ini` with the content:
+
+```
+[Parameters]
+ServerPort = localhost:8981
+DirsAllowed = /var/lib/data
+NumberOfBuffers = 10000
+MaxDirtyBuffers = 6000
+[HTTPServer]
+ServerPort = localhost:8982
+```
+
+Run virtuoso!
+```
+$ cd ~/tmp/virtuoso/var/lib/virtuoso/
+$ ls
+$ ~/opt/virtuoso/bin/virtuoso-t +foreground +configfile ~/tmp/virtuoso/etc/virtuoso.ini
+```
+
+Here we start by changing into the `~/tmp/virtuoso/var/lib/virtuoso/` directory which will be where virtuoso will put its state. Now in a different terminal list the files created int the state directory:
+
+```
+$ ls ~/tmp/virtuoso/var/lib/virtuoso
+virtuoso.db virtuoso.lck virtuoso.log virtuoso.pxa virtuoso.tdb virtuoso.trx
+```
+
+That creates the database file (and other files) with the documented default values, i.e. `virtuoso.*`.
+
+We cannot quite reproduce the issue locally, since every reboot will have exactly the same value for the files locally.
+
+Checking the state directory for virtuoso on tux04, however:
+
+```
+fredm@tux04:~$ sudo ls -al /export2/guix-containers/genenetwork/var/lib/virtuoso/ | grep '\.db$'
+-rw-r--r-- 1 986 980 3787456512 Oct 28 14:16 js1b7qjpimdhfj870kg5b2dml640hryx-virtuoso.db
+-rw-r--r-- 1 986 980 4152360960 Oct 28 17:11 rf8v0c6m6kn5yhf00zlrklhp5lmgpr4x-virtuoso.db
+```
+
+We see that there are multiple db files, each created when virtuoso was restarted. There is an extra (possibly) random string prepended to the `virtuoso.db` part. This happens for our service if we do not actually provide the `DatabaseFile` configuration.
+
+
+## Fixes
+
+=> https://github.com/genenetwork/gn-gemtext-threads/commit/8211c1e49498ba2f3b578ed5b11b15c52299aa08 Document how to restart checkpointing and the scheduler after bulk loading
+=> https://git.genenetwork.org/guix-bioinformatics/commit/?id=2dc335ca84ea7f26c6977e6b432f3420b113f0aa Add configs for scheduler and checkpointing
+=> https://git.genenetwork.org/guix-bioinformatics/commit/?id=7d793603189f9d41c8ee87f8bb4c876440a1fce2 Set up virtuoso database configurations
+=> https://git.genenetwork.org/gn-machines/commit/?id=46a1c4c8d01198799e6ac3b99998dca40d2c7094 Explicitly name virtuoso database files.