diff options
Diffstat (limited to 'issues/genenetwork')
8 files changed, 548 insertions, 0 deletions
diff --git a/issues/genenetwork/cannot-connect-to-mariadb.gmi b/issues/genenetwork/cannot-connect-to-mariadb.gmi new file mode 100644 index 0000000..ca4bd9f --- /dev/null +++ b/issues/genenetwork/cannot-connect-to-mariadb.gmi @@ -0,0 +1,99 @@ +# Cannot Connect to MariaDB + + +## Description + +GeneNetwork3 is failing to connect to mariadb with the error: + +``` +⋮ +2024-11-05 14:49:00 Traceback (most recent call last): +2024-11-05 14:49:00 File "/gnu/store/83v79izrqn36nbn0l1msbcxa126v21nz-profile/lib/python3.10/site-packages/flask/app.py", line 1523, in full_dispatch_request +2024-11-05 14:49:00 rv = self.dispatch_request() +2024-11-05 14:49:00 File "/gnu/store/83v79izrqn36nbn0l1msbcxa126v21nz-profile/lib/python3.10/site-packages/flask/app.py", line 1509, in dispatch_request +2024-11-05 14:49:00 return self.ensure_sync(self.view_functions[rule.endpoint])(**req.view_args) +2024-11-05 14:49:00 File "/gnu/store/83v79izrqn36nbn0l1msbcxa126v21nz-profile/lib/python3.10/site-packages/gn3/api/menu.py", line 13, in generate_json +2024-11-05 14:49:00 with database_connection(current_app.config["SQL_URI"], logger=current_app.logger) as conn: +2024-11-05 14:49:00 File "/gnu/store/lzw93sik90d780n09svjx5la1bb8g3df-python-3.10.7/lib/python3.10/contextlib.py", line 135, in __enter__ +2024-11-05 14:49:00 return next(self.gen) +2024-11-05 14:49:00 File "/gnu/store/83v79izrqn36nbn0l1msbcxa126v21nz-profile/lib/python3.10/site-packages/gn3/db_utils.py", line 34, in database_connection +2024-11-05 14:49:00 connection = mdb.connect(db=db_name, +2024-11-05 14:49:00 File "/gnu/store/83v79izrqn36nbn0l1msbcxa126v21nz-profile/lib/python3.10/site-packages/MySQLdb/__init__.py", line 121, in Connect +2024-11-05 14:49:00 return Connection(*args, **kwargs) +2024-11-05 14:49:00 File "/gnu/store/83v79izrqn36nbn0l1msbcxa126v21nz-profile/lib/python3.10/site-packages/MySQLdb/connections.py", line 195, in __init__ +2024-11-05 14:49:00 super().__init__(*args, **kwargs2) +2024-11-05 14:49:00 MySQLdb.OperationalError: (2002, "Can't connect to local MySQL server through socket '/tmp/mysql.sock' (2)") +``` + +We have previously defined the default socket file[^1][^2] as "/run/mysqld/mysqld.sock". + +## Troubleshooting Logs + +### 2024-11-05 + +I attempted to just bind `/run/mysqld/mysqld.sock` to `/tmp/mysql.sock` by adding the following mapping in GN3's `gunicorn-app` definition: + +``` +(file-system-mapping + (source "/run/mysqld/mysqld.sock") + (target "/tmp/mysql.sock") + (writable? #t)) +``` + +but that does not fix things. + +I had tried to change the mysql URI to use IP addresses, i.e. + +``` +SQL_URI="mysql://webqtlout:webqtlout@128.169.5.119:3306/db_webqtl" +``` + +but that simply changes the error from the above to the one below: + +``` +2024-11-05 15:27:12 MySQLdb.OperationalError: (2002, "Can't connect to MySQL server on '128.169.5.119' (115)") +``` + +I tried with both `127.0.0.1` and `128.169.5.119`. + +My hail-mary was to attempt to expose the `my.cnf` file generated by the `mysql-service-type` definition to the "pola-wrapper", but that is proving tricky, seeing as the file is generated elsewhere[^4] and we do not have a way of figuring out the actual final path of the file. + +I tried: + +``` +(file-system-mapping + (source (mixed-text-file "my.cnf" + (string-append "[client]\n" + "socket=/run/mysqld/mysqld.sock"))) + (target "/etc/mysql/my.cnf")) +``` + +but that did not work either. + +### 2024-11-07 + +Start digging into how GNU Guix services are defined[^5] to try and understand why the file mapping attempt did not work. + +=> http://git.savannah.gnu.org/cgit/guix.git/tree/gnu/system/file-systems.scm?id=2394a7f5fbf60dd6adc0a870366adb57166b6d8b#n575 +Looking at the code linked above specifically at lines 575 to 588, and 166, it seems, to me, that the mappings attempt should have worked. + +Try it again, taking care to verify that the paths are correct, with: + +``` +(file-system-mapping + (source (mixed-text-file "my.cnf" + (string-append "[client-server]\n" + "socket=/run/mysqld/mysqld.sock"))) + (target "/etc/my.cnf")) +``` + +Try rebuilding on tux04: started getting `Segmentation fault` errors out of the blue for many guix commands 🤦🏿. +Try building container on local dev machine: this took a long time - quit and continue later. + +### Footnotes + +[^1] Lines 47 to 49 of https://git.genenetwork.org/gn-machines/tree/production.scm?id=46a1c4c8d01198799e6ac3b99998dca40d2c7094#n47 +[^2] Guix's mysql-service-type configurations https://guix.gnu.org/manual/en/html_node/Database-Services.html#index-mysql_002dconfiguration +[^3] https://mariadb.com/kb/en/server-system-variables/#socket +[^4] https://git.savannah.gnu.org/cgit/guix.git/tree/gnu/services/databases.scm?id=4c56d0cccdc44e12484b26332715f54768738c5f#n576 +[^5] https://guix.gnu.org/manual/en/html_node/Defining-Services.html diff --git a/issues/genenetwork/containerising-production-issues.gmi b/issues/genenetwork/containerising-production-issues.gmi new file mode 100644 index 0000000..ed5702a --- /dev/null +++ b/issues/genenetwork/containerising-production-issues.gmi @@ -0,0 +1,33 @@ +# Containerising Production: Issues + +## Tags + +* type: bug +* assigned: fredm +* priority: critical +* status: closed, completed +* keywords: production, container, tux04 +* interested: alexk, aruni, bonfacem, fredm, pjotrp, soloshelby, zsloan, jnduli + +## Description + +We have recently got production into a container and deployed it: It has come up, however, that there are services that are useful to get a full-featured GeneNetwork system running that are not part of the container. + +This is, therefore, a meta-issue, tracking all issues that relate to the deployment of the disparate services that make up GeneNetwork. + +## Documentation + +=> https://issues.genenetwork.org/topics/genenetwork/genenetwork-services + +The link above documents the various services that make up the GeneNetwork service. + +## Issues + +* [x] Move user directories to a large partition +=> ./handle-tmp-dirs-in-container [x] Link TMPDIR in container to a directory on a large partition +=> ./markdown-editing-service-not-deployed [ ] Define and deploy Markdown Editing service +=> ./umhet3-samples-timing-slow [ ] Figure out and fix UM-HET3 Samples mappings on Tux04 +=> ./setup-mailing-on-tux04 [x] Setting up email service on Tux04 +=> ./virtuoso-shutdown-clears-data [x] Virtuoso seems to lose data on restart +=> ./python-requests-error-in-container [x] Fix python's requests library certificates error +=> ./cannot-connect-to-mariadb [ ] GN3 cannot connect to mariadb server diff --git a/issues/genenetwork/handle-tmp-dirs-in-container.gmi b/issues/genenetwork/handle-tmp-dirs-in-container.gmi new file mode 100644 index 0000000..5f6eb92 --- /dev/null +++ b/issues/genenetwork/handle-tmp-dirs-in-container.gmi @@ -0,0 +1,22 @@ +# Handle Temporary Directories in the Container + +## Tags + +* type: feature +* assigned: fredm +* priority: critical +* status: closed, completed +* keywords: production, container, tux04 +* interested: alexk, aruni, bonfacem, pjotrp, zsloan + +## Description + +The container's temporary directories should be in a large partition on the host to avoid a scenario where the writes fill up one of the smaller drives. + +Currently, we use the `/tmp` directory by default, but we should look into transitioning away from that — `/tmp` is world readable and world writable and therefore needs careful consideration to keep safe. + +Thankfully, we are running our systems within a container, and can bind the container's `/tmp` directory to a non-world-accessible directory, keeping things at least contained. + +### Fixes + +=> https://git.genenetwork.org/gn-machines/commit/?id=7306f1127df9d4193adfbfa51295615f13d32b55 diff --git a/issues/genenetwork/markdown-editing-service-not-deployed.gmi b/issues/genenetwork/markdown-editing-service-not-deployed.gmi new file mode 100644 index 0000000..e7a1717 --- /dev/null +++ b/issues/genenetwork/markdown-editing-service-not-deployed.gmi @@ -0,0 +1,34 @@ +# Markdown Editing Service: Not Deployed + +## Tags + +* type: bug +* status: open +* assigned: fredm +* priority: critical +* keywords: production, container, tux04 +* interested: alexk, aruni, bonfacem, fredm, pjotrp, zsloan + +## Description + +The Markdown Editing service is not working on production. + +* Link: https://genenetwork.org/facilities/ +* Repository: https://git.genenetwork.org/gn-guile + +Currently, the code is being run directly on the host, rather than inside the container. + +Some important things to note: + +* The service requires access to a checkout of https://github.com/genenetwork/gn-docs +* Currently, the service is hard-coded to use a specific port: we should probably fix that. + +## Reopened: 2024-11-01 + +While the service was deployed, the edit functionality is not working right, specifically, pushing the edits upstream to the remote seems to fail. + +If you do an edit and refresh the page, it will show up in the system, but it will not proceed to be pushed up to the remote. + +Set `CGIT_REPO_PATH="https://git.genenetwork.org/gn-guile"` which seems to allow the commit to work, but we do not actually get the changes pushed to the remote in any useful sense. + +It seems to me, that we need to configure the environment in such a way that it will be able to push the changes to remote. diff --git a/issues/genenetwork/python-requests-error-in-container.gmi b/issues/genenetwork/python-requests-error-in-container.gmi new file mode 100644 index 0000000..0289762 --- /dev/null +++ b/issues/genenetwork/python-requests-error-in-container.gmi @@ -0,0 +1,174 @@ +# Python Requests Error in Container + +## Tags + +* type: bug +* assigned: fredm +* priority: critical +* status: closed, completed, fixed +* interested: alexk, aruni, bonfacem, pjotrp, zsloan +* keywords: production, container, tux04, python, requests + +## Description + +Building the container with the +=> https://git.genenetwork.org/guix-bioinformatics/commit/?id=eb7beb340a9731775e8ad177e47b70dba2f2a84f upgraded guix definition +leads to python's requests library failing. + +``` +2024-10-30 16:04:13 OSError: Could not find a suitable TLS CA certificate bundle, invalid path: /etc/ssl/certs/ca-certificates.crt +``` + +If you login to the container itself, however, you find that the file `/etc/ssl/certs/ca-certificates.crt` actually exists and has content. + +Possible fixes suggested are to set up correct envvars for the requests library, such as `REQUESTS_CA_BUNDLE` + +See +=> https://requests.readthedocs.io/en/latest/user/advanced/#ssl-cert-verification + +### Troubleshooting Logs + +Try reproducing the issue locally: + +``` +$ guix --version +hint: Consider installing the `glibc-locales' package and defining `GUIX_LOCPATH', along these lines: + + guix install glibc-locales + export GUIX_LOCPATH="$HOME/.guix-profile/lib/locale" + +See the "Application Setup" section in the manual, for more info. + +guix (GNU Guix) 2394a7f5fbf60dd6adc0a870366adb57166b6d8b +Copyright (C) 2024 the Guix authors +License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> +This is free software: you are free to change and redistribute it. +There is NO WARRANTY, to the extent permitted by law. +$ +$ guix shell --container --network python python-requests coreutils +[env]$ ls "${GUIX_ENVIRONMENT}/etc" +ld.so.cache profile +``` + +We see from the above that there are no certificates in the environment with just python and python-requests. + +Okay. Now let's write a simple python script to test things out with: + +``` +import requests + +resp = requests.get("https://github.com") +print(resp) +``` + +and run it! + +``` +$ guix shell --container --network python python-requests coreutils -- python3 test.py +Traceback (most recent call last): + File "/tmp/test.py", line 1, in <module> + import requests + File "/gnu/store/b6ny4p29f32rrnnvgx7zz1nhsms2zmqk-profile/lib/python3.10/site-packages/requests/__init__.py", line 164, in <module> + from .api import delete, get, head, options, patch, post, put, request + File "/gnu/store/b6ny4p29f32rrnnvgx7zz1nhsms2zmqk-profile/lib/python3.10/site-packages/requests/api.py", line 11, in <module> + from . import sessions + File "/gnu/store/b6ny4p29f32rrnnvgx7zz1nhsms2zmqk-profile/lib/python3.10/site-packages/requests/sessions.py", line 15, in <module> + from .adapters import HTTPAdapter + File "/gnu/store/b6ny4p29f32rrnnvgx7zz1nhsms2zmqk-profile/lib/python3.10/site-packages/requests/adapters.py", line 81, in <module> + _preloaded_ssl_context.load_verify_locations( +FileNotFoundError: [Errno 2] No such file or directory +``` + +Uhmm, what is this new error? + +Add `nss-certs` and try again. + +``` +$ guix shell --container --network python python-requests nss-certs coreutils +[env]$ ls ${GUIX_ENVIRONMENT}/etc/ssl/ +certs +[env]$ python3 test.py +Traceback (most recent call last): + File "/tmp/test.py", line 1, in <module> + import requests + File "/gnu/store/17dw8qczqqz9fmj2kxzsbfqn730frqd7-profile/lib/python3.10/site-packages/requests/__init__.py", line 164, in <module> + from .api import delete, get, head, options, patch, post, put, request + File "/gnu/store/17dw8qczqqz9fmj2kxzsbfqn730frqd7-profile/lib/python3.10/site-packages/requests/api.py", line 11, in <module> + from . import sessions + File "/gnu/store/17dw8qczqqz9fmj2kxzsbfqn730frqd7-profile/lib/python3.10/site-packages/requests/sessions.py", line 15, in <module> + from .adapters import HTTPAdapter + File "/gnu/store/17dw8qczqqz9fmj2kxzsbfqn730frqd7-profile/lib/python3.10/site-packages/requests/adapters.py", line 81, in <module> + _preloaded_ssl_context.load_verify_locations( +FileNotFoundError: [Errno 2] No such file or directory +[env]$ +[env]$ export REQUESTS_CA_BUNDLE="${GUIX_ENVIRONMENT}/etc/ssl/certs/ca-certificates.crt" +[env]$ $ python3 test.py +Traceback (most recent call last): + File "/tmp/test.py", line 1, in <module> + import requests + File "/gnu/store/17dw8qczqqz9fmj2kxzsbfqn730frqd7-profile/lib/python3.10/site-packages/requests/__init__.py", line 164, in <module> + from .api import delete, get, head, options, patch, post, put, request + File "/gnu/store/17dw8qczqqz9fmj2kxzsbfqn730frqd7-profile/lib/python3.10/site-packages/requests/api.py", line 11, in <module> + from . import sessions + File "/gnu/store/17dw8qczqqz9fmj2kxzsbfqn730frqd7-profile/lib/python3.10/site-packages/requests/sessions.py", line 15, in <module> + from .adapters import HTTPAdapter + File "/gnu/store/17dw8qczqqz9fmj2kxzsbfqn730frqd7-profile/lib/python3.10/site-packages/requests/adapters.py", line 81, in <module> + _preloaded_ssl_context.load_verify_locations( +FileNotFoundError: [Errno 2] No such file or directory +``` + +Welp! Looks like this error is a whole different thing. + +Let us try with the genenetwork2 package. + +``` +$ guix shell --container --network genenetwork2 coreutils +[env]$ ls "${GUIX_ENVIRONMENT}/etc" +bash_completion.d jupyter ld.so.cache profile +``` + +This does not seem to have the certificates in place either, so let's add nss-certs + +``` +$ guix shell --container --network genenetwork2 coreutils nss-certs +[env]$ ls "${GUIX_ENVIRONMENT}/etc" +bash_completion.d jupyter ld.so.cache profile ssl +[env]$ python3 test.py +Traceback (most recent call last): + File "/tmp/test.py", line 3, in <module> + resp = requests.get("https://github.com") + File "/gnu/store/qigjz4i0dckbsjbd2has0md2dxwsa7ry-profile/lib/python3.10/site-packages/requests/api.py", line 73, in get + return request("get", url, params=params, **kwargs) + File "/gnu/store/qigjz4i0dckbsjbd2has0md2dxwsa7ry-profile/lib/python3.10/site-packages/requests/api.py", line 59, in request + return session.request(method=method, url=url, **kwargs) + File "/gnu/store/qigjz4i0dckbsjbd2has0md2dxwsa7ry-profile/lib/python3.10/site-packages/requests/sessions.py", line 587, in request + resp = self.send(prep, **send_kwargs) + File "/gnu/store/qigjz4i0dckbsjbd2has0md2dxwsa7ry-profile/lib/python3.10/site-packages/requests/sessions.py", line 701, in send + r = adapter.send(request, **kwargs) + File "/gnu/store/qigjz4i0dckbsjbd2has0md2dxwsa7ry-profile/lib/python3.10/site-packages/requests/adapters.py", line 460, in send + self.cert_verify(conn, request.url, verify, cert) + File "/gnu/store/qigjz4i0dckbsjbd2has0md2dxwsa7ry-profile/lib/python3.10/site-packages/requests/adapters.py", line 263, in cert_verify + raise OSError( +OSError: Could not find a suitable TLS CA certificate bundle, invalid path: /etc/ssl/certs/ca-certificates.crt +``` + +We get the expected certificates error! This is good. Now define the envvar and try again. + +``` +[env]$ export REQUESTS_CA_BUNDLE="${GUIX_ENVIRONMENT}/etc/ssl/certs/ca-certificates.crt" +[env]$ python3 test.py +<Response [200]> +``` + +Success!!! + +Adding nss-certs and setting the `REQUESTS_CA_BUNDLE` fixes things. We'll need to do the same for the container, for both the genenetwork2 and genenetwork3 packages (and any other packages that use requests library). + +### Fixes + +=> https://git.genenetwork.org/guix-bioinformatics/commit/?id=fec68c4ca87eeca4eb9e69e71fc27e0eae4dd728 +=> https://git.genenetwork.org/guix-bioinformatics/commit/?id=c3bb784c8c70857904ef97ecd7d36ec98772413d +The two commits above add nss-certs package to all the flask apps, which make use of the python-requests library, which requires a valid CA certificates bundle in each application's environment. + +=> https://git.genenetwork.org/gn-machines/commit/?h=production-container&id=04506c4496e5ca8b3bc38e28ed70945a145fb036 +The commit above defines the "REQUESTS_CA_BUNDLE" environment variable for all the flask applications that make use of python's requests library. diff --git a/issues/genenetwork/setup-mailing-on-tux04.gmi b/issues/genenetwork/setup-mailing-on-tux04.gmi new file mode 100644 index 0000000..45605d9 --- /dev/null +++ b/issues/genenetwork/setup-mailing-on-tux04.gmi @@ -0,0 +1,16 @@ +# Setup Mailing on Tux04 + +## Tags + +* type: bug +* status: closed +* assigned: fredm +* priority: critical +* interested: pjotrp, zsloan +* keywords: production, container, tux04 + +## Description + +We use emails to verify user accounts and allow changing of user passwords. We therefore need to setup a way to send emails from the system. + +I updated the configurations to use UTHSC's mail server diff --git a/issues/genenetwork/umhet3-samples-timing-slow.gmi b/issues/genenetwork/umhet3-samples-timing-slow.gmi new file mode 100644 index 0000000..a3a33a7 --- /dev/null +++ b/issues/genenetwork/umhet3-samples-timing-slow.gmi @@ -0,0 +1,72 @@ +# UM-HET3 Timing: Slow + +## Tags + +* type: bug +* status: open +* assigned: fredm +* priority: critical +* interested: fredm, pjotrp, zsloan +* keywords: production, container, tux04, UM-HET3 + +## Description + +In email from @robw: + +``` +> > Not sure why. Am I testing the wrong way? +> > Are we using memory and RAM in the same way on the two machines? +> > Here are data on the loading time improvement for Tux2: +> > I tested this using a "worst case" trait that we know when—the 25,000 +> > UM-HET3 samples: +> > [1]https://genenetwork.org/show_trait?trait_id=10004&dataset=HET3-ITPPu +> > blish +> > Tux02: 15.6, 15.6, 15.3 sec +> > Fallback: 37.8, 38.7, 38.5 sec +> > Here are data on Gemma speed/latency performance: +> > Also tested "worst case" performance using three large BXD data sets +> > tested in this order: +> > [2]https://genenetwork.org/show_trait?trait_id=10004&dataset=BXD-Longev +> > ityPublish +> > [3]https://genenetwork.org/show_trait?trait_id=10003&dataset=BXD-Longev +> > ityPublish +> > [4]https://genenetwork.org/show_trait?trait_id=10002&dataset=BXD-Longev +> > ityPublish +> > Tux02: 107.2, 329.9 (ouch), 360.0 sec (double ouch) for 1004, 1003, and +> > 1002 respectively. On recompute (from cache) 19.9, 19.9 and 20.0—still +> > too slow. +> > Fallback: 154.1, 115.9 for the first two traits (trait 10002 already in +> > the cache) +> > On recompute (from cache) 59.6, 59.0 and 59.7. Too slow from cache. +> > PROBLEM 2: Tux02 is unable to map UM-HET3. I still get an nginx 413 +> > error: Entity Too Large. +> +> Yeah, Fred should fix that one. It is an nginx setting - we run 2x +> nginx. It was reported earlier. +> +> > I need this to work asap. Now mapping our amazing UM-HET3 data. I can +> > use Fallback, but it is painfully slow and takes about 214 sec. I hope +> > Tux02 gets that down to a still intolerable slow 86 sec. +> > Can we please fix and confirm by testing. The Trait is above for your +> > testing pleasure. +> > Even 86 secs is really too slow and should motivate us (or users like +> > me) to think about how we are using all of those 24 ultra-fast cores on +> > the AMD 9274F. Why not put them all to use for us and users. It is not +> > good enough just to have "it work". It has to work in about 5–10 +> > seconds. +> > Here are my questions for you guys: Are we able to use all 24 cores +> > for any one user? How does each user interact with the CPU? Can we +> > handle a class of 24 students with 24 cores, or is it "complicated"? +> > PROBLEM 3: Zach, Fred. Are we computing render time or transport +> > latency correctly? Ideally the printout at the bottom of mapping pages +> > would be true latency as experienced by the user. As far as I can tell +> > with a stop watch our estimates of time are incorrect by as much as 3 +> > secs. And note that the link +> > to [5]http://joss.theoj.org/papers/10.21105/joss.00025 is not working +> > correctly in the footer (see image below). Oddly enough it works fine +> > on Tux02 +> +> Fred, take a note. +``` + +Figure out what this is about and fix it. diff --git a/issues/genenetwork/virtuoso-shutdown-clears-data.gmi b/issues/genenetwork/virtuoso-shutdown-clears-data.gmi new file mode 100644 index 0000000..2e01238 --- /dev/null +++ b/issues/genenetwork/virtuoso-shutdown-clears-data.gmi @@ -0,0 +1,98 @@ +# Virtuoso: Shutdown Clears Data + +## Tags + +* type: bug +* assigned: fredm +* priority: critical +* status: closed, completed +* interested: bonfacem, pjotrp, zsloan +* keywords: production, container, tux04, virtuoso + +## Description + +It seems that virtuoso has the bad habit of clearing data whenever it is stopped/restarted. + +This issue will track the work necessary to get the service behaving correctly. + +According to the documentation on +=> https://vos.openlinksw.com/owiki/wiki/VOS/VirtBulkRDFLoader the bulk loading process + +``` +The bulk loader also disables checkpointing and the scheduler, which also need to be re-enabled post bulk load +``` + +That needs to be handled. + +### Notes + +After having a look at +=> https://docs.openlinksw.com/virtuoso/ch-server/#databaseadmsrv the configuration documentation +it occurs to me that the reason virtuoso supposedly clears the data is that the `DatabaseFile` value is not set, so it defaults to a new database file every time the server is restarted (See also the `Striping` setting). + +### Troubleshooting + +Reproduce locally: + +We begin by getting a look at the settings for the remote virtuoso +``` +$ ssh tux04 +fredm@tux04:~$ cat /gnu/store/bg6i4x96nm32gjp4qhphqmxqc5vggk3h-virtuoso.ini +[Parameters] +ServerPort = localhost:8981 +DirsAllowed = /var/lib/data +NumberOfBuffers = 4000000 +MaxDirtyBuffers = 3000000 +[HTTPServer] +ServerPort = localhost:8982 +``` + +Copy these into a file locally, and adjust the `NumberOfBuffers` and `MaxDirtyBuffers` for smaller local dev environment. Also update `DirsAllowed`. + +We end up with our local configuration in `~/tmp/virtuoso/etc/virtuoso.ini` with the content: + +``` +[Parameters] +ServerPort = localhost:8981 +DirsAllowed = /var/lib/data +NumberOfBuffers = 10000 +MaxDirtyBuffers = 6000 +[HTTPServer] +ServerPort = localhost:8982 +``` + +Run virtuoso! +``` +$ cd ~/tmp/virtuoso/var/lib/virtuoso/ +$ ls +$ ~/opt/virtuoso/bin/virtuoso-t +foreground +configfile ~/tmp/virtuoso/etc/virtuoso.ini +``` + +Here we start by changing into the `~/tmp/virtuoso/var/lib/virtuoso/` directory which will be where virtuoso will put its state. Now in a different terminal list the files created int the state directory: + +``` +$ ls ~/tmp/virtuoso/var/lib/virtuoso +virtuoso.db virtuoso.lck virtuoso.log virtuoso.pxa virtuoso.tdb virtuoso.trx +``` + +That creates the database file (and other files) with the documented default values, i.e. `virtuoso.*`. + +We cannot quite reproduce the issue locally, since every reboot will have exactly the same value for the files locally. + +Checking the state directory for virtuoso on tux04, however: + +``` +fredm@tux04:~$ sudo ls -al /export2/guix-containers/genenetwork/var/lib/virtuoso/ | grep '\.db$' +-rw-r--r-- 1 986 980 3787456512 Oct 28 14:16 js1b7qjpimdhfj870kg5b2dml640hryx-virtuoso.db +-rw-r--r-- 1 986 980 4152360960 Oct 28 17:11 rf8v0c6m6kn5yhf00zlrklhp5lmgpr4x-virtuoso.db +``` + +We see that there are multiple db files, each created when virtuoso was restarted. There is an extra (possibly) random string prepended to the `virtuoso.db` part. This happens for our service if we do not actually provide the `DatabaseFile` configuration. + + +## Fixes + +=> https://github.com/genenetwork/gn-gemtext-threads/commit/8211c1e49498ba2f3b578ed5b11b15c52299aa08 Document how to restart checkpointing and the scheduler after bulk loading +=> https://git.genenetwork.org/guix-bioinformatics/commit/?id=2dc335ca84ea7f26c6977e6b432f3420b113f0aa Add configs for scheduler and checkpointing +=> https://git.genenetwork.org/guix-bioinformatics/commit/?id=7d793603189f9d41c8ee87f8bb4c876440a1fce2 Set up virtuoso database configurations +=> https://git.genenetwork.org/gn-machines/commit/?id=46a1c4c8d01198799e6ac3b99998dca40d2c7094 Explicitly name virtuoso database files. |