1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
|
# Installation
This document is WIP and still a mixture of old and new docs.
Large system deployments can get very complex. In this document we explain the GeneNetwork reproducible deployment system which is based on GNU Guix The Guix system can be used to install GN with all its files and dependencies.
Note that the official deployment works through a Guix VM. This is described in
=> ./deployment
# Check list
To run GeneNetwork the following services need to function:
* [ ] GNU Guix with a guix profile for genenetwork2
* [ ] A path to the (static) genotype files
* [?] Gn-proxy for authentication
* [ ] The genenetwork3 service
* [ ] Redis
* [ ] Mariadb
# Installing Guix packages
Make sure to install GNU Guix using the binary download instructions on the main website. Follow the instructions on Note the download amounts to several GBs of data. Debian-derived distros may support
```
apt-get install guix
```
# Creating a GNU Guix profile
We run a GNU Guix channel with packages at
=> https://gitlab.com/genenetwork/guix-bioinformatics
The README has instructions hosting a channel (recommended!), but sometimes we use the GUIX_PACKAGE_PATH instead. First upgrade to a recent guix with
```
mkdir ~/opt
guix pull -p ~/opt/guix-pull
```
It should upgrade (ignore the locales warnings). You can optionally specify the specific git checkout of guix with
```
guix pull -p ~/opt/guix-pull --commit=f04883d
```
which is useful when you need to roll back to an earlier version (sometimes our channel goes out of sync). Next, we install GeneNetwork2 with
```
source ~/opt/guix-pull/etc/profile
git clone https://git.genenetwork.org/guix-bioinformatics/guix-bioinformatics.git ~/guix-bioinformatics
```
you probably also need guix-past (the upstream channel for older packages):
```
git clone https://gitlab.inria.fr/guix-hpc/guix-past.git ~/guix-past
cd ~/guix-past
env GUIX_PACKAGE_PATH=$HOME/guix-bioinformatics:$HOME/guix-past/modules ~/opt/guix-pull/bin/guix package -i genenetwork2 -p ~/opt/genenetwork2
```
Ignore the warnings. Guix should install the software without trying to build everything. If you system insists on building all packages, try the `--dry-run` switch and fix the [[https://guix.gnu.org/manual/en/html_node/Substitute-Server-Authorization.html][substitutes]]. You may add the `--substitute-urls="http://guix.genenetwork.org https://ci.guix.gnu.org https://mirror.hydra.gnu.org"` switch.
The guix.genenetwork.org has most of our packages pre-built(!). To use it on your own machine the public key is
```
(public-key
(ecc
(curve Ed25519)
(q #9F56EAB5CE37AA15693C31F451140588240F259676C137E31C0CA70EC4D1B534#)
)
)
```
Once we have a GNU Guix profile, a running database (see below) and the file storage,
we should be ready to fire up GeneNetwork:
# Running GN2
Check out the source with git:
```
git clone git@github.com:genenetwork/genenetwork2.git
cd genenetwork2
```
You may want to use the testing branch.
Run GN2 with earlier created Guix profile
```
export GN2_PROFILE=$HOME/opt/genenetwork2
env TMPDIR=$HOME/tmp WEBSERVER_MODE=DEBUG LOG_LEVEL=DEBUG SERVER_PORT=5012 GENENETWORK_FILES=/export/data/genenetwork/genotype_files SQL_URI=mysql://webqtlout:webqtlout@localhost/db_webqtl ./bin/genenetwork2 etc/default_settings.py -gunicorn-dev
```
The script comes with debug and logging switches can be particularly useful when
developing GN2. Location and files are examples.
It may be useful to tunnel the web server to your local browser with an ssh tunnel:
## Testing on an ssh tunnnel
If you want to test a service running on the server on a certain port (say 8202) use
ssh -L 8202:127.0.0.1:8202 -f -N myname@penguin2.genenetwork.org
And browse on your local machine to http://localhost:8202/
# BELOW INFORMATION NEEDS TO BE UPDATED
* Run gn-proxy
GeneNetwork requires a separate gn-proxy server which handles
authorisation and access control. For instructions see the
[[https://github.com/genenetwork/gn-proxy][README]]. Note it may already be running on our servers!
* Run Redis
Redis part of GN2 deployment and will be started by the ./bin/genenetwork2
startup script.
* Run MariaDB server
** Install MariaDB with GNU GUIx
These are the steps you can take to install a fresh installation of
mariadb (which comes as part of the GNU Guix genenetwork2 install).
As root configure the Guix profile, previously that was
```
. ~/opt/genenetwork2/etc/profile
```
But now we use the recommended
```
/usr/local/guix-profiles/guix-pull/bin/guix install mariadb borg -p /usr/local/guix-profiles/gn-latest
. /usr/local/guix-profiles/gn-latest/etc/profile
```
Exctract the db (that takes a while too)
```
/usr/local/guix-profiles/gn-latest/bin/borg extract /export2/data/wrk/tux01/borg-tux01::borg-backup-mariadb-20240218-06:16-Sun --progress
```
and run for example
```
adduser mariadb
addgroup mariadb (and add user to group)
mkdir -p /export2/mariadb/database
chown mariadb.mariadb -R /export2/mariadb/
mkdir -p /var/run/mysqld
chown mariadb.mariadb /var/run/mysqld
su mariadb
. /usr/local/guix-profiles/gn-latest/etc/profile
mariadb --version
mariadb Ver 15.1 Distrib 10.10.2-MariaDB, for Linux (x86_64) using readline 5.1
mariadb_install_db --user=mariadb --datadir=/export2/mariadb/database
mariadbd -u mariadb --datadir=/exportdb/mariadb/database/mariadb --explicit_defaults_for_timestamp -P 12048"
```
If you want to run as root you may have to set /etc/my.cnf
```
[mariadbd]
user=root
```
You also need to set
```
ft_min_word_len = 3
```
To make sure word text searches (shh) work and rebuild the tables if
required.
To check error output in a file on start-up run with something like
```
mariadbd -u mariadb --console --explicit_defaults_for_timestamp --datadir=/export/mariadb/tux01_mariadb/latest --log-error=~/test.log
/usr/local/guix-profiles/gn-latest/bin/mariadb -uwebqtlout -pwebqtlout db_webqtl -e 'show tables'
```
When you get errors like:
```
qlalchemy.exc.IntegrityError: (_mariadb_exceptions.IntegrityError) (1215, 'Cannot add foreign key constraint')
```
you may need to set
```
set foreign_key_checks=0
```
The current production my.conf is
```
[mysqld]
# innodb_empty_free_list_algorithm=backoff
innodb_buffer_pool_size=16G
# innodb_ibuf_max_size=2G
innodb_ft_min_token_size=3
# innodb_use_sys_malloc=0
innodb_file_per_table=ON
key_buffer_size=10M
ft_min_word_len = 3
# main = 1 for active master: server A
gtid-domain-id=1
tmpdir=/export/tmp
wait_timeout=180
lc_time_names=en_US
lc_messages=en_US
max_connections=2048
thread_cache_size=16
open_files_limit = 16384
query_cache_type=1 query_cache_min_res_unit = 1k
query_cache_limit = 1M
query_cache_size=128M
log_error=/var/log/mysql/error.log
skip-name-resolve
# Skip recovery for now:
innodb_force_recovery=1
# Only when listening to the network!
# bind-address = 0.0.0.0
# port = 3306
slow_query_log=1
slow_query_log_file=/var/log/mysql/mysql-slow.log
long_query_time=60.0
log_queries_not_using_indexes=0
log_warnings=4
log_slow_admin_statements=ON
ft_min_word_len=3
log-bin=/var/lib/mysql/gn0-binary-log
expire-logs-days=120
server-id=1
# Domain = 1 for active master: server A
gtid-domain-id=1
[myisamchk]
sort_buffer_size=4M
ft_min_word_len = 3
```
Note that we handle IP restrictions through the nftables firewall.
The systemd config is
```
[Unit]
Description=MariaDB database server
Documentation=man:mysqld(8)
Documentation=https://mariadb.com/kb/en/library/systemd/
After=network.target
[Install]
WantedBy=multi-user.target
Alias=mysqld.service
[Service]
TimeoutStartSec=infinity
TimeoutStopSec=infinity
LimitNOFILE=infinity
LimitMEMLOCK=infinity
Type=simple
PrivateNetwork=false
User=mariadb
Group=mariadb
CapabilityBoundingSet=CAP_IPC_LOCK # Prevent writes to /usr, /boot, and /etc
ProtectSystem=true
PrivateDevices=true
# Prevent accessing /home, /root and /run/user
ProtectHome=false
# Execute pre and post scripts as root, otherwise it does it as User=
PermissionsStartOnly=true
ExecStartPre=/usr/bin/install -m 755 -o mariadb -g root -d /var/run/mysqld
ExecStart=/usr/local/guix-profiles/gn-latest/bin/mariadbd --datadir=/export/mariadb/tux01_mariadb/latest $MYSQLD_OPTS $_WSREP_NEW_CLUSTER $_WS
REP_START_POSITION -W
ExecStartPost=/bin/sh -c "systemctl unset-environment _WSREP_START_POSITION"
KillSignal=SIGTERM
SendSIGKILL=no
Restart=on-abort
RestartSec=15s
UMask=007
PrivateTmp=false
```
** Load the small database in MySQL
Currently we have two databases for deployment,
'db_webqtl_s' is the small testing database containing experiments
from BXD mice and 'db_webqtl_plant' which contains all plant related
material.
Download one database from
http://ipfs.genenetwork.org/ipfs/QmRUmYu6ogxEdzZeE8PuXMGCDa8M3y2uFcfo4zqQRbpxtk
After installation unzip the database binary in the MySQL directory
#+BEGIN_SRC sh
cd ~/mysql
p7zip -d db_webqtl_s.7z
chown -R mysql:mysql db_webqtl_s/
chmod 700 db_webqtl_s/
chmod 660 db_webqtl_s/*
#+END_SRC
restart MySQL service (mysqld). Login as root
: mysql_upgrade -u root --force
: myslq -u root
and
: mysql> show databases;
: +--------------------+
: | Database |
: +--------------------+
: | information_schema |
: | db_webqtl_s |
: | mysql |
: | performance_schema |
: +--------------------+
Set permissions and match password in your settings file below:
: mysql> grant all privileges on db_webqtl_s.* to gn2@"localhost" identified by 'webqtl';
You may need to change "localhost" to whatever domain you are
connecting from (mysql will give an error).
Note that if the mysql connection is not working, try connecting to
the IP address and check server firewall, hosts.allow and mysql IP
configuration (see below).
Note for the plant database you can rename it to db_webqtl_s, or
change the settings in etc/default_settings.py to match your path.
* Get genotype files
The script looks for genotype files. You can find them in
http://ipfs.genenetwork.org/ipfs/QmXQy3DAUWJuYxubLHLkPMNCEVq1oV7844xWG2d1GSPFPL
#+BEGIN_SRC sh
mkdir -p $HOME/genotype_files
cd $HOME/genotype_files
#+END_SRC
* GN2 Dependency Graph
List of all runtime dependencies for GN2 as installed by GNU Guix.
https://genenetwork.org/environments/
* Working with the GN2 source code
See [[development.org]].
* Read more
If you want to understand the architecture of GN2 read
[[Architecture.org]]. The rest of this document is mostly on deployment
of GN2.
* Trouble shooting
** ImportError: No module named jinja2
If you have all the Guix packages installed this error points out that
the environment variables are not set. Copy-paste the paths into your
terminal (mainly so PYTHON_PATH and R_LIBS_SITE are set) from the
information given by guix:
: guix package --search-paths
On one system:
: export PYTHONPATH="$HOME/.guix-profile/lib/python3.8/site-packages"
: export R_LIBS_SITE="$HOME/.guix-profile/site-library/"
: export GEM_PATH="$HOME/.guix-profile/lib/ruby/gems/2.2.0"
and perhaps a few more.
** ERROR: 'can not find directory $HOME/gn2_data' or 'can not find directory $HOME/genotype_files/genotype'
The default settings file looks in your $HOME/gn2_data. Since these
files come with a Guix installation you should take a hint from the
values in the installed version of default_settings.py (see above in
this document).
You can use the GENENETWORK_FILES switch to set the datadir, for example
: env GN2_PROFILE=~/opt/gn-latest GENENETWORK_FILES=/gnu/data/gn2_data ./bin/genenetwork2
** Can't run a module
In rare cases, development modules are not brought in with Guix
because no source code is available. This can lead to missing modules
on a running server. Please check with the authors when a module
is missing.
** Rpy2 error 'show' now found
This error
: __show = rpy2.rinterface.baseenv.get("show")
: LookupError: 'show' not found
means that R was updated in your path, and that Rpy2 needs to be
recompiled against this R - don't you love informative messages?
In our case it means that GN's PYTHONPATH is not in sync with
R_LIBS_SITE. Please check your GNU Guix GN2 installation paths,
you man need to reinstall. Note that this may be the point you
may want to start using profiles (see profile section).
** Mysql can't connect server through socket ERROR
The following error
: sqlalchemy.exc.OperationalError: (_mysql_exceptions.OperationalError) (2002, 'Can\'t connect to local MySQL server through socket \'/run/mysqld/mysqld.sock\' (2 "No such file or directory")')
means that MySQL is trying to connect locally to a non-existent MySQL
server, something you may see in a container. Typically replicated with something like
: mysql -h localhost
try to connect over the network interface instead, e.g.
: mysql -h 127.0.0.1
if that works run genenetwork after setting SQL_URI to something like
: export SQL_URI=mysql://gn2:mysql_password@127.0.0.1/db_webqtl_s
* NOTES
** Deploying GN2 official
Let's see how fast we can deploy a second copy of GN2.
- [ ] Base install
+ [ ] First install a Debian server with GNU Guix on board
+ [ ] Get Guix build going
- [ ] Build the correct version of Guix
- [ ] Check out the correct gn-stable version of guix-bioinformatics http://git.genenetwork.org/pjotrp/guix-bioinformatics
- [ ] guix package -i genenetwork2 -p /usr/local/guix-profiles/gn2-stable
+ [ ] Create a gn2 user and home with space
+ [ ] Install redis
- [ ] add to systemd
- [ ] update redis.cnf
- [ ] update database
+ [ ] Install mariadb (currently debian mariadb-server)
- [ ] add to systemd
- [ ] system stop mysql
- [ ] update mysql.cnf
- [ ] update database (see gn-services/services/mariadb.md)
- [ ] check tables
+ [ ] run gn2
+ [ ] update nginx
+ [ ] install genenetwork3
- [ ] add to systemd
|