#+TITLE: Installing GeneNetwork services * Table of Contents :TOC: - [[#introduction][Introduction]] - [[#install][Install]] - [[#tarball][Tarball]] - [[#docker][Docker]] - [[#with-source][With source]] - [[#running-gn2][Running GN2]] - [[#run-mysql-server][Run MySQL server]] - [[#install-mysql-with-gnu-guix][Install MySQL with GNU GUIx]] - [[#load-the-small-database-in-mysql][Load the small database in MySQL]] - [[#gn2-dependency-graph][GN2 Dependency Graph]] - [[#working-with-the-gn2-source-code][Working with the GN2 source code]] - [[#trouble-shooting][Trouble shooting]] - [[#importerror-no-module-named-jinja2][ImportError: No module named jinja2]] - [[#error-can-not-find-directory-homegn2_data][ERROR: can not find directory $HOME/gn2_data]] - [[#cant-run-a-module][Can't run a module]] - [[#rpy2-error-show-now-found][Rpy2 error 'show' now found]] - [[#mysql-cant-connect-server-through-socket-error][Mysql can't connect server through socket ERROR]] - [[#read-more][Read more]] - [[#irc-session][IRC session]] * Introduction Large system deployments can get very [[http://biogems.info/contrib/genenetwork/gn2.svg ][complex]]. In this document we explain the GeneNetwork version 2 (GN2) reproducible deployment system which is based on GNU Guix (see also [[https://github.com/pjotrp/guix-notes/blob/master/README.md][Guix-notes]]). The Guix system can be used to install GN with all its files and dependencies. The official installation path is from a checked out version of the main Guix package tree and that of the Genenetwork package tree. Current supported versions can be found as the SHA values of 'gn-latest' branches of [[https://gitlab.com/genenetwork/guix-bioinformatics][Guix bioinformatics]] and [[https://gitlab.com/genenetwork/guix][GNU Guix]]. For a full view of runtime dependencies as defined by GNU Guix, see an example of the [[#gn2-dependency-graph][GN2 Dependency Graph]]. * Install The quickest way to install GN2 is by using a binary installation (tarball or Docker image). These installations are bundled by GNU Guix and include all dependencies. You can install GeneNetwork on most Linux distributions, including Debian, Ubuntu, Fedora and CentOS, provided you have administrator privileges (root). The alternative is a Docker installation. ** Tarball Download the ~800Mb tarball from [[http://files.genenetwork.org/software/binary_tarball/]]. Validate the checksum and unpack to root, for example : tar xvzf genenetwork2-2.10rc3-1538ffd-tarball-pack.tar.gz : mv /gnu / : mv /opt/genenetwork2 /opt/ Now you shoud be able to start the server with : /opt/genenetwork2/bin/genenetwork2 When the server stops with a MySQL error [[#run-mysql-server][Run MySQL server]] and set SQL_URI to point at it. For example: : export SQL_URI=mysql://gn2:mysql_password@127.0.0.1/db_webqtl_s See also [[#mysql-cant-connect-server-through-socket-error][Mysql can't connect server through socket ERROR]]. ** Docker Docker images are also available through [[http://files.genenetwork.org/software/]]. Validate the checksum and run with [[https://docs.docker.com/engine/reference/commandline/load/][Docker load]]. ** With source For more elaborate installation instructions on deploying GeneNetwork from source see [[#source-deployment][Source deployment]]. * Running GN2 Default settings for GN2 are listed in a file called [[../etc/default_settings.py][default_settings.py]]. You can copy this file and pass it as a new parameter to the genenetwork2 command, e.g. : genenetwork2 mysettings.py or you can set environment variables to override individual parameters, e.g. : env SERVER_PORT=5004 SQL_URI=mysql://user:pwd@dbhostname/db_webqtl genenetwork2 the debug and logging switches can be particularly useful when developing GN2. * Run MySQL server ** Install MySQL with GNU GUIx These are the steps you can take to install a fresh installation of mysql (which comes as part of the GNU Guix genenetwork2 install). As root configure and run : adduser mysql && addgroup mysql : mysqld --datadir=/var/mysql --initialize-insecure : mkdir -p /var/run/mysqld : chown mysql.mysql ~/mysql /var/run/mysqld : mysqld -u mysql --datadir=/var/mysql --explicit_defaults_for_timestamp -P 12048" If you want to run as root you may have to set : /etc/my.cnf : [mysqld] : user=root To check error output in a file on start-up run with something like : mysqld -u mysql --console --explicit_defaults_for_timestamp --datadir=/gnu/mysql --log-error=~/test.log Other tips are that Guix installs mysqld in your profile, so this may work : /home/user/.guix-profile/bin/mysqld -u mysql --explicit_defaults_for_timestamp --datadir=/gnu/mysql When you get errors like: : qlalchemy.exc.IntegrityError: (_mysql_exceptions.IntegrityError) (1215, 'Cannot add foreign key constraint') you may need to set : set foreign_key_checks=0 ** Load the small database in MySQL At this point we require the underlying distribution to install and run mysqld (see next section for GNU Guix). Currently we have two databases for deployment, 'db_webqtl_s' is the small testing database containing experiments from BXD mice and 'db_webqtl_plant' which contains all plant related material. Download one database from [[http://files.genenetwork.org/raw_database/]] [[https://s3.amazonaws.com/genenetwork2/db_webqtl_s.zip]] Check the md5sum. After installation inflate the database binary in the MySQL directory : cd ~/mysql : chown -R mysql:mysql db_webqtl_s/ : chmod 700 db_webqtl_s/ : chmod 660 db_webqtl_s/* restart MySQL service (mysqld). Login as root : myslq -u root and : mysql> show databases; : +--------------------+ : | Database | : +--------------------+ : | information_schema | : | db_webqtl_s | : | mysql | : | performance_schema | : +--------------------+ Set permissions and match password in your settings file below: : mysql> grant all privileges on db_webqtl_s.* to gn2@"localhost" identified by 'mysql_password'; You may need to change "localhost" to whatever domain you are connecting from (mysql will give an error). Note that if the mysql connection is not working, try connecting to the IP address and check server firewall, hosts.allow and mysql IP configuration (see below). Note for the plant database you can rename it to db_webqtl_s, or change the settings in etc/default_settings.py to match your path. * GN2 Dependency Graph Graph of all runtime dependencies as installed by GNU Guix. #+ATTR_HTML: :title GN2_graph http://biogems.info/contrib/genenetwork/gn2.svg * Working with the GN2 source code See [[development.org]]. * Trouble shooting ** ImportError: No module named jinja2 If you have all the Guix packages installed this error points out that the environment variables are not set. Copy-paste the paths into your terminal (mainly so PYTHON_PATH and R_LIBS_SITE are set) from the information given by guix: : guix package --search-paths On one system: : export PYTHONPATH="$HOME/.guix-profile/lib/python2.7/site-packages" : export R_LIBS_SITE="$HOME/.guix-profile/site-library/" : export GEM_PATH="$HOME/.guix-profile/lib/ruby/gems/2.2.0" and perhaps a few more. ** ERROR: can not find directory $HOME/gn2_data The default settings file looks in your $HOME/gn2_data. Since these files come with a Guix installation you should take a hint from the values in the installed version of default_settings.py (see above in this document). ** Can't run a module In rare cases, development modules are not brought in with Guix because no source code is available. This can lead to missing modules on a running server. Please check with the authors when a module is missing. ** Rpy2 error 'show' now found This error : __show = rpy2.rinterface.baseenv.get("show") : LookupError: 'show' not found means that R was updated in your path, and that Rpy2 needs to be recompiled against this R - don't you love informative messages? In our case it means that GN's PYTHONPATH is not in sync with R_LIBS_SITE. Please check your GNU Guix GN2 installation paths, you man need to reinstall. Note that this may be the point you may want to start using profiles (see profile section). ** Mysql can't connect server through socket ERROR The following error : sqlalchemy.exc.OperationalError: (_mysql_exceptions.OperationalError) (2002, 'Can\'t connect to local MySQL server through socket \'/run/mysqld/mysqld.sock\' (2 "No such file or directory")') means that MySQL is trying to connect locally to a non-existent MySQL server, something you may see in a container. Typically replicated with something like : mysql -h localhost try to connect over the network interface instead, e.g. : mysql -h 127.0.0.1 if that works run genenetwork after setting SQL_URI to something like : export SQL_URI=mysql://gn2:mysql_password@127.0.0.1/db_webqtl_s * Running ElasticSearch In order to start up elasticsearch: Penguin - change user to "elasticsearch" and use the following command: "env JAVA_HOME=/opt/jdk-9.0.4 /opt/elasticsearch-6.2.1/bin/elasticsearch" New server - as root run "systemctl restart elasticsearch" * Read more If you want to understand the architecture of GN2 read [[Architecture.org]]. The rest of this document is mostly on deployment of GN2. * IRC session Here an IRC session where we installed GN2 from scratch using GNU Guix and a download of the test database. #+begin_src time to get binary install sorted :) [07:03] Guix is designed for distributed installation servers we have one on guix.genenetwork.org it contains all the prebuild packages for GN okay [07:04] let's step back however [07:05] I presume the environment is set with all guix package --search-paths right? yep set to the ones in ~/.guix-profile/ good, and you are in gn-deploy-guix repo [07:06] yep [07:07] git log shows Author: David Thompson Date: Sun Mar 27 21:20:19 2016 -0400 yes env GUIX_PACKAGE_PATH=../guix-bioinformatics ./pre-inst-env guix package -A genenetwork2 [07:08] shows genenetwork2 2.0-a8fcff4 out ../guix-bioinformatics/gn/packages/genenetwork.scm:144:2 genenetwork2-database-small 1.0 out ../guix-bioinformatics/gn/packages/genenetwork.scm:270:4 genenetwork2-files-small 1.0 out ../guix-bioinformatics/gn/packages/genenetwork.scm:228:4 yeah [07:09] OK, we are in sync. This means we should be able to install the exact same software I need to start up my guix daemon - I usually run it in a screen screen -S guix-daemon hah, I don't have screen installed yet [07:11] comes with guix ;) [07:12] no worries, you can run it any way you want $HOME/.guix-profile/bin/guix-daemon --build-users-group=guixbuild then something's weird, because it says I don't have it oh, you need to install it first [07:13] guix package -A screen screen 4.3.1 out gnu/packages/screen.scm:34:2 but you can skip this install, for now alright [07:14] env GUIX_PACKAGE_PATH=../guix-bioinformatics ./pre-inst-env guix package -i genenetwork2 --dry-run substitute: updating list of substitutes from 'https://mirror.hydra.gnu.org'... 79.1% you see that? followed by [07:15] substitute: updating list of substitutes from 'https://hydra.gnu.org'... 100.0% The following derivations would be built: /gnu/store/rk7nw0rjqqsha958m649wrykadx6mmhl-profile.drv /gnu/store/7b0qjybvfx8syzvfs7p5rdablwhbkbvs-module-import-compiled.drv /gnu/store/cy9zahbbf23d3cqyy404lk9f50z192kp-module-import.drv /gnu/store/ibdn603i8grf0jziy5gjsly34wx82lmk-gtk-icon-themes.drv which should have the same HASH values /gnu/store/7b0qjybvf... etc. [07:16] profile has a different hash but the next ones? they're the same not sure why profile differs. Do you see the contact with mirror.hydra.org? [07:17] yeah OK, that means you set the key correctly for that one :) alright we are at the same state now. You can see most packages need to be rebuild because they are no longer cached as binaries on hydra [07:18] things move fast... hehe let me also do the same on my laptop - which I have staged before [07:19] btw, to set the path I often do [07:20] export PATH="/home/wrk/.guix-profile/bin:/home/wrk/.guix-profile/sbin":$PATH to keep things like 'screen' from Debian Once past building guix itself that is normally OK [07:21] ah, okay will do that the guix build requires certain versions of tools, so you don't want to mix foreign tools in [07:23] makes sense [07:24] On my laptop I am trying the main updating list of substitutes from 'http://hydra.gnu.org'... 10.5% [07:27] it is a bit slow, but let's see if there is a difference with the mirror you can see there are two servers here. Actually with recent daemons, if the mirror fails it will try the main server [07:28] I documented the use of a caching server here [07:29] https://github.com/pjotrp/guix-notes/blob/master/REPRODUCIBLE.org this is exactly what we are doing now alrighty [07:35] To see if a remote server has a guix server running it should respond [07:36] lynx http://guix.genenetwork.org:8080 --dump Resource not found: / you see that? yes [07:37] good. The main hydra server is too slow. So on my laptop I forced using the mirror with [07:38] env GUIX_PACKAGE_PATH=../guix-bioinformatics/ ./pre-inst-env guix package -i genenetwork2 --dry-run --substitute-urls="http://mirror.hydra.gnu.org" the list looks the same to me [07:40] me too note that some packages will be built and some downloaded, right? [07:41] yes atlas is actually a binary on my system [07:43] I mean in that list so, it should not build. Same as yours? yeah, atlas and r-gtable are the ones to be downloaded You should not have seen that error ;) we should try and install it this way, try [07:44] env GUIX_PACKAGE_PATH=../guix-bioinformatics ./pre-inst-env guix package -i genenetwork2 --cores=4 --max-jobs=4 --keep-going [07:46] set CPUs and max-jobs to something sensible Does your VM have multiple cores? note you can always press Ctrl-C during install it doesn't, I'll reboot it and give it another core [07:47] Hey [07:48] I'm here Will be stepping away for some breakfast Can you do the same as us Can you see the irc log Alright Yes, I can Please email me a copy in five minutes user01: so when I use the GN server [07:56] env GUIX_PACKAGE_PATH=../guix-bioinformatics ./pre-inst-env guix package -i genenetwork2 --dry-run --substitute-urls=http://guix.genenetwork.org:8080 I don't need to build anything [07:57] (this won't work for you, yet) to get it to work you need to 'trust' it [07:58] but, first get the build going I'll have a coffee while you and get building yeah it's doing its thing now [08:01] cool [08:02] in a separate terminal you can try and install with the gn mirror [08:05] I'll send you the public key and you can paste it as said https://github.com/pjotrp/guix-notes/blob/master/REPRODUCIBLE.org [08:06] alright should be in the E-mail [08:09] getting it working it kinda nasty since the server gives no feedback it works when you see no more in the build list ;) [08:11] btw, you can install software in parallel. Guix does that. even the same packages so keep building ;) try and do this with Debian... coffee for me [08:12] the first build failed [08:15] OK, Dennis fixed that one yesterday [08:27] the problem is that sometime source tarballs disappear [08:28] R is notorious for that haha, that's inconvenient.. well, it is good that Guix catches them but we do not cache sources binaries are cached - to some degree - so we don't have to rebuild those [08:29] time to use the guix cache at guix.genenetwork.org try and install the key (it is in the E-mail) and see what this lists [08:31] env GUIX_PACKAGE_PATH=../guix-bioinformatics ./pre-inst-env guix package -i genenetwork2 --substitute-urls=http://guix.genenetwork.org --dry-run should be all binary installs it's not.. [08:32] if I remove --substitute-urls, the list changes, does that mean I have the key set up correctly at least? [08:33] dunno [08:35] how many packages does it want to build? should be zero four Ah, that is OK - those are default profile things genenetwork2 is among the ones to be downloaded so [08:36] remove --dry-run yeah, good sign :) we'll still hit a snag, but run it should be fast doing it [08:37] it worked! [08:38] I think [08:39] heh [08:40] you mean it is finished? yep type genenetwork2 complains about not being able to connect to the database [08:41] last snag :) no database well, we succeeded in installing a same-byte install of a very complex system :) [08:42] (always take time to congratulate yourself) now we need to install mysql hehe :) this can be done throug guix or through debian [08:43] the latter is a bit easier here, so let's do that fun note: you can mix debian and guix Follow instructions on [08:44] https://github.com/genenetwork/genenetwork2/tree/staging/doc#run-mysql-server apt-get install mysql-common [08:45] may do it You can also install with guix, but I need to document that btw your internet must be fast :) [08:46] hehe it is ;) when the database is installed [08:48] be sure to set the password as instructed [08:50] when mysql is set the genenetwork2 command should fire up the web server on localhost:5003 [08:58] btw my internet is way slower :) [09:00] I'm back [09:04] fixed router firmware upgrade problem unbricking tssk [09:07] I'll never leave routers to update themselves again [09:08] self-brick highway Resuming [09:09] auto-updates are evil always switch them off user02: can you install genenetwork like user has done? [09:10] pretty well documented here now :) Yes I can [09:11] Already installed key user02: you are getting binary packages only now? [09:13] That's the sanest way to go now seriously everything should be pre-built from guix.genenetwork.org you are downloading? yes [09:15] cool. Maybe an idea to set up a server for your own use Stuck at downloading preprocesscore should not [09:24] what does env GUIX_PACKAGE_PATH=../guix-bioinformatics/ ./pre-inst-env guix package -i genenetwork2 --substitute-urls="http://guix.genenetwork.org" --dry-run [09:25] say for r-prepocesscore download or build? mine says download [09:26] it only lists the derivatives to be built nothing else happens [09:27] OK, so there is a problem your key may not be working everything should be listed as 'to be download' [09:28] Hmm Ah I know where I messed up where? I did add the key However (I am documenting) I did not tell guix to trust it yes and there is another potential problem Remember the documentation on installing guix? You have to tell guix to trust the default key [09:29] Right? So in this case read the IRC log That step is mandatory user01: how are you doing? user02: https://github.com/pjotrp/guix-notes/blob/master/REPRODUCIBLE.org#using-gnu-guix-archive [09:30] a little bit left on the db download user02: you should see no more building user02: another issue may be that you updated r-preprocesscore package in guix-buinformatics [09:32] all downstream packages will want to rebuild no, not really It's not even installed checkout a branch of the the old version - make sure we are in synch should be at /gnu/store/y1f3r2xs3fhyadd46nd2aqbr2p9qv2ra-r-biocpreprocesscore-1.32.0 [09:33] pjotrp: Possibly we should use the archive utility of Guix to do deployment to avoid such out-of-sync differences :) [09:34] maybe. I did not get archive to update profiles properly [09:37] Also it is good that they get to understand guix this way carved in stone, eh [09:38] Yeah, all good [09:39] My mistake was skipping the guix archive part Can we begin with the install? It's telling me of derivatives that will be downloaded [09:40] So we're good Here goes yeeha [09:42] pjotrp, where is this guix.genenetwork.org located at? Tennessee It's...it's....sloooooooowwwwwwwwwwwwww not from Europe is it downloading at all? It should be extended Yes...like at 100KB/s [09:43] tear-jerker Verizon problems who's the host? I am getting 500Kb/s UT Guix's servers can run off more than one server, right? I'd like to host that particular server here For speed yes Sooner or later It will be a necessity [09:45] exactly what I am doing - this is our server guix.genenetwork.org:8080 All done installing [09:46] what? Now the databases what do you mean by slow exactly? Yes, it's installed can you run genenetwork2 setting variables If I try running it now, it will fail as I don't have the DBs [09:47] cool - you had a lot of prebuilt packages already OK, follow the instructions I wrote above now everything seems to be working for me :) OK user01: excellent! you see a webserver? yep, can connect to localhost:5003 [09:48] So now you are running a guix copy of GN2 you can see where it lives with `which genenetwork2` or ls -l ~/.guix-profile/bin/genenetwork2 [09:49] /gnu/store/1kma5xszvzsvmbb4k699h7gvdncw901i-genenetwork2-2.0-a8fcff4/bin/genenetwork2 it is a script written by guix, open it [09:50] inside it points to paths and our script at /gnu/store/1kma5xszvzsvmbb4k699h7gvdncw901i-genenetwork2-2.0-a8fcff4/bin/.genenetwork2-real if you open that you can see how the webserver is started [09:51] next step is to run a recent version of GN2 okay [09:52] See https://github.com/genenetwork/genenetwork2/tree/staging/doc#run-your-own-copy-of-gn2 but do not checkout that genetwork2_diet we reverted to the main tree clone git@github.com:genenetwork/genenetwork2.git [09:53] instead and checkout the staging branch that is effectively my branch [09:54] when that is done you should be able to fire up the webserver from there [09:55] using ./bin/genenetwork2 now installing DBs Downloading annoyingly the source tree is ~700Mb [09:56] Can it also be done by installing the guix package genenetwork2-database-small? I changed it in the diet version to 8Mb, but I had to revert I need to make my VM bigger... user02: not ready [09:57] ok user01: sorry user01: you could mount a local dir inside the VM for development that would allow you to use MAC tools for editing just an idea yeah, I figure I'll do something like that do you use emacs? [09:58] yep that can also run on remote files over ssh that's an alternative kudos for using emacs :), wdyt user03 79 minutes to go downloading the db user02: sorry about that [09:59] it is 2GB user, you can also mount the directory via sshfs Mac OSX runs OpenSSH user02: sopa You can therefore mount a directory outside the VM to the VM via sshfs [10:00] yes, 3 options now That way, you can set up a VM only for it's logic Apps + the OS it runs [10:01] For data, let it reside on physical host accessible via sshfs Use this Arch wiki reference: https://wiki.archlinux.org/index.php/SSHFS I edited that last somewhere in 2015, may have been updated since then alright, cool! [10:04] user01: you are almost done [10:06] I wrote an elixir package for guix :) env GUIX_PACKAGE_PATH=../guix-bioinformatics/ ./pre-inst-env guix package -A elixir --substitute-urls="http://guix.genenetwork.org" [10:08] elixir 1.2.3 out ../guix-bioinformatics/gn/packages/elixir.scm:31:2 I am building it on guix.genenetwork.org right now [10:09] nice [10:10] #+end_src