#+TITLE: Installing GeneNetwork services

* Table of Contents                                                     :TOC:
 - [[#introduction][Introduction]]
 - [[#install][Install]]
 - [[#running-gn2][Running GN2]]
 - [[#run-gn-proxy][Run gn-proxy]]
 - [[#run-redis][Run Redis]]
 - [[#run-mariadb-server][Run MariaDB server]]
   - [[#install-mariadb-with-gnu-guix][Install MariaDB with GNU GUIx]]
   - [[#load-the-small-database-in-mysql][Load the small database in MySQL]]
 - [[#get-genotype-files][Get genotype files]]
 - [[#gn2-dependency-graph][GN2 Dependency Graph]]
 - [[#working-with-the-gn2-source-code][Working with the GN2 source code]]
 - [[#read-more][Read more]]
 - [[#trouble-shooting][Trouble shooting]]
   - [[#importerror-no-module-named-jinja2][ImportError: No module named jinja2]]
   - [[#error-can-not-find-directory-homegn2_data-or-can-not-find-directory-homegenotype_filesgenotype][ERROR: 'can not find directory $HOME/gn2_data' or 'can not find directory $HOME/genotype_files/genotype']]
   - [[#cant-run-a-module][Can't run a module]]
   - [[#rpy2-error-show-now-found][Rpy2 error 'show' now found]]
   - [[#mysql-cant-connect-server-through-socket-error][Mysql can't connect server through socket ERROR]]
 - [[#notes][NOTES]]
   - [[#deploying-gn2-official][Deploying GN2 official]]

* Introduction

Large system deployments can get very [[http://biogems.info/contrib/genenetwork/gn2.svg ][complex]]. In this document we
explain the GeneNetwork version 2 (GN2) reproducible deployment system
which is based on GNU Guix (see also [[https://github.com/pjotrp/guix-notes/blob/master/README.md][Guix-notes]]). The Guix
system can be used to install GN with all its files and dependencies.

The official installation path is from a checked out version of the
main Guix package tree and that of the Genenetwork package
tree. Current supported versions can be found as the SHA values of
'gn-latest' branches of [[https://gitlab.com/genenetwork/guix-bioinformatics][Guix bioinformatics]] and [[https://gitlab.com/genenetwork/guix][GNU Guix]].

For a full view of runtime dependencies as defined by GNU Guix, see
an example of the [[#gn2-dependency-graph][GN2 Dependency Graph]].

* Install

Make sure to install GNU Guix using the binary download instructions
on the main website. Follow the instructions on
[[GUIX-Reproducible-from-source.org]] to download pre-built binaries. Note
the download amounts to several GBs of data.

* Running GN2

Default settings for GN2 are listed in a file called
[[../etc/default_settings.py][default_settings.py]]. You can copy this file and pass it as a new
parameter to the genenetwork2 command, e.g.

: genenetwork2 mysettings.py

or you can set environment variables to override individual parameters, e.g.

: env SERVER_PORT=5004 SQL_URI=mysql://user:pwd@dbhostname/db_webqtl genenetwork2

the debug and logging switches can be particularly useful when
developing GN2.

* Run gn-proxy

GeneNetwork requires a separate gn-proxy server which handles
authorisation and access control. For instructions see the [[https://github.com/genenetwork/gn-proxy][README]].

* Run Redis

Redis part of GN2 deployment and will be started by the ./bin/genenetwork2
startup script.

* Run MariaDB server
** Install MariaDB with GNU GUIx

These are the steps you can take to install a fresh installation of
mariadb (which comes as part of the GNU Guix genenetwork2 install).

As root configure the Guix profile

: . ~/opt/genenetwork2/etc/profile

and run for example

#+BEGIN_SRC bash
adduser mariadb && addgroup mariadb
mkdir -p /export2/mariadb/database
chown mariadb.mariadb -R /export2/mariadb/
mkdir -p /var/run/mysqld
chown mariadb.mariadb /var/run/mysqld
su mariadb
mysql --version
  mysql  Ver 15.1 Distrib 10.1.45-MariaDB, for Linux (x86_64) using readline 5.1
mysql_install_db --user=mariadb --datadir=/export2/mariadb/database
mysqld -u mariadb --datadir=/exportdb/mariadb/database/mariadb --explicit_defaults_for_timestamp -P 12048"
#+END_SRC

If you want to run as root you may have to set

: /etc/my.cnf
: [mariadbd]
: user=root

You also need to set

: ft_min_word_len = 3

To make sure word text searches (shh) work and rebuild the tables if
required.

To check error output in a file on start-up run with something like

: mariadbd -u mariadb --console  --explicit_defaults_for_timestamp  --datadir=/gnu/mariadb --log-error=~/test.log

Other tips are that Guix installs mariadbd in your profile, so this may work

: /home/user/.guix-profile/bin/mariadbd -u mariadb --explicit_defaults_for_timestamp  --datadir=/gnu/mariadb

When you get errors like:

: qlalchemy.exc.IntegrityError: (_mariadb_exceptions.IntegrityError) (1215, 'Cannot add foreign key constraint')

you may need to set

: set foreign_key_checks=0

** Load the small database in MySQL

At this point we require the underlying distribution to install and
run mysqld (see next section for GNU Guix). Currently we have two databases for deployment,
'db_webqtl_s' is the small testing database containing experiments
from BXD mice and 'db_webqtl_plant' which contains all plant related
material.

Download one database from

http://ipfs.genenetwork.org/ipfs/QmRUmYu6ogxEdzZeE8PuXMGCDa8M3y2uFcfo4zqQRbpxtk

After installation unzip the database binary in the MySQL directory

#+BEGIN_SRC sh
cd ~/mysql
p7zip -d db_webqtl_s.7z
chown -R mysql:mysql db_webqtl_s/
chmod 700 db_webqtl_s/
chmod 660 db_webqtl_s/*
#+END_SRC

restart MySQL service (mysqld). Login as root

: mysql_upgrade -u root --force

: myslq -u root

and

: mysql> show databases;
: +--------------------+
: | Database           |
: +--------------------+
: | information_schema |
: | db_webqtl_s        |
: | mysql              |
: | performance_schema |
: +--------------------+

Set permissions and match password in your settings file below:

: mysql> grant all privileges on db_webqtl_s.* to gn2@"localhost" identified by 'webqtl';

You may need to change "localhost" to whatever domain you are
connecting from (mysql will give an error).

Note that if the mysql connection is not working, try connecting to
the IP address and check server firewall, hosts.allow and mysql IP
configuration (see below).

Note for the plant database you can rename it to db_webqtl_s, or
change the settings in etc/default_settings.py to match your path.

* Get genotype files

The script looks for genotype files. You can find them in
http://ipfs.genenetwork.org/ipfs/QmXQy3DAUWJuYxubLHLkPMNCEVq1oV7844xWG2d1GSPFPL

#+BEGIN_SRC sh
mkdir -p $HOME/genotype_files
cd $HOME/genotype_files

#+END_SRC

* GN2 Dependency Graph

Graph of all runtime dependencies as installed by GNU Guix.

#+ATTR_HTML: :title GN2_graph
http://biogems.info/contrib/genenetwork/gn2.svg

* Working with the GN2 source code

See [[development.org]].

* Read more

If you want to understand the architecture of GN2 read
[[Architecture.org]].  The rest of this document is mostly on deployment
of GN2.

* Trouble shooting

** ImportError: No module named jinja2

If you have all the Guix packages installed this error points out that
the environment variables are not set. Copy-paste the paths into your
terminal (mainly so PYTHON_PATH and R_LIBS_SITE are set) from the
information given by guix:

: guix package --search-paths

On one system:

: export PYTHONPATH="$HOME/.guix-profile/lib/python3.8/site-packages"
: export R_LIBS_SITE="$HOME/.guix-profile/site-library/"
: export GEM_PATH="$HOME/.guix-profile/lib/ruby/gems/2.2.0"

and perhaps a few more.
** ERROR: 'can not find directory $HOME/gn2_data' or 'can not find directory $HOME/genotype_files/genotype'

The default settings file looks in your $HOME/gn2_data. Since these
files come with a Guix installation you should take a hint from the
values in the installed version of default_settings.py (see above in
this document).

You can use the GENENETWORK_FILES switch to set the datadir, for example

: env GN2_PROFILE=~/opt/gn-latest GENENETWORK_FILES=/gnu/data/gn2_data ./bin/genenetwork2

** Can't run a module

In rare cases, development modules are not brought in with Guix
because no source code is available. This can lead to missing modules
on a running server. Please check with the authors when a module
is missing.
** Rpy2 error 'show' now found

This error

: __show = rpy2.rinterface.baseenv.get("show")
: LookupError: 'show' not found

means that R was updated in your path, and that Rpy2 needs to be
recompiled against this R - don't you love informative messages?

In our case it means that GN's PYTHONPATH is not in sync with
R_LIBS_SITE. Please check your GNU Guix GN2 installation paths,
you man need to reinstall. Note that this may be the point you
may want to start using profiles (see profile section).

** Mysql can't connect server through socket ERROR

The following error

: sqlalchemy.exc.OperationalError: (_mysql_exceptions.OperationalError) (2002, 'Can\'t connect to local MySQL server through socket \'/run/mysqld/mysqld.sock\' (2 "No such file or directory")')

means that MySQL is trying to connect locally to a non-existent MySQL
server, something you may see in a container. Typically replicated with something like

: mysql -h localhost

try to connect over the network interface instead, e.g.

: mysql -h 127.0.0.1

if that works run genenetwork after setting SQL_URI to something like

: export SQL_URI=mysql://gn2:mysql_password@127.0.0.1/db_webqtl_s

* NOTES

** Deploying GN2 official

Let's see how fast we can deploy a second copy of GN2.

- [ ] Base install
  + [ ] First install a Debian server with GNU Guix on board
  + [ ] Get Guix build going
    - [ ] Build the correct version of Guix
    - [ ] Check out the correct gn-stable version of guix-bioinformatics http://git.genenetwork.org/pjotrp/guix-bioinformatics
    - [ ] guix package -i genenetwork2 -p /usr/local/guix-profiles/gn2-stable
  + [ ] Create a gn2 user and home with space
  + [ ] Install redis
    - [ ] add to systemd
    - [ ] update redis.cnf
    - [ ] update database
  + [ ] Install mariadb (currently debian mariadb-server)
    - [ ] add to systemd
    - [ ] system stop mysql
    - [ ] update mysql.cnf
    - [ ] update database (see gn-services/services/mariadb.md)
    - [ ] check tables
  + [ ] run gn2
  + [ ] update nginx
  + [ ] install genenetwork3
    - [ ] add to systemd