aboutsummaryrefslogtreecommitdiff
path: root/doc/README.org
diff options
context:
space:
mode:
Diffstat (limited to 'doc/README.org')
-rw-r--r--doc/README.org462
1 files changed, 142 insertions, 320 deletions
diff --git a/doc/README.org b/doc/README.org
index b38ea664..620c946c 100644
--- a/doc/README.org
+++ b/doc/README.org
@@ -2,33 +2,29 @@
* Table of Contents :TOC:
- [[#introduction][Introduction]]
- - [[#quick-installation-recipe][Quick installation recipe]]
- - [[#step-1-install-gnu-guix][Step 1: Install GNU Guix]]
- - [[#step-2-checkout-the-gn2-git-repositories][Step 2: Checkout the GN2 git repositories]]
- - [[#step-3-authorize-the-gn-guix-server][Step 3: Authorize the GN Guix server]]
- - [[#step-4-install-and-run-gn2][Step 4: Install and run GN2]]
+ - [[#install][Install]]
+ - [[#tarball][Tarball]]
+ - [[#docker][Docker]]
+ - [[#with-source][With source]]
+ - [[#running-gn2][Running GN2]]
- [[#run-mysql-server][Run MySQL server]]
+ - [[#install-mysql-with-gnu-guix][Install MySQL with GNU GUIx]]
+ - [[#load-the-small-database-in-mysql][Load the small database in MySQL]]
- [[#gn2-dependency-graph][GN2 Dependency Graph]]
- - [[#source-deployment][Source deployment]]
- - [[#run-your-own-copy-of-gn2][Run your own copy of GN2]]
- - [[#set-up-nginx-port-forwarding][Set up nginx port forwarding]]
- - [[#source-deployment-and-other-information-on-reproducibility][Source deployment and other information on reproducibility]]
- - [[#update-to-recent-guix][Update to recent guix]]
- - [[#install-gn2][Install GN2]]
- - [[#run-gn2][Run GN2]]
+ - [[#working-with-the-gn2-source-code][Working with the GN2 source code]]
+ - [[#running-elasticsearch][Running ElasticSearch]]
+ - [[#systemd][SystemD]]
+ - [[#read-more][Read more]]
- [[#trouble-shooting][Trouble shooting]]
- [[#importerror-no-module-named-jinja2][ImportError: No module named jinja2]]
- - [[#error-can-not-find-directory-homegn2_data][ERROR: can not find directory $HOME/gn2_data]]
+ - [[#error-can-not-find-directory-homegn2_data-or-can-not-find-directory-homegenotype_filesgenotype][ERROR: 'can not find directory $HOME/gn2_data' or 'can not find directory $HOME/genotype_files/genotype']]
- [[#cant-run-a-module][Can't run a module]]
- [[#rpy2-error-show-now-found][Rpy2 error 'show' now found]]
+ - [[#mysql-cant-connect-server-through-socket-error][Mysql can't connect server through socket ERROR]]
- [[#irc-session][IRC session]]
* Introduction
-If you want to understand the architecture of GN2 read
-[[Architecture.org]]. The rest of this document is mostly on deployment
-of GN2.
-
Large system deployments can get very [[http://biogems.info/contrib/genenetwork/gn2.svg ][complex]]. In this document we
explain the GeneNetwork version 2 (GN2) reproducible deployment system
which is based on GNU Guix (see also [[https://github.com/pjotrp/guix-notes/blob/master/README.md][Guix-notes]]). The Guix
@@ -37,195 +33,133 @@ system can be used to install GN with all its files and dependencies.
The official installation path is from a checked out version of the
main Guix package tree and that of the Genenetwork package
tree. Current supported versions can be found as the SHA values of
-'gn-latest' branches of [[https://github.com/genenetwork/guix-bioinformatics/tree/gn-latest][Guix bioinformatics]] and [[https://github.com/genenetwork/guix/tree/gn-latest][GNU Guix main]].
+'gn-latest' branches of [[https://gitlab.com/genenetwork/guix-bioinformatics][Guix bioinformatics]] and [[https://gitlab.com/genenetwork/guix][GNU Guix]].
For a full view of runtime dependencies as defined by GNU Guix, see
-the [[#gn2-dependency-graph][GN2 Dependency Graph]].
-
-* Quick installation recipe
-
-This is a recipe for quick and dirty installation of GN2. For
-convenience everything is installed as root, though in reality only
-GNU Guix has to be installed as root. I tested this recipe on a fresh
-install of Debian 8.3.0 (in KVM) though it should work on any modern
-Linux distribution (including CentOS). For more elaborate installation
-instructions see [[#source-deployment][Source deployment]].
+an example of the [[#gn2-dependency-graph][GN2 Dependency Graph]].
-Note that GN2 consists of an approx. 5 GB installation including
-database. If you use a virtual machine we recommend to use at least
-double.
+* Install
-** Step 1: Install GNU Guix
+The quickest way to install GN2 is by using a binary installation
+(tarball or Docker image). These installations are bundled by GNU
+Guix and include all dependencies. You can install GeneNetwork on most
+Linux distributions, including Debian, Ubuntu, Fedora and CentOS,
+provided you have administrator privileges (root). The alternative is
+a Docker installation.
-Fetch the GNU Guix binary from [[https://www.gnu.org/software/guix/download/][here]] (middle panel) and follow
-[[https://www.gnu.org/software/guix/manual/html_node/Binary-Installation.html][instructions]]. Essentially, download and unpack the tar ball (which
-creates directories in /gnu and /var/guix), add build users and group
-(Guix builds software as unpriviliged users) and run the Guix daemon
-after fixing the paths (also known as the 'profile').
+** Tarball
-Once you have succeeded, you have to [[https://github.com/pjotrp/guix-notes/blob/master/INSTALL.org#set-the-key][set the key]] (getting permission
-to download binaries from the GNU server) and you should be able to
-install the hello package using binary packages (no building)
+Download the ~800Mb tarball from
+[[http://files.genenetwork.org/software/binary_tarball/]]. Validate the checksum and
+unpack to root, for example
-#+begin_src bash
-export PATH=~/.guix-profile/bin:$PATH
-guix pull
-guix package -i hello --dry-run
-#+end_src
-
-Which should show something like
+: tar xvzf genenetwork2-2.10rc3-1538ffd-tarball-pack.tar.gz
+: mv /gnu /
+: mv /opt/genenetwork2 /opt/
-: The following files would be downloaded:
-: /gnu/store/zby49aqfbd9w9br4l52mvb3y6f9vfv22-hello-2.10
-: ...
-#+end_src
+Now you shoud be able to start the server with
-means binary installs. The actual installation command of 'hello' is
+: /opt/genenetwork2/bin/genenetwork2
-#+begin_src bash
-guix package -i hello
-hello
- Hello, world!
-#+end_src
+When the server stops with a MySQL error [[#run-mysql-server][Run MySQL server]]
+and set SQL_URI to point at it. For example:
-If you actually see things building it means that Guix is not yet
-properly installed and up-to-date, i.e., the key is missing or you
-need to do a 'guix pull'. Press Ctrl-C to interrupt.
+: export SQL_URI=mysql://gn2:mysql_password@127.0.0.1/db_webqtl_s
-If you need more help we have another writeup in [[https://github.com/pjotrp/guix-notes/blob/master/INSTALL.org#binary-installation][guix-notes]]. To get
-rid of the locale warning see [[https://github.com/pjotrp/guix-notes/blob/master/INSTALL.org#set-locale][set-locale]].
+See also [[#mysql-cant-connect-server-through-socket-error][Mysql can't connect server through socket ERROR]].
-** Step 2: Checkout the GN2 git repositories
+** Docker
-To fixate the software dependency graph GN2 uses git repositories of
-Guix packages. First install git if it is missing
+Docker images are also available through
+[[http://files.genenetwork.org/software/]]. Validate the checksum and run
+with [[https://docs.docker.com/engine/reference/commandline/load/][Docker load]].
-#+begin_src bash
-guix package -i git
-export GIT_SSL_CAINFO=/etc/ssl/certs/ca-certificates.crt
-#+end_src
+** With source
-check out the git repositories (gn-deploy branch)
+For more elaborate installation instructions on deploying GeneNetwork from
+source see [[#source-deployment][Source deployment]].
-#+begin_src bash
-cd ~
-mkdir genenetwork
-cd genenetwork
-git clone --branch gn-deploy https://github.com/genenetwork/guix-bioinformatics
-git clone --branch gn-deploy --recursive https://github.com/genenetwork/guix guix-gn-deploy
-cd guix-gn-deploy
-#+end_src bash
+* Running GN2
-To test whether this is working try:
+Default settings for GN2 are listed in a file called
+[[../etc/default_settings.py][default_settings.py]]. You can copy this file and pass it as a new
+parameter to the genenetwork2 command, e.g.
-#+begin_src bash
-#+end_src bash
+: genenetwork2 mysettings.py
+or you can set environment variables to override individual parameters, e.g.
-** Step 3: Authorize the GN Guix server
+: env SERVER_PORT=5004 SQL_URI=mysql://user:pwd@dbhostname/db_webqtl genenetwork2
-GN2 has its own GNU Guix binary distribution server. To trust it you have
-to add the following key
+the debug and logging switches can be particularly useful when
+developing GN2.
-#+begin_src scheme
-(public-key
- (ecc
- (curve Ed25519)
- (q #11217788B41ADC8D5B8E71BD87EF699C65312EC387752899FE9C888856F5C769#)
- )
-)
-#+end_src
-
-by pasting it into the command
-
-#+begin_src bash
-guix archive --authorize
-#+end_src
-
-and hit Ctrl-D.
-
-Now you can use the substitute server to install GN2 binaries.
-
-** Step 4: Install and run GN2
-
-Since this is a quick and dirty install we are going to override the
-GNU Guix package path by pointing the package path to our repository:
-
-#+begin_src bash
-rm /root/.config/guix/latest
-ln -s ~/genenetwork/guix-gn-deploy/ /root/.config/guix/latest
-#+end_src
+* Run MySQL server
+** Install MySQL with GNU GUIx
-Now check whether you can find the GN2 package with
+These are the steps you can take to install a fresh installation of
+mysql (which comes as part of the GNU Guix genenetwork2 install).
-#+begin_src bash
-env GUIX_PACKAGE_PATH=~/genenetwork/guix-bioinformatics/ guix package -A genenetwork2
- genenetwork2 2.0-a8fcff4 out gn/packages/genenetwork.scm:144:2
-#+end_src
+As root configure and run
-(ignore the source file newer then ... messages, this is caused by the
-/root/.config/guix/latest override).
+#+BEGIN_SRC bash
+adduser mysql && addgroup mysql
+mysqld --datadir=/var/mysql --initialize-insecure
+mkdir -p /var/run/mysqld
+chown mysql.mysql ~/mysql /var/run/mysqld
+mysqld -u mysql --datadir=/var/mysql --explicit_defaults_for_timestamp -P 12048"
+#+END_SRC
-And install with
+If you want to run as root you may have to set
-#+begin_src bash
-env GUIX_PACKAGE_PATH=~/genenetwork/guix-bioinformatics/ \
- guix package -i genenetwork2 \
- --substitute-urls="http://guix.genenetwork.org"
-#+end_src
+: /etc/my.cnf
+: [mysqld]
+: user=root
-Note: the order of the substitute url's may make a difference in speed
-(put the one first that is fastest for your location and time of day).
+To check error output in a file on start-up run with something like
-Note: if your system starts building or gives an error it may well be
-Step 3 did not succeed. The installation should actually be smooth at
-this point and only do binary installs (no compiling).
+: mysqld -u mysql --console --explicit_defaults_for_timestamp --datadir=/gnu/mysql --log-error=~/test.log
-After installation you should be able to run genenetwork2 after updating
-the Guix suggested environment vars. Check the output of
+Other tips are that Guix installs mysqld in your profile, so this may work
-#+begin_src bash
-guix package --search-paths
-export PYTHONPATH="/root/.guix-profile/lib/python2.7/site-packages"
-export R_LIBS_SITE="/root/.guix-profile/site-library/"
-#+end_src
+: /home/user/.guix-profile/bin/mysqld -u mysql --explicit_defaults_for_timestamp --datadir=/gnu/mysql
-and copy-paste the listed exports into the terminal before running:
+When you get errors like:
-#+begin_src bash
-genenetwork2
-#+end_src
+: qlalchemy.exc.IntegrityError: (_mysql_exceptions.IntegrityError) (1215, 'Cannot add foreign key constraint')
-It will complain that the database is missing. See the next section on
-running MySQL server for downloading and installing a MySQL GN2
-database. After installing the database restart genenetwork2 and point
-your browser at [[http://localhost:5003/]].
+you may need to set
-End of the GN2 installation recipe!
+: set foreign_key_checks=0
-* Run MySQL server
+** Load the small database in MySQL
At this point we require the underlying distribution to install and
-run mysqld. Currently we have two databases for deployment,
+run mysqld (see next section for GNU Guix). Currently we have two databases for deployment,
'db_webqtl_s' is the small testing database containing experiments
from BXD mice and 'db_webqtl_plant' which contains all plant related
material.
Download one database from
-http://files.genenetwork.org/raw_database/
-https://s3.amazonaws.com/genenetwork2/db_webqtl_s.zip
+[[http://files.genenetwork.org/raw_database/]]
+
+[[https://s3.amazonaws.com/genenetwork2/db_webqtl_s.zip]]
Check the md5sum.
After installation inflate the database binary in the MySQL directory
-(this installation path is subject to change soon)
+: cd ~/mysql
: chown -R mysql:mysql db_webqtl_s/
: chmod 700 db_webqtl_s/
: chmod 660 db_webqtl_s/*
-restart MySQL service (mysqld). Login as root and
+restart MySQL service (mysqld). Login as root
+
+: myslq -u root
+
+and
: mysql> show databases;
: +--------------------+
@@ -241,9 +175,12 @@ Set permissions and match password in your settings file below:
: mysql> grant all privileges on db_webqtl_s.* to gn2@"localhost" identified by 'mysql_password';
+You may need to change "localhost" to whatever domain you are
+connecting from (mysql will give an error).
+
Note that if the mysql connection is not working, try connecting to
the IP address and check server firewall, hosts.allow and mysql IP
-configuration.
+configuration (see below).
Note for the plant database you can rename it to db_webqtl_s, or
change the settings in etc/default_settings.py to match your path.
@@ -255,183 +192,44 @@ Graph of all runtime dependencies as installed by GNU Guix.
#+ATTR_HTML: :title GN2_graph
http://biogems.info/contrib/genenetwork/gn2.svg
-* Source deployment
-
-This section gives a more elaborate instruction for installing GN2
-from source.
-
-First execute above 4 steps:
-
- - [[#step-1-install-gnu-guix][Step 1: Install GNU Guix]]
- - [[#step-2-checkout-the-gn2-git-repositories][Step 2: Checkout the GN2 git repositories]]
- - [[#step-3-authorize-the-gn-guix-server][Step 3: Authorize the GN Guix server]]
- - [[#step-4-install-and-run-gn2-][Step 4: Install and run GN2 ]]
-
-
-** Run your own copy of GN2
-
-At some point you may want to fix the source code. Assuming you have
-Guix and Genenetwork2 installed (as described above) clone the GN2
-repository from https://github.com/genenetwork/genenetwork2.
-
-Copy-paste the paths into your terminal (mainly so PYTHON_PATH and
-R_LIBS_SITE are set) from the information given by guix:
-
-: guix package --search-paths
-
-Inside the repository:
-
-: cd genenetwork2
-: ./bin/genenetwork2
-
-Will fire up your local repo http://localhost:5003/ using the
-settings in ./etc/default_settings.py. These settings may
-not reflect your system. To override settings create your own from a copy of
-default_settings.py and pass it into GN2 with
-
-: ./bin/genenetwork2 $HOME/my_settings.py
-
-and everything *should* work (note the full path to the settings
-file). This way we develop against the exact same dependency graph of
-software.
-
-If something is not working, take a hint from the settings file
-that comes in the Guix installation. It sits in something like
+* Working with the GN2 source code
-: cat ~/.guix-profile/lib/python2.7/site-packages/genenetwork2-2.0-py2.7.egg/etc/default_settings.py
+See [[development.org]].
-** Set up nginx port forwarding
+* Running ElasticSearch
-nginx can be used as a reverse proxy for GN2. For example, we want to
-expose GN2 on port 80 while it is running on port 5003. Essentially
-the configuration looks like
+In order to start up elasticsearch:
+Penguin - change user to "elasticsearch" and use the following command: "env JAVA_HOME=/opt/jdk-9.0.4 /opt/elasticsearch-6.2.1/bin/elasticsearch"
-#+begin_src js
- server {
- listen 80;
- server_name test-gn2.genenetwork.org;
- access_log logs/test-gn2.access.log;
- proxy_connect_timeout 3000;
- proxy_send_timeout 3000;
- proxy_read_timeout 3000;
- send_timeout 3000;
+** SystemD
- location / {
- proxy_set_header Host $http_host;
- proxy_set_header Connection keep-alive;
- proxy_set_header X-Real-IP $remote_addr;
- proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
- proxy_set_header X-Forwarded-Host $server_name;
- proxy_pass http://127.0.0.1:5003;
- }
-}
-#+end_src js
+New server - as root run "systemctl restart elasticsearch"
-Install the nginx webserver (as root)
+#+BEGIN_SRC
+tux01:/etc/systemd/system# cat elasticsearch.service
+[Unit]
+Description=Run Elasticsearch
-: guix package -i nginx
+[Service]
+ExecStart=/opt/elasticsearch-6.2.1/bin/elasticsearch
+Environment=JAVA_HOME=/opt/jdk-9.0.4
+Environment="ES_JAVA_OPTS=-Xms1g -Xmx8g"
+Environment="PATH=/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games:/opt/jdk-9.0.4/bin"
+LimitNOFILE=65536
+StandardOutput=syslog
+StandardError=syslog
+User=elasticsearch
-The nginx example configuration examples can be found in the Guix
-store through
+[Install]
+WantedBy=multi-user.target
+#+END_SRC
-: ls -l /root/.guix-profile/sbin/nginx
-: lrwxrwxrwx 3 root guixbuild 66 Dec 31 1969 /root/.guix-profile/sbin/nginx -> /gnu/store/g0wrcl5z27rmk5b52rldzvk1bzzbnz2l-nginx-1.8.1/sbin/nginx
+* Read more
-Use that path
-
-: ls /gnu/store/g0wrcl5z27rmk5b52rldzvk1bzzbnz2l-nginx-1.8.1/share/nginx/conf/
-: fastcgi.conf koi-win scgi_params
-: fastcgi.conf.default mime.types scgi_params.default
-: fastcgi_params mime.types.default uwsgi_params
-: fastcgi_params.default nginx.conf uwsgi_params.default
-: koi-utf nginx.conf.default win-utf
-
-And copy any relevant files to /etc/nginx. A configuration file for
-GeneNetwork (reverse proxy) port forwarding can be found in the source
-repository under ./etc/nginx-genenetwork.conf. Copy this file to /etc
-(still as root)
-: cp ./etc/nginx-genenetwork.conf /etc/nginx/
-
-Make dirs
-
-: mkdir -p /var/spool/nginx/logs
-
-Add users
-
-: adduser nobody ; addgroup nobody
-
-Run nginx
-
-: /root/.guix-profile/sbin/nginx -c /etc/nginx/nginx-genenetwork.conf -p /var/spool/nginx
-
-* Source deployment and other information on reproducibility
-
-See the document [[GUIX-Reproducible-from-source.org]].
-
-** Update to recent guix
-
-We now compile Guix from scratch.
-
-Create, install and run a recent version of the guix-daemon by
-compiling the guix repository you have installed with git in
-step 2. Follow [[https://github.com/pjotrp/guix-notes/blob/master/INSTALL.org#building-gnu-guix-from-source-using-guix][these]] steps carefully after
-
-: cd ~/genenetwork/guix-gn-deploy
-
-Make sure to restart the guix daemon and run guix client from this
-directory.
-
-** Install GN2
-
-Reinstall genenetwork2 using the new tree
-
-#+begin_src bash
-env GUIX_PACKAGE_PATH=~/genenetwork/guix-bioinformatics/ ./pre-inst-env guix package -i genenetwork2 --substitute-urls="http://guix.genenetwork.org https://mirror.guixsd.org"
-#+end_src bash
-
-Note the use of ./pre-inst-env here!
-
-Actually, it should be the same installation as in step 4, so nothing
-gets downloaded.
-
-** Run GN2
-
-Make a note of the paths with
-
-#+begin_src bash
-./pre-inst-env guix package --search-paths
-#+end_src bash
-
-or this should also work if guix is installed
-
-#+begin_src bash
-guix package --search-paths
-#+end_src bash
-
-After setting the paths for the server
-
-#+begin_src bash
-export PATH=~/.guix-profile/bin:$PATH
-export PYTHONPATH="$HOME/.guix-profile/lib/python2.7/site-packages"
-export R_LIBS_SITE="$HOME/.guix-profile/site-library/"
-export GUIX_GTK3_PATH="$HOME/.guix-profile/lib/gtk-3.0"
-export GI_TYPELIB_PATH="$HOME/.guix-profile/lib/girepository-1.0"
-export XDG_DATA_DIRS="$HOME/.guix-profile/share"
-export GIO_EXTRA_MODULES="$HOME/.guix-profile/lib/gio/modules"
-#+end_src bash
-
-run the main script (in ~/.guix-profile/bin)
-
-#+begin_src bash
-genenetwork2
-#+end_src bash
-
-will start the default server which listens on port 5003, i.e.,
-http://localhost:5003/.
-
-OK, we are where we were before with step 4. Only difference is that we
-used our own compiled guix server.
+If you want to understand the architecture of GN2 read
+[[Architecture.org]]. The rest of this document is mostly on deployment
+of GN2.
* Trouble shooting
@@ -451,13 +249,17 @@ On one system:
: export GEM_PATH="$HOME/.guix-profile/lib/ruby/gems/2.2.0"
and perhaps a few more.
-** ERROR: can not find directory $HOME/gn2_data
+** ERROR: 'can not find directory $HOME/gn2_data' or 'can not find directory $HOME/genotype_files/genotype'
The default settings file looks in your $HOME/gn2_data. Since these
files come with a Guix installation you should take a hint from the
values in the installed version of default_settings.py (see above in
this document).
+You can use the GENENETWORK_FILES switch to set the datadir, for example
+
+: env GN2_PROFILE=~/opt/gn-latest GENENETWORK_FILES=/gnu/data/gn2_data ./bin/genenetwork2
+
** Can't run a module
In rare cases, development modules are not brought in with Guix
@@ -479,6 +281,26 @@ R_LIBS_SITE. Please check your GNU Guix GN2 installation paths,
you man need to reinstall. Note that this may be the point you
may want to start using profiles (see profile section).
+** Mysql can't connect server through socket ERROR
+
+The following error
+
+: sqlalchemy.exc.OperationalError: (_mysql_exceptions.OperationalError) (2002, 'Can\'t connect to local MySQL server through socket \'/run/mysqld/mysqld.sock\' (2 "No such file or directory")')
+
+means that MySQL is trying to connect locally to a non-existent MySQL
+server, something you may see in a container. Typically replicated with something like
+
+: mysql -h localhost
+
+try to connect over the network interface instead, e.g.
+
+: mysql -h 127.0.0.1
+
+if that works run genenetwork after setting SQL_URI to something like
+
+: export SQL_URI=mysql://gn2:mysql_password@127.0.0.1/db_webqtl_s
+
+
* IRC session
Here an IRC session where we installed GN2 from scratch using GNU Guix