diff options
Diffstat (limited to 'doc/README.org')
-rw-r--r-- | doc/README.org | 462 |
1 files changed, 142 insertions, 320 deletions
diff --git a/doc/README.org b/doc/README.org index b38ea664..620c946c 100644 --- a/doc/README.org +++ b/doc/README.org @@ -2,33 +2,29 @@ * Table of Contents :TOC: - [[#introduction][Introduction]] - - [[#quick-installation-recipe][Quick installation recipe]] - - [[#step-1-install-gnu-guix][Step 1: Install GNU Guix]] - - [[#step-2-checkout-the-gn2-git-repositories][Step 2: Checkout the GN2 git repositories]] - - [[#step-3-authorize-the-gn-guix-server][Step 3: Authorize the GN Guix server]] - - [[#step-4-install-and-run-gn2][Step 4: Install and run GN2]] + - [[#install][Install]] + - [[#tarball][Tarball]] + - [[#docker][Docker]] + - [[#with-source][With source]] + - [[#running-gn2][Running GN2]] - [[#run-mysql-server][Run MySQL server]] + - [[#install-mysql-with-gnu-guix][Install MySQL with GNU GUIx]] + - [[#load-the-small-database-in-mysql][Load the small database in MySQL]] - [[#gn2-dependency-graph][GN2 Dependency Graph]] - - [[#source-deployment][Source deployment]] - - [[#run-your-own-copy-of-gn2][Run your own copy of GN2]] - - [[#set-up-nginx-port-forwarding][Set up nginx port forwarding]] - - [[#source-deployment-and-other-information-on-reproducibility][Source deployment and other information on reproducibility]] - - [[#update-to-recent-guix][Update to recent guix]] - - [[#install-gn2][Install GN2]] - - [[#run-gn2][Run GN2]] + - [[#working-with-the-gn2-source-code][Working with the GN2 source code]] + - [[#running-elasticsearch][Running ElasticSearch]] + - [[#systemd][SystemD]] + - [[#read-more][Read more]] - [[#trouble-shooting][Trouble shooting]] - [[#importerror-no-module-named-jinja2][ImportError: No module named jinja2]] - - [[#error-can-not-find-directory-homegn2_data][ERROR: can not find directory $HOME/gn2_data]] + - [[#error-can-not-find-directory-homegn2_data-or-can-not-find-directory-homegenotype_filesgenotype][ERROR: 'can not find directory $HOME/gn2_data' or 'can not find directory $HOME/genotype_files/genotype']] - [[#cant-run-a-module][Can't run a module]] - [[#rpy2-error-show-now-found][Rpy2 error 'show' now found]] + - [[#mysql-cant-connect-server-through-socket-error][Mysql can't connect server through socket ERROR]] - [[#irc-session][IRC session]] * Introduction -If you want to understand the architecture of GN2 read -[[Architecture.org]]. The rest of this document is mostly on deployment -of GN2. - Large system deployments can get very [[http://biogems.info/contrib/genenetwork/gn2.svg ][complex]]. In this document we explain the GeneNetwork version 2 (GN2) reproducible deployment system which is based on GNU Guix (see also [[https://github.com/pjotrp/guix-notes/blob/master/README.md][Guix-notes]]). The Guix @@ -37,195 +33,133 @@ system can be used to install GN with all its files and dependencies. The official installation path is from a checked out version of the main Guix package tree and that of the Genenetwork package tree. Current supported versions can be found as the SHA values of -'gn-latest' branches of [[https://github.com/genenetwork/guix-bioinformatics/tree/gn-latest][Guix bioinformatics]] and [[https://github.com/genenetwork/guix/tree/gn-latest][GNU Guix main]]. +'gn-latest' branches of [[https://gitlab.com/genenetwork/guix-bioinformatics][Guix bioinformatics]] and [[https://gitlab.com/genenetwork/guix][GNU Guix]]. For a full view of runtime dependencies as defined by GNU Guix, see -the [[#gn2-dependency-graph][GN2 Dependency Graph]]. - -* Quick installation recipe - -This is a recipe for quick and dirty installation of GN2. For -convenience everything is installed as root, though in reality only -GNU Guix has to be installed as root. I tested this recipe on a fresh -install of Debian 8.3.0 (in KVM) though it should work on any modern -Linux distribution (including CentOS). For more elaborate installation -instructions see [[#source-deployment][Source deployment]]. +an example of the [[#gn2-dependency-graph][GN2 Dependency Graph]]. -Note that GN2 consists of an approx. 5 GB installation including -database. If you use a virtual machine we recommend to use at least -double. +* Install -** Step 1: Install GNU Guix +The quickest way to install GN2 is by using a binary installation +(tarball or Docker image). These installations are bundled by GNU +Guix and include all dependencies. You can install GeneNetwork on most +Linux distributions, including Debian, Ubuntu, Fedora and CentOS, +provided you have administrator privileges (root). The alternative is +a Docker installation. -Fetch the GNU Guix binary from [[https://www.gnu.org/software/guix/download/][here]] (middle panel) and follow -[[https://www.gnu.org/software/guix/manual/html_node/Binary-Installation.html][instructions]]. Essentially, download and unpack the tar ball (which -creates directories in /gnu and /var/guix), add build users and group -(Guix builds software as unpriviliged users) and run the Guix daemon -after fixing the paths (also known as the 'profile'). +** Tarball -Once you have succeeded, you have to [[https://github.com/pjotrp/guix-notes/blob/master/INSTALL.org#set-the-key][set the key]] (getting permission -to download binaries from the GNU server) and you should be able to -install the hello package using binary packages (no building) +Download the ~800Mb tarball from +[[http://files.genenetwork.org/software/binary_tarball/]]. Validate the checksum and +unpack to root, for example -#+begin_src bash -export PATH=~/.guix-profile/bin:$PATH -guix pull -guix package -i hello --dry-run -#+end_src - -Which should show something like +: tar xvzf genenetwork2-2.10rc3-1538ffd-tarball-pack.tar.gz +: mv /gnu / +: mv /opt/genenetwork2 /opt/ -: The following files would be downloaded: -: /gnu/store/zby49aqfbd9w9br4l52mvb3y6f9vfv22-hello-2.10 -: ... -#+end_src +Now you shoud be able to start the server with -means binary installs. The actual installation command of 'hello' is +: /opt/genenetwork2/bin/genenetwork2 -#+begin_src bash -guix package -i hello -hello - Hello, world! -#+end_src +When the server stops with a MySQL error [[#run-mysql-server][Run MySQL server]] +and set SQL_URI to point at it. For example: -If you actually see things building it means that Guix is not yet -properly installed and up-to-date, i.e., the key is missing or you -need to do a 'guix pull'. Press Ctrl-C to interrupt. +: export SQL_URI=mysql://gn2:mysql_password@127.0.0.1/db_webqtl_s -If you need more help we have another writeup in [[https://github.com/pjotrp/guix-notes/blob/master/INSTALL.org#binary-installation][guix-notes]]. To get -rid of the locale warning see [[https://github.com/pjotrp/guix-notes/blob/master/INSTALL.org#set-locale][set-locale]]. +See also [[#mysql-cant-connect-server-through-socket-error][Mysql can't connect server through socket ERROR]]. -** Step 2: Checkout the GN2 git repositories +** Docker -To fixate the software dependency graph GN2 uses git repositories of -Guix packages. First install git if it is missing +Docker images are also available through +[[http://files.genenetwork.org/software/]]. Validate the checksum and run +with [[https://docs.docker.com/engine/reference/commandline/load/][Docker load]]. -#+begin_src bash -guix package -i git -export GIT_SSL_CAINFO=/etc/ssl/certs/ca-certificates.crt -#+end_src +** With source -check out the git repositories (gn-deploy branch) +For more elaborate installation instructions on deploying GeneNetwork from +source see [[#source-deployment][Source deployment]]. -#+begin_src bash -cd ~ -mkdir genenetwork -cd genenetwork -git clone --branch gn-deploy https://github.com/genenetwork/guix-bioinformatics -git clone --branch gn-deploy --recursive https://github.com/genenetwork/guix guix-gn-deploy -cd guix-gn-deploy -#+end_src bash +* Running GN2 -To test whether this is working try: +Default settings for GN2 are listed in a file called +[[../etc/default_settings.py][default_settings.py]]. You can copy this file and pass it as a new +parameter to the genenetwork2 command, e.g. -#+begin_src bash -#+end_src bash +: genenetwork2 mysettings.py +or you can set environment variables to override individual parameters, e.g. -** Step 3: Authorize the GN Guix server +: env SERVER_PORT=5004 SQL_URI=mysql://user:pwd@dbhostname/db_webqtl genenetwork2 -GN2 has its own GNU Guix binary distribution server. To trust it you have -to add the following key +the debug and logging switches can be particularly useful when +developing GN2. -#+begin_src scheme -(public-key - (ecc - (curve Ed25519) - (q #11217788B41ADC8D5B8E71BD87EF699C65312EC387752899FE9C888856F5C769#) - ) -) -#+end_src - -by pasting it into the command - -#+begin_src bash -guix archive --authorize -#+end_src - -and hit Ctrl-D. - -Now you can use the substitute server to install GN2 binaries. - -** Step 4: Install and run GN2 - -Since this is a quick and dirty install we are going to override the -GNU Guix package path by pointing the package path to our repository: - -#+begin_src bash -rm /root/.config/guix/latest -ln -s ~/genenetwork/guix-gn-deploy/ /root/.config/guix/latest -#+end_src +* Run MySQL server +** Install MySQL with GNU GUIx -Now check whether you can find the GN2 package with +These are the steps you can take to install a fresh installation of +mysql (which comes as part of the GNU Guix genenetwork2 install). -#+begin_src bash -env GUIX_PACKAGE_PATH=~/genenetwork/guix-bioinformatics/ guix package -A genenetwork2 - genenetwork2 2.0-a8fcff4 out gn/packages/genenetwork.scm:144:2 -#+end_src +As root configure and run -(ignore the source file newer then ... messages, this is caused by the -/root/.config/guix/latest override). +#+BEGIN_SRC bash +adduser mysql && addgroup mysql +mysqld --datadir=/var/mysql --initialize-insecure +mkdir -p /var/run/mysqld +chown mysql.mysql ~/mysql /var/run/mysqld +mysqld -u mysql --datadir=/var/mysql --explicit_defaults_for_timestamp -P 12048" +#+END_SRC -And install with +If you want to run as root you may have to set -#+begin_src bash -env GUIX_PACKAGE_PATH=~/genenetwork/guix-bioinformatics/ \ - guix package -i genenetwork2 \ - --substitute-urls="http://guix.genenetwork.org" -#+end_src +: /etc/my.cnf +: [mysqld] +: user=root -Note: the order of the substitute url's may make a difference in speed -(put the one first that is fastest for your location and time of day). +To check error output in a file on start-up run with something like -Note: if your system starts building or gives an error it may well be -Step 3 did not succeed. The installation should actually be smooth at -this point and only do binary installs (no compiling). +: mysqld -u mysql --console --explicit_defaults_for_timestamp --datadir=/gnu/mysql --log-error=~/test.log -After installation you should be able to run genenetwork2 after updating -the Guix suggested environment vars. Check the output of +Other tips are that Guix installs mysqld in your profile, so this may work -#+begin_src bash -guix package --search-paths -export PYTHONPATH="/root/.guix-profile/lib/python2.7/site-packages" -export R_LIBS_SITE="/root/.guix-profile/site-library/" -#+end_src +: /home/user/.guix-profile/bin/mysqld -u mysql --explicit_defaults_for_timestamp --datadir=/gnu/mysql -and copy-paste the listed exports into the terminal before running: +When you get errors like: -#+begin_src bash -genenetwork2 -#+end_src +: qlalchemy.exc.IntegrityError: (_mysql_exceptions.IntegrityError) (1215, 'Cannot add foreign key constraint') -It will complain that the database is missing. See the next section on -running MySQL server for downloading and installing a MySQL GN2 -database. After installing the database restart genenetwork2 and point -your browser at [[http://localhost:5003/]]. +you may need to set -End of the GN2 installation recipe! +: set foreign_key_checks=0 -* Run MySQL server +** Load the small database in MySQL At this point we require the underlying distribution to install and -run mysqld. Currently we have two databases for deployment, +run mysqld (see next section for GNU Guix). Currently we have two databases for deployment, 'db_webqtl_s' is the small testing database containing experiments from BXD mice and 'db_webqtl_plant' which contains all plant related material. Download one database from -http://files.genenetwork.org/raw_database/ -https://s3.amazonaws.com/genenetwork2/db_webqtl_s.zip +[[http://files.genenetwork.org/raw_database/]] + +[[https://s3.amazonaws.com/genenetwork2/db_webqtl_s.zip]] Check the md5sum. After installation inflate the database binary in the MySQL directory -(this installation path is subject to change soon) +: cd ~/mysql : chown -R mysql:mysql db_webqtl_s/ : chmod 700 db_webqtl_s/ : chmod 660 db_webqtl_s/* -restart MySQL service (mysqld). Login as root and +restart MySQL service (mysqld). Login as root + +: myslq -u root + +and : mysql> show databases; : +--------------------+ @@ -241,9 +175,12 @@ Set permissions and match password in your settings file below: : mysql> grant all privileges on db_webqtl_s.* to gn2@"localhost" identified by 'mysql_password'; +You may need to change "localhost" to whatever domain you are +connecting from (mysql will give an error). + Note that if the mysql connection is not working, try connecting to the IP address and check server firewall, hosts.allow and mysql IP -configuration. +configuration (see below). Note for the plant database you can rename it to db_webqtl_s, or change the settings in etc/default_settings.py to match your path. @@ -255,183 +192,44 @@ Graph of all runtime dependencies as installed by GNU Guix. #+ATTR_HTML: :title GN2_graph http://biogems.info/contrib/genenetwork/gn2.svg -* Source deployment - -This section gives a more elaborate instruction for installing GN2 -from source. - -First execute above 4 steps: - - - [[#step-1-install-gnu-guix][Step 1: Install GNU Guix]] - - [[#step-2-checkout-the-gn2-git-repositories][Step 2: Checkout the GN2 git repositories]] - - [[#step-3-authorize-the-gn-guix-server][Step 3: Authorize the GN Guix server]] - - [[#step-4-install-and-run-gn2-][Step 4: Install and run GN2 ]] - - -** Run your own copy of GN2 - -At some point you may want to fix the source code. Assuming you have -Guix and Genenetwork2 installed (as described above) clone the GN2 -repository from https://github.com/genenetwork/genenetwork2. - -Copy-paste the paths into your terminal (mainly so PYTHON_PATH and -R_LIBS_SITE are set) from the information given by guix: - -: guix package --search-paths - -Inside the repository: - -: cd genenetwork2 -: ./bin/genenetwork2 - -Will fire up your local repo http://localhost:5003/ using the -settings in ./etc/default_settings.py. These settings may -not reflect your system. To override settings create your own from a copy of -default_settings.py and pass it into GN2 with - -: ./bin/genenetwork2 $HOME/my_settings.py - -and everything *should* work (note the full path to the settings -file). This way we develop against the exact same dependency graph of -software. - -If something is not working, take a hint from the settings file -that comes in the Guix installation. It sits in something like +* Working with the GN2 source code -: cat ~/.guix-profile/lib/python2.7/site-packages/genenetwork2-2.0-py2.7.egg/etc/default_settings.py +See [[development.org]]. -** Set up nginx port forwarding +* Running ElasticSearch -nginx can be used as a reverse proxy for GN2. For example, we want to -expose GN2 on port 80 while it is running on port 5003. Essentially -the configuration looks like +In order to start up elasticsearch: +Penguin - change user to "elasticsearch" and use the following command: "env JAVA_HOME=/opt/jdk-9.0.4 /opt/elasticsearch-6.2.1/bin/elasticsearch" -#+begin_src js - server { - listen 80; - server_name test-gn2.genenetwork.org; - access_log logs/test-gn2.access.log; - proxy_connect_timeout 3000; - proxy_send_timeout 3000; - proxy_read_timeout 3000; - send_timeout 3000; +** SystemD - location / { - proxy_set_header Host $http_host; - proxy_set_header Connection keep-alive; - proxy_set_header X-Real-IP $remote_addr; - proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; - proxy_set_header X-Forwarded-Host $server_name; - proxy_pass http://127.0.0.1:5003; - } -} -#+end_src js +New server - as root run "systemctl restart elasticsearch" -Install the nginx webserver (as root) +#+BEGIN_SRC +tux01:/etc/systemd/system# cat elasticsearch.service +[Unit] +Description=Run Elasticsearch -: guix package -i nginx +[Service] +ExecStart=/opt/elasticsearch-6.2.1/bin/elasticsearch +Environment=JAVA_HOME=/opt/jdk-9.0.4 +Environment="ES_JAVA_OPTS=-Xms1g -Xmx8g" +Environment="PATH=/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games:/opt/jdk-9.0.4/bin" +LimitNOFILE=65536 +StandardOutput=syslog +StandardError=syslog +User=elasticsearch -The nginx example configuration examples can be found in the Guix -store through +[Install] +WantedBy=multi-user.target +#+END_SRC -: ls -l /root/.guix-profile/sbin/nginx -: lrwxrwxrwx 3 root guixbuild 66 Dec 31 1969 /root/.guix-profile/sbin/nginx -> /gnu/store/g0wrcl5z27rmk5b52rldzvk1bzzbnz2l-nginx-1.8.1/sbin/nginx +* Read more -Use that path - -: ls /gnu/store/g0wrcl5z27rmk5b52rldzvk1bzzbnz2l-nginx-1.8.1/share/nginx/conf/ -: fastcgi.conf koi-win scgi_params -: fastcgi.conf.default mime.types scgi_params.default -: fastcgi_params mime.types.default uwsgi_params -: fastcgi_params.default nginx.conf uwsgi_params.default -: koi-utf nginx.conf.default win-utf - -And copy any relevant files to /etc/nginx. A configuration file for -GeneNetwork (reverse proxy) port forwarding can be found in the source -repository under ./etc/nginx-genenetwork.conf. Copy this file to /etc -(still as root) -: cp ./etc/nginx-genenetwork.conf /etc/nginx/ - -Make dirs - -: mkdir -p /var/spool/nginx/logs - -Add users - -: adduser nobody ; addgroup nobody - -Run nginx - -: /root/.guix-profile/sbin/nginx -c /etc/nginx/nginx-genenetwork.conf -p /var/spool/nginx - -* Source deployment and other information on reproducibility - -See the document [[GUIX-Reproducible-from-source.org]]. - -** Update to recent guix - -We now compile Guix from scratch. - -Create, install and run a recent version of the guix-daemon by -compiling the guix repository you have installed with git in -step 2. Follow [[https://github.com/pjotrp/guix-notes/blob/master/INSTALL.org#building-gnu-guix-from-source-using-guix][these]] steps carefully after - -: cd ~/genenetwork/guix-gn-deploy - -Make sure to restart the guix daemon and run guix client from this -directory. - -** Install GN2 - -Reinstall genenetwork2 using the new tree - -#+begin_src bash -env GUIX_PACKAGE_PATH=~/genenetwork/guix-bioinformatics/ ./pre-inst-env guix package -i genenetwork2 --substitute-urls="http://guix.genenetwork.org https://mirror.guixsd.org" -#+end_src bash - -Note the use of ./pre-inst-env here! - -Actually, it should be the same installation as in step 4, so nothing -gets downloaded. - -** Run GN2 - -Make a note of the paths with - -#+begin_src bash -./pre-inst-env guix package --search-paths -#+end_src bash - -or this should also work if guix is installed - -#+begin_src bash -guix package --search-paths -#+end_src bash - -After setting the paths for the server - -#+begin_src bash -export PATH=~/.guix-profile/bin:$PATH -export PYTHONPATH="$HOME/.guix-profile/lib/python2.7/site-packages" -export R_LIBS_SITE="$HOME/.guix-profile/site-library/" -export GUIX_GTK3_PATH="$HOME/.guix-profile/lib/gtk-3.0" -export GI_TYPELIB_PATH="$HOME/.guix-profile/lib/girepository-1.0" -export XDG_DATA_DIRS="$HOME/.guix-profile/share" -export GIO_EXTRA_MODULES="$HOME/.guix-profile/lib/gio/modules" -#+end_src bash - -run the main script (in ~/.guix-profile/bin) - -#+begin_src bash -genenetwork2 -#+end_src bash - -will start the default server which listens on port 5003, i.e., -http://localhost:5003/. - -OK, we are where we were before with step 4. Only difference is that we -used our own compiled guix server. +If you want to understand the architecture of GN2 read +[[Architecture.org]]. The rest of this document is mostly on deployment +of GN2. * Trouble shooting @@ -451,13 +249,17 @@ On one system: : export GEM_PATH="$HOME/.guix-profile/lib/ruby/gems/2.2.0" and perhaps a few more. -** ERROR: can not find directory $HOME/gn2_data +** ERROR: 'can not find directory $HOME/gn2_data' or 'can not find directory $HOME/genotype_files/genotype' The default settings file looks in your $HOME/gn2_data. Since these files come with a Guix installation you should take a hint from the values in the installed version of default_settings.py (see above in this document). +You can use the GENENETWORK_FILES switch to set the datadir, for example + +: env GN2_PROFILE=~/opt/gn-latest GENENETWORK_FILES=/gnu/data/gn2_data ./bin/genenetwork2 + ** Can't run a module In rare cases, development modules are not brought in with Guix @@ -479,6 +281,26 @@ R_LIBS_SITE. Please check your GNU Guix GN2 installation paths, you man need to reinstall. Note that this may be the point you may want to start using profiles (see profile section). +** Mysql can't connect server through socket ERROR + +The following error + +: sqlalchemy.exc.OperationalError: (_mysql_exceptions.OperationalError) (2002, 'Can\'t connect to local MySQL server through socket \'/run/mysqld/mysqld.sock\' (2 "No such file or directory")') + +means that MySQL is trying to connect locally to a non-existent MySQL +server, something you may see in a container. Typically replicated with something like + +: mysql -h localhost + +try to connect over the network interface instead, e.g. + +: mysql -h 127.0.0.1 + +if that works run genenetwork after setting SQL_URI to something like + +: export SQL_URI=mysql://gn2:mysql_password@127.0.0.1/db_webqtl_s + + * IRC session Here an IRC session where we installed GN2 from scratch using GNU Guix |