aboutsummaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
Diffstat (limited to 'doc')
-rw-r--r--doc/database.org150
-rw-r--r--doc/docker-container.org12
-rw-r--r--doc/rpy2-performance.org182
3 files changed, 337 insertions, 7 deletions
diff --git a/doc/database.org b/doc/database.org
index 5107b660..d5462d4e 100644
--- a/doc/database.org
+++ b/doc/database.org
@@ -1339,7 +1339,8 @@ The SNP count info for the BXD is calculated like this
startMb += stepMb
#+end_src
-select * from BXDSnpPosition limit 5;
+: select * from BXDSnpPosition limit 5;
+
+------+-----------+-----------+----------+
| Chr | StrainId1 | StrainId2 | Mb |
+------+-----------+-----------+----------+
@@ -1368,3 +1369,150 @@ mysql> select * from SnpSource limit 5;
Empty set (0.00 sec)
Hmmm. This is the test database. Then there are the plink files and VCF files.
+
+* Optimize SQL?
+
+We were facing some issues with slow queries. A query
+was really slow on Penguin2:
+
+: time mysql -u webqtlout -pwebqtlout db_webqtl < ~/chunk.sql > /dev/null
+: real 0m13.082s
+: user 0m0.292s
+: sys 0m0.032s
+
+Runs in 1s on Tux01 and 13s on P2, why is that? The gist of it
+was increasing an InnoDB cache size(!)
+
+Interestingly, Penguin2 is running InnoDB on a much slower storage.
+It has more indices that Tux01(?!). Probably due to things we have
+been trying to make the datatables faster.
+
+Meanwhile the query is one with many joins:
+
+#+begin_src sql
+SELECT ProbeSet.Name,ProbeSetXRef.DataId, T4.value, T5.value, T6.value, T7.value, T8.value, T9.value, T10.value, T11.value, T12.value, T14.value, T15.value, T17.value, T18.value, T19.value, T20.value, T21.value, T22.value, T24.value, T25.value, T26.value, T28.value, T29.value, T30.value, T31.value, T35.value, T36.value, T37.value, T39.value, T98.value, T99.value, T100.value, T103.value, T487.value, T105.value, T106.value, T110.value FROM (ProbeSet, ProbeSetXRef, ProbeSetFreeze)
+ left join ProbeSetData as T4 on T4.Id = ProbeSetXRef.DataId
+ and T4.StrainId=4
+ (...)
+ left join ProbeSetData as T110 on T110.Id = ProbeSetXRef.DataId
+ and T110.StrainId=110
+ WHERE ProbeSetXRef.ProbeSetFreezeId = ProbeSetFreeze.Id
+ and ProbeSetFreeze.Name = 'HC_M2_0606_P'
+ and ProbeSet.Id = ProbeSetXRef.ProbeSetId
+ order by ProbeSet.Id
+#+end_src
+
+And is blazingly fast on Tux01 and (now) fast enough on Penguin2.
+
+First I checked the tables for indices and storage type. Next I
+checked the difference in configuration.
+
+** Check tables
+
+Tables (ProbeSetData, ProbeSet, ProbeSetXRef, ProbeSetFreeze)
+
+*** ProbeSetData
+
+Same on Tux01 and P2:
+
+: show indexes from ProbeSetData ;
+
+| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+|--------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------|
+| ProbeSetData | 0 | DataId | 1 | Id | A | 47769944 | NULL | NULL | | BTREE | | |
+| ProbeSetData | 0 | DataId | 2 | StrainId | A | 5111384047 | NULL | NULL | | BTREE | | |
+
+*** ProbeSetFreeze
+
+Tux01 has less indexes than P2(!):
+
+| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+|----------------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------|
+| ProbeSetFreeze | 0 | PRIMARY | 1 | Id | A | 911 | NULL | NULL | | BTREE | | |
+| ProbeSetFreeze | 0 | FullName | 1 | FullName | A | 911 | NULL | NULL | | BTREE | | |
+| ProbeSetFreeze | 0 | Name | 1 | Name | A | 911 | NULL | NULL | YES | BTREE | | |
+| ProbeSetFreeze | 1 | NameIndex | 1 | Name2 | A | 911 | NULL | NULL | | BTREE | | |
+: 4 rows in set (0.000 sec)
+
+| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+|----------------+------------+-----------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------|
+| ProbeSetFreeze | 0 | PRIMARY | 1 | Id | A | 883 | NULL | NULL | | BTREE | | |
+| ProbeSetFreeze | 0 | FullName | 1 | FullName | A | 883 | NULL | NULL | | BTREE | | |
+| ProbeSetFreeze | 0 | Name | 1 | Name | A | 883 | NULL | NULL | YES | BTREE | | |
+| ProbeSetFreeze | 1 | NameIndex | 1 | Name2 | A | 883 | NULL | NULL | | BTREE | | |
+| ProbeSetFreeze | 1 | ShortName | 1 | ShortName | A | 883 | NULL | NULL | | BTREE | | |
+| ProbeSetFreeze | 1 | ProbeFreezeId | 1 | ProbeFreezeId | A | 441 | NULL | NULL | | BTREE | | |
+| ProbeSetFreeze | 1 | conf_and_public | 1 | confidentiality | A | 3 | NULL | NULL | | BTREE | | |
+| ProbeSetFreeze | 1 | conf_and_public | 2 | public | A | 4 | NULL | NULL | | BTREE | | |
+: 8 rows in set (0.00 sec)
+
+*** ProbeSet
+
+Identical indexes
+
+*** ProbeSetXRef
+
+Tux01 has less indexes than P2(!):
+
+: MariaDB [db_webqtl]> show indexes from ProbeSetXRef ;
+| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+|--------------+------------+------------+--------------+------------------+-----------+-------------+----------+--------+------+------------+---------+---------------|
+| ProbeSetXRef | 0 | ProbeSetId | 1 | ProbeSetFreezeId | A | 885 | NULL | NULL | | BTREE | | |
+| ProbeSetXRef | 0 | ProbeSetId | 2 | ProbeSetId | A | 47713039 | NULL | NULL | | BTREE | | |
+| ProbeSetXRef | 0 | DataId_IDX | 1 | DataId | A | 47713039 | NULL | NULL | | BTREE | | |
+| ProbeSetXRef | 1 | Locus_IDX | 1 | Locus | A | 15904346 | NULL | NULL | YES | BTREE | | |
+: 4 rows in set (0.000 sec)
+
+
+
+: MariaDB [db_webqtl]> show indexes from ProbeSetXRef ;
+| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+|--------------+------------+-------------+--------------+------------------+-----------+-------------+----------+--------+------+------------+---------+---------------|
+| ProbeSetXRef | 0 | ProbeSetId | 1 | ProbeSetFreezeId | A | 856 | NULL | NULL | | BTREE | | |
+| ProbeSetXRef | 0 | ProbeSetId | 2 | ProbeSetId | A | 46412145 | NULL | NULL | | BTREE | | |
+| ProbeSetXRef | 0 | DataId_IDX | 1 | DataId | A | 46412145 | NULL | NULL | | BTREE | | |
+| ProbeSetXRef | 1 | ProbeSetId1 | 1 | ProbeSetId | A | 5156905 | NULL | NULL | | BTREE | | |
+| ProbeSetXRef | 1 | Locus | 1 | Locus | A | 23206072 | NULL | NULL | YES | BTREE | | |
+: 5 rows in set (0.00 sec)
+
+** Check storage
+
+The database in Tux01 is mounted on NVME. On Penguin2 it
+is slower SATA with RAID5.
+
+Also on Penguin2 the following tables are using InnoDB instead of
+MyISAM
+
+#+begin_src sh
+-rw-rw---- 1 mysql mysql 79691776 Oct 15 2019 AccessLog.ibd
+-rw-rw---- 1 mysql mysql 196608 Oct 24 2019 Docs.ibd
+-rw-rw---- 1 mysql mysql 63673729024 Jul 10 2020 GenoData.ibd
+-rw-rw---- 1 mysql mysql 34787557376 Jul 9 2020 ProbeData.ibd
+-rw-rw---- 1 mysql mysql 254690721792 Jul 10 2020 ProbeSetData.ibd
+-rw-rw---- 1 mysql mysql 32103202816 Jul 9 2020 SnpAll.ibd
+-rw-rw---- 1 mysql mysql 98304 May 6 2020 TraitMetadata.ibd
+#+end_src
+
+This [[https://www.liquidweb.com/kb/mysql-performance-myisam-vs-innodb/][article]] suggests that myISAM will be faster for our use case.
+
+** Configuration
+
+There was one setting on Tux01 missing on P2
+
+: +innodb_buffer_pool_size=1024M
+
+Running the same query twice (so you can see the warmup after
+a restart of MariaDB)
+
+#+begin_src sh
+penguin2:/etc$ time mysql -u webqtlout -pwebqtlout db_webqtl < ~/chunk.sql > ~/test.out
+real 0m4.253s
+user 0m0.276s
+sys 0m0.040s
+penguin2:/etc$ time mysql -u webqtlout -pwebqtlout db_webqtl < ~/chunk.sql > ~/test.out
+real 0m2.633s
+user 0m0.296s
+sys 0m0.028s
+#+end_src
+
+That is much better :)
diff --git a/doc/docker-container.org b/doc/docker-container.org
index ec91824a..ef0d71fc 100644
--- a/doc/docker-container.org
+++ b/doc/docker-container.org
@@ -32,13 +32,13 @@ First create the gn2 tar archive by running:
env GUIX_PACKAGE_PATH="/home/bonface/projects/guix-bioinformatics:/home/bonface/projects/guix-past/modules" \
./pre-inst-env guix pack --no-grafts\
-S /gn2-profile=/ \
- screen genenetwork2
+ screen python2-genenetwork2
# For the Python 3 version:
env GUIX_PACKAGE_PATH="/home/bonface/projects/guix-bioinformatics:/home/bonface/projects/guix-past/modules" \
./pre-inst-env guix pack --no-grafts\
-S /gn2-profile=/ \
- screen python3-genenetwork2
+ screen genenetwork2
#+end_src
The output will look something similar to:
@@ -59,16 +59,16 @@ RUN tar -xzf /tmp/gn2.tar.gz -C / && rm -f /tmp/gn2.tar.gz && \
Build the image(Note the fullstop at the end):
-: sudo docker build -t python2-genenetwork2:latest -f Dockerfile .
+: sudo docker build -t genenetwork2:latest -f Dockerfile .
To load the image interactively you've just created:
-: docker run -ti "python2-genenetwork2:latest" bash
+: docker run -ti "genenetwork2:latest" bash
Assuming you have a docker instance running, you could always run
commands in it e.g:
-: docker run "python2-genenetwork2:latest" python --version
+: docker run "genenetwork2:latest" python --version
* Pushing to DockerHub
@@ -78,7 +78,7 @@ CI environment using Github Actions.
To push to dockerhub, first get the image name by running =docker
images=. Push to dockerhub using a command similar to:
-: docker push bonfacekilz/python2-genenetwork2:latest
+: docker push bonfacekilz/genenetwork2:latest
Right now, we have 2 images on DockerHub:
diff --git a/doc/rpy2-performance.org b/doc/rpy2-performance.org
new file mode 100644
index 00000000..8f917ca0
--- /dev/null
+++ b/doc/rpy2-performance.org
@@ -0,0 +1,182 @@
+* Python-Rpy2 performance issues with genenetwork2
+
+At one point, genenetwork2 was down. A possible cause was that it
+wrote into the log file in an infinite loop due to rpy2(v3.4.4), so a
+solution was to empty it. Currently, as a work around, rpy2 is
+disabled by removing it's imports. This affects WGCNA/ CTL imports and
+commenting out Biweight Midcorrelation option in the trait page. See:
+
+- [[https://github.com/genenetwork/genenetwork2/commit/1baf5f7611909c651483208184c5fbf7d4a7a088][1baf5f7]]
+- [[https://github.com/genenetwork/genenetwork2/commit/afee4d625248565857df98d3510f680ae6204864][afee4d6]]
+- [[https://github.com/genenetwork/genenetwork2/commit/c458bf0ad731e5e5fd9cbd0686936b3a441bae63][c458bf0]]
+- [[https://github.com/genenetwork/genenetwork2/commit/d31f3f763471b19559ca74e73b52b3cb5e7153ce][d31f3f7]]
+
+** Reproducing the problem
+
+I went back to commit #b8408cea. With regards to logs, I never
+experienced any log issue. Perhaps it's because of how I start my
+server:
+
+: env SERVER_PORT=5004 TMPDIR=/home/bonface/tmp WEBSERVER_MODE=DEBUG LOG_LEVEL=DEBUG GENENETWORK_FILES=/home/bonface/data/genotype_files/ GN2_PROFILE=/home/bonface/opt/python3-genenetwork2 ./scripts/run_debug.sh
+
+However, when loading the homepage, I occasionally ran into this trace:
+
+#+begin_src
+DEBUG:wqflask.views:.check_access_permissions: @app.before_request check_access_permissions
+DEBUG:wqflask.views:.shutdown_session: remove db_session
+WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: Error: ignoring SIGPIPE signal
+
+WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: In addition:
+WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: Warning messages:
+
+WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: 1:
+WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: In (function (package, help, pos = 2, lib.loc = NULL, character.only = FALSE, :
+WARNING:rpy2.rinterface_lib.callbacks:R[write to console]:
+
+WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: library '/home/bonface/R/x86_64-unknown-linux-gnu-library/4.0' contains no packages
+
+WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: 2:
+WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: In (function (package, help, pos = 2, lib.loc = NULL, character.only = FALSE, :
+WARNING:rpy2.rinterface_lib.callbacks:R[write to console]:
+
+WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: library '/home/bonface/R/x86_64-unknown-linux-gnu-library/4.0' contains no packages
+
+WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: Fatal error: unable to initialize the JIT
+
+
+WARNING:rpy2.rinterface_lib.callbacks:R[write to console]:
+ *** caught segfault ***
+
+WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: address (nil), cause 'memory not mapped'
+
+WARNING:rpy2.rinterface_lib.callbacks:R[write to console]:
+Possible actions:
+1: abort (with core dump, if enabled)
+2: normal R exit
+3: exit R without saving workspace
+4: exit R saving workspace
+
+Selection:
+
+#+end_src
+
+This blocks the flask service. Seems to be related to: [[https://github.com/rpy2/rpy2/issues/769][rpy2-issue#769]]
+and [[https://github.com/rpy2/rpy2/issues/809][rpy2-issue#809]]. I tried to reproduce this problem using some endpoint:
+
+#+begin_src python
+@app.route("/test")
+ def test():
+ from rpy2 import robjects as ro
+ from rpy2 import rinterface
+ from threading import Thread
+
+ def rpy2_init_simple():
+ rinterface.initr_simple()
+
+ thread = Thread(target=rpy2_init_simple)
+ thread.start()
+ return "This is a test after importing rpy2"
+#+end_src
+
+which generates this trace:
+
+#+begin_src
+/home/bonface/opt/python3-genenetwork2/lib/python3.8/site-packages/rpy2/rinterface.py:955: UserWarning: R is not initialized by the main thread.
+ Its taking over SIGINT cannot be reversed here, and as a
+ consequence the embedded R cannot be interrupted with Ctrl-C.
+ Consider (re)setting the signal handler of your choice from
+ the main thread.
+warnings.warn(
+DEBUG:wqflask.views:.shutdown_session: remove db_session
+
+#+end_src
+
+Modifying the endpoint to:
+
+#+begin_src python
+@app.route("/test")
+ def test():
+ import wqflask.correlation.show_corr_results
+ import wqflask.ctl.ctl_analysis
+ import time
+ from wqflask.correlation.correlation_functions import cal_zero_order_corr_for_tiss
+
+ print("Sleeping for 3 seconds")
+ time.sleep(3)
+ return "This is a test after importing rpy2"
+#+end_src
+
+and refreshing the page a couple of times, I get:
+
+#+begin_src
+DEBUG:wqflask.views:.check_access_permissions: @app.before_request check_access_
+permissions
+Sleeping for 3 seconds
+DEBUG:wqflask.views:.shutdown_session: remove db_session
+WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: Error: ignoring SIGPI
+PE signal
+
+WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: In addition:
+WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: Warning messages:
+
+WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: 1:
+WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: In (function (package
+, help, pos = 2, lib.loc = NULL, character.only = FALSE, :
+WARNING:rpy2.rinterface_lib.callbacks:R[write to console]:
+
+WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: library '/home/bonfa
+ce/R/x86_64-unknown-linux-gnu-library/4.0' contains no packages
+
+WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: 2:
+WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: In (function (package
+, help, pos = 2, lib.loc = NULL, character.only = FALSE, :
+WARNING:rpy2.rinterface_lib.callbacks:R[write to console]:
+
+WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: library '/home/bonfa
+ce/R/x86_64-unknown-linux-gnu-library/4.0' contains no packages
+
+WARNING:rpy2.rinterface_lib.callbacks:R[write to console]:
+\*** caught segfault ***
+
+WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: address (nil), cause
+'memory not mapped'
+
+WARNING:rpy2.rinterface_lib.callbacks:R[write to console]:
+Possible actions:
+1: abort (with core dump, if enabled)
+2: normal R exit
+3: exit R without saving workspace
+4: exit R saving workspace
+
+Selection: [2021-06-16 13:11:00 +0300] [18657] [INFO] Handling signal: winch
+[2021-06-16 13:11:00 +0300] [18657] [INFO] Handling signal: winch
+[2021-06-16 13:13:02 +0300] [18657] [INFO] Handling signal: winch
+#+end_src
+
+However, this seems to be non-deterministic, in the sense that I can't
+really pin what causes the above. I've tried to write a Locust Test
+that simulates users hitting that endpoint:
+
+#+begin_src python
+"""Load test a single trait page"""
+from locust import HttpUser, task, between
+
+
+ class LoadTest(HttpUser):
+ wait_time = between(1, 2.5)
+
+ @task
+ def fetch_trait(self):
+ """Fetch a single trait"""
+ self.client.get("/test")
+#+end_src
+
+
+** A possible solution
+
+From this [[https://github.com/rpy2/rpy2/issues/809#issuecomment-845923975][comment]], a possible reason for the above traces, is that
+from Flask's end, a [[https://tldp.org/LDP/lpg/node20.html][SIGPIPE]] is somehow generated by our Python
+code. However, at this particular point, the R thread just happens to
+be running, and R can't handle this correctly. This seems to have been
+fixed in this [[https://github.com/rpy2/rpy2/pull/810][PR]] with a this [[https://github.com/rpy2/rpy2/issues/809#issuecomment-851618215][explanation]]. On our end, to have these
+changes, we have to update our python-rpy2 version.