summaryrefslogtreecommitdiff
path: root/issues
diff options
context:
space:
mode:
authorPjotr Prins2021-08-16 10:46:34 +0200
committerPjotr Prins2021-08-16 10:46:41 +0200
commitbece7b48c12379f3ac2ea9c6f67a145f47d17c8d (patch)
tree19567360abbc672fe7946fc74a320d02726b4f67 /issues
parentd1b00db615b786a8248851ba9ca9f8ac0da2352e (diff)
downloadgn-gemtext-bece7b48c12379f3ac2ea9c6f67a145f47d17c8d.tar.gz
gemma: testing slow gemma with profiler
Diffstat (limited to 'issues')
-rw-r--r--issues/gemma/multivariate_gemma_hangs-issue243.gmi35
1 files changed, 33 insertions, 2 deletions
diff --git a/issues/gemma/multivariate_gemma_hangs-issue243.gmi b/issues/gemma/multivariate_gemma_hangs-issue243.gmi
index 255de8b..b5f07b2 100644
--- a/issues/gemma/multivariate_gemma_hangs-issue243.gmi
+++ b/issues/gemma/multivariate_gemma_hangs-issue243.gmi
@@ -1,6 +1,8 @@
# Multivariate GEMMA hangs (issue 243)
-The following command does finish
+=> https://github.com/genetics-statistics/GEMMA/issues/243
+
+The following command finishes:
```
time ../../bin/gemma -bfile multivariate_2traits -k output/multivariate.cXX.txt -maf 0.0000001 -lmm 4 -n 1 2 -o gemma.polygenic.result
@@ -122,4 +124,33 @@ at src/gemma.cpp:2830
eval=eval@entry=0x5233c0, UtW=UtW@entry=0x5233f0, UtY=UtY@entry=0x523430) at src/mvlmm.cpp:3806
```
-A profiler should help pinpoint where time is spent. The issue submitter admits that this is a contrived edge-case so I am not going to work on it until I start optimizing mvlmm.
+Shows time is spent mostly in one place.
+
+A profiler should help pinpoint where time is spent. The issue submitter admits that this is a contrived edge-case so I am not going to work on it until I start optimizing mvlmm. Just a quick check with gperftools (formerly the Google profiler) which is packaged in GNU Guix:
+
+Profiling above gemma dataset
+
+```
+ 120 7.3% 7.3% 155 9.4% CalcQi
+ 96 5.8% 13.1% 96 5.8% dgemm_kernel_ZEN
+ 88 5.3% 18.4% 88 5.3% __sched_yield
+ 81 4.9% 23.3% 109 6.6% blas_memory_free
+ 77 4.7% 28.0% 77 4.7% __pthread_mutex_unlock_usercnt
+ 71 4.3% 32.3% 94 5.7% CalcXHiY
+ 69 4.2% 36.5% 87 5.3% dgemm_nn
+ 66 4.0% 40.5% 66 4.0% __pthread_mutex_lock
+ 63 3.8% 44.3% 63 3.8% __ieee754_log_fma
+ 59 3.6% 47.9% 179 10.9% blas_memory_alloc
+ 58 3.5% 51.4% 58 3.5% gsl_vector_get
+ 57 3.5% 54.9% 88 5.3% dgemm_nt
+ 56 3.4% 58.3% 80 4.9% CalcOmega
+ 56 3.4% 61.7% 421 25.5% cblas_dgemm
+ 54 3.3% 64.9% 54 3.3% dgemm_beta_ZEN
+ 51 3.1% 68.0% 537 32.6% CalcSigma
+ 51 3.1% 71.1% 76 4.6% dsyr_thread_L
+ 43 2.6% 73.7% 43 2.6% gsl_matrix_get
+ 41 2.5% 76.2% 41 2.5% gsl_matrix_set
+ 37 2.2% 78.5% 130 7.9% MphCalcLogL
+```
+
+CalcSigma and CalcQi are worth looking into. Also cblas_dgemm may need some attention.