diff options
author | Pjotr Prins | 2021-08-16 10:46:34 +0200 |
---|---|---|
committer | Pjotr Prins | 2021-08-16 10:46:41 +0200 |
commit | bece7b48c12379f3ac2ea9c6f67a145f47d17c8d (patch) | |
tree | 19567360abbc672fe7946fc74a320d02726b4f67 /issues/gemma/multivariate_gemma_hangs-issue243.gmi | |
parent | d1b00db615b786a8248851ba9ca9f8ac0da2352e (diff) | |
download | gn-gemtext-bece7b48c12379f3ac2ea9c6f67a145f47d17c8d.tar.gz |
gemma: testing slow gemma with profiler
Diffstat (limited to 'issues/gemma/multivariate_gemma_hangs-issue243.gmi')
-rw-r--r-- | issues/gemma/multivariate_gemma_hangs-issue243.gmi | 35 |
1 files changed, 33 insertions, 2 deletions
diff --git a/issues/gemma/multivariate_gemma_hangs-issue243.gmi b/issues/gemma/multivariate_gemma_hangs-issue243.gmi index 255de8b..b5f07b2 100644 --- a/issues/gemma/multivariate_gemma_hangs-issue243.gmi +++ b/issues/gemma/multivariate_gemma_hangs-issue243.gmi @@ -1,6 +1,8 @@ # Multivariate GEMMA hangs (issue 243) -The following command does finish +=> https://github.com/genetics-statistics/GEMMA/issues/243 + +The following command finishes: ``` time ../../bin/gemma -bfile multivariate_2traits -k output/multivariate.cXX.txt -maf 0.0000001 -lmm 4 -n 1 2 -o gemma.polygenic.result @@ -122,4 +124,33 @@ at src/gemma.cpp:2830 eval=eval@entry=0x5233c0, UtW=UtW@entry=0x5233f0, UtY=UtY@entry=0x523430) at src/mvlmm.cpp:3806 ``` -A profiler should help pinpoint where time is spent. The issue submitter admits that this is a contrived edge-case so I am not going to work on it until I start optimizing mvlmm. +Shows time is spent mostly in one place. + +A profiler should help pinpoint where time is spent. The issue submitter admits that this is a contrived edge-case so I am not going to work on it until I start optimizing mvlmm. Just a quick check with gperftools (formerly the Google profiler) which is packaged in GNU Guix: + +Profiling above gemma dataset + +``` + 120 7.3% 7.3% 155 9.4% CalcQi + 96 5.8% 13.1% 96 5.8% dgemm_kernel_ZEN + 88 5.3% 18.4% 88 5.3% __sched_yield + 81 4.9% 23.3% 109 6.6% blas_memory_free + 77 4.7% 28.0% 77 4.7% __pthread_mutex_unlock_usercnt + 71 4.3% 32.3% 94 5.7% CalcXHiY + 69 4.2% 36.5% 87 5.3% dgemm_nn + 66 4.0% 40.5% 66 4.0% __pthread_mutex_lock + 63 3.8% 44.3% 63 3.8% __ieee754_log_fma + 59 3.6% 47.9% 179 10.9% blas_memory_alloc + 58 3.5% 51.4% 58 3.5% gsl_vector_get + 57 3.5% 54.9% 88 5.3% dgemm_nt + 56 3.4% 58.3% 80 4.9% CalcOmega + 56 3.4% 61.7% 421 25.5% cblas_dgemm + 54 3.3% 64.9% 54 3.3% dgemm_beta_ZEN + 51 3.1% 68.0% 537 32.6% CalcSigma + 51 3.1% 71.1% 76 4.6% dsyr_thread_L + 43 2.6% 73.7% 43 2.6% gsl_matrix_get + 41 2.5% 76.2% 41 2.5% gsl_matrix_set + 37 2.2% 78.5% 130 7.9% MphCalcLogL +``` + +CalcSigma and CalcQi are worth looking into. Also cblas_dgemm may need some attention. |