summaryrefslogtreecommitdiff
path: root/issues/gemma/multivariate_gemma_hangs-issue243.gmi
blob: 7eea489780bb20f88dbde26d623c74d592449731 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
# Multivariate GEMMA hangs (issue 243)

=> https://github.com/genetics-statistics/GEMMA/issues/243

The simulated dataset was generated.  Benjamin Chu writes: I think the main problem might be how I did the simulation. I believe it was something like

```
b ~ N(0, 1) # for k position of b only, others are 0
yi ~ N(xi^T * b, 1)
```

where yi is length 2 vector of phenotype, xi is length 10000 genotype vector (entries 0, 1 or 2), and b is length 10000 effect size vector but only k position of b is drawn from standard Normal. This might be a poor simulation model for GEMMA. Later I changed the simulation model and the divergence problem went away.

The following command finishes:

```
time ../../bin/gemma -bfile multivariate_2traits -k output/multivariate.cXX.txt -maf 0.0000001 -lmm 4 -n 1 2 -o gemma.polygenic.result
real    133m29.533s
user    133m31.746s
sys     0m5.064s
```

But is slow. A print statement shows close to a second is spent on every SNP:

```
0.83 9994,10000
0.82 9995,10000
0.81 9996,10000
0.80 9997,10000
0.80 9998,10000
0.80 9999,10000
```

Running the debugger on and hitting Ctrl-C repeatedly stops in the following places

```
#0  0x00000000004807f9 in MphCalcLogL (eval=eval@entry=0x5233c0, xHiy=xHiy@entry=0x59e5d0,
    D_l=D_l@entry=0x5ac840, UltVehiY=UltVehiY@entry=0x53e500, Qi=Qi@entry=0x59e2e0) at src/mvlmm.cpp:579
#1  0x000000000048104b in MphEM (func_name=func_name@entry=82 'R', max_iter=1000, max_prec=0.001,
    eval=eval@entry=0x5233c0, X=X@entry=0x540930, Y=Y@entry=0x5408d0, U_hat=U_hat@entry=0x53e380,
    E_hat=0x53e3e0, OmegaU=0x53e440, OmegaE=0x53e4a0, UltVehiY=0x53e500, UltVehiBX=0x53e560,
    UltVehiU=0x53e5c0, UltVehiE=0x53e620, V_g=0x540990, V_e=0x540a20, B=0x540ab0) at src/mvlmm.cpp:662
#2  0x000000000048f924 in MVLMM::AnalyzePlink (this=this@entry=0x7fffffffcf20, U=U@entry=0x523380,
    eval=eval@entry=0x5233c0, UtW=UtW@entry=0x5233f0, UtY=UtY@entry=0x523430) at src/mvlmm.cpp:3803
#3  0x0000000000435c2e in GEMMA::BatchRun (this=this@entry=0x7fffffffe4a0, cPar=...)
    at src/gemma.cpp:2830

#0  0x00007ffff7df8690 in gsl_matrix_sub ()
   from /gnu/store/pmmbmrbizz668plrrvfyvqh5ymvjy5p4-profile/lib/libgsl.so.25
#1  0x000000000047fc91 in UpdateU (OmegaE=OmegaE@entry=0x53e4a0, UltVehiY=UltVehiY@entry=0x53e500,
    UltVehiBX=UltVehiBX@entry=0x53e560, UltVehiU=UltVehiU@entry=0x53e5c0) at src/mvlmm.cpp:387
#2  0x0000000000480e76 in MphEM (func_name=func_name@entry=76 'L', max_iter=1000, max_prec=0.001,
    eval=eval@entry=0x5233c0, X=X@entry=0x540930, Y=Y@entry=0x5408d0, U_hat=U_hat@entry=0x53e380,
    E_hat=0x53e3e0, OmegaU=0x53e440, OmegaE=0x53e4a0, UltVehiY=0x53e500, UltVehiBX=0x53e560,
    UltVehiU=0x53e5c0, UltVehiE=0x53e620, V_g=0x540990, V_e=0x540a20, B=0x540ab0) at src/mvlmm.cpp:686
#3  0x000000000048faed in MVLMM::AnalyzePlink (this=this@entry=0x7fffffffcf20, U=U@entry=0x523380,
    eval=eval@entry=0x5233c0, UtW=UtW@entry=0x5233f0, UtY=UtY@entry=0x523430) at src/mvlmm.cpp:3778
#4  0x0000000000435c2e in GEMMA::BatchRun (this=this@entry=0x7fffffffe4a0, cPar=...)

at src/gemma.cpp:2830

#0  0x00007ffff6300718 in dgemm_nt ()
   from /gnu/store/pmmbmrbizz668plrrvfyvqh5ymvjy5p4-profile/lib/libopenblas.so.0
#1  0x00007ffff622e6f3 in cblas_dgemm ()
   from /gnu/store/pmmbmrbizz668plrrvfyvqh5ymvjy5p4-profile/lib/libopenblas.so.0
#2  0x00007ffff7d6af44 in gsl_blas_dgemm ()
   from /gnu/store/pmmbmrbizz668plrrvfyvqh5ymvjy5p4-profile/lib/libgsl.so.25
#3  0x00000000004806a4 in CalcSigma (func_name=func_name@entry=82 'R', eval=eval@entry=0x5233c0,
    D_l=D_l@entry=0x59e5d0, X=X@entry=0x540930, OmegaU=OmegaU@entry=0x53e440,
    OmegaE=OmegaE@entry=0x53e4a0, UltVeh=0x5abd30, Qi=0x5abe50, Sigma_uu=0x540fb0, Sigma_ee=0x5ac7e0)
    at src/mvlmm.cpp:542
#4  0x0000000000480f62 in MphEM (func_name=func_name@entry=82 'R', max_iter=1000, max_prec=0.001,
    eval=eval@entry=0x5233c0, X=X@entry=0x540930, Y=Y@entry=0x5408d0, U_hat=U_hat@entry=0x53e380,
    E_hat=0x53e3e0, OmegaU=0x53e440, OmegaE=0x53e4a0, UltVehiY=0x53e500, UltVehiBX=0x53e560,
    UltVehiU=0x53e5c0, UltVehiE=0x53e620, V_g=0x540990, V_e=0x540a20, B=0x540ab0) at src/mvlmm.cpp:704
#5  0x000000000048f924 in MVLMM::AnalyzePlink (this=this@entry=0x7fffffffcf20, U=U@entry=0x523380,
    eval=eval@entry=0x5233c0, UtW=UtW@entry=0x5233f0, UtY=UtY@entry=0x523430) at src/mvlmm.cpp:3803
#6  0x0000000000435c2e in GEMMA::BatchRun (this=this@entry=0x7fffffffe4a0, cPar=...)
    at src/gemma.cpp:2830

#0  0x00007ffff5e16f00 in __pthread_mutex_unlock_usercnt ()
   from /gnu/store/fa6wj5bxkj5ll1d7292a70knmyl7a0cr-glibc-2.31/lib/libpthread.so.0
#1  0x00007ffff622e6fb in cblas_dgemm ()
   from /gnu/store/pmmbmrbizz668plrrvfyvqh5ymvjy5p4-profile/lib/libopenblas.so.0
#2  0x00007ffff7d6af44 in gsl_blas_dgemm ()
   from /gnu/store/pmmbmrbizz668plrrvfyvqh5ymvjy5p4-profile/lib/libgsl.so.25
#3  0x000000000048067b in CalcSigma (func_name=func_name@entry=82 'R', eval=eval@entry=0x5233c0,
    D_l=D_l@entry=0x5ac970, X=X@entry=0x540930, OmegaU=OmegaU@entry=0x53e440,
    OmegaE=OmegaE@entry=0x53e4a0, UltVeh=0x5abdb0, Qi=0x59eb10, Sigma_uu=0x5ac3e0, Sigma_ee=0x5ab290)
    at src/mvlmm.cpp:541

#0  0x00007ffff5e16f19 in __pthread_mutex_unlock_usercnt ()
   from /gnu/store/fa6wj5bxkj5ll1d7292a70knmyl7a0cr-glibc-2.31/lib/libpthread.so.0
#1  0x00007ffff643d667 in blas_memory_alloc ()
   from /gnu/store/pmmbmrbizz668plrrvfyvqh5ymvjy5p4-profile/lib/libopenblas.so.0
#2  0x00007ffff622e648 in cblas_dgemm ()
   from /gnu/store/pmmbmrbizz668plrrvfyvqh5ymvjy5p4-profile/lib/libopenblas.so.0
#3  0x00007ffff7d6af44 in gsl_blas_dgemm ()
   from /gnu/store/pmmbmrbizz668plrrvfyvqh5ymvjy5p4-profile/lib/libgsl.so.25
#4  0x0000000000480652 in CalcSigma (func_name=func_name@entry=82 'R', eval=eval@entry=0x5233c0,
    D_l=D_l@entry=0x5ab1e0, X=X@entry=0x540930, OmegaU=OmegaU@entry=0x53e440,
    OmegaE=OmegaE@entry=0x53e4a0, UltVeh=0x5a0990, Qi=0x59e570, Sigma_uu=0x59e2e0, Sigma_ee=0x5abe30)
    at src/mvlmm.cpp:539

#0  0x000000000047f8e7 in gsl_matrix_get (j=389, i=0, m=0x53e500)
    at /gnu/store/pmmbmrbizz668plrrvfyvqh5ymvjy5p4-profile/include/gsl/gsl_matrix_double.h:286
#1  CalcXHiY (eval=eval@entry=0x5233c0, D_l=D_l@entry=0x59f980, X=X@entry=0x540930,
    UltVehiY=UltVehiY@entry=0x53e500, xHiy=xHiy@entry=0x5aaf40) at src/mvlmm.cpp:349
#2  0x0000000000481033 in MphEM (func_name=func_name@entry=82 'R', max_iter=1000, max_prec=0.001,
    eval=eval@entry=0x5233c0, X=X@entry=0x540930, Y=Y@entry=0x5408d0, U_hat=U_hat@entry=0x53e380,
    E_hat=0x53e3e0, OmegaU=0x53e440, OmegaE=0x53e4a0, UltVehiY=0x53e500, UltVehiBX=0x53e560,
    UltVehiU=0x53e5c0, UltVehiE=0x53e620, V_g=0x540990, V_e=0x540a20, B=0x540ab0) at src/mvlmm.cpp:658
#3  0x000000000048f9da in MVLMM::AnalyzePlink (this=this@entry=0x7fffffffcf20, U=U@entry=0x523380,
    eval=eval@entry=0x5233c0, UtW=UtW@entry=0x5233f0, UtY=UtY@entry=0x523430) at src/mvlmm.cpp:3805
#4  0x0000000000435c2e in GEMMA::BatchRun (this=this@entry=0x7fffffffe4a0, cPar=...)
    at src/gemma.cpp:2830

#0  0x00007ffff62ffd0b in dgemm_nn ()
   from /gnu/store/pmmbmrbizz668plrrvfyvqh5ymvjy5p4-profile/lib/libopenblas.so.0
#1  0x00007ffff622e6f3 in cblas_dgemm ()
   from /gnu/store/pmmbmrbizz668plrrvfyvqh5ymvjy5p4-profile/lib/libopenblas.so.0
#2  0x00007ffff7d6af44 in gsl_blas_dgemm ()
   from /gnu/store/pmmbmrbizz668plrrvfyvqh5ymvjy5p4-profile/lib/libgsl.so.25
#3  0x000000000048067b in CalcSigma (func_name=func_name@entry=82 'R', eval=eval@entry=0x5233c0,
    D_l=D_l@entry=0x5a13f0, X=X@entry=0x540930, OmegaU=OmegaU@entry=0x53e440,
    OmegaE=OmegaE@entry=0x53e4a0, UltVeh=0x5a4410, Qi=0x5a8320, Sigma_uu=0x59fec0, Sigma_ee=0x5acca0)
    at src/mvlmm.cpp:541
#4  0x0000000000480f62 in MphEM (func_name=func_name@entry=82 'R', max_iter=1000, max_prec=0.001,
    eval=eval@entry=0x5233c0, X=X@entry=0x540930, Y=Y@entry=0x5408d0, U_hat=U_hat@entry=0x53e380,
    E_hat=0x53e3e0, OmegaU=0x53e440, OmegaE=0x53e4a0, UltVehiY=0x53e500, UltVehiBX=0x53e560,
    UltVehiU=0x53e5c0, UltVehiE=0x53e620, V_g=0x540990, V_e=0x540a20, B=0x540ab0) at src/mvlmm.cpp:704
#5  0x000000000048fa1d in MVLMM::AnalyzePlink (this=this@entry=0x7fffffffcf20, U=U@entry=0x523380,
    eval=eval@entry=0x5233c0, UtW=UtW@entry=0x5233f0, UtY=UtY@entry=0x523430) at src/mvlmm.cpp:3806
```

Shows time is spent mostly in one place.

A profiler should help pinpoint where time is spent. The issue submitter admits that this is a contrived edge-case so I am not going to work on it until I start optimizing mvlmm. Just a quick check with gperftools (formerly the Google profiler) which is packaged in GNU Guix:

Profiling above gemma dataset

```
     120   7.3%   7.3%      155   9.4% CalcQi
      96   5.8%  13.1%       96   5.8% dgemm_kernel_ZEN
      88   5.3%  18.4%       88   5.3% __sched_yield
      81   4.9%  23.3%      109   6.6% blas_memory_free
      77   4.7%  28.0%       77   4.7% __pthread_mutex_unlock_usercnt
      71   4.3%  32.3%       94   5.7% CalcXHiY
      69   4.2%  36.5%       87   5.3% dgemm_nn
      66   4.0%  40.5%       66   4.0% __pthread_mutex_lock
      63   3.8%  44.3%       63   3.8% __ieee754_log_fma
      59   3.6%  47.9%      179  10.9% blas_memory_alloc
      58   3.5%  51.4%       58   3.5% gsl_vector_get
      57   3.5%  54.9%       88   5.3% dgemm_nt
      56   3.4%  58.3%       80   4.9% CalcOmega
      56   3.4%  61.7%      421  25.5% cblas_dgemm
      54   3.3%  64.9%       54   3.3% dgemm_beta_ZEN
      51   3.1%  68.0%      537  32.6% CalcSigma
      51   3.1%  71.1%       76   4.6% dsyr_thread_L
      43   2.6%  73.7%       43   2.6% gsl_matrix_get
      41   2.5%  76.2%       41   2.5% gsl_matrix_set
      37   2.2%  78.5%      130   7.9% MphCalcLogL
```

CalcSigma and CalcQi are worth looking into. Also cblas_dgemm may need some attention.

For now I am dropping this until I pick up the mvlmm speedups.