aboutsummaryrefslogtreecommitdiff
path: root/test/performance/releases.org
blob: 30c96f87bbc96af91ef8a5241f1053ca9a759bba (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
* GEMMA performance stats

Below measurements are taken on 4x Intel(R) Core(TM) i7-6770HQ CPU @
2.60GHz with hyperthreading and 16 GB RAM with warmed up memory
buffers.

Between 0.96 and 0.97 a speed regression was [[https://github.com/genetics-statistics/GEMMA/issues/136][reported]] which resulted
in tracking of performance. It is interesting because 0.96 is a single
core Eigenlib version and 0.97 went multi-core with
openblas. Unfortunately I linked in lapack and an older BLAS which
slowed things down. In 0.98 openblas is mostly used and is faster.

Note also that the recent static versions are slower than the
dynamically linked ones. Not sure why that is.

The test commands are

#+BEGIN_SRC
# kinship
time ./bin/gemma -g ./example/mouse_hs1940.geno.txt.gz -p ./example/mouse_hs1940.pheno.txt -a ./example/mouse_hs1940.anno.txt -gk -no-check
# univariate LMM
time ./bin/gemma -g ./example/mouse_hs1940.geno.txt.gz -p ./example/mouse_hs1940.pheno.txt -n 1 -a ./example/mouse_hs1940.anno.txt -k ./output/result.cXX.txt -lmm -no-check
# multivariate LMM
time ./bin/gemma -g ./example/mouse_hs1940.geno.txt.gz -p ./example/mouse_hs1940.pheno.txt -n 1 2 -a ./example/mouse_hs1940.anno.txt -k ./output/result.cXX.txt -lmm -no-check
#+END_SRC

Currently on my laptop there is no difference in running these tests
using gcc or clang.

#+BEGIN_SRC
Clang:

real    0m25.758s
user    0m46.380s
sys     0m0.852s

real    0m22.173s
user    0m29.420s
sys     0m1.540s

GNU C

real    0m24.540s
user    0m43.948s
sys     0m1.276s

real    0m22.504s
user    0m29.768s
sys     0m1.544s
#+END_SRC

Running the GNU profiler on the MVLMM one rendered

#+BEGIN_SRC
  %   cumulative   self              self     total
 time   seconds   seconds    calls   s/call   s/call  name
 22.73      0.90     0.90    41121     0.00     0.00  CalcQi(gsl_vector const*, gsl_vector const*, gsl_matrix const*, gsl_matrix*)
 13.64      1.44     0.54    30313     0.00     0.00  CalcXHiY(gsl_vector const*, gsl_vector const*, gsl_matrix const*, gsl_matrix const*, gsl_v
ector*)
 11.87      1.91     0.47    19536     0.00     0.00  CalcSigma(char, gsl_vector const*, gsl_vector const*, gsl_matrix const*, gsl_matrix const*
, gsl_matrix const*, gsl_matrix const*, gsl_matrix const*, gsl_matrix*, gsl_matrix*)
 10.86      2.34     0.43    38621     0.00     0.00  safeGetline(std::istream&, std::__cxx11::basic_string<char, std::char_traits<char>, std::a
llocator<char> >&)
  8.33      2.67     0.33    10805     0.00     0.00  MphCalcP(gsl_vector const*, gsl_vector const*, gsl_matrix const*, gsl_matrix const*, gsl_m
atrix const*, gsl_matrix const*, gsl_matrix*, gsl_vector*, gsl_matrix*)
  6.06      2.91     0.24        1     0.24     0.43  ReadFile_geno
  5.30      3.12     0.21    19536     0.00     0.00  UpdateV(gsl_vector const*, gsl_matrix const*, gsl_matrix const*, gsl_matrix const*, gsl_ma
trix const*, gsl_matrix*, gsl_matrix*)
  5.30      3.33     0.21        1     0.21     3.27  MVLMM::AnalyzeBimbam(gsl_matrix const*, gsl_vector const*, gsl_matrix const*, gsl_matrix c
onst*)
#+END_SRC

* GEMMA 0.98-beta1

#+BEGIN_SRC bash
        linux-vdso.so.1 (0x00007ffe475d2000)
        libgsl.so.23 => /home/wrk/opt/gemma-dev-env/lib/libgsl.so.23 (0x00007f95a21e3000)
        libopenblas.so.0 => /home/wrk/opt/gemma-dev-env/lib/libopenblas.so.0 (0x00007f959fc45000)
        libz.so.1 => /home/wrk/opt/gemma-dev-env/lib/libz.so.1 (0x00007f959fa2a000)
        libgfortran.so.3 => /home/wrk/opt/gemma-dev-env/lib/libgfortran.so.3 (0x00007f959f709000)
        libquadmath.so.0 => /home/wrk/opt/gemma-dev-env/lib/libquadmath.so.0 (0x00007f959f4c8000)
        libstdc++.so.6 => /home/wrk/opt/gemma-dev-env/lib/libstdc++.so.6 (0x00007f959f14d000)
        libm.so.6 => /home/wrk/opt/gemma-dev-env/lib/libm.so.6 (0x00007f959ee01000)
        libgcc_s.so.1 => /home/wrk/opt/gemma-dev-env/lib/libgcc_s.so.1 (0x00007f959ebea000)
        libpthread.so.0 => /home/wrk/opt/gemma-dev-env/lib/libpthread.so.0 (0x00007f959e9cc000)
        libc.so.6 => /home/wrk/opt/gemma-dev-env/lib/libc.so.6 (0x00007f959e61a000)
#+END_SRC

#+BEGIN_SRC bash

time ./bin/gemma -g ~/tmp/mouse_hs1940/mouse_hs1940.geno.txt.gz -p ~/tmp/mouse_hs1940/mouse_hs1940.pheno.txt -a ~/tmp/mouse_hs1940/mouse_hs1940.anno.txt -gk -no-check
GEMMA 0.98-beta1 (2018-09-06) by Xiang Zhou and team (C) 2012-2018
Reading Files ...
## number of total individuals = 1940
## number of analyzed individuals = 1410
## number of covariates = 1
## number of phenotypes = 1
## number of total SNPs/var        =    12226
## number of analyzed SNPs         =    10768
Calculating Relatedness Matrix ...
================================================== 100%

real    0m16.875s
user    0m25.180s
sys     0m1.740s
#+END_SRC

#+BEGIN_SRC bash
lario:~/izip/git/opensource/genenetwork/gemma$ time bin/gemma -g ~/tmp/mouse_hs1940/mouse_hs1940.geno.txt.gz -p ~/tmp/mouse_hs1940/mouse_hs1940.pheno.txt -n 1 -a ~/tmp/mouse_hs1940/mouse_hs1940.anno.txt -k ./output/result.cXX.txt -lmm -no-check
GEMMA 0.98-beta1 (2018-09-06) by Xiang Zhou and team (C) 2012-2018
Reading Files ...
## number of total individuals = 1940
## number of analyzed individuals = 1410
## number of covariates = 1
## number of phenotypes = 1
## number of total SNPs/var        =    12226
## number of analyzed SNPs         =    10768
Start Eigen-Decomposition...
pve estimate =0.608801
se(pve) =0.032774
================================================== 100%

real    0m13.255s
user    0m18.272s
sys     0m3.324s
#+END_SRC

* GEMMA 0.98-pre

#+BEGIN_SRC bash
/gnu/store/icz3hd36aqpjz5slyp4hhr8wsfbgiml1-bash-minimal-4.4.12/bin/bash: warning: setlocale: LC_ALL: cannot change locale (en_GB.UTF-8)
        linux-vdso.so.1 (0x00007ffe2abe1000)
        libgsl.so.23 => /home/wrk/opt/gemma-dev-env/lib/libgsl.so.23 (0x00007f685a9c0000)
        libopenblas.so.0 => /home/wrk/opt/gemma-dev-env/lib/libopenblas.so.0 (0x00007f6858422000)
        libz.so.1 => /home/wrk/opt/gemma-dev-env/lib/libz.so.1 (0x00007f6858207000)
        libgfortran.so.3 => /home/wrk/opt/gemma-dev-env/lib/libgfortran.so.3 (0x00007f6857ee6000)
        libquadmath.so.0 => /home/wrk/opt/gemma-dev-env/lib/libquadmath.so.0 (0x00007f6857ca5000)
        libstdc++.so.6 => /home/wrk/opt/gemma-dev-env/lib/libstdc++.so.6 (0x00007f685792a000)
        libm.so.6 => /home/wrk/opt/gemma-dev-env/lib/libm.so.6 (0x00007f68575de000)
        libgcc_s.so.1 => /home/wrk/opt/gemma-dev-env/lib/libgcc_s.so.1 (0x00007f68573c7000)
        libpthread.so.0 => /home/wrk/opt/gemma-dev-env/lib/libpthread.so.0 (0x00007f68571a9000)
        libc.so.6 => /home/wrk/opt/gemma-dev-env/lib/libc.so.6 (0x00007f6856df7000)
        /gnu/store/n6acaivs0jwiwpidjr551dhdni5kgpcr-glibc-2.26.105-g0890d5379c/lib/ld-linux-x86-64.so.2 => /gnu/store/gf30mz7cfx4fyj4cckgxfxwlsc3c7a8r-glibc-2.26.105-g0890d5379c/lib/ld-linux-x86-64.so.2 (0x000055ae91968000)
#+END_SRC

#+BEGIN_SRC bash
lario:~/izip/git/opensource/genenetwork/gemma$ time ./bin/gemma -g ~/tmp/mouse_hs1940/mouse_hs1940.geno.txt.gz -p ~/tmp/mouse_hs1940/mouse_hs1940.pheno.txt -a ~/tmp/mouse_hs1940/mouse_hs1940.anno.txt -gk
GEMMA 0.98-pre1 (2018/02/10) by Xiang Zhou and team (C) 2012-2018
Reading Files ...
## number of total individuals = 1940
## number of analyzed individuals = 1410
## number of covariates = 1
## number of phenotypes = 1
## number of total SNPs/var        =    12226
## number of analyzed SNPs         =    10768
Calculating Relatedness Matrix ...
================================================== 100%

real    0m15.995s
user    0m31.884s
sys     0m4.680s
#+END_SRC

#+BEGIN_SRC bash
lario:~/izip/git/opensource/genenetwork/gemma$ time bin/gemma -g ~/tmp/mouse_hs1940/mouse_hs1940.geno.txt.gz -p ~/tmp/mouse_hs1940/mouse_hs1940.pheno.txt -n 1 -a ~/tmp/mouse_hs1940/mouse_hs1940.anno.txt -k ./output/result.cXX.txt -lmm
GEMMA 0.98-pre1 (2018/02/10) by Xiang Zhou and team (C) 2012-2018
Reading Files ...
## number of total individuals = 1940
## number of analyzed individuals = 1410
## number of covariates = 1
## number of phenotypes = 1
## number of total SNPs/var        =    12226
## number of analyzed SNPs         =    10768
Start Eigen-Decomposition...
pve estimate =0.608801
se(pve) =0.032774
================================================== 100%

real    0m13.440s
user    0m20.528s
sys     0m4.324s
#+END_SRC

* GEMMA 0.97

#+BEGIN_SRC bash
lario:~/tmp/gemma-release-0.97$ ldd gemma-gn2-0.97-c760aa0-xqhsidq7h5/bin/gemma
        linux-vdso.so.1 (0x00007ffc237a8000)
        libgsl.so.23 => /home/wrk/tmp/gemma-release-0.97/gsl-2.4-as8vm64028/lib/libgsl.so.23 (0x00007f8b415f5000)
        libopenblas.so.0 => /home/wrk/tmp/gemma-release-0.97/openblas-0.2.19-f7j1vq0ncc/lib/libopenblas.so.0 (0x00007f8b3fbc3000)
        libz.so.1 => /home/wrk/tmp/gemma-release-0.97/zlib-1.2.11-sfx1wh27i6/lib/libz.so.1 (0x00007f8b3f9a8000)
        libgfortran.so.3 => /home/wrk/tmp/gemma-release-0.97/gfortran-5.4.0-lib-15plffwjdv/lib/libgfortran.so.3 (0x00007f8b3f687000)
        libquadmath.so.0 => /home/wrk/tmp/gemma-release-0.97/gcc-5.4.0-lib-3x53yv4v14/lib/libquadmath.so.0 (0x00007f8b3f448000)
        liblapack.so.3 => /home/wrk/tmp/gemma-release-0.97/lapack-3.7.1-nyd19c9ccy/lib/liblapack.so.3 (0x00007f8b3eb83000)
        libstdc++.so.6 => /home/wrk/tmp/gemma-release-0.97/gcc-5.4.0-lib-3x53yv4v14/lib/libstdc++.so.6 (0x00007f8b3e809000)
        libm.so.6 => /home/wrk/tmp/gemma-release-0.97/glibc-2.25-n6nvxlk2j8/lib/libm.so.6 (0x00007f8b3e4f7000)
        libgcc_s.so.1 => /home/wrk/tmp/gemma-release-0.97/gcc-5.4.0-lib-3x53yv4v14/lib/libgcc_s.so.1 (0x00007f8b3e2e0000)
        libpthread.so.0 => /home/wrk/tmp/gemma-release-0.97/glibc-2.25-n6nvxlk2j8/lib/libpthread.so.0 (0x00007f8b3e0c2000)
        libc.so.6 => /home/wrk/tmp/gemma-release-0.97/glibc-2.25-n6nvxlk2j8/lib/libc.so.6 (0x00007f8b3dd23000)
        libblas.so.3 => /home/wrk/tmp/gemma-release-0.97/lapack-3.7.1-nyd19c9ccy/lib/libblas.so.3 (0x00007f8b3dacb000)
        /home/wrk/tmp/gemma-release-0.97/glibc-2.25-n6nvxlk2j8/lib/ld-linux-x86-64.so.2 (0x00007f8b41a5c000)
#+END_SRC

#+BEGIN_SRC bash
lario:~/tmp/gemma-release-0.97$ time ./gemma-gn2-0.97-c760aa0-xqhsidq7h5/bin/gemma -g ~/tmp/mouse_hs1940/mouse_hs1940.geno.txt.gz -p ~/tmp/mouse_hs1940/mouse_hs1940.pheno.txt -a ~/tmp/mouse_hs1940/mouse_hs1940.anno.txt -gk
GEMMA 0.97 (2017/12/27) by Xiang Zhou and team (C) 2012-2017
Reading Files ...
## number of total individuals = 1940
## number of analyzed individuals = 1410
## number of covariates = 1
## number of phenotypes = 1
## number of total SNPs/var        =    12226
## number of analyzed SNPs         =    10768
Calculating Relatedness Matrix ...
================================================== 100%

real    0m21.389s
user    0m34.980s
sys     0m4.560s
#+END_SRC

#+BEGIN_SRC bash
lario:~/tmp/gemma-release-0.97$ time ./gemma-gn2-0.97-c760aa0-xqhsidq7h5/bin/gemma -g ~/tmp/mouse_hs1940/mouse_hs1940.geno.txt.gz -p ~/tmp/mouse_hs1940/mouse_hs1940.pheno.txt -n 1 -a ~/tmp/mouse_hs1940/mouse_hs1940.anno.txt -k ./output/result.cXX.txt -lmm
GEMMA 0.97 (2017/12/27) by Xiang Zhou and team (C) 2012-2017
Reading Files ...
## number of total individuals = 1940
## number of analyzed individuals = 1410
## number of covariates = 1
## number of phenotypes = 1
## number of total SNPs/var        =    12226
## number of analyzed SNPs         =    10768
Start Eigen-Decomposition...
pve estimate =0.608801
se(pve) =0.032774
================================================== 100%

real    0m13.296s
user    0m18.332s
sys     0m5.020s
#+END_SRC

* GEMMA 0.96

#+BEGIN_SRC bash
lario:~/tmp/gemma-release-0.96$ ldd gemma.linux
        linux-vdso.so.1 (0x00007ffd9ee8f000)
        libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007fc2a94a1000)
        libgfortran.so.3 => /usr/lib/x86_64-linux-gnu/libgfortran.so.3 (0x00007fc2a9183000)
        libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007fc2a8e01000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fc2a8afd000)
        libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fc2a88e6000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fc2a86c9000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fc2a832b000)
        libquadmath.so.0 => /usr/lib/x86_64-linux-gnu/libquadmath.so.0 (0x00007fc2a80ec000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fc2a96bb000)
#+END_SRC

#+BEGIN_SRC bash
lario:~/tmp/gemma-release-0.96$ time ./gemma.linux -g ~/tmp/mouse_hs1940/mouse_hs1940.geno.txt.gz -p ~/tmp/mouse_hs1940/mouse_hs1940.pheno.txt -a ~/tmp/mouse_hs1940/mouse_hs1940.anno.txt -gk
Reading Files ...
## number of total individuals = 1940
## number of analyzed individuals = 1410
## number of covariates = 1
## number of phenotypes = 1
## number of total SNPs = 12226
## number of analyzed SNPs = 10768
Calculating Relatedness Matrix ...
Reading SNPs  ==================================================100.00%

real    0m16.347s
user    0m16.204s
sys     0m0.116s
#+END_SRC


#+BEGIN_SRC bash
lario:~/tmp/gemma-release-0.96$ time ./gemma.linux -g ~/tmp/mouse_hs1940/mouse_hs1940.geno.txt.gz -p ~/tmp/mouse_hs1940/mouse_hs1940.pheno.txt -n 1 -a ~/tmp/mouse_hs1940/mouse_hs1940.anno.txt -k ./output/result.cXX.txt -lmm
Reading Files ...
## number of total individuals = 1940
## number of analyzed individuals = 1410
## number of covariates = 1
## number of phenotypes = 1
## number of total SNPs = 12226
## number of analyzed SNPs = 10768
Start Eigen-Decomposition...
pve estimate =0.608801
se(pve) =0.032774
Reading SNPs  ==================================================100.00%

real    0m20.377s
user    0m20.240s
sys     0m0.132s
#+END_SRC