diff options
Diffstat (limited to 'test/performance')
| -rw-r--r-- | test/performance/releases.org | 12 |
1 files changed, 7 insertions, 5 deletions
diff --git a/test/performance/releases.org b/test/performance/releases.org index c973607..af0cbb7 100644 --- a/test/performance/releases.org +++ b/test/performance/releases.org @@ -29,16 +29,18 @@ sys 0m0.901s The output looks the same. Good. So far the first difference is a much later openblas 0.3.30 (over 0.3.9). In the source code we added checkpoints and more debugging, particularly write statements. I disabled the latter, but still no dice. -When compiled with the profile library prefix the gemma run with +When compiled with the profiler library prefix the gemma run with #+begin_src sh +premake5 gmake2 && make verbose=1 config=debug -j 8 gemma && time CPUPROFILE=gemma.prof LD_LIBRARY_PATH=$GUIX_ENVIRONMENT/lib ./build/bin/Debug/gemma -g ./example/mouse_hs1940.geno.txt.gz -p ./example/mouse_hs1940.pheno.txt -n 1 -a ./example/mouse_hs1940.anno.txt -k ./output/result.cXX.txt -lmm -no-check -debug CPUPROFILE=gemma.prof pprof --text build/bin/Debug/gemma gemma.prof - 1024 50.7% 50.7% 1024 50.7% dcopy_k_ZEN - 99 4.9% 55.6% 99 4.9% openblas_read_env - 67 3.3% 58.9% 107 5.3% ____strtod_l_internal - 67 3.3% 62.3% 67 3.3% gsl_vector_div + 1007 49.2% 49.2% 1015 49.6% dot_compute + 94 4.6% 53.8% 94 4.6% rpcc + 74 3.6% 57.5% 74 3.6% gsl_vector_div + 62 3.0% 60.5% 92 4.5% ____strtod_l_internal + 42 2.1% 62.5% 42 2.1% dgemm_kernel_ZEN #+end_src sh this led me to try the newer openblas on the older gemma - and indeed, the regression is coming from the openblas version. Even though it says 'OpenBLAS 0.3.30 DYNAMIC_ARCH NO_AFFINITY Zen MAX_THREADS=128' I suspect the dynamic arch is not really optimizing. |
