blob: d888e82c064f90a51f192eeaeeb004bb7ad2e599 (
plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
|
# CD is slow
The pages are slow and some are broken.
# Tags
* type: bug
* status: open
* priority: high
* assigned: pjotrp
* interested: pjotrp, bonfacem
* keywords: deployment, server
# Measuring
bonfacekilz:
I'm currently instrumenting the requests. See what hogs up time. Loading the landing page takes up 32 seconds!
Something's off. From outside the container:
```
123bonfacem@tux02 ~ $ guix shell python-wrapper python-requests -- python time.py
Status: 200
Time taken: 32.989222288131714 seconds
```
From inside the container:
```
12025-07-18 14:46:36 INFO:gn2.wqflask:Landing page rendered in 8.12 seconds
```
And I see:
## CD
```
> curl -w @- -o /dev/null -s https://cd.genenetwork.org <<EOF
\n
DNS lookup: %{time_namelookup}s\n
Connect time: %{time_connect}s\n
TLS handshake: %{time_appconnect}s\n
Pre-transfer: %{time_pretransfer}s\n
Start transfer: %{time_starttransfer}s\n
Total time: %{time_total}s\n
EOF
DNS lookup: 8.117543s
Connect time: 8.117757s
TLS handshake: 8.197767s
Pre-transfer: 8.197861s
Start transfer: 33.096467s
Total time: 33.096601s
```
## Production
```
> curl -w @- -o /dev/null -s https://genenetwork.org <<EOF
\n
DNS lookup: %{time_namelookup}s\n
Connect time: %{time_connect}s\n
TLS handshake: %{time_appconnect}s\n
Pre-transfer: %{time_pretransfer}s\n
Start transfer: %{time_starttransfer}s\n
Total time: %{time_total}s\n
EOF
DNS lookup: 8.075794s
Connect time: 8.076402s
TLS handshake: 8.147322s
Pre-transfer: 8.147370s
Start transfer: 8.797107s
Total time: 8.797299s
```
## On tux02 (outside CD container)
```
> curl -w @- -o /dev/null -s http://localhost:9092 <<EOF
\n
DNS lookup: %{time_namelookup}s\n
Connect time: %{time_connect}s\n
TLS handshake: %{time_appconnect}s\n
Pre-transfer: %{time_pretransfer}s\n
Start transfer: %{time_starttransfer}s\n
Total time: %{time_total}s\n
EOF
DNS lookup: 0.000068s
Connect time: 0.000543s
TLS handshake: 0.000000s
Pre-transfer: 0.000606s
Start transfer: 24.851069s
Total time: 24.851166s
```
This does not look like an nginx problem (at least on tux02 itself). Also the nginx configuration was not really changed.
The mysql configuration ditto. I can still test both, but it looks like the problem is inside the system container.
The container logs are at
```
root@tux02:/export2/guix-containers/genenetwork-development/var/log/cd# tail -100 genenetwork2.log
```
Some interesting errors there that need resolving, such as
```
tail gn-guile.log
2025-07-20 04:49:49 X.509 certificate of 'sparql.genenetwork.org' could not be verified:
2025-07-20 04:49:49 signer-not-found invalid
```
Let's try to get inside the container with nsenter:
```
ps xau|grep genenetwork-development-container
root 115940 0.0 0.0 163692 26296 ? Ssl Jul18 0:00 /gnu/store/ylwk2vn18dkzkj0nxq2h4vjzhz17bm7c-guile-3.0.9/bin/guile --no-auto-compile /usr/local/bin/genenetwork-development-container
pgrep -P 115940
115961
```
Use this PID and a recent nsenter:
```
/gnu/store/w7a3frdmffpw3hvxpvvxwxgzfhyqdm6n-profile/bin/nsenter -m -p -t 115961 /run/current-system/profile/bin/bash -login
```
System tools are in '/run/current-system/profile/bin/'
We can pick up curl (I note the system container has full access to the /gnu/store on the host:
```
root@tux02 /# /gnu/store/vdaspmq10c3zmqhp38lfqy812w6r4xg3-curl-8.6.0/bin/curl -w @- -o /dev/null -s http://localhost:9092 <<EOF
\n
DNS lookup: %{time_namelookup}s\n
Connect time: %{time_connect}s\n
TLS handshake: %{time_appconnect}s\n
Pre-transfer: %{time_pretransfer}s\n
Start transfer: %{time_starttransfer}s\n
Total time: %{time_total}s\n
EOF
DNS lookup: 0.000064s
Connect time: 0.000478s
TLS handshake: 0.000000s
Pre-transfer: 0.000551s
Start transfer: 24.792926s
Total time: 24.793015s
```
That rules out container and nginx streaming issues.
So the problem is with GN and its DBs. The gn-machines is used from /home/aruni and it checkout is March. Has CD been slow since then? I don't think so. Also the changes to the actual scripts are even older. Also the guix-bioinformatics repo shows no changes. Remaining culprits I suspect are:
* [*] MySQL
* [ ] Interaction gn-auth with gn2
* [ ] Interaction gnqa with gn2
Running a standard test on mysql shows it is fine:
```
time mysql -u webqtlout -pwebqtlout db_webqtl < $rundir/../shared/sql/test02.sql
Name FullName Name Symbol CAST(ProbeSet."description" AS BINARY) CAST(ProbeSet."Probe_Target_Description" AS BINARY) Chr Mb Mean LRS Locus pValue additive geno_chr geno_mb
HC_M2_0606_P Hippocampus Consortium M430v2 (Jun06) PDNN 1457545_at 9530036O11Rik long non-coding RNA, expressed sequence tag (EST) AK035474 with high bladder expression antisense EST 14 Kb upstream of Shh 5 28.480441 6.7419292929293 15.2845189682605 rsm10000001525 0.055 0.0434848484848485 3 9.671673
HC_M2_0606_P Hippocampus Consortium M430v2 (Jun06) PDNN 1427571_at Shh sonic hedgehog (hedgehog) last exon 5 28.457886 6.50113131313131 9.58158655605723 rs8253327 0.697 0.0494097096188748 1 191.908118
HC_M2_0606_P Hippocampus Consortium M430v2 (Jun06) PDNN 1436869_at Shh sonic hedgehog (hedgehog) mid distal 3' UTR 5 28.457155 9.279090909090911 12.7711275309832 rs8253327 0.306 -0.214087568058076 1 191.908118
real 0m0.010s
user 0m0.004s
sys 0m0.000s
```
|