1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
|
# Investigate and fix "qtl2::calc_genoprob()" run due to failing with negative length vectors
## Tags
* Assigned: Flisso
* type: bug
* status: in progress
* interested: alexm
* key words: cross, qtl2, calc_genoprob, bugs
## Description
Running subset of genotype and founders csv on qtl2 to generate founder aware smoothed genotypes.
The script is crushing as per the followin error message:
```sh
calc_genoprob failing with negative length vectors are not allowed
```
For reference, see "qtl2_hmm_pipeline.R" script:
=> https://github.com/fetche-lab/HS-rats-2026/blob/main/genotypes/new_processing/tests/chr1/codes/qtl2_hmm_pipeline.R
The following were key findings from the run, and the error:
* Map and IDs were consistent:
* - 50,000 markers
* - no duplicate marker IDs
* - monotonic increasing cM
* Genotype dimensions:
* - HS genotypes: 1499 x 50000
* - Founder genotypes: 8 x 50000
* Error cause matched integer-length overflow conditions:
* The original workflow tried to allocate a genotype-probability object effectively sized around 1499 * 50000 * 36 = 2,698,200,000, which exceeds R’s 32-bit vector-length limit (2,147,483,647), causing negative length vectors are not allowed.
* So the solution was to chunk the files to 5000 lines, but still the culprit is on the calc_genoprob() runtime.
## Tasks
* [x] error: "calc_genoprob failing with negative length vectors are not allowed)"
* [ ] Re-run the script per specified chunks
* [ ] Evaluate the smoothed output for its validity and intepretability
* [ ] use the proximal/distal founder aware markers to extract snps from the original geno file.
* [ ] or, extend a function in the script to perform this
* [ ] Test the results with gemma and rqtl2 mapping
|