summaryrefslogtreecommitdiff
path: root/issues/implement-parallel-correlation-with-rust.gmi
blob: 087401ff1ed833aabecc4142ff4dfb59448807a8 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
# Implementing parallel correlation with rust

### Notes

In an attempt to speed the current gn2 correlation
we are doing an reimplentation in rust to support
parralel computation


## Tags

* assigned:alem,Fred
* type: upgrade
* priority: medium
* status: closed, completed
* keywords: correlation,rust,parralel


## Tasks

* [X] implementation of basic pearson and spearman correlation in rust

* [X] add unittests and benchmarks

* [X] loading datasets;format

* [X] package the library as a guix package


* [x]  gn2 integration

* [x] parsing input datasets

* [x]  benchmark for all datasets rust v gn2 version

* [ ] Figure out how to tell cargo to use the declared dependencies, rather than dowloading the dependencies

* [*] db queries migration to gn3 or use partial correlation queries

* [x] add parralel computation

* [x] code optimization and minor fixes

* [x] lmbd parser

* [x] implement biweight mid correlation for rust

* [] fix lmdb-rust packaging

* [] production deployment






## Notes

You can call the lib as an external process from any language or directly call
it in a rust cargo



code can be found here:

=> https://github.com/Alexanderlacuna/correlation_rust.git


added a documentation for correlation rust

=> https://github.com/Alexanderlacuna/correlation_rust/blob/main/README.md


PR for integrating to genenetwork:

=> https://github.com/genenetwork/genenetwork3/pull/93 



### 8 ,July 2022

worked on input data parsing and enhancements

PR for parsing datasets

=> https://github.com/genenetwork/genenetwork3/pull/96


## 17,Aug 2022

integration of correlation rust ongoing with @fred working
also including optimizing of slow queries,code refatoring


## 5 Sept 2022

use text files similar to gn1 see example

=> https://github.com/genenetwork/genenetwork1/blob/d2381dd5bb35cb906032f7d530e9d14984e1de21/web/webqtl/correlation/CorrelationPage.py#L271

also something to be considered  is lmdb


## 20 Sept 2022

integrated  codebase to store db in text files for probeset types
=> https://github.com/genenetwork/genenetwork2/pull/736



## 11 july 2023

* implemented Parallel computation implementation with rust completed

* implemented both lmdb and csv parser  in rust

=> https://github.com/Alexanderlacuna/correlation_rust/pull/3/files#diff-87f0275ed65ae8ffe36df25091c9b975defbc920da7b5ec3c2dadb7d57ad4d67