diff options
| author | Munyoki Kilyungi | 2026-02-09 15:19:56 +0300 |
|---|---|---|
| committer | Munyoki Kilyungi | 2026-02-09 15:20:57 +0300 |
| commit | 3f5b36124bc2ade33aaf205f5b7b179334fd39c0 (patch) | |
| tree | 5264f86d1b949a1f1545b4a9329e205c807e1c21 /issues | |
| parent | 838c4f5595abce433f534839a262d6634610b910 (diff) | |
| download | gn-ai-3f5b36124bc2ade33aaf205f5b7b179334fd39c0.tar.gz | |
Update issues.
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
Diffstat (limited to 'issues')
| -rw-r--r-- | issues/rdf/expose-lmdb-view-in-rdf.gmi | 7 | ||||
| -rw-r--r-- | issues/rdf/rdf-refinement.gmi | 40 |
2 files changed, 45 insertions, 2 deletions
diff --git a/issues/rdf/expose-lmdb-view-in-rdf.gmi b/issues/rdf/expose-lmdb-view-in-rdf.gmi index d58f7a7c..12b5d868 100644 --- a/issues/rdf/expose-lmdb-view-in-rdf.gmi +++ b/issues/rdf/expose-lmdb-view-in-rdf.gmi @@ -57,12 +57,17 @@ ORDER BY > Comments: PR in genenetwork3 => https://github.com/genenetwork/genenetwork3/pull/240 -* [-] Deploy functionality to tux02 +* [X] Deploy functionality to tux02 Prototype: => https://github.com/Alexanderlacuna/gn_lmdb_rdf_interface/tree/master +* [X] (bonfacem) Modify endpoint to use json extension: +``` +curl http://127.0.0.1:8091/dataset/bxd-publish/values/10002.json +``` +* [X] (alexm) Remove null values from data end-point. * [ ] (bonfacem, pjotrp, alexm) How to work with case-attributes metadata. * [ ] (bonfacem) Add above link to RDF. diff --git a/issues/rdf/rdf-refinement.gmi b/issues/rdf/rdf-refinement.gmi index 9199b656..a841c011 100644 --- a/issues/rdf/rdf-refinement.gmi +++ b/issues/rdf/rdf-refinement.gmi @@ -275,7 +275,7 @@ Genotypes and markers are different but related. Different Species can have dif * [X] Link geno-files to the correct data (ref gn2 code on how this is done) => https://files.genenetwork.org/current/ Genotype files. The dir reps the InfoPages.AccesionId. * [X] Create global namespace for geno-files. -* [ ] probesets +* [X] probesets All probesets should have a name: ``` @@ -283,8 +283,45 @@ SELECT * FROM ProbeSet WHERE ProbeSet.Name IS NULL OR TRIM(ProbeSet.Name) = ''\G +``` +Number of probesets we have: +``` +MariaDB [db_webqtl]> select count(*) from ProbeSet; ++----------+ +| count(*) | ++----------+ +| 6436251 | ++----------+ +``` +Number of experiment that use probesets: + +``` +MariaDB [db_webqtl]> select count(*) from ProbeSetXRef; ++----------+ +| count(*) | ++----------+ +| 49131499 | ++----------+ +``` + +We can get away with tx'ing ProbeSet in one go. However, file size gets too big and rapper complains about it. Instead, figure out a way to tx ProbeSetXRef in chunks. Note: total transform times averages at about ~21 mins. With probesets/probesetxref, that will balloon upto >1hr. Not worried about optimising things now. That can be worked out for later. +* [ ] ProbeSetXRef +* [ ] (w/ Johannesm/pjotrp/rob) What columns to put into RDF. We have 72 rows ATM: + +``` +MariaDB [db_webqtl]> SELECT COUNT(*) AS column_count + -> FROM INFORMATION_SCHEMA.COLUMNS + -> WHERE TABLE_SCHEMA = DATABASE() + -> AND TABLE_NAME = 'ProbeSet'; ++--------------+ +| column_count | ++--------------+ +| 72 | ++--------------+ +1 row in set (0.00 sec) ``` +* [ ] (w/ Alex) ProbeSetData * [X] RIF * [-] (cancelled) Gene Symbols @@ -295,6 +332,7 @@ WHERE ProbeSet.Name IS NULL ## Post Mark-up * [ ] ! Generate a list of data older than 2020 and ping Rob/Pjotr. +* [ ] (w/Alex) Revisit data privacy in the LMDB view. * [-] (Cancelled) Re-visit how we store all HTML metadata. Clean this up. * [ ] Sync mariadb tux01 with tux02; have rdf.genenetwork.org be the latest. * [ ] Make sure that the rdf.genenetwork.org named graph is available on public end-point (mention to Fred about the nuance of moving to a new graph without breaking CD/Prod from old code that used the old genenetwork.org graph). |
