summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorMunyoki Kilyungi2023-12-12 13:35:39 +0300
committerMunyoki Kilyungi2023-12-12 13:37:29 +0300
commit4590637f76d1355fe7ab8b161bbf3588ae5a6e76 (patch)
tree0a7ca558cc3cec81c27eed202576e2d21018a01a
parentb8f4efb22461fd0005eec0ae7147484eafb3380e (diff)
downloadgn-gemtext-4590637f76d1355fe7ab8b161bbf3588ae5a6e76.tar.gz
Add notes on fetching GeneList Metadata.
GeneList metadata makes it hard the probeset rdf transformation more nuanced and complicated. Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
-rw-r--r--issues/handling-resource-links-in-probeset-page.gmi28
1 files changed, 28 insertions, 0 deletions
diff --git a/issues/handling-resource-links-in-probeset-page.gmi b/issues/handling-resource-links-in-probeset-page.gmi
index 5071a85..d2b4514 100644
--- a/issues/handling-resource-links-in-probeset-page.gmi
+++ b/issues/handling-resource-links-in-probeset-page.gmi
@@ -28,3 +28,31 @@ gn:probeset1435395_s_at gnt:hasGeneManiaResource <https://genemania.org/search/m
```
The straightforward approach would be to construct this structure in the front-end. However, the problem lies in the fact that these resource links are inferred, making it challenging to discern their connection within GN without visiting the website. Therefore, it's preferable to store this information in RDF despite the ease of constructing it in the front-end.
+
+
+### GeneList Metadata
+
+Consider GN's approach for fetching GeneList entries for a specific trait.
+
+=> https://github.com/genenetwork/genenetwork2/blob/371cbaeb1b05a062d7f75083aa4ff7209e4e06b3/wqflask/wqflask/show_trait/show_trait.py#L398 Fetching GeneList for a given trait
+
+The GeneList table lacks unique GeneSymbols and GeneIds, as illustrated in the following examples:
+
+```
+SELECT * FROM GeneList WHERE SpeciesId = 1 AND GeneSymbol = "Sp3" AND GeneId = 20687 AND Chromosome = "2"\G
+```
+
+Duplicate entry examples:
+
+```
+SELECT * FROM GeneList WHERE GeneSymbol = "AB102723" AND
+GeneId=3070 AND SpeciesId = 4 \G
+
+SELECT * FROM GeneList WHERE SpeciesId = 1 AND GeneSymbol = "Sp3" AND GeneId = 20687 AND Chromosome = "2"\G
+```
+
+Identifying duplicates:
+
+```
+SELECT GeneSymbol, GeneId, SpeciesId, COUNT(CONCAT(GeneSymbol, "_", GeneId, "_", SpeciesId)) AS `count` FROM GeneList GROUP BY BINARY GeneSymbol, GeneId, chromosome, txStart, txEnd HAVING COUNT(CONCAT(GeneSymbol, "_", GeneId, "_", SpeciesId)) > 1;
+```