From 4590637f76d1355fe7ab8b161bbf3588ae5a6e76 Mon Sep 17 00:00:00 2001 From: Munyoki Kilyungi Date: Tue, 12 Dec 2023 13:35:39 +0300 Subject: Add notes on fetching GeneList Metadata. GeneList metadata makes it hard the probeset rdf transformation more nuanced and complicated. Signed-off-by: Munyoki Kilyungi --- .../handling-resource-links-in-probeset-page.gmi | 28 ++++++++++++++++++++++ 1 file changed, 28 insertions(+) diff --git a/issues/handling-resource-links-in-probeset-page.gmi b/issues/handling-resource-links-in-probeset-page.gmi index 5071a85..d2b4514 100644 --- a/issues/handling-resource-links-in-probeset-page.gmi +++ b/issues/handling-resource-links-in-probeset-page.gmi @@ -28,3 +28,31 @@ gn:probeset1435395_s_at gnt:hasGeneManiaResource https://github.com/genenetwork/genenetwork2/blob/371cbaeb1b05a062d7f75083aa4ff7209e4e06b3/wqflask/wqflask/show_trait/show_trait.py#L398 Fetching GeneList for a given trait + +The GeneList table lacks unique GeneSymbols and GeneIds, as illustrated in the following examples: + +``` +SELECT * FROM GeneList WHERE SpeciesId = 1 AND GeneSymbol = "Sp3" AND GeneId = 20687 AND Chromosome = "2"\G +``` + +Duplicate entry examples: + +``` +SELECT * FROM GeneList WHERE GeneSymbol = "AB102723" AND +GeneId=3070 AND SpeciesId = 4 \G + +SELECT * FROM GeneList WHERE SpeciesId = 1 AND GeneSymbol = "Sp3" AND GeneId = 20687 AND Chromosome = "2"\G +``` + +Identifying duplicates: + +``` +SELECT GeneSymbol, GeneId, SpeciesId, COUNT(CONCAT(GeneSymbol, "_", GeneId, "_", SpeciesId)) AS `count` FROM GeneList GROUP BY BINARY GeneSymbol, GeneId, chromosome, txStart, txEnd HAVING COUNT(CONCAT(GeneSymbol, "_", GeneId, "_", SpeciesId)) > 1; +``` -- cgit v1.2.3