diff options
author | Munyoki Kilyungi | 2023-12-14 15:21:44 +0300 |
---|---|---|
committer | Munyoki Kilyungi | 2023-12-14 15:22:27 +0300 |
commit | 0474c1eab3e20037cb50e901fb21ef4482914900 (patch) | |
tree | d3d8a6243bd9555ebb4b0d1ca1aa61a4b06db362 | |
parent | 34668330eed37069394015bc54ef57c1b18f5d8a (diff) | |
download | gn-gemtext-0474c1eab3e20037cb50e901fb21ef4482914900.tar.gz |
Move genelist info to the genelist issue and close it.
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
-rw-r--r-- | issues/handling-resource-links-in-probeset-page.gmi | 24 | ||||
-rw-r--r-- | issues/transform-genelist-to-rdf.gmi | 29 |
2 files changed, 28 insertions, 25 deletions
diff --git a/issues/handling-resource-links-in-probeset-page.gmi b/issues/handling-resource-links-in-probeset-page.gmi index 4d0c052..de214be 100644 --- a/issues/handling-resource-links-in-probeset-page.gmi +++ b/issues/handling-resource-links-in-probeset-page.gmi @@ -29,32 +29,8 @@ gn:probeset1435395_s_at gnt:hasGeneManiaResource <https://genemania.org/search/m The straightforward approach would be to construct this structure in the front-end. However, the problem lies in the fact that these resource links are inferred, making it challenging to discern their connection within GN without visiting the website. Therefore, it's preferable to store this information in RDF despite the ease of constructing it in the front-end. -### GeneList Metadata -Consider GN's approach for fetching GeneList entries for a specific trait. -=> https://github.com/genenetwork/genenetwork2/blob/371cbaeb1b05a062d7f75083aa4ff7209e4e06b3/wqflask/wqflask/show_trait/show_trait.py#L398 Fetching GeneList for a given trait - -The GeneList table lacks unique GeneSymbols and GeneIds, as illustrated in the following examples: - -``` -SELECT * FROM GeneList WHERE SpeciesId = 1 AND GeneSymbol = "Sp3" AND GeneId = 20687 AND Chromosome = "2"\G -``` - -Duplicate entry examples: - -``` -SELECT * FROM GeneList WHERE GeneSymbol = "AB102723" AND -GeneId=3070 AND SpeciesId = 4 \G - -SELECT * FROM GeneList WHERE SpeciesId = 1 AND GeneSymbol = "Sp3" AND GeneId = 20687 AND Chromosome = "2"\G -``` - -Identifying duplicates: - -``` -SELECT GeneSymbol, GeneId, SpeciesId, COUNT(CONCAT(GeneSymbol, "_", GeneId, "_", SpeciesId)) AS `count` FROM GeneList GROUP BY BINARY GeneSymbol, GeneId, chromosome, txStart, txEnd HAVING COUNT(CONCAT(GeneSymbol, "_", GeneId, "_", SpeciesId)) > 1; -``` Transforming ProbeSet metadata takes long. The exact command: diff --git a/issues/transform-genelist-to-rdf.gmi b/issues/transform-genelist-to-rdf.gmi index bfa213d..3c20b5e 100644 --- a/issues/transform-genelist-to-rdf.gmi +++ b/issues/transform-genelist-to-rdf.gmi @@ -18,7 +18,34 @@ Example: => https://genenetwork.org/show_trait?trait_id=1460303_at&dataset=HC_M2_0606_P Trait Data and Analysis for 1460303_at -When dumping data, it's unclear how resources (GeneMANIA, STRING, PANTHER, etc.) become links---they are manually constructed in GN's source code. This transformation is crucial when converting data to RDF. +When transforming data, it's unclear how resources (GeneMANIA, STRING, PANTHER, etc.) become links---they are manually constructed in GN's source code. This transformation is crucial when converting data to RDF. + +## GeneList Metadata + +Consider GN's approach for fetching GeneList entries for a specific trait. + +=> https://github.com/genenetwork/genenetwork2/blob/371cbaeb1b05a062d7f75083aa4ff7209e4e06b3/wqflask/wqflask/show_trait/show_trait.py#L398 Fetching GeneList for a given trait + +The GeneList table lacks unique GeneSymbols and GeneIds, as illustrated in the following examples: + +``` +SELECT * FROM GeneList WHERE SpeciesId = 1 AND GeneSymbol = "Sp3" AND GeneId = 20687 AND Chromosome = "2"\G +``` + +Duplicate entry examples: + +``` +SELECT * FROM GeneList WHERE GeneSymbol = "AB102723" AND +GeneId=3070 AND SpeciesId = 4 \G + +SELECT * FROM GeneList WHERE SpeciesId = 1 AND GeneSymbol = "Sp3" AND GeneId = 20687 AND Chromosome = "2"\G +``` + +Identifying duplicates: + +``` +SELECT GeneSymbol, GeneId, SpeciesId, COUNT(CONCAT(GeneSymbol, "_", GeneId, "_", SpeciesId)) AS `count` FROM GeneList GROUP BY BINARY GeneSymbol, GeneId, chromosome, txStart, txEnd HAVING COUNT(CONCAT(GeneSymbol, "_", GeneId, "_", SpeciesId)) > 1; +``` ## Resolution |