From 3d6fec1bc267cfa0a64dfc57318372ee6133f034 Mon Sep 17 00:00:00 2001 From: Munyoki Kilyungi Date: Thu, 14 Dec 2023 01:33:41 +0300 Subject: Add basic performance analysis for ProbeSet RDF dump. Signed-off-by: Munyoki Kilyungi --- issues/handling-resource-links-in-probeset-page.gmi | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/issues/handling-resource-links-in-probeset-page.gmi b/issues/handling-resource-links-in-probeset-page.gmi index bcd50ed..2f18c19 100644 --- a/issues/handling-resource-links-in-probeset-page.gmi +++ b/issues/handling-resource-links-in-probeset-page.gmi @@ -55,3 +55,18 @@ Identifying duplicates: ``` SELECT GeneSymbol, GeneId, SpeciesId, COUNT(CONCAT(GeneSymbol, "_", GeneId, "_", SpeciesId)) AS `count` FROM GeneList GROUP BY BINARY GeneSymbol, GeneId, chromosome, txStart, txEnd HAVING COUNT(CONCAT(GeneSymbol, "_", GeneId, "_", SpeciesId)) > 1; ``` + +Transforming ProbeSet metadata takes long. The exact command: + +```shell +time guix shell guile-dbi \ +guile-hashing -m manifest.scm -- ./pre-inst-env ./examples/probeset.scm --settings conn.scm --output /export/data/genenetwork-virtuoso/probeset-metadata.ttl --documentation ./docs/probeset-metadata.md +``` + +The aforementioned command takes: + +* real: 89m1.715s +* user: 175m47.684s +* sys: 6m15.076s + +Optimisations---perhaps using guile-fibers---can be considered later. -- cgit v1.2.3