diff options
author | Munyoki Kilyungi | 2023-12-15 21:38:24 +0300 |
---|---|---|
committer | Munyoki Kilyungi | 2023-12-15 21:45:03 +0300 |
commit | 1ea6e2dd7655788e198dc13695c829287132498f (patch) | |
tree | 3ef484f9bd0010a40e2781826f11e9d673e887b6 /examples/genelist.scm | |
parent | 4a62e17816928e271ba982038ac36fcaf72783d2 (diff) | |
download | gn-transform-databases-1ea6e2dd7655788e198dc13695c829287132498f.tar.gz |
Preserve gene symbol case when used as an identifer.
Genes with varying casing (e.g., Shh, SHH) result in
`string->identifier` capitalizing the first letter by default. This
creates inconsistencies in gene symbols, leading to different
predicates and objects for the same entity, introducing errors.
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
Diffstat (limited to 'examples/genelist.scm')
-rwxr-xr-x | examples/genelist.scm | 10 |
1 files changed, 6 insertions, 4 deletions
diff --git a/examples/genelist.scm b/examples/genelist.scm index fbd39c1..b19b30f 100755 --- a/examples/genelist.scm +++ b/examples/genelist.scm @@ -78,10 +78,12 @@ (gnt:hasTargetSeq rdfs:domain gnc:Probeset)) (triples (string->identifier - "gene" (regexp-substitute/global #f "[^A-Za-z0-9:]" - (string-trim-both - (field GeneList GeneSymbol)) - 'pre "_" 'post)) + "gene" (regexp-substitute/global + #f "[^A-Za-z0-9:]" + (string-trim-both + (field GeneList GeneSymbol)) + 'pre "_" 'post) + #:proc (lambda (x) x)) (set rdf:type 'gnc:GeneSymbol) (set rdfs:label (field GeneList GeneSymbol)) (set dct:description (sanitize-rdf-string (field GeneList GeneDescription))) |