aboutsummaryrefslogtreecommitdiff
path: root/dump.scm
AgeCommit message (Collapse)Author
2023-04-05Add ability to have operations such as GROUP_CONCAT in SELECT clauseMunyoki Kilyungi
This change enables having: "... GROUP_CONCAT(GeneRIF_BASIC.PubMedId) AS alias ..." as part of the query. * dump.scm (field->key, field->assoc-ref): Add new syntax-rule. * dump/sql.scm (select-query): Ditto. Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-03-08Dump GeneWiki metadataMunyoki Kilyungi
* dump.scm (dump-generif): New data dump. Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-03-06Dump relevant metadata about phenotypesMunyoki Kilyungi
* dump.scm (dump-publishfreeze, dump-published-phenotypes): New dumps. Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-03-06Delete phenotype and publish_xref metadataMunyoki Kilyungi
* dump.scm (phenotype-id->id, dump-phenotype): Delete (dump-publish-xref): Delete. Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-03-06Add mapping method and species info to inbredsets.Munyoki Kilyungi
* dump.scm (dump-inbred-set): Add mapping method and species as extra metadata for inbredsets. Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-02-15Remove dump-case-attributesMunyoki Kilyungi
This information is already stored in LMDB. * dump.scm (dump-case-attributes): Delete. (main)(<dump-case-attributes>): Ditto. Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2022-12-06Use InfoPageName as a dataset's name.Munyoki Kilyungi
* dump.scm (dump-info-files): Set a dataset's name to InfoPageName.
2022-11-04Add comment with URI to GeneRIF data.Arun Isaac
* dump.scm: Add comment with URI to GeneRIF data.
2022-11-04Unite importing GeneRIF with dumping SQL data.Arun Isaac
* README.md: Document generif-data-file parameter in connection settings. * dump.scm: Import (srfi srfi-171), (ice-9 regex) and (zlib). (decode-html-entities, import-generif): New functions. (main): Call import-generif. * import-generif.scm: Delete file.
2022-10-30Move triple utilities to new module.Arun Isaac
* dump.scm (string->identifier, string-blank?, triple, prefix): Move to ... * dump/triples.scm: ... new file. * dump.scm: Import (dump triples).
2022-10-30Move string-blank? to (dump utils).Arun Isaac
* dump.scm (string-blank?): Move to ... * dump/utils.scm (string-blank?): ... here.
2022-10-30Special case Yohan Bossé's last name.Arun Isaac
* dump.scm (dump-investigators): Special case Yohan Bossé's last name.
2022-10-30Do not deduplicate the AvgMethod table.Arun Isaac
The AvgMethod table no longer has duplicate "N/A" records. * dump.scm (dump-avg-method): Do not deduplicate the AvgMethod table.
2022-08-20Add gn:traitId and gn:publicationId.Munyoki Kilyungi
In GeneNetwork, a phenonytpe is currently identified by it's ID (primary key of the table from MariaDB). The only way to relate it to a publication is through a publication ID. This is important because there are some publications with a NULL value for "pubmed ID" and as such without the publication ID, some data is lost as there's no way to point to publication with a NULL "pubmed ID." * dump.scm (dump-publish-xref): Define gn:traitId and gn:publicationId. Signed-off-by: Arun Isaac <arunisaac@systemreboot.net>
2022-06-24Delete vertical tab character in publication abstracts.Arun Isaac
* dump.scm (dump-publication): Delete vertical tab character in abstracts.
2022-06-23Dump groups.BonfaceKilz
* dump.scm (dump-groups): New dump. (main): Call dump-groups. Signed-off-by: Arun Isaac <arunisaac@systemreboot.net>
2022-06-23Dump case-attributes.BonfaceKilz
* dump.scm (dump-case-attributes): New dump. (main): Call dump-case-attributes. Signed-off-by: Arun Isaac <arunisaac@systemreboot.net>
2022-06-23Remove "." if it occurs at the end of a turtle identifier.BonfaceKilz
A "." at the end of a turtle identifier---for example "gn:caseAttribute_ethn."---generates an error when trying to validate the generated RDF. * dump.scm (string->identifier): Remove trailing "." if it occurs in the identifier. Signed-off-by: Arun Isaac <arunisaac@systemreboot.net>
2022-05-05Prefix SQL connection parameters with sql-.Arun Isaac
This differentiates it from virtuoso and SPARQL connection parameters. * dump.scm (call-with-genenetwork-database, dump-data-table): Prefix SQL connection parameters with sql-. * README.md (Using): Update documentation of SQL connection parameters.
2022-05-04Special case investigator ID for "Yohan Bossé".Arun Isaac
* dump.scm (investigator-attributes->id): Add special case for investigator "Yohan Bossé".
2022-03-10Ignore 0th command-line argument.Arun Isaac
* dump.scm (%connection-settings): Use 1th command-line argument. (%dump-directory): Use 2th command-line argument.
2022-03-10Accept connection parameters and dump directory as arguments.Arun Isaac
* dump.scm: Import (rnrs programs). (%connection-settings): New variable. (call-with-database): Use %connection-settings. (%database-name): Delete variable. (%dump-directory): Set from command-line arguments. (dump-data-table): Use %connection-settings instead of %database-name. * README.org (Using): Add command-line arguments to usage instructions.
2022-01-04Eval macro helper functions at macro expansion time.Arun Isaac
If these macro helper functions are not evaluated at macro expansion time, the dependent macros will fail to compile. * dump.scm (string->identifier, field->key, field->assoc-ref, collect-fields, find-clause, remove-namespace, column-id, dump-id): Eval at macro expansion time.
2021-12-24Do not add to load path in script.Arun Isaac
* dump.scm: Do not add to load path.
2021-12-24Document define-dump.Arun Isaac
* dump.scm (define-dump): Add docstring.
2021-12-24Introduce syntax-let abstraction.Arun Isaac
* dump.scm (syntax-let): New macro. (define-dump): Use syntax-let. * .dir-locals.el (scheme-mode): Indent syntax-let correctly.
2021-12-23Automatically create domain triples for predicates.Arun Isaac
* dump.scm (define-dump): Automatically create domain triples for predicates.
2021-12-23Dump metadata about the dump itself.Arun Isaac
* dump.scm (remove-namespace, dump-id): New functions. (define-dump): Dump metadata about the dump itself.
2021-12-23Disambiguate user and User tables in RDF identifier.Arun Isaac
* dump.scm (dump-schema, column-id): Disambiguate user and User tables in RDF identifier.
2021-12-23Abstract column id generation to separate function.Arun Isaac
* dump.scm (column-id): New function. (dump-schema): Use column-id.
2021-12-23Do not register dumped tables and columns to %dumped.Arun Isaac
* dump.scm (%dumped): Delete variable. (define-dump): Do not register dumped tables and columns to %dumped. (dumped-table?): Delete function.
2021-12-23Specify range of gn:inbredSetOfSpecies.Arun Isaac
* dump.scm (dump-inbred-set): Set range of gn:inbredSetOfSpecies to gn:species.
2021-12-23Use foaf:Person instead of gn:investigator.Arun Isaac
* dump.scm (dump-info-files): Set range of gn:datasetOfInvestigator to foaf:Person instead of gn:investigator.
2021-12-23Add Literal range triples.Arun Isaac
* dump.scm (dump-species, dump-strain, dump-inbred-set, dump-phenotype, dump-publication, dump-tissue, dump-investigator, dump-avg-method, dump-gene-chip, dump-info-files): Add Literal range triples.
2021-12-23Remove duplicate predicates for gn:phenotype.Arun Isaac
* dump.scm (dump-phenotype): Remove duplicate predicates gn:prePublicationDescription and gn:postPublicationDescription.
2021-12-23Rename gn:authors predicate to gn:author.Arun Isaac
gn:authors is a typo. * dump.scm (dump-publication): Rename gn:authors predicate to gn:author.
2021-12-23Add runtime type checking for triple.Arun Isaac
* dump.scm (triple): Add runtime type checking.
2021-12-20Move schema visualization to separate script.Arun Isaac
* dump.scm: Do not import (sxml simple) and (dump string-similarity). (string-remove-suffix-ci, floor-log1024, human-units, human-units-color, sxml->xml-string, sxml->graphviz-html, table-label, table->graphviz-node, column->foreign-table, tables->graphviz-edges): Move to ... (dump-schema): Dump schema to RDF. (main): Call dump-schema without setting schema.dot as the output file. * visualize-schema.scm: ... here.
2021-12-20Capture full column type.Arun Isaac
Capture full column type instead of just whether it is an integer. * dump.scm (dump-data-table): Capture full column type in <column> object. * dump/table.scm (<column>)[int?]: Delete member. [type]: New member. Export column-type instead of column-int?.
2021-12-20Move <table> and <column> types to separate module.Arun Isaac
* dump.scm (<table>, <column>): Move to ... * dump/table.scm: ... here.
2021-12-17Indent define-dump better.Arun Isaac
* dump.scm (define-dump): Indent better.
2021-12-17Document RDF schema during dumping.Arun Isaac
* dump.scm (define-dump): Support schema-triples clause. (dump-strain, dump-publish-xref, dump-info-files): Add schema-triples clause. (main): Output rdfs: prefix.
2021-12-17Make order of clauses in define-dump unspecified.Arun Isaac
* dump.scm (find-clause): New function. (define-dump): Make order of clauses unspecified.
2021-12-16Make define-dump syntax more concise.Arun Isaac
* dump.scm (field->key, field->assoc-ref, collect-fields): New functions. (define-dump): Redefine with more concise syntax. * dump.scm (dump-species, dump-strain, dump-mapping-method, dump-inbred-set, dump-phenotype, dump-publication, dump-publish-xref, dump-tissue, dump-investigators, dump-avg-method, dump-gene-chip, dump-info-files): Use new define-dump syntax. (default-metadata-proc): Delete function. * .dir-locals.el (scheme-mode): Indent triples form correctly.
2021-12-15Move string similarity functions to separate module.Arun Isaac
* dump.scm: Use (dump string-similarity). (trigrams, jaccard-index, jaccard-string-similarity, jaccard-string-similar?): Move to ... * dump/string-similarity.scm: ... here.
2021-12-14Camel case gn:binomialName.Arun Isaac
* dump.scm (dump-species): Change gn:binomialname to gn:binomialName.
2021-12-14Use node ports to indicate foreign key relations precisely.Arun Isaac
* dump.scm (table-label): Set port attributes on <td> tag. (tables->graphviz-edges): Specify ports on edges.
2021-12-14Specify appearance using HTML table.Arun Isaac
* dump.scm (table-label): Unset border attribute of <table> tag. Set cellborder and bgcolor attributes of <table> tag. (table->graphviz-node): Unset style and fillcolor node attributes. Set shape node attribute to none.
2021-12-14Take advantage of bug fixes in bleeding edge (ccwl graphviz).Arun Isaac
* dump.scm (graph->dot): Delete function. (sxml->graphviz-html): Return a <html-string> object. (dump-schema): Use graph->dot from (ccwl graphviz).
2021-12-13Abstract out table to graphviz edge conversion.Arun Isaac
* dump.scm (column->foreign-table, tables->graphviz-edges): New functions. (dump-schema): Use tables->graphviz-edges.