Age | Commit message (Collapse) | Author |
|
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
|
|
* dump.scm (define-dump): Add optional table-metadata? flag thats #f
by default. If this flag is #t, dump metadata about the SQL table
itself.
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
|
|
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
|
|
* dump.scm (annotate-field): New function.
* dump/triples.scm (triple): Print a string as they appear with
DISPLAY should they contain "\"" thus enabling a triple that looks
like:
gn:species_mus_musculus gn:name "Mouse"^^xsd:string
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
|
|
* dump.scm (dump-generif-basic): Annotate createTime field with xsd.
* dump.scm (dump-generif): New dump.
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
|
|
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
|
|
This change enables having:
"... GROUP_CONCAT(GeneRIF_BASIC.PubMedId) AS alias ..."
as part of the query.
* dump.scm (field->key, field->assoc-ref): Add new syntax-rule.
* dump/sql.scm (select-query): Ditto.
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
|
|
* dump.scm (dump-generif): New data dump.
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
|
|
* dump.scm (dump-publishfreeze, dump-published-phenotypes): New dumps.
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
|
|
* dump.scm (phenotype-id->id, dump-phenotype): Delete
(dump-publish-xref): Delete.
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
|
|
* dump.scm (dump-inbred-set): Add mapping method and species as extra
metadata for inbredsets.
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
|
|
This information is already stored in LMDB.
* dump.scm (dump-case-attributes): Delete.
(main)(<dump-case-attributes>): Ditto.
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
|
|
* dump.scm (dump-info-files): Set a dataset's name to InfoPageName.
|
|
* dump.scm: Add comment with URI to GeneRIF data.
|
|
* README.md: Document generif-data-file parameter in connection
settings.
* dump.scm: Import (srfi srfi-171), (ice-9 regex) and (zlib).
(decode-html-entities, import-generif): New functions.
(main): Call import-generif.
* import-generif.scm: Delete file.
|
|
* dump.scm (string->identifier, string-blank?, triple, prefix): Move
to ...
* dump/triples.scm: ... new file.
* dump.scm: Import (dump triples).
|
|
* dump.scm (string-blank?): Move to ...
* dump/utils.scm (string-blank?): ... here.
|
|
* dump.scm (dump-investigators): Special case Yohan Bossé's last name.
|
|
The AvgMethod table no longer has duplicate "N/A" records.
* dump.scm (dump-avg-method): Do not deduplicate the AvgMethod table.
|
|
In GeneNetwork, a phenonytpe is currently identified by it's
ID (primary key of the table from MariaDB). The only way to relate it
to a publication is through a publication ID. This is important
because there are some publications with a NULL value for "pubmed ID"
and as such without the publication ID, some data is lost as there's
no way to point to publication with a NULL "pubmed ID."
* dump.scm (dump-publish-xref): Define gn:traitId and
gn:publicationId.
Signed-off-by: Arun Isaac <arunisaac@systemreboot.net>
|
|
* dump.scm (dump-publication): Delete vertical tab character in
abstracts.
|
|
* dump.scm (dump-groups): New dump.
(main): Call dump-groups.
Signed-off-by: Arun Isaac <arunisaac@systemreboot.net>
|
|
* dump.scm (dump-case-attributes): New dump.
(main): Call dump-case-attributes.
Signed-off-by: Arun Isaac <arunisaac@systemreboot.net>
|
|
A "." at the end of a turtle identifier---for example
"gn:caseAttribute_ethn."---generates an error when trying to validate
the generated RDF.
* dump.scm (string->identifier): Remove trailing "." if it occurs in
the identifier.
Signed-off-by: Arun Isaac <arunisaac@systemreboot.net>
|
|
This differentiates it from virtuoso and SPARQL connection parameters.
* dump.scm (call-with-genenetwork-database, dump-data-table): Prefix
SQL connection parameters with sql-.
* README.md (Using): Update documentation of SQL connection
parameters.
|
|
* dump.scm (investigator-attributes->id): Add special case for
investigator "Yohan Bossé".
|
|
* dump.scm (%connection-settings): Use 1th command-line argument.
(%dump-directory): Use 2th command-line argument.
|
|
* dump.scm: Import (rnrs programs).
(%connection-settings): New variable.
(call-with-database): Use %connection-settings.
(%database-name): Delete variable.
(%dump-directory): Set from command-line arguments.
(dump-data-table): Use %connection-settings instead of %database-name.
* README.org (Using): Add command-line arguments to usage
instructions.
|
|
If these macro helper functions are not evaluated at macro expansion
time, the dependent macros will fail to compile.
* dump.scm (string->identifier, field->key, field->assoc-ref,
collect-fields, find-clause, remove-namespace, column-id, dump-id):
Eval at macro expansion time.
|
|
* dump.scm: Do not add to load path.
|
|
* dump.scm (define-dump): Add docstring.
|
|
* dump.scm (syntax-let): New macro.
(define-dump): Use syntax-let.
* .dir-locals.el (scheme-mode): Indent syntax-let correctly.
|
|
* dump.scm (define-dump): Automatically create domain triples for
predicates.
|
|
* dump.scm (remove-namespace, dump-id): New functions.
(define-dump): Dump metadata about the dump itself.
|
|
* dump.scm (dump-schema, column-id): Disambiguate user and User tables
in RDF identifier.
|
|
* dump.scm (column-id): New function.
(dump-schema): Use column-id.
|
|
* dump.scm (%dumped): Delete variable.
(define-dump): Do not register dumped tables and columns to %dumped.
(dumped-table?): Delete function.
|
|
* dump.scm (dump-inbred-set): Set range of gn:inbredSetOfSpecies to
gn:species.
|
|
* dump.scm (dump-info-files): Set range of gn:datasetOfInvestigator to
foaf:Person instead of gn:investigator.
|
|
* dump.scm (dump-species, dump-strain, dump-inbred-set,
dump-phenotype, dump-publication, dump-tissue, dump-investigator,
dump-avg-method, dump-gene-chip, dump-info-files): Add Literal range
triples.
|
|
* dump.scm (dump-phenotype): Remove duplicate predicates
gn:prePublicationDescription and gn:postPublicationDescription.
|
|
gn:authors is a typo.
* dump.scm (dump-publication): Rename gn:authors predicate to
gn:author.
|
|
* dump.scm (triple): Add runtime type checking.
|
|
* dump.scm: Do not import (sxml simple) and (dump string-similarity).
(string-remove-suffix-ci, floor-log1024, human-units,
human-units-color, sxml->xml-string, sxml->graphviz-html, table-label,
table->graphviz-node, column->foreign-table, tables->graphviz-edges):
Move to ...
(dump-schema): Dump schema to RDF.
(main): Call dump-schema without setting schema.dot as the output
file.
* visualize-schema.scm: ... here.
|
|
Capture full column type instead of just whether it is an integer.
* dump.scm (dump-data-table): Capture full column type in <column>
object.
* dump/table.scm (<column>)[int?]: Delete member.
[type]: New member.
Export column-type instead of column-int?.
|
|
* dump.scm (<table>, <column>): Move to ...
* dump/table.scm: ... here.
|
|
* dump.scm (define-dump): Indent better.
|
|
* dump.scm (define-dump): Support schema-triples clause.
(dump-strain, dump-publish-xref, dump-info-files): Add schema-triples
clause.
(main): Output rdfs: prefix.
|
|
* dump.scm (find-clause): New function.
(define-dump): Make order of clauses unspecified.
|
|
* dump.scm (field->key, field->assoc-ref, collect-fields): New
functions.
(define-dump): Redefine with more concise syntax.
* dump.scm (dump-species, dump-strain, dump-mapping-method,
dump-inbred-set, dump-phenotype, dump-publication, dump-publish-xref,
dump-tissue, dump-investigators, dump-avg-method, dump-gene-chip,
dump-info-files): Use new define-dump syntax.
(default-metadata-proc): Delete function.
* .dir-locals.el (scheme-mode): Indent triples form correctly.
|