Age | Commit message (Expand) | Author |
2021-12-23 | Add runtime type checking for triple....* dump.scm (triple): Add runtime type checking.
| Arun Isaac |
2021-12-20 | Move schema visualization to separate script....* dump.scm: Do not import (sxml simple) and (dump string-similarity).
(string-remove-suffix-ci, floor-log1024, human-units,
human-units-color, sxml->xml-string, sxml->graphviz-html, table-label,
table->graphviz-node, column->foreign-table, tables->graphviz-edges):
Move to ...
(dump-schema): Dump schema to RDF.
(main): Call dump-schema without setting schema.dot as the output
file.
* visualize-schema.scm: ... here.
| Arun Isaac |
2021-12-20 | Add guile-sparql to Guix manifest....* guix.scm: Add guile-sparql to manifest.
| Arun Isaac |
2021-12-20 | Upgrade ccwl to latest commit....* guix.scm (ccwl): Upgrade to commit
51c12b7e58685b70e7cfd9612dac403cf9ee845c.
| Arun Isaac |
2021-12-20 | Capture full column type....Capture full column type instead of just whether it is an integer.
* dump.scm (dump-data-table): Capture full column type in <column>
object.
* dump/table.scm (<column>)[int?]: Delete member.
[type]: New member.
Export column-type instead of column-int?.
| Arun Isaac |
2021-12-20 | Move <table> and <column> types to separate module....* dump.scm (<table>, <column>): Move to ...
* dump/table.scm: ... here.
| Arun Isaac |
2021-12-17 | Indent define-dump better....* dump.scm (define-dump): Indent better.
| Arun Isaac |
2021-12-17 | Document RDF schema during dumping....* dump.scm (define-dump): Support schema-triples clause.
(dump-strain, dump-publish-xref, dump-info-files): Add schema-triples
clause.
(main): Output rdfs: prefix.
| Arun Isaac |
2021-12-17 | Make order of clauses in define-dump unspecified....* dump.scm (find-clause): New function.
(define-dump): Make order of clauses unspecified.
| Arun Isaac |
2021-12-16 | Make define-dump syntax more concise....* dump.scm (field->key, field->assoc-ref, collect-fields): New
functions.
(define-dump): Redefine with more concise syntax.
* dump.scm (dump-species, dump-strain, dump-mapping-method,
dump-inbred-set, dump-phenotype, dump-publication, dump-publish-xref,
dump-tissue, dump-investigators, dump-avg-method, dump-gene-chip,
dump-info-files): Use new define-dump syntax.
(default-metadata-proc): Delete function.
* .dir-locals.el (scheme-mode): Indent triples form correctly.
| Arun Isaac |
2021-12-16 | Add tests....* tests.scm: New file.
| Arun Isaac |
2021-12-16 | Specify map-alist behaviour for multiple set verbs....* dump/utils.scm (map-alist): Specify behaviour for multiple set
verbs.
| Arun Isaac |
2021-12-16 | Generalize collect-keys and key->assoc-ref....The generalized versions---collect forms and translate-forms---will be
required by other macros.
* dump/utils.scm (collect-forms, translate forms): New public
functions.
(collect-keys): Rewrite in terms of collect-forms.
(key->assoc-ref): Rewrite in terms of translate-forms.
| Arun Isaac |
2021-12-16 | Rename away delete from (srfi srfi-1)....delete from (srfi srfi-1) somehow interferes with the delete verb of
map-alist. It is not clear why.
* dump/utils.scm (dump): Rename delete to srfi:delete while importing.
| Arun Isaac |
2021-12-15 | Move string similarity functions to separate module....* dump.scm: Use (dump string-similarity).
(trigrams, jaccard-index, jaccard-string-similarity,
jaccard-string-similar?): Move to ...
* dump/string-similarity.scm: ... here.
| Arun Isaac |
2021-12-14 | Camel case gn:binomialName....* dump.scm (dump-species): Change gn:binomialname to gn:binomialName.
| Arun Isaac |
2021-12-14 | Use node ports to indicate foreign key relations precisely....* dump.scm (table-label): Set port attributes on <td> tag.
(tables->graphviz-edges): Specify ports on edges.
| Arun Isaac |
2021-12-14 | Specify appearance using HTML table....* dump.scm (table-label): Unset border attribute of <table> tag. Set
cellborder and bgcolor attributes of <table> tag.
(table->graphviz-node): Unset style and fillcolor node attributes. Set
shape node attribute to none.
| Arun Isaac |
2021-12-14 | Take advantage of bug fixes in bleeding edge (ccwl graphviz)....* dump.scm (graph->dot): Delete function.
(sxml->graphviz-html): Return a <html-string> object.
(dump-schema): Use graph->dot from (ccwl graphviz).
| Arun Isaac |
2021-12-13 | Upgrade to bleeding edge (ccwl graphviz)....This fixes a few bugs and brings in new features from (ccwl graphviz).
* guix.scm: Import (gnu packages autotools), (guix git-download)
and (guix packages). Prefix (gnu packages bioinformatics) imports with
guix:.
(ccwl): New variable.
| Arun Isaac |
2021-12-13 | Abstract out table to graphviz edge conversion....* dump.scm (column->foreign-table, tables->graphviz-edges): New
functions.
(dump-schema): Use tables->graphviz-edges.
| Arun Isaac |
2021-12-13 | Abstract out table to graphviz node conversion....* dump.scm (dumped-table?, table-label, table->graphviz-node): New
functions.
(dump-schema): Use table->graphviz-node.
| Arun Isaac |
2021-12-13 | Color table headers by size....* dump.scm (human-units-color): New function.
(dump-schema): Use human-units-color.
| Arun Isaac |
2021-12-13 | Implement human units conversion in terms of log1024....This generalizes better and is mathematically cleaner.
* dump.scm (floor-log1024): New function.
(human-units): Use floor-log1024.
| Arun Isaac |
2021-12-13 | Use sxml to construct graphviz HTML strings....Using sxml allows us to stay in the world of S-expressions.
* dump.scm (sxml->xml-string, sxml->graphviz-html): New function.
(dump-schema): Construct graphviz HTML string using sxml.
| Arun Isaac |
2021-12-11 | Highlight dumped tables and columns....* dump.scm (dump-schema): Highlight tables and columns.
| Arun Isaac |
2021-12-11 | Fix HTML string handling in dot output....* dump.scm (replace-substrings): New function.
(graph->dot): Fix HTML string handling.
| Arun Isaac |
2021-12-11 | Log dumped tables and columns....* dump.scm (%dumped): New variable.
(define-dump): Append to %dumped when a new table dumping function is
defined.
| Arun Isaac |
2021-12-11 | Abstract out definition of table dumping functions....* dump.scm (define-dump): New macro.
(dump-species, dump-strain, dump-mapping-method, dump-inbred-set,
dump-phenotype, dump-publication, dump-publish-xref, dump-tissue,
dump-investigators, dump-avg-method, dump-gene-chip, dump-info-files):
Redefine using define-dump.
| Arun Isaac |
2021-12-11 | Use string similarity and check if foreign key is an integer....* dump.scm (<column>): New type.
(tables): Use <column> objects to represent columns.
(trigrams, jaccard-index, jaccard-string-similarity): New functions.
(dump-schema): Use string similarity and check if foreign key is an
integer.
| Arun Isaac |
2021-12-11 | Remove rdflib python code....* rdf.py: Delete file.
| Arun Isaac |
2021-12-11 | Visualize schema....* .dir-locals.el (scheme-mode): Indent set-table-columns correctly.
* dump.scm: Import (srfi srfi-9 gnu).
(%database-name): New variable.
(<table>): New type.
(tables, string-remove-suffix-ci, human-units, graph->dot,
dump-schema): New functions.
Invoke dump-schema.
* guix.scm: Import (gnu packages bioinformatics). Add ccwl, graphviz
and guile-libyaml to the manifest.
| Arun Isaac |
2021-12-11 | Use select-query....* dump.scm (get-tables-from-comments, dump-table-fields, dump-species,
dump-strain, dump-mapping-method, dump-inbred-set, dump-phenotype,
dump-publication, dump-publish-xref, dump-tissue, dump-investigators,
dump-avg-method, dump-gene-chip, dump-info-files): Use select-query.
| Arun Isaac |
2021-12-11 | Implement S-expression like SQL select query....* dump/sql.scm: Import (srfi srfi-1). Export select-query.
(select-query): New macro.
| Arun Isaac |
2021-12-04 | Add emacs directory local variables....* .dir-locals.el: New file.
| Arun Isaac |
2021-12-04 | Remove redundant camel->lower-camel function....* dump.scm (camel->lower-camel): Delete function.
(default-metadata-proc): Do not use camel->lower-camel.
| Arun Isaac |
2021-12-04 | Build subjects exclusively with string->identifier....* dump.scm (dump-mapping-method, dump-publication, dump-info-files):
Use string->identifier to build subjects.
| Arun Isaac |
2021-12-04 | Append an underscore to the identifier prefix....This is slightly more readable.
* dump.scm (string->identifier): Append an underscore to the
identifier prefix.
| Arun Isaac |
2021-12-04 | Fix indentation....* dump.scm (get-tables-from-comments, dump-table-fields): Fix
indentation.
| Arun Isaac |
2021-12-04 | Use the map-alist DSL....* dump.scm: Import (dump utils).
(string-blank?): New function.
(scm->triples): Filter out triples with #f or blank string objects.
(process-metadata-alist): Delete function.
(default-metadata-proc): New function.
(dump-species, dump-strain, mapping-method-name->id, dump-inbred-set,
dump-phenotype, dump-publication, dump-publish-xref, dump-tissue,
dump-investigators, dump-avg-method, dump-gene-chip, dump-info-files):
Use map-alist.
| Arun Isaac |
2021-12-04 | Implement the map-alist DSL....map-alist is a DSL to transform one association list into
another. These transformations are frequently required when dumping
tables, especially metadata tables.
* dump/utils.scm: New file.
| Arun Isaac |
2021-12-02 | Construct investigator ID using first and last names too....* dump.scm (investigator-email->id): Rename to
investigator-attributes->id. Use first and last names in addition to
the email ID.
(dump-investigators): Use investigator-attributes->id. Include records
that have no email ID.
(dump-info-files): Use investigator-attributes->id. Include records
that have no email ID.
| Arun Isaac |
2021-12-02 | Use string-delete instead of string-replace-substring....For the simple task of removing spaces, string-delete is
sufficient. string-replace-substring is overkill.
* dump.scm (fix-email-id): Use string-delete instead of
string-replace-substring.
| Arun Isaac |
2021-12-02 | Abstract out string->identifier....Building a turtle identifier from a string after removing illegal
characters and prefixing is an extremely common operation. Abstract
it. Also, mandate identifier prefixes. It is better to play it safe.
* dump.scm (string->identifier): New function.
(binomial-name->species-id, dump-strain, mapping-method-name->id,
inbred-set-name->id, aphenotype-id->id, tissue-short-name->id,
investigator-email->id, avg-method-name->id, gene-chip-name->id): Use
string->identifier.
| Arun Isaac |
2021-12-02 | Document delete-substrings....* dump.scm (delete-substring): Add docstring.
| Arun Isaac |
2021-12-01 | Deal with AvgMethodId = 0....* dump.scm (dump-info-files): Deal with AvgMethodId.
| Arun Isaac |
2021-12-01 | Use InfoFileTitle instead of InfoPageTitle for dataset name....Not all datasets have a non-NULL InfoPageTitle field.
* dump.scm (dump-info-files): Use InfoFileTitle instead of
InfoPageTitle for dataset name.
| Arun Isaac |
2021-12-01 | Extract name of dataset group....* dump.scm (dump-info-files): Extract name of dataset group.
| Arun Isaac |
2021-12-01 | Do not link inbred-set to mapping-method....Not all inbred sets have a mapping method, and the mapping method of
the inbred set has, so far, not been used anywhere.
* dump.scm (mapping-method-name->id, dump-mapping-method): Mark as
unused.
(dump-inbred-set): Do not link inbred-set to mapping-method.
| Arun Isaac |
2021-12-01 | Allow N/A avg method....* dump.scm (dump-avg-method): Allow N/A in name.
(dump-info-files): Allow N/A in avg-method-name.
(avg-method-name->id): Replace / with _.
| Arun Isaac |