aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-06-19Chunk probeset dumpMunyoki Kilyungi
The probeset table has many columns, with about 5Million rows. As such, the dump can be huge. One problem with the dump is that rapper fails with an out-of-memory error. This commit chunks the data to make linting and uploading data more manageable. Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-06-15Change the transaction log mode to 'autocommit' when deleting graphMunyoki Kilyungi
During typical server operations, deleting one or more graphs containing a large number of triples can consume available memory to the point where the operation cannot be completed, and thus the graph can't be deleted. Such large graphs can be cleared by changing the transaction log mode to autocommit. Read the following for more: https://vos.openlinksw.com/owiki/wiki/VOS/VirtTipsAndTricksGuideDeleteLargeGraphs Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-06-12Add a dataset's full name when dumpingMunyoki Kilyungi
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-06-12Use PublishFreeze Name as a fallback for InfoPageNameMunyoki Kilyungi
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-06-12Annotate a phenotype dataset with 'dataset:Munyoki Kilyungi
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-06-12Replace "Unknown" in Publication fields with an empty stringMunyoki Kilyungi
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-06-12Annotate publications without a pmid with "unpublished"Munyoki Kilyungi
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-06-12Use "publication:" as an identifier for publicationsMunyoki Kilyungi
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-06-12Remove check for confidentiality/publicity during dataset dumpMunyoki Kilyungi
Should a dataset not be confidential/public, it's marked as private. Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-06-12Add geoPlatfrom to dataset dumpMunyoki Kilyungi
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-06-12Add missing Species join when dumping datasetsMunyoki Kilyungi
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-06-12Use the correct title for the a datasetMunyoki Kilyungi
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-06-12Rename AboutProcessing alias to AboutDataProcessingMunyoki Kilyungi
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-06-12Replace citations to source from Datasets instead of InfoFilesMunyoki Kilyungi
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-06-12Rename sql-alias of GN notes during dataset dumpMunyoki Kilyungi
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-06-12Add rdf defs for datasetOfSpeciesMunyoki Kilyungi
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-06-12Add publicationTitle to dataset dumpMunyoki Kilyungi
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-06-12Make geoSeries a link during dataset dumpMunyoki Kilyungi
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-06-12Minor identation fixMunyoki Kilyungi
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-06-12Add organization to dataset dumpMunyoki Kilyungi
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-06-12Use NCBI taxonomy browser instead of uniprotMunyoki Kilyungi
NCBI presents data in a user friendly way from the taxid. To get the same data from uniprot, you need a unique identify which requires an extra query to retrieve. Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-06-12Cast longtext fields to fix broket utf-8 charactersMunyoki Kilyungi
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-06-12Remove dead commentsMunyoki Kilyungi
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-06-12Remove "^[Nn]one$" from some fields from the InfoFiles tableMunyoki Kilyungi
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-06-12Default to InfoFiles with the Dataset being a fallback for citationsMunyoki Kilyungi
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-06-12Make ((g/ph)enotype/probeset)Dataset subset of DatasetMunyoki Kilyungi
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-05-30Dump OMIM as a normal string without any annotationMunyoki Kilyungi
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-05-30Fix broken utf-8 strings in abstract fieldMunyoki Kilyungi
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-05-30Add probeset as an extra prefixMunyoki Kilyungi
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-05-30Delete unused prefixesMunyoki Kilyungi
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-05-30Replace species with inbredSet metadata when dumping info-filesMunyoki Kilyungi
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-05-30Only dump public dataMunyoki Kilyungi
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-05-30Allow upleading all files in a directory to virtuosoMunyoki Kilyungi
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-05-30Replace PUT with a POST when uploading data in virtuosoMunyoki Kilyungi
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-05-30Sanitize a probeset's descriptionMunyoki Kilyungi
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-05-30Update the probeset's identifier(object)Munyoki Kilyungi
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-05-30Make a probeset's tissue multisetMunyoki Kilyungi
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-05-30Make a probeset's alias a multisetMunyoki Kilyungi
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-05-30Move probeset metadata used for a given experiment to it's own dumpMunyoki Kilyungi
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-05-26Use a phenotype's abbreviation as it's nameMunyoki Kilyungi
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-05-26Make tissue a multi-set during the probeset dumpMunyoki Kilyungi
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-05-26Remove some metadata from the probeset dumpMunyoki Kilyungi
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-05-26Create a new probesetfreeze dumpMunyoki Kilyungi
This way, the probeset dump will become smaller. Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-05-26Allow load-rdf script to read in data from a dirMunyoki Kilyungi
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-05-26Return an empty string if a dataset doesn't have a nameMunyoki Kilyungi
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-05-26Remove unnecessary fields from probeset dumpMunyoki Kilyungi
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-05-26Sanitize a generif entry from GNMunyoki Kilyungi
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-05-26Add genotype dumpMunyoki Kilyungi
* examples/dump-genotype.scm: New dump for genotypes and their associated datasets(that were not dumped from the info-files table). Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-05-26Fix broken utf-8 chars when dumping an investigator's namesMunyoki Kilyungi
* examples/dump-dataset-metadata.scm (dump-investigators) <foaf:name, foaf:givenName>: Binary convert fields first to latin1 then utf8. Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-05-26Remove an investigator's email/phone from dump for privacy reasonsMunyoki Kilyungi
* examples/dump-dataset-metadata.scm (dump-investigators) <foaf:phone, foad:mbox>: Delete. Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>