summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--topics/biohackathon/GNGSoC2023.gmi26
-rw-r--r--topics/systems/virtuoso.gmi18
2 files changed, 39 insertions, 5 deletions
diff --git a/topics/biohackathon/GNGSoC2023.gmi b/topics/biohackathon/GNGSoC2023.gmi
index 623a6c6..a5f9019 100644
--- a/topics/biohackathon/GNGSoC2023.gmi
+++ b/topics/biohackathon/GNGSoC2023.gmi
@@ -59,6 +59,10 @@ git repo genenetwork-machines, guix-bioinformatics
* Added Klaus server for git
* Fixed channels to work wit Python 3.9 instead of 3.10
+### Week 6
+
+* Towards a new workbench with cgit support
+
### More
=> https://ci.genenetwork.org/jobs/guix-bioinformatics
@@ -112,7 +116,11 @@ git repo genenetwork3
* Metadata - renamed prefixes
* Short names gn: gnt:
* Updated virtuoso
-* parsing geno files - lmdb support
+* parsing geno files - lmdb support
+
+### Week 6
+
+* Improvements on RDF
## LLMs & metadata (RDF)
@@ -155,6 +163,10 @@ git repo genenetwork3
* Expose container to Rupert
* Add a password
+### Week 6
+
+* Very close to a working flask app Rupert can try next week
+
## API to access data from GN
* lead: Rupert
@@ -203,6 +215,11 @@ git repo gn-docs & genenetwork3 & SPARQL
* Added back-end support for wikidata - finding inconsistencies
+### Week 6
+
+* API ready for running in a production environment
+* Using latest RDF
+
## Editing data
* lead: Fred
@@ -229,6 +246,9 @@ git repo gn-docs & genenetwork3 & SPARQL
* Editing has gone on production - fixing issues
* Discussion on REST API
+### Week 6
+
+* Progress on authorization and editing
## Guix parametrization
@@ -259,6 +279,10 @@ git repo gn-docs & genenetwork3 & SPARQL
* Proposed DSL for parameters
+### Week 6
+
+* Posted BLOG and started implementation
+
## Links
* Matrix room is GNSoC2023
diff --git a/topics/systems/virtuoso.gmi b/topics/systems/virtuoso.gmi
index 2b2526c..c6a42cf 100644
--- a/topics/systems/virtuoso.gmi
+++ b/topics/systems/virtuoso.gmi
@@ -143,7 +143,13 @@ curl -v -X PUT --digest -u 'dba:password' -T data.ttl -G http://localhost:8892/s
```
where http://genenetwork.org is the name of the graph. Note that single quoting the password is good to do especially when you have special characters in the password.
-The PUT method deletes the existing data in the graph before loading the new one. So, there is no need to manually delete old data before loading new data. However, virtuoso is slow at deleting millions of triples, resulting in an apparent freeze-up. So, it is preferable to handle such deletes manually using a lower-level SQL statement issued via the isql client.
+The PUT method deletes the existing data in the graph before loading the new one. A POST method can be used instead. There is usually no need to manually delete old data before loading new data. virtuoso is slow at deleting millions of triples, resulting in an apparent freeze-up. So, it is preferable to handle such deletes manually using a lower-level SQL statement issued via the isql client.
+
+Start isql with something like
+
+```
+guix shell --expose=verified-data=/var/lib/data virtuoso-ose -- isql -U dba -P password 8981
+```
To delete a graph:
@@ -152,10 +158,12 @@ $ isql
SQL> DELETE FROM rdf_quad WHERE g = iri_to_id('http://genenetwork.org');
```
-To add ttl files:
+To add ttl files through isql:
```
-ld_dir('/dir', '*.ttl', 'http://genenetwork.org'); rdf_loader_run();
+ld_dir('/dir', '*.ttl', 'http://genenetwork.org');
+rdf_loader_run();
+checkpoint;
```
=> http://vos.openlinksw.com/owiki/wiki/VOS/VirtTipsAndTricksGuideDeleteLargeGraphs How can I delete graphs containing large numbers of triples from the Virtuoso Quad Store?
@@ -197,6 +205,8 @@ Also, make sure that the load list is empty before registering your turtle files
DELETE FROM DB.DBA.load_list;
```
+Note that the directory may be mapped to a different location by the service. On tux02 it is `/export/data/genenetwork-virtuoso/`.
+
Use isql to register all the turtle files:
```
@@ -215,7 +225,7 @@ Check the table DB.DBA.load_list to see the list of registered files that will b
SQL> SELECT * FROM DB.DBA.load_list;
```
-Perform the bulk load of all data by running:
+Complete the actual bulk load of all data by running:
```
SQL> rdf_loader_run();