summaryrefslogtreecommitdiff
path: root/topics
diff options
context:
space:
mode:
authorArun Isaac2022-05-05 16:37:29 +0530
committerArun Isaac2022-05-05 17:17:05 +0530
commit55b722aa459639666bdb42218f7bfab81717fd35 (patch)
treed7f6bac7f6057ca9e6191d7ccd1395b7495b861e /topics
parentfce8ae5156c852fd448fc3909c0dddd47332d6bc (diff)
downloadgn-gemtext-55b722aa459639666bdb42218f7bfab81717fd35.tar.gz
topics: Document RDF validation, virtuoso graph loading and deletion.
* topics/systems/virtuoso.gmi (Loading data into virtuoso): Document RDF validation, virtuoso graph loading and deletion.
Diffstat (limited to 'topics')
-rw-r--r--topics/systems/virtuoso.gmi22
1 files changed, 13 insertions, 9 deletions
diff --git a/topics/systems/virtuoso.gmi b/topics/systems/virtuoso.gmi
index 7d12331..60cb45f 100644
--- a/topics/systems/virtuoso.gmi
+++ b/topics/systems/virtuoso.gmi
@@ -113,20 +113,24 @@ For ease of implementation, SPARQL 1.1 also specifies an additional REST-like AP
=> https://www.w3.org/TR/sparql11-http-rdf-update/ SPARQL 1.1 Graph Store HTTP Protocol
The virtuoso documentation shows examples of using this protocol with cURL.
=> http://vos.openlinksw.com/owiki/wiki/VOS/VirtGraphProtocolCURLExamples Virtuoso SPARQL 1.1 Graph Store HTTP Protocol examples using cURL
-We recap the same here. First delete the existing graph with something like
+We recap the same here.
+
+When uploading data, the virtuoso server often does not report errors properly. It simply freezes up. So, it is very helpful to validate your RDF before uploading. For this, use rapper from the raptor2 package. To validate data.ttl, a turtle file, run
```
-curl -v --digest --user dba:password --verbose --url -G http://localhost:28890/sparql-graph-crud-auth --data-urlencode graph=https://BioHackrXiv.org/graph -X DELETE
+rapper --input turtle --count data.ttl
```
-Next update the graph with
+Then, upload it to a virtuoso SPARQL endpoint running at port 8892
```
-curl -v -X PUT --digest -u dba:password -H Content-Type:text/turtle -T test/data/biohackrxiv.ttl -G http://localhost:28890/sparql-graph-crud-auth --data-urlencode graph=https://BioHackrXiv.org/graph
+curl -v -X PUT --digest -u dba:password -T data.ttl -G http://localhost:8892/sparql-graph-crud-auth --data-urlencode graph=http://example.org
```
-where https://BioHackrXiv.org/graph is the name of the graph (in this example). A python version can be found in
-=> https://github.com/pubseq/bh20-seq-resource/blob/master/scripts/update_virtuoso/check_for_updates.py
-
-## Validate data using rapper
+where http://example.org is the name of the graph.
-TODO
+The PUT method deletes the existing data in the graph before loading the new one. So, there is no need to manually delete old data before loading new data. However, virtuoso is slow at deleting millions of triples, resulting in an apparent freeze-up. So, it is preferable to handle such deletes manually using a lower-level SQL statement issued via the isql client.
+```
+$ isql
+SQL> DELETE FROM rdf_quad WHERE g = iri_to_id('http://example.org');
+```
+=> http://vos.openlinksw.com/owiki/wiki/VOS/VirtTipsAndTricksGuideDeleteLargeGraphs How can I delete graphs containing large numbers of triples from the Virtuoso Quad Store?
## Virtuoso.ini