summaryrefslogtreecommitdiff
path: root/topics/systems
diff options
context:
space:
mode:
authorMunyoki Kilyungi2022-12-06 16:48:41 +0300
committerMunyoki Kilyungi2022-12-06 16:49:12 +0300
commit30f048daa357cbf21e94f89ebc6fdd0dc3903048 (patch)
tree4245db55b56e13d3aec6ffc0815d15a61d1b0eca /topics/systems
parent58fb126c2e861b7fcf19700878bd8e6a70b8d0d3 (diff)
downloadgn-gemtext-30f048daa357cbf21e94f89ebc6fdd0dc3903048.tar.gz
Update instructions on dumping data to a ttl file
Diffstat (limited to 'topics/systems')
-rw-r--r--topics/systems/virtuoso.gmi49
1 files changed, 49 insertions, 0 deletions
diff --git a/topics/systems/virtuoso.gmi b/topics/systems/virtuoso.gmi
index 1a6b8b9..e9d841d 100644
--- a/topics/systems/virtuoso.gmi
+++ b/topics/systems/virtuoso.gmi
@@ -153,3 +153,52 @@ When virtuoso has just been started up with a clean state (that is, the virtuoso
=>https://github.com/genenetwork/dump-genenetwork-database/commit/8f60fde7f5499e5ffe352d7ae98a2de34a91b89f
Retry uploading to virtuoso (commit from dump-genenetwork-database repo)
formerly (https://git.genenetwork.org/arunisaac/dump-genenetwork-database/commit/8f60fde7f5499e5ffe352d7ae98a2de34a91b89f)
+
+## Dumping data from a MySQL database
+
+To dump data into a ttl file, first make sure that you are in the guix environment in the "dump-genenetwork-database" repository
+
+=> https://github.com/genenetwork/dump-genenetwork-database/ Dump Genenetwork Database
+
+Next, drop into a development environment with:
+
+```
+$ guix shell
+```
+
+Build the sources:
+
+```
+$ make
+```
+
+Describe the database connection parameters in a file *conn.scm* file as shown below. Take care to replace the placeholders within angle brackets with the appropriate values.
+
+```
+((sql-username . "root")
+ (sql-password . "root")
+ (sql-database . "db_webqtl_s")
+ (sql-host . "localhost")
+ (sql-port . 3306)
+ (virtuoso-port . 8891)
+ (virtuoso-username . "dba")
+ (virtuoso-password . "dba")
+ (sparql-scheme . http)
+ (sparql-host . "localhost")
+ (sparql-port . 8892))
+```
+
+Then, to dump the database to \~/data/dump, run:
+
+```
+$ ./pre-inst-env ./dump.scm conn.scm ~/data/dump
+```
+
+Make sure there is enough free space! It\'s best to dump the database on penguin2 where disk space and bandwidth are not significant constraints.
+
+Then, validate the dumped RDF using `rapper` and load it into virtuoso. This will load the dumped RDF into the `http://genenetwork.org` graph, and will delete all pre-existing data in that graph.
+
+```
+$ rapper --input turtle --count ~/data/dump/dump.ttl
+$ ./pre-inst-env ./load-rdf.scm conn.scm ~/data/dump/dump.ttl
+```