GENENETWORK SPARQL endpoint
SPARQL is the query language for our RDF database. This endpoint can export HTML, JSON and TSV(!)
Note that we created a reflective REST API that executes similar queries. See the REST API.
SPARQL examples are:
Get species info
- list_species() - List available species.
PREFIX gn: <http://genenetwork.org/id/> PREFIX gnc: <http://genenetwork.org/category/> PREFIX owl: <http://www.w3.org/2002/07/owl#> PREFIX gnt: <http://genenetwork.org/term/> PREFIX skos: <http://www.w3.org/2004/02/skos/core#> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX taxon: <http://purl.uniprot.org/taxonomy/> SELECT DISTINCT * WHERE { ?s rdf:type gnc:species . ?s ?p ?o . }
Get 'group' or population info
- list_groups("drosophila") - List available groups of datasets
PREFIX gn: <http://genenetwork.org/id/>
PREFIX gnc: <http://genenetwork.org/category/>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX gnt: <http://genenetwork.org/term/>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX taxon: <http://purl.uniprot.org/taxonomy/>
SELECT ?inbredSet WHERE {
rdf:type gnc:species .
?species skos:altLabel "drosophila" .
?inbredSet rdf:type gnc:inbredSet .
?inbredSet gnt:belongsToSpecies ?species .
}
List all sets with species and description:
SELECT DISTINCT ?set ?species ?descr WHERE {
?set rdf:type gnc:inbredSet ;
gnt:belongsToSpecies ?species .
OPTIONAL {?set rdfs:label ?descr } .
And list all 50+ sets for Mouse:
SELECT DISTINCT * WHERE {
?inbredSet rdf:type gnc:inbredSet ;
gnt:belongsToSpecies gn:Mus_musculus .
OPTIONAL {?inbredSet rdfs:label ?descr }.
}
try.
Show set info for one 'group' without tissue info
SELECT DISTINCT * WHERE {
gn:inbredSetHsnih-palmer ?p ?o .
FILTER ( !EXISTS{ gn:inbredSetHsnih-palmer gnt:hasTissue ?o }) .
}
List all datasets for a group/population:
- list_datasets("BXD") - List available datasets for a given group (here, "BXD").
SELECT DISTINCT * WHERE {
?dataset gnt:belongsToInbredSet gn:inbredSetBxd ;
rdfs:label ?descr .
}
Pick one, e.g. http://genenetwork.org/id/Devneocortex_ilm6_2p14rinv_1111 or gn:Devneocortex_ilm6_2p14rinv_1111
SELECT DISTINCT * WHERE {
gn:Devneocortex_ilm6_2p14rinv_1111 ?p ?o .
}
Will show something like:
http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://genenetwork.org/category/probesetDataset
http://purl.org/dc/terms/created "2011-11-18"
http://www.w3.org/2004/02/skos/core#prefLabel "BIDMC/UTHSC Dev Neocortex P14 ILMv6.2 (Nov10)"
http://genenetwork.org/term/belongsToInbredSet http://genenetwork.org/id/inbredSetBxd
http://vocab.fairdatacollective.org/gdmt/hasCreatorAffiliation "Beth Israel Deaconess Medical Center"
Another way to list datasets with the name that is used in GN:
SELECT DISTINCT ?dataset ?datasetName WHERE {
?dataset rdf:type/rdfs:subClassOf gnc:dataset .
?dataset rdfs:label ?datasetName .
?dataset gnt:belongsToInbredSet ?inbredSet .
?inbredSet skos:altLabel "BXD" .
}
To list all datasets
SELECT DISTINCT ?dataset ?datasetName WHERE {
?dataset rdf:type/rdfs:subClassOf gnc:dataset .
?dataset rdfs:label ?datasetName .
}
And count them!
SELECT count(?dataset) WHERE {
?dataset rdf:type/rdfs:subClassOf gnc:dataset .
}
893 at last count(!)
- info_dataset("CB_M_1004_P") - Get meta information about a data set using the GN name:
SELECT DISTINCT * WHERE {
?s rdfs:label "CB_M_1004_P" .
?s ?p ?o .
}
(you should be using the identifier here)
- info_datasets("B6D2F2") - Get meta information about all data sets for a group.
SELECT DISTINCT * WHERE {
?s rdf:type/rdfs:subClassOf gnc:dataset .
?s gnt:belongsToInbredSet ?inbredSet .
?inbredSet skos:altLabel "B6D2F2" .
?s ?p ?o .
}
- info_pheno("BXD", "10038") - Get summary information for a phenotype
The following works if you change the gnt prefix to terms. This is bug.
PREFIX gn: <http://genenetwork.org/id/>
PREFIX gnc: <http://genenetwork.org/category/>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX gnt: <http://genenetwork.org/term/>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX taxon: <http://purl.uniprot.org/taxonomy/>
PREFIX fabio: <http://purl.org/spar/fabio/>
PREFIX dct: <http://purl.org/dc/terms/>
SELECT DISTINCT * WHERE {
?s rdf:type gnc:phenotype .
?inbredSet skos:altLabel "BXD" .
?s gnt:belongsToInbredSet ?inbredSet.
?s gnt:traitName "10001" .
?s ?p ?o .
OPTIONAL {
?pub fabio:hasPubMedId ?pmid .
?s dct:isReferencedBy ?pmid .
?pub ?pubTerms ?pubResult .
}
}
- get_pheno("BXD", "10646") - Get phenotype values for a classical trait.
Use lmdb
- get_geno("BXD") - Get genotypes for a group.
Use lmdb
- run_gemma("BXDPublish", "10015") - Perform a genome scan with gemma
- run_rqtl("BXDPublish", "10015") - Perform a genome scan with R/qtl
- run_correlation("HC_M2_0606_P", "BXDPublish", "1427571_at") - Finds traits that are correlated with a given trait.
Not in SPARQL