summaryrefslogtreecommitdiff
path: root/topics/next-gen-databases
diff options
context:
space:
mode:
authorMunyoki Kilyungi2023-08-29 13:22:54 +0300
committerMunyoki Kilyungi2023-08-29 13:22:54 +0300
commit7bab9517d30a70fe4be1239bce737baa5ed8233d (patch)
tree3848b77a64a7a6fc2671eadc12de6d0b72eeacd0 /topics/next-gen-databases
parent9dfd7c24b9f3f6e0d43e56678f0249e795d0c1bc (diff)
downloadgn-gemtext-7bab9517d30a70fe4be1239bce737baa5ed8233d.tar.gz
Document proposed classification scheme
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
Diffstat (limited to 'topics/next-gen-databases')
-rw-r--r--topics/next-gen-databases/gn-classification-scheme.gmi45
1 files changed, 45 insertions, 0 deletions
diff --git a/topics/next-gen-databases/gn-classification-scheme.gmi b/topics/next-gen-databases/gn-classification-scheme.gmi
new file mode 100644
index 0000000..2d66f3c
--- /dev/null
+++ b/topics/next-gen-databases/gn-classification-scheme.gmi
@@ -0,0 +1,45 @@
+# Understanding GN's Classification Scheme
+
+In GeneNetwork (GN), data is grouped into specific discrete categories. Let's dive into the currently implemented scheme:
+
+* Species: This category groups data based on different species, like humans, mice, or plants.
+* Set/Group (InbredSet): This groups data according to a group of genetically similar organisms.
+* DatasetType: This classified data according to it's type. There are three main types: Genotypes (genetic makeup), Molecular Traits (molecular characteristics), and Phenotypes (observable traits).
+
+The aforementioned classification scheme is inspired by GN's menu structure which forms the skeleton of the proposed classification. You can query metadata about this classification: "gn:ResourceClassificationScheme". This classification scheme has 3 levels:
+
+* gnc:DatasetType
+* gnc:Set
+* gn:Species.
+
+Here's a depper look at each level:
+
+* gnc:DatasetType: This level encompasses subcategories like "gnc:Probeset," "gnc:Genotype," and "gnc:Phenotype."
+* gnc:Set: This level includes all the members listed in the InbredSet table.
+* gn:Species: This level consists of all the members from the Species table.
+
+The beauty of this system is that most of the resources in GN can be accurately categorized using it. Instead of using specific properties like "gnt:belongsToSpecies" or "gnt:belongsToSet," we can utilize the xkos⁰ approach. For instance, to classify a resource, we can use the concept of xkos and apply the relationship "xkos:classifiedUnder." Here's an example of a resource that has been classified:
+
+```
+gn:Gtexv8_sto_0220 xkos:classifiedUnder gnc:Probeset .
+gn:Gtexv8_sto_0220 xkos:classifiedUnder gn:setGtex_v8
+```
+
+This means that the resource "gn:Gtexv8_sto_0220" is classified under the category "gnc:Probeset" and also under the set "gn:setGtex_v8."
+
+To query this classification using SPARQL, we can use the following code snippet:
+
+```
+PREFIX xkos: <http://rdf-vocabulary.ddialliance.org/xkos#>
+PREFIX gnc: <http://genenetwork.org/category/>
+PREFIX gn: <http://genenetwork.org/id/>
+
+SELECT * WHERE {
+ gn:Gtexv8_sto_0220 xkos:classifiedUnder ?datasetType .
+ gn:Gtexv8_sto_0220 xkos:classifiedUnder ?set .
+ gnc:DatasetType skos:member ?datasetType .
+ gnc:Set skos:member ?set .
+}
+```
+
+=> https://rdf-vocabulary.ddialliance.org/xkos.html ⁰ XKOS: An SKOS extension for representing statistical classifications.