aboutsummaryrefslogtreecommitdiff
path: root/features/search.md
diff options
context:
space:
mode:
authorPjotr Prins2022-12-05 09:18:00 -0600
committerPjotr Prins2022-12-05 09:18:00 -0600
commite808ddf723dfb7d7a2094eec4d85e96e63ee7a3a (patch)
treed938c2a6686e92438fb3e43ba8c33f0f543c9ab7 /features/search.md
parent938d50706127d1d12d5015c41be892261075e5e5 (diff)
downloadgn-docs-e808ddf723dfb7d7a2094eec4d85e96e63ee7a3a.tar.gz
Rename
Diffstat (limited to 'features/search.md')
-rw-r--r--features/search.md44
1 files changed, 44 insertions, 0 deletions
diff --git a/features/search.md b/features/search.md
new file mode 100644
index 0000000..dad4373
--- /dev/null
+++ b/features/search.md
@@ -0,0 +1,44 @@
+# Search
+
+## Overview
+
+One of the key features of GN is the powerful search functionality. For most users it is the entry point for using the GeneNetwork web service. On the front-page a menu is offered that allows selecting species, e.g. mouse or rat, and relevant datasets grouped by family, e.g. BXD, and type, e.g. Hippocampus mRNA.
+Recently we introduced the Xapian search engine that allows for fast lookups and powerful search queries.
+A example search for the BRCA2 results in GN searching for the term "BRCA2" in 754 datasets and 39,765,944 traits across 10 species and found 7998 results that match the query.
+The search URL looks like 'https://genenetwork.org/gsearch?type=gene&terms=BRCA2' and can be copy-pasted and shared with other users.
+
+More powerful queries will narrow down on field in the result table. For example to get only mouse results "species:mouse BRCA2" found 5916 results.
+
+Example search terms:
+
+```
+species:mouse BRCA2 - looks good, has Liver
+species:mouse Tissue:Liver - 0 results
+species:mouse tissue:Liver - 0 results
+brca - only renders 2 results, why not BRCA2?
+```
+
+Keywords like tissue appear to be case sensitive. Should not be the case.
+
+GeneNetwork search functionality is used in the publication `Integrative Functional Genomics for Systems Genetics in GeneWeaver.org`. The authors present their webservice by searching for a "nociception" QTL in a region on chromosome 4 found in GN. They export the details and use it for further analysis in GeneWeaver\cite{bubier2016}.
+
+In the publication `Systems Genetics of Obesity` GeneNetwork search is used to find genes contributing to collagen content in adipose tissue in the BXD and find similar cis-eQTL connected with adipose tissue that exist in other studies hosted on GN\cite{brockmann2016}.
+
+In the paper `New Insights on Gene by Environmental Effects of Drugs of Abuse in Animal Models Using GeneNetwork' the authors reanalyzed an older experimental data in the GeneNetwork database.
+They discovered QTLs on mouse chromosomes 3, 5, 9, 11, and 14, not found in the original study
+and found new candidate genes included Slitrk6 and Cdk14. Slitrk6, in a Chromosome14 QTL for locomotion, was found to be part of a co-expression network involved in voluntary movement and associated with neuropsychiatric phenotypes. Cdk14, one of only three genes in a Chromosome5 QTL, is associated with handling induced convulsions after ethanol treatment, that is regulated by the anticonvulsant drug valproic acid\cite{PMC9024903}.
+
+## Future
+
+* Add human SNP search and synteny
+* Support more sources, e.g. geneweaver
+
+## Methods
+
+GN search is built on
+Xapian, an Open Source Search Engine Library, released under the GPL v2+. It's written in C++, with bindings to allow use from Python, Guile and other languages.
+Xapian is actively maintained and current Xapian users include, for example, the Debian website and the notmuch E-mail indexer.
+Xapian is a highly adaptable toolkit which allows developers to easily add advanced indexing and search facilities to their own applications. It has built-in support for several families of weighting models and also supports a rich set of boolean query operators\cite{Xapian}.
+We build the Xapian index from SQL data and RDF data in the GN databases.
+
+=> topics/xapian-indexing.gmi indexing optimizations