summaryrefslogtreecommitdiff
path: root/issues/annotation.gmi
diff options
context:
space:
mode:
Diffstat (limited to 'issues/annotation.gmi')
-rw-r--r--issues/annotation.gmi34
1 files changed, 34 insertions, 0 deletions
diff --git a/issues/annotation.gmi b/issues/annotation.gmi
new file mode 100644
index 0000000..c5688dd
--- /dev/null
+++ b/issues/annotation.gmi
@@ -0,0 +1,34 @@
+# Annotation
+
+Annotation, assembly and liftover are recurring themes because they are tied to (updating) reference genomes. Here we track a next generation annotation system which will support versions of annotation with a matching reference genome. In the future we'll build this out with pangenome support.
+
+# Tags
+
+* assigned: arun,pjotrp,zsloan,arthurc,robw
+* priority: medium
+* type: enhancement
+* status: open, in progress
+* keywords: mapping, architecture
+
+
+# TODO
+
+- [ ] Point out where annotation is used in code (@zsloan)
+- [ ] Come up with a storage model to replace the existing Mariadb tables (pjotrp, aruni)
+- [ ] Implement storage backend
+- [ ] Create examples to replace existing code
+- [ ] Replace all code to allow for selecting versions
+
+# Notes
+
+=> https://github.com/genenetwork/genenetwork2/blob/testing/wqflask/wqflask/interval_analyst/GeneUtil.py GeneList table
+
+is only used here, with all position fields being the ones that would ideally account for assembly (txStart, txEnd, cdsStart, cdsEnd, exonStarts, exonEnds - no idea why this last one has an "s" after Start/End)
+
+=> https://github.com/genenetwork/genenetwork2/blob/df36cc808e21202075d32559c9f295087bd8aee2/wqflask/base/trait.py#L528-L597
+
+where max LRS locus position is fetched for traits
+
+=> https://github.com/genenetwork/genenetwork2/blob/df36cc808e21202075d32559c9f295087bd8aee2/wqflask/wqflask/do_search.py#L819-L857 - Search queries
+
+notably the Position Search (the most "ambitious" version of this would allow for querying based off of different assemblies, but I can't think of a good way to do this with SQL)