summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--issues/add-homology-track.gmi2
-rw-r--r--issues/annotation.gmi34
2 files changed, 35 insertions, 1 deletions
diff --git a/issues/add-homology-track.gmi b/issues/add-homology-track.gmi
index 89aa88d..70422f0 100644
--- a/issues/add-homology-track.gmi
+++ b/issues/add-homology-track.gmi
@@ -17,7 +17,7 @@ I currently have it displaying regions in mouse with text and a UCSC genome brow
# Description
-The idea is to add a new track to mouse/rat mapping figures that provides links to the corresponding region for humans in the UCSC Genome Browser.
+The idea is to add a new track to mouse/rat mapping figures that provides links to the corresponding region for humans in the UCSC Genome Browser.
This can be done using these UCSC chain files. Links to the files and an explanation about their format are below:
- Help file: https://genome.ucsc.edu/goldenPath/help/chain.html
diff --git a/issues/annotation.gmi b/issues/annotation.gmi
new file mode 100644
index 0000000..c5688dd
--- /dev/null
+++ b/issues/annotation.gmi
@@ -0,0 +1,34 @@
+# Annotation
+
+Annotation, assembly and liftover are recurring themes because they are tied to (updating) reference genomes. Here we track a next generation annotation system which will support versions of annotation with a matching reference genome. In the future we'll build this out with pangenome support.
+
+# Tags
+
+* assigned: arun,pjotrp,zsloan,arthurc,robw
+* priority: medium
+* type: enhancement
+* status: open, in progress
+* keywords: mapping, architecture
+
+
+# TODO
+
+- [ ] Point out where annotation is used in code (@zsloan)
+- [ ] Come up with a storage model to replace the existing Mariadb tables (pjotrp, aruni)
+- [ ] Implement storage backend
+- [ ] Create examples to replace existing code
+- [ ] Replace all code to allow for selecting versions
+
+# Notes
+
+=> https://github.com/genenetwork/genenetwork2/blob/testing/wqflask/wqflask/interval_analyst/GeneUtil.py GeneList table
+
+is only used here, with all position fields being the ones that would ideally account for assembly (txStart, txEnd, cdsStart, cdsEnd, exonStarts, exonEnds - no idea why this last one has an "s" after Start/End)
+
+=> https://github.com/genenetwork/genenetwork2/blob/df36cc808e21202075d32559c9f295087bd8aee2/wqflask/base/trait.py#L528-L597
+
+where max LRS locus position is fetched for traits
+
+=> https://github.com/genenetwork/genenetwork2/blob/df36cc808e21202075d32559c9f295087bd8aee2/wqflask/wqflask/do_search.py#L819-L857 - Search queries
+
+notably the Position Search (the most "ambitious" version of this would allow for querying based off of different assemblies, but I can't think of a good way to do this with SQL)