# Annotation Annotation, assembly and liftover are recurring themes because they are tied to (updating) reference genomes. Here we track a next generation annotation system which will support versions of annotation with a matching reference genome. In the future we'll build this out with pangenome support. # Tags * assigned: pjotrp,zsloan,arthurc,robw * priority: medium * type: enhancement * status: open, in progress * keywords: mapping, architecture # TODO * [ ] Point out where annotation is used in code (@zsloan) * [ ] Come up with a storage model to replace the existing Mariadb tables (pjotrp, aruni) * [ ] Implement storage backend * [ ] Create examples to replace existing code * [ ] Replace all code to allow for selecting versions # Notes => https://github.com/genenetwork/genenetwork2/blob/testing/wqflask/wqflask/interval_analyst/GeneUtil.py GeneList table is only used here, with all position fields being the ones that would ideally account for assembly (txStart, txEnd, cdsStart, cdsEnd, exonStarts, exonEnds - no idea why this last one has an "s" after Start/End) => https://github.com/genenetwork/genenetwork2/blob/df36cc808e21202075d32559c9f295087bd8aee2/wqflask/base/trait.py#L528-L597 where max LRS locus position is fetched for traits => https://github.com/genenetwork/genenetwork2/blob/df36cc808e21202075d32559c9f295087bd8aee2/wqflask/wqflask/do_search.py#L819-L857 - Search queries notably the Position Search (the most "ambitious" version of this would allow for querying based off of different assemblies, but I can't think of a good way to do this with SQL)