diff options
Diffstat (limited to 'tasks')
-rw-r--r-- | tasks/alexm.gmi | 86 | ||||
-rw-r--r-- | tasks/bonfacem.gmi | 103 | ||||
-rw-r--r-- | tasks/felixl.gmi | 128 | ||||
-rw-r--r-- | tasks/fredm.gmi | 16 | ||||
-rw-r--r-- | tasks/machine-room.gmi | 8 | ||||
-rw-r--r-- | tasks/octopus.gmi | 3 | ||||
-rw-r--r-- | tasks/pjotrp.gmi | 87 | ||||
-rw-r--r-- | tasks/programmer-team/meetings.gmi | 82 | ||||
-rw-r--r-- | tasks/roadmap.gmi | 65 | ||||
-rw-r--r-- | tasks/zachs.gmi | 7 |
10 files changed, 545 insertions, 40 deletions
diff --git a/tasks/alexm.gmi b/tasks/alexm.gmi index 88d3927..7ec8e87 100644 --- a/tasks/alexm.gmi +++ b/tasks/alexm.gmi @@ -1,4 +1,4 @@ -# Tasks for Fred +# Tasks for Alex ## Description @@ -16,11 +16,83 @@ You can refine the search by constraining the checks some more, e.g. to get high # Tasks -* [ ] Make GNQA reliable (with @fahamu) -* [ ] Improve UX for GNQA (with @shelbys) -* [ ] GNQA add abstracts pubmed (with @shelbys) +## This week + +* [ ] Start application - Pwani +* - [X] Got all transcripts +* [+] Correlations - Fred is having issues - Rust updated on Guix +* - also take a look at long running SQL statement and large LIMIT value (check prod!) +* [ ] Friend of UTHSC - Pjotr needs to send forms +* [+] Disable spinner on production (check prod!) +* [+] Rqtl2 - BXD output work on CD +* - [ ] should go to production w. fredm + Disable for Production +* - [X] DO mice family file - children are heterozygous - family file contains parents->child +* - [X] DO GN2 compatible by generating .geno files + Test on CD +* [+ ] Minor refactorings - Rqtl2 is hacky +* [ ] Work in development system container and document +=> https://git.genenetwork.org/gn-machines/commit/?h=gn-local-development-container&id=589dcf32be90f5ec827cb6976d3cb5838d500ac0 +* [+] Create terminal output for external processes on *PRODUCTION* (Rqtl1, Rqtl2, GEMMA, pair-scan are done --- WGCNA as a pilot, with @bonfacem and @pjotrp) + + +## (14/4/25) + +* [x] Debug DO results for for genenetwork2 + * [x] inspect results from gn3 and display mapping results + * [x] Debug db tunneling connection + * [x] Debug rendering huge datatables + +## (21/4/25) +* [x] QTL computation for the DO dataset + * [x] Debug rendering large datasets using datatables + * [x] fix issue with qtl2 plot for DO dataset + * [x] Caching for qtl2 computations + +* [] Pwani Campus Application + +## 28/4/25 + +* [x] Push changes to CD/Production +* [x] Enable RQTL2 only for DO/bxd dataset +* [] look at integrating QTL for HS dataset +* [x] setup local container with bons + +## 5/05/25 + +* [] Integrate hsrat dataset for rqtl2 mapping. +* [] Pwani campus application. +* [] Look at caching for genotype probabilities (rqtl2). +* [] Add full logs on the mapping results page. +* [x] Add test feature flag for rqtl2. + +## 2/06/2025 + +* work onsubset for hs dataset;; define founder genotype files?? +* script to dump genotypes to db with bons +* experiment with caching for Genotypic probabilities rds objects +* work on genenetwork llms how to make search without login + +* masters ; submit documents + +## Next week(s) + +* [ ] Accelerate Xapian functionality - needs Aider key from Pjotr +* Check and fix CTL? +* [+] Create terminal output for external processes (Rqtl1, Rqtl2, pair-scan are done --- WGCNA as a pilot, with @bonfacem and @pjotrp) +* [X] GNQA says there are no results, but has them +* [X] Correlations are slow + +## Done + +* [X] Rqtl1 - ITP output - 3K individuals - family file +* [X] When bonz is ready wire up GNQA +* + balg-qa.genenetwork.org +* [X] Don't support new PIL - stick to the old one in guix-bioninformatics +* [X] Make GNQA reliable (with @fahamu) +* [X] Improve UX for GNQA (with @shelbys) -- Adrian wants to use our AI UX for their setup +* [X] GNQA add abstracts pubmed (with @shelbys) => ../issues/fetch-pubmed-references-to-gnqa +* [X] Edit markdown/gemtext pages through web UI (with @bonfacem) + -* [ ] Edit markdown/gemtext pages through web UI (with @bonfacem) -* [ ] GNQA add GN metadata with @bonfacem -* [ ] Create terminal output for external processes (WGCNA as a pilot, with @bonfacem and @pjotrp) diff --git a/tasks/bonfacem.gmi b/tasks/bonfacem.gmi index 52f4027..03848f1 100644 --- a/tasks/bonfacem.gmi +++ b/tasks/bonfacem.gmi @@ -8,9 +8,62 @@ ## Tasks -* [X] Indexing generif data / Improve Local Search -* [ ] Add hashes to RDF metadata -* [-] Brain Data (To be spec'ed further) +### Note +- GN-auth dashboard fixes. Follow up with Fred. +- Case-attributes used in co-variates. +- Encourage FahamuAI to be open. + +### This week +* [+] Case Attributes (Do a diagnostic and delegate) +* - Git blame. Add tests. +* - Error when checking the history. +* - Reach out to Zach. +* - Disable diff in the UI. +* [ ] Distinct admin and dev user. +* [ ] Adapter to LMDB into a cross object. +* - Try computations with R/qtl2. +* - Look at R LMDB libraries. +* - Look at functions that read the files. +* - PJ: LMDB adapter in R and cross-type files. +* [ ] Send Arun an e-mail on how to go about upgrading shepherd. +* [ ] Dump all genotypes from production to LMDB. +* - PJ sync tux01 genotypes with tux02/04. +* [+] Correlations hash. +* - Add dataset count to RDF. +* [ ] Spam + LLMs +* - RateLimiting for Rif Editing. +* - Honep Pot approach. +* [+] Help Alex with SSL certification container error. +* - Put the changes in the actual scm files. +* [X] Python Fahamu. +* [X] Memvid - brief look. + +### Later +* [ ] Dockerise GN container. For Harm. +* [ ] Send emails when job fail. +* [ ] Look at updating gn-auth/gn-libs to PYTHONPATH for gn2/3. +* [ ] Sample/individual/strain/genometype counts for PublishData only - ProbeSetData? https://github.com/genenetwork/genenetwork2/blob/testing/scripts/sample_count.py - mirror in RDF and use global search +* - search for all traits that have more than X samples +* [ ] Add case attributes to RDF and share with Felix (depends on @felixl) +* [ ] xapian search, add dataset size keys, as well as GN accession id, trait id, and date/year +* - Improve xapian markdown docs to show all used fields/keys with examples +* - genewiki search (link in table? check with Rob) +* - base line with GN1 search - add tests +* - Fix missing search term for sh* - both menu search and global search +* - Use GN1 as a benchmark for search results (mechanical Rob?) +* - Xapian ranges for markers + +### Even later + +* [ ] Rest API for precompute output (mapping with GEMMA) +* [ ] GNQA add GN metadata (to RAG) +* - Focus on RIF +* - triple -> plain text +* - bob :fatherof nancy -> Bob is the father of Nancy. + +## Later + +* [ ] AI improvements ### On going tasks @@ -34,3 +87,47 @@ Should something in one of these closed issues be amiss, we can always and shoul Currently closed issues are: => https://issues.genenetwork.org/search?type=closed-issue&query=assigned%3ABonfaceKilz%20AND%20type%3Aissue%20AND%20is%3Aclosed Closed Issues + +* [X] Indexing generif data / Improve Local Search +* [X] lmdb publishdata output and share with Pjotr and Johannes + +## Done + +* [X] Add lmdb output hashes with index and export LMDB_DATA_DIRECTORY +* [X] Share small database with @pjotrp and @felixl +* [X] With Alex get rqtl2 demo going in CD (for BXD) +* [X] Set up meeting with ILRI +* - Zasper https://news.ycombinator.com/item?id=42572057 - Alan +* [X] Migrate fahamuai RAG to VPS and switch tokens to GGI OpenAI account +* 1. Running AI server using (our) VPS and our tokens +* + Pjotr gives API key - OpenAI - model? +* 2. Read the code base - Elixir is plumbing incl. authentication, Python processing text etc. +* 3. Try ingestion and prompt (REST API) - check out postgres tables +* 4. Backup state from production Elixir +* 5. Assess porting it to Guix (don't do any work) - minimum version Elixir +* 6. Get docs from Shelby/Brian +* [X] Set-up grobit on balg01 +* - guix docker/native +* - recent breaking changes +* [X] GeneRIF +* - Merge recent changes first. Ping Rob. +* - Brainstorm ideas around log-in. +* - Unlimited tokens that don't expire. +* - Sync prod with CD -- sqlite. +* - Add deletion +* [X] Describe Generif/wikidata access for Rob in an email with test account on CD +* 1. Send email to Rob +* 2. Work on production w. Fred +* [X] Distinguish CD from production -- banners/buttons/colors. +* [X] Use aider - give a presentation in the coming weeks +* [X] gn-auth fixes +* [X] Assess Brian's repo for deployment. +* [X] Finish container work +* - View diffs in BXD: Edit case attributes throws an error. +* [X] Check small db from: https://files.genenetwork.org/database/ +* [X] Changes to Production + (Alex) +* [X] File issue with syslog +* [X] LMDB database. +* - Simplify (focus on small files). Don't over-rely on Numpy. +* [X] Assess adding GeneRIF to LLM. +* [X] Referrer headers -- a way of preventing bots beyond rate-limiting. diff --git a/tasks/felixl.gmi b/tasks/felixl.gmi index 209e8c9..347f387 100644 --- a/tasks/felixl.gmi +++ b/tasks/felixl.gmi @@ -1,4 +1,4 @@ -# Tasks for Munyoki +# Tasks for Felix ## Tags @@ -6,12 +6,134 @@ * assigned: felixl * status: in progress -## October +## Tasks +### Goals + +1. Write papers for PhD +2. Load data into GN - serve the communities +3. Get comfortable with programming + +#### Previous week(s) + +* [x] Restless Legs Syndrome (RLS) - 'Traditional Phewas' - AI aspect - Johannes +* [+] Finalize the slide deck - so it can be read on its own +* [.] Review paper: one-liners for @pjotrp - why is this important for GN and/or thesis +* - [ ] list of relevant papers with one-liners - the WHY +=> https://pmc.ncbi.nlm.nih.gov/articles/PMC3294237/ +* [+] Analyse and discuss BXD case attributes with Rob --- both group level and dataset level +* [ ] Sane representation of case attributes in RDF with @bonfacem +* [X] Present C.elegans protocol and example mappings with GEMMA/Rqtl +* [ ] Uploader - setting up code with @fredm +* - [ ] Concrete improvement to work on +* - [X] run small database mysql locally +* - [X] aider with Sonnet + code fixes +* - [ ] document - add to code base - merge with Fred's tree - share changes with Pjotr & team +* [ ] Sort @alexm application with Pwani = this week + +### This week (07-04-2025 onwards) + +* GN2 tasks + * [ X ] Progress on Kilifish + - meet with Dennis (send him an email with all the queries needed) + - progress to format and upload data to gn2 (to be ready by latest Friday!) + * [ X ] Make a milestone with genotype smoothing + +* PhD tasks + * [ X ] Complete and share concept note and timeline to supervisors, have a meeting for progress + * [ ] Make a milestone on chapter one manuscript (deep dive into the selected papers){THE BIG PICTURE; a complete draft by early May} + +* Programming + * [ ] Make a milestone with the uploader (really push and learn!) + - documentation (use ai); add to the code base of the uploader + - utilise the hurdles to learn programming priniciples in action + +### This week (14-04-2025 onwards) + +* gn-uploader programming + * [X] - Resolve the config file issue with your local uploader + * [ ] - Run the uploader locally, then break the system, see how components connect to each other + * [ ] - document your findings + +* genotype smoothing + * [ ] - resolve errors with plotting, document your findings + +### This week (21-04-Onwards) + +* genotype smoothing + * [ ] - haplotyping tools for smoothing (plink,., etc) + - see what it can offer with smoothing. See what others say about this. + +* gn-uploader programming + * [ ] - Run the uploader locally, then break the system, see how components connect to each other (ask help from Bonz) + * [ ] - document your findings + +### This week (28-04-Onwards) +* gn-uploader programming + * [X] - Run the uploader locally, then break the system, see how components connect to each other (ask help from Bonz) + * [X] - document your findings + {Get help from your teammates/AI to jump start this!, swallow your pride! :(} + +* genotype smoothing + * [X] Keep refining the following: + * [X] filtering power adapted from plink + * [X] the xsomes mix up in the plot (probably the phenotype data?) + * [X] Update findings and push to github + +### This week (05-05-Onwards) +* programming (gn-uploader) + * [ ] - pick one file each day, review it, understand it + * [ ] - pair programming with Alex on test runs + +* HS rats scripts + * [ ] - prepare/refine scripts to quickly process HS rats file + * [ ] - assist alex with hs rats cross info + +* AOBs + * [ X ] Weekly meetings + * [ X ] follow up with Paul on his progress + * [ X ] follow up on the MSc bioinformatics project + * [ X ] follow up on Alex's application with Pwani + +### (12-05-onwards) + * [X] - HS genotypes scripting + +### (19-05-onwards) + * [X] - HS genotypes debugging (memory issue) + * [X] - pair programming with Bonz to improve the script + +### this week (26-05-onwards) + * [X] - process the genotype file for hs rats + * [X] - approach by tissues categories + * [X] - adipose and liver + - test by Xsomes for memory capture + - run the working commands + * [X] - the rest 10 other tissues (in progress) + * [X] - *.bed file vs the updated vcf files from the website? + +### this week (02-06-onwards) +* [X] - process the genotypes for the rest of the 10 tissues for HS rats +* [X] - document the new findings about smoothing using bcftools and plink + +* ## this week (09-06-onwards) +* [ ] - identify start and end points for haplotypes in hs genotype files +* [ ] - upload the final updates to gn2, test and see the results +* [ ] - gn-uploader/uploader folder, explore + +### Later weeks (non-programming tasks) + +* [ ] Kilifish into GN +* [ ] Review paper on genotyping +* [ ] HS Rat +* [ ] Prepare others for C.elegans * [ ] Upload Arabidopsis dataset * [ ] Upload Medaka dataset +* [ ] Work on improved DO and Ce genotyping + +### Done + + -## Tasks ### On going tasks => https://issues.genenetwork.org/search?query=assigned%3Afelixl+AND+is%3Aopen&type=open-issue All in-progress tasks diff --git a/tasks/fredm.gmi b/tasks/fredm.gmi index 5e7e71d..1cd3125 100644 --- a/tasks/fredm.gmi +++ b/tasks/fredm.gmi @@ -1,5 +1,21 @@ # Tasks for Fred +# Tags + +* kanban: fredm +* assigned: @fredm +* status: in progress + +# Tasks + +* [ ] Add drives to Penguin2, see issues/systems/penguin2-raid5 +* [X] Move production files from sdc to sde +* [ ] Fix password weakness +* [ ] Fix gn-docs and editing, e.g. facilities page by gn-guile in container +* [ ] Unifiy container dirs +* [ ] Fix wikidata gene aliases (see mapping page) with @pjotrp +* [ ] Public SPARQL container? + ## Description These are the tasks and issues to be handled by Fred. diff --git a/tasks/machine-room.gmi b/tasks/machine-room.gmi index badac82..77f7b8e 100644 --- a/tasks/machine-room.gmi +++ b/tasks/machine-room.gmi @@ -11,17 +11,19 @@ ## GN +* [ ] penguin2 has 90TB of space we can use on NFS/backups +* [ ] Script to replace reaper with GEMMA * [ ] Transfer nervenet.org to dnsimple -* [ ] Trait vectors for Johannes +* [+] Trait vectors for Johannes +* [X] grub on tux04 * [ ] nft on tux04 * [ ] !!Organize pluto, update Julia and add apps to GN menu Jupyter notebooks -* [ ] !!Xusheng jumpshiny services +* [+] !!Xusheng jumpshiny services * [ ] Fix apps and create system containers for herd services - see issues/systems/apps * [ ] Slurm+ravanan on production for GEMMA speedup * [ ] Embed R/qtl2 (Alex) * [ ] Hoot in GN2 (Andrew) * [ ] tux02 certbot failing (manual now) -* [ ] penguin2 has 32TB of space we can use on NFS/backups ## Octopus: diff --git a/tasks/octopus.gmi b/tasks/octopus.gmi index 27232ec..61955ec 100644 --- a/tasks/octopus.gmi +++ b/tasks/octopus.gmi @@ -2,6 +2,9 @@ In this file we track tasks that need to be done. +Tuxes still have some 30x 2.5" slots. +Lambda has 18x 2.5" slots. + # Tasks * [X] get lizardfs and NFS going on tuxes tux06-09 diff --git a/tasks/pjotrp.gmi b/tasks/pjotrp.gmi index b284c46..57620aa 100644 --- a/tasks/pjotrp.gmi +++ b/tasks/pjotrp.gmi @@ -6,35 +6,69 @@ * assigned: pjotrp * status: in progress -# Notes - -The tasks here should probably be broken out into appropriately tagged issues, where they have not - they can be found and filtered out with tissue (formerly gnbug). +# Current -=> https://issues.genenetwork.org +## 1U01HG013760 -Generally work applies to NIH/R073237482 and other grants. +* Prefix-Free Parsing Compressed Suffix Tree (PFP) for tokenization +* Mempang -# Current +* [+] create backup server with @fredm +* [+] RAG with Shelby and Bonz +* [+] Moni builds 1U01HG013760 +* [+] test framework wfmash - vertebrate tree and HPC compute? +* - wfmash - wgatools -> PAF + FASTA to VCF +* - wfmash arch=native build +* [ ] gbam - data compression with Nick and Hasithak +* [X] accelerate wfmash with @santiago and team +* [+] package wfmash and Rust wfa2-lib +* [ ] add Ceph for distributed network storage 1U01HG013760 +* [ ] Work on pangenome genotyping 1U01HG013760 +* [ ] update freebayes into Debian (version #) +* - [ ] static build and prepare for conda +* [ ] update vcflib into Debian (version #) +* - [ ] static build and prepare for conda +* [ ] pangenome as a 1st class input for GEMMA +* kilifish pangenome with Paul and Dario ## Systems +* [+] jumpshiny +* [ ] pluto +* [ ] Backup production databases on Tux04 +* - [+] Dump containers w. databases +* - [X] Dump mariadb +* - [ ] backup remote +* - [ ] borg-borg +* - [ ] fix root scripts * [ ] make sure production is up to scratch (see stable below) -* [ ] backup tux04 -* [ ] add Ceph for distributed network storage 1U01HG013760 +* [ ] synchronize git repos for public, CD, fallback and production using sheepdog and document * [ ] drop tux02 backups on balg01 -* [ ] drop backups NL -* [ ] reintroduce borg-borg +* [X] Small database public ## Ongoing tasks (current/urgent) -* [ ] Precompute -* [+] Set up stable GeneNetwork server instance with new hardware (see below) -=> /topics/systems/fire-up-genenetwork-system-container.gmi +* [ ] ~Felix, Alex, Rahul as friends of UTHSC +* [ ] Precompute with GEMMA + + [ ] Store N + + [ ] Store significance levels + + [ ] Check genotype input data + + [ ] Imputation + + [ ] Do same with bulkLMM + + [ ] Generate lmdb output + + [ ] Hook into Xapian + + [ ] Hook into correlations + * [ ] Check email setup tux04 -* [+] Julia as part of GN3 deployment +* [ ] jbrowse plugin code - https://genenetwork.trop.in/mm10 +* [+] bulklmm Julia as part of GN3 deployment + - precompute & Julia +=> https://github.com/GregFa/TestSysimage + Here the repo with BulkLMMSysimage: +=> https://github.com/GregFa/BulkLMMSysimage => /topics/deploy/julia.gmi -* [ ] Work on pangenome genotyping 1U01HG013760 -* [+] Moni builds 1U01HG013760 +* [X] Set up stable GeneNetwork server instance with new hardware (see below) +=> /topics/systems/fire-up-genenetwork-system-container.gmi # Tasks @@ -51,11 +85,11 @@ Now (X=done +=WIP _=kickoff ?=?) * [+] Build leadership team * [+] gBAM * [ ] p-value global search -* [+] Xapian search add tags, notmuch style (with @zachs) +* [+] Xapian search add tags, notmuch style (with @bonfacem and @zachs) => ../issues/systems/octopus -* [ ] Add R/qtl2 and multi-parent support with Karl (DO and Magic populations) +* [+] Add R/qtl2 and multi-parent support with Karl (DO and Magic populations) * [+] Fix slow search on Mariadb? Moving to xapian * [.] GeneNetwork paper * + [ ] add FAIR statement @@ -70,7 +104,7 @@ Longer term Later * [ ] Mempang25 1U01HG013760 - + [ ] Invites + + [X] Invites + [ ] Payments + [ ] Rooms + [ ] Catering @@ -86,11 +120,7 @@ Later ### Set up stable server instance with new hardware * [ ] ssh-shell access for git markdown -* [ ] R/qtl2 with Karl and Alex -* [+] Set up opensmtpd as a service - + [ ] Add package dependency - + [X] Test on open port 25 - + [ ] Add public-inbox (Arun) +* [+] R/qtl2 with Karl and Alex, see [alex.gmi] => ./machine-room.gmi machine room @@ -118,3 +148,12 @@ Later * [X] Fix mariadb index search - need to upgrade mariadb to convert final utf8mb4, see => ../issues/slow-sql-query-for-xapian-indexing.gmi * [X] Debian/free software issues incl. vcflib work in Zig and release +* [X] Set up opensmtpd as a service + +# Notes + +The tasks here should probably be broken out into appropriately tagged issues, where they have not - they can be found and filtered out with tissue (formerly gnbug). + +=> https://issues.genenetwork.org + +Generally work applies to NIH/R073237482 and other grants. diff --git a/tasks/programmer-team/meetings.gmi b/tasks/programmer-team/meetings.gmi new file mode 100644 index 0000000..d972b3b --- /dev/null +++ b/tasks/programmer-team/meetings.gmi @@ -0,0 +1,82 @@ +# Weekly meetings + +In this document we will track tasks based of our weekly meetings. This list sets the agenda +on progress for the next week's meeting. + +## 02-10-2024 +## @felixm +* [ ] Use Aider to contribute and cover to Fred's coding. Share useful prompts. +* [ ] Feed relevant papers to GPT and find similar summary for other datasets. Start with C-Elegans. + + +## @bonfacem +* [ ] Share values with PJ. +* [ ] Assume LMDB files are transient. When hash doesn't exist, generate the hash for that dataset. Use LMDB to store key value pairs of hashes. +* [ ] Add dump script to gn-guile. +* [ ] Add Case Attributes in Virtuoso. + +## @alex +* [ ] Push R/QTL2 to production +* [ ] Have R/QTL2 work for ITP + +Nice to have: +* Think about editing publish data and consequent updates to LMDB. + +## @pjotr +* Kickstart UTHSC VPN access for Felix and Alex. + +## 01-20-2024 +### @bonfacem + +* [ ] Report: OpenAI on Aider - use AI for programming - discuss with @alexm + +=> https://issues.genenetwork.org/topics/ai/aider + +* [-] Metadata: Provide list of case attributes for BXD to @flisso +* [-] Code UI: GeneRIF and GenWiki should work from the mapping page - encourage people to use + - anyone logged in can edit + - If RIF does not exist point to GeneWiki + - If GeneWiki does not exist provide edit page +* [ ] Code export: Exporting traits to lmdb PublishData - @alexm helps with SQL + - missing data should not be an X + - run lmdb design (first code) by @pjotrp + - start exporting traits for Johannes (he will need to write a python reader) +* Later: Improve the work/dev container for @alexm + +### @flisso + +* [ ] Write: Uploader protocol. NOTES: Finished with C-elegans. Yet to test with other datasets. +* [ ] Script: Run Reaper +* [ ] Data: Case attributes - with @bonfacem +* [ ] Write: Create protocol to upload case attributes + +### @alexm + +* [ ] Code: Rqtl2 match Rqtl1: match scan changes. Notes: PR out and added tests. +* [ ] Bug: Fix pair scan. NOTES: Fixed it. But can't test it now since CD is down. +* Later: AI changes + +### @Pjotr + +* [ ] Code: Work on precompute with GEMMA (w. Jameson) +* [ ] Code: Take Bonface's trait files when they become available + + +## 01-27-2024 + +Last week's error with CD and production downtime: +* [ level 1] Container: Error messages when data not loaded in Virtuoso, Indexing. +* [ level 2] Sheepdog: Check services --- sheepdog. Health checkpoints. +* [ level 3] User feedback. Escalate errors correctly to the users, so they can report to coders + +### @bonfacem +* [ ] Troubleshoot CD. +* [ ] Export files in lmdb. Yohannes read file in Python example +* [ ] Metadata: Provide list of case attributes for BXD to @flisso +* [ ] Aider: See if it can generate some guile and python. Give an example. + +### @alexm +* [ ] UI for R/Qtl2. + +### @flisso +* [ ] Look at Fred Python code for the uploader and report on this. diff --git a/tasks/roadmap.gmi b/tasks/roadmap.gmi new file mode 100644 index 0000000..9bed63d --- /dev/null +++ b/tasks/roadmap.gmi @@ -0,0 +1,65 @@ +# GN Road map + +GN is a web service for complex traits. The main version is currently deployed in Memphis TN, mostly targetting mouse and rat. +Here we define a road map to bring GN to more communities by providing federated services. +The aim is to have plant.genenetwork.org, nematode.genenetwork.org, big.genenetwork.org running in the coming years. + +# Getting an instance up (step 1) + +## Deploy a new instance + +To test things we can use an existing database or a new one. We can deploy that as a (new) Guix service container. + +We'll need to run a few services including: + +* GN3 +* GN2 +* Auth (if required) +* Uploader (if required) + +## Get database ready + +In the first step we have to upload data for the target community. This can be done by updating the databases with some example datasets. Care has to be taken that search etc. works and that we can do the mapping. + +* Add traits +* Add genotype files +* Add metadata + +# Branding and hosting (Step 2) + +Once we have a working database with a number of example use cases we can start rebranding the service and, ideally, host it on location. + +# Synchronization (Step 3) + +## Move traits into lmdb + +This is WIP. We need to adapt the GN3 code to work with lmdb when available. + +## Move genotypes into lmdb + +This is WIP. We need to adapt the GN3 code to work with lmdb when available. + +# Federated metadata (Step 4) + +## Move all metadata into RDF + +This is WIP and happening. We will need to document. + +# LLM Integration (Step 5) + +Provide an LLM that integrates well with the gn eco-system. Goals for the LLM: + +* Flexible data ingestion +* Plug and play LLMS (local, OpenAI, Claude etc.) + +This is still a WIP. + +# Community (Step 6) + +## Uploading data examples + +## GN3 examples + +## UI examples + +## Provide programming examples diff --git a/tasks/zachs.gmi b/tasks/zachs.gmi new file mode 100644 index 0000000..6ae3df1 --- /dev/null +++ b/tasks/zachs.gmi @@ -0,0 +1,7 @@ +# Tasks for Zach + +# Tasks + +* [ ] Move non-ephemeral data out of redis into sqlite DB - see JSON dump +* - [ ] Collections +* - [ ] permanent URIs(?) |