summary refs log tree commit diff
path: root/tasks/bonfacem.gmi
blob: e0396e44a64a2131910fb134aa18925ad1d1ec40 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
# Tasks for Munyoki

## Tags

* kanban: bonfacem
* assigned: bonfacem

## Tasks

### Note
* Don't lose metadata.   Have an array of disabled snips.
* Store by snip (rows).  Storage by marker.  2 different files.
* gn-auth:
  have wrappers around gn-auth (draw-back: folk may forget).
  use Nginx as a MTM (re-routing calls).  How to add handler in Nginx and to work with tokens.
* GN-auth dashboard fixes.  Follow up with Fred.
* Case-attributes used in co-variates.

### PhD Work

* Concept note/ideas: Add all metadata in GN to an LLM that enriches GnQA.
* Use mapping output as full vectors for gpt/transformers.  Integrate this work into GN.
* Share PhD concept note with PJ for polishing

### This week
* [ ] Generate RDF docs using AI.
* [~] Dump all genotypes from tux02 to LMDB.
*     - PJ sync tux01 genotypes with tux02/04.
*     - Yet to set-up 2FA on new device
* [~] Look at deep-seek/anthropic (also really doc deployment in balg01).  Run in debian machine.
* [~] Set-up wolfshead.
* [~] Adapter to LMDB into a cross object.
*     - Try computations with R/qtl2.
*     - Look at R LMDB libraries.
*     - Look at functions that read the files.
*     - PJ: LMDB adapter in R and cross-type files.
* [~] gn-guile webhook.


### Later
* [ ] Editing genotype metadata
* [+] Correlations hash.
*     - Add dataset count to RDF.
* [ ] Spam + LLMs
*     - RateLimiting for Rif Editing.
*     - Honepot approach.
* [ ] Dockerise GN container.   For Harm.
* [ ] Send emails when job fail.
* [ ] Look at updating gn-auth/gn-libs to PYTHONPATH for gn2/3.
* [ ] Sample/individual/strain/genometype counts for PublishData only - ProbeSetData? https://github.com/genenetwork/genenetwork2/blob/testing/scripts/sample_count.py - mirror in RDF and use global search
*     - search for all traits that have more than X samples
* [ ] Add case attributes to RDF and share with Felix (depends on @felixl)
* [ ] xapian search, add dataset size keys, as well as GN accession id, trait id, and date/year
*     - Improve xapian markdown docs to show all used fields/keys with examples
*     - genewiki search (link in table? check with Rob)
*     - base line with GN1 search - add tests
*     - Fix missing search term for sh* - both menu search and global search
*     - Use GN1 as a benchmark for search results (mechanical Rob?)
*     - Xapian ranges for markers

### Even later

* [ ] Rest API for precompute output (mapping with GEMMA)
* [ ] GNQA add GN metadata (to RAG)
*     - Focus on RIF
*     - triple -> plain text
*     - bob :fatherof nancy -> Bob is the father of Nancy.

## Later

* [ ] AI improvements

### On going tasks

=> https://issues.genenetwork.org/search?query=assigned%3ABonfaceKilz+AND+is%3Aopen+AND+status%3Ain-progress&type=all All in-progress tasks

### Stalled (To Be Done/Completed)

=> https://issues.genenetwork.org/search?query=assigned%3ABonfaceKilz+AND+tag%3Astalled+AND+is%3Aopen&type=open-issueo All stalled taskse that are to be promoted to in-progress

### Unclear Issues

Ad-hoc issues that were picked some where some how:

=> https://issues.genenetwork.org/search?type=open-issue&query=assigned%3ABonfaceKilz%20AND%20NOT%20tag%3Astalled%20AND%20NOT%20tag%3Ain-progress%20AND%20status%3Aunclear%20OR%20priority%3Aunclear Unclear Issues


### Closed Issues

Should something in one of these closed issues be amiss, we can always and should re-open the offending issue.

Currently closed issues are:

=> https://issues.genenetwork.org/search?type=closed-issue&query=assigned%3ABonfaceKilz%20AND%20type%3Aissue%20AND%20is%3Aclosed Closed Issues

* [X] Indexing generif data / Improve Local Search
* [X] lmdb publishdata output and share with Pjotr and Johannes

## Done

* [X] Add lmdb output hashes with index and export LMDB_DATA_DIRECTORY
* [X] Share small database with @pjotrp and @felixl
* [X] With Alex get rqtl2 demo going in CD (for BXD)
* [X] Set up meeting with ILRI
*     - Zasper https://news.ycombinator.com/item?id=42572057 - Alan
* [X] Migrate fahamuai RAG to VPS and switch tokens to GGI OpenAI account
*     1. Running AI server using (our) VPS and our tokens
*        + Pjotr gives API key - OpenAI - model?
*     2. Read the code base - Elixir is plumbing incl. authentication, Python processing text etc.
*     3. Try ingestion and prompt (REST API) - check out postgres tables
*     4. Backup state from production Elixir
*     5. Assess porting it to Guix (don't do any work) - minimum version Elixir
*     6. Get docs from Shelby/Brian
* [X] Set-up grobit on balg01
*     - guix docker/native
*     - recent breaking changes
* [X] GeneRIF
*     - Merge recent changes first.  Ping Rob.
*     - Brainstorm ideas around log-in.
*     - Unlimited tokens that don't expire.
*     - Sync prod with CD -- sqlite.
*     - Add deletion
* [X] Describe Generif/wikidata access for Rob in an email with test account on CD
*     1. Send email to Rob
*     2. Work on production w. Fred
* [X] Distinguish CD from production -- banners/buttons/colors.
* [X] Use aider - give a presentation in the coming weeks
* [X] gn-auth fixes
* [X] Assess Brian's repo for deployment.
* [X] Finish container work
* - View diffs in BXD: Edit case attributes throws an error.
* [X] Check small db from: https://files.genenetwork.org/database/
* [X] Changes to Production + (Alex)
* [X] File issue with syslog
* [X] LMDB database.
*     - Simplify (focus on small files).  Don't over-rely on Numpy.
* [X] Assess adding GeneRIF to LLM.
* [X] Referrer headers -- a way of preventing bots beyond rate-limiting. 
* [X] Python Fahamu.
* [X] Memvid - brief look.

* [X] Encourage FahamuAI to be open.
*     Another paper with his group should be out this month
* [X] Help Alex with SSL certification container error.
*     - Fix SSL issues in local container.
* [X] Send Arun an e-mail on how to go about upgrading shepherd.
* [X] Case Attributes.
*     - Git blame.  Add tests.  Fred.
*     - NOTE: Fixed the diffs.  But there's an edge-case with BXD longevity (I haven't checked.  Shared scripts)
*     - NOTE: Elpy broke.  Eglot/lspemacs doesn't work.
*     - NOTE: Moved away from storing diffs in files to LMDB.
*     - Error when checking the history.  Fixed by fixing the diffs.
*     - Reach out to Zach.  NOTE: Timing differences.
*     - Disable diff in the UI - unnecessary.
* [X] Added LMDB_PATH to dev container.   Updated old commits.
* [X] Merged no-login AI work that Alex did.
* [X] Talk to Fred and hand over case-attributes.
* [X] Distinct admin and dev user. [w/ Fred]
*     - Extra fluff to grant dev user access to everything.
* [X] Merged rate-limiter.
* [X] Look at slow running CD (look at issue tracker and be systematic).
* [X] Fix CD.  Build guix against a recent pinned profile.