aboutsummaryrefslogtreecommitdiff
path: root/gnqa/README.md
diff options
context:
space:
mode:
Diffstat (limited to 'gnqa/README.md')
-rw-r--r--gnqa/README.md40
1 files changed, 35 insertions, 5 deletions
diff --git a/gnqa/README.md b/gnqa/README.md
index 4114964..577c7b0 100644
--- a/gnqa/README.md
+++ b/gnqa/README.md
@@ -1,7 +1,37 @@
-# GeneNetwork Question Answer sytem
+# GeneNetwork Question Answer (GNQA) Study Evaluation
-## paper 1 evaluation
-FahamuAI is used as the RAG engine
-## paper 2 evaluation
-R2R is being used as the RAG engine
+This directory contains the code created to evaluate questions submitted to GNQA.
+Unlike the evaluation in paper 1, this work uses different LLMs and a different RAG engine.
+RAGAS is still used to evaluate the queries.
+
+The RAG engine being used is [R2R](https://github.com/SciPhi-AI/R2R). It is open source and has performance similar to the engine we used for our 1st GNQA paper.
+
+The evaluation workflow is organized around reading questions that can be organized with two sets of categories, e.g. category 1 - who asked the questions, category 2 - the field to which the question belongs.
+In our initial work our category 1 consists of citizen scientists and domain experts.
+While category 2 consists of three fields or specializations: Genenetwork.org systems genetics, the genetics of diabetes and the genetics of aging.
+
+We will have make the code more configurable by pulling the categories out of the source code and keeping them strictly in settings files.
+
+It is best to define a structure for your different types of data: sets, lists, responses, and scores.
+
+## Tasks
+
+1. Create list(s) of questions (not automated)
+1. Run question list through RAG (automated)
+1. Save responses (automated)
+1. Create datasets from responses (automated)
+1. Run datasets through evaluator to get scores (not automated)
+1. Create plots of scores (not automated)
+
+## Covering the tasks
+
+*ID refers to the task number from the previous section*
+
+| ID | File Operator | From directory | To directory | command |
+|:--|:---:|---:|---:|:--|
+| 2 | run_questions | lists | responses | `python run_questions.py lists/catA_question_list.json responses/resp_catA_catB.json` |
+| 3 | parse_r2r_result | responses | datasets | `python parse_r2r_result.py responses/resp_catA_catB.json datasets/intermediate_files/catA_catB_.json` |
+| 4 | create_dataset | lists | datasets | `python create_dataset.py lists/list_catA_catB.json dataset/catA_catB.json` |
+| 5 | ragas_eval | datasets | scores | `python3 ragas_eval.py datasets/catA/catB_1.json scores/catA/catB_1.json 3` # run evaluation 3 times |
+ \ No newline at end of file