README contains howto for using the evaluation code

author: Nyeusi D. Shebes 2025-02-28 10:11:32 -0600
committer: Nyeusi D. Shebes 2025-02-28 10:11:32 -0600
commit: 9c7412ff3920010f294093bcb7df143c00321c29 (patch)
tree: d1f9c022e1c685a762c07185c19875a4d5e14e12 /gnqa/paper2_eval/README.md
parent: 835e229909e9bdb6e084c5112672065886517adb (diff)
download: gn-ai-9c7412ff3920010f294093bcb7df143c00321c29.tar.gz
1 files changed, 28 insertions, 15 deletions
diff --git a/gnqa/paper2_eval/README.md b/gnqa/paper2_eval/README.md
index 8dff6f64..3ce1a6e7 100644
--- a/gnqa/paper2_eval/README.md
+++ b/gnqa/paper2_eval/README.md
@@ -15,19 +15,32 @@ We will have make the code more configurable by pulling the categories out of th
 
 It is best to define a structure for your different types of data: sets, lists, responses, and scores.
 
-| File Operator | From directory | To directory | command |
-|:---:|---:|---:|:--|
-| create_dataset | list | dataset | python create_dataset.py \
-| | | | &nbsp;&nbsp;&nbsp; ../data/lists/list_catA_catB.json \ |
-| | | | &nbsp;&nbsp;&nbsp; ../data/dataset/catA_catB.json |
-| run_questions | list | responses |
-| | | | &nbsp;&nbsp;&nbsp; ../data/list/catA_question_list.json \ |
-| | | | &nbsp;&nbsp;&nbsp; ../data/responses/resp_catA_catB.json |
-| parse_r2r_result | responses | dataset | |
-| | | | &nbsp;&nbsp;&nbsp; ../data/responses/resp_catA_catB.json \ |
-| | | | &nbsp;&nbsp;&nbsp; ../data/dataset/intermediate_files/catA_catB_.json |
-| ragas_eval | dataset | scores | python3 ragas_eval.py \ |
-| | | | &nbsp;&nbsp;&nbsp; ../data/datasets/catA/catB_1.json \ |
-| | | | &nbsp;&nbsp;&nbsp; ../data/scores/catA/catB_1.json \ |
-| | | | &nbsp;&nbsp;&nbsp; 3 # run evaluation 3 times |
+## Tasks
+
+1. Create list(s) of questions (not automated)
+1. Run question list through RAG (automated)
+1. Save responses (automated)
+1. Create datasets from responses (automated)
+1. Run datasets through evaluator to get scores (not automated)
+1. Create plots of scores (not automated)
+
+## Covering the tasks
+
+*ID refers to the task number from the previous section*
+
+| ID | File Operator | From directory | To directory | command |
+|:--|:---:|---:|---:|:--|
+| 2 | run_questions | list | responses | python run_questions.py \ |
+| | | | | &nbsp;&nbsp;&nbsp; ../data/list/catA_question_list.json \ |
+| | | | | &nbsp;&nbsp;&nbsp; ../data/responses/resp_catA_catB.json |
+| 3 | parse_r2r_result | responses | dataset | |
+| | | | | &nbsp;&nbsp;&nbsp; ../data/responses/resp_catA_catB.json \ |
+| | | | | &nbsp;&nbsp;&nbsp; ../data/dataset/intermediate_files/catA_catB_.json |
+| 4 | create_dataset | list | dataset | python create_dataset.py \
+| | | | | &nbsp;&nbsp;&nbsp; ../data/lists/list_catA_catB.json \ |
+| | | | | &nbsp;&nbsp;&nbsp; ../data/dataset/catA_catB.json |
+| 5 | ragas_eval | dataset | scores | python3 ragas_eval.py \ |
+| | | | | &nbsp;&nbsp;&nbsp; ../data/datasets/catA/catB_1.json \ |
+| | | | | &nbsp;&nbsp;&nbsp; ../data/scores/catA/catB_1.json \ |
+| | | | | &nbsp;&nbsp;&nbsp; 3 # run evaluation 3 times |
  
 \ No newline at end of file
author	Nyeusi D. Shebes	2025-02-28 10:11:32 -0600
committer	Nyeusi D. Shebes	2025-02-28 10:11:32 -0600
commit	9c7412ff3920010f294093bcb7df143c00321c29 (patch)
tree	d1f9c022e1c685a762c07185c19875a4d5e14e12 /gnqa/paper2_eval/README.md
parent	835e229909e9bdb6e084c5112672065886517adb (diff)
download	gn-ai-9c7412ff3920010f294093bcb7df143c00321c29.tar.gz