diff options
Diffstat (limited to 'gnqa/paper2_eval/README.md')
-rw-r--r-- | gnqa/paper2_eval/README.md | 43 |
1 files changed, 28 insertions, 15 deletions
diff --git a/gnqa/paper2_eval/README.md b/gnqa/paper2_eval/README.md index 8dff6f6..3ce1a6e 100644 --- a/gnqa/paper2_eval/README.md +++ b/gnqa/paper2_eval/README.md @@ -15,19 +15,32 @@ We will have make the code more configurable by pulling the categories out of th It is best to define a structure for your different types of data: sets, lists, responses, and scores. -| File Operator | From directory | To directory | command | -|:---:|---:|---:|:--| -| create_dataset | list | dataset | python create_dataset.py \ -| | | | ../data/lists/list_catA_catB.json \ | -| | | | ../data/dataset/catA_catB.json | -| run_questions | list | responses | -| | | | ../data/list/catA_question_list.json \ | -| | | | ../data/responses/resp_catA_catB.json | -| parse_r2r_result | responses | dataset | | -| | | | ../data/responses/resp_catA_catB.json \ | -| | | | ../data/dataset/intermediate_files/catA_catB_.json | -| ragas_eval | dataset | scores | python3 ragas_eval.py \ | -| | | | ../data/datasets/catA/catB_1.json \ | -| | | | ../data/scores/catA/catB_1.json \ | -| | | | 3 # run evaluation 3 times | +## Tasks + +1. Create list(s) of questions (not automated) +1. Run question list through RAG (automated) +1. Save responses (automated) +1. Create datasets from responses (automated) +1. Run datasets through evaluator to get scores (not automated) +1. Create plots of scores (not automated) + +## Covering the tasks + +*ID refers to the task number from the previous section* + +| ID | File Operator | From directory | To directory | command | +|:--|:---:|---:|---:|:--| +| 2 | run_questions | list | responses | python run_questions.py \ | +| | | | | ../data/list/catA_question_list.json \ | +| | | | | ../data/responses/resp_catA_catB.json | +| 3 | parse_r2r_result | responses | dataset | | +| | | | | ../data/responses/resp_catA_catB.json \ | +| | | | | ../data/dataset/intermediate_files/catA_catB_.json | +| 4 | create_dataset | list | dataset | python create_dataset.py \ +| | | | | ../data/lists/list_catA_catB.json \ | +| | | | | ../data/dataset/catA_catB.json | +| 5 | ragas_eval | dataset | scores | python3 ragas_eval.py \ | +| | | | | ../data/datasets/catA/catB_1.json \ | +| | | | | ../data/scores/catA/catB_1.json \ | +| | | | | 3 # run evaluation 3 times |
\ No newline at end of file |