From 48826bd04c4539619dc46f414fe565574e62f5d8 Mon Sep 17 00:00:00 2001 From: Alexander_Kabui Date: Tue, 15 Oct 2024 10:39:29 +0300 Subject: Add docs for rqtl refactoring design. --- .../lmms/rqtl2/gn-rqtl-design-implementation.gmi | 90 ++++++++++++++++++++++ 1 file changed, 90 insertions(+) create mode 100644 topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi diff --git a/topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi b/topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi new file mode 100644 index 0000000..76c52dd --- /dev/null +++ b/topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi @@ -0,0 +1,90 @@ +# RQTL Implementation for GeneNetwork Design Proposal + +## Tags + +* Assigned: alexm, +* Keywords: RQTL, GeneNetwork2, Design +* Type: Enhancements, +* Status: In Progress + + + +## Description + +This document outlines the design proposal for the re-implementation of the RQTL feature in GeneNetwork providing also a console view to track the external process. + +### Problem Definition +The current RQTL implementation requires enhancements. The core functionality remains the same: making API calls from GeneNetwork 3 (GN3). The system needs a cleaner architecture for future use, emphasizing improved error handling and a clear separation of concerns between GeneNetwork 2 (GN2) and GN3. This will involve eliminating file transfers between GN2 and GN3. Additionally, a console should be provided to users for tracking the progress of ongoing tasks. + +## High-Level Design +This is divided into two major components: +* Refactor the RQTL API +* Implement a console view to track the external process + +### Refactor the RQTL API +This component involves rewriting the current API to replace file transfers between GN2 and GN3 with JSON data. We will also implement tests and enhance error handling. + +**Subtasks:** + +- [ ] Rewrite the RQTL API endpoints +- [ ] Add unit tests for this module +- [ ] Implement better error handling for the API + +### Implement a Console View to Track the External Process +This component involves creating a monitoring system to track the state of the external process and output relevant information to the user. + +## Deep Dive + +### Running the External Script +The RQTL implementation is in R, and we need a strategy for executing this script as an external process. This can be subdivided into several key steps: + +- **Task Queue Integration**: + - We will utilize a task queue system (currently implemented in GN3) to manage script execution. + +- **Job Submission**: + - Each API call will create a new job in the task queue, which will handle the execution of the R script. + +- **Script Execution**: + - This stage involves executing the R script in a controlled environment, ensuring all necessary dependencies are loaded. + +- **Monitoring and Logging**: + - The system will have monitoring tools to track the status of each job. Users will receive real-time updates on job progress and logs for the current task. + +- **Result Retrieval**: + - Once the R script completes (either successfully or with an error), results will be returned to the API call. + +- **Error Handling**: + - Better error handling will be implemented to manage potential issues during script execution. This includes capturing errors from the R script and providing meaningful feedback to users through the application. + +### Additional Error Handling Considerations +This will involve: +* API error handling +* Error handling within the R script + +## Additional UI Considerations +We need to rethink where to output the external process logs in the UI. Currently, we can add flags to the URL to enable this functionality, e.g., `URL/page&flags&console=1`. + +### Current Design Suggestions: +#### With HTMX, offer a split screen +This will include an output page and a monitoring system page. + +#### Popup button for preview +A button that allows users to preview and hide the console output. + +## Long-Term Goals +We aim to run computations on clusters rather than locally. This project will serve as a pioneer for that approach. + +## Related Issues +=> https://issues.genenetwork.org/topics/lmms/rqtl2/using-rqtl2 + +### Tasks +- [ ] Rewrite the RQTL API endpoints +- [ ] Add unit tests for this module +- [ ] Implement state-of-the-art error handling +- [ ] Make improvements to the current R script if possible +- [ ] Task queue integration (refer to the Deep Dive section) +- [ ] Implement a monitoring and logging system for job execution (refer to the deep dive section) +- [ ] Fetch results from running jobs and display them to the user +- [ ] Implement a console preview UI for user feedback +- [ ] Refactor the GN2 UI +- [ ] Run this computation on clusters \ No newline at end of file -- cgit 1.4.1 From b89795b37d845ef4a15c24c53784b4cf51e6a565 Mon Sep 17 00:00:00 2001 From: Alexander_Kabui Date: Tue, 15 Oct 2024 10:52:40 +0300 Subject: Update docs on using terminal emulator: --- topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi | 3 +++ 1 file changed, 3 insertions(+) diff --git a/topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi b/topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi index 76c52dd..ce9c0dc 100644 --- a/topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi +++ b/topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi @@ -63,6 +63,9 @@ This will involve: ## Additional UI Considerations We need to rethink where to output the external process logs in the UI. Currently, we can add flags to the URL to enable this functionality, e.g., `URL/page&flags&console=1`. +Also the design suggestion is to out the results in a terminal emulator for +example xterm +See more: https://xtermjs.org/ ### Current Design Suggestions: #### With HTMX, offer a split screen -- cgit 1.4.1 From e48a34504d4a1e2f8ca5b4070f3b441aff9cfcf6 Mon Sep 17 00:00:00 2001 From: Alexander_Kabui Date: Tue, 15 Oct 2024 11:04:58 +0300 Subject: Add link to Terminal emulator implementation in gn2. --- topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi b/topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi index ce9c0dc..966b44c 100644 --- a/topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi +++ b/topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi @@ -35,6 +35,7 @@ This component involves creating a monitoring system to track the state of the e ## Deep Dive + ### Running the External Script The RQTL implementation is in R, and we need a strategy for executing this script as an external process. This can be subdivided into several key steps: @@ -64,8 +65,9 @@ This will involve: ## Additional UI Considerations We need to rethink where to output the external process logs in the UI. Currently, we can add flags to the URL to enable this functionality, e.g., `URL/page&flags&console=1`. Also the design suggestion is to out the results in a terminal emulator for -example xterm -See more: https://xtermjs.org/ +example xterm ,See more: https://xtermjs.org/, A current implementation already exists +for gn3 see +=> https://github.com/genenetwork/genenetwork2/blob/abe324888fc3942d4b3469ec8d1ce2c7dcbd8a93/gn2/wqflask/templates/wgcna_setup.html#L89 ### Current Design Suggestions: #### With HTMX, offer a split screen @@ -74,6 +76,8 @@ This will include an output page and a monitoring system page. #### Popup button for preview A button that allows users to preview and hide the console output. + + ## Long-Term Goals We aim to run computations on clusters rather than locally. This project will serve as a pioneer for that approach. -- cgit 1.4.1 From 6488b2ff492e66bd023f0050d5556795f86bc588 Mon Sep 17 00:00:00 2001 From: Alexander_Kabui Date: Tue, 15 Oct 2024 11:12:19 +0300 Subject: Minor fixes for the high level design. --- topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-) diff --git a/topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi b/topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi index 966b44c..74ae6b9 100644 --- a/topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi +++ b/topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi @@ -18,11 +18,14 @@ The current RQTL implementation requires enhancements. The core functionality re ## High-Level Design This is divided into two major components: -* Refactor the RQTL API -* Implement a console view to track the external process +* RQTL Api + +* Monitoring system for the rqtl script + +### RQTL Api +This component serves as the entry point to the gn3 rqtl system and involves rewriting the current API to replace file transfers between GN2 and GN3 with JSON data. +We will also implement tests and enhance error handling. -### Refactor the RQTL API -This component involves rewriting the current API to replace file transfers between GN2 and GN3 with JSON data. We will also implement tests and enhance error handling. **Subtasks:** @@ -30,9 +33,12 @@ This component involves rewriting the current API to replace file transfers betw - [ ] Add unit tests for this module - [ ] Implement better error handling for the API -### Implement a Console View to Track the External Process +### Monitoring system for the rqtl script + This component involves creating a monitoring system to track the state of the external process and output relevant information to the user. + + ## Deep Dive -- cgit 1.4.1 From af065cc2968379404a8976a80ff08f7bd0216a49 Mon Sep 17 00:00:00 2001 From: Alexander_Kabui Date: Tue, 15 Oct 2024 11:17:02 +0300 Subject: Add new task for rqtl issue: --- topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi b/topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi index 74ae6b9..11f5f15 100644 --- a/topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi +++ b/topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi @@ -23,13 +23,15 @@ This is divided into two major components: * Monitoring system for the rqtl script ### RQTL Api -This component serves as the entry point to the gn3 rqtl system and involves rewriting the current API to replace file transfers between GN2 and GN3 with JSON data. +This component serves as the entry point to the gn3 rqtl system and involves +rewriting the current API to replace file transfers between GN2 and GN3 with JSON data We will also implement tests and enhance error handling. **Subtasks:** - [ ] Rewrite the RQTL API endpoints +- [ ] add validation for data submitted by user - [ ] Add unit tests for this module - [ ] Implement better error handling for the API @@ -92,7 +94,8 @@ We aim to run computations on clusters rather than locally. This project will se ### Tasks - [ ] Rewrite the RQTL API endpoints -- [ ] Add unit tests for this module +- [ ] Minor: validation for data from the client +- [ ] Add unit tests for the rqtl api module - [ ] Implement state-of-the-art error handling - [ ] Make improvements to the current R script if possible - [ ] Task queue integration (refer to the Deep Dive section) -- cgit 1.4.1 From 23876d632e51c2cce9836e727d0ecc476985e694 Mon Sep 17 00:00:00 2001 From: Alexander_Kabui Date: Tue, 15 Oct 2024 11:49:17 +0300 Subject: Update docs. --- .../lmms/rqtl2/gn-rqtl-design-implementation.gmi | 30 +++++++++++++++------- 1 file changed, 21 insertions(+), 9 deletions(-) diff --git a/topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi b/topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi index 11f5f15..8a32255 100644 --- a/topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi +++ b/topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi @@ -23,15 +23,16 @@ This is divided into two major components: * Monitoring system for the rqtl script ### RQTL Api -This component serves as the entry point to the gn3 rqtl system and involves -rewriting the current API to replace file transfers between GN2 and GN3 with JSON data -We will also implement tests and enhance error handling. +This component serves as the entry point to the gn3 rqtl system and can be subdivided into +data validation and preprocessing stage, data computation stage, output processing stage. +In the component the task mainly involves rewriting the current API to replace file transfers between GN2 and GN3 with JSON data We will also implement tests and enhance error handling. **Subtasks:** - [ ] Rewrite the RQTL API endpoints -- [ ] add validation for data submitted by user +- [ ] add validation and preprocessing for data submitted from the +- [ ] Processing output from the external script - [ ] Add unit tests for this module - [ ] Implement better error handling for the API @@ -93,14 +94,25 @@ We aim to run computations on clusters rather than locally. This project will se => https://issues.genenetwork.org/topics/lmms/rqtl2/using-rqtl2 ### Tasks -- [ ] Rewrite the RQTL API endpoints -- [ ] Minor: validation for data from the client -- [ ] Add unit tests for the rqtl api module +* stage 1 * + +- [ ] Implement the RQTL API endpoints +- [ ] validation and preprocessing for data from the client - [ ] Implement state-of-the-art error handling +- [ ] Add unit tests for the rqtl api module - [ ] Make improvements to the current R script if possible + +* stage 2 * + - [ ] Task queue integration (refer to the Deep Dive section) -- [ ] Implement a monitoring and logging system for job execution (refer to the deep dive section) -- [ ] Fetch results from running jobs and display them to the user +- [ ] Implement a monitoring and logging system for job execution (refer to the deep dive section +- [ ] Fetch results from running jobs +- [ ] Processing output from the external script + +* stage 3 * - [ ] Implement a console preview UI for user feedback - [ ] Refactor the GN2 UI + +* stage 4 * + - [ ] Run this computation on clusters \ No newline at end of file -- cgit 1.4.1 From 2637f2051c28c6e04a1789cd0dde0d707c0e8423 Mon Sep 17 00:00:00 2001 From: Alexander_Kabui Date: Thu, 17 Oct 2024 12:42:13 +0300 Subject: split up the problem definition to challenges and features we want to introduce. --- .../lmms/rqtl2/gn-rqtl-design-implementation.gmi | 25 +++++++++++++++++++++- 1 file changed, 24 insertions(+), 1 deletion(-) diff --git a/topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi b/topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi index 8a32255..f853509 100644 --- a/topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi +++ b/topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi @@ -14,7 +14,30 @@ This document outlines the design proposal for the re-implementation of the RQTL feature in GeneNetwork providing also a console view to track the external process. ### Problem Definition -The current RQTL implementation requires enhancements. The core functionality remains the same: making API calls from GeneNetwork 3 (GN3). The system needs a cleaner architecture for future use, emphasizing improved error handling and a clear separation of concerns between GeneNetwork 2 (GN2) and GN3. This will involve eliminating file transfers between GN2 and GN3. Additionally, a console should be provided to users for tracking the progress of ongoing tasks. + +The current RQTL implementation faces the following challenges: + +- Lack of adequate error handling for the API and scripts. + +- Insufficient separation of concerns between GN2 and GN3. + +- The system requires a cleaner architecture for future use. + +- lack way for user to track the progress of the r-qtl script being executed + +We will address these challenges and add enhancements by: + + +- Rewriting the R script using r-qtl2 instead of r-qtl. + +- Improving the overall design and architecture of the system. + +- Establishing clear separation of concerns between GN2 and GN3, eliminating file path transfers between the two. + +- Implementing better error handling for both the API and the RQTL script. + +- Piping stdout from the script to the browser through a console for real-time monitoring. + ## High-Level Design This is divided into two major components: -- cgit 1.4.1 From 57a22e4e3f646dc95ab82df748c7526bd51676fd Mon Sep 17 00:00:00 2001 From: Alexander_Kabui Date: Thu, 17 Oct 2024 13:18:06 +0300 Subject: Improve on the documentation for the rqtl api design stage. --- .../lmms/rqtl2/gn-rqtl-design-implementation.gmi | 38 ++++++++++++++++------ 1 file changed, 28 insertions(+), 10 deletions(-) diff --git a/topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi b/topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi index f853509..a9ea520 100644 --- a/topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi +++ b/topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi @@ -1,4 +1,4 @@ -# RQTL Implementation for GeneNetwork Design Proposal +># RQTL Implementation for GeneNetwork Design Proposal ## Tags @@ -21,23 +21,26 @@ The current RQTL implementation faces the following challenges: - Insufficient separation of concerns between GN2 and GN3. -- The system requires a cleaner architecture for future use. - - lack way for user to track the progress of the r-qtl script being executed -We will address these challenges and add enhancements by: +- There is lack of a clear way in which the r-qtl script is executed +We will address these challenges and add enhancements by: - Rewriting the R script using r-qtl2 instead of r-qtl. -- Improving the overall design and architecture of the system. - - Establishing clear separation of concerns between GN2 and GN3, eliminating file path transfers between the two. - Implementing better error handling for both the API and the RQTL script. +- run the script as a job in a task queue + - Piping stdout from the script to the browser through a console for real-time monitoring. +- Improving the overall design and architecture of the system. + +- The system requires a cleaner architecture for future use. + ## High-Level Design This is divided into two major components: @@ -45,10 +48,25 @@ This is divided into two major components: * Monitoring system for the rqtl script -### RQTL Api -This component serves as the entry point to the gn3 rqtl system and can be subdivided into -data validation and preprocessing stage, data computation stage, output processing stage. -In the component the task mainly involves rewriting the current API to replace file transfers between GN2 and GN3 with JSON data We will also implement tests and enhance error handling. +### RQTL Api + + +This component will serve as the entry point for running RQTL in GN3. At this stage, we need to improve the overall architecture and error handling. This process will be divided into the following steps: + +- Data Validation +In this step, we must validate that all required data to run RQTL is provided in the JSON format. This includes the mapping method, genotype file, phenotype file, etc. Please refer to the r-qtl2 documentation for an overview on the requirements : +=> https://rqtl.org/ + +- Data Preprocessing +During this stage, we will transform the data into a format that R can understand. This includes converting boolean values to the appropriate representations, preparing the RQTL command with all required values, and adding defaults where necessary. + +- Data Computation +In this stage, we will pass the RQTL script command to the task queue to run as a job. + +- Output Data Processing +In this step, we need to retrieve the results outputted from the script in a specified format, such as JSON or CSV. This may include outputs like RQTL pair scans and generated diagrams. Please refer to the documentation for an overview: +=> https://rqtl.org/ + **Subtasks:** -- cgit 1.4.1 From 534c9613c36524f77c4a834ccc61548c7771549e Mon Sep 17 00:00:00 2001 From: Alexander_Kabui Date: Thu, 17 Oct 2024 13:26:53 +0300 Subject: Refactor subtask to match rqtl api design steps and add status for each to determine progress. --- topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi b/topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi index a9ea520..5a88cdc 100644 --- a/topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi +++ b/topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi @@ -71,11 +71,13 @@ In this step, we need to retrieve the results outputted from the script in a spe **Subtasks:** -- [ ] Rewrite the RQTL API endpoints -- [ ] add validation and preprocessing for data submitted from the -- [ ] Processing output from the external script -- [ ] Add unit tests for this module -- [ ] Implement better error handling for the API +- [ ] add the rqtl api endpoint (10%) +- [ ] Input Data validation (15%) +- [ ] Input data processing (20%) +- [ ] Passing data to r-script for the computation (40%) +- [ ] output data processing (80%) + -[ ] add unittests for this module (100%) + ### Monitoring system for the rqtl script -- cgit 1.4.1 From 7cd1496f192ac712b0461504b95340f85005cd73 Mon Sep 17 00:00:00 2001 From: Alexander_Kabui Date: Thu, 17 Oct 2024 13:52:07 +0300 Subject: Add link for the task queue implementation for gn3. --- topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi b/topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi index 5a88cdc..3df19cc 100644 --- a/topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi +++ b/topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi @@ -82,7 +82,8 @@ In this step, we need to retrieve the results outputted from the script in a spe ### Monitoring system for the rqtl script This component involves creating a monitoring system to track the state of the external process and output relevant information to the user. - +We need a way to determine the status for the current job for example +QUEUED, STARTED, INPROGRESS, COMPLETED ## Deep Dive @@ -91,8 +92,13 @@ This component involves creating a monitoring system to track the state of the e ### Running the External Script The RQTL implementation is in R, and we need a strategy for executing this script as an external process. This can be subdivided into several key steps: -- **Task Queue Integration**: - - We will utilize a task queue system (currently implemented in GN3) to manage script execution. +- **Task Queue Integration**: + + - We will utilize a task queue system , + We already have an implementation in gn3 + to manage script execution + +- https://github.com/genenetwork/genenetwork3/blob/0820295202c2fe747c05b93ce0f1c5a604442f69/gn3/commands.py#L101 - **Job Submission**: - Each API call will create a new job in the task queue, which will handle the execution of the R script. -- cgit 1.4.1 From e436c1893e0a2da8e647342002da4d3da6e2b6e4 Mon Sep 17 00:00:00 2001 From: Alexander_Kabui Date: Thu, 17 Oct 2024 14:39:57 +0300 Subject: Add more notes for the monitoring and logging step. --- topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi b/topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi index 3df19cc..2920247 100644 --- a/topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi +++ b/topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi @@ -107,7 +107,23 @@ The RQTL implementation is in R, and we need a strategy for executing this scrip - This stage involves executing the R script in a controlled environment, ensuring all necessary dependencies are loaded. - **Monitoring and Logging**: - - The system will have monitoring tools to track the status of each job. Users will receive real-time updates on job progress and logs for the current task. + +- The system will include monitoring tools to track the status of each job. Users will receive real-time updates on job progress and logs for the current task. + +In this stage, we can have different states for the current job, such as QUEUED, IN PROGRESS, and COMPLETED. + +We need to output to the user which stage of computation we are currently on during the script +execution. + +- During the QUEUED state, the standard output (stdout) should display the command being executed along with all its arguments. + +- During the STARTED stage, the stdout should notify the user that execution has begun. + +- In the IN PROGRESS stage, we need to fetch logs from the script being executed at each computation step. Please refer to this documentation for an overview of the different computation: +=> https://rqtl.org/ + +- During the DONE step, the system should output the results from the R/qtl script to the user. + - **Result Retrieval**: - Once the R script completes (either successfully or with an error), results will be returned to the API call. -- cgit 1.4.1 From 8a0d2f780987ae483f6b5640440f44045c64ff23 Mon Sep 17 00:00:00 2001 From: Alexander_Kabui Date: Thu, 17 Oct 2024 15:01:17 +0300 Subject: Add new step for writing rqtl script in rqtl-2. --- topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi b/topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi index 2920247..ca30aef 100644 --- a/topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi +++ b/topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi @@ -43,11 +43,23 @@ We will address these challenges and add enhancements by: ## High-Level Design -This is divided into two major components: -* RQTL Api +This is divided into three major components: +* GN3 RQTL-2 Script implementation +* RQTL Api * Monitoring system for the rqtl script + +### GN3 RQTL-2 Script implementation +We currently have an rqtl script written in rqtl https://github.com/genenetwork/genenetwork3/blob/main/scripts/rqtl_wrapper.R +There is a newer rqtl implementation (rqtl-2) which is +a ) is a reimplementation of the QTL analysis software R/qtl, to better handle high-dimensional data and complex cross designs. +To see the difference between the two: +=> oWe aim to implement a seperate script using this while maintaining the one +eimplemented using rqtl1. +(TODO) This probably needs to be split to a new issue(with enough knowledge) , to capture +each computation step . + ### RQTL Api -- cgit 1.4.1 From b5ebfd7878d8353101f4bc2cf75ad2a828814e31 Mon Sep 17 00:00:00 2001 From: Alexander_Kabui Date: Fri, 18 Oct 2024 12:00:36 +0300 Subject: Make improvements to the rqtl design documentation. --- .../lmms/rqtl2/gn-rqtl-design-implementation.gmi | 43 +++++++++++++--------- 1 file changed, 25 insertions(+), 18 deletions(-) diff --git a/topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi b/topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi index ca30aef..d39ba0a 100644 --- a/topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi +++ b/topics/lmms/rqtl2/gn-rqtl-design-implementation.gmi @@ -11,7 +11,7 @@ ## Description -This document outlines the design proposal for the re-implementation of the RQTL feature in GeneNetwork providing also a console view to track the external process. +This document outlines the design proposal for the re-implementation of the RQTL feature in GeneNetwork providing also a console view to track the stdout from the external process. ### Problem Definition @@ -39,7 +39,6 @@ We will address these challenges and add enhancements by: - Improving the overall design and architecture of the system. -- The system requires a cleaner architecture for future use. ## High-Level Design @@ -53,12 +52,13 @@ This is divided into three major components: ### GN3 RQTL-2 Script implementation We currently have an rqtl script written in rqtl https://github.com/genenetwork/genenetwork3/blob/main/scripts/rqtl_wrapper.R There is a newer rqtl implementation (rqtl-2) which is -a ) is a reimplementation of the QTL analysis software R/qtl, to better handle high-dimensional data and complex cross designs. -To see the difference between the two: -=> oWe aim to implement a seperate script using this while maintaining the one -eimplemented using rqtl1. +a reimplementation of the QTL analysis software R/qtl, to better handle high-dimensional data and complex cross designs. +To see the difference between the two see documentation: +=> https://kbroman.org/qtl2/assets/vignettes/rqtl_diff.html +We aim to implement a seperate script using this while maintaining the one +implemented using rqtl1 (rqtl) . (TODO) This probably needs to be split to a new issue(with enough knowledge) , to capture -each computation step . +each computation step in the r script. ### RQTL Api @@ -76,7 +76,7 @@ During this stage, we will transform the data into a format that R can understan In this stage, we will pass the RQTL script command to the task queue to run as a job. - Output Data Processing -In this step, we need to retrieve the results outputted from the script in a specified format, such as JSON or CSV. This may include outputs like RQTL pair scans and generated diagrams. Please refer to the documentation for an overview: +In this step, we need to retrieve the results outputted from the script in a specified format, such as JSON or CSV and process the data. This may include outputs like RQTL pair scans and generated diagrams. Please refer to the documentation for an overview: => https://rqtl.org/ @@ -95,7 +95,7 @@ In this step, we need to retrieve the results outputted from the script in a spe This component involves creating a monitoring system to track the state of the external process and output relevant information to the user. We need a way to determine the status for the current job for example -QUEUED, STARTED, INPROGRESS, COMPLETED +QUEUED, STARTED, INPROGRESS, COMPLETED (see deep dive for more on this) ## Deep Dive @@ -127,11 +127,12 @@ In this stage, we can have different states for the current job, such as QUEUED, We need to output to the user which stage of computation we are currently on during the script execution. -- During the QUEUED state, the standard output (stdout) should display the command being executed along with all its arguments. +- During the QUEUED state, the standard output (stdout) should display the command to be executed along with all its arguments. - During the STARTED stage, the stdout should notify the user that execution has begun. -- In the IN PROGRESS stage, we need to fetch logs from the script being executed at each computation step. Please refer to this documentation for an overview of the different computation: +- In the IN PROGRESS stage, we need to fetch logs from the script being executed at each computation step. Please refer to this documentation for an overview of the different computations we +shall have : => https://rqtl.org/ - During the DONE step, the system should output the results from the R/qtl script to the user. @@ -149,13 +150,13 @@ This will involve: * Error handling within the R script ## Additional UI Considerations -We need to rethink where to output the external process logs in the UI. Currently, we can add flags to the URL to enable this functionality, e.g., `URL/page&flags&console=1`. -Also the design suggestion is to out the results in a terminal emulator for +We need to rethink where to output the external process stdout in the UI. Currently, we can add flags to the URL to enable this functionality, e.g., `URL/page&flags&console=1`. +Also the design suggestion is to output the results in a terminal emulator for example xterm ,See more: https://xtermjs.org/, A current implementation already exists for gn3 see => https://github.com/genenetwork/genenetwork2/blob/abe324888fc3942d4b3469ec8d1ce2c7dcbd8a93/gn2/wqflask/templates/wgcna_setup.html#L89 -### Current Design Suggestions: +### Design Suggestions: #### With HTMX, offer a split screen This will include an output page and a monitoring system page. @@ -164,6 +165,8 @@ A button that allows users to preview and hide the console output. + + ## Long-Term Goals We aim to run computations on clusters rather than locally. This project will serve as a pioneer for that approach. @@ -171,7 +174,11 @@ We aim to run computations on clusters rather than locally. This project will se => https://issues.genenetwork.org/topics/lmms/rqtl2/using-rqtl2 ### Tasks -* stage 1 * + +* stage 1 (20%) * + - [ ] implement the rqtl script using rqtl2 + +* stage 2 (40%) * - [ ] Implement the RQTL API endpoints - [ ] validation and preprocessing for data from the client @@ -179,17 +186,17 @@ We aim to run computations on clusters rather than locally. This project will se - [ ] Add unit tests for the rqtl api module - [ ] Make improvements to the current R script if possible -* stage 2 * +* stage 3 (60%)* - [ ] Task queue integration (refer to the Deep Dive section) - [ ] Implement a monitoring and logging system for job execution (refer to the deep dive section - [ ] Fetch results from running jobs - [ ] Processing output from the external script -* stage 3 * +* stage 4 (80%) * - [ ] Implement a console preview UI for user feedback - [ ] Refactor the GN2 UI -* stage 4 * +* stage 5 (100%) * - [ ] Run this computation on clusters \ No newline at end of file -- cgit 1.4.1