aboutsummaryrefslogtreecommitdiff
path: root/gn3/api/correlation.py
AgeCommit message (Collapse)Author
2023-04-06Remove deprecated `gn3.db_utils.database_connector` functionFrederick Muriuki Muriithi
Remove the deprecated function and fix a myriad of bugs that arise from removing the function. Issue: https://issues.genenetwork.org/issues/bugfix_coupling_current_app_and_db_utils
2022-07-28Add command to run the sample correlations in an external processFrederick Muriuki Muriithi
2022-05-24Run partial correlations with external scriptFrederick Muriuki Muriithi
Use new external script to run the partial correlations for both cases, i.e. - against an entire dataset, or - against selected traits
2022-05-24New script to compute partial correlationsFrederick Muriuki Muriithi
* Add a new script to compute the partial correlations against: - a select list of traits, or - an entire dataset depending on the specified subcommand. This new script is meant to supercede the `scripts/partial_correlations.py` script. * Fix the check for errors * Reorganise the order of arguments for the `partial_correlations_with_target_traits` function: move the `method` argument before the `target_trait_names` argument so that the common arguments in the partial correlation computation functions share the same order.
2022-05-21Fix linting errorsFrederick Muriuki Muriithi
2022-05-16Run computation in one-shot asynchronous processFrederick Muriuki Muriithi
After reworking the worker/runner to have a one-shot mode, add a function that queues up the task and then runs the worker in the one-shot mode to process the computation in the background.
2022-05-06Fix linting and typing errorsFrederick Muriuki Muriithi
2022-05-06Hook up pcorrs with target traits computationsFrederick Muriuki Muriithi
Enable the endpoint to actually compute partial correlations with selected target traits rather than against an entire dataset. Fix some issues caused by recent refactor that broke pcorrs against a dataset
2022-03-30Revert "Run json.loads on request.get_json, since request.get_json was just ↵Frederick Muriuki Muriithi
returning a string" This reverts commit b93b22386056347d8002dd2e403425beeb4657cd. The appropriate fix should have been in GN2. The original statement args = request.get_json() was correct, since `request.get_json()` should return a python object parsed from the JSON string in the request. Unfortunately, GN2 was encoding the request data two times, which led to the call returning a JSON-encoded string instead of the expected object. The issue has been fixed in GN2 and therefore, the "fix" here can be reverted.
2022-03-28Run json.loads on request.get_json, since request.get_json was just ↵zsloan
returning a string
2022-03-11Fix some linting issuesFrederick Muriuki Muriithi
2022-03-08Fix tests, and issues caught by testsFrederick Muriuki Muriithi
Fix some issues caught by tests due to changes introducing the hand-off of the partial correlations computations to an external process Fix some issues due to the changes that introduce context managers for database connections Update some tests to take the above two changes into consideration
2022-03-08Create database connections within context managersFrederick Muriuki Muriithi
Use the `with` context manager to open database connections, so as to ensure that those connections are closed once the call is completed. This hopefully avoids the 'too many connections' error
2022-03-03Run partial correlations in an external processFrederick Muriuki Muriithi
Run the partial correlations code in an external python process decoupling it from the server and making it asynchronous. Summary of changes: * gn3/api/correlation.py: - Remove response processing code - Queue partial corrs processing - Create new endpoint to get results * gn3/commands.py - Compose the pcorrs command to be run in an external process - Enable running of subprocess commands with list args * gn3/responses/__init__.py: new module indicator file * gn3/responses/pcorrs_responses.py: Hold response processing code extracted from ~gn3.api.correlations.py~ file * scripts/partial_correlations.py: CLI script to process the pcorrs * sheepdog/worker.py: - Add the *genenetwork3* path at the beginning of the ~sys.path~ list to override any GN3 in the site-packages - Add any environment variables to be set for the command to be run
2022-02-21Fix a myriad of linter issuesFrederick Muriuki Muriithi
* Use `with` in place of plain `open` * Use f-strings in place of `str.format()` * Remove string interpolation from queries - provide data as query parameters * other minor fixes
2022-02-18Test partial correlations endpoint with non-existent primary traitsFrederick Muriuki Muriithi
Test that the partial correlations endpoint responds with an appropriate "not-found" message and the corresponding 404 status code in the case where a request is made and the primary trait requested for does not exist in the database. Summary of the changes in each file: * gn3/api/correlation.py: generalise the building of the response * gn3/computations/partial_correlations.py: return with a "not-found" if the primary trait does not exist in the database * gn3/db/partial_correlations.py: Fix a number of bugs that led to exceptions in the case that the primary trait did not exist * pytest.ini: register a `slow` pytest marker * tests/integration/test_partial_correlations.py: Add a new test to check for an appropriate 404 response in case of a primary trait that does not exist in the database.
2022-02-17Test partial correlations endpoint with missing data in POST requestFrederick Muriuki Muriithi
Add a test for the partial correlations endpoint, with: - no data in the request - missing items in the data Fix the bugs caught by the test
2022-01-10Convert NaN to NoneFrederick Muriuki Muriithi
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi Comment: https://github.com/genenetwork/genenetwork3/pull/67#issuecomment-1000828159 * Convert NaN values to None to avoid possible bugs with the string replace method used before.
2021-12-24Replace `NaN` with `null` in JSON stringFrederick Muriuki Muriithi
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi * `NaN` is not a valid JSON value, and leads to errors in the code. This commit replaces all `NaN` values with `null`.
2021-12-24Encode the data to JSON and set the status codeFrederick Muriuki Muriithi
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi * Encode bytes objects to string * Encode NaN values to "null" * gn3/api/correlation.py:
2021-12-24Add API endpoint for partial correlationsFrederick Muriuki Muriithi
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi * Add an API endpoint for the partial correlation. * gn3/api/correlation.py:
2021-12-17Add API endpoint for partial correlationsFrederick Muriuki Muriithi
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi * Add an API endpoint for the partial correlation.
2021-08-18Fix obvious linting errorsMuriithi Frederick Muriuki
* Fix linting errors that do not change the function of the code.
2021-08-11use normal function for correlation (#34)Alexander Kabui
* use normal function for correlation + rename functions * update test for sample correlation * use normal function for tissue correlation + rename functions
2021-04-16benchmark normal function for sample rAlexander Kabui
2021-04-15optimization for sample correlationAlexander Kabui
2021-04-12fix merge conflictAlexander Kabui
2021-04-12Integrate correlation APIAlexander Kabui
- add new api for gn2-gn3 sample r integration - delete map for sample list to values - add db util file - add python msql-client dependency - add db for fetching lit correlation results - add unittests for db utils - add tests for db_utils - modify api for fetching lit correlation results - refactor Mock Database Connector and unittests - add sql url parser - add SQL URI env variable - refactor code for db utils - modify return data for lit correlation - refactor tissue correlation endpoint - replace db_instance with conn
2021-04-06fix DocstringsAlexander Kabui
2021-03-16delete unwanted correlation stuff (#5)Alexander Kabui
* delete unwanted correlation stuff * Refactor/clean up correlations (#4) * initial commit for Refactor/clean-up-correlation * add python scipy dependency * initial commit for sample correlation * initial commit for sample correlation endpoint * initial commit for integration and unittest * initial commit for registering correlation blueprint * add and modify unittest and integration tests for correlation * Add compute compute_all_sample_corr method for correlation * add scipy to requirement txt file * add tissue correlation for trait list * add unittest for tissue correlation * add lit correlation for trait list * add unittests for lit correlation for trait list * modify lit correlarion for trait list * add unittests for lit correlation for trait list * add correlation metho in dynamic url * add file format for expected structure input while doing sample correlation * modify input data structure -> add trait id * update tests for sample r correlation * add compute all lit correlation method * add endpoint for computing lit_corr * add unit and integration tests for computing lit corr * add /api/correlation/tissue_corr/{corr_method} endpoint for tissue correlation * add unittest and integration tests for tissue correlation Co-authored-by: BonfaceKilz <bonfacemunyoki@gmail.com> * update guix scm file * fix pylint error for correlations api Co-authored-by: BonfaceKilz <bonfacemunyoki@gmail.com>
2021-03-16Refactor/clean up correlations (#4)Alexander Kabui
* initial commit for Refactor/clean-up-correlation * add python scipy dependency * initial commit for sample correlation * initial commit for sample correlation endpoint * initial commit for integration and unittest * initial commit for registering correlation blueprint * add and modify unittest and integration tests for correlation * Add compute compute_all_sample_corr method for correlation * add scipy to requirement txt file * add tissue correlation for trait list * add unittest for tissue correlation * add lit correlation for trait list * add unittests for lit correlation for trait list * modify lit correlarion for trait list * add unittests for lit correlation for trait list * add correlation metho in dynamic url * add file format for expected structure input while doing sample correlation * modify input data structure -> add trait id * update tests for sample r correlation * add compute all lit correlation method * add endpoint for computing lit_corr * add unit and integration tests for computing lit corr * add /api/correlation/tissue_corr/{corr_method} endpoint for tissue correlation * add unittest and integration tests for tissue correlation Co-authored-by: BonfaceKilz <bonfacemunyoki@gmail.com>
2021-03-15Apply pep-8 formattingBonfaceKilz
2021-03-13Correlation api (#2)Alexander Kabui
* add file for correlation api * register initial correlation api * add correlation package * add function for getting page data * delete loading page api * modify code for correlation * add tests folder for correlations * fix error in correlation api * add tests for correlation * add tests for correlation loading data * add module for correlation computations * modify api to return json when computing correlation * add tests for computing correlation * modify code for loading correlation data * modify tests for correlation computation * test loading correlation data using api endpoint * add tests for asserting error in creating Correlation object * add do correlation method * add dummy tests for do_correlation method * delete unused modules * add tests for creating trait and dataset * add intergration test for correlation api * add tests for correlation api * edit docorrelation method * modify integration tests for correlation api * modify tests for show_corr_results * add create dataset function * pep8 formatting and fix return value for api * add more test data for doing correlation * modify tests for correlation * pep8 formatting * add getting formatted corr type method * import json library add process samples method for correlation * fix issue with sample_vals key_error * create utility module for correlation * refactor endpoint for /corr_compute * add test and mocks for compute_correlation function * add compute correlation function and pep8 formatting * move get genofile samplelist to utility module * refactor code for CorrelationResults object * pep8 formatting for module * remove CorrelationResults from Api * add base package initialize data_set module with create_dataset,redis and Dataset_Getter * set dataset_structure if redis is empty * add callable for DatsetType * add set_dataset_key method If name is not in the object's dataset dictionary * add Dataset object and MrnaAssayDataSet * add db_tools * add mysql client * add DatasetGroup object * add species module * get mapping method * import helper functions and new dataset * add connection to db before request * add helper functions * add logger module * add get_group_samplelists module * add logger for debug * add code for adding sample_data * pep8 formatting * Add chunks module * add correlation helper module * add get_sample_r_and_p_values method add get_header_fields function * add generate corr json method * add function to retrieve_trait_info * remove comments and clean up code in show_corr_results * remove comments and clean up code for data_set module * pep8 formatting for helper_functions module * pep8 formatting for trait module * add module for species * add Temp Dataset Object * add Phenotype Dataset * add Genotype Dataset * add rettrieve sample_sample_data method * add webqtlUtil module * add do lit correlation for all traits * add webqtlCaseData:Settings not ported * return the_trait for create trait method * add correlation_test json data * add tests fore show corr results * add dictfier package * add tests for show_corr_results * add assertion for trait_id * refactor code for show_corr_results * add test file for compute_corr intergration tests * add scipy dependency * refactor show_corr_results object add do lit correlation for trait_list * add hmac module * add bunch module:Dictionary using object notation * add correlation functions * add rpy2 dependency * add hmac module * add MrnaAssayTissueData object and get_symbol_values_pairs function * add config module * add get json_results method * pep8 formatting remove comments * add config file * add db package * refactor correlatio compuatation module * add do tissue correlation for trait list * add do lit correlation for all traits * add do tissue correlation for all traits * add do_bicor for bicor method * raise error for when initital start vars is None * add support for both form and json data when for correlation input * remove print statement and pep8 formatting * add default settings file * add tools module for locate_ignore_error * refactor code remove comments for trait module * Add new test data for computing correlation * pep8 formatting and use pickle * refactor function for filtering form/json data * remove unused imports * remove mock functions in correlation_utility module * refactor tests for compute correlation and pep8 formatting * add tests for show_correlation results * modify tests for show_corr_results * add json files for tests * pep8 formatting for show_corr_results * Todo:Lint base files * pylint for intergration tests * add test module for test_corr_helpers * Add test chunk module * lint utility package * refactoring and pep8 formatting * implement simple metric for correlation * add hmac utility file * add correlation prefix * fix merge conflict * minor fixes for endpoints * import:python-scipy,python-sqlalchemy from guix * add python mysqlclient * remove pkg-resources from requirements * add python-rpy3 from guix * refactor code for species module * pep8 formatting and refactor code * add tests for genereating correlation results * lint correlation functions * fix failing tests for show_corr_results * add new correlation test data fix errors * fix issues related to getting group samplelists * refactor intergration tests for correlation * add todo for refactoring_wanted_inputs * replace custom Attribute setter with SimpleNamespace * comparison of sample r correlation results btwn genenenetwork2 and genenetwork3 * delete AttributeSetter * test request for /api/correlation/compute_correlation took 18.55710196495056 Seconds * refactor tests and show_correlation results * remove unneccessary comments and print statements * edit requirement txt file * api/correlation took 114.29814600944519 Seconds for correlation resullts:20000 - corr-type:lit - corr-method:pearson corr-dataset:corr_dataset:HC_M2_0606_P * capture SQL_URI and GENENETWORK FILES path * pep8 formatting edit && remove print statements * delete filter_input function update test and data for correlation * add docstring for required correlation_input * /api/correlation took 12.905632972717285 Seconds * pearson * lit *dataset:HX_M2_0606_P trait_id :1444666 p_range:(lower->-0.60,uppper->0.74) corr_return_results: 100 * update integration and unittest for correlation * add simple markdown docs for correlation * update docs * add tests and catch for invalid correlation_input * minor fix for api * Remove jupyter from deps * guix.scm: Remove duplicate entry * guix.scm: Add extra action items as comments * Trim requirements.txt file Co-authored-by: BonfaceKilz <me@bonfacemunyoki.com>