Age | Commit message (Collapse) | Author |
|
|
|
Use new external script to run the partial correlations for both cases,
i.e.
- against an entire dataset, or
- against selected traits
|
|
* Add a new script to compute the partial correlations against:
- a select list of traits, or
- an entire dataset
depending on the specified subcommand. This new script is meant to supercede
the `scripts/partial_correlations.py` script.
* Fix the check for errors
* Reorganise the order of arguments for the
`partial_correlations_with_target_traits` function: move the `method`
argument before the `target_trait_names` argument so that the common
arguments in the partial correlation computation functions share the same
order.
|
|
|
|
After reworking the worker/runner to have a one-shot mode, add a function that
queues up the task and then runs the worker in the one-shot mode to process
the computation in the background.
|
|
|
|
Enable the endpoint to actually compute partial correlations with selected
target traits rather than against an entire dataset.
Fix some issues caused by recent refactor that broke pcorrs against a dataset
|
|
returning a string"
This reverts commit b93b22386056347d8002dd2e403425beeb4657cd.
The appropriate fix should have been in GN2. The original statement
args = request.get_json()
was correct, since `request.get_json()` should return a python object parsed
from the JSON string in the request. Unfortunately, GN2 was encoding the
request data two times, which led to the call returning a JSON-encoded string
instead of the expected object.
The issue has been fixed in GN2 and therefore, the "fix" here can be reverted.
|
|
returning a string
|
|
|
|
Fix some issues caught by tests due to changes introducing the hand-off of the
partial correlations computations to an external process
Fix some issues due to the changes that introduce context managers for
database connections
Update some tests to take the above two changes into consideration
|
|
Use the `with` context manager to open database connections, so as to ensure
that those connections are closed once the call is completed. This hopefully
avoids the 'too many connections' error
|
|
Run the partial correlations code in an external python process decoupling it
from the server and making it asynchronous.
Summary of changes:
* gn3/api/correlation.py:
- Remove response processing code
- Queue partial corrs processing
- Create new endpoint to get results
* gn3/commands.py
- Compose the pcorrs command to be run in an external process
- Enable running of subprocess commands with list args
* gn3/responses/__init__.py: new module indicator file
* gn3/responses/pcorrs_responses.py: Hold response processing code extracted
from ~gn3.api.correlations.py~ file
* scripts/partial_correlations.py: CLI script to process the pcorrs
* sheepdog/worker.py:
- Add the *genenetwork3* path at the beginning of the ~sys.path~ list to
override any GN3 in the site-packages
- Add any environment variables to be set for the command to be run
|
|
* Use `with` in place of plain `open`
* Use f-strings in place of `str.format()`
* Remove string interpolation from queries - provide data as query parameters
* other minor fixes
|
|
Test that the partial correlations endpoint responds with an appropriate
"not-found" message and the corresponding 404 status code in the case where a
request is made and the primary trait requested for does not exist in the
database.
Summary of the changes in each file:
* gn3/api/correlation.py: generalise the building of the response
* gn3/computations/partial_correlations.py: return with a "not-found" if the
primary trait does not exist in the database
* gn3/db/partial_correlations.py: Fix a number of bugs that led to exceptions
in the case that the primary trait did not exist
* pytest.ini: register a `slow` pytest marker
* tests/integration/test_partial_correlations.py: Add a new test to check for
an appropriate 404 response in case of a primary trait that does not exist
in the database.
|
|
Add a test for the partial correlations endpoint, with:
- no data in the request
- missing items in the data
Fix the bugs caught by the test
|
|
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi
Comment:
https://github.com/genenetwork/genenetwork3/pull/67#issuecomment-1000828159
* Convert NaN values to None to avoid possible bugs with the string replace
method used before.
|
|
Issue:
https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi
* `NaN` is not a valid JSON value, and leads to errors in the code. This
commit replaces all `NaN` values with `null`.
|
|
Issue:
https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi
* Encode bytes objects to string
* Encode NaN values to "null"
* gn3/api/correlation.py:
|
|
Issue:
https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi
* Add an API endpoint for the partial correlation.
* gn3/api/correlation.py:
|
|
Issue:
https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi
* Add an API endpoint for the partial correlation.
|
|
* Fix linting errors that do not change the function of the code.
|
|
* use normal function for correlation + rename functions
* update test for sample correlation
* use normal function for tissue correlation + rename functions
|
|
|
|
|
|
|
|
- add new api for gn2-gn3 sample r integration
- delete map for sample list to values
- add db util file
- add python msql-client dependency
- add db for fetching lit correlation results
- add unittests for db utils
- add tests for db_utils
- modify api for fetching lit correlation results
- refactor Mock Database Connector and unittests
- add sql url parser
- add SQL URI env variable
- refactor code for db utils
- modify return data for lit correlation
- refactor tissue correlation endpoint
- replace db_instance with conn
|
|
|
|
* delete unwanted correlation stuff
* Refactor/clean up correlations (#4)
* initial commit for Refactor/clean-up-correlation
* add python scipy dependency
* initial commit for sample correlation
* initial commit for sample correlation endpoint
* initial commit for integration and unittest
* initial commit for registering correlation blueprint
* add and modify unittest and integration tests for correlation
* Add compute compute_all_sample_corr method for correlation
* add scipy to requirement txt file
* add tissue correlation for trait list
* add unittest for tissue correlation
* add lit correlation for trait list
* add unittests for lit correlation for trait list
* modify lit correlarion for trait list
* add unittests for lit correlation for trait list
* add correlation metho in dynamic url
* add file format for expected structure input while doing sample correlation
* modify input data structure -> add trait id
* update tests for sample r correlation
* add compute all lit correlation method
* add endpoint for computing lit_corr
* add unit and integration tests for computing lit corr
* add /api/correlation/tissue_corr/{corr_method} endpoint for tissue correlation
* add unittest and integration tests for tissue correlation
Co-authored-by: BonfaceKilz <bonfacemunyoki@gmail.com>
* update guix scm file
* fix pylint error for correlations api
Co-authored-by: BonfaceKilz <bonfacemunyoki@gmail.com>
|
|
* initial commit for Refactor/clean-up-correlation
* add python scipy dependency
* initial commit for sample correlation
* initial commit for sample correlation endpoint
* initial commit for integration and unittest
* initial commit for registering correlation blueprint
* add and modify unittest and integration tests for correlation
* Add compute compute_all_sample_corr method for correlation
* add scipy to requirement txt file
* add tissue correlation for trait list
* add unittest for tissue correlation
* add lit correlation for trait list
* add unittests for lit correlation for trait list
* modify lit correlarion for trait list
* add unittests for lit correlation for trait list
* add correlation metho in dynamic url
* add file format for expected structure input while doing sample correlation
* modify input data structure -> add trait id
* update tests for sample r correlation
* add compute all lit correlation method
* add endpoint for computing lit_corr
* add unit and integration tests for computing lit corr
* add /api/correlation/tissue_corr/{corr_method} endpoint for tissue correlation
* add unittest and integration tests for tissue correlation
Co-authored-by: BonfaceKilz <bonfacemunyoki@gmail.com>
|
|
|
|
* add file for correlation api
* register initial correlation api
* add correlation package
* add function for getting page data
* delete loading page api
* modify code for correlation
* add tests folder for correlations
* fix error in correlation api
* add tests for correlation
* add tests for correlation loading data
* add module for correlation computations
* modify api to return json when computing correlation
* add tests for computing correlation
* modify code for loading correlation data
* modify tests for correlation computation
* test loading correlation data using api endpoint
* add tests for asserting error in creating Correlation object
* add do correlation method
* add dummy tests for do_correlation method
* delete unused modules
* add tests for creating trait and dataset
* add intergration test for correlation api
* add tests for correlation api
* edit docorrelation method
* modify integration tests for correlation api
* modify tests for show_corr_results
* add create dataset function
* pep8 formatting and fix return value for api
* add more test data for doing correlation
* modify tests for correlation
* pep8 formatting
* add getting formatted corr type method
* import json library
add process samples method for correlation
* fix issue with sample_vals key_error
* create utility module for correlation
* refactor endpoint for /corr_compute
* add test and mocks for compute_correlation function
* add compute correlation function and pep8 formatting
* move get genofile samplelist to utility module
* refactor code for CorrelationResults object
* pep8 formatting for module
* remove CorrelationResults from Api
* add base package
initialize data_set module with create_dataset,redis and Dataset_Getter
* set dataset_structure if redis is empty
* add callable for DatsetType
* add set_dataset_key method If name is not in the object's dataset dictionary
* add Dataset object and MrnaAssayDataSet
* add db_tools
* add mysql client
* add DatasetGroup object
* add species module
* get mapping method
* import helper functions and new dataset
* add connection to db before request
* add helper functions
* add logger module
* add get_group_samplelists module
* add logger for debug
* add code for adding sample_data
* pep8 formatting
* Add chunks module
* add correlation helper module
* add get_sample_r_and_p_values method
add get_header_fields function
* add generate corr json method
* add function to retrieve_trait_info
* remove comments and clean up code in show_corr_results
* remove comments and clean up code for data_set module
* pep8 formatting for helper_functions module
* pep8 formatting for trait module
* add module for species
* add Temp Dataset Object
* add Phenotype Dataset
* add Genotype Dataset
* add rettrieve sample_sample_data method
* add webqtlUtil module
* add do lit correlation for all traits
* add webqtlCaseData:Settings not ported
* return the_trait for create trait method
* add correlation_test json data
* add tests fore show corr results
* add dictfier package
* add tests for show_corr_results
* add assertion for trait_id
* refactor code for show_corr_results
* add test file for compute_corr intergration tests
* add scipy dependency
* refactor show_corr_results object
add do lit correlation for trait_list
* add hmac module
* add bunch module:Dictionary using object notation
* add correlation functions
* add rpy2 dependency
* add hmac module
* add MrnaAssayTissueData object and get_symbol_values_pairs function
* add config module
* add get json_results method
* pep8 formatting remove comments
* add config file
* add db package
* refactor correlatio compuatation module
* add do tissue correlation for trait list
* add do lit correlation for all traits
* add do tissue correlation for all traits
* add do_bicor for bicor method
* raise error for when initital start vars is None
* add support for both form and json data when for correlation input
* remove print statement and pep8 formatting
* add default settings file
* add tools module for locate_ignore_error
* refactor code remove comments for trait module
* Add new test data for computing correlation
* pep8 formatting and use pickle
* refactor function for filtering form/json data
* remove unused imports
* remove mock functions in correlation_utility module
* refactor tests for compute correlation and pep8 formatting
* add tests for show_correlation results
* modify tests for show_corr_results
* add json files for tests
* pep8 formatting for show_corr_results
* Todo:Lint base files
* pylint for intergration tests
* add test module for test_corr_helpers
* Add test chunk module
* lint utility package
* refactoring and pep8 formatting
* implement simple metric for correlation
* add hmac utility file
* add correlation prefix
* fix merge conflict
* minor fixes for endpoints
* import:python-scipy,python-sqlalchemy from guix
* add python mysqlclient
* remove pkg-resources from requirements
* add python-rpy3 from guix
* refactor code for species module
* pep8 formatting and refactor code
* add tests for genereating correlation results
* lint correlation functions
* fix failing tests for show_corr_results
* add new correlation test data fix errors
* fix issues related to getting group samplelists
* refactor intergration tests for correlation
* add todo for refactoring_wanted_inputs
* replace custom Attribute setter with SimpleNamespace
* comparison of sample r correlation results btwn genenenetwork2 and genenetwork3
* delete AttributeSetter
* test request for /api/correlation/compute_correlation took 18.55710196495056 Seconds
* refactor tests and show_correlation results
* remove unneccessary comments and print statements
* edit requirement txt file
* api/correlation took 114.29814600944519 Seconds for correlation resullts:20000
- corr-type:lit
- corr-method:pearson
corr-dataset:corr_dataset:HC_M2_0606_P
* capture SQL_URI and GENENETWORK FILES path
* pep8 formatting edit && remove print statements
* delete filter_input function
update test and data for correlation
* add docstring for required correlation_input
* /api/correlation took 12.905632972717285 Seconds
* pearson
* lit
*dataset:HX_M2_0606_P
trait_id :1444666
p_range:(lower->-0.60,uppper->0.74)
corr_return_results: 100
* update integration and unittest for correlation
* add simple markdown docs for correlation
* update docs
* add tests and catch for invalid correlation_input
* minor fix for api
* Remove jupyter from deps
* guix.scm: Remove duplicate entry
* guix.scm: Add extra action items as comments
* Trim requirements.txt file
Co-authored-by: BonfaceKilz <me@bonfacemunyoki.com>
|