genenetwork3 - GeneNetwork3 REST API for data science and machine learning

Age	Commit message (Collapse)	Author
2022-05-31	Extract utility functions from `fetch_all_database_data`	Frederick Muriuki Muriithi
	Extract the utility functions to help with understanding the what the `fetch_all_database_data` function is doing. This helps with maintenance.
2022-05-30	Pass trait data as args to `fix_strains` and fix some bugs	Frederick Muriuki Muriithi
	The `fix_strains` function works on the trait data, not the basic trait info. This commit fixes the arguments passed to the function, and also some bugs in the function.
2022-05-27	Move sql for CRUD operations on case-attrs from gn2 to gn3	BonfaceKilz

2022-05-27	Move sql for modifying case-attributes from gn2 to gn3	BonfaceKilz

2022-05-27	Return all the results from CaseAttributes column as is	BonfaceKilz
	* gn3/db/sample_data.py: Remove "collections" import. Add "Optional" import. (get_case_attributes): Return the results of "fetchall" from the case attributes. * tests/unit/db/test_sample_data.py (test_get_case_attributes): Update failing test.
2022-05-26	Add Endpoint to get menu items for use in UI	Frederick Muriuki Muriithi

2022-05-24	Run partial correlations with external script	Frederick Muriuki Muriithi
	Use new external script to run the partial correlations for both cases, i.e. - against an entire dataset, or - against selected traits
2022-05-24	New script to compute partial correlations	Frederick Muriuki Muriithi
	* Add a new script to compute the partial correlations against: - a select list of traits, or - an entire dataset depending on the specified subcommand. This new script is meant to supercede the `scripts/partial_correlations.py` script. * Fix the check for errors * Reorganise the order of arguments for the `partial_correlations_with_target_traits` function: move the `method` argument before the `target_trait_names` argument so that the common arguments in the partial correlation computation functions share the same order.
2022-05-21	Fix linting errors	Frederick Muriuki Muriithi

2022-05-21	Use multiprocessing to improve performance	Frederick Muriuki Muriithi

2022-05-21	Process primary, target and control traits in a single iteration	Frederick Muriuki Muriithi
	Rework the code to process the traits in a single iteration to improve performance.
2022-05-21	Return generator object rather than tuples	Frederick Muriuki Muriithi
	Return generator objects rather than pre-computed tuples to reduce the number of iterations needed to process the data, and thus improve the performance of the system somewhat.
2022-05-16	Run computation in one-shot asynchronous process	Frederick Muriuki Muriithi
	After reworking the worker/runner to have a one-shot mode, add a function that queues up the task and then runs the worker in the one-shot mode to process the computation in the background.
2022-05-06	Fix linting and typing errors	Frederick Muriuki Muriithi

2022-05-06	Hook up pcorrs with target traits computations	Frederick Muriuki Muriithi
	Enable the endpoint to actually compute partial correlations with selected target traits rather than against an entire dataset. Fix some issues caused by recent refactor that broke pcorrs against a dataset
2022-05-05	Compute partial correlation with selected traits	Frederick Muriuki Muriithi
	Compute partial correlations against a selection of traits rather than against an entire dataset.
2022-05-05	Extract common error checking. Rename function.	Frederick Muriuki Muriithi
	* Extract the common error checking code into a separate function * Rename the function to make its use clearer
2022-05-03	Refactor: Remove unnecessary loop	Frederick Muriuki Muriithi
	Remove an unnecessary looping construct to help with speeding up the partial correlations somewhat.
2022-04-29	Replace whole header with the longest one, instead of just the	zsloan
	non-CaseAttribute headers (before this caused issues if someone was adding case attributes to a file that already contained some case attributes)
2022-04-29	Get max string length instead when comparing headers	zsloan
	Apparently max(string1, string2) in Python gets the strong that is highest alphabetically, but I'm pretty sure this line was intenteded to get the header with the most items (which this commit doesn't fully address; you could still end up with a situation where some case attributes were removed while others were added, though that should be rare)
2022-04-12	Delete "get_allowable_sampledata_headers"	BonfaceKilz
	* gn3/csvcmp.py (get_allowable_sampledata_headers): Delete it. * tests/unit/test_csvcmp.py: Remove "get_allowable_sampledata_headers" import. (test_get_allowable_csv_headers): Delete it.
2022-04-12	Strip any newline, tab or carriage-return chars from sample data	BonfaceKilz
	* gn3/db/sample_data.py (get_trait_csv_sample_data): Strip out "\n", "\t", or "\r" from the sample data. See: <https://issues.genenetwork.org/issues/csv-error-ITP_10001-longevity-data-set.html>
2022-04-07	Fix pylint errors	BonfaceKilz

2022-04-07	Fix mypy error	BonfaceKilz

2022-04-07	Use case attribute id inside brackets if present during insertions	BonfaceKilz
	* gn3/db/sample_data.py (delete_sample_data): If an id is present in the column header, use it. * tests/unit/db/test_sample_data.py (test_delete_sample_data): Update tests to capture the above.
2022-04-07	Use case attribute id inside brackets if present during insertions	BonfaceKilz
	* gn3/db/sample_data.py (insert_sample_data): If an id is present in the column header, use it. * tests/unit/db/test_sample_data.py (test_insert_sample_data): Update tests to capture the above.
2022-04-07	Use case attribute id inside brackets if present during updates	BonfaceKilz
	* gn3/db/sample_data.py: Import "parse_csv_column". (update_sample_data): If an id is present in the column header, use it. * tests/unit/db/test_sample_data.py (test_update_sample_data): Update tests to capture the above.
2022-04-07	Add method for fetching the case_attributes	BonfaceKilz
	* gn3/db/sample_data.py (get_case_attributes): New function. * tests/unit/db/test_sample_data.py (test_get_case_attributes): Test case for the above.
2022-04-07	Run python black on file	BonfaceKilz
	* gn3/db/sample_data.py: Run "python black -l 79 ..."
2022-04-07	Add method for parsing a csv header from uploaded sample-data file	BonfaceKilz
	* gn3/csvcmp.py (parse_csv_column): New function. * tests/unit/test_csvcmp.py: Test case for the above.
2022-04-01	Run python-black in file	BonfaceKilz
	* gn3/csvcmp.py: Run "black -l 79 ..." * tests/unit/db/test_sample_data.py: Ditto. * tests/unit/test_csvcmp.py: Ditto.
2022-03-30	Revert "Run json.loads on request.get_json, since request.get_json was just ↵	Frederick Muriuki Muriithi
	returning a string" This reverts commit b93b22386056347d8002dd2e403425beeb4657cd. The appropriate fix should have been in GN2. The original statement args = request.get_json() was correct, since `request.get_json()` should return a python object parsed from the JSON string in the request. Unfortunately, GN2 was encoding the request data two times, which led to the call returning a JSON-encoded string instead of the expected object. The issue has been fixed in GN2 and therefore, the "fix" here can be reverted.
2022-03-29	Remove unused module	Frederick Muriuki Muriithi
	* Remove a module that is no longer in use
2022-03-28	Run json.loads on request.get_json, since request.get_json was just ↵	zsloan
	returning a string
2022-03-23	Run python-black on file and fix other pylint issues.	BonfaceKilz
	See: <https://ci.genenetwork.org/jobs/genenetwork3-pylint/126> * gn3/computations/rqtl.py: Run `black gn3/computations/rqtl.py`. Also, manually fix other pylint issues.
2022-03-22	Fixes pylint errors	zsloan

2022-03-22	Fixed mypy typing errors	zsloan

2022-03-22	Tried to make the docstrings more consistent	zsloan

2022-03-22	Add typing to some functions	zsloan

2022-03-22	Add functions for getting proximal/distal markers for each pseudomarker ↵	zsloan
	position in pair-scan results + return only the sorted top 500 results
2022-03-22	Fix issue that causes R/qtl to always run pair-scan even if pair-scan isn't ↵	zsloan
	selected
2022-03-22	Added genofile name to inputs for processing R/qtl pair-scan results, since ↵	zsloan
	it's needed to store the proximal/distal markers for each position
2022-03-22	Removed quotes from beginning and end of chromosome string	zsloan

2022-03-22	Fixed a cople function calls to use the updated function names	zsloan

2022-03-22	Create pairscan_for_figure and pairscan_for_table functions that return the ↵	zsloan
	Dict and List respectively used for the pair scan figure and the table showing the results
2022-03-22	Fix imports to import both process_rqtl_mapping and process_rqtl_pairscan in ↵	zsloan
	api/rqtl.py
2022-03-22	Added pairscan boolean kwarg and process_rqtl_pairscan function for reading ↵	zsloan
	in pairscan results + renamed process_rqtl_output to process_rqtl_mapping to distinguish between that and pairscan
2022-03-18	Clean all csv fields before diffing	BonfaceKilz
	There was a subtle bug where "csvdiff" generated an error related to "different column headings" caused something akin to diffing: "a, b \n, ..." with "a, b\n, ...". * gn3/csvcmp.py (csv_diff): Clean csv texts before any diffing. * tests/unit/test_csvcmp.py (test_csv_diff_same_columns): Modify test case to capture aforementioned bug.
2022-03-18	Create new function for cleaning individual fields in csv text	BonfaceKilz
	* gn3/csvcmp.py (clean_csv_text): New function. * tests/unit/test_csvcmp.py: Import "csv_text". (test_clean_csv_text): Test case for the above.
2022-03-15	Feature/refactored pca (#79)	Alexander Kabui
	* compute zscore function * test case for computing zscore * function to compute pca * generate scree plot data * generate new pca trait data from zscores and eigen_vec * remove redundant functions * generate factor loading table data * generate pca temp dataset dict * variable naming and error fixes * unit test for processing factor loadings * minor fixes for generating temp pca dataset * pass datetime as argument to generate_pca temp dataset function * add unittest for caching pca datasets * cache temp datasets * ignore missing imports for sklearn * mypy fixes * pylint fixes * refactor tests for pca * remove ununsed imports * fix for generating pca traits vals * mypy and code refactoring * pep8 formatting and add docstrings * remove comments /pep8 formatting * sort eigen vectors based on eigen values * minor fix for zscores * fix for rounding variance ratios * refactor tests * rename module to pca * rename datasets to traits * fix failing tests * fix caching function * fixes return x and y coordinates for scree plot * expand exception scope * fix for deprecated numpy.matrix function * fix for failing tests * pep8 fixes * remove comments * fix merge conflict * pylint fixes * rename module name to test_pca