genenetwork3 - GeneNetwork3 REST API for data science and machine learning

Age	Commit message (Collapse)	Author
2022-03-18	Clean all csv fields before diffing	BonfaceKilz
	There was a subtle bug where "csvdiff" generated an error related to "different column headings" caused something akin to diffing: "a, b \n, ..." with "a, b\n, ...". * gn3/csvcmp.py (csv_diff): Clean csv texts before any diffing. * tests/unit/test_csvcmp.py (test_csv_diff_same_columns): Modify test case to capture aforementioned bug.
2022-03-18	Create new function for cleaning individual fields in csv text	BonfaceKilz
	* gn3/csvcmp.py (clean_csv_text): New function. * tests/unit/test_csvcmp.py: Import "csv_text". (test_clean_csv_text): Test case for the above.
2022-03-15	Feature/refactored pca (#79)	Alexander Kabui
	* compute zscore function * test case for computing zscore * function to compute pca * generate scree plot data * generate new pca trait data from zscores and eigen_vec * remove redundant functions * generate factor loading table data * generate pca temp dataset dict * variable naming and error fixes * unit test for processing factor loadings * minor fixes for generating temp pca dataset * pass datetime as argument to generate_pca temp dataset function * add unittest for caching pca datasets * cache temp datasets * ignore missing imports for sklearn * mypy fixes * pylint fixes * refactor tests for pca * remove ununsed imports * fix for generating pca traits vals * mypy and code refactoring * pep8 formatting and add docstrings * remove comments /pep8 formatting * sort eigen vectors based on eigen values * minor fix for zscores * fix for rounding variance ratios * refactor tests * rename module to pca * rename datasets to traits * fix failing tests * fix caching function * fixes return x and y coordinates for scree plot * expand exception scope * fix for deprecated numpy.matrix function * fix for failing tests * pep8 fixes * remove comments * fix merge conflict * pylint fixes * rename module name to test_pca
2022-03-14	Dummy White Space commit to fix laminar	BonfaceKilz

2022-03-14	Only loop through the diff's modifications if it exists	BonfaceKilz

2022-03-14	Given a csv text and permissible headers, extract invalid headers	BonfaceKilz
	* gn3/csvcmp.py (extract_invalid_csv_headers): New function. * tests/unit/test_csvcmp.py: Import "extract_invalid_csv_headers". (test_extract_invalid_csv_headers_with_some_wrong_headers): Test case for the above.
2022-03-14	Get all permissible column data	BonfaceKilz
	* gn3/csvcmp.py: Import "Any" and "List". (get_allowable_sampledata_headers): New function. * tests/unit/test_csvcmp: Import "get_allowable_sampledata_headers". (test_get_allowable_csv_headers): Test case for the above.
2022-03-12	Remove unused imports	BonfaceKilz

2022-03-12	Fix mypy issues	BonfaceKilz

2022-03-12	Fix pylint issues	BonfaceKilz

2022-03-12	Compose csv-diff command within single quotes	BonfaceKilz
	* gn3/csvcmp.py (csv_diff): Use single quotes. There was a change in 6d39c92 that broke this.
2022-03-12	Delete noisy "print" statement	BonfaceKilz

2022-03-12	Store the first element as strain_id	BonfaceKilz

2022-03-12	Append the strain name when extracting "actions"	BonfaceKilz
	* gn3/db/sample_data.py (__extract_actions): During updates, make sure that the strain name is part of the returned string when extracting "actions". * tests/unit/db/test_sample_data.py: Add test cases for the above.
2022-03-12	Apply auto-pep8 to sample_data.py and it's test file	BonfaceKilz

2022-03-12	Add missing return type-annotations	BonfaceKilz
	* tests/unit/db/test_sample_data.py (delete_sample_data): Add missing return type for type annotations. (insert_sample_data): Ditto.
2022-03-12	Update how data is updated by re-using existing functions	BonfaceKilz
	* gn3/db/sample_data.py (get_sample_data_ids): Re-use "delete_sample_data" and "insert_sample_data" when updating data; and also add logic for updating modified data. * tests/unit/db/test_sample_data.py: Add tests for the above.
2022-03-12	Create action dict that's created when updating data	BonfaceKilz
	* gn3/db/sample_data.py (__extract_actions): An update on a vector of data can contain: inserts, deletes and updates. This functions extracts these actions during an update. * tests/unit/db/test_sample_data.py (test_extract_actions): Add test-case for the above.
2022-03-12	Remove check for inserted data when inserting individual data	BonfaceKilz
	* gn3/db/sample_data.py (insert_sample_data)[__insert_data]: Move check to the main body. With this check here, you have 3 redundant checks. For a successful insert, it will insert the first value to the `PublishData` table and ignore the rest of the inserts.
2022-03-12	Make `_map` a constant	BonfaceKilz
	* gn3/db/sample_data.py: Now constant, `_MAP`. (delete_sample_data)[__delete_data]: Replace `_map` with `_MAP`. (insert_sample_data)[__insert_data]: Ditto.
2022-03-12	Fix faulty SQL query string when deleting case-attributes	BonfaceKilz

2022-03-12	Explicitly get CaseAttributeId and fix broken sql query	BonfaceKilz
	* gn3/db/sample_data.py (insert_sample_data): Use correct query string. Also, use CaseAttributeId to determine whether case-attributes were inserted. If so, do not attempt an insert.
2022-03-12	Remove duplicate params	BonfaceKilz
	* gn3/db/sample_data.py (insert_sample_data)[__insert_case_attribute]: Remove extra parameters.
2022-03-12	Remove dead code	BonfaceKilz

2022-03-12	Check whether publish data already exists before inserting	BonfaceKilz
	* gn3/db/sample_data.py (insert_sample_data): If data already exists in the table, do not attempt an insert; otherwise, an error will be generated.
2022-03-12	Fill in empty values in csv text with: "x"	BonfaceKilz
	* gn3/csvcmp.py (fill_csv): Update this function to allow empty lists to be filled with the default value(set in the args). * tests/unit/test_csvcmp.py (test_fill_csv): Update test to capture above.
2022-03-12	Fetch id's separately for the insertion edge-case	BonfaceKilz
	* gn3/db/sample_data (get_sample_data_ids): Add an extra condition that caters for inserts: during inserts, joins won't work when fetching the strain_id, publishdata_id, and strain_name. In this case, just create 2 separate queries to do that work.
2022-03-12	Extract a strain name given a csv string and it's header	BonfaceKilz
	* gn3/csvcmp.py (extract_strain_name): New function. * gn3/db/sample_data (delete_sample_data): Use the aforementioned function. (insert_sample_data): Ditto. * tests/unit/test_csvcmp: Test cases for above.
2022-03-12	Allow deleting case-attribute data during deletion	BonfaceKilz
	* gn3/db/sample_data.py (delete_sample_data): Modify this function to allow deleting case-attribute values.
2022-03-12	Allow inserting case-attribute data during inserts	BonfaceKilz
	* gn3/db/sample_data.py (insert_sample_data): Modify this function to allow inserting case-attribute values.
2022-03-12	Fetch InbredSetId	BonfaceKilz
	* gn3/db/sample_data.py (get_sample_data_ids): Extend to also fetch InbredSetId. (update_sample_data): Discard the returned value of InbredSetId. (delete_sample_data): Ditto.
2022-03-12	Create a new function for retrieving strain_id and publishdata_id	BonfaceKilz
	* gn3/db/sample_data.py: Import Any, Tuple. (get_sample_data_ids): New function that fetches the strain_id and publishdata_id of a given data point. (update_sample_data): Use `get_sample_data_ids`. (delete_sample_data): Ditto. (insert_sample_data): Ditto.
2022-03-12	Move operations on sample_data to it's own module	BonfaceKilz

2022-03-12	Don't add extra key "Column" to dict if there are no changes	BonfaceKilz
	gn3/csvcmp.py (csv_diff): If the diff is empty, don't add an extra key "Column" to the dictionary. tests/unit/test_csvcmp (test_csv_diff_only_column_change): Add test-case for the above.
2022-03-12	Fill CSV text if there are non-even rows	BonfaceKilz
	Should you try to run `csvdiff` against 2 csv files with either file having a non-even columns, there will be an error. As such, the csv files need to be "filled" before running `csvdiff`. * gn3/csvcmp (csv_diff): For non-even rows in the csv files, fill the csv rows.
2022-03-12	Create new method for filling csv with a default value	BonfaceKilz
	* gn3/csvcmp.py (fill_csv): Given a csv text with uneven or incomplete fields whole length are less than width, fill them with a value which defaults to "x". * tests/unit/test_csvcmp.py (test_fill_csv): Test cases for the above.
2022-03-12	Replace "all" with "and"	BonfaceKilz
	* gn3/csvcmp.py (remove_insignificant_edits): "all" evaluates all elements and throws an error if when `abs(float(x) - float(y)) < epsilon` is processed. Use "and" instead because of it's short-circuiting behaviour.
2022-03-12	Store columns in the output dict	BonfaceKilz
	When inserting, deleting, or editing case-attributes, we need the column headers in order to be able to know identify the attribute of interest. * gn3/csvcmp.py (csv_diff): Add extra "Column" key in returned dict.
2022-03-12	Add methods for working with csv data	BonfaceKilz
	gn3/csvcmp.py: New file (create_dirs_if_not_exists): From a list of dirs, create them if they don't exist. (remove_insignificant_edits): Given a dict with a "Modification" key, remove edits with "delta < ε". (csv_diff): Generate a csv_diff using the "csvdiff" tool packaged in guix. tests/unit/test_csvcmp.py: Add some tests for "gn3/csvcmp.py"
2022-03-12	db: Fix error in SQL query	BonfaceKilz
	* gn3/db/traits.py (get_trait_csv_sample_data): Update SQL to fix runtime errors.
2022-03-12	Fix pylint error	BonfaceKilz

2022-03-12	Append case attributes to csv data if they exist	BonfaceKilz

2022-03-12	db: Extend csv query to fetch case attributes	BonfaceKilz
	* gn3/db/traits.py (get_trait_csv_sample_data): Fetch case attribute data if it exists.
2022-03-12	Revert "db: Fetch correct sample data"	BonfaceKilz
	This reverts commit 710769e84b3bc6a2bdd66effdbac0659272ed511.
2022-03-11	Fix typing errors	Frederick Muriuki Muriithi

2022-03-11	Fix some linting issues	Frederick Muriuki Muriithi

2022-03-08	Remove unused function and its tests	Frederick Muriuki Muriithi

2022-03-08	Fix tests, and issues caught by tests	Frederick Muriuki Muriithi
	Fix some issues caught by tests due to changes introducing the hand-off of the partial correlations computations to an external process Fix some issues due to the changes that introduce context managers for database connections Update some tests to take the above two changes into consideration
2022-03-08	Create database connections within context managers	Frederick Muriuki Muriithi
	Use the `with` context manager to open database connections, so as to ensure that those connections are closed once the call is completed. This hopefully avoids the 'too many connections' error
2022-03-04	Automatically decode Redis strings	Frederick Muriuki Muriithi