genenetwork3 - GeneNetwork3 REST API for data science and machine learning

Age	Commit message (Collapse)	Author
2022-03-22	Added option for running pairscan to rqtl_wrapper.R	zsloan

2022-03-18	README: Update CI badge subdomain.	Arun Isaac
	The CI has been migrated from penguin2.genenetwork.org to ci.genenetwork.org. * README.md: Update CI badge subdomain.
2022-03-18	Clean all csv fields before diffing	BonfaceKilz
	There was a subtle bug where "csvdiff" generated an error related to "different column headings" caused something akin to diffing: "a, b \n, ..." with "a, b\n, ...". * gn3/csvcmp.py (csv_diff): Clean csv texts before any diffing. * tests/unit/test_csvcmp.py (test_csv_diff_same_columns): Modify test case to capture aforementioned bug.
2022-03-18	Create new function for cleaning individual fields in csv text	BonfaceKilz
	* gn3/csvcmp.py (clean_csv_text): New function. * tests/unit/test_csvcmp.py: Import "csv_text". (test_clean_csv_text): Test case for the above.
2022-03-15	Feature/refactored pca (#79)	Alexander Kabui
	* compute zscore function * test case for computing zscore * function to compute pca * generate scree plot data * generate new pca trait data from zscores and eigen_vec * remove redundant functions * generate factor loading table data * generate pca temp dataset dict * variable naming and error fixes * unit test for processing factor loadings * minor fixes for generating temp pca dataset * pass datetime as argument to generate_pca temp dataset function * add unittest for caching pca datasets * cache temp datasets * ignore missing imports for sklearn * mypy fixes * pylint fixes * refactor tests for pca * remove ununsed imports * fix for generating pca traits vals * mypy and code refactoring * pep8 formatting and add docstrings * remove comments /pep8 formatting * sort eigen vectors based on eigen values * minor fix for zscores * fix for rounding variance ratios * refactor tests * rename module to pca * rename datasets to traits * fix failing tests * fix caching function * fixes return x and y coordinates for scree plot * expand exception scope * fix for deprecated numpy.matrix function * fix for failing tests * pep8 fixes * remove comments * fix merge conflict * pylint fixes * rename module name to test_pca
2022-03-14	Dummy White Space commit to fix laminar	BonfaceKilz

2022-03-14	Sort import lines	BonfaceKilz

2022-03-14	Only loop through the diff's modifications if it exists	BonfaceKilz

2022-03-14	Given a csv text and permissible headers, extract invalid headers	BonfaceKilz
	* gn3/csvcmp.py (extract_invalid_csv_headers): New function. * tests/unit/test_csvcmp.py: Import "extract_invalid_csv_headers". (test_extract_invalid_csv_headers_with_some_wrong_headers): Test case for the above.
2022-03-14	Get all permissible column data	BonfaceKilz
	* gn3/csvcmp.py: Import "Any" and "List". (get_allowable_sampledata_headers): New function. * tests/unit/test_csvcmp: Import "get_allowable_sampledata_headers". (test_get_allowable_csv_headers): Test case for the above.
2022-03-12	Fix pylint errors in unit_tests	BonfaceKilz

2022-03-12	Remove unused imports	BonfaceKilz

2022-03-12	Fix mypy issues	BonfaceKilz

2022-03-12	Fix pylint issues	BonfaceKilz

2022-03-12	README: Replace "unit-test" instructions with "pytest"	BonfaceKilz

2022-03-12	Compose csv-diff command within single quotes	BonfaceKilz
	* gn3/csvcmp.py (csv_diff): Use single quotes. There was a change in 6d39c92 that broke this.
2022-03-12	Delete noisy "print" statement	BonfaceKilz

2022-03-12	Store the first element as strain_id	BonfaceKilz

2022-03-12	Append the strain name when extracting "actions"	BonfaceKilz
	* gn3/db/sample_data.py (__extract_actions): During updates, make sure that the strain name is part of the returned string when extracting "actions". * tests/unit/db/test_sample_data.py: Add test cases for the above.
2022-03-12	Apply auto-pep8 to sample_data.py and it's test file	BonfaceKilz

2022-03-12	Add missing return type-annotations	BonfaceKilz
	* tests/unit/db/test_sample_data.py (delete_sample_data): Add missing return type for type annotations. (insert_sample_data): Ditto.
2022-03-12	Update how data is updated by re-using existing functions	BonfaceKilz
	* gn3/db/sample_data.py (get_sample_data_ids): Re-use "delete_sample_data" and "insert_sample_data" when updating data; and also add logic for updating modified data. * tests/unit/db/test_sample_data.py: Add tests for the above.
2022-03-12	Create action dict that's created when updating data	BonfaceKilz
	* gn3/db/sample_data.py (__extract_actions): An update on a vector of data can contain: inserts, deletes and updates. This functions extracts these actions during an update. * tests/unit/db/test_sample_data.py (test_extract_actions): Add test-case for the above.
2022-03-12	Add test cases for inserting and deleting data	BonfaceKilz
	* tests/unit/db/test_sample_data.py (test_insert_sample_data): Test inserting data. (test_delete_sample_data): Test deleting data.
2022-03-12	Remove check for inserted data when inserting individual data	BonfaceKilz
	* gn3/db/sample_data.py (insert_sample_data)[__insert_data]: Move check to the main body. With this check here, you have 3 redundant checks. For a successful insert, it will insert the first value to the `PublishData` table and ignore the rest of the inserts.
2022-03-12	Make `_map` a constant	BonfaceKilz
	* gn3/db/sample_data.py: Now constant, `_MAP`. (delete_sample_data)[__delete_data]: Replace `_map` with `_MAP`. (insert_sample_data)[__insert_data]: Ditto.
2022-03-12	Fix faulty SQL query string when deleting case-attributes	BonfaceKilz

2022-03-12	Explicitly get CaseAttributeId and fix broken sql query	BonfaceKilz
	* gn3/db/sample_data.py (insert_sample_data): Use correct query string. Also, use CaseAttributeId to determine whether case-attributes were inserted. If so, do not attempt an insert.
2022-03-12	Remove duplicate params	BonfaceKilz
	* gn3/db/sample_data.py (insert_sample_data)[__insert_case_attribute]: Remove extra parameters.
2022-03-12	Remove dead code	BonfaceKilz

2022-03-12	Check whether publish data already exists before inserting	BonfaceKilz
	* gn3/db/sample_data.py (insert_sample_data): If data already exists in the table, do not attempt an insert; otherwise, an error will be generated.
2022-03-12	Fill in empty values in csv text with: "x"	BonfaceKilz
	* gn3/csvcmp.py (fill_csv): Update this function to allow empty lists to be filled with the default value(set in the args). * tests/unit/test_csvcmp.py (test_fill_csv): Update test to capture above.
2022-03-12	Remove test cases related to sample data	BonfaceKilz
	Most of this functions were moved to their own module.
2022-03-12	Fetch id's separately for the insertion edge-case	BonfaceKilz
	* gn3/db/sample_data (get_sample_data_ids): Add an extra condition that caters for inserts: during inserts, joins won't work when fetching the strain_id, publishdata_id, and strain_name. In this case, just create 2 separate queries to do that work.
2022-03-12	Extract a strain name given a csv string and it's header	BonfaceKilz
	* gn3/csvcmp.py (extract_strain_name): New function. * gn3/db/sample_data (delete_sample_data): Use the aforementioned function. (insert_sample_data): Ditto. * tests/unit/test_csvcmp: Test cases for above.
2022-03-12	Allow deleting case-attribute data during deletion	BonfaceKilz
	* gn3/db/sample_data.py (delete_sample_data): Modify this function to allow deleting case-attribute values.
2022-03-12	Allow inserting case-attribute data during inserts	BonfaceKilz
	* gn3/db/sample_data.py (insert_sample_data): Modify this function to allow inserting case-attribute values.
2022-03-12	Fetch InbredSetId	BonfaceKilz
	* gn3/db/sample_data.py (get_sample_data_ids): Extend to also fetch InbredSetId. (update_sample_data): Discard the returned value of InbredSetId. (delete_sample_data): Ditto.
2022-03-12	Create a new function for retrieving strain_id and publishdata_id	BonfaceKilz
	* gn3/db/sample_data.py: Import Any, Tuple. (get_sample_data_ids): New function that fetches the strain_id and publishdata_id of a given data point. (update_sample_data): Use `get_sample_data_ids`. (delete_sample_data): Ditto. (insert_sample_data): Ditto.
2022-03-12	Move operations on sample_data to it's own module	BonfaceKilz

2022-03-12	Don't add extra key "Column" to dict if there are no changes	BonfaceKilz
	gn3/csvcmp.py (csv_diff): If the diff is empty, don't add an extra key "Column" to the dictionary. tests/unit/test_csvcmp (test_csv_diff_only_column_change): Add test-case for the above.
2022-03-12	Test edges cases for csv files when running csvdiff	BonfaceKilz
	* tests/unit/test_csvcmp.py (test_csv_diff): Delete it. (test_csv_diff_same_columns): Test csv_diff against csv texts with the same columns. (test_csv_diff_different_columns): Test csv texts against csv texts with different varying columns.
2022-03-12	Fill CSV text if there are non-even rows	BonfaceKilz
	Should you try to run `csvdiff` against 2 csv files with either file having a non-even columns, there will be an error. As such, the csv files need to be "filled" before running `csvdiff`. * gn3/csvcmp (csv_diff): For non-even rows in the csv files, fill the csv rows.
2022-03-12	Create new method for filling csv with a default value	BonfaceKilz
	* gn3/csvcmp.py (fill_csv): Given a csv text with uneven or incomplete fields whole length are less than width, fill them with a value which defaults to "x". * tests/unit/test_csvcmp.py (test_fill_csv): Test cases for the above.
2022-03-12	Replace "all" with "and"	BonfaceKilz
	* gn3/csvcmp.py (remove_insignificant_edits): "all" evaluates all elements and throws an error if when `abs(float(x) - float(y)) < epsilon` is processed. Use "and" instead because of it's short-circuiting behaviour.
2022-03-12	Store columns in the output dict	BonfaceKilz
	When inserting, deleting, or editing case-attributes, we need the column headers in order to be able to know identify the attribute of interest. * gn3/csvcmp.py (csv_diff): Add extra "Column" key in returned dict.
2022-03-12	Add methods for working with csv data	BonfaceKilz
	gn3/csvcmp.py: New file (create_dirs_if_not_exists): From a list of dirs, create them if they don't exist. (remove_insignificant_edits): Given a dict with a "Modification" key, remove edits with "delta < ε". (csv_diff): Generate a csv_diff using the "csvdiff" tool packaged in guix. tests/unit/test_csvcmp.py: Add some tests for "gn3/csvcmp.py"
2022-03-12	db: Fix error in SQL query	BonfaceKilz
	* gn3/db/traits.py (get_trait_csv_sample_data): Update SQL to fix runtime errors.
2022-03-12	Fix pylint error	BonfaceKilz

2022-03-12	Append case attributes to csv data if they exist	BonfaceKilz