genenetwork3 - GeneNetwork3 REST API for data science and machine learning

Age	Commit message (Collapse)	Author
2022-06-29	new variable: CORRELATION_COMMAND	Alexander

2022-06-29	init test correlation rust module	Alexander

2022-06-29	init rust correlation module	Alexander

2022-06-28	Parse the method from UI before passing it to external process	Frederick Muriuki Muriithi
	To reduce the chances of the system failing due to the external process being launched with the wrong parameters, add a parsing stage that converts the method from the UI into a form acceptable by the CLI script. * gn3/commands.py: parse the method from UI * scripts/partial_correlations.py: simplify the acceptable methods
2022-06-21	tests: test_file_utils: Replace pytest.mark.skip with unittest.skip	BonfaceKilz
	"python setup.py test" won't skip "pytest.mark.skip" thereby leading to build failure when you try to package gn3.
2022-06-21	setup.py: Remove commented out "ipfshttpclient" line	BonfaceKilz

2022-06-21	gn3: fs_helpers: Fix pylint errors	BonfaceKilz
	* gn3/fs_helpers.py: Remove unused "pathlib" import. (cache_ipfs_file): Disable "unused-argument" warting.
2022-06-21	test_file_utils: Disable test cases for "cache_ipfs_file"	BonfaceKilz
	* tests/unit/test_file_utils.py (test_cache_ipfs_file_cache_hit): Skip it. (test_cache_ipfs_file_cache_miss): Ditto.
2022-06-21	gn3: fs_helpers: Remove ipfshttpclient	BonfaceKilz
	This library pollutes the Genenetwork2 profile with an old version "dataclasses" thereby causing it to fail.
2022-06-21	setup.py: Remove ipfshttpclient	BonfaceKilz

2022-06-21	integration: test_gemma: Skip "GemmaAPITest"	BonfaceKilz

2022-06-21	Replace lint code with human-readable text	BonfaceKilz
	* gn3/db/correlations.py (__fetch_data__): Use a more readable code as opposed to an error code.
2022-06-21	db: correlations: Ignore pylint error	BonfaceKilz
	* gn3/db/correlations.py (__fetch_data__): Ignore "Too many args" [R0913] error.
2022-06-21	db: correlations: Ignore types	BonfaceKilz
	* gn3/db/correlations.py (__build_query__): Ignore the "sample_ids" and "joins" types when calling build_query_sgo_lit_corr (fetch_all_database_data): Ignore the return type. TODO: Ping Alex/Arun to fix this.
2022-06-21	db: datasets.py: Ignore results from sparql.queryAndConvert	BonfaceKilz
	ATM, it's very difficult to work the correct type that is returned. Ignore this for now and fix this later.
2022-06-21	mypy.ini: Ignore missing lmdb mypy stubs	BonfaceKilz

2022-06-20	Update README: export env variables explicitly	Frederick Muriuki Muriithi

2022-06-20	gn3: genodb: Retire get function.	Arun Isaac
	* gn3/genodb.py (get): Delete function. (matrix): Use db.txn.get instead of get.
2022-06-20	gn3: genodb: Match class and function names of GenotypeMatrix.	Arun Isaac
	* gn3/genodb.py (GenotypeMatrix): Match class and function names.
2022-06-20	gn3: genodb: Remove db, nrows and ncols fields from GenotypeMatrix.	Arun Isaac
	db is unused. nrows and ncols are available in the array and transpose numpy arrays. * gn3/genodb.py (GenotypeMatrix)[db, nrows, ncols]: Delete fields. * gn3/genodb.py (matrix): Do not initialize db, nrows and ncols fields.
2022-06-20	gn3: genodb: Mention reading entire matrix in module docstring.	Arun Isaac
	* gn3/genodb.py: Mention reading entire matrix in module docstring.
2022-06-20	Restrict partial correlation method choices	Frederick Muriuki Muriithi
	- Have "Pearson's r" and "Spearman's rho" as the only valid choices for the partial correlations
2022-06-17	gn3: genodb: Rename Matrix named tuple to GenotypeMatrix.	Arun Isaac
	* gn3/genodb.py (Matrix): Rename to GenotypeMatrix. (matrix): Update invocation of Matrix.
2022-06-17	gn3: genodb: Allow retrieval of the entire genotype matrix.	Arun Isaac
	* gn3/genodb.py: Document nparray in the module docstring. (nparray): New function.
2022-06-17	gn3: genodb: Read optimized storage for the current matrix.	Arun Isaac
	The genotype database now stores the current version of the matrix alone in a read-optimized form, while storing the older versions of the matrix in a more compressed form. We are only interested in the current version of the matrix. So, always use the read optimized storage. * gn3/genodb.py (Matrix)[row_pointers, column_pointers]: Delete fields. [array, transpose]: New fields. * gn3/genodb.py (matrix, row, column): Read from read-optimized storage. (vector_ref): Delete function.
2022-06-09	gn3: genodb: Remove blank line in module docstring.	Arun Isaac
	* gn3/genodb.py: Remove blank line in module docstring.
2022-06-09	gn3: genodb: Rewrite without classes.	Arun Isaac
	We rewrite genodb using only functions. This makes for much more readable code. * gn3/genodb.py: Rewrite without classes.
2022-06-08	gn3: genodb: Support reading columns.	Arun Isaac
	* gn3/genodb.py (Matrix.__init__): Retrieve column pointers from database. (row): Abstract out vector access code to ... (Matrix.__vector): ... here. (Matrix.column): New method.
2022-06-08	gn3: genodb: Read only the most recent genotype matrix.	Arun Isaac
	The genotype database format now supports versioning of matrices. So, we update genodb.py to return only the most recent genotype matrix. * gn3/genodb.py (GenotypeDatabase.matrix): Return only the most recent genotype matrix.
2022-06-08	gn3: genodb: Open genotype database in read-only mode.	Arun Isaac
	* gn3/genodb.py (GenotypeDatabase.__init__): Open genotype database in read-only mode.
2022-06-08	gn3: genodb: Do not create genotype database if it does not exist.	Arun Isaac
	* gn3/genodb.py (GenotypeDatabase.__init__): Do not create genotype database if it does not exist.
2022-06-08	gn3: genodb: Decide on little endianness.	Arun Isaac
	It has been decided that the genotype database will use little endianness wherever applicable. * gn3/genodb.py (Matrix.__init__): Remove TODO note to decide on endianness.
2022-06-08	gn3: genodb: Do not terminate database strings with null.	Arun Isaac
	* gn3/genodb.py (GenotypeDatabase.get_metadata, GenotypeDatabase.matrix): Do not terminate database strings with the null character.
2022-06-03	gn3: Add genodb.	Arun Isaac
	genodb is a tiny library to read our new genotype database file format. * gn3/genodb.py: New file.
2022-05-31	Remove unnecessary statement	Frederick Muriuki Muriithi

2022-05-31	Extract utility functions from `fetch_all_database_data`	Frederick Muriuki Muriithi
	Extract the utility functions to help with understanding the what the `fetch_all_database_data` function is doing. This helps with maintenance.
2022-05-30	Pass trait data as args to `fix_strains` and fix some bugs	Frederick Muriuki Muriithi
	The `fix_strains` function works on the trait data, not the basic trait info. This commit fixes the arguments passed to the function, and also some bugs in the function.
2022-05-27	Move sql for CRUD operations on case-attrs from gn2 to gn3	BonfaceKilz

2022-05-27	Move sql for modifying case-attributes from gn2 to gn3	BonfaceKilz

2022-05-27	sql: caseattributes_audit.sql: New file	BonfaceKilz
	Create new table that stores edits related to case-attributes.
2022-05-27	Return all the results from CaseAttributes column as is	BonfaceKilz
	* gn3/db/sample_data.py: Remove "collections" import. Add "Optional" import. (get_case_attributes): Return the results of "fetchall" from the case attributes. * tests/unit/db/test_sample_data.py (test_get_case_attributes): Update failing test.
2022-05-26	Add Endpoint to get menu items for use in UI	Frederick Muriuki Muriithi

2022-05-24	Run partial correlations with external script	Frederick Muriuki Muriithi
	Use new external script to run the partial correlations for both cases, i.e. - against an entire dataset, or - against selected traits
2022-05-24	Fix some linting issues	Frederick Muriuki Muriithi

2022-05-24	New script to compute partial correlations	Frederick Muriuki Muriithi
	* Add a new script to compute the partial correlations against: - a select list of traits, or - an entire dataset depending on the specified subcommand. This new script is meant to supercede the `scripts/partial_correlations.py` script. * Fix the check for errors * Reorganise the order of arguments for the `partial_correlations_with_target_traits` function: move the `method` argument before the `target_trait_names` argument so that the common arguments in the partial correlation computation functions share the same order.
2022-05-21	Fix linting errors	Frederick Muriuki Muriithi

2022-05-21	Use multiprocessing to improve performance	Frederick Muriuki Muriithi

2022-05-21	Process primary, target and control traits in a single iteration	Frederick Muriuki Muriithi
	Rework the code to process the traits in a single iteration to improve performance.
2022-05-21	Return generator object rather than tuples	Frederick Muriuki Muriithi
	Return generator objects rather than pre-computed tuples to reduce the number of iterations needed to process the data, and thus improve the performance of the system somewhat.
2022-05-16	Run computation in one-shot asynchronous process	Frederick Muriuki Muriithi
	After reworking the worker/runner to have a one-shot mode, add a function that queues up the task and then runs the worker in the one-shot mode to process the computation in the background.