aboutsummaryrefslogtreecommitdiff
path: root/gn3/db
AgeCommit message (Collapse)Author
2021-09-01Parse data lines into markersMuriithi Frederick Muriuki
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi * gn3/db/genotypes.py: parse data lines in file to genetic markers. * tests/unit/db/test_genotypes.py: test that parsing works. Add some tests to check that the parsing of the markers works as expected, and add the code to actually parse the markers.
2021-09-01Parse the genotype file's data headerMuriithi Frederick Muriuki
* gn3/db/genotypes.py: parse data header * tests/unit/db/test_genotypes.py: check that header's parse works correctly. Add tests to check that the parser works as expected. Add code to implement the parsing and pass the tests.
2021-09-01Implement parsing of genotype labelsMuriithi Frederick Muriuki
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi * gn3/db/genotypes.py: parse genotype labels * tests/unit/db/test_genotypes.py: test that genotype labels are parsed correctly As part of parsing the genotype files into usable python data structures, this commit adds a function to parse the label lines (beginning with "@") into the appropriate values.
2021-08-31Fix linting errors, minor bugs and reorganise codeMuriithi Frederick Muriuki
* Fix some linting errors and some minor bugs caught by the linter. Move the `random_string` function to separate module for use in multiple places in the code.
2021-08-31Update `heatmap_data` function: remove extraneous dataMuriithi Frederick Muriuki
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi * gn3/computations/heatmap.py: update function * gn3/db/traits.py: new function Remove extraneous data and arguments from the function. - Load the genotype file - Generate traits file - Provide both raw traits data, and exported traits data in return
2021-08-31Provide utilities for genotype filesMuriithi Frederick Muriuki
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi * gn3/db/genotypes.py: New module * gn3/settings.py: Add new configuration variable * qtlfilesexport.py: Test out new code Add a module containing functions dealing with the genotype files. Add a configuration variable to point to the location of the genotype files.
2021-08-17Provide top-level `riset` key-value pairMuriithi Frederick Muriuki
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi * Provide the expected, top-level `riset` key-value pair and eliminate the redundant key-value pair.
2021-08-17Fix errors: add in missing parenthesisMuriithi Frederick Muriuki
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi * Call the `cursor.fetchone()` function to get results. Without the parenthesis, the code was trying to use the function itself as the results, which was a bug, and would lead to failure.
2021-08-09Set up the trait dataset type correctlyMuriithi Frederick Muriuki
* gn3/db/traits.py: setup `trait_dataset_type` * tests/unit/db/test_traits.py: fix tests The type ('Temp', 'Geno', 'Publish', and 'ProbeSet') relate to a trait's dataset, and not the trait itself. This commit updates the code to take this into consideration. The dataset type is also set up from a trait's full name, therefore this commit removes the `trait_type` argument from the `retrieve_trait_info` function.
2021-08-09Retrieve the trait dataMuriithi Frederick Muriuki
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi * Add functions to retrieve the `value`, `variance`, and `ndata` values for any given trait.
2021-08-09Add missing arguments. Fix typo.Muriithi Frederick Muriuki
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi * Fix minor bugs in the code.
2021-08-09Fix linting errorsMuriithi Frederick Muriuki
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi * Add module, class and function docstrings * Deactivate some irrelevant pylint errors * Fix indentations and line-lengths
2021-08-08Only load extra data if the traits have basic infoMuriithi Frederick Muriuki
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi * Only load the extra trait data if the basic trait information is found.
2021-08-08Merge branch 'main' of github.com:genenetwork/genenetwork3 into ↵Muriithi Frederick Muriuki
heatmap_decompose_db_retrieval
2021-08-05db: traits: Return unique values when fetching sample dataBonfaceKilz
2021-08-05Reorganise the database codeMuriithi Frederick Muriuki
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi * Reorganise the code to separate the datasets from the traits, and to more closely conform to the same flow as that in GN1
2021-08-05Build up trait_name items from full nameMuriithi Frederick Muriuki
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi * The full name of the traits from search contains multiple parts to it, and as such, we use it to retrieve the appropriate data and set it up in the final trait_info dictionary that is produced.
2021-08-04Fix issues caught by pylintMuriithi Frederick Muriuki
* gn3/computations/slink.py: remove unused imports * gn3/db/traits.py: remove unnecessary `else` clauses * tests/unit/db/test_traits.py: add docstrings for functions
2021-08-04Retrieve the RISet and RISet ID valuesMuriithi Frederick Muriuki
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi * Retrieve the RISet and RISet ID values from the database.
2021-08-04Add tests for post-processing functionsMuriithi Frederick Muriuki
Issues: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi * Add missing tests for some post-processing functions
2021-08-04Avoid string interpolation: use prepared statementMuriithi Frederick Muriuki
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi * Following Arun's comment at https://github.com/genenetwork/genenetwork3/pull/31#issuecomment-890915813 this commit eliminates string interpolation, and adds a map of tables for the various types of traits dataset names
2021-07-30Rework db functions to enable postprocessingMuriithi Frederick Muriuki
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi * Rework the database functions to return a dict of key-value pairs, which eases the postprocessing of the trait information. The postprocessing is mainly to try an maintain data compatibility with the code that is at the following locations: https://github.com/genenetwork/genenetwork1/blob/master/web/webqtl/base/webqtlTrait.py https://github.com/genenetwork/genenetwork1/blob/master/web/webqtl/base/webqtlDataset.py https://github.com/genenetwork/genenetwork1/blob/master/web/webqtl/heatmap/Heatmap.py This was mainly a proof-of-concept, and the functions do not have testing added for them: there is therefore need to add testing for the new functions, and probably even rework them if they are found to be complicated.
2021-07-30Return dict from query functionsMuriithi Frederick Muriuki
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi * gn3/db/traits.py: return dicts rather than tuples/list * tests/unit/db/test_traits.py: Update tests Return dicts with the key-value pairs set up so as to ease with the data manipulation down the pipeline. This is also useful to help with the retrieval of all other extra information that was left out in the first iteration. This commit also updates the tests by ensuring they expect dicts rather than tuples.
2021-07-29Merge branch 'main' into Feature/Update-db-from-csv-dataBonfaceKilz
2021-07-29Delete "update_raw" and it's test-casesBonfaceKilz
2021-07-29Add method for updating values from a sample datasetBonfaceKilz
* gn3/db/traits.py (update_sample_data): New function. * tests/unit/db/test_traits.py: New test cases for ^^.
2021-07-29Add type annotations to the functionMuriithi Frederick Muriuki
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi * Add some type annotations to the functions to reduce the chances of bugs creeping into the code.
2021-07-29Retrieve trait informationMuriithi Frederick Muriuki
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi * gn3/db/traits.py: add functions to retrieve traits information * tests/unit/db/test_traits.py: add tests for new function Add functions to retrieve traits information as is done in genenetwork1 https://github.com/genenetwork/genenetwork1/blob/master/web/webqtl/base/webqtlTrait.py#L397-L456 At this point, the data retrieval functions are probably incomplete, as there is more of the `retrieveInfo` function in GN1 that has not been considered as of this commit.
2021-07-29Make name retrieval more generalMuriithi Frederick Muriuki
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi * gn3/db/traits.py: make function more general * tests/unit/db/test_traits.py: parametrize the tests Make the name retrieval more general for the different types of traits by changing the column specification and table as appropriate.
2021-07-29Retrieve 'ProbeSet' trait nameMuriithi Frederick Muriuki
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi * gn3/db/traits.py: new function (retrieve_probeset_trait_name) * tests/unit/db/test_traits.py: test(s) for new function Add a function to retrieve the name of a 'ProbeSet' trait in a manner similar to genenetwork1's retrieval of the same, as implemented here https://github.com/genenetwork/genenetwork1/blob/master/web/webqtl/base/webqtlDataset.py#L140-154 Unlike in genenetwork1, we do not mutate an object, instead, we return the values as retrieved from the database, and the caller will deal with the returned values as appropriate.
2021-07-29db: traits: Remove publishdata columnBonfaceKilz
2021-07-26gn3: db: Create a raw update queryBonfaceKilz
* gn3/db/__init__.py (update_raw): New function.
2021-07-26db: traits: Fetch sample_data from a trait in csv formBonfaceKilz
2021-07-26db: traits: Remove unused functionsBonfaceKilz
2021-07-10gn3: db: Use correct type for columns arg in fetch functionsBonfaceKilz
2021-07-10Fix pylint issuesBonfaceKilz
2021-07-10gn3: db: Add extra argument to specify column in fetch statementsBonfaceKilz
2021-07-10db: phenotypes: Add Probeset data structuresBonfaceKilz
* gn3/db/phenotypes.py (Probeset): New dataclass. (probeset_mapping): New dict. * gn3/db/__init__.py: Add probeset_mapping and Probeset.
2021-06-07Rename json_data column to json_diff_dataBonfaceKilz
2021-06-07gn3: db: Fix how columns from tables is resolvedBonfaceKilz
2021-06-07gn3: db: Add "id_" property to metadata_audit class and mappingBonfaceKilz
2021-06-07gn3: db: Add "fetchall" method.BonfaceKilz
2021-06-07gn3: metadata_audit: Make props for MetadataAudit class optionalBonfaceKilz
2021-06-07gn3: db: Make "WHERE" clause optionalBonfaceKilz
* gn3/db/__init__.py (fetchone): Make "WHERE" an Optional arg.
2021-06-07gn3: db: Use correct DATACLASSMAP entry from metadata_auditBonfaceKilz
2021-06-07gn3: db: sort importsBonfaceKilz
2021-06-03gn3: db: Remove "escape_string" from importsBonfaceKilz
We use prepared statements, so no need to have this.
2021-06-03Use prepared statements for FETCH sql functionBonfaceKilz
2021-06-03gn3: db: Replace items() with keys()BonfaceKilz
* gn3/db/__init__.py (diff_from_dict): We only use the keys of the dict!
2021-06-03Use prepared statements for UPDATE sql functionBonfaceKilz