aboutsummaryrefslogtreecommitdiff
path: root/gn3/computations
AgeCommit message (Collapse)Author
2022-01-05Merge pull request #64 from jgarte/type-hint-normalize-valuesBonfaceKilz
Adds type hint for normalize_values function
2021-12-24Fix typing errorsFrederick Muriuki Muriithi
2021-12-24Fix linting errorsFrederick Muriuki Muriithi
2021-12-24Fix sortingFrederick Muriuki Muriithi
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi * Update the sorting algorithm, for literature and tissue correlations so that it sorts the results by the correlation value first then by the p-value next.
2021-12-24Return the correlation method usedFrederick Muriuki Muriithi
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi * Return the correlation method used
2021-12-24Reduce the total amount of data to be outputFrederick Muriuki Muriithi
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi * There is a lot of data that is not necessary in the final result. This commit removes that data, retaining only data relevant for the display.
2021-12-24Add dataset type to the resultsFrederick Muriuki Muriithi
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi * The dataset type is relevant for the display of the data, therefore, this commit presents the dataset type as part of the results.
2021-12-17Add "success" status to final computation resultsFrederick Muriuki Muriithi
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi
2021-12-14linting: Fix obvious linting issuesFrederick Muriuki Muriithi
2021-12-14mypy: ignore some imports and errorsFrederick Muriuki Muriithi
* Ignore some missing library stubs * Ignore some typing errors * Fix obvious typing errors
2021-12-14Adds type hint for normalize_values functionjgart
2021-12-14TO REVERT: Add logging to see data frameFrederick Muriuki Muriithi
2021-12-14Remove any items with less than 3 samplesFrederick Muriuki Muriithi
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi * pingouin raises an exception whenever one attempts to use it to compute the partial correlation with data that has less than 3 samples.
2021-12-14Fix dataset: use target dataset not primaryFrederick Muriuki Muriithi
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi * Use the target dataset to load the target traits, not the primary trait's dataset, since they might differ.
2021-12-13Provide missing functionFrederick Muriuki Muriithi
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi * Import the missing function.
2021-12-13Fix the removal of controls for corresponding Nones in targetsFrederick Muriuki Muriithi
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi * Fix the code, so that it removes all control values, whose corresponding target values are None, without throwing an error.
2021-12-13Return the primary and control traits in addition to resultsFrederick Muriuki Muriithi
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi * In addition to the partial correlation results, this commit enables the return of the chosen primary trait and the selected control traits. This data is required for presentation on the results page.
2021-12-13Run partial correlations against chosen databaseFrederick Muriuki Muriithi
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi * Run the partial correlations against the database that the user selects, and not the one from which the primary trait is. This was a bug in the code.
2021-12-10format normalize function doc stringAlexander Kabui
2021-12-10minor pr fixesAlexander Kabui
2021-12-10rename variablesAlexander Kabui
2021-12-10try and catch for non matching sample keysAlexander Kabui
2021-12-10update function docs for normalizing strain valuesAlexander Kabui
2021-12-10fix bug:unpacking error when generator returns empty listAlexander Kabui
2021-12-09Prevent error on no result. Fix indexingFrederick Muriuki Muriithi
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi * If the dataset name is not found, don't cause an exception, instead, return the provided search name. * Use the correct inner object
2021-12-08Provide group from primary traitFrederick Muriuki Muriithi
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi * From the collections page, the group is not present, so this commit retrieves the group value from the primary trait.
2021-11-29Fix linting errorsFrederick Muriuki Muriithi
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi
2021-11-29Provide entry-point function for the partial correlationsFrederick Muriuki Muriithi
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi * Provide the entry-point function to the partial correlation feature. This is the function that ochestrates the fetching of the data, and processing it for output by the API endpoint (to be implemented).
2021-11-23Fix a myriad of linting errorsFrederick Muriuki Muriithi
* Fix linting errors like: - Unused variables - Undeclared variable errors (mostly caused by typos, and wrong names) - Missing documentation strings for functions etc.
2021-11-23Migrate `getPartialCorrelationsNormal`Frederick Muriuki Muriithi
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi * Migrate the `web.webqtl.correlation.PartialCorrDBPage.getPartialCorrelationsNormal` function in GN1. * Remove function obsoleted by newer implementation of the code
2021-11-19Avoid rounding: compare floats approximatelyFrederick Muriuki Muriithi
Notes: https://github.com/genenetwork/genenetwork3/pull/56#issuecomment-973798918 * As mentioned in the notes, rather than rounding to an arbitrary number of decimal places, it is a much better practice to use approximate comparisons of floats for the tests.
2021-11-18Fix some linting errorsFrederick Muriuki Muriithi
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi * Fix some obvious linting errors and remove obsolete code
2021-11-18Replace code migrated from R with pingouin functionsFrederick Muriuki Muriithi
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi * Replace the code that was in the process of being migrated from R in GeneNetwork1 with calls to pingouin functions that achieve the same thing. Since the functions in this case are computing correlations and partial correlations, rather than having home-rolled functions to do that, this commit makes use of the tried and tested pingouin functions. This avoids complicating our code with edge-case checks, and leverages the performance optimisations done in pingouin.
2021-11-15Fix bugs in recursive partial correlationsFrederick Muriuki Muriithi
* gn3/computations/partial_correlations.py: Remove rounding. Fix computation of remaining covariates * tests/unit/computations/partial_correlations_test_data/pcor_rec_blackbox_test.txt: reduce the number of covariates to between one (1) and three (3) * tests/unit/computations/test_partial_correlations.py: fix some minor bugs It turns out that the computation complexity increases exponentially, with the number of covariates. Therefore, to get a somewhat sensible test time, while retaining a large-ish number of tests, this commit reduces the number of covariates to between 1 and 3.
2021-11-15Fix the columns in built data frameFrederick Muriuki Muriithi
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi * When the z value is a Sequence of sequences of values, each of the internal sequences should form a column of its own, and not a row, as it was originally set up to do.
2021-11-12Merge branch 'main' of github.com:genenetwork/genenetwork3 into ↵Frederick Muriuki Muriithi
partial-correlations
2021-11-11Reimplement correlations2.compute_correlation using pearsonr.Arun Isaac
correlations2.compute_correlation computes the Pearson correlation coefficient. Outsource this computation to scipy.stats.pearsonr. When the inputs are constant, the Pearson correlation coefficient does not exist and is represented by NaN. Update the tests to reflect this. * gn3/computations/correlations2.py: Remove import of sqrt from math. (compute_correlation): Reimplement using scipy.stats.pearsonr. * tests/unit/computations/test_correlation.py: Import math. (TestCorrelation.test_compute_correlation): When inputs are constant, set expected correlation coefficient to NaN.
2021-11-11Reimplement __items_with_values using list comprehension.Arun Isaac
* gn3/computations/correlations2.py: Remove import of reduce from functools. (__items_with_values): Reimplement using list comprehension.
2021-11-11pylint fixes and pep8 formattingAlexander Kabui
2021-11-11fix target and base sample data orderAlexander Kabui
2021-11-11fix:spawned processes memory issuesAlexander Kabui
2021-11-11replace list with generatorsAlexander Kabui
2021-11-09Implement remaining part of `partial_correlation_recursive` functionFrederick Muriuki Muriithi
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi * gn3/computations/partial_correlations.py: implement remaining portion of `partial_correlation_recursive` function. * tests/unit/computations/test_partial_correlations.py: add parsing for new data format and update tests
2021-11-09Fix bug: if three columns, ensure last is "z"Frederick Muriuki Muriithi
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi * Fix a bug, caught when the function is called in a recursive form, with the "z*" columns reducing for each cycle through the recursive form. As it was, the last cycle through the recursive form would end up with a DataFrame with the columns "x", "y", and "z0" rather than the columns "x", "y", "z". This commit handles that edge case to ensure that the column name is changed from "z0" to simply "z".
2021-11-04Create blackbox tests for some functions migrated from RFrederick Muriuki Muriithi
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi * gn3/computations/partial_correlations.py: new stub functions (partial_correlation_matrix, partial_correlation_recursive) * tests/unit/computations/partial_correlations_test_data/pcor_mat_blackbox_test.csv: blackbox sample data and results for variance-covariance matrix method * tests/unit/computations/partial_correlations_test_data/pcor_rec_blackbox_test.csv: blackbox sample data and results for recursive method * tests/unit/computations/test_partial_correlations.py: Tests for new function Provide some blackbox testing sample data for checking the operation of the functions migrated from R.
2021-11-04Stub `determine_partials`Frederick Muriuki Muriithi
Issue: * Stub out `determine_partials` which is a migration of `web.webqtl.correlation.correlationFunction.determinePartialsByR` in GN1. The function in GN1 has R code from line 188 to line 344. This will need to be converted over to Python. This function will also need tests.
2021-11-04Implement `compute_partial_correlations_fast`Frederick Muriuki Muriithi
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi * Implement `compute_partial_correlations_fast` that is a partial migration of `web.webqtl.correlation.PartialCorrDBPage.getPartialCorrelationsFast` in GN1. This function will probably be reworked once the dependencies are fully migrated. It also needs tests to be added.
2021-11-04Retrieve indices of the selected samplesFrederick Muriuki Muriithi
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi * gn3/computations/partial_correlations.py: New function (good_dataset_samples_indexes). * tests/unit/computations/test_partial_correlations.py: Tests for new function (good_dataset_samples_indexes) Get the indices of the selected samples. This is a partial migration of the `web.webqtl.correlation.PartialCorrDBPage.getPartialCorrelationsFast` function in GN1.
2021-11-04Explicitly round the valuesFrederick Muriuki Muriithi
* Explicitly round the values to prevent issues with the type-checker
2021-11-04Specify ten (10) decimal placesFrederick Muriuki Muriithi
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi * gn3/computations/partial_correlations.py: specify 10 decimal places * tests/unit/computations/test_partial_correlations.py: update examples Slight differences in python implementations, possibly hardware and operating systems could cause the value of float (double) values to be different in the less significant parts of the decimal places. This commit limits the usable part of the decimals to the first 10 decimal places for now.