Age | Commit message (Expand) | Author |
2022-02-08 | Merge iterations to remove unnecessary computations...Do all the work in a single iteration to avoid unnecessary iterations that
hamper performance.
| Frederick Muriuki Muriithi |
2022-02-08 | Remove multiprocessing for stability...Web servers are long-running processes, and python is not very good at
cleaning up after itself especially in forked processes - this leads to memory
errors in the web-server after a while.
This commit removes the use of multiprocessing to avoid such failures.
| Frederick Muriuki Muriithi |
2022-02-08 | Give sorting functions more descriptive names | Frederick Muriuki Muriithi |
2022-02-08 | Use multiprocessing to speed up computation...This commit refactors the code to make it possible to use multiprocessing to
speed up the computation of the partial correlations.
The major refactor is to move the `__compute_trait_info__` function to the
top-level of the module, and provide to it all the other necessary context via
the new args.
| Frederick Muriuki Muriithi |
2022-02-08 | Remove unnecessary computation...In Python3 when slicing,
seq[:min(some_val, len(seq))] == seq[:some_val]
because Python3 will just return a copy of the entire sequence if `some_val`
happens to be larger/greater than the length of the sequence.
This commit removes the unnecessary call to `min()`
| Frederick Muriuki Muriithi |
2022-01-10 | Use the correct letter case for the keys...* Use the correct case for the keys inorder to retrieve the correct values.
| Frederick Muriuki Muriithi |
2022-01-10 | Indicate that string is an f-string...* The string had the f-string syntax to format the values to be inserted into
the string, but was missing the 'f' before the opening quotes to signify to
python that this was an f-string. This commit fixes that.
| Frederick Muriuki Muriithi |
2022-01-10 | Remove all pairs with 'None' as the value...* Remove all key-value pairs whose value is None.
| Frederick Muriuki Muriithi |
2022-01-10 | Replace unoptimised function with optimised one...Issue:
https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi
* Replace unoptimised function with one optimised to give better performance.
The optimisation done here is to fetch multiple items/traits from the
database per query, rather than the original form, which fetched a single
item/trait from the database per query.
| Frederick Muriuki Muriithi |
2022-01-10 | Convert NaN to None...Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi
Comment:
https://github.com/genenetwork/genenetwork3/pull/67#issuecomment-1000828159
* Convert NaN values to None to avoid possible bugs with the string replace
method used before.
| Frederick Muriuki Muriithi |
2022-01-10 | Rework database functions to fetch multiple items...Issue:
https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi
* In an attempt to optimise the performance of the partial correlations
feature, this commit reworks some database access functions to fetch
multiple items from the database, per query, unlike their original forms
which would fetch a single item per query.
This reduces queries to the database, and should hopefully improve the
responsiveness of the partial correlations feature.
| Frederick Muriuki Muriithi |
2021-12-24 | Fix typing errors | Frederick Muriuki Muriithi |
2021-12-24 | Fix sorting...Issue:
https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi
* Update the sorting algorithm, for literature and tissue correlations so that
it sorts the results by the correlation value first then by the p-value
next.
| Frederick Muriuki Muriithi |
2021-12-24 | Return the correlation method used...Issue:
https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi
* Return the correlation method used
| Frederick Muriuki Muriithi |
2021-12-24 | Reduce the total amount of data to be output...Issue:
https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi
* There is a lot of data that is not necessary in the final result. This
commit removes that data, retaining only data relevant for the display.
| Frederick Muriuki Muriithi |
2021-12-24 | Add dataset type to the results...Issue:
https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi
* The dataset type is relevant for the display of the data, therefore, this
commit presents the dataset type as part of the results.
| Frederick Muriuki Muriithi |
2021-12-17 | Add "success" status to final computation results...Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi
| Frederick Muriuki Muriithi |
2021-12-14 | linting: Fix obvious linting issues | Frederick Muriuki Muriithi |
2021-12-14 | mypy: ignore some imports and errors...* Ignore some missing library stubs
* Ignore some typing errors
* Fix obvious typing errors
| Frederick Muriuki Muriithi |
2021-12-14 | TO REVERT: Add logging to see data frame | Frederick Muriuki Muriithi |
2021-12-14 | Remove any items with less than 3 samples...Issue:
https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi
* pingouin raises an exception whenever one attempts to use it to compute the
partial correlation with data that has less than 3 samples.
| Frederick Muriuki Muriithi |
2021-12-14 | Fix dataset: use target dataset not primary...Issue:
https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi
* Use the target dataset to load the target traits, not the primary trait's
dataset, since they might differ.
| Frederick Muriuki Muriithi |
2021-12-13 | Provide missing function...Issue:
https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi
* Import the missing function.
| Frederick Muriuki Muriithi |
2021-12-13 | Fix the removal of controls for corresponding Nones in targets...Issue:
https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi
* Fix the code, so that it removes all control values, whose corresponding
target values are None, without throwing an error.
| Frederick Muriuki Muriithi |
2021-12-13 | Return the primary and control traits in addition to results...Issue:
https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi
* In addition to the partial correlation results, this commit enables the
return of the chosen primary trait and the selected control traits. This
data is required for presentation on the results page.
| Frederick Muriuki Muriithi |
2021-12-13 | Run partial correlations against chosen database...Issue:
https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi
* Run the partial correlations against the database that the user selects, and
not the one from which the primary trait is. This was a bug in the code.
| Frederick Muriuki Muriithi |
2021-12-09 | Prevent error on no result. Fix indexing...Issue:
https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi
* If the dataset name is not found, don't cause an exception, instead, return
the provided search name.
* Use the correct inner object
| Frederick Muriuki Muriithi |
2021-12-08 | Provide group from primary trait...Issue:
https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi
* From the collections page, the group is not present, so this commit
retrieves the group value from the primary trait.
| Frederick Muriuki Muriithi |
2021-11-29 | Fix linting errors...Issue:
https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi
| Frederick Muriuki Muriithi |
2021-11-29 | Provide entry-point function for the partial correlations...Issue:
https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi
* Provide the entry-point function to the partial correlation feature. This is
the function that ochestrates the fetching of the data, and processing it
for output by the API endpoint (to be implemented).
| Frederick Muriuki Muriithi |
2021-11-23 | Fix a myriad of linting errors...* Fix linting errors like:
- Unused variables
- Undeclared variable errors (mostly caused by typos, and wrong names)
- Missing documentation strings for functions
etc.
| Frederick Muriuki Muriithi |
2021-11-23 | Migrate `getPartialCorrelationsNormal`...Issue:
https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi
* Migrate the
`web.webqtl.correlation.PartialCorrDBPage.getPartialCorrelationsNormal`
function in GN1.
* Remove function obsoleted by newer implementation of the code
| Frederick Muriuki Muriithi |
2021-11-19 | Avoid rounding: compare floats approximately...Notes:
https://github.com/genenetwork/genenetwork3/pull/56#issuecomment-973798918
* As mentioned in the notes, rather than rounding to an arbitrary number of
decimal places, it is a much better practice to use approximate comparisons
of floats for the tests.
| Frederick Muriuki Muriithi |
2021-11-18 | Fix some linting errors...Issue:
https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi
* Fix some obvious linting errors and remove obsolete code
| Frederick Muriuki Muriithi |
2021-11-18 | Replace code migrated from R with pingouin functions...Issue:
https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi
* Replace the code that was in the process of being migrated from R in
GeneNetwork1 with calls to pingouin functions that achieve the same thing.
Since the functions in this case are computing correlations and partial
correlations, rather than having home-rolled functions to do that, this
commit makes use of the tried and tested pingouin functions.
This avoids complicating our code with edge-case checks, and leverages the
performance optimisations done in pingouin.
| Frederick Muriuki Muriithi |
2021-11-15 | Fix bugs in recursive partial correlations...* gn3/computations/partial_correlations.py: Remove rounding. Fix computation
of remaining covariates
*
tests/unit/computations/partial_correlations_test_data/pcor_rec_blackbox_test.txt:
reduce the number of covariates to between one (1) and three (3)
* tests/unit/computations/test_partial_correlations.py: fix some minor bugs
It turns out that the computation complexity increases exponentially, with
the number of covariates. Therefore, to get a somewhat sensible test time,
while retaining a large-ish number of tests, this commit reduces the number
of covariates to between 1 and 3.
| Frederick Muriuki Muriithi |
2021-11-15 | Fix the columns in built data frame...Issue:
https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi
* When the z value is a Sequence of sequences of values, each of the internal
sequences should form a column of its own, and not a row, as it was
originally set up to do.
| Frederick Muriuki Muriithi |
2021-11-09 | Implement remaining part of `partial_correlation_recursive` function...Issue:
https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi
* gn3/computations/partial_correlations.py: implement remaining portion of
`partial_correlation_recursive` function.
* tests/unit/computations/test_partial_correlations.py: add parsing for new
data format and update tests
| Frederick Muriuki Muriithi |
2021-11-09 | Fix bug: if three columns, ensure last is "z"...Issue:
https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi
* Fix a bug, caught when the function is called in a recursive form, with the
"z*" columns reducing for each cycle through the recursive form.
As it was, the last cycle through the recursive form would end up with a
DataFrame with the columns "x", "y", and "z0" rather than the columns "x",
"y", "z".
This commit handles that edge case to ensure that the column name is changed
from "z0" to simply "z".
| Frederick Muriuki Muriithi |
2021-11-04 | Partially implement `partial_correlation_recursive`...Issue:
https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi
* gn3/computations/partial_correlations.py: Implement one path for the
`gn3.computations.partial_correlations.partial_correlation_recursive`
function.
* gn3/settings.py: Add a setting for how many decimal places to round to
* tests/unit/computations/test_partial_correlations.py: Update test to take
the number of decimal places into consideration
Implement a single path (where the z value is a vector and not a matrix) for
the `partial_correlation_recursive` function.
| Frederick Muriuki Muriithi |
2021-11-04 | Implement `build_data_frame`...Issue:
https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi
* gn3/computations/partial_correlations.py: new function (`build_data_frame`)
* tests/unit/computations/test_partial_correlations.py: Add tests for new
function
Add a new function to build a pandas DataFrame object from the provided
values:
- x: a vector of floats (represented with a tuple of floats)
- y: a vector of floats (represented with a tuple of floats)
- z: a vector OR matrix of floats (represented with a tuple of floats or a
tuple of tuples of floats)
| Frederick Muriuki Muriithi |
2021-11-04 | Create blackbox tests for some functions migrated from R...Issue:
https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi
* gn3/computations/partial_correlations.py: new stub
functions (partial_correlation_matrix, partial_correlation_recursive)
*
tests/unit/computations/partial_correlations_test_data/pcor_mat_blackbox_test.csv:
blackbox sample data and results for variance-covariance matrix method
*
tests/unit/computations/partial_correlations_test_data/pcor_rec_blackbox_test.csv:
blackbox sample data and results for recursive method
* tests/unit/computations/test_partial_correlations.py: Tests for new function
Provide some blackbox testing sample data for checking the operation of the
functions migrated from R.
| Frederick Muriuki Muriithi |
2021-11-01 | Stub `determine_partials`...Issue:
* Stub out `determine_partials` which is a migration of
`web.webqtl.correlation.correlationFunction.determinePartialsByR` in GN1.
The function in GN1 has R code from line 188 to line 344. This will need to
be converted over to Python.
This function will also need tests.
| Frederick Muriuki Muriithi |
2021-11-01 | Implement `compute_partial_correlations_fast`...Issue:
https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi
* Implement `compute_partial_correlations_fast` that is a partial migration of
`web.webqtl.correlation.PartialCorrDBPage.getPartialCorrelationsFast` in
GN1.
This function will probably be reworked once the dependencies are fully
migrated.
It also needs tests to be added.
| Frederick Muriuki Muriithi |
2021-11-01 | Retrieve indices of the selected samples...Issue:
https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi
* gn3/computations/partial_correlations.py: New
function (good_dataset_samples_indexes).
* tests/unit/computations/test_partial_correlations.py: Tests for new
function (good_dataset_samples_indexes)
Get the indices of the selected samples. This is a partial migration of the
`web.webqtl.correlation.PartialCorrDBPage.getPartialCorrelationsFast`
function in GN1.
| Frederick Muriuki Muriithi |
2021-10-29 | Explicitly round the values...* Explicitly round the values to prevent issues with the type-checker
| Frederick Muriuki Muriithi |
2021-10-29 | Specify ten (10) decimal places...Issue:
https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi
* gn3/computations/partial_correlations.py: specify 10 decimal places
* tests/unit/computations/test_partial_correlations.py: update examples
Slight differences in python implementations, possibly hardware and
operating systems could cause the value of float (double) values to be
different in the less significant parts of the decimal places.
This commit limits the usable part of the decimals to the first 10 decimal
places for now.
| Frederick Muriuki Muriithi |
2021-10-29 | Fix linting and typing errors...Issue:
https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi
| Frederick Muriuki Muriithi |
2021-10-29 | Complete `build_temporary_tissue_correlations_table`...Issue:
https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi
* gn3/computations/partial_correlations.py: Remove comments after updating
usage of the function at call point
* gn3/db/correlations.py: Complete the implementation of the
`build_temporary_tissue_correlations_table` function
| Frederick Muriuki Muriithi |
2021-10-29 | Complete implementation of `batch_computed_tissue_correlation`...Issue:
https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi
* Complete the implementation of the `batch_computed_tissue_correlation`
function
| Frederick Muriuki Muriithi |