Age | Commit message (Collapse) | Author |
|
* Add a new script to compute the partial correlations against:
- a select list of traits, or
- an entire dataset
depending on the specified subcommand. This new script is meant to supercede
the `scripts/partial_correlations.py` script.
* Fix the check for errors
* Reorganise the order of arguments for the
`partial_correlations_with_target_traits` function: move the `method`
argument before the `target_trait_names` argument so that the common
arguments in the partial correlation computation functions share the same
order.
|
|
|
|
After reworking the worker/runner to have a one-shot mode, add a function that
queues up the task and then runs the worker in the one-shot mode to process
the computation in the background.
|
|
|
|
Enable the endpoint to actually compute partial correlations with selected
target traits rather than against an entire dataset.
Fix some issues caused by recent refactor that broke pcorrs against a dataset
|
|
returning a string"
This reverts commit b93b22386056347d8002dd2e403425beeb4657cd.
The appropriate fix should have been in GN2. The original statement
args = request.get_json()
was correct, since `request.get_json()` should return a python object parsed
from the JSON string in the request. Unfortunately, GN2 was encoding the
request data two times, which led to the call returning a JSON-encoded string
instead of the expected object.
The issue has been fixed in GN2 and therefore, the "fix" here can be reverted.
|
|
returning a string
|
|
|
|
selected
|
|
it's needed to store the proximal/distal markers for each position
|
|
api/rqtl.py
|
|
in pairscan results + renamed process_rqtl_output to process_rqtl_mapping to distinguish between that and pairscan
|
|
|
|
Fix some issues caught by tests due to changes introducing the hand-off of the
partial correlations computations to an external process
Fix some issues due to the changes that introduce context managers for
database connections
Update some tests to take the above two changes into consideration
|
|
Use the `with` context manager to open database connections, so as to ensure
that those connections are closed once the call is completed. This hopefully
avoids the 'too many connections' error
|
|
|
|
Long-running computations are handed off to external processes. This avoids
timeouts in the webserver, and also reduces chances of instability of the
webserver.
The results of these long-running computations are needed eventually, so this
commit provides a way to check for the state of the computation, and the
results if any.
|
|
Run the partial correlations code in an external python process decoupling it
from the server and making it asynchronous.
Summary of changes:
* gn3/api/correlation.py:
- Remove response processing code
- Queue partial corrs processing
- Create new endpoint to get results
* gn3/commands.py
- Compose the pcorrs command to be run in an external process
- Enable running of subprocess commands with list args
* gn3/responses/__init__.py: new module indicator file
* gn3/responses/pcorrs_responses.py: Hold response processing code extracted
from ~gn3.api.correlations.py~ file
* scripts/partial_correlations.py: CLI script to process the pcorrs
* sheepdog/worker.py:
- Add the *genenetwork3* path at the beginning of the ~sys.path~ list to
override any GN3 in the site-packages
- Add any environment variables to be set for the command to be run
|
|
* Use `with` in place of plain `open`
* Use f-strings in place of `str.format()`
* Remove string interpolation from queries - provide data as query parameters
* other minor fixes
|
|
Test that the partial correlations endpoint responds with an appropriate
"not-found" message and the corresponding 404 status code in the case where a
request is made and the primary trait requested for does not exist in the
database.
Summary of the changes in each file:
* gn3/api/correlation.py: generalise the building of the response
* gn3/computations/partial_correlations.py: return with a "not-found" if the
primary trait does not exist in the database
* gn3/db/partial_correlations.py: Fix a number of bugs that led to exceptions
in the case that the primary trait did not exist
* pytest.ini: register a `slow` pytest marker
* tests/integration/test_partial_correlations.py: Add a new test to check for
an appropriate 404 response in case of a primary trait that does not exist
in the database.
|
|
Add a test for the partial correlations endpoint, with:
- no data in the request
- missing items in the data
Fix the bugs caught by the test
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi
Comment:
https://github.com/genenetwork/genenetwork3/pull/67#issuecomment-1000828159
* Convert NaN values to None to avoid possible bugs with the string replace
method used before.
|
|
Issue:
https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi
* `NaN` is not a valid JSON value, and leads to errors in the code. This
commit replaces all `NaN` values with `null`.
|
|
Issue:
https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi
* Encode bytes objects to string
* Encode NaN values to "null"
* gn3/api/correlation.py:
|
|
Issue:
https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi
* Add an API endpoint for the partial correlation.
* gn3/api/correlation.py:
|
|
Issue:
https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi
* Add an API endpoint for the partial correlation.
|
|
* guix.scm: Import (gnu packages rdf).
(genenetwork3)[propagated-inputs]: Add python-sparqlwrapper.
* gn3/settings.py (SPARQL_ENDPOINT): New variable.
* gn3/api/general.py: Import datasets from gn3.db.
(dataset_metadata): New API endpoint.
* gn3/db/datasets.py: Import re, Template from string, Dict and Optional from
typing, JSON and SPARQLWrapper from SPARQLWrapper, SPARQL_ENDPOINT from
gn3.settings.
(sparql_query, dataset_metadata): New functions.
|
|
Issue:
https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/non-clustered-heatmaps-and-flipping.gmi
* Update the request endpoint, so that it produces a vertical or horizontal
heatmap depending on the user's request.
|
|
bug/fix_rqtl_covariates
|
|
|
|
Issue:
https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi
* Fix issue according to review
https://github.com/genenetwork/genenetwork3/pull/37#discussion_r714549781
|
|
|
|
* Add missing function and module docstrings
* Remove unused imports
* Fix import order
* Rework some code sections to fix issues
* Disable some pylint errors.
|
|
Issue:
https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi
* Update the check to look for at least 2 traits before trying to generate the
heatmap.
|
|
Issue:
https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi
* gn3/api/heatmaps.py: Serialize the figure to JSON
* gn3/heatmaps.py: Return the figure object
Serialize the Plotly figure into JSON, and return that, so that it can be
used on the client to display the image.
|
|
|
|
Issue:
https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi
* gn3/api/heatmaps.py: Fix bugs in data parsing
* gn3/app.py: enable CORS
* gn3/settings.py: add flask-cors configurations
* guix.scm: Add flask-cors dependency
For easier testing of the heatmaps generation feature, this commit activates
the cross-origin resource sharing for all "localhost" origins.
|
|
Issue:
https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi
* gn3/api/heatmaps.py: Parse incoming data to build up correct trait names and
respond with only the computed heatmap data.
* gn3/heatmaps.py: Return only the computed data for heatmaps and clustering.
Since GN3 is supposed to handle only the data, and db-access, this commit
ensures that GN3 responds to the client with only the computed heatmap data,
and does not try to generate the heatmaps themselves.
The generation of the heatmaps will be delegated to the UI clients, such as
GeneNetwork2.
|
|
Issue:
https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi
* To help with demonstrating that the code is producing the expected output,
for now, we return the path to the generated html file that displays the
interactive heatmap.
At this point, it is mostly useful in the development environment. Moving
forward, we might have to actually stream the raw html, or if we can get the
Kaleido library packaged for GNU Guix, stream the images binary data instead.
|
|
|
|
Issue:
https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi
* Intergrate the heatmap generation code on the /api/heatmaps/clustered
endpoint.
The endpoint should take a json query of the form:
{"traits_names": [ ... ] }
where the "traits_name" value is a list of the full names of traits.
A sample query to the endpoint could be something like the following:
curl -i -X POST "http://localhost:8080/api/heatmaps/clustered" \
-H "Accept: application/json" \
-H "Content-Type: application/json" \
-d '{
"traits_names": [
"UCLA_BXDBXH_CARTILAGE_V2::ILM103710672",
"UCLA_BXDBXH_CARTILAGE_V2::ILM2260338",
"UCLA_BXDBXH_CARTILAGE_V2::ILM3140576",
"UCLA_BXDBXH_CARTILAGE_V2::ILM5670577",
"UCLA_BXDBXH_CARTILAGE_V2::ILM2070121",
"UCLA_BXDBXH_CARTILAGE_V2::ILM103990541",
"UCLA_BXDBXH_CARTILAGE_V2::ILM1190722",
"UCLA_BXDBXH_CARTILAGE_V2::ILM6590722",
"UCLA_BXDBXH_CARTILAGE_V2::ILM4200064",
"UCLA_BXDBXH_CARTILAGE_V2::ILM3140463"
]
}'
which should respond with a json response containing the raw binary string
for the png format and possibly another for the svg format.
|
|
|
|
integrated into the script yet though)
|