Age | Commit message (Collapse) | Author |
|
* gn3/api/search.py (remove_synteny_field): Tolerate WEIGHT operator in parsed
search queries.
|
|
* gn3/api/search.py: Import gzip, Path from pathlib and curry from
pymonad.tools.
(IntervalLiftoverFunction): New variable.
(query_subqueries, query_terms, field_processor_or, liftover,
liftover_interval, parse_synteny_field, is_synteny_on, remove_synteny_field):
New functions.
(parse_location_field): Generalize to support synteny searches.
(parse_query): Support synteny search queries.
(search_results): Pass synteny files directory to parse_query.
|
|
* gn3/api/search.py: Import partial and reduce from functools. Import Callable
from typing.
(ChromosomalPosition, ChromosomalInterval, FieldProcessor): New classes.
(apply_si_suffix, combine_queries, parse_location_field, interval_start,
interval_end): New functions.
(parse_query): Add field processors for location shorthands.
|
|
* gn3/api/search.py (parse_query): New function.
(search_results): Use parse_query.
|
|
* gn3/api/metadata.py (jsonify_dataset_metadata): Rewrite metadata
end-point to use a dataset's name instead of it's accession_id.
* gn3/db/rdf.py (get_dataset_metadata): Replace accession_id with
name. Use one single RDF query instead of multiple queries.
|
|
* gn3/api/metadata.py: Import URLError.
(jsonify_dataset_metadata): Add URLError in except block.
|
|
* gn3/api/general.py: Delete rdf import. Delete trailing white
spaces.
* gn3/api/metadata.py: Delete trailing white spaces.
|
|
* gn3/api/metadata.py: import RemoteDisconnected.
(jsonify_dataset_metadata): Wrap get_dataset_metadata in try block.
|
|
* gn3/api/general.py: (dataset_metadata) Delete.
* gn3/api/metadata.py: Import Blueprint, jsonify, current_app,
SPARQLWrapper and get_dataset_metadata.
(metadata): New Blueprint
(jsonify_dataset_metadata): New function/end-point.
* gn3/app.py: Import metadata
(create_app): Register metadata blueprint.
|
|
* gn3/api/general.py: Replace gn3.db.datasets import with gn3.db.rdf.
(dataset_metadata) <jsonify>: Replace datasets.dataset_metadata with
rdf.get_dataset_metadata.
* gn3/db/datasets.py: Remove unused imports.
(sparql_query, dataset_metadata): Delete.
* gn3/db/rdf.py: (sparql_query, get_dataset_metadata): New functions.
|
|
App settings should be accessed from current_app. It should not be hard-coded
to a variable in a module.
* gn3/db_utils.py: Do not import XAPIAN_DB_PATH from gn3.settings.
(xapian_database): Accept path argument.
* gn3/api/search.py: Import current_app from flask.
(search_results): Pass Xapian index path to xapian_database.
|
|
* gn3/api/search.py: New file.
* gn3/app.py: Register the search blueprint.
|
|
|
|
|
|
Use new external script to run the partial correlations for both cases,
i.e.
- against an entire dataset, or
- against selected traits
|
|
* Add a new script to compute the partial correlations against:
- a select list of traits, or
- an entire dataset
depending on the specified subcommand. This new script is meant to supercede
the `scripts/partial_correlations.py` script.
* Fix the check for errors
* Reorganise the order of arguments for the
`partial_correlations_with_target_traits` function: move the `method`
argument before the `target_trait_names` argument so that the common
arguments in the partial correlation computation functions share the same
order.
|
|
|
|
After reworking the worker/runner to have a one-shot mode, add a function that
queues up the task and then runs the worker in the one-shot mode to process
the computation in the background.
|
|
|
|
Enable the endpoint to actually compute partial correlations with selected
target traits rather than against an entire dataset.
Fix some issues caused by recent refactor that broke pcorrs against a dataset
|
|
returning a string"
This reverts commit b93b22386056347d8002dd2e403425beeb4657cd.
The appropriate fix should have been in GN2. The original statement
args = request.get_json()
was correct, since `request.get_json()` should return a python object parsed
from the JSON string in the request. Unfortunately, GN2 was encoding the
request data two times, which led to the call returning a JSON-encoded string
instead of the expected object.
The issue has been fixed in GN2 and therefore, the "fix" here can be reverted.
|
|
returning a string
|
|
|
|
selected
|
|
it's needed to store the proximal/distal markers for each position
|
|
api/rqtl.py
|
|
in pairscan results + renamed process_rqtl_output to process_rqtl_mapping to distinguish between that and pairscan
|
|
|
|
Fix some issues caught by tests due to changes introducing the hand-off of the
partial correlations computations to an external process
Fix some issues due to the changes that introduce context managers for
database connections
Update some tests to take the above two changes into consideration
|
|
Use the `with` context manager to open database connections, so as to ensure
that those connections are closed once the call is completed. This hopefully
avoids the 'too many connections' error
|
|
|
|
Long-running computations are handed off to external processes. This avoids
timeouts in the webserver, and also reduces chances of instability of the
webserver.
The results of these long-running computations are needed eventually, so this
commit provides a way to check for the state of the computation, and the
results if any.
|
|
Run the partial correlations code in an external python process decoupling it
from the server and making it asynchronous.
Summary of changes:
* gn3/api/correlation.py:
- Remove response processing code
- Queue partial corrs processing
- Create new endpoint to get results
* gn3/commands.py
- Compose the pcorrs command to be run in an external process
- Enable running of subprocess commands with list args
* gn3/responses/__init__.py: new module indicator file
* gn3/responses/pcorrs_responses.py: Hold response processing code extracted
from ~gn3.api.correlations.py~ file
* scripts/partial_correlations.py: CLI script to process the pcorrs
* sheepdog/worker.py:
- Add the *genenetwork3* path at the beginning of the ~sys.path~ list to
override any GN3 in the site-packages
- Add any environment variables to be set for the command to be run
|
|
* Use `with` in place of plain `open`
* Use f-strings in place of `str.format()`
* Remove string interpolation from queries - provide data as query parameters
* other minor fixes
|
|
Test that the partial correlations endpoint responds with an appropriate
"not-found" message and the corresponding 404 status code in the case where a
request is made and the primary trait requested for does not exist in the
database.
Summary of the changes in each file:
* gn3/api/correlation.py: generalise the building of the response
* gn3/computations/partial_correlations.py: return with a "not-found" if the
primary trait does not exist in the database
* gn3/db/partial_correlations.py: Fix a number of bugs that led to exceptions
in the case that the primary trait did not exist
* pytest.ini: register a `slow` pytest marker
* tests/integration/test_partial_correlations.py: Add a new test to check for
an appropriate 404 response in case of a primary trait that does not exist
in the database.
|
|
Add a test for the partial correlations endpoint, with:
- no data in the request
- missing items in the data
Fix the bugs caught by the test
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi
Comment:
https://github.com/genenetwork/genenetwork3/pull/67#issuecomment-1000828159
* Convert NaN values to None to avoid possible bugs with the string replace
method used before.
|
|
Issue:
https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi
* `NaN` is not a valid JSON value, and leads to errors in the code. This
commit replaces all `NaN` values with `null`.
|
|
Issue:
https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi
* Encode bytes objects to string
* Encode NaN values to "null"
* gn3/api/correlation.py:
|
|
Issue:
https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi
* Add an API endpoint for the partial correlation.
* gn3/api/correlation.py:
|
|
Issue:
https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/partial-correlations.gmi
* Add an API endpoint for the partial correlation.
|
|
* guix.scm: Import (gnu packages rdf).
(genenetwork3)[propagated-inputs]: Add python-sparqlwrapper.
* gn3/settings.py (SPARQL_ENDPOINT): New variable.
* gn3/api/general.py: Import datasets from gn3.db.
(dataset_metadata): New API endpoint.
* gn3/db/datasets.py: Import re, Template from string, Dict and Optional from
typing, JSON and SPARQLWrapper from SPARQLWrapper, SPARQL_ENDPOINT from
gn3.settings.
(sparql_query, dataset_metadata): New functions.
|
|
Issue:
https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/non-clustered-heatmaps-and-flipping.gmi
* Update the request endpoint, so that it produces a vertical or horizontal
heatmap depending on the user's request.
|