diff options
209 files changed, 9169 insertions, 2383 deletions
diff --git a/MANIFEST.in b/MANIFEST.in index c515f0e..79339cd 100644 --- a/MANIFEST.in +++ b/MANIFEST.in @@ -1,5 +1,5 @@ include README.org recursive-include etc *.py *.csv -recursive-include qc_app/static *.js *.css *.png -recursive-include qc_app/templates *.html +recursive-include uploader/static *.js *.css *.png +recursive-include uploader/templates *.html recursive-exclude tests/ *.py *.tsv *.csv
\ No newline at end of file @@ -197,7 +197,7 @@ few environment variables #+BEGIN_SRC shell export FLASK_APP=wsgi.py export FLASK_ENV=development -export QCAPP_INSTANCE_PATH=/path/to/directory/with/config.py +export UPLOADER_CONF=/path/to/directory/with/uploader/configuration.py #+END_SRC then you can run the application with #+BEGIN_SRC shell @@ -208,7 +208,7 @@ flask run To run the linter over the code base, run: #+BEGIN_SRC shell - pylint setup.py tests quality_control qc_app r_qtl scripts + pylint setup.py tests quality_control uploader r_qtl scripts #+END_SRC To check for correct type usage in the application, run: @@ -218,13 +218,13 @@ To check for correct type usage in the application, run: Run unit tests with: #+BEGIN_SRC shell - $ export QCAPP_CONF=</path/to/configuration/file.py> + $ export UPLOADER_CONF=</path/to/configuration/file.py> $ pytest -m unit_test #+END_SRC To run ALL tests (not just unit tests): #+BEGIN_SRC shell - $ export QCAPP_CONF=</path/to/configuration/file.py> + $ export UPLOADER_CONF=</path/to/configuration/file.py> $ pytest #+END_SRC diff --git a/docs/dev/background_jobs.md b/docs/dev/background_jobs.md new file mode 100644 index 0000000..1a41636 --- /dev/null +++ b/docs/dev/background_jobs.md @@ -0,0 +1,62 @@ +# Background Jobs + +We run background jobs for long-running processes, e.g. quality-assurance checks +across multiple huge files, inserting huge data to databases, etc. The system +needs to keep track of the progress of these jobs and communicate the state to +the user whenever the user requests. + +This details some thoughts on how to handle these jobs, especially in failure +conditions. + +We currently use Redis[^redis] to keep track of the state of the background +processes. + +Every background job started will have a Redis[^redis] key with the prefix `gn-uploader:jobs` + +## Users + +Currently (2024-10-23T13:29UTC-05:00), we do not track the user that started the job. Moving forward, we will track this information. + +We could have the keys be something like, `gn-uploader:jobs:<user-id>:<job-id>`. + +Another option is track any particular users jobs with a key of the form +`gn-uploader:users:<user-id>:jobs` and in that case, have the job keys take the +form `gn-uploader:jobs:<job-id>`. I (@fredmanglis) favour this option over +having the user's ID in the jobs keys directly, since it provides a way to +interact with **ALL** the jobs without indirecting through each specific user. +This is a useful ability to have, especially for system administrative tasks. + +## Multiprocessing Within Jobs + +Some jobs, e.g. quality-assurance jobs, can run multiple threads/processes +themselves. This brings up a problem because Redis[^redis] does not allow +parallel access to a key, especially for writing. + +We also do not want to create bottlenecks by writing to the same key from +multiple threads/processes. + +The design I have currently come up with, that might work is as follows: + +- At any point just before where multiple threads/processes are started, a list + of new keys, each of which will collect the output from a single thread, will + be built. +- These keys are recorded in the parent's redis key data +- The threads/processes are started and do whatever they need, pushing their + outputs to the appropriate keys within redis. + +The new keys for the children threads/processe could build on the theme + + +## Fetching Jobs Status + +Different jobs could have different ways of requirements for handling/processing +their outputs, and those of any children they might spawn. The system will need +to provide a way to pass in the correct function/code to process the outputs at +the point where the job status is requested. + +This implies that we need to track the type of job in order to be able to select +the correct code for processing such output. + +## Links + +- [^redis]: https://redis.io/ diff --git a/docs/dev/quality_assurance_on_csv_files.md b/docs/dev/quality_assurance_on_csv_files.md new file mode 100644 index 0000000..02d63c9 --- /dev/null +++ b/docs/dev/quality_assurance_on_csv_files.md @@ -0,0 +1,52 @@ +# Quality Assurance/Control on CSV Files + +## Abbreviations + +- CSV files: Character-separated-values files — these are data files structured in a table-like format, with a specific character chosen as the column/field separator. The comma (,) is the most common field separator used by most csv files. It is, however, possible to encounter files with other characters separating the values. + +## General Pattern + +A general pattern has emerged when performing quality assurance on the data in +CSV files — the pseudocode below shows the general pattern: + +```python +def qc_function(filepath, …): + open(filepath, …) + + headers = read_first_line(…) + perform_qc_on_headings(headers, …) + + for each subsequent line in file: + perform_qc_on_first_column(line, …) + + for each subsequent field in line: + perform_qc_on_field(field, …) +``` + +We want to list the errors found in each file, so it makes sense for the `perform_qc_on*` functions in the pseudocode above to return the list of errors found for each file. + +The actual quality assurance done on the headers, first column of data rows, and the fields can differ from one type of file to the next, but the structure remains relatively unchanged. + +This implies we could make use of a higher-order function that contains the general structure with the actual qc steps passed in as functions that are called in the higher-order structuring function. This gives something like: + +```python +def qc_function(filepath, headers_qc, first_column_qc, data_qc, …): + for line in file: + if line is a comment line: + skip line and continue iteration + if line is first non-comment line: + line is the header line + call headers_qc on fields in this line + if line is not first non-comment line: + line is data line + call first_column_qc on first field of line + call data_qc on each of the subsequent fields of the line + + collect and return errors +``` + +## Improvements + +- Read the file in a separate generator function +- Parallelize QC if many files are present +- Add logging/output for user update (how do we do this correctly?) @@ -28,4 +28,16 @@ ignore_missing_imports = True ignore_missing_imports = True [mypy-yaml.*] +ignore_missing_imports = True + +[mypy-pymonad.tools] +ignore_missing_imports = True + +[mypy-pymonad.either] +ignore_missing_imports = True + +[mypy-authlib.*] +ignore_missing_imports = True + +[mypy-flask_session.*] ignore_missing_imports = True
\ No newline at end of file diff --git a/qc_app/__init__.py b/qc_app/__init__.py deleted file mode 100644 index 9907695..0000000 --- a/qc_app/__init__.py +++ /dev/null @@ -1,45 +0,0 @@ -"""The Quality-Control Web Application entry point""" -import os -import logging -from pathlib import Path - -from flask import Flask - -from .entry import entrybp -from .upload import upload -from .parse import parsebp -from .samples import samples -from .base_routes import base -from .dbinsert import dbinsertbp -from .errors import register_error_handlers - -def override_settings_with_envvars( - app: Flask, ignore: tuple[str, ...]=tuple()) -> None: - """Override settings in `app` with those in ENVVARS""" - for setting in (key for key in app.config if key not in ignore): - app.config[setting] = os.environ.get(setting) or app.config[setting] - - -def create_app(): - """The application factory""" - app = Flask(__name__) - app.config.from_pyfile( - Path(__file__).parent.joinpath("default_settings.py")) - if "QCAPP_CONF" in os.environ: - app.config.from_envvar("QCAPP_CONF") # Override defaults with instance path - - override_settings_with_envvars(app, ignore=tuple()) - - if "QCAPP_SECRETS" in os.environ: - app.config.from_envvar("QCAPP_SECRETS") - - # setup blueprints - app.register_blueprint(base, url_prefix="/") - app.register_blueprint(entrybp, url_prefix="/") - app.register_blueprint(parsebp, url_prefix="/parse") - app.register_blueprint(upload, url_prefix="/upload") - app.register_blueprint(dbinsertbp, url_prefix="/dbinsert") - app.register_blueprint(samples, url_prefix="/samples") - - register_error_handlers(app) - return app diff --git a/qc_app/base_routes.py b/qc_app/base_routes.py deleted file mode 100644 index 9daf439..0000000 --- a/qc_app/base_routes.py +++ /dev/null @@ -1,29 +0,0 @@ -"""Basic routes required for all pages""" -import os -from flask import Blueprint, send_from_directory - -base = Blueprint("base", __name__) - -def appenv(): - """Get app's guix environment path.""" - return os.environ.get("GN_UPLOADER_ENVIRONMENT") - -@base.route("/bootstrap/<path:filename>") -def bootstrap(filename): - """Fetch bootstrap files.""" - return send_from_directory( - appenv(), f"share/genenetwork2/javascript/bootstrap/{filename}") - - -@base.route("/jquery/<path:filename>") -def jquery(filename): - """Fetch jquery files.""" - return send_from_directory( - appenv(), f"share/genenetwork2/javascript/jquery/{filename}") - - -@base.route("/node-modules/<path:filename>") -def node_modules(filename): - """Fetch node-js modules.""" - return send_from_directory( - appenv(), f"lib/node_modules/{filename}") diff --git a/qc_app/db/__init__.py b/qc_app/db/__init__.py deleted file mode 100644 index 36e93e8..0000000 --- a/qc_app/db/__init__.py +++ /dev/null @@ -1,8 +0,0 @@ -"""Database functions""" -from .species import species, species_by_id -from .populations import ( - save_population, - population_by_id, - populations_by_species, - population_by_species_and_id) -from .datasets import geno_datasets_by_species_and_population diff --git a/qc_app/db/platforms.py b/qc_app/db/platforms.py deleted file mode 100644 index cb527a7..0000000 --- a/qc_app/db/platforms.py +++ /dev/null @@ -1,25 +0,0 @@ -"""Handle db interactions for platforms.""" -from typing import Optional - -import MySQLdb as mdb -from MySQLdb.cursors import DictCursor - -def platforms_by_species( - conn: mdb.Connection, speciesid: int) -> tuple[dict, ...]: - """Retrieve platforms by the species""" - with conn.cursor(cursorclass=DictCursor) as cursor: - cursor.execute("SELECT * FROM GeneChip WHERE SpeciesId=%s " - "ORDER BY GeneChipName ASC", - (speciesid,)) - return tuple(dict(row) for row in cursor.fetchall()) - -def platform_by_id(conn: mdb.Connection, platformid: int) -> Optional[dict]: - """Retrieve a platform by its ID""" - with conn.cursor(cursorclass=DictCursor) as cursor: - cursor.execute("SELECT * FROM GeneChip WHERE Id=%s", - (platformid,)) - result = cursor.fetchone() - if bool(result): - return dict(result) - - return None diff --git a/qc_app/db/populations.py b/qc_app/db/populations.py deleted file mode 100644 index 4485e52..0000000 --- a/qc_app/db/populations.py +++ /dev/null @@ -1,54 +0,0 @@ -"""Functions for accessing the database relating to species populations.""" -import MySQLdb as mdb -from MySQLdb.cursors import DictCursor - -def population_by_id(conn: mdb.Connection, population_id) -> dict: - """Get the grouping/population by id.""" - with conn.cursor(cursorclass=DictCursor) as cursor: - cursor.execute("SELECT * FROM InbredSet WHERE InbredSetId=%s", - (population_id,)) - return cursor.fetchone() - -def population_by_species_and_id( - conn: mdb.Connection, species_id, population_id) -> dict: - """Retrieve a population by its identifier and species.""" - with conn.cursor(cursorclass=DictCursor) as cursor: - cursor.execute("SELECT * FROM InbredSet WHERE SpeciesId=%s AND Id=%s", - (species_id, population_id)) - return cursor.fetchone() - -def populations_by_species(conn: mdb.Connection, speciesid) -> tuple: - "Retrieve group (InbredSet) information from the database." - with conn.cursor(cursorclass=DictCursor) as cursor: - query = "SELECT * FROM InbredSet WHERE SpeciesId=%s" - cursor.execute(query, (speciesid,)) - return tuple(cursor.fetchall()) - - return tuple() - -def save_population(conn: mdb.Connection, population_details: dict) -> dict: - """Save the population details to the db.""" - with conn.cursor(cursorclass=DictCursor) as cursor: - cursor.execute( - "INSERT INTO InbredSet(" - "InbredSetId, InbredSetName, Name, SpeciesId, FullName, " - "MenuOrderId, Description" - ") " - "VALUES (" - "%(InbredSetId)s, %(InbredSetName)s, %(Name)s, %(SpeciesId)s, " - "%(FullName)s, %(MenuOrderId)s, %(Description)s" - ")", - { - "MenuOrderId": 0, - "InbredSetId": 0, - **population_details - }) - new_id = cursor.lastrowid - cursor.execute("UPDATE InbredSet SET InbredSetId=%s WHERE Id=%s", - (new_id, new_id)) - return { - **population_details, - "Id": new_id, - "InbredSetId": new_id, - "population_id": new_id - } diff --git a/qc_app/db/species.py b/qc_app/db/species.py deleted file mode 100644 index 653e59b..0000000 --- a/qc_app/db/species.py +++ /dev/null @@ -1,22 +0,0 @@ -"""Database functions for species.""" -import MySQLdb as mdb -from MySQLdb.cursors import DictCursor - -def species(conn: mdb.Connection) -> tuple: - "Retrieve the species from the database." - with conn.cursor(cursorclass=DictCursor) as cursor: - cursor.execute( - "SELECT SpeciesId, SpeciesName, LOWER(Name) AS Name, MenuName, " - "FullName FROM Species") - return tuple(cursor.fetchall()) - - return tuple() - -def species_by_id(conn: mdb.Connection, speciesid) -> dict: - "Retrieve the species from the database by id." - with conn.cursor(cursorclass=DictCursor) as cursor: - cursor.execute( - "SELECT SpeciesId, SpeciesName, LOWER(Name) AS Name, MenuName, " - "FullName FROM Species WHERE SpeciesId=%s", - (speciesid,)) - return cursor.fetchone() diff --git a/qc_app/entry.py b/qc_app/entry.py deleted file mode 100644 index f2db878..0000000 --- a/qc_app/entry.py +++ /dev/null @@ -1,127 +0,0 @@ -"""Entry-point module""" -import os -import mimetypes -from typing import Tuple -from zipfile import ZipFile, is_zipfile - -from werkzeug.utils import secure_filename -from flask import ( - flash, - request, - url_for, - redirect, - Blueprint, - render_template, - current_app as app, - send_from_directory) - -from qc_app.db import species -from qc_app.db_utils import with_db_connection - -entrybp = Blueprint("entry", __name__) - -@entrybp.route("/favicon.ico", methods=["GET"]) -def favicon(): - """Return the favicon.""" - return send_from_directory(os.path.join(app.root_path, "static"), - "images/CITGLogo.png", - mimetype="image/png") - - -def errors(rqst) -> Tuple[str, ...]: - """Return a tuple of the errors found in the request `rqst`. If no error is - found, then an empty tuple is returned.""" - def __filetype_error__(): - return ( - ("Invalid file type provided.",) - if rqst.form.get("filetype") not in ("average", "standard-error") - else tuple()) - - def __file_missing_error__(): - return ( - ("No file was uploaded.",) - if ("qc_text_file" not in rqst.files or - rqst.files["qc_text_file"].filename == "") - else tuple()) - - def __file_mimetype_error__(): - text_file = rqst.files["qc_text_file"] - return ( - ( - ("Invalid file! Expected a tab-separated-values file, or a zip " - "file of the a tab-separated-values file."),) - if text_file.mimetype not in ( - "text/plain", "text/tab-separated-values", - "application/zip") - else tuple()) - - return ( - __filetype_error__() + - (__file_missing_error__() or __file_mimetype_error__())) - -def zip_file_errors(filepath, upload_dir) -> Tuple[str, ...]: - """Check the uploaded zip file for errors.""" - zfile_errors: Tuple[str, ...] = tuple() - if is_zipfile(filepath): - with ZipFile(filepath, "r") as zfile: - infolist = zfile.infolist() - if len(infolist) != 1: - zfile_errors = zfile_errors + ( - ("Expected exactly one (1) member file within the uploaded zip " - f"file. Got {len(infolist)} member files."),) - if len(infolist) == 1 and infolist[0].is_dir(): - zfile_errors = zfile_errors + ( - ("Expected a member text file in the uploaded zip file. Got a " - "directory/folder."),) - - if len(infolist) == 1 and not infolist[0].is_dir(): - zfile.extract(infolist[0], path=upload_dir) - mime = mimetypes.guess_type(f"{upload_dir}/{infolist[0].filename}") - if mime[0] != "text/tab-separated-values": - zfile_errors = zfile_errors + ( - ("Expected the member text file in the uploaded zip file to" - " be a tab-separated file."),) - - return zfile_errors - -@entrybp.route("/", methods=["GET"]) -def index(): - """Load the landing page""" - return render_template("index.html") - -@entrybp.route("/upload", methods=["GET", "POST"]) -def upload_file(): - """Enables uploading the files""" - if request.method == "GET": - return render_template( - "select_species.html", species=with_db_connection(species)) - - upload_dir = app.config["UPLOAD_FOLDER"] - request_errors = errors(request) - if request_errors: - for error in request_errors: - flash(error, "alert-danger error-expr-data") - return redirect(url_for("entry.upload_file")) - - filename = secure_filename(request.files["qc_text_file"].filename) - if not os.path.exists(upload_dir): - os.mkdir(upload_dir) - - filepath = os.path.join(upload_dir, filename) - request.files["qc_text_file"].save(os.path.join(upload_dir, filename)) - - zip_errors = zip_file_errors(filepath, upload_dir) - if zip_errors: - for error in zip_errors: - flash(error, "alert-danger error-expr-data") - return redirect(url_for("entry.upload_file")) - - return redirect(url_for("parse.parse", - speciesid=request.form["speciesid"], - filename=filename, - filetype=request.form["filetype"])) - -@entrybp.route("/data-review", methods=["GET"]) -def data_review(): - """Provide some help on data expectations to the user.""" - return render_template("data_review.html") diff --git a/qc_app/input_validation.py b/qc_app/input_validation.py deleted file mode 100644 index 9abe742..0000000 --- a/qc_app/input_validation.py +++ /dev/null @@ -1,27 +0,0 @@ -"""Input validation utilities""" -from typing import Any - -def is_empty_string(value: str) -> bool: - """Check whether as string is empty""" - return (isinstance(value, str) and value.strip() == "") - -def is_empty_input(value: Any) -> bool: - """Check whether user provided an empty value.""" - return (value is None or is_empty_string(value)) - -def is_integer_input(value: Any) -> bool: - """ - Check whether user provided a value that can be parsed into an integer. - """ - def __is_int__(val, base): - try: - int(val, base=base) - except ValueError: - return False - return True - return isinstance(value, int) or ( - (not is_empty_input(value)) and ( - isinstance(value, str) and ( - __is_int__(value, 10) - or __is_int__(value, 8) - or __is_int__(value, 16)))) diff --git a/qc_app/parse.py b/qc_app/parse.py deleted file mode 100644 index d20f6f0..0000000 --- a/qc_app/parse.py +++ /dev/null @@ -1,175 +0,0 @@ -"""File parsing module""" -import os - -import jsonpickle -from redis import Redis -from flask import flash, request, url_for, redirect, Blueprint, render_template -from flask import current_app as app - -from quality_control.errors import InvalidValue, DuplicateHeading - -from qc_app import jobs -from qc_app.dbinsert import species_by_id -from qc_app.db_utils import with_db_connection - -parsebp = Blueprint("parse", __name__) - -def isinvalidvalue(item): - """Check whether item is of type InvalidValue""" - return isinstance(item, InvalidValue) - -def isduplicateheading(item): - """Check whether item is of type DuplicateHeading""" - return isinstance(item, DuplicateHeading) - -@parsebp.route("/parse", methods=["GET"]) -def parse(): - """Trigger file parsing""" - errors = False - speciesid = request.args.get("speciesid") - filename = request.args.get("filename") - filetype = request.args.get("filetype") - if speciesid is None: - flash("No species selected", "alert-error error-expr-data") - errors = True - else: - try: - speciesid = int(speciesid) - species = with_db_connection( - lambda con: species_by_id(con, speciesid)) - if not bool(species): - flash("No such species.", "alert-error error-expr-data") - errors = True - except ValueError: - flash("Invalid speciesid provided. Expected an integer.", - "alert-error error-expr-data") - errors = True - - if filename is None: - flash("No file provided", "alert-error error-expr-data") - errors = True - - if filetype is None: - flash("No filetype provided", "alert-error error-expr-data") - errors = True - - if filetype not in ("average", "standard-error"): - flash("Invalid filetype provided", "alert-error error-expr-data") - errors = True - - if filename: - filepath = os.path.join(app.config["UPLOAD_FOLDER"], filename) - if not os.path.exists(filepath): - flash("Selected file does not exist (any longer)", - "alert-error error-expr-data") - errors = True - - if errors: - return redirect(url_for("entry.upload_file")) - - redisurl = app.config["REDIS_URL"] - with Redis.from_url(redisurl, decode_responses=True) as rconn: - job = jobs.launch_job( - jobs.build_file_verification_job( - rconn, app.config["SQL_URI"], redisurl, - speciesid, filepath, filetype, - app.config["JOBS_TTL_SECONDS"]), - redisurl, - f"{app.config['UPLOAD_FOLDER']}/job_errors") - - return redirect(url_for("parse.parse_status", job_id=job["jobid"])) - -@parsebp.route("/status/<job_id>", methods=["GET"]) -def parse_status(job_id: str): - "Retrieve the status of the job" - with Redis.from_url(app.config["REDIS_URL"], decode_responses=True) as rconn: - try: - job = jobs.job(rconn, jobs.jobsnamespace(), job_id) - except jobs.JobNotFound as _exc: - return render_template("no_such_job.html", job_id=job_id), 400 - - error_filename = jobs.error_filename( - job_id, f"{app.config['UPLOAD_FOLDER']}/job_errors") - if os.path.exists(error_filename): - stat = os.stat(error_filename) - if stat.st_size > 0: - return redirect(url_for("parse.fail", job_id=job_id)) - - job_id = job["jobid"] - progress = float(job["percent"]) - status = job["status"] - filename = job.get("filename", "uploaded file") - errors = jsonpickle.decode( - job.get("errors", jsonpickle.encode(tuple()))) - if status in ("success", "aborted"): - return redirect(url_for("parse.results", job_id=job_id)) - - if status == "parse-error": - return redirect(url_for("parse.fail", job_id=job_id)) - - app.jinja_env.globals.update( - isinvalidvalue=isinvalidvalue, - isduplicateheading=isduplicateheading) - return render_template( - "job_progress.html", - job_id = job_id, - job_status = status, - progress = progress, - message = job.get("message", ""), - job_name = f"Parsing '{filename}'", - errors=errors) - -@parsebp.route("/results/<job_id>", methods=["GET"]) -def results(job_id: str): - """Show results of parsing...""" - with Redis.from_url(app.config["REDIS_URL"], decode_responses=True) as rconn: - job = jobs.job(rconn, jobs.jobsnamespace(), job_id) - - if job: - filename = job["filename"] - errors = jsonpickle.decode(job.get("errors", jsonpickle.encode(tuple()))) - app.jinja_env.globals.update( - isinvalidvalue=isinvalidvalue, - isduplicateheading=isduplicateheading) - return render_template( - "parse_results.html", - errors=errors, - job_name = f"Parsing '{filename}'", - user_aborted = job.get("user_aborted"), - job_id=job["jobid"]) - - return render_template("no_such_job.html", job_id=job_id) - -@parsebp.route("/fail/<job_id>", methods=["GET"]) -def fail(job_id: str): - """Handle parsing failure""" - with Redis.from_url(app.config["REDIS_URL"], decode_responses=True) as rconn: - job = jobs.job(rconn, jobs.jobsnamespace(), job_id) - - if job: - error_filename = jobs.error_filename( - job_id, f"{app.config['UPLOAD_FOLDER']}/job_errors") - if os.path.exists(error_filename): - stat = os.stat(error_filename) - if stat.st_size > 0: - return render_template( - "worker_failure.html", job_id=job_id) - - return render_template("parse_failure.html", job=job) - - return render_template("no_such_job.html", job_id=job_id) - -@parsebp.route("/abort", methods=["POST"]) -def abort(): - """Handle user request to abort file processing""" - job_id = request.form["job_id"] - - with Redis.from_url(app.config["REDIS_URL"], decode_responses=True) as rconn: - job = jobs.job(rconn, jobs.jobsnamespace(), job_id) - - if job: - rconn.hset(name=jobs.job_key(jobs.jobsnamespace(), job_id), - key="user_aborted", - value=int(True)) - - return redirect(url_for("parse.parse_status", job_id=job_id)) diff --git a/qc_app/samples.py b/qc_app/samples.py deleted file mode 100644 index 804f262..0000000 --- a/qc_app/samples.py +++ /dev/null @@ -1,354 +0,0 @@ -"""Code regarding samples""" -import os -import sys -import csv -import uuid -from pathlib import Path -from typing import Iterator - -import MySQLdb as mdb -from redis import Redis -from MySQLdb.cursors import DictCursor -from flask import ( - flash, - request, - url_for, - redirect, - Blueprint, - render_template, - current_app as app) - -from functional_tools import take - -from qc_app import jobs -from qc_app.files import save_file -from qc_app.input_validation import is_integer_input -from qc_app.db_utils import ( - with_db_connection, - database_connection, - with_redis_connection) -from qc_app.db import ( - species_by_id, - save_population, - population_by_id, - populations_by_species, - species as fetch_species) - -samples = Blueprint("samples", __name__) - -@samples.route("/upload/species", methods=["GET", "POST"]) -def select_species(): - """Select the species.""" - if request.method == "GET": - return render_template("samples/select-species.html", - species=with_db_connection(fetch_species)) - - index_page = redirect(url_for("entry.upload_file")) - species_id = request.form.get("species_id") - if bool(species_id): - species_id = int(species_id) - species = with_db_connection( - lambda conn: species_by_id(conn, species_id)) - if bool(species): - return redirect(url_for( - "samples.select_population", species_id=species_id)) - flash("Invalid species selected!", "alert-error") - flash("You need to select a species", "alert-error") - return index_page - -@samples.route("/upload/species/<int:species_id>/create-population", - methods=["POST"]) -def create_population(species_id: int): - """Create new grouping/population.""" - if not is_integer_input(species_id): - flash("You did not provide a valid species. Please select one to " - "continue.", - "alert-danger") - return redirect(url_for("samples.select_species")) - species = with_db_connection(lambda conn: species_by_id(conn, species_id)) - if not bool(species): - flash("Species with given ID was not found.", "alert-danger") - return redirect(url_for("samples.select_species")) - - species_page = redirect(url_for("samples.select_species"), code=307) - with database_connection(app.config["SQL_URI"]) as conn: - species = species_by_id(conn, species_id) - pop_name = request.form.get("inbredset_name", "").strip() - pop_fullname = request.form.get("inbredset_fullname", "").strip() - - if not bool(species): - flash("Invalid species!", "alert-error error-create-population") - return species_page - if (not bool(pop_name)) or (not bool(pop_fullname)): - flash("You *MUST* provide a grouping/population name", - "alert-error error-create-population") - return species_page - - pop = save_population(conn, { - "SpeciesId": species["SpeciesId"], - "Name": pop_name, - "InbredSetName": pop_fullname, - "FullName": pop_fullname, - "Family": request.form.get("inbredset_family") or None, - "Description": request.form.get("description") or None - }) - - flash("Grouping/Population created successfully.", "alert-success") - return redirect(url_for("samples.upload_samples", - species_id=species_id, - population_id=pop["population_id"])) - -@samples.route("/upload/species/<int:species_id>/population", - methods=["GET", "POST"]) -def select_population(species_id: int): - """Select from existing groupings/populations.""" - if not is_integer_input(species_id): - flash("You did not provide a valid species. Please select one to " - "continue.", - "alert-danger") - return redirect(url_for("samples.select_species")) - species = with_db_connection(lambda conn: species_by_id(conn, species_id)) - if not bool(species): - flash("Species with given ID was not found.", "alert-danger") - return redirect(url_for("samples.select_species")) - - if request.method == "GET": - return render_template( - "samples/select-population.html", - species=species, - populations=with_db_connection( - lambda conn: populations_by_species(conn, species_id))) - - population_page = redirect(url_for( - "samples.select_population", species_id=species_id), code=307) - _population_id = request.form.get("inbredset_id") - if not is_integer_input(_population_id): - flash("You did not provide a valid population. Please select one to " - "continue.", - "alert-danger") - return population_page - population = with_db_connection( - lambda conn: population_by_id(conn, _population_id)) - if not bool(population): - flash("Invalid grouping/population!", - "alert-error error-select-population") - return population_page - - return redirect(url_for("samples.upload_samples", - species_id=species_id, - population_id=_population_id), - code=307) - -def read_samples_file(filepath, separator: str, firstlineheading: bool, **kwargs) -> Iterator[dict]: - """Read the samples file.""" - with open(filepath, "r", encoding="utf-8") as inputfile: - reader = csv.DictReader( - inputfile, - fieldnames=( - None if firstlineheading - else ("Name", "Name2", "Symbol", "Alias")), - delimiter=separator, - quotechar=kwargs.get("quotechar", '"')) - for row in reader: - yield row - -def save_samples_data(conn: mdb.Connection, - speciesid: int, - file_data: Iterator[dict]): - """Save the samples to DB.""" - data = ({**row, "SpeciesId": speciesid} for row in file_data) - total = 0 - with conn.cursor() as cursor: - while True: - batch = take(data, 5000) - if len(batch) == 0: - break - cursor.executemany( - "INSERT INTO Strain(Name, Name2, SpeciesId, Symbol, Alias) " - "VALUES(" - " %(Name)s, %(Name2)s, %(SpeciesId)s, %(Symbol)s, %(Alias)s" - ") ON DUPLICATE KEY UPDATE Name=Name", - batch) - total += len(batch) - print(f"\tSaved {total} samples total so far.") - -def cross_reference_samples(conn: mdb.Connection, - species_id: int, - population_id: int, - strain_names: Iterator[str]): - """Link samples to their population.""" - with conn.cursor(cursorclass=DictCursor) as cursor: - cursor.execute( - "SELECT MAX(OrderId) AS loid FROM StrainXRef WHERE InbredSetId=%s", - (population_id,)) - last_order_id = (cursor.fetchone()["loid"] or 10) - total = 0 - while True: - batch = take(strain_names, 5000) - if len(batch) == 0: - break - params_str = ", ".join(["%s"] * len(batch)) - ## This query is slow -- investigate. - cursor.execute( - "SELECT s.Id FROM Strain AS s LEFT JOIN StrainXRef AS sx " - "ON s.Id = sx.StrainId WHERE s.SpeciesId=%s AND s.Name IN " - f"({params_str}) AND sx.StrainId IS NULL", - (species_id,) + tuple(batch)) - strain_ids = (sid["Id"] for sid in cursor.fetchall()) - params = tuple({ - "pop_id": population_id, - "strain_id": strain_id, - "order_id": last_order_id + (order_id * 10), - "mapping": "N", - "pedigree": None - } for order_id, strain_id in enumerate(strain_ids, start=1)) - cursor.executemany( - "INSERT INTO StrainXRef( " - " InbredSetId, StrainId, OrderId, Used_for_mapping, PedigreeStatus" - ")" - "VALUES (" - " %(pop_id)s, %(strain_id)s, %(order_id)s, %(mapping)s, " - " %(pedigree)s" - ")", - params) - last_order_id += (len(params) * 10) - total += len(batch) - print(f"\t{total} total samples cross-referenced to the population " - "so far.") - -def build_sample_upload_job(# pylint: disable=[too-many-arguments] - speciesid: int, - populationid: int, - samplesfile: Path, - separator: str, - firstlineheading: bool, - quotechar: str): - """Define the async command to run the actual samples data upload.""" - return [ - sys.executable, "-m", "scripts.insert_samples", app.config["SQL_URI"], - str(speciesid), str(populationid), str(samplesfile.absolute()), - separator, f"--redisuri={app.config['REDIS_URL']}", - f"--quotechar={quotechar}" - ] + (["--firstlineheading"] if firstlineheading else []) - -@samples.route("/upload/species/<int:species_id>/populations/<int:population_id>/samples", - methods=["GET", "POST"]) -def upload_samples(species_id: int, population_id: int):#pylint: disable=[too-many-return-statements] - """Upload the samples.""" - samples_uploads_page = redirect(url_for("samples.upload_samples", - species_id=species_id, - population_id=population_id)) - if not is_integer_input(species_id): - flash("You did not provide a valid species. Please select one to " - "continue.", - "alert-danger") - return redirect(url_for("samples.select_species")) - species = with_db_connection(lambda conn: species_by_id(conn, species_id)) - if not bool(species): - flash("Species with given ID was not found.", "alert-danger") - return redirect(url_for("samples.select_species")) - - if not is_integer_input(population_id): - flash("You did not provide a valid population. Please select one " - "to continue.", - "alert-danger") - return redirect(url_for("samples.select_population", - species_id=species_id), - code=307) - population = with_db_connection( - lambda conn: population_by_id(conn, int(population_id))) - if not bool(population): - flash("Invalid grouping/population!", "alert-error") - return redirect(url_for("samples.select_population", - species_id=species_id), - code=307) - - if request.method == "GET" or request.files.get("samples_file") is None: - return render_template("samples/upload-samples.html", - species=species, - population=population) - - try: - samples_file = save_file(request.files["samples_file"], - Path(app.config["UPLOAD_FOLDER"])) - except AssertionError: - flash("You need to provide a file with the samples data.", - "alert-error") - return samples_uploads_page - - firstlineheading = (request.form.get("first_line_heading") == "on") - - separator = request.form.get("separator", ",") - if separator == "other": - separator = request.form.get("other_separator", ",") - if not bool(separator): - flash("You need to provide a separator character.", "alert-error") - return samples_uploads_page - - quotechar = (request.form.get("field_delimiter", '"') or '"') - - redisuri = app.config["REDIS_URL"] - with Redis.from_url(redisuri, decode_responses=True) as rconn: - the_job = jobs.launch_job( - jobs.initialise_job( - rconn, - jobs.jobsnamespace(), - str(uuid.uuid4()), - build_sample_upload_job( - species["SpeciesId"], - population["InbredSetId"], - samples_file, - separator, - firstlineheading, - quotechar), - "samples_upload", - app.config["JOBS_TTL_SECONDS"], - {"job_name": f"Samples Upload: {samples_file.name}"}), - redisuri, - f"{app.config['UPLOAD_FOLDER']}/job_errors") - return redirect(url_for( - "samples.upload_status", job_id=the_job["jobid"])) - -@samples.route("/upload/status/<uuid:job_id>", methods=["GET"]) -def upload_status(job_id: uuid.UUID): - """Check on the status of a samples upload job.""" - job = with_redis_connection(lambda rconn: jobs.job( - rconn, jobs.jobsnamespace(), job_id)) - if job: - status = job["status"] - if status == "success": - return render_template("samples/upload-success.html", job=job) - - if status == "error": - return redirect(url_for("samples.upload_failure", job_id=job_id)) - - error_filename = Path(jobs.error_filename( - job_id, f"{app.config['UPLOAD_FOLDER']}/job_errors")) - if error_filename.exists(): - stat = os.stat(error_filename) - if stat.st_size > 0: - return redirect(url_for( - "samples.upload_failure", job_id=job_id)) - - return render_template( - "samples/upload-progress.html", - job=job) # maybe also handle this? - - return render_template("no_such_job.html", job_id=job_id), 400 - -@samples.route("/upload/failure/<uuid:job_id>", methods=["GET"]) -def upload_failure(job_id: uuid.UUID): - """Display the errors of the samples upload failure.""" - job = with_redis_connection(lambda rconn: jobs.job( - rconn, jobs.jobsnamespace(), job_id)) - if not bool(job): - return render_template("no_such_job.html", job_id=job_id), 400 - - error_filename = Path(jobs.error_filename( - job_id, f"{app.config['UPLOAD_FOLDER']}/job_errors")) - if error_filename.exists(): - stat = os.stat(error_filename) - if stat.st_size > 0: - return render_template("worker_failure.html", job_id=job_id) - - return render_template("samples/upload-failure.html", job=job) diff --git a/qc_app/static/css/styles.css b/qc_app/static/css/styles.css deleted file mode 100644 index a88c229..0000000 --- a/qc_app/static/css/styles.css +++ /dev/null @@ -1,7 +0,0 @@ -.heading { - text-transform: capitalize; -} - -label { - text-transform: capitalize; -} diff --git a/qc_app/templates/base.html b/qc_app/templates/base.html deleted file mode 100644 index eb5e6b7..0000000 --- a/qc_app/templates/base.html +++ /dev/null @@ -1,51 +0,0 @@ -<!DOCTYPE html> -<html lang="en"> - <head> - <meta charset="UTF-8" /> - <meta application-name="GeneNetwork Quality-Control Application" /> - <meta name="viewport" content="width=device-width, initial-scale=1.0" /> - {%block extrameta%}{%endblock%} - - <title>GN Uploader: {%block title%}{%endblock%}</title> - - <link rel="stylesheet" type="text/css" - href="{{url_for('base.bootstrap', - filename='css/bootstrap.min.css')}}" /> - <link rel="stylesheet" type="text/css" - href="{{url_for('base.bootstrap', - filename='css/bootstrap-theme.min.css')}}" /> - - - <link rel="shortcut icon" type="image/png" sizes="64x64" - href="{{url_for('static', filename='images/CITGLogo.png')}}" /> - - <link rel="stylesheet" type="text/css" href="/static/css/custom-bootstrap.css" /> - <link rel="stylesheet" type="text/css" href="/static/css/styles.css" /> - - {%block css%}{%endblock%} - </head> - - <body> - <div class="navbar navbar-inverse navbar-static-top pull-left" - role="navigation" - style="width: 100%;min-width: 850px;white-space: nowrap;"> - <div class="container-fluid" style="width: 100%"> - <ul class="nav navbar-nav"> - <li><a href="/" style="font-weight: bold">GN Uploader</a></li> - <li> - <a href="{{gnuri or 'https://genenetwork.org'}}">GeneNetwork</a> - </li> - </ul> - </div> - </div> - <div class="container"> - {%block contents%}{%endblock%} - </div> - - <script src="{{url_for('base.jquery', - filename='jquery.min.js')}}"></script> - <script src="{{url_for('base.bootstrap', - filename='js/bootstrap.min.js')}}"></script> - {%block javascript%}{%endblock%} - </body> -</html> diff --git a/qc_app/templates/index.html b/qc_app/templates/index.html deleted file mode 100644 index 89d2ae9..0000000 --- a/qc_app/templates/index.html +++ /dev/null @@ -1,81 +0,0 @@ -{%extends "base.html"%} - -{%block title%}Data Upload{%endblock%} - -{%block contents%} -<div class="row"> - <h1 class="heading">data upload</h1> - - <div class="explainer"> - <p>Each of the sections below gives you a different option for data upload. - Please read the documentation for each section carefully to understand what - each section is about.</p> - </div> -</div> - -<div class="row"> - <h2 class="heading">R/qtl2 Bundles</h2> - - <div class="explainer"> - <p>This feature combines and extends the two upload methods below. Instead of - uploading one item at a time, the R/qtl2 bundle you upload can contain both - the genotypes data (samples/individuals/cases and their data) and the - expression data.</p> - <p>The R/qtl2 bundle, additionally, can contain extra metadata, that neither - of the methods below can handle.</p> - - <a href="{{url_for('upload.rqtl2.select_species')}}" - title="Upload a zip bundle of R/qtl2 files"> - <button class="btn btn-primary">upload R/qtl2 bundle</button></a> - </div> -</div> - - -<div class="row"> - <h2 class="heading">Expression Data</h2> - - <div class="explainer"> - <p>This feature enables you to upload expression data. It expects the data to - be in <strong>tab-separated values (TSV)</strong> files. The data should be - a simple matrix of <em>phenotype × sample</em>, i.e. The first column is a - list of the <em>phenotypes</em> and the first row is a list of - <em>samples/cases</em>.</p> - - <p>If you haven't done so please go to this page to learn the requirements for - file formats and helpful suggestions to enter your data in a fast and easy - way.</p> - - <ol> - <li><strong>PLEASE REVIEW YOUR DATA.</strong>Make sure your data complies - with our system requirements. ( - <a href="{{url_for('entry.data_review')}}#data-concerns" - title="Details for the data expectations.">Help</a> - )</li> - <li><strong>UPLOAD YOUR DATA FOR DATA VERIFICATION.</strong> We accept - <strong>.csv</strong>, <strong>.txt</strong> and <strong>.zip</strong> - files (<a href="{{url_for('entry.data_review')}}#file-types" - title="Details for the data expectations.">Help</a>)</li> - </ol> - </div> - - <a href="{{url_for('entry.upload_file')}}" - title="Upload your expression data" - class="btn btn-primary">upload expression data</a> -</div> - -<div class="row"> - <h2 class="heading">samples/cases</h2> - - <div class="explainer"> - <p>For the expression data above, you need the samples/cases in your file to - already exist in the GeneNetwork database. If there are any samples that do - not already exist the upload of the expression data will fail.</p> - <p>This section gives you the opportunity to upload any missing samples</p> - </div> - - <a href="{{url_for('samples.select_species')}}" - title="Upload samples/cases/individuals for your data" - class="btn btn-primary">upload Samples/Cases</a> -</div> - -{%endblock%} diff --git a/qc_app/templates/parse_results.html b/qc_app/templates/parse_results.html deleted file mode 100644 index e2bf7f0..0000000 --- a/qc_app/templates/parse_results.html +++ /dev/null @@ -1,30 +0,0 @@ -{%extends "base.html"%} -{%from "errors_display.html" import errors_display%} - -{%block title%}Parse Results{%endblock%} - -{%block contents%} -<h1 class="heading">{{job_name}}: parse results</h2> - -{%if user_aborted%} -<span class="alert-warning">Job aborted by the user</span> -{%endif%} - -{{errors_display(errors, "No errors found in the file", "We found the following errors", True)}} - -{%if errors | length == 0 and not user_aborted %} -<form method="post" action="{{url_for('dbinsert.select_platform')}}"> - <input type="hidden" name="job_id" value="{{job_id}}" /> - <input type="submit" value="update database" class="btn btn-primary" /> -</form> -{%endif%} - -{%if errors | length > 0 or user_aborted %} -<br /> -<a href="{{url_for('entry.upload_file')}}" title="Back to index page." - class="btn btn-primary"> - Go back -</a> -{%endif%} - -{%endblock%} diff --git a/qc_app/templates/rqtl2/create-geno-dataset-success.html b/qc_app/templates/rqtl2/create-geno-dataset-success.html deleted file mode 100644 index 1b50221..0000000 --- a/qc_app/templates/rqtl2/create-geno-dataset-success.html +++ /dev/null @@ -1,55 +0,0 @@ -{%extends "base.html"%} -{%from "flash_messages.html" import flash_messages%} - -{%block title%}Upload R/qtl2 Bundle{%endblock%} - -{%block contents%} -<h2 class="heading">Select Genotypes Dataset</h2> - -<div class="explainer"> - <p>You successfully created the genotype dataset with the following - information. - <dl> - <dt>ID</dt> - <dd>{{geno_dataset.id}}</dd> - - <dt>Name</dt> - <dd>{{geno_dataset.name}}</dd> - - <dt>Full Name</dt> - <dd>{{geno_dataset.fname}}</dd> - - <dt>Short Name</dt> - <dd>{{geno_dataset.sname}}</dd> - - <dt>Created On</dt> - <dd>{{geno_dataset.today}}</dd> - - <dt>Public?</dt> - <dd>{%if geno_dataset.public == 0%}No{%else%}Yes{%endif%}</dd> - </dl> - </p> -</div> - -<div class="row"> - <form id="frm-upload-rqtl2-bundle" - action="{{url_for('upload.rqtl2.select_dataset_info', - species_id=species.SpeciesId, - population_id=population.InbredSetId)}}" - method="POST" - enctype="multipart/form-data"> - <legend class="heading">select from existing genotype datasets</legend> - - <input type="hidden" name="species_id" value="{{species.SpeciesId}}" /> - <input type="hidden" name="population_id" - value="{{population.InbredSetId}}" /> - <input type="hidden" name="rqtl2_bundle_file" - value="{{rqtl2_bundle_file}}" /> - <input type="hidden" name="geno-dataset-id" - value="{{geno_dataset.id}}" /> - - <button type="submit" class="btn btn-primary">continue</button> - </form> -</div> - -{%endblock%} diff --git a/qc_app/templates/rqtl2/create-probe-dataset-success.html b/qc_app/templates/rqtl2/create-probe-dataset-success.html deleted file mode 100644 index 790d174..0000000 --- a/qc_app/templates/rqtl2/create-probe-dataset-success.html +++ /dev/null @@ -1,59 +0,0 @@ -{%extends "base.html"%} -{%from "flash_messages.html" import flash_messages%} - -{%block title%}Upload R/qtl2 Bundle{%endblock%} - -{%block contents%} -<h2 class="heading">Create ProbeSet Dataset</h2> - -<div class="row"> - <p>You successfully created the ProbeSet dataset with the following - information. - <dl> - <dt>Averaging Method</dt> - <dd>{{avgmethod.Name}}</dd> - - <dt>ID</dt> - <dd>{{dataset.datasetid}}</dd> - - <dt>Name</dt> - <dd>{{dataset.name2}}</dd> - - <dt>Full Name</dt> - <dd>{{dataset.fname}}</dd> - - <dt>Short Name</dt> - <dd>{{dataset.sname}}</dd> - - <dt>Created On</dt> - <dd>{{dataset.today}}</dd> - - <dt>DataScale</dt> - <dd>{{dataset.datascale}}</dd> - </dl> - </p> -</div> - -<div class="row"> - <form id="frm-upload-rqtl2-bundle" - action="{{url_for('upload.rqtl2.select_dataset_info', - species_id=species.SpeciesId, - population_id=population.InbredSetId)}}" - method="POST" - enctype="multipart/form-data"> - <legend class="heading">Create ProbeSet dataset</legend> - - <input type="hidden" name="species_id" value="{{species.SpeciesId}}" /> - <input type="hidden" name="population_id" - value="{{population.InbredSetId}}" /> - <input type="hidden" name="rqtl2_bundle_file" value="{{rqtl2_bundle_file}}" /> - <input type="hidden" name="geno-dataset-id" value="{{geno_dataset.Id}}" /> - <input type="hidden" name="tissueid" value="{{tissue.Id}}" /> - <input type="hidden" name="probe-study-id" value="{{study.Id}}" /> - <input type="hidden" name="probe-dataset-id" value="{{dataset.datasetid}}" /> - - <button type="submit" class="btn btn-primary">continue</button> - </form> -</div> - -{%endblock%} diff --git a/qc_app/templates/rqtl2/create-probe-study-success.html b/qc_app/templates/rqtl2/create-probe-study-success.html deleted file mode 100644 index d0ee508..0000000 --- a/qc_app/templates/rqtl2/create-probe-study-success.html +++ /dev/null @@ -1,49 +0,0 @@ -{%extends "base.html"%} -{%from "flash_messages.html" import flash_messages%} - -{%block title%}Upload R/qtl2 Bundle{%endblock%} - -{%block contents%} -<h2 class="heading">Create ProbeSet Study</h2> - -<div class="row"> - <p>You successfully created the ProbeSet study with the following - information. - <dl> - <dt>ID</dt> - <dd>{{study.id}}</dd> - - <dt>Name</dt> - <dd>{{study.name}}</dd> - - <dt>Full Name</dt> - <dd>{{study.fname}}</dd> - - <dt>Short Name</dt> - <dd>{{study.sname}}</dd> - - <dt>Created On</dt> - <dd>{{study.today}}</dd> - </dl> - </p> - - <form id="frm-upload-rqtl2-bundle" - action="{{url_for('upload.rqtl2.select_dataset_info', - species_id=species.SpeciesId, - population_id=population.InbredSetId)}}" - method="POST" - enctype="multipart/form-data"> - <legend class="heading">Create ProbeSet study</legend> - - <input type="hidden" name="species_id" value="{{species.SpeciesId}}" /> - <input type="hidden" name="population_id" - value="{{population.InbredSetId}}" /> - <input type="hidden" name="rqtl2_bundle_file" value="{{rqtl2_bundle_file}}" /> - <input type="hidden" name="geno-dataset-id" value="{{geno_dataset.Id}}" /> - <input type="hidden" name="probe-study-id" value="{{study.studyid}}" /> - - <button type="submit" class="btn btn-primary">continue</button> - </form> -</div> - -{%endblock%} diff --git a/qc_app/templates/rqtl2/index.html b/qc_app/templates/rqtl2/index.html deleted file mode 100644 index f3329c2..0000000 --- a/qc_app/templates/rqtl2/index.html +++ /dev/null @@ -1,36 +0,0 @@ -{%extends "base.html"%} -{%from "flash_messages.html" import flash_messages%} - -{%block title%}Data Upload{%endblock%} - -{%block contents%} -<h1 class="heading">R/qtl2 data upload</h1> - -<h2>R/qtl2 Upload</h2> - -<form method="POST" action="{{url_for('upload.rqtl2.select_species')}}" - id="frm-rqtl2-upload"> - <legend class="heading">upload R/qtl2 bundle</legend> - {{flash_messages("error-rqtl2")}} - - <div class="form-group"> - <label for="select:species" class="form-label">Species</label> - <select id="select:species" - name="species_id" - required="required" - class="form-control"> - <option value="">Select species</option> - {%for spec in species%} - <option value="{{spec.SpeciesId}}">{{spec.MenuName}}</option> - {%endfor%} - </select> - <small class="form-text text-muted"> - Data that you upload to the system should belong to a know species. - Here you can select the species that you wish to upload data for. - </small> - </div> - - <button type="submit" class="btn btn-primary" />submit</button> -</form> - -{%endblock%} diff --git a/qc_app/templates/rqtl2/select-geno-dataset.html b/qc_app/templates/rqtl2/select-geno-dataset.html deleted file mode 100644 index 873f9c3..0000000 --- a/qc_app/templates/rqtl2/select-geno-dataset.html +++ /dev/null @@ -1,144 +0,0 @@ -{%extends "base.html"%} -{%from "flash_messages.html" import flash_messages%} - -{%block title%}Upload R/qtl2 Bundle{%endblock%} - -{%block contents%} -<h2 class="heading">Select Genotypes Dataset</h2> - -<div class="row"> - <p>Your R/qtl2 files bundle contains a "geno" specification. You will - therefore need to select from one of the existing Genotype datasets or - create a new one.</p> - <p>This is the dataset where your data will be organised under.</p> -</div> - -<div class="row"> - <form id="frm-upload-rqtl2-bundle" - action="{{url_for('upload.rqtl2.select_geno_dataset', - species_id=species.SpeciesId, - population_id=population.InbredSetId)}}" - method="POST" - enctype="multipart/form-data"> - <legend class="heading">select from existing genotype datasets</legend> - - <input type="hidden" name="species_id" value="{{species.SpeciesId}}" /> - <input type="hidden" name="population_id" - value="{{population.InbredSetId}}" /> - <input type="hidden" name="rqtl2_bundle_file" - value="{{rqtl2_bundle_file}}" /> - - {{flash_messages("error-rqtl2-select-geno-dataset")}} - - <div class="form-group"> - <legend>Datasets</legend> - <label for="select:geno-datasets" class="form-label">Dataset</label> - <select id="select:geno-datasets" - name="geno-dataset-id" - required="required" - {%if datasets | length == 0%} - disabled="disabled" - {%endif%} - class="form-control" - aria-describedby="help-geno-dataset-select-dataset"> - <option value="">Select dataset</option> - {%for dset in datasets%} - <option value="{{dset['Id']}}">{{dset["Name"]}} ({{dset["FullName"]}})</option> - {%endfor%} - </select> - <span id="help-geno-dataset-select-dataset" class="form-text text-muted"> - Select from the existing genotype datasets for species - {{species.SpeciesName}} ({{species.FullName}}). - </span> - </div> - - <button type="submit" class="btn btn-primary">select dataset</button> - </form> -</div> - -<div class="row"> - <p style="color:#FE3535; padding-left:20em; font-weight:bolder;">OR</p> -</div> - -<div class="row"> - <form id="frm-upload-rqtl2-bundle" - action="{{url_for('upload.rqtl2.create_geno_dataset', - species_id=species.SpeciesId, - population_id=population.InbredSetId)}}" - method="POST" - enctype="multipart/form-data"> - <legend class="heading">create a new genotype dataset</legend> - - <input type="hidden" name="species_id" value="{{species.SpeciesId}}" /> - <input type="hidden" name="population_id" - value="{{population.InbredSetId}}" /> - <input type="hidden" name="rqtl2_bundle_file" - value="{{rqtl2_bundle_file}}" /> - - {{flash_messages("error-rqtl2-create-geno-dataset")}} - - <div class="form-group"> - <label for="txt:dataset-name" class="form-label">Name</label> - <input type="text" - id="txt:dataset-name" - name="dataset-name" - maxlength="100" - required="required" - class="form-control" - aria-describedby="help-geno-dataset-name" /> - <span id="help-geno-dataset-name" class="form-text text-muted"> - Provide the new name for the genotype dataset, e.g. "BXDGeno" - </span> - </div> - - <div class="form-group"> - <label for="txt:dataset-fullname" class="form-label">Full Name</label> - <input type="text" - id="txt:dataset-fullname" - name="dataset-fullname" - required="required" - maxlength="100" - class="form-control" - aria-describedby="help-geno-dataset-fullname" /> - - <span id="help-geno-dataset-fullname" class="form-text text-muted"> - Provide a longer name that better describes the genotype dataset, e.g. - "BXD Genotypes" - </span> - </div> - - <div class="form-group"> - <label for="txt:dataset-shortname" class="form-label">Short Name</label> - <input type="text" - id="txt:dataset-shortname" - name="dataset-shortname" - maxlength="100" - class="form-control" - aria-describedby="help-geno-dataset-shortname" /> - - <span id="help-geno-dataset-shortname" class="form-text text-muted"> - Provide a short name for the genotype dataset. This is optional. If not - provided, we'll default to the same value as the "Name" above. - </span> - </div> - - <div class="form-group"> - <input type="checkbox" - id="chk:dataset-public" - name="dataset-public" - checked="checked" - class="form-check" - aria-describedby="help-geno-datasent-public" /> - <label for="chk:dataset-public" class="form-check-label">Public?</label> - - <span id="help-geno-dataset-public" class="form-text text-muted"> - Specify whether the dataset will be available publicly. Check to make the - dataset publicly available and uncheck to limit who can access the dataset. - </span> - </div> - - <button type="submit" class="btn btn-primary">create new dataset</button> - </form> -</div> - -{%endblock%} diff --git a/qc_app/templates/rqtl2/select-population.html b/qc_app/templates/rqtl2/select-population.html deleted file mode 100644 index 37731f0..0000000 --- a/qc_app/templates/rqtl2/select-population.html +++ /dev/null @@ -1,136 +0,0 @@ -{%extends "base.html"%} -{%from "flash_messages.html" import flash_messages%} - -{%block title%}Select Grouping/Population{%endblock%} - -{%block contents%} -<h1 class="heading">Select grouping/population</h1> - -<div class="explainer"> - <p>The data is organised in a hierarchical form, beginning with - <em>species</em> at the very top. Under <em>species</em> the data is - organised by <em>population</em>, sometimes referred to as <em>grouping</em>. - (In some really old documents/systems, you might see this referred to as - <em>InbredSet</em>.)</p> - <p>In this section, you get to define what population your data is to be - organised by.</p> -</div> - -<form method="POST" - action="{{url_for('upload.rqtl2.select_population', species_id=species.SpeciesId)}}"> - <legend class="heading">select grouping/population</legend> - {{flash_messages("error-select-population")}} - - <input type="hidden" name="species_id" value="{{species.SpeciesId}}" /> - - <div class="form-group"> - <label for="select:inbredset" class="form-label">population</label> - <select id="select:inbredset" - name="inbredset_id" - required="required" - class="form-control"> - <option value="">Select a grouping/population</option> - {%for pop in populations%} - <option value="{{pop.InbredSetId}}"> - {{pop.InbredSetName}} ({{pop.FullName}})</option> - {%endfor%} - </select> - <span class="form-text text-muted">If you are adding data to an already existing - population, simply pick the population from this drop-down selector. If - you cannot find your population from this list, try the form below to - create a new one..</span> - </div> - - <button type="submit" class="btn btn-primary" />select population</button> -</form> - -<p style="color:#FE3535; padding-left:20em; font-weight:bolder;">OR</p> - -<form method="POST" - action="{{url_for('upload.rqtl2.create_population', species_id=species.SpeciesId)}}"> - <legend class="heading">create new grouping/population</legend> - {{flash_messages("error-create-population")}} - - <input type="hidden" name="species_id" value="{{species.SpeciesId}}" /> - - <div class="form-group"> - <legend class="heading">mandatory</legend> - - <div class="form-group"> - <label for="txt:inbredset-name" class="form-label">name</label> - <input id="txt:inbredset-name" - name="inbredset_name" - type="text" - required="required" - maxlength="30" - placeholder="Enter grouping/population name" - class="form-control" /> - <span class="form-text text-muted">This is a short name that identifies the - population. Useful for menus, and quick scanning.</span> - </div> - - <div class="form-group"> - <label for="txt:" class="form-label">full name</label> - <input id="txt:inbredset-fullname" - name="inbredset_fullname" - type="text" - required="required" - maxlength="100" - placeholder="Enter the grouping/population's full name" - class="form-control" /> - <span class="form-text text-muted">This can be the same as the name above, or can - be longer. Useful for documentation, and human communication.</span> - </div> - </div> - - <div class="form-group"> - <legend class="heading">optional</legend> - - <div class="form-group"> - <label for="num:public" class="form-label">public?</label> - <select id="num:public" - name="public" - class="form-control"> - <option value="0">0 - Only accessible to authorised users</option> - <option value="1">1 - Publicly accessible to all users</option> - <option value="2" selected> - 2 - Publicly accessible to all users</option> - </select> - <span class="form-text text-muted">This determines whether the - population/grouping will appear on the menus for users.</span> - </div> - - <div class="form-group"> - <label for="txt:inbredset-family" class="form-label">family</label> - <input id="txt:inbredset-family" - name="inbredset_family" - type="text" - placeholder="I am not sure what this is about." - class="form-control" /> - <span class="form-text text-muted">I do not currently know what this is about. - This is a failure on my part to figure out what this is and provide a - useful description. Please feel free to remind me.</span> - </div> - - <div class="form-group"> - <label for="txtarea:" class="form-label">Description</label> - <textarea id="txtarea:description" - name="description" - rows="5" - placeholder="Enter a description of this grouping/population" - class="form-control"></textarea> - <span class="form-text text-muted"> - A long-form description of what the population consists of. Useful for - humans.</span> - </div> - </div> - - <button type="submit" class="btn btn-primary" /> - create grouping/population</button> -</form> - -{%endblock%} - - -{%block javascript%} -{%endblock%} diff --git a/qc_app/templates/samples/select-population.html b/qc_app/templates/samples/select-population.html deleted file mode 100644 index da19ddc..0000000 --- a/qc_app/templates/samples/select-population.html +++ /dev/null @@ -1,99 +0,0 @@ -{%extends "base.html"%} -{%from "flash_messages.html" import flash_messages%} - -{%block title%}Select Grouping/Population{%endblock%} - -{%block contents%} -<h1 class="heading">Select grouping/population</h1> - -<div> - <p>We organise the samples/cases/strains in a hierarchichal form, starting - with <strong>species</strong> at the very top. Under species, we have a - grouping in terms of the relevant population - (e.g. Inbred populations, cell tissue, etc.)</p> -</div> - -<form method="POST" action="{{url_for('samples.select_population', - species_id=species.SpeciesId)}}"> - <legend class="heading">select grouping/population</legend> - {{flash_messages("error-select-population")}} - - <input type="hidden" name="species_id" value="{{species.SpeciesId}}" /> - - <div class="form-group"> - <label for="select:inbredset" class="form-label">grouping/population</label> - <select id="select:inbredset" - name="inbredset_id" - required="required" - class="form-control"> - <option value="">Select a grouping/population</option> - {%for pop in populations%} - <option value="{{pop.InbredSetId}}"> - {{pop.InbredSetName}} ({{pop.FullName}})</option> - {%endfor%} - </select> - </div> - - <button type="submit" class="btn btn-primary">select population</button> -</form> - -<p style="color:#FE3535; padding-left:20em; font-weight:bolder;">OR</p> - -<form method="POST" action="{{url_for('samples.create_population', - species_id=species.SpeciesId)}}"> - <legend class="heading">create new grouping/population</legend> - {{flash_messages("error-create-population")}} - - <input type="hidden" name="species_id" value="{{species.SpeciesId}}" /> - <div class="form-group"> - <legend>mandatory</legend> - - <label for="txt:inbredset-name" class="form-label">name</label> - <input id="txt:inbredset-name" - name="inbredset_name" - type="text" - required="required" - placeholder="Enter grouping/population name" - class="form-control" /> - - <label for="txt:" class="form-label">full name</label> - <input id="txt:inbredset-fullname" - name="inbredset_fullname" - type="text" - required = "required" - placeholder="Enter the grouping/population's full name" - class="form-control" /> - </div> - <div class="form-group"> - <legend>Optional</legend> - - <label for="num:public" class="form-label">public?</label> - <input id="num:public" - name="public" - type="number" - min="0" max="2" value="2" - class="form-control" /> - - <label for="txt:inbredset-family" class="form-label">family</label> - <input id="txt:inbredset-family" - name="inbredset_family" - type="text" - placeholder="I am not sure what this is about." - class="form-control" /> - - <label for="txtarea:" class="form-label">Description</label> - <textarea id="txtarea:description" - name="description" - rows="5" - placeholder="Enter a description of this grouping/population" - class="form-control"></textarea> - </div> - - <button type="submit" class="btn btn-primary">create grouping/population</button> -</form> - -{%endblock%} - - -{%block javascript%} -{%endblock%} diff --git a/qc_app/templates/samples/select-species.html b/qc_app/templates/samples/select-species.html deleted file mode 100644 index edadc61..0000000 --- a/qc_app/templates/samples/select-species.html +++ /dev/null @@ -1,30 +0,0 @@ -{%extends "base.html"%} -{%from "flash_messages.html" import flash_all_messages%} - -{%block title%}Select Grouping/Population{%endblock%} - -{%block contents%} -<h2 class="heading">upload samples/cases</h2> - -<p>We need to know what species your data belongs to.</p> - -{{flash_all_messages()}} - -<form method="POST" action="{{url_for('samples.select_species')}}"> - <legend class="heading">upload samples</legend> - <div class="form-group"> - <label for="select_species02" class="form-label">Species</label> - <select id="select_species02" - name="species_id" - required="required" - class="form-control"> - <option value="">Select species</option> - {%for spec in species%} - <option value="{{spec.SpeciesId}}">{{spec.MenuName}}</option> - {%endfor%} - </select> - </div> - - <button type="submit" class="btn btn-primary">submit</button> -</form> -{%endblock%} diff --git a/qc_app/templates/samples/upload-progress.html b/qc_app/templates/samples/upload-progress.html deleted file mode 100644 index 7bb02be..0000000 --- a/qc_app/templates/samples/upload-progress.html +++ /dev/null @@ -1,22 +0,0 @@ -{%extends "base.html"%} -{%from "cli-output.html" import cli_output%} - -{%block extrameta%} -<meta http-equiv="refresh" content="5"> -{%endblock%} - -{%block title%}Job Status{%endblock%} - -{%block contents%} -<h1 class="heading">{{job.job_name}}</h2> - -<p> -<strong>status</strong>: -<span>{{job["status"]}} ({{job.get("message", "-")}})</span><br /> -</p> - -<p>saving to database...</p> - -{{cli_output(job, "stdout")}} - -{%endblock%} diff --git a/qc_app/templates/samples/upload-samples.html b/qc_app/templates/samples/upload-samples.html deleted file mode 100644 index e62de57..0000000 --- a/qc_app/templates/samples/upload-samples.html +++ /dev/null @@ -1,139 +0,0 @@ -{%extends "base.html"%} -{%from "flash_messages.html" import flash_messages%} - -{%block title%}Upload Samples{%endblock%} - -{%block css%}{%endblock%} - -{%block contents%} -<h1 class="heading">upload samples</h1> - -{{flash_messages("alert-success")}} - -<p>You can now upload a character-separated value (CSV) file that contains - details about your samples. The CSV file should have the following fields: - <dl> - <dt>Name</dt> - <dd>The primary name for the sample</dd> - - <dt>Name2</dt> - <dd>A secondary name for the sample. This can simply be the same as - <strong>Name</strong> above. This field <strong>MUST</strong> contain a - value.</dd> - - <dt>Symbol</dt> - <dd>A symbol for the sample. Can be an empty field.</dd> - - <dt>Alias</dt> - <dd>An alias for the sample. Can be an empty field.</dd> - </dl> -</p> - -<form id="form-samples" - method="POST" - action="{{url_for('samples.upload_samples', - species_id=species.SpeciesId, - population_id=population.InbredSetId)}}" - enctype="multipart/form-data"> - <legend class="heading">upload samples</legend> - - <div class="form-group"> - <input type="hidden" name="species_id" value="{{species.SpeciesId}}" /> - <label class="form-label">species:</label> - <span class="form-text">{{species.SpeciesName}} [{{species.MenuName}}]</span> - </div> - - <div class="form-group"> - <input type="hidden" name="inbredset_id" value="{{population.InbredSetId}}" /> - <label class="form-label">grouping/population:</label> - <span class="form-text">{{population.Name}} [{{population.FullName}}]</span> - </div> - - <div class="form-group"> - <label for="file-samples" class="form-label">select file</label> - <input type="file" name="samples_file" id="file:samples" - accept="text/csv, text/tab-separated-values" - class="form-control" /> - </div> - - <div class="form-group"> - <label for="select:separator" class="form-label">field separator</label> - <select id="select:separator" - name="separator" - required="required" - class="form-control"> - <option value="">Select separator for your file: (default is comma)</option> - <option value="	">TAB</option> - <option value=" ">Space</option> - <option value=",">Comma</option> - <option value=";">Semicolon</option> - <option value="other">Other</option> - </select> - <input id="txt:separator" - type="text" - name="other_separator" - class="form-control" /> - <small class="form-text text-muted"> - If you select '<strong>Other</strong>' for the field separator value, - enter the character that separates the fields in your CSV file in the form - field below. - </small> - </div> - - <div class="form-group form-check"> - <input id="chk:heading" - type="checkbox" - name="first_line_heading" - class="form-check-input" /> - <label for="chk:heading" class="form-check-label"> - first line is a heading?</label> - <small class="form-text text-muted"> - Select this if the first line in your file contains headings for the - columns. - </small> - </div> - - <div class="form-group"> - <label for="txt:delimiter" class="form-label">field delimiter</label> - <input id="txt:delimiter" - type="text" - name="field_delimiter" - maxlength="1" - class="form-control" /> - <small class="form-text text-muted"> - If there is a character delimiting the string texts within particular - fields in your CSV, provide the character here. This can be left blank if - no such delimiters exist in your file. - </small> - </div> - - <button type="submit" - class="btn btn-primary">upload samples file</button> -</form> - -<table id="tbl:samples-preview" class="table"> - <caption class="heading">preview content</caption> - - <thead> - <tr> - <th>Name</th> - <th>Name2</th> - <th>Symbol</th> - <th>Alias</th> - </tr> - </thead> - - <tbody> - <tr id="default-row"> - <td colspan="4"> - Please make some selections to preview the data.</td> - </tr> - </tbody> -</table> - -{%endblock%} - - -{%block javascript%} -<script src="/static/js/upload_samples.js" type="text/javascript"></script> -{%endblock%} diff --git a/qc_app/templates/samples/upload-success.html b/qc_app/templates/samples/upload-success.html deleted file mode 100644 index cb745c3..0000000 --- a/qc_app/templates/samples/upload-success.html +++ /dev/null @@ -1,18 +0,0 @@ -{%extends "base.html"%} -{%from "cli-output.html" import cli_output%} - -{%block title%}Job Status{%endblock%} - -{%block contents%} -<h1 class="heading">{{job.job_name}}</h2> - -<p> -<strong>status</strong>: -<span>{{job["status"]}} ({{job.get("message", "-")}})</span><br /> -</p> - -<p>Successfully uploaded the samples.</p> - -{{cli_output(job, "stdout")}} - -{%endblock%} diff --git a/qc_app/templates/select_species.html b/qc_app/templates/select_species.html deleted file mode 100644 index 3b1a8a9..0000000 --- a/qc_app/templates/select_species.html +++ /dev/null @@ -1,92 +0,0 @@ -{%extends "base.html"%} -{%from "flash_messages.html" import flash_messages%} -{%from "upload_progress_indicator.html" import upload_progress_indicator%} - -{%block title%}expression data: select species{%endblock%} - -{%block contents%} -{{upload_progress_indicator()}} - -<h2 class="heading">expression data: select species</h2> - -<div class="row"> - <form action="{{url_for('entry.upload_file')}}" - method="POST" - enctype="multipart/form-data" - id="frm-upload-expression-data"> - <legend class="heading">upload expression data</legend> - {{flash_messages("error-expr-data")}} - - <div class="form-group"> - <label for="select_species01" class="form-label">Species</label> - <select id="select_species01" - name="speciesid" - required="required" - class="form-control"> - <option value="">Select species</option> - {%for aspecies in species%} - <option value="{{aspecies.SpeciesId}}">{{aspecies.MenuName}}</option> - {%endfor%} - </select> - </div> - - <div class="form-group"> - <legend class="heading">file type</legend> - - <div class="form-check"> - <input type="radio" name="filetype" value="average" id="filetype_average" - required="required" class="form-check-input" /> - <label for="filetype_average" class="form-check-label">average</label> - </div> - - <div class="form-check"> - <input type="radio" name="filetype" value="standard-error" - id="filetype_standard_error" required="required" - class="form-check-input" /> - <label for="filetype_standard_error" class="form-check-label"> - standard error - </label> - </div> - </div> - - <div class="form-group"> - <span id="no-file-error" class="alert-danger" style="display: none;"> - No file selected - </span> - <label for="file_upload" class="form-label">select file</label> - <input type="file" name="qc_text_file" id="file_upload" - accept="text/plain, text/tab-separated-values, application/zip" - class="form-control"/> - </div> - - <button type="submit" - class="btn btn-primary" - data-toggle="modal" - data-target="#upload-progress-indicator">upload file</button> - </form> -</div> -{%endblock%} - - -{%block javascript%} -<script type="text/javascript" src="static/js/upload_progress.js"></script> -<script type="text/javascript"> - function setup_formdata(form) { - var formdata = new FormData(); - formdata.append( - "speciesid", - form.querySelector("#select_species01").value) - formdata.append( - "qc_text_file", - form.querySelector("input[type='file']").files[0]); - formdata.append( - "filetype", - selected_filetype( - Array.from(form.querySelectorAll("input[type='radio']")))); - return formdata; - } - - setup_upload_handlers( - "frm-upload-expression-data", make_data_uploader(setup_formdata)); -</script> -{%endblock%} diff --git a/qc_app/templates/unhandled_exception.html b/qc_app/templates/unhandled_exception.html deleted file mode 100644 index 6e6a051..0000000 --- a/qc_app/templates/unhandled_exception.html +++ /dev/null @@ -1,21 +0,0 @@ -{%extends "base.html"%} - -{%block title%}System Error{%endblock%} - -{%block css%} -<link rel="stylesheet" href="/static/css/two-column-with-separator.css" /> -{%endblock%} - -{%block contents%} -<p> - An error has occured, and your request has been aborted. Please notify the - administrator to try and get this sorted. -</p> -<p> - Provide the following information to help the administrator figure out and fix - the issue:<br /> - <hr /><br /> - {{trace}} - <hr /><br /> -</p> -{%endblock%} diff --git a/qc_app/upload/__init__.py b/qc_app/upload/__init__.py deleted file mode 100644 index 5f120d4..0000000 --- a/qc_app/upload/__init__.py +++ /dev/null @@ -1,7 +0,0 @@ -"""Package handling upload of files.""" -from flask import Blueprint - -from .rqtl2 import rqtl2 - -upload = Blueprint("upload", __name__) -upload.register_blueprint(rqtl2, url_prefix="/rqtl2") diff --git a/r_qtl/errors.py b/r_qtl/exceptions.py index 417eb58..9620cf4 100644 --- a/r_qtl/errors.py +++ b/r_qtl/exceptions.py @@ -6,7 +6,7 @@ class RQTLError(Exception): class InvalidFormat(RQTLError): """Raised when the format of the file(s) is invalid.""" -class MissingFileError(InvalidFormat): +class MissingFileException(InvalidFormat): """ Raise when at least one file listed in the control file is missing from the R/qtl2 bundle. diff --git a/r_qtl/fileerrors.py b/r_qtl/fileerrors.py index e76676c..c253d71 100644 --- a/r_qtl/fileerrors.py +++ b/r_qtl/fileerrors.py @@ -1,5 +1,14 @@ """QC errors as distinguished from actual exceptions""" from collections import namedtuple +InvalidValue = namedtuple( + "InvalidValue", + ("filename", + "rowtitle", + "coltitle", + "cellvalue", + "message")) + + MissingFile = namedtuple( "MissingFile", ("controlfilekey", "filename", "message")) diff --git a/r_qtl/r_qtl2.py b/r_qtl/r_qtl2.py index 0a96e7c..c6307f5 100644 --- a/r_qtl/r_qtl2.py +++ b/r_qtl/r_qtl2.py @@ -1,22 +1,27 @@ """The R/qtl2 parsing and processing code.""" import io +import os import csv import json from pathlib import Path -from zipfile import ZipFile from functools import reduce, partial +from zipfile import ZipFile, is_zipfile from typing import Union, Iterator, Iterable, Callable, Optional import yaml from functional_tools import take, chain -from r_qtl.errors import InvalidFormat, MissingFileError +from r_qtl.exceptions import InvalidFormat, MissingFileException FILE_TYPES = ( "geno", "founder_geno", "pheno", "covar", "phenocovar", "gmap", "pmap", "phenose") +__CONTROL_FILE_ERROR_MESSAGE__ = ( + "The zipped bundle that was provided does not contain a valid control file " + "in either JSON or YAML format.") + def __special_file__(filename): """ @@ -30,7 +35,81 @@ def __special_file__(filename): return (is_macosx_special_file or is_nix_hidden_file) -def control_data(zfile: ZipFile) -> dict: +def extract(zfile: ZipFile, outputdir: Path) -> tuple[Path, ...]: + """Extract a ZipFile + + This function will extract a zipfile `zfile` to the directory `outputdir`. + + Parameters + ---------- + zfile: zipfile.ZipFile object - the zipfile to extract. + outputdir: Optional pathlib.Path object - where the extracted files go. + + Returns + ------- + A tuple of Path objects, each pointing to a member in the zipfile. + """ + outputdir.mkdir(parents=True, exist_ok=True) + return tuple(Path(zfile.extract(member, outputdir)) + for member in zfile.namelist() + if not __special_file__(member)) + + +def transpose_csv( + inpath: Path, + linesplitterfn: Callable, + linejoinerfn: Callable, + outpath: Path) -> Path: + """Transpose a file: Make its rows into columns and its columns into rows. + + This function will create a new file, `outfile`, with the same content as + the original, `infile`, except transposed i.e. The rows of `infile` are the + columns of `outfile` and the columns of `infile` are the rows of `outfile`. + + Parameters + ---------- + inpath: The CSV file to transpose. + linesplitterfn: A function to use for splitting each line into columns + linejoinerfn: A function to use to rebuild the lines + outpath: The path where the transposed data is stored + """ + def __read_by_line__(_path): + with open(_path, "r", encoding="utf8") as infile: + for line in infile: + if line.startswith("#"): + continue + yield line + + transposed_data= (f"{linejoinerfn(items)}\n" for items in zip(*( + linesplitterfn(line) for line in __read_by_line__(inpath)))) + + with open(outpath, "w", encoding="utf8") as outfile: + for line in transposed_data: + outfile.write(line) + + return outpath + + +def transpose_csv_with_rename(inpath: Path, + linesplitterfn: Callable, + linejoinerfn: Callable) -> Path: + """Renames input file and creates new transposed file with the original name + of the input file. + + Parameters + ---------- + inpath: Path to the input file. Should be a pathlib.Path object. + linesplitterfn: A function to use for splitting each line into columns + linejoinerfn: A function to use to rebuild the lines + """ + transposedfilepath = Path(inpath) + origbkp = inpath.parent.joinpath(f"{inpath.stem}___original{inpath.suffix}") + os.rename(inpath, origbkp) + return transpose_csv( + origbkp, linesplitterfn, linejoinerfn, transposedfilepath) + + +def __control_data_from_zipfile__(zfile: ZipFile) -> dict: """Retrieve the control file from the zip file info.""" files = tuple(filename for filename in zfile.namelist() @@ -39,7 +118,7 @@ def control_data(zfile: ZipFile) -> dict: or filename.endswith(".json")))) num_files = len(files) if num_files == 0: - raise InvalidFormat("Expected a json or yaml control file.") + raise InvalidFormat(__CONTROL_FILE_ERROR_MESSAGE__) if num_files > 1: raise InvalidFormat("Found more than one possible control file.") @@ -56,6 +135,80 @@ def control_data(zfile: ZipFile) -> dict: else yaml.safe_load(zfile.read(files[0]))) } +def __control_data_from_dirpath__(dirpath: Path): + """Load control data from a given directory path.""" + files = tuple(path for path in dirpath.iterdir() + if (not __special_file__(path.name) + and (path.suffix in (".yaml", ".json")))) + num_files = len(files) + if num_files == 0: + raise InvalidFormat(__CONTROL_FILE_ERROR_MESSAGE__) + + if num_files > 1: + raise InvalidFormat("Found more than one possible control file.") + + with open(files[0], "r", encoding="utf8") as infile: + return { + "na.strings": ["NA"], + "comment.char": "#", + "sep": ",", + **{ + f"{key}_transposed": False for key in FILE_TYPES + }, + **(json.loads(infile.read()) + if files[0].suffix == ".json" + else yaml.safe_load(infile.read())) + } + + +def control_data(control_src: Union[Path, ZipFile]) -> dict: + """Read the R/qtl2 bundle control file. + + Parameters + ---------- + control_src: Path object of ZipFile object. + If a directory path is provided, this function will read the control + data from the control file in that directory. + It is importand that the Path be a directory and contain data from one + and only one R/qtl2 bundle. + + If a ZipFile object is provided, then the control data is read from the + control file within the zip file. We are moving away from parsing data + directly from ZipFile objects, and this is retained only until the + transition to using extracted files is complete. + + Returns + ------- + Returns a dict object with the control data that determines what the files + in the bundle are and how to parse them. + + Raises + ------ + r_qtl.exceptions.InvalidFormat + """ + def __cleanup__(cdata): + return { + **cdata, + **dict((filetype, + ([cdata[filetype]] if isinstance(cdata[filetype], str) + else cdata[filetype]) + ) for filetype in + (typ for typ in cdata.keys() if typ in FILE_TYPES)) + } + + if isinstance(control_src, ZipFile): + return __cleanup__(__control_data_from_zipfile__(control_src)) + if isinstance(control_src, Path): + if is_zipfile(control_src): + return __cleanup__( + __control_data_from_zipfile__(ZipFile(control_src))) + if control_src.is_dir(): + return __cleanup__(__control_data_from_dirpath__(control_src)) + raise InvalidFormat( + "Expects either a zipped bundle of files or a path-like object " + "pointing to the zipped R/qtl2 bundle.") + + def replace_na_strings(cdata, val): """Replace values indicated in `na.strings` with `None`.""" return (None if val in cdata.get("na.strings", ["NA"]) else val) @@ -267,7 +420,7 @@ def file_data(zfile: ZipFile, zfile, member_key, cdata, process_transposed_value): yield row except KeyError as exc: - raise MissingFileError(*exc.args) from exc + raise MissingFileException(*exc.args) from exc def cross_information(zfile: ZipFile, cdata: dict) -> Iterator[dict]: """Load cross information where present.""" @@ -401,3 +554,21 @@ def load_samples(zipfilepath: Union[str, Path], pass return tuple(samples) + + + +def read_text_file(filepath: Union[str, Path]) -> Iterator[str]: + """Read the raw text from a text file.""" + with open(filepath, "r", encoding="utf8") as _file: + for line in _file: + yield line + + +def read_csv_file(filepath: Union[str, Path], + separator: str = ",", + comment_char: str = "#") -> Iterator[tuple[str, ...]]: + """Read a file as a csv file.""" + for line in read_text_file(filepath): + if line.startswith(comment_char): + continue + yield tuple(field.strip() for field in line.split(separator)) diff --git a/r_qtl/r_qtl2_qc.py b/r_qtl/r_qtl2_qc.py index be1eac4..2d9e9a8 100644 --- a/r_qtl/r_qtl2_qc.py +++ b/r_qtl/r_qtl2_qc.py @@ -1,12 +1,14 @@ """Quality control checks for R/qtl2 data bundles.""" -from zipfile import ZipFile +from pathlib import Path from functools import reduce, partial +from zipfile import ZipFile, is_zipfile from typing import Union, Iterator, Optional, Callable -from r_qtl import errors as rqe from r_qtl import r_qtl2 as rqtl2 +from r_qtl import exceptions as rqe from r_qtl.r_qtl2 import FILE_TYPES from r_qtl.fileerrors import MissingFile +from r_qtl.exceptions import InvalidFormat from quality_control.errors import InvalidValue from quality_control.checks import decimal_points_error @@ -39,11 +41,10 @@ def bundle_files_list(cdata: dict) -> tuple[tuple[str, str], ...]: return fileslist -def missing_files(zfile: ZipFile) -> tuple[tuple[str, str], ...]: - """ - Retrieve a list of files listed in the control file that do not exist in the - bundle. - """ + +def __missing_from_zipfile__( + zfile: ZipFile, cdata: dict) -> tuple[tuple[str, str], ...]: + """Check for missing files from a still-compressed zip file.""" def __missing_p__(filedetails: tuple[str, str]): _cfkey, thefile = filedetails try: @@ -52,14 +53,53 @@ def missing_files(zfile: ZipFile) -> tuple[tuple[str, str], ...]: except KeyError: return True - return tuple(afile for afile in bundle_files_list(rqtl2.control_data(zfile)) + return tuple(afile for afile in bundle_files_list(cdata) if __missing_p__(afile)) -def validate_bundle(zfile: ZipFile): + +def __missing_from_dirpath__( + dirpath: Path, cdata: dict) -> tuple[tuple[str, str], ...]: + """Check for missing files from an extracted bundle.""" + allfiles = tuple(_file.name for _file in dirpath.iterdir()) + return tuple(afile for afile in bundle_files_list(cdata) + if afile[1] not in allfiles) + + +def missing_files(bundlesrc: Union[Path, ZipFile]) -> tuple[tuple[str, str], ...]: + """ + Retrieve a list of files listed in the control file that do not exist in the + bundle. + + Parameters + ---------- + bundlesrc: Path object of ZipFile object: This is the bundle under check. + + Returns + ------- + A tuple of names listed in the control file that do not exist in the bundle. + + Raises + ------ + r_qtl.exceptions.InvalidFormat + """ + cdata = rqtl2.control_data(bundlesrc) + if isinstance(bundlesrc, ZipFile): + return __missing_from_zipfile__(bundlesrc, cdata) + if isinstance(bundlesrc, Path): + if is_zipfile(bundlesrc): + return __missing_from_zipfile__(ZipFile(bundlesrc), cdata) + if bundlesrc.is_dir(): + return __missing_from_dirpath__(bundlesrc, cdata) + raise InvalidFormat( + "Expects either a zipfile.ZipFile object or a pathlib.Path object " + "pointing to a directory containing the R/qtl2 bundle.") + + +def validate_bundle(zfile: Union[Path, ZipFile]): """Ensure the R/qtl2 bundle is valid.""" missing = missing_files(zfile) if len(missing) > 0: - raise rqe.MissingFileError( + raise rqe.MissingFileException( "The following files do not exist in the bundle: " + ", ".join(mfile[1] for mfile in missing)) @@ -111,6 +151,6 @@ def retrieve_errors(zfile: ZipFile, filetype: str, checkers: tuple[Callable]) -> if value is not None: for checker in checkers: yield checker(lineno=lineno, field=field, value=value) - except rqe.MissingFileError: + except rqe.MissingFileException: fname = cdata.get(filetype) yield MissingFile(filetype, fname, f"Missing '{filetype}' file '{fname}'.") diff --git a/scripts/cli_parser.py b/scripts/cli_parser.py index 308ee4b..d42ae66 100644 --- a/scripts/cli_parser.py +++ b/scripts/cli_parser.py @@ -19,6 +19,12 @@ def init_cli_parser(program: str, description: Optional[str] = None) -> Argument type=int, default=86400, help="How long to keep any redis keys around.") + parser.add_argument( + "--loglevel", + type=str, + default="INFO", + choices=["DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL"], + help="The severity of events to track with the logger.") return parser def add_global_data_arguments(parser: ArgumentParser) -> ArgumentParser: diff --git a/scripts/insert_data.py b/scripts/insert_data.py index 1465348..4b2e5f3 100644 --- a/scripts/insert_data.py +++ b/scripts/insert_data.py @@ -14,8 +14,8 @@ from MySQLdb.cursors import DictCursor from functional_tools import take from quality_control.file_utils import open_file -from qc_app.db_utils import database_connection -from qc_app.check_connections import check_db, check_redis +from uploader.db_utils import database_connection +from uploader.check_connections import check_db, check_redis # Set up logging stderr_handler = logging.StreamHandler(stream=sys.stderr) diff --git a/scripts/insert_samples.py b/scripts/insert_samples.py index 8431462..e3577b6 100644 --- a/scripts/insert_samples.py +++ b/scripts/insert_samples.py @@ -7,10 +7,11 @@ import argparse import MySQLdb as mdb from redis import Redis -from qc_app.db_utils import database_connection -from qc_app.check_connections import check_db, check_redis -from qc_app.db import species_by_id, population_by_id -from qc_app.samples import ( +from uploader.db_utils import database_connection +from uploader.check_connections import check_db, check_redis +from uploader.species.models import species_by_id +from uploader.population.models import population_by_id +from uploader.samples.models import ( save_samples_data, read_samples_file, cross_reference_samples) diff --git a/scripts/process_rqtl2_bundle.py b/scripts/process_rqtl2_bundle.py index 4da3936..ade9862 100644 --- a/scripts/process_rqtl2_bundle.py +++ b/scripts/process_rqtl2_bundle.py @@ -2,6 +2,7 @@ import sys import uuid import json +import argparse import traceback from typing import Any from pathlib import Path @@ -13,13 +14,13 @@ from redis import Redis from functional_tools import take -import r_qtl.errors as rqe import r_qtl.r_qtl2 as rqtl2 import r_qtl.r_qtl2_qc as rqc +import r_qtl.exceptions as rqe -from qc_app import jobs -from qc_app.db_utils import database_connection -from qc_app.check_connections import check_db, check_redis +from uploader import jobs +from uploader.db_utils import database_connection +from uploader.check_connections import check_db, check_redis from scripts.cli_parser import init_cli_parser from scripts.redis_logger import setup_redis_logger @@ -94,10 +95,11 @@ def process_bundle(dbconn: mdb.Connection, logger.info("Processing geno files.") genoexit = install_genotypes( dbconn, - meta["speciesid"], - meta["populationid"], - meta["geno-dataset-id"], - Path(meta["rqtl2-bundle-file"]), + argparse.Namespace( + speciesid=meta["speciesid"], + populationid=meta["populationid"], + datasetid=meta["geno-dataset-id"], + rqtl2bundle=Path(meta["rqtl2-bundle-file"])), logger) if genoexit != 0: raise Exception("Processing 'geno' file failed.") @@ -109,10 +111,11 @@ def process_bundle(dbconn: mdb.Connection, if has_pheno_file(thejob): phenoexit = install_pheno_files( dbconn, - meta["speciesid"], - meta["platformid"], - meta["probe-dataset-id"], - Path(meta["rqtl2-bundle-file"]), + argparse.Namespace( + speciesid=meta["speciesid"], + platformid=meta["platformid"], + dataset_id=meta["probe-dataset-id"], + rqtl2bundle=Path(meta["rqtl2-bundle-file"])), logger) if phenoexit != 0: raise Exception("Processing 'pheno' file failed.") diff --git a/scripts/qc.py b/scripts/qc.py index e8573a9..6de051f 100644 --- a/scripts/qc.py +++ b/scripts/qc.py @@ -11,7 +11,7 @@ from quality_control.utils import make_progress_calculator from quality_control.errors import InvalidValue, DuplicateHeading from quality_control.parsing import FileType, strain_names, collect_errors -from qc_app.db_utils import database_connection +from uploader.db_utils import database_connection from .cli_parser import init_cli_parser diff --git a/scripts/qc_on_rqtl2_bundle.py b/scripts/qc_on_rqtl2_bundle.py index 40809b7..fc95d13 100644 --- a/scripts/qc_on_rqtl2_bundle.py +++ b/scripts/qc_on_rqtl2_bundle.py @@ -16,13 +16,13 @@ from redis import Redis from quality_control.errors import InvalidValue from quality_control.checks import decimal_points_error -from qc_app import jobs -from qc_app.db_utils import database_connection -from qc_app.check_connections import check_db, check_redis +from uploader import jobs +from uploader.db_utils import database_connection +from uploader.check_connections import check_db, check_redis -from r_qtl import errors as rqe from r_qtl import r_qtl2 as rqtl2 from r_qtl import r_qtl2_qc as rqc +from r_qtl import exceptions as rqe from r_qtl import fileerrors as rqfe from scripts.process_rqtl2_bundle import parse_job @@ -105,7 +105,7 @@ def retrieve_errors_with_progress(rconn: Redis,#pylint: disable=[too-many-locals __update_processed__(value) rconn.hset(fqjobid, f"{filetype}-linecount", count) - except rqe.MissingFileError: + except rqe.MissingFileException: fname = cdata.get(filetype) yield rqfe.MissingFile(filetype, fname, ( f"The file '{fname}' does not exist in the bundle despite it being " @@ -133,7 +133,7 @@ def qc_geno_errors(rconn, fqjobid, _dburi, _speciesid, zfile, logger) -> bool: def fetch_db_geno_samples(conn: mdb.Connection, speciesid: int) -> tuple[str, ...]: """Fetch samples/cases/individuals from the database.""" - samples = set() + samples = set()# type: ignore[var-annotated] with conn.cursor() as cursor: cursor.execute("SELECT Name, Name2 from Strain WHERE SpeciesId=%s", (speciesid,)) @@ -191,12 +191,13 @@ def check_pheno_samples( return allerrors -def qc_pheno_errors(rconn, fqjobid, dburi, speciesid, zfile, logger) -> bool: +def qc_pheno_errors(# pylint: disable=[too-many-arguments] + rconn, fqjobid, dburi, speciesid, zfile, logger) -> bool: """Check for errors in `pheno` file(s).""" cdata = rqtl2.control_data(zfile) if "pheno" in cdata: logger.info("Checking for errors in the 'pheno' file…") - perrs = tuple() + perrs = tuple()# type: ignore[var-annotated] with database_connection(dburi) as dbconn: perrs = check_pheno_samples( dbconn, speciesid, zfile.filename, logger) + tuple( @@ -216,7 +217,8 @@ def qc_pheno_errors(rconn, fqjobid, dburi, speciesid, zfile, logger) -> bool: return False -def qc_phenose_errors(rconn, fqjobid, dburi, speciesid, zfile, logger) -> bool: +def qc_phenose_errors(# pylint: disable=[too-many-arguments] + rconn, fqjobid, _dburi, _speciesid, zfile, logger) -> bool: """Check for errors in `phenose` file(s).""" cdata = rqtl2.control_data(zfile) if "phenose" in cdata: @@ -258,7 +260,9 @@ def run_qc(rconn: Redis, if qc_missing_files(rconn, fqjobid, zfile, logger): return 1 - def with_zipfile(rconn, fqjobid, dbconn, speciesid, filename, logger, func): + def with_zipfile(# pylint: disable=[too-many-arguments] + rconn, fqjobid, dbconn, speciesid, filename, logger, func + ): with ZipFile(filename, "r") as zfile: return func(rconn, fqjobid, dbconn, speciesid, zfile, logger) diff --git a/scripts/qc_on_rqtl2_bundle2.py b/scripts/qc_on_rqtl2_bundle2.py new file mode 100644 index 0000000..7e5d253 --- /dev/null +++ b/scripts/qc_on_rqtl2_bundle2.py @@ -0,0 +1,346 @@ +"""Run Quality Control checks on R/qtl2 bundle.""" +import os +import sys +import json +from time import sleep +from pathlib import Path +from zipfile import ZipFile +from argparse import Namespace +from datetime import timedelta +import multiprocessing as mproc +from functools import reduce, partial +from logging import Logger, getLogger, StreamHandler +from typing import Union, Sequence, Callable, Iterator + +import MySQLdb as mdb +from redis import Redis + +from quality_control.errors import InvalidValue +from quality_control.checks import decimal_points_error + +from uploader import jobs +from uploader.db_utils import database_connection +from uploader.check_connections import check_db, check_redis + +from r_qtl import r_qtl2 as rqtl2 +from r_qtl import r_qtl2_qc as rqc +from r_qtl import exceptions as rqe +from r_qtl import fileerrors as rqfe + +from scripts.process_rqtl2_bundle import parse_job +from scripts.redis_logger import setup_redis_logger +from scripts.cli_parser import init_cli_parser, add_global_data_arguments +from scripts.rqtl2.bundleutils import build_line_joiner, build_line_splitter + + +def check_for_missing_files( + rconn: Redis, fqjobid: str, extractpath: Path, logger: Logger) -> bool: + """Check that all files listed in the control file do actually exist.""" + logger.info("Checking for missing files.") + missing = rqc.missing_files(extractpath) + # add_to_errors(rconn, fqjobid, "errors-generic", tuple( + # rqfe.MissingFile( + # mfile[0], mfile[1], ( + # f"File '{mfile[1]}' is listed in the control file under " + # f"the '{mfile[0]}' key, but it does not actually exist in " + # "the bundle.")) + # for mfile in missing)) + if len(missing) > 0: + logger.error(f"Missing files in the bundle!") + return True + return False + + +def open_file(file_: Path) -> Iterator: + """Open file and return one line at a time.""" + with open(file_, "r", encoding="utf8") as infile: + for line in infile: + yield line + + +def check_markers( + filename: str, + row: tuple[str, ...], + save_error: lambda val: val +) -> tuple[rqfe.InvalidValue]: + """Check that the markers are okay""" + errors = tuple() + counts = {} + for marker in row: + counts = {**counts, marker: counts.get(marker, 0) + 1} + if marker is None or marker == "": + errors = errors + (save_error(rqfe.InvalidValue( + filename, + "markers" + "-", + marker, + "A marker MUST be a valid value.")),) + + return errors + tuple( + save_error(rqfe.InvalidValue( + filename, + "markers", + key, + f"Marker '{key}' was repeated {value} times")) + for key,value in counts.items() if value > 1) + + +def check_geno_line( + filename: str, + headers: tuple[str, ...], + row: tuple[Union[str, None]], + cdata: dict, + save_error: lambda val: val +) -> tuple[rqfe.InvalidValue]: + """Check that the geno line is correct.""" + errors = tuple() + # Verify that line has same number of columns as headers + if len(headers) != len(row): + errors = errors + (save_error(rqfe.InvalidValue( + filename, + headers[0], + row[0], + row[0], + "Every line MUST have the same number of columns.")),) + + # First column is the individuals/cases/samples + if not bool(row[0]): + errors = errors + (save_error(rqfe.InvalidValue( + filename, + headers[0], + row[0], + row[0], + "The sample/case MUST be a valid value.")),) + + def __process_value__(val): + if val in cdata["na.strings"]: + return None + if val in cdata["alleles"]: + return cdata["genotypes"][val] + + genocode = cdata.get("genotypes", {}) + for coltitle, cellvalue in zip(headers[1:],row[1:]): + if ( + bool(genocode) and + cellvalue is not None and + cellvalue not in genocode.keys() + ): + errors = errors + (save_error(rqfe.InvalidValue( + filename, row[0], coltitle, cellvalue, + f"Value '{cellvalue}' is invalid. Expected one of " + f"'{', '.join(genocode.keys())}'.")),) + + return errors + + +def push_file_error_to_redis(rconn: Redis, key: str, error: InvalidValue) -> InvalidValue: + """Push the file error to redis a json string + + Parameters + ---------- + rconn: Connection to redis + key: The name of the list where we push the errors + error: The file error to save + + Returns + ------- + Returns the file error it saved + """ + if bool(error): + rconn.rpush(key, json.dumps(error._asdict())) + return error + + +def file_errors_and_details( + redisargs: dict[str, str], + file_: Path, + filetype: str, + cdata: dict, + linesplitterfn: Callable, + linejoinerfn: Callable, + headercheckers: tuple[Callable, ...], + bodycheckers: tuple[Callable, ...] +) -> dict: + """Compute errors, and other file metadata.""" + errors = tuple() + if cdata[f"{filetype}_transposed"]: + rqtl2.transpose_csv_with_rename(file_, linesplitterfn, linejoinerfn) + + with Redis.from_url(redisargs["redisuri"], decode_responses=True) as rconn: + save_error_fn = partial(push_file_error_to_redis, + rconn, + error_list_name(filetype, file_.name)) + for lineno, line in enumerate(open_file(file_), start=1): + row = linesplitterfn(line) + if lineno == 1: + headers = tuple(row) + errors = errors + reduce( + lambda errs, fnct: errs + fnct( + file_.name, row[1:], save_error_fn), + headercheckers, + tuple()) + continue + + errors = errors + reduce( + lambda errs, fnct: errs + fnct( + file_.name, headers, row, cdata, save_error_fn), + bodycheckers, + tuple()) + + filedetails = { + "filename": file_.name, + "filesize": os.stat(file_).st_size, + "linecount": lineno + } + rconn.hset(redisargs["fqjobid"], + f"file-details:{filetype}:{file_.name}", + json.dumps(filedetails)) + return {**filedetails, "errors": errors} + + +def error_list_name(filetype: str, filename: str): + """Compute the name of the list where the errors will be pushed. + + Parameters + ---------- + filetype: The type of file. One of `r_qtl.r_qtl2.FILE_TYPES` + filename: The name of the file. + """ + return f"errors:{filetype}:{filename}" + + +def check_for_geno_errors( + redisargs: dict[str, str], + extractdir: Path, + cdata: dict, + linesplitterfn: Callable[[str], tuple[Union[str, None]]], + linejoinerfn: Callable[[tuple[Union[str, None], ...]], str], + logger: Logger +) -> bool: + """Check for errors in genotype files.""" + if "geno" in cdata or "founder_geno" in cdata: + genofiles = tuple( + extractdir.joinpath(fname) for fname in cdata.get("geno", [])) + fgenofiles = tuple( + extractdir.joinpath(fname) for fname in cdata.get("founder_geno", [])) + allgenofiles = genofiles + fgenofiles + with Redis.from_url(redisargs["redisuri"], decode_responses=True) as rconn: + error_list_names = [ + error_list_name("geno", file_.name) for file_ in allgenofiles] + for list_name in error_list_names: + rconn.delete(list_name) + rconn.hset( + redisargs["fqjobid"], + "geno-errors-lists", + json.dumps(error_list_names)) + processes = [ + mproc.Process(target=file_errors_and_details, + args=( + redisargs, + file_, + ftype, + cdata, + linesplitterfn, + linejoinerfn, + (check_markers,), + (check_geno_line,)) + ) + for ftype, file_ in ( + tuple(("geno", file_) for file_ in genofiles) + + tuple(("founder_geno", file_) for file_ in fgenofiles)) + ] + for process in processes: + process.start() + # Set expiry for any created error lists + for key in error_list_names: + rconn.expire(name=key, + time=timedelta(seconds=redisargs["redisexpiry"])) + + # TOD0: Add the errors to redis + if any(rconn.llen(errlst) > 0 for errlst in error_list_names): + logger.error("At least one of the 'geno' files has (an) error(s).") + return True + logger.info("No error(s) found in any of the 'geno' files.") + + else: + logger.info("No 'geno' files to check.") + + return False + + +# def check_for_pheno_errors(...): +# """Check for errors in phenotype files.""" +# pass + + +# def check_for_phenose_errors(...): +# """Check for errors in phenotype, standard-error files.""" +# pass + + +# def check_for_phenocovar_errors(...): +# """Check for errors in phenotype-covariates files.""" +# pass + + +def run_qc(rconn: Redis, args: Namespace, fqjobid: str, logger: Logger) -> int: + """Run quality control checks on R/qtl2 bundles.""" + thejob = parse_job(rconn, args.redisprefix, args.jobid) + print(f"THE JOB =================> {thejob}") + jobmeta = thejob["job-metadata"] + inpath = Path(jobmeta["rqtl2-bundle-file"]) + extractdir = inpath.parent.joinpath(f"{inpath.name}__extraction_dir") + with ZipFile(inpath, "r") as zfile: + rqtl2.extract(zfile, extractdir) + + ### BEGIN: The quality control checks ### + cdata = rqtl2.control_data(extractdir) + splitter = build_line_splitter(cdata) + joiner = build_line_joiner(cdata) + + redisargs = { + "fqjobid": fqjobid, + "redisuri": args.redisuri, + "redisexpiry": args.redisexpiry + } + check_for_missing_files(rconn, fqjobid, extractdir, logger) + # check_for_pheno_errors(...) + check_for_geno_errors(redisargs, extractdir, cdata, splitter, joiner, logger) + # check_for_phenose_errors(...) + # check_for_phenocovar_errors(...) + ### END: The quality control checks ### + + def __fetch_errors__(rkey: str) -> tuple: + return tuple(json.loads(rconn.hget(fqjobid, rkey) or "[]")) + + return (1 if any(( + bool(__fetch_errors__(key)) + for key in + ("errors-geno", "errors-pheno", "errors-phenos", "errors-phenocovar"))) + else 0) + + +if __name__ == "__main__": + def main(): + """Enter R/qtl2 bundle QC runner.""" + args = add_global_data_arguments(init_cli_parser( + "qc-on-rqtl2-bundle", "Run QC on R/qtl2 bundle.")).parse_args() + check_redis(args.redisuri) + check_db(args.databaseuri) + + logger = getLogger("qc-on-rqtl2-bundle") + logger.addHandler(StreamHandler(stream=sys.stderr)) + logger.setLevel("DEBUG") + + fqjobid = jobs.job_key(args.redisprefix, args.jobid) + with Redis.from_url(args.redisuri, decode_responses=True) as rconn: + logger.addHandler(setup_redis_logger( + rconn, fqjobid, f"{fqjobid}:log-messages", + args.redisexpiry)) + + exitcode = run_qc(rconn, args, fqjobid, logger) + rconn.hset( + jobs.job_key(args.redisprefix, args.jobid), "exitcode", exitcode) + return exitcode + + sys.exit(main()) diff --git a/scripts/qcapp_wsgi.py b/scripts/qcapp_wsgi.py index 349c006..fe77031 100644 --- a/scripts/qcapp_wsgi.py +++ b/scripts/qcapp_wsgi.py @@ -5,8 +5,8 @@ from logging import getLogger, StreamHandler from flask import Flask -from qc_app import create_app -from qc_app.check_connections import check_db, check_redis +from uploader import create_app +from uploader.check_connections import check_db, check_redis def setup_logging(appl: Flask) -> Flask: """Setup appropriate logging paradigm depending on environment.""" diff --git a/scripts/redis_logger.py b/scripts/redis_logger.py index 2ae682b..d3fde5f 100644 --- a/scripts/redis_logger.py +++ b/scripts/redis_logger.py @@ -1,5 +1,6 @@ """Utilities to log to redis for our worker scripts.""" import logging +from typing import Optional from redis import Redis @@ -26,6 +27,26 @@ class RedisLogger(logging.Handler): self.redisconnection.rpush(self.messageslistname, self.format(record)) self.redisconnection.expire(self.messageslistname, self.expiry) +class RedisMessageListHandler(logging.Handler): + """Send messages to specified redis list.""" + def __init__(self, + rconn: Redis, + fullyqualifiedkey: str, + loglevel: int = logging.NOTSET, + expiry: Optional[int] = 86400): + super().__init__(loglevel) + self.redisconnection = rconn + self.fullyqualifiedkey = fullyqualifiedkey + self.expiry = expiry + + def emit(self, record): + """Log out to specified `fullyqualifiedkey`.""" + self.redisconnection.rpush(self.fullyqualifiedkey, self.format(record)) + if bool(self.expiry): + self.redisconnection.expire(self.fullyqualifiedkey, self.expiry) + else: + self.redisconnection.persist(self.fullyqualifiedkey) + def setup_redis_logger(rconn: Redis, fullyqualifiedjobid: str, job_messagelist: str, diff --git a/scripts/rqtl2/bundleutils.py b/scripts/rqtl2/bundleutils.py new file mode 100644 index 0000000..17faa7c --- /dev/null +++ b/scripts/rqtl2/bundleutils.py @@ -0,0 +1,44 @@ +"""Common utilities to operate in R/qtl2 bundles.""" +from typing import Union, Callable + +def build_line_splitter(cdata: dict) -> Callable[[str], tuple[Union[str, None], ...]]: + """Build and return a function to use to split data in the files. + + Parameters + ---------- + cdata: A dict holding the control information included with the R/qtl2 + bundle. + + Returns + ------- + A function that takes a string and return a tuple of strings. + """ + separator = cdata["sep"] + na_strings = cdata["na.strings"] + def __splitter__(line: str) -> tuple[Union[str, None], ...]: + return tuple( + item if item not in na_strings else None + for item in + (field.strip() for field in line.strip().split(separator))) + return __splitter__ + + +def build_line_joiner(cdata: dict) -> Callable[[tuple[Union[str, None], ...]], str]: + """Build and return a function to use to split data in the files. + + Parameters + ---------- + cdata: A dict holding the control information included with the R/qtl2 + bundle. + + Returns + ------- + A function that takes a string and return a tuple of strings. + """ + separator = cdata["sep"] + na_strings = cdata["na.strings"] + def __joiner__(row: tuple[Union[str, None], ...]) -> str: + return separator.join( + (na_strings[0] if item is None else item) + for item in row) + return __joiner__ diff --git a/scripts/rqtl2/cli_parser.py b/scripts/rqtl2/cli_parser.py index bcc7a4f..9bb60a3 100644 --- a/scripts/rqtl2/cli_parser.py +++ b/scripts/rqtl2/cli_parser.py @@ -2,12 +2,22 @@ from pathlib import Path from argparse import ArgumentParser -def add_common_arguments(parser: ArgumentParser) -> ArgumentParser: - """Add common arguments to the CLI parser.""" - parser.add_argument("datasetid", - type=int, - help="The dataset to which the data belongs.") +def add_bundle_argument(parser: ArgumentParser) -> ArgumentParser: + """Add the `rqtl2bundle` argument.""" parser.add_argument("rqtl2bundle", type=Path, help="Path to R/qtl2 bundle zip file.") return parser + + +def add_datasetid_argument(parser: ArgumentParser) -> ArgumentParser: + """Add the `datasetid` argument.""" + parser.add_argument("datasetid", + type=int, + help="The dataset to which the data belongs.") + return parser + + +def add_common_arguments(parser: ArgumentParser) -> ArgumentParser: + """Add common arguments to the CLI parser.""" + return add_bundle_argument(add_datasetid_argument(parser)) diff --git a/scripts/rqtl2/entry.py b/scripts/rqtl2/entry.py index 93fc130..bc4cd9f 100644 --- a/scripts/rqtl2/entry.py +++ b/scripts/rqtl2/entry.py @@ -1,23 +1,30 @@ """Build common script-entry structure.""" -from logging import Logger +import logging from typing import Callable from argparse import Namespace from redis import Redis from MySQLdb import Connection -from qc_app import jobs -from qc_app.db_utils import database_connection -from qc_app.check_connections import check_db, check_redis +from uploader import jobs +from uploader.db_utils import database_connection +from uploader.check_connections import check_db, check_redis from scripts.redis_logger import setup_redis_logger -def build_main(args: Namespace, - run_fn: Callable[[Connection, Namespace], int], - logger: Logger, - loglevel: str = "INFO") -> Callable[[],int]: +def build_main( + args: Namespace, + run_fn: Callable[[Connection, Namespace, logging.Logger], int], + loggername: str +) -> Callable[[],int]: """Build a function to be used as an entry-point for scripts.""" def main(): + logging.basicConfig( + format=( + "%(asctime)s - %(levelname)s %(name)s: " + "(%(pathname)s: %(lineno)d) %(message)s"), + level=args.loglevel) + logger = logging.getLogger(loggername) check_db(args.databaseuri) check_redis(args.redisuri) if not args.rqtl2bundle.exists(): @@ -26,13 +33,12 @@ def build_main(args: Namespace, with (Redis.from_url(args.redisuri, decode_responses=True) as rconn, database_connection(args.databaseuri) as dbconn): - fqjobid = jobs.job_key(jobs.jobsnamespace(), args.jobid) + fqjobid = jobs.job_key(args.redisprefix, args.jobid) logger.addHandler(setup_redis_logger( rconn, fqjobid, f"{fqjobid}:log-messages", args.redisexpiry)) - logger.setLevel(loglevel) - return run_fn(dbconn, args) + return run_fn(dbconn, args, logger) return main diff --git a/scripts/rqtl2/install_genotypes.py b/scripts/rqtl2/install_genotypes.py index 68ae365..20a19da 100644 --- a/scripts/rqtl2/install_genotypes.py +++ b/scripts/rqtl2/install_genotypes.py @@ -1,11 +1,11 @@ """Load genotypes from R/qtl2 bundle into the database.""" import sys +import argparse import traceback -from pathlib import Path from zipfile import ZipFile from functools import reduce from typing import Iterator, Optional -from logging import Logger, getLogger, StreamHandler +from logging import Logger, getLogger import MySQLdb as mdb from MySQLdb.cursors import DictCursor @@ -19,10 +19,15 @@ from scripts.rqtl2.entry import build_main from scripts.rqtl2.cli_parser import add_common_arguments from scripts.cli_parser import init_cli_parser, add_global_data_arguments -def insert_markers(dbconn: mdb.Connection, - speciesid: int, - markers: tuple[str, ...], - pmapdata: Optional[Iterator[dict]]) -> int: +__MODULE__ = "scripts.rqtl2.install_genotypes" + +def insert_markers( + dbconn: mdb.Connection, + speciesid: int, + markers: tuple[str, ...], + pmapdata: Optional[Iterator[dict]], + _logger: Logger +) -> int: """Insert genotype and genotype values into the database.""" mdata = reduce(#type: ignore[var-annotated] lambda acc, row: ({#type: ignore[arg-type, return-value] @@ -45,12 +50,15 @@ def insert_markers(dbconn: mdb.Connection, "marker": marker, "chr": mdata.get(marker, {}).get("chr"), "pos": mdata.get(marker, {}).get("pos") - } for marker in markers}.items())) + } for marker in markers}.values())) return cursor.rowcount -def insert_individuals(dbconn: mdb.Connection, - speciesid: int, - individuals: tuple[str, ...]) -> int: +def insert_individuals( + dbconn: mdb.Connection, + speciesid: int, + individuals: tuple[str, ...], + _logger: Logger +) -> int: """Insert individuals/samples into the database.""" with dbconn.cursor() as cursor: cursor.executemany( @@ -61,10 +69,13 @@ def insert_individuals(dbconn: mdb.Connection, for individual in individuals)) return cursor.rowcount -def cross_reference_individuals(dbconn: mdb.Connection, - speciesid: int, - populationid: int, - individuals: tuple[str, ...]) -> int: +def cross_reference_individuals( + dbconn: mdb.Connection, + speciesid: int, + populationid: int, + individuals: tuple[str, ...], + _logger: Logger +) -> int: """Cross reference any inserted individuals.""" with dbconn.cursor(cursorclass=DictCursor) as cursor: paramstr = ", ".join(["%s"] * len(individuals)) @@ -80,11 +91,13 @@ def cross_reference_individuals(dbconn: mdb.Connection, tuple(ids)) return cursor.rowcount -def insert_genotype_data(dbconn: mdb.Connection, - speciesid: int, - genotypes: tuple[dict, ...], - individuals: tuple[str, ...]) -> tuple[ - int, tuple[dict, ...]]: +def insert_genotype_data( + dbconn: mdb.Connection, + speciesid: int, + genotypes: tuple[dict, ...], + individuals: tuple[str, ...], + _logger: Logger +) -> tuple[int, tuple[dict, ...]]: """Insert the genotype data values into the database.""" with dbconn.cursor(cursorclass=DictCursor) as cursor: paramstr = ", ".join(["%s"] * len(individuals)) @@ -120,11 +133,14 @@ def insert_genotype_data(dbconn: mdb.Connection, "markerid": row["markerid"] } for row in data) -def cross_reference_genotypes(dbconn: mdb.Connection, - speciesid: int, - datasetid: int, - dataids: tuple[dict, ...], - gmapdata: Optional[Iterator[dict]]) -> int: +def cross_reference_genotypes( + dbconn: mdb.Connection, + speciesid: int, + datasetid: int, + dataids: tuple[dict, ...], + gmapdata: Optional[Iterator[dict]], + _logger: Logger +) -> int: """Cross-reference the data to the relevant dataset.""" _rows, markers, mdata = reduce(#type: ignore[var-annotated] lambda acc, row: (#type: ignore[return-value,arg-type] @@ -140,31 +156,43 @@ def cross_reference_genotypes(dbconn: mdb.Connection, (tuple(), tuple(), {})) with dbconn.cursor(cursorclass=DictCursor) as cursor: - paramstr = ", ".join(["%s"] * len(markers)) - cursor.execute("SELECT Id, Name FROM Geno " - f"WHERE SpeciesId=%s AND Name IN ({paramstr})", - (speciesid,) + markers) - markersdict = {row["Id"]: row["Name"] for row in cursor.fetchall()} - cursor.executemany( + markersdict = {} + if len(markers) > 0: + paramstr = ", ".join(["%s"] * len(markers)) + insertparams = (speciesid,) + markers + selectquery = ("SELECT Id, Name FROM Geno " + f"WHERE SpeciesId=%s AND Name IN ({paramstr})") + _logger.debug( + "The select query was\n\t%s\n\nwith the parameters\n\t%s", + selectquery, + (speciesid,) + markers) + cursor.execute(selectquery, insertparams) + markersdict = {row["Id"]: row["Name"] for row in cursor.fetchall()} + + insertquery = ( "INSERT INTO GenoXRef(GenoFreezeId, GenoId, DataId, cM) " "VALUES(%(datasetid)s, %(markerid)s, %(dataid)s, %(pos)s) " - "ON DUPLICATE KEY UPDATE GenoFreezeId=GenoFreezeId", - tuple({ - **row, - "datasetid": datasetid, - "pos": mdata.get(markersdict.get( - row.get("markerid"), {}), {}).get("pos") - } for row in dataids)) + "ON DUPLICATE KEY UPDATE GenoFreezeId=GenoFreezeId") + insertparams = tuple({ + **row, + "datasetid": datasetid, + "pos": mdata.get(markersdict.get( + row.get("markerid"), "nosuchkey"), {}).get("pos") + } for row in dataids) + _logger.debug( + "The insert query was\n\t%s\n\nwith the parameters\n\t%s", + insertquery, insertparams) + cursor.executemany(insertquery, insertparams) return cursor.rowcount def install_genotypes(#pylint: disable=[too-many-arguments, too-many-locals] dbconn: mdb.Connection, - speciesid: int, - populationid: int, - datasetid: int, - rqtl2bundle: Path, - logger: Logger = getLogger()) -> int: + args: argparse.Namespace, + logger: Logger = getLogger(__name__) +) -> int: """Load any existing genotypes into the database.""" + (speciesid, populationid, datasetid, rqtl2bundle) = ( + args.speciesid, args.populationid, args.datasetid, args.rqtl2bundle) count = 0 with ZipFile(str(rqtl2bundle.absolute()), "r") as zfile: try: @@ -189,20 +217,22 @@ def install_genotypes(#pylint: disable=[too-many-arguments, too-many-locals] speciesid, tuple(key for key in batch[0].keys() if key != "id"), (rqtl2.file_data(zfile, "pmap", cdata) if "pmap" in cdata - else None)) + else None), + logger) individuals = tuple(row["id"] for row in batch) - insert_individuals(dbconn, speciesid, individuals) + insert_individuals(dbconn, speciesid, individuals, logger) cross_reference_individuals( - dbconn, speciesid, populationid, individuals) + dbconn, speciesid, populationid, individuals, logger) _num_rows, dataids = insert_genotype_data( - dbconn, speciesid, batch, individuals) + dbconn, speciesid, batch, individuals, logger) cross_reference_genotypes( dbconn, speciesid, datasetid, dataids, (rqtl2.file_data(zfile, "gmap", cdata) - if "gmap" in cdata else None)) + if "gmap" in cdata else None), + logger) count = count + len(batch) except rqtl2.InvalidFormat as exc: logger.error(str(exc)) @@ -224,15 +254,5 @@ if __name__ == "__main__": return parser.parse_args() - thelogger = getLogger("install_genotypes") - thelogger.addHandler(StreamHandler(stream=sys.stderr)) - main = build_main( - cli_args(), - lambda dbconn, args: install_genotypes(dbconn, - args.speciesid, - args.populationid, - args.datasetid, - args.rqtl2bundle), - thelogger, - "INFO") + main = build_main(cli_args(), install_genotypes, __MODULE__) sys.exit(main()) diff --git a/scripts/rqtl2/install_phenos.py b/scripts/rqtl2/install_phenos.py index b5cab8e..a6e9fb2 100644 --- a/scripts/rqtl2/install_phenos.py +++ b/scripts/rqtl2/install_phenos.py @@ -1,10 +1,10 @@ """Load pheno from R/qtl2 bundle into the database.""" import sys +import argparse import traceback -from pathlib import Path from zipfile import ZipFile from functools import reduce -from logging import Logger, getLogger, StreamHandler +from logging import Logger, getLogger import MySQLdb as mdb from MySQLdb.cursors import DictCursor @@ -18,6 +18,8 @@ from r_qtl import r_qtl2_qc as rqc from functional_tools import take +__MODULE__ = "scripts.rqtl2.install_phenos" + def insert_probesets(dbconn: mdb.Connection, platformid: int, phenos: tuple[str, ...]) -> int: @@ -95,12 +97,11 @@ def cross_reference_probeset_data(dbconn: mdb.Connection, def install_pheno_files(#pylint: disable=[too-many-arguments, too-many-locals] dbconn: mdb.Connection, - speciesid: int, - platformid: int, - datasetid: int, - rqtl2bundle: Path, + args: argparse.Namespace, logger: Logger = getLogger()) -> int: """Load data in `pheno` files and other related files into the database.""" + (speciesid, platformid, datasetid, rqtl2bundle) = ( + args.speciesid, args.platformid, args.datasetid, args.rqtl2bundle) with ZipFile(str(rqtl2bundle), "r") as zfile: try: rqc.validate_bundle(zfile) @@ -155,16 +156,5 @@ if __name__ == "__main__": return parser.parse_args() - thelogger = getLogger("install_phenos") - thelogger.addHandler(StreamHandler(stream=sys.stderr)) - main = build_main( - cli_args(), - lambda dbconn, args: install_pheno_files(dbconn, - args.speciesid, - args.platformid, - args.datasetid, - args.rqtl2bundle, - thelogger), - thelogger, - "DEBUG") + main = build_main(cli_args(), install_pheno_files, __MODULE__) sys.exit(main()) diff --git a/scripts/rqtl2/phenotypes_qc.py b/scripts/rqtl2/phenotypes_qc.py new file mode 100644 index 0000000..83828e4 --- /dev/null +++ b/scripts/rqtl2/phenotypes_qc.py @@ -0,0 +1,468 @@ +"""Run quality control on phenotypes-specific files in the bundle.""" +import sys +import uuid +import shutil +import logging +import tempfile +import contextlib +from pathlib import Path +from logging import Logger +from zipfile import ZipFile +from argparse import Namespace +import multiprocessing as mproc +from functools import reduce, partial +from typing import Union, Iterator, Callable, Optional, Sequence + +import MySQLdb as mdb +from redis import Redis + +from r_qtl import r_qtl2 as rqtl2 +from r_qtl import r_qtl2_qc as rqc +from r_qtl import exceptions as rqe +from r_qtl.fileerrors import InvalidValue + +from functional_tools import chain + +from quality_control.checks import decimal_places_pattern + +from uploader.files import sha256_digest_over_file +from uploader.samples.models import samples_by_species_and_population + +from scripts.rqtl2.entry import build_main +from scripts.redis_logger import RedisMessageListHandler +from scripts.rqtl2.cli_parser import add_bundle_argument +from scripts.cli_parser import init_cli_parser, add_global_data_arguments +from scripts.rqtl2.bundleutils import build_line_joiner, build_line_splitter + +__MODULE__ = "scripts.rqtl2.phenotypes_qc" + +def validate(phenobundle: Path, logger: Logger) -> dict: + """Check that the bundle is generally valid""" + try: + rqc.validate_bundle(phenobundle) + except rqe.RQTLError as rqtlerr: + # logger.error("Bundle file validation failed!", exc_info=True) + return { + "skip": True, + "logger": logger, + "phenobundle": phenobundle, + "errors": (" ".join(rqtlerr.args),) + } + return { + "errors": tuple(), + "skip": False, + "phenobundle": phenobundle, + "logger": logger + } + + +def check_for_mandatory_pheno_keys( + phenobundle: Path, + logger: Logger, + **kwargs +) -> dict: + """Check that the mandatory keys exist for phenotypes.""" + if kwargs.get("skip", False): + return { + **kwargs, + "logger": logger, + "phenobundle": phenobundle + } + + _mandatory_keys = ("pheno", "phenocovar") + _cdata = rqtl2.read_control_file(phenobundle) + _errors = kwargs.get("errors", tuple()) + tuple( + f"Expected '{key}' file(s) are not declared in the bundle." + for key in _mandatory_keys if key not in _cdata.keys()) + return { + **kwargs, + "logger": logger, + "phenobundle": phenobundle, + "errors": _errors, + "skip": len(_errors) > 0 + } + + +def check_for_averages_files( + phenobundle: Path, + logger: Logger, + **kwargs +) -> dict: + """Check that averages files appear together""" + if kwargs.get("skip", False): + return { + **kwargs, + "logger": logger, + "phenobundle": phenobundle + } + + _together = (("phenose", "phenonum"), ("phenonum", "phenose")) + _cdata = rqtl2.read_control_file(phenobundle) + _errors = kwargs.get("errors", tuple()) + tuple( + f"'{first}' is defined in the control file but there is no " + f"corresponding '{second}'" + for first, second in _together + if ((first in _cdata.keys()) and (second not in _cdata.keys()))) + return { + **kwargs, + "logger": logger, + "phenobundle": phenobundle, + "errors": _errors, + "skip": len(_errors) > 0 + } + + +def extract_bundle( + bundle: Path, workdir: Path, jobid: uuid.UUID +) -> tuple[Path, tuple[Path, ...]]: + """Extract the bundle.""" + with ZipFile(bundle) as zfile: + extractiondir = workdir.joinpath( + f"{str(jobid)}-{sha256_digest_over_file(bundle)}-{bundle.name}") + return extractiondir, rqtl2.extract(zfile, extractiondir) + + +def undo_transpose(filetype: str, cdata: dict, extractiondir): + """Undo transposition of all files of type `filetype` in thebundle.""" + if len(cdata.get(filetype, [])) > 0 and cdata.get(f"{filetype}_transposed", False): + files = (extractiondir.joinpath(_file) for _file in cdata[filetype]) + for _file in files: + rqtl2.transpose_csv_with_rename( + _file, + build_line_splitter(cdata), + build_line_joiner(cdata)) + + +@contextlib.contextmanager +def redis_logger( + redisuri: str, loggername: str, filename: str, fqkey: str +) -> Iterator[logging.Logger]: + """Build a Redis message-list logger.""" + rconn = Redis.from_url(redisuri, decode_responses=True) + logger = logging.getLogger(loggername) + logger.propagate = False + handler = RedisMessageListHandler( + rconn, + fullyqualifiedkey(fqkey, filename))#type: ignore[arg-type] + handler.setFormatter(logging.getLogger().handlers[0].formatter) + logger.addHandler(handler) + try: + yield logger + finally: + rconn.close() + + +def qc_phenocovar_file( + filepath: Path, + redisuri, + fqkey: str, + separator: str, + comment_char: str): + """Check that `phenocovar` files are structured correctly.""" + with redis_logger( + redisuri, + f"{__MODULE__}.qc_phenocovar_file", + filepath.name, + fqkey) as logger: + logger.info("Running QC on file: %s", filepath.name) + _csvfile = rqtl2.read_csv_file(filepath, separator, comment_char) + _headings = tuple(heading.lower() for heading in next(_csvfile)) + _errors: tuple[InvalidValue, ...] = tuple() + for heading in ("description", "units"): + if heading not in _headings: + _errors = (InvalidValue( + filepath.name, + "header row", + "-", + "-", + (f"File {filepath.name} is missing the {heading} heading " + "in the header line.")),) + + def collect_errors(errors_and_linecount, line): + _errs, _lc = errors_and_linecount + logger.info("Testing record '%s'", line[0]) + if len(line) != len(_headings): + _errs = _errs + (InvalidValue( + filepath.name, + line[0], + "-", + "-", + (f"Record {_lc} in file {filepath.name} has a different " + "number of columns than the number of headings")),) + _line = dict(zip(_headings, line)) + if not bool(_line["description"]): + _errs = _errs + ( + InvalidValue(filepath.name, + _line[_headings[0]], + "description", + _line["description"], + "The description is not provided!"),) + + return _errs, _lc+1 + + return { + filepath.name: dict(zip( + ("errors", "linecount"), + reduce(collect_errors, _csvfile, (_errors, 1)))) + } + + +def merge_dicts(*dicts): + """Merge multiple dicts into a single one.""" + return reduce(lambda merged, dct: {**merged, **dct}, dicts, {}) + + +def decimal_points_error(# pylint: disable=[too-many-arguments] + filename: str, + rowtitle: str, + coltitle: str, + cellvalue: str, + message: str, + decimal_places: int = 1 +) -> Optional[InvalidValue]: + """Returns an error if the value does not meet the checks.""" + if not bool(decimal_places_pattern(decimal_places).match(cellvalue)): + return InvalidValue(filename, rowtitle, coltitle, cellvalue, message) + return None + + +def integer_error( + filename: str, + rowtitle: str, + coltitle: str, + cellvalue: str, + message: str +) -> Optional[InvalidValue]: + """Returns an error if the value does not meet the checks.""" + try: + value = int(cellvalue) + if value <= 0: + raise ValueError("Must be a non-zero, positive number.") + return None + except ValueError as _verr: + return InvalidValue(filename, rowtitle, coltitle, cellvalue, message) + + +def qc_pheno_file(# pylint: disable=[too-many-arguments] + filepath: Path, + redisuri: str, + fqkey: str, + samples: tuple[str, ...], + phenonames: tuple[str, ...], + separator: str, + comment_char: str, + na_strings: Sequence[str], + error_fn: Callable = decimal_points_error +): + """Run QC/QA on a `pheno` file.""" + with redis_logger( + redisuri, + f"{__MODULE__}.qc_pheno_file", + filepath.name, + fqkey) as logger: + logger.info("Running QC on file: %s", filepath.name) + _csvfile = rqtl2.read_csv_file(filepath, separator, comment_char) + _headings: tuple[str, ...] = tuple( + heading.lower() for heading in next(_csvfile)) + _errors: tuple[InvalidValue, ...] = tuple() + + _absent = tuple(pheno for pheno in _headings[1:] if pheno not in phenonames) + if len(_absent) > 0: + _errors = _errors + (InvalidValue( + filepath.name, + "header row", + "-", + ", ".join(_absent), + (f"The phenotype names ({', '.join(samples)}) do not exist in any " + "of the provided phenocovar files.")),) + + def collect_errors(errors_and_linecount, line): + _errs, _lc = errors_and_linecount + if line[0] not in samples: + _errs = _errs + (InvalidValue( + filepath.name, + line[0], + _headings[0], + line[0], + (f"The sample named '{line[0]}' does not exist in the database. " + "You will need to upload that first.")),) + + for field, value in zip(_headings[1:], line[1:]): + if value in na_strings: + continue + _err = error_fn( + filepath.name, + line[0], + field, + value) + _errs = _errs + ((_err,) if bool(_err) else tuple()) + + return _errs, _lc+1 + + return { + filepath.name: dict(zip( + ("errors", "linecount"), + reduce(collect_errors, _csvfile, (_errors, 1)))) + } + + +def phenotype_names(filepath: Path, + separator: str, + comment_char: str) -> tuple[str, ...]: + """Read phenotype names from `phenocovar` file.""" + return reduce(lambda tpl, line: tpl + (line[0],),#type: ignore[arg-type, return-value] + rqtl2.read_csv_file(filepath, separator, comment_char), + tuple())[1:] + +def fullyqualifiedkey( + prefix: str, + rest: Optional[str] = None +) -> Union[Callable[[str], str], str]: + """Compute fully qualified Redis key.""" + if not bool(rest): + return lambda _rest: f"{prefix}:{_rest}" + return f"{prefix}:{rest}" + +def run_qc(# pylint: disable=[too-many-locals] + dbconn: mdb.Connection, + args: Namespace, + logger: Logger +) -> int: + """Run quality control checks on the bundle.""" + logger.debug("Beginning the quality assuarance checks.") + results = check_for_averages_files( + **check_for_mandatory_pheno_keys( + **validate(args.rqtl2bundle, logger))) + errors = results.get("errors", tuple()) + if len(errors) > 0: + logger.error("We found the following errors:\n%s", + "\n".join(f" - {error}" for error in errors)) + return 1 + # Run QC on actual values + # Steps: + # - Extract file to specific directory + extractiondir, *_bundlefiles = extract_bundle( + args.rqtl2bundle, args.workingdir, args.jobid) + + # - For every pheno, phenocovar, phenose, phenonum file, undo + # transposition where relevant + cdata = rqtl2.control_data(extractiondir) + with mproc.Pool(mproc.cpu_count() - 1) as pool: + pool.starmap( + undo_transpose, + ((ftype, cdata, extractiondir) + for ftype in ("pheno", "phenocovar", "phenose", "phenonum"))) + + # - Fetch samples/individuals from database. + logger.debug("Fetching samples/individuals from the database.") + samples = tuple(#type: ignore[var-annotated] + item for item in set(reduce( + lambda acc, item: acc + ( + item["Name"], item["Name2"], item["Symbol"], item["Alias"]), + samples_by_species_and_population( + dbconn, args.speciesid, args.populationid), + tuple())) + if bool(item)) + + # - Check that `description` and `units` is present in phenocovar for + # all phenotypes + with mproc.Pool(mproc.cpu_count() - 1) as pool: + logger.debug("Check for errors in 'phenocovar' file(s).") + _phenocovar_qc_res = merge_dicts(*pool.starmap(qc_phenocovar_file, tuple( + (extractiondir.joinpath(_file), + args.redisuri, + chain( + "phenocovar", + fullyqualifiedkey(args.jobid), + fullyqualifiedkey(args.redisprefix)), + cdata["sep"], + cdata["comment.char"]) + for _file in cdata.get("phenocovar", [])))) + + # - Check all samples in pheno files exist in database + # - Check all phenotypes in pheno files exist in phenocovar files + # - Check all numeric values in pheno files + phenonames = tuple(set( + name for names in pool.starmap(phenotype_names, tuple( + (extractiondir.joinpath(_file), cdata["sep"], cdata["comment.char"]) + for _file in cdata.get("phenocovar", []))) + for name in names)) + + dec_err_fn = partial(decimal_points_error, message=( + "Expected a non-negative number with at least one decimal " + "place.")) + + logger.debug("Check for errors in 'pheno' file(s).") + _pheno_qc_res = merge_dicts(*pool.starmap(qc_pheno_file, tuple(( + extractiondir.joinpath(_file), + args.redisuri, + chain( + "pheno", + fullyqualifiedkey(args.jobid), + fullyqualifiedkey(args.redisprefix)), + samples, + phenonames, + cdata["sep"], + cdata["comment.char"], + cdata["na.strings"], + dec_err_fn + ) for _file in cdata.get("pheno", [])))) + + # - Check the 3 checks above for phenose and phenonum values too + # qc_phenose_files(…) + # qc_phenonum_files(…) + logger.debug("Check for errors in 'phenose' file(s).") + _phenose_qc_res = merge_dicts(*pool.starmap(qc_pheno_file, tuple(( + extractiondir.joinpath(_file), + args.redisuri, + chain( + "phenose", + fullyqualifiedkey(args.jobid), + fullyqualifiedkey(args.redisprefix)), + samples, + phenonames, + cdata["sep"], + cdata["comment.char"], + cdata["na.strings"], + dec_err_fn + ) for _file in cdata.get("phenose", [])))) + + logger.debug("Check for errors in 'phenonum' file(s).") + _phenonum_qc_res = merge_dicts(*pool.starmap(qc_pheno_file, tuple(( + extractiondir.joinpath(_file), + args.redisuri, + chain( + "phenonum", + fullyqualifiedkey(args.jobid), + fullyqualifiedkey(args.redisprefix)), + samples, + phenonames, + cdata["sep"], + cdata["comment.char"], + cdata["na.strings"], + partial(integer_error, message=( + "Expected a non-negative, non-zero integer value.")) + ) for _file in cdata.get("phenonum", [])))) + + # - Delete all extracted files + shutil.rmtree(extractiondir) + raise NotImplementedError("WIP!") + + +if __name__ == "__main__": + def cli_args(): + """Process command-line arguments for `install_phenos`""" + parser = add_bundle_argument(add_global_data_arguments(init_cli_parser( + program="PhenotypesQC", + description=( + "Perform Quality Control checks on a phenotypes bundle file")))) + parser.add_argument( + "--workingdir", + default=f"{tempfile.gettempdir()}/phenotypes_qc", + help=("The directory where this script will put its intermediate " + "files."), + type=Path) + return parser.parse_args() + + main = build_main(cli_args(), run_qc, __MODULE__) + sys.exit(main()) diff --git a/scripts/validate_file.py b/scripts/validate_file.py index 0028795..a40d7e7 100644 --- a/scripts/validate_file.py +++ b/scripts/validate_file.py @@ -12,8 +12,8 @@ from redis.exceptions import ConnectionError # pylint: disable=[redefined-builti from quality_control.utils import make_progress_calculator from quality_control.parsing import FileType, strain_names, collect_errors -from qc_app import jobs -from qc_app.db_utils import database_connection +from uploader import jobs +from uploader.db_utils import database_connection from .cli_parser import init_cli_parser from .qc import add_file_validation_arguments diff --git a/scripts/worker.py b/scripts/worker.py index 0eb9ea5..91b0332 100644 --- a/scripts/worker.py +++ b/scripts/worker.py @@ -11,8 +11,8 @@ from tempfile import TemporaryDirectory from redis import Redis -from qc_app import jobs -from qc_app.check_connections import check_redis +from uploader import jobs +from uploader.check_connections import check_redis def parse_args(): "Parse the command-line arguments" diff --git a/tests/conftest.py b/tests/conftest.py index a39acf0..9012221 100644 --- a/tests/conftest.py +++ b/tests/conftest.py @@ -11,8 +11,8 @@ from redis import Redis from functional_tools import take -from qc_app import jobs, create_app -from qc_app.jobs import JOBS_PREFIX +from uploader import jobs, create_app +from uploader.jobs import JOBS_PREFIX from quality_control.errors import InvalidValue, DuplicateHeading diff --git a/tests/qc_app/__init__.py b/tests/uploader/__init__.py index e69de29..e69de29 100644 --- a/tests/qc_app/__init__.py +++ b/tests/uploader/__init__.py diff --git a/tests/qc_app/test_entry.py b/tests/uploader/test_entry.py index 0c614a5..0c614a5 100644 --- a/tests/qc_app/test_entry.py +++ b/tests/uploader/test_entry.py diff --git a/tests/qc_app/test_expression_data_pages.py b/tests/uploader/test_expression_data_pages.py index c2f7de1..c2f7de1 100644 --- a/tests/qc_app/test_expression_data_pages.py +++ b/tests/uploader/test_expression_data_pages.py diff --git a/tests/uploader/test_files.py b/tests/uploader/test_files.py new file mode 100644 index 0000000..cb22fff --- /dev/null +++ b/tests/uploader/test_files.py @@ -0,0 +1,17 @@ +"""Tests functions in the `uploader.files` module.""" +from pathlib import Path + +import pytest + +from uploader.files import sha256_digest_over_file + +@pytest.mark.unit_test +@pytest.mark.parametrize( + "filepath,expectedhash", + ((Path("tests/test_data/average.tsv.zip"), + "a371c654c095c030edad468e1c3d6b176ea8adfbcd91a322afd37779044478d9"), + (Path("tests/test_data/standarderror.tsv"), + "a08332e0b06391d50eecb722f69d85fbdf374a2d77713ee879d3fd6c60419d55"))) +def test_sha256_digest_over_file(filepath: Path, expectedhash: str): + """Test the `sha256_digest_over_file` function.""" + assert sha256_digest_over_file(filepath) == expectedhash diff --git a/tests/qc_app/test_parse.py b/tests/uploader/test_parse.py index 3915a4d..076c47c 100644 --- a/tests/qc_app/test_parse.py +++ b/tests/uploader/test_parse.py @@ -4,7 +4,7 @@ import sys import redis import pytest -from qc_app.jobs import job, jobsnamespace +from uploader.jobs import job, jobsnamespace from tests.conftest import uploadable_file_object @@ -24,7 +24,7 @@ def test_parse_with_existing_uploaded_file(#pylint: disable=[too-many-arguments] 1. the system redirects to the job/parse status page 2. the job is placed on redis for processing """ - monkeypatch.setattr("qc_app.jobs.uuid4", lambda : job_id) + monkeypatch.setattr("uploader.jobs.uuid4", lambda : job_id) # Upload a file speciesid = 1 filename = "no_data_errors.tsv" diff --git a/tests/qc_app/test_progress_indication.py b/tests/uploader/test_progress_indication.py index 14a1050..14a1050 100644 --- a/tests/qc_app/test_progress_indication.py +++ b/tests/uploader/test_progress_indication.py diff --git a/tests/qc_app/test_results_page.py b/tests/uploader/test_results_page.py index 8c8379f..8c8379f 100644 --- a/tests/qc_app/test_results_page.py +++ b/tests/uploader/test_results_page.py diff --git a/tests/qc_app/test_uploads_with_zip_files.py b/tests/uploader/test_uploads_with_zip_files.py index 1506cfa..1506cfa 100644 --- a/tests/qc_app/test_uploads_with_zip_files.py +++ b/tests/uploader/test_uploads_with_zip_files.py diff --git a/uploader/__init__.py b/uploader/__init__.py new file mode 100644 index 0000000..9fdb383 --- /dev/null +++ b/uploader/__init__.py @@ -0,0 +1,89 @@ +"""The Quality-Control Web Application entry point""" +import os +import sys +import logging +from pathlib import Path + +from flask import Flask, request +from flask_session import Session + +from uploader.oauth2.client import user_logged_in, authserver_authorise_uri + +from . import session +from .base_routes import base +from .species import speciesbp +from .oauth2.views import oauth2 +from .expression_data import exprdatabp +from .errors import register_error_handlers + +def override_settings_with_envvars( + app: Flask, ignore: tuple[str, ...]=tuple()) -> None: + """Override settings in `app` with those in ENVVARS""" + for setting in (key for key in app.config if key not in ignore): + app.config[setting] = os.environ.get(setting) or app.config[setting] + + +def __log_gunicorn__(app: Flask) -> Flask: + """Set up logging for the WSGI environment with GUnicorn""" + logger = logging.getLogger("gunicorn.error") + app.logger.handlers = logger.handlers + app.logger.setLevel(logger.level) + return app + + +def __log_dev__(app: Flask) -> Flask: + """Set up logging for the development environment.""" + stderr_handler = logging.StreamHandler(stream=sys.stderr) + app.logger.addHandler(stderr_handler) + + root_logger = logging.getLogger() + root_logger.addHandler(stderr_handler) + root_logger.setLevel(app.config["LOG_LEVEL"]) + + return app + + +def setup_logging(app: Flask) -> Flask: + """Set up logging for the application.""" + software, *_version_and_comments = os.environ.get( + "SERVER_SOFTWARE", "").split('/') + return __log_gunicorn__(app) if bool(software) else __log_dev__(app) + + +def create_app(): + """The application factory""" + app = Flask(__name__) + app.config.from_pyfile( + Path(__file__).parent.joinpath("default_settings.py")) + if "UPLOADER_CONF" in os.environ: + app.config.from_envvar("UPLOADER_CONF") # Override defaults with instance path + + override_settings_with_envvars(app, ignore=tuple()) + + secretsfile = app.config.get("UPLOADER_SECRETS", "").strip() + if bool(secretsfile): + secretsfile = Path(secretsfile).absolute() + app.config["UPLOADER_SECRETS"] = secretsfile + if secretsfile.exists(): + # Silently ignore secrets if the file does not exist. + app.config.from_pyfile(secretsfile) + + setup_logging(app) + + # setup jinja2 symbols + app.add_template_global(lambda : request.url, name="request_url") + app.add_template_global(authserver_authorise_uri) + app.add_template_global(lambda: app.config["GN2_SERVER_URL"], + name="gn2server_uri") + app.add_template_global(user_logged_in) + app.add_template_global(lambda : session.user_details()["email"], name="user_email") + + Session(app) + + # setup blueprints + app.register_blueprint(base, url_prefix="/") + app.register_blueprint(oauth2, url_prefix="/oauth2") + app.register_blueprint(speciesbp, url_prefix="/species") + + register_error_handlers(app) + return app diff --git a/uploader/authorisation.py b/uploader/authorisation.py new file mode 100644 index 0000000..ee8fe97 --- /dev/null +++ b/uploader/authorisation.py @@ -0,0 +1,67 @@ +"""Authorisation utilities.""" +import logging +from functools import wraps + +from typing import Callable +from flask import flash, redirect +from pymonad.either import Left, Right, Either +from authlib.jose import KeySet, JsonWebToken +from authlib.jose.errors import BadSignatureError + +from uploader import session +from uploader.oauth2.client import auth_server_jwks + +def require_login(function): + """Check that the user is logged in before executing `func`.""" + @wraps(function) + def __is_session_valid__(*args, **kwargs): + """Check that the user is logged in and their token is valid.""" + def __clear_session__(_no_token): + session.clear_session_info() + flash("You need to be logged in.", "alert-danger") + return redirect("/") + + return session.user_token().either( + __clear_session__, + lambda token: function(*args, **kwargs)) + return __is_session_valid__ + + +def __validate_token__(jwks: KeySet, token: dict) -> Either: + """Check that a token is signed by a key from the authorisation server.""" + for key in jwks.keys: + try: + # Fixes CVE-2016-10555. See + # https://docs.authlib.org/en/latest/jose/jwt.html + jwt = JsonWebToken(["RS256"]) + jwt.decode(token["access_token"], key) + return Right(token) + except BadSignatureError: + pass + + return Left({"token": token}) + + +def require_token(func: Callable) -> Callable: + """ + Wrap functions that require the user be authorised to perform the operations + that the functions in question provide. + """ + def __invalid_token__(_whatever): + logging.debug("==========> Failure log: %s", _whatever) + raise Exception( + "You attempted to access a feature of the system that requires " + "authorisation. Unfortunately, we could not verify you have the " + "appropriate authorisation to perform the action you requested. " + "You might need to log in, or if you already are logged in, you " + "need to log out, then log back in to get a newer token/session.") + @wraps(func) + def __wrapper__(*args, **kwargs): + return session.user_token().then(lambda tok: { + "jwks": auth_server_jwks(), + "token": tok + }).then(lambda vals: __validate_token__(**vals)).either( + __invalid_token__, + lambda tok: func(*args, **{**kwargs, "token": tok})) + + return __wrapper__ diff --git a/uploader/base_routes.py b/uploader/base_routes.py new file mode 100644 index 0000000..742a254 --- /dev/null +++ b/uploader/base_routes.py @@ -0,0 +1,53 @@ +"""Basic routes required for all pages""" +import os +from urllib.parse import urljoin + +from flask import (Blueprint, + current_app as app, + send_from_directory) + +from uploader.ui import make_template_renderer +from uploader.oauth2.client import user_logged_in + +base = Blueprint("base", __name__) +render_template = make_template_renderer("home") + + +@base.route("/favicon.ico", methods=["GET"]) +def favicon(): + """Return the favicon.""" + return send_from_directory(os.path.join(app.root_path, "static"), + "images/CITGLogo.png", + mimetype="image/png") + + +@base.route("/", methods=["GET"]) +def index(): + """Load the landing page""" + return render_template("index.html" if user_logged_in() else "login.html", + gn2server_intro=urljoin(app.config["GN2_SERVER_URL"], + "/intro")) + +def appenv(): + """Get app's guix environment path.""" + return os.environ.get("GN_UPLOADER_ENVIRONMENT") + +@base.route("/bootstrap/<path:filename>") +def bootstrap(filename): + """Fetch bootstrap files.""" + return send_from_directory( + appenv(), f"share/genenetwork2/javascript/bootstrap/{filename}") + + +@base.route("/jquery/<path:filename>") +def jquery(filename): + """Fetch jquery files.""" + return send_from_directory( + appenv(), f"share/genenetwork2/javascript/jquery/{filename}") + + +@base.route("/node-modules/<path:filename>") +def node_modules(filename): + """Fetch node-js modules.""" + return send_from_directory( + appenv(), f"lib/node_modules/{filename}") diff --git a/qc_app/check_connections.py b/uploader/check_connections.py index ceccc32..2561e55 100644 --- a/qc_app/check_connections.py +++ b/uploader/check_connections.py @@ -5,7 +5,7 @@ import traceback import redis import MySQLdb -from qc_app.db_utils import database_connection +from uploader.db_utils import database_connection def check_redis(uri: str): "Check the redis connection" diff --git a/uploader/datautils.py b/uploader/datautils.py new file mode 100644 index 0000000..46a55c4 --- /dev/null +++ b/uploader/datautils.py @@ -0,0 +1,38 @@ +"""Generic data utilities: Rename module.""" +import math +from functools import reduce +from typing import Union, Sequence + +def enumerate_sequence(seq: Sequence[dict], start:int = 1) -> Sequence[dict]: + """Enumerate sequence beginning at 1""" + return tuple({**item, "sequence_number": seqno} + for seqno, item in enumerate(seq, start=start)) + + +def order_by_family(items: tuple[dict, ...], + family_key: str = "Family", + order_key: str = "FamilyOrderId") -> list: + """Order the populations by their families.""" + def __family_order__(item): + orderval = item[order_key] + return math.inf if orderval is None else orderval + + def __order__(ordered, current): + _key = (__family_order__(current), current[family_key]) + return { + **ordered, + _key: ordered.get(_key, tuple()) + (current,) + } + + return sorted(tuple(reduce(__order__, items, {}).items()), + key=lambda item: item[0][0]) + + +def safe_int(val: Union[str, int, float]) -> int: + """ + Convert val into an integer: if val cannot be converted, return a zero. + """ + try: + return int(val) + except ValueError: + return 0 diff --git a/uploader/db/__init__.py b/uploader/db/__init__.py new file mode 100644 index 0000000..d2b1d9d --- /dev/null +++ b/uploader/db/__init__.py @@ -0,0 +1,2 @@ +"""Database functions""" +from .datasets import geno_datasets_by_species_and_population diff --git a/qc_app/db/averaging.py b/uploader/db/averaging.py index 62bbe67..62bbe67 100644 --- a/qc_app/db/averaging.py +++ b/uploader/db/averaging.py diff --git a/qc_app/db/datasets.py b/uploader/db/datasets.py index 767ec41..767ec41 100644 --- a/qc_app/db/datasets.py +++ b/uploader/db/datasets.py diff --git a/qc_app/db/tissues.py b/uploader/db/tissues.py index 9fe7bab..9fe7bab 100644 --- a/qc_app/db/tissues.py +++ b/uploader/db/tissues.py diff --git a/qc_app/db_utils.py b/uploader/db_utils.py index ef26398..d31e2c2 100644 --- a/qc_app/db_utils.py +++ b/uploader/db_utils.py @@ -3,10 +3,11 @@ import logging import traceback import contextlib from urllib.parse import urlparse -from typing import Any, Tuple, Optional, Iterator, Callable +from typing import Any, Tuple, Iterator, Callable import MySQLdb as mdb from redis import Redis +from MySQLdb.cursors import Cursor from flask import current_app as app def parse_db_url(db_url) -> Tuple: @@ -19,10 +20,9 @@ def parse_db_url(db_url) -> Tuple: @contextlib.contextmanager -def database_connection(db_url: Optional[str] = None) -> Iterator[mdb.Connection]: +def database_connection(db_url: str) -> Iterator[mdb.Connection]: """function to create db connector""" - host, user, passwd, db_name, db_port = parse_db_url( - db_url or app.config["SQL_URI"]) + host, user, passwd, db_name, db_port = parse_db_url(db_url) connection = mdb.connect( host, user, passwd, db_name, port=(db_port or 3306)) try: @@ -44,3 +44,11 @@ def with_redis_connection(func: Callable[[Redis], Any]) -> Any: redisuri = app.config["REDIS_URL"] with Redis.from_url(redisuri, decode_responses=True) as rconn: return func(rconn) + + +def debug_query(cursor: Cursor): + """Debug the actual query run with MySQLdb""" + for attr in ("_executed", "statement", "_last_executed"): + if hasattr(cursor, attr): + logging.debug("MySQLdb QUERY: %s", getattr(cursor, attr)) + break diff --git a/uploader/default_settings.py b/uploader/default_settings.py new file mode 100644 index 0000000..1acb247 --- /dev/null +++ b/uploader/default_settings.py @@ -0,0 +1,20 @@ +""" +The default configuration file. The values here should be overridden in the +actual configuration file used for the production and staging systems. +""" +LOG_LEVEL = "WARNING" +SECRET_KEY = b"<Please! Please! Please! Change This!>" +UPLOAD_FOLDER = "/tmp/qc_app_files" +REDIS_URL = "redis://" +JOBS_TTL_SECONDS = 1209600 # 14 days +GNQC_REDIS_PREFIX="gn-uploader" +SQL_URI = "" + +GN2_SERVER_URL = "https://genenetwork.org/" + +SESSION_TYPE = "redis" +SESSION_PERMANENT = True +SESSION_USE_SIGNER = True + +JWKS_ROTATION_AGE_DAYS = 7 # Days (from creation) to keep a JWK in use. +JWKS_DELETION_AGE_DAYS = 14 # Days (from creation) to keep a JWK around before deleting it. diff --git a/qc_app/errors.py b/uploader/errors.py index 3e7c893..3e7c893 100644 --- a/qc_app/errors.py +++ b/uploader/errors.py diff --git a/uploader/expression_data/__init__.py b/uploader/expression_data/__init__.py new file mode 100644 index 0000000..fc8bd41 --- /dev/null +++ b/uploader/expression_data/__init__.py @@ -0,0 +1,2 @@ +"""Package handling upload of files.""" +from .views import exprdatabp diff --git a/qc_app/dbinsert.py b/uploader/expression_data/dbinsert.py index ef08423..32ca359 100644 --- a/qc_app/dbinsert.py +++ b/uploader/expression_data/dbinsert.py @@ -11,10 +11,12 @@ from flask import ( flash, request, url_for, Blueprint, redirect, render_template, current_app as app) -from qc_app.db_utils import with_db_connection, database_connection -from qc_app.db import species, species_by_id, populations_by_species - -from . import jobs +from uploader import jobs +from uploader.authorisation import require_login +from uploader.population.models import populations_by_species +from uploader.species.models import all_species, species_by_id +from uploader.platforms.models import platform_by_species_and_id +from uploader.db_utils import with_db_connection, database_connection dbinsertbp = Blueprint("dbinsert", __name__) @@ -40,25 +42,17 @@ def genechips(): return {**acc, speciesid: (chip,)} return {**acc, speciesid: acc[speciesid] + (chip,)} - with database_connection() as conn: + with database_connection(app.config["SQL_URI"]) as conn: with conn.cursor(cursorclass=DictCursor) as cursor: cursor.execute("SELECT * FROM GeneChip ORDER BY GeneChipName ASC") return reduce(__organise_by_species__, cursor.fetchall(), {}) return {} -def platform_by_id(genechipid:int) -> Union[dict, None]: - "Retrieve the gene platform by id" - with database_connection() as conn: - with conn.cursor(cursorclass=DictCursor) as cursor: - cursor.execute( - "SELECT * FROM GeneChip WHERE GeneChipId=%s", - (genechipid,)) - return cursor.fetchone() def studies_by_species_and_platform(speciesid:int, genechipid:int) -> tuple: "Retrieve the studies by the related species and gene platform" - with database_connection() as conn: + with database_connection(app.config["SQL_URI"]) as conn: with conn.cursor(cursorclass=DictCursor) as cursor: query = ( "SELECT Species.SpeciesId, ProbeFreeze.* " @@ -82,7 +76,7 @@ def organise_groups_by_family(acc:dict, group:dict) -> dict: def tissues() -> tuple: "Retrieve type (Tissue) information from the database." - with database_connection() as conn: + with database_connection(app.config["SQL_URI"]) as conn: with conn.cursor(cursorclass=DictCursor) as cursor: cursor.execute("SELECT * FROM Tissue ORDER BY Name") return tuple(cursor.fetchall()) @@ -90,6 +84,7 @@ def tissues() -> tuple: return tuple() @dbinsertbp.route("/platform", methods=["POST"]) +@require_login def select_platform(): "Select the platform (GeneChipId) used for the data." job_id = request.form["job_id"] @@ -105,7 +100,7 @@ def select_platform(): return render_template( "select_platform.html", filename=filename, filetype=job["filetype"], totallines=int(job["currentline"]), - default_species=default_species, species=species(conn), + default_species=default_species, species=all_species(conn), genechips=gchips[default_species], genechips_data=json.dumps(gchips)) return render_error(f"File '{filename}' no longer exists.") @@ -113,6 +108,7 @@ def select_platform(): return render_error("Unknown error") @dbinsertbp.route("/study", methods=["POST"]) +@require_login def select_study(): "View to select/create the study (ProbeFreeze) associated with the data." form = request.form @@ -142,6 +138,7 @@ def select_study(): return render_error(f"Missing data: {aserr.args[0]}") @dbinsertbp.route("/create-study", methods=["POST"]) +@require_login def create_study(): "Create a new study (ProbeFreeze)." form = request.form @@ -154,7 +151,7 @@ def create_study(): assert form.get("inbredsetid"), "group" assert form.get("tissueid"), "type/tissue" - with database_connection() as conn: + with database_connection(app.config["SQL_URI"]) as conn: with conn.cursor(cursorclass=DictCursor) as cursor: values = ( form["genechipid"], @@ -186,7 +183,7 @@ def create_study(): def datasets_by_study(studyid:int) -> tuple: "Retrieve datasets associated with a study with the ID `studyid`." - with database_connection() as conn: + with database_connection(app.config["SQL_URI"]) as conn: with conn.cursor(cursorclass=DictCursor) as cursor: query = "SELECT * FROM ProbeSetFreeze WHERE ProbeFreezeId=%s" cursor.execute(query, (studyid,)) @@ -196,7 +193,7 @@ def datasets_by_study(studyid:int) -> tuple: def averaging_methods() -> tuple: "Retrieve averaging methods from database" - with database_connection() as conn: + with database_connection(app.config["SQL_URI"]) as conn: with conn.cursor(cursorclass=DictCursor) as cursor: cursor.execute("SELECT * FROM AvgMethod") return tuple(cursor.fetchall()) @@ -205,7 +202,7 @@ def averaging_methods() -> tuple: def dataset_datascales() -> tuple: "Retrieve datascales from database" - with database_connection() as conn: + with database_connection(app.config["SQL_URI"]) as conn: with conn.cursor() as cursor: cursor.execute( 'SELECT DISTINCT DataScale FROM ProbeSetFreeze ' @@ -218,6 +215,7 @@ def dataset_datascales() -> tuple: return tuple() @dbinsertbp.route("/dataset", methods=["POST"]) +@require_login def select_dataset(): "Select the dataset to add the file contents against" form = request.form @@ -238,6 +236,7 @@ def select_dataset(): return render_error(f"Missing data: {aserr.args[0]}") @dbinsertbp.route("/create-dataset", methods=["POST"]) +@require_login def create_dataset(): "Select the dataset to add the file contents against" form = request.form @@ -255,7 +254,7 @@ def create_dataset(): assert form.get("datasetconfidentiality"), "Dataset confidentiality" assert form.get("datasetdatascale"), "Dataset Datascale" - with database_connection() as conn: + with database_connection(app.config["SQL_URI"]) as conn: with conn.cursor(cursorclass=DictCursor) as cursor: datasetname = form["datasetname"] cursor.execute("SELECT * FROM ProbeSetFreeze WHERE Name=%s", @@ -293,7 +292,7 @@ def create_dataset(): def study_by_id(studyid:int) -> Union[dict, None]: "Get a study by its Id" - with database_connection() as conn: + with database_connection(app.config["SQL_URI"]) as conn: with conn.cursor(cursorclass=DictCursor) as cursor: cursor.execute( "SELECT * FROM ProbeFreeze WHERE Id=%s", @@ -302,7 +301,7 @@ def study_by_id(studyid:int) -> Union[dict, None]: def dataset_by_id(datasetid:int) -> Union[dict, None]: "Retrieve a dataset by its id" - with database_connection() as conn: + with database_connection(app.config["SQL_URI"]) as conn: with conn.cursor(cursorclass=DictCursor) as cursor: cursor.execute( ("SELECT AvgMethod.Name AS AvgMethodName, ProbeSetFreeze.* " @@ -317,41 +316,44 @@ def selected_keys(original: dict, keys: tuple) -> dict: return {key: value for key,value in original.items() if key in keys} @dbinsertbp.route("/final-confirmation", methods=["POST"]) +@require_login def final_confirmation(): "Preview the data before triggering entry into the database" - form = request.form - try: - assert form.get("filename"), "filename" - assert form.get("filetype"), "filetype" - assert form.get("species"), "species" - assert form.get("genechipid"), "platform" - assert form.get("studyid"), "study" - assert form.get("datasetid"), "dataset" - - speciesid = form["species"] - genechipid = form["genechipid"] - studyid = form["studyid"] - datasetid=form["datasetid"] - return render_template( - "final_confirmation.html", filename=form["filename"], - filetype=form["filetype"], totallines=form["totallines"], - species=speciesid, genechipid=genechipid, studyid=studyid, - datasetid=datasetid, the_species=selected_keys( - with_db_connection(lambda conn: species_by_id(conn, speciesid)), - ("SpeciesName", "Name", "MenuName")), - platform=selected_keys( - platform_by_id(genechipid), - ("GeneChipName", "Name", "GeoPlatform", "Title", "GO_tree_value")), - study=selected_keys( - study_by_id(studyid), ("Name", "FullName", "ShortName")), - dataset=selected_keys( - dataset_by_id(datasetid), - ("AvgMethodName", "Name", "Name2", "FullName", "ShortName", - "DataScale"))) - except AssertionError as aserr: - return render_error(f"Missing data: {aserr.args[0]}") + with database_connection(app.config["SQL_URI"]) as conn: + form = request.form + try: + assert form.get("filename"), "filename" + assert form.get("filetype"), "filetype" + assert form.get("species"), "species" + assert form.get("genechipid"), "platform" + assert form.get("studyid"), "study" + assert form.get("datasetid"), "dataset" + + speciesid = form["species"] + genechipid = form["genechipid"] + studyid = form["studyid"] + datasetid=form["datasetid"] + return render_template( + "final_confirmation.html", filename=form["filename"], + filetype=form["filetype"], totallines=form["totallines"], + species=speciesid, genechipid=genechipid, studyid=studyid, + datasetid=datasetid, the_species=selected_keys( + with_db_connection(lambda conn: species_by_id(conn, speciesid)), + ("SpeciesName", "Name", "MenuName")), + platform=selected_keys( + platform_by_species_and_id(conn, speciesid, genechipid), + ("GeneChipName", "Name", "GeoPlatform", "Title", "GO_tree_value")), + study=selected_keys( + study_by_id(studyid), ("Name", "FullName", "ShortName")), + dataset=selected_keys( + dataset_by_id(datasetid), + ("AvgMethodName", "Name", "Name2", "FullName", "ShortName", + "DataScale"))) + except AssertionError as aserr: + return render_error(f"Missing data: {aserr.args[0]}") @dbinsertbp.route("/insert-data", methods=["POST"]) +@require_login def insert_data(): "Trigger data insertion" form = request.form diff --git a/uploader/expression_data/views.py b/uploader/expression_data/views.py new file mode 100644 index 0000000..bbe6538 --- /dev/null +++ b/uploader/expression_data/views.py @@ -0,0 +1,384 @@ +"""Views for expression data""" +import os +import uuid +import mimetypes +from typing import Tuple +from zipfile import ZipFile, is_zipfile + +import jsonpickle +from redis import Redis +from werkzeug.utils import secure_filename +from flask import (flash, + request, + url_for, + redirect, + Blueprint, + current_app as app) + +from quality_control.errors import InvalidValue, DuplicateHeading + +from uploader import jobs +from uploader.datautils import order_by_family +from uploader.ui import make_template_renderer +from uploader.authorisation import require_login +from uploader.species.models import all_species, species_by_id +from uploader.db_utils import with_db_connection, database_connection +from uploader.population.models import (populations_by_species, + population_by_species_and_id) + +exprdatabp = Blueprint("expression-data", __name__) +render_template = make_template_renderer("expression-data") + +def isinvalidvalue(item): + """Check whether item is of type InvalidValue""" + return isinstance(item, InvalidValue) + + +def isduplicateheading(item): + """Check whether item is of type DuplicateHeading""" + return isinstance(item, DuplicateHeading) + + +def errors(rqst) -> Tuple[str, ...]: + """Return a tuple of the errors found in the request `rqst`. If no error is + found, then an empty tuple is returned.""" + def __filetype_error__(): + return ( + ("Invalid file type provided.",) + if rqst.form.get("filetype") not in ("average", "standard-error") + else tuple()) + + def __file_missing_error__(): + return ( + ("No file was uploaded.",) + if ("qc_text_file" not in rqst.files or + rqst.files["qc_text_file"].filename == "") + else tuple()) + + def __file_mimetype_error__(): + text_file = rqst.files["qc_text_file"] + return ( + ( + ("Invalid file! Expected a tab-separated-values file, or a zip " + "file of the a tab-separated-values file."),) + if text_file.mimetype not in ( + "text/plain", "text/tab-separated-values", + "application/zip") + else tuple()) + + return ( + __filetype_error__() + + (__file_missing_error__() or __file_mimetype_error__())) + + +def zip_file_errors(filepath, upload_dir) -> Tuple[str, ...]: + """Check the uploaded zip file for errors.""" + zfile_errors: Tuple[str, ...] = tuple() + if is_zipfile(filepath): + with ZipFile(filepath, "r") as zfile: + infolist = zfile.infolist() + if len(infolist) != 1: + zfile_errors = zfile_errors + ( + ("Expected exactly one (1) member file within the uploaded zip " + f"file. Got {len(infolist)} member files."),) + if len(infolist) == 1 and infolist[0].is_dir(): + zfile_errors = zfile_errors + ( + ("Expected a member text file in the uploaded zip file. Got a " + "directory/folder."),) + + if len(infolist) == 1 and not infolist[0].is_dir(): + zfile.extract(infolist[0], path=upload_dir) + mime = mimetypes.guess_type(f"{upload_dir}/{infolist[0].filename}") + if mime[0] != "text/tab-separated-values": + zfile_errors = zfile_errors + ( + ("Expected the member text file in the uploaded zip file to" + " be a tab-separated file."),) + + return zfile_errors + + +@exprdatabp.route("populations/expression-data", methods=["GET"]) +@require_login +def index(): + """Display the expression data index page.""" + with database_connection(app.config["SQL_URI"]) as conn: + if not bool(request.args.get("species_id")): + return render_template("expression-data/index.html", + species=order_by_family(all_species(conn)), + activelink="expression-data") + species = species_by_id(conn, request.args.get("species_id")) + if not bool(species): + flash("Could not find species selected!", "alert-danger") + return redirect(url_for("species.populations.expression-data.index")) + return redirect(url_for( + "species.populations.expression-data.select_population", + species_id=species["SpeciesId"])) + + +@exprdatabp.route("<int:species_id>/populations/expression-data/select-population", + methods=["GET"]) +@require_login +def select_population(species_id: int): + """Select the expression data's population.""" + with database_connection(app.config["SQL_URI"]) as conn: + species = species_by_id(conn, species_id) + if not bool(species): + flash("No such species!", "alert-danger") + return redirect(url_for("species.populations.expression-data.index")) + + if not bool(request.args.get("population_id")): + return render_template("expression-data/select-population.html", + species=species, + populations=order_by_family( + populations_by_species(conn, species_id), + order_key="FamilyOrder"), + activelink="expression-data") + + population = population_by_species_and_id( + conn, species_id, request.args.get("population_id")) + if not bool(population): + flash("No such population!", "alert-danger") + return redirect(url_for( + "species.populations.expression-data.select_population", + species_id=species_id)) + + return redirect(url_for("species.populations.expression-data.upload_file", + species_id=species_id, + population_id=population["Id"])) + + +@exprdatabp.route("<int:species_id>/populations/<int:population_id>/" + "expression-data/upload", + methods=["GET", "POST"]) +@require_login +def upload_file(species_id: int, population_id: int): + """Enables uploading the files""" + with database_connection(app.config["SQL_URI"]) as conn: + species = species_by_id(conn, species_id) + population = population_by_species_and_id(conn, species_id, population_id) + if request.method == "GET": + return render_template("expression-data/select-file.html", + species=species, + population=population) + + upload_dir = app.config["UPLOAD_FOLDER"] + request_errors = errors(request) + if request_errors: + for error in request_errors: + flash(error, "alert-danger error-expr-data") + return redirect(url_for("species.populations.expression-data.upload_file")) + + filename = secure_filename( + request.files["qc_text_file"].filename)# type: ignore[arg-type] + if not os.path.exists(upload_dir): + os.mkdir(upload_dir) + + filepath = os.path.join(upload_dir, filename) + request.files["qc_text_file"].save(os.path.join(upload_dir, filename)) + + zip_errors = zip_file_errors(filepath, upload_dir) + if zip_errors: + for error in zip_errors: + flash(error, "alert-danger error-expr-data") + return redirect(url_for("species.populations.expression-data.index.upload_file")) + + return redirect(url_for("species.populations.expression-data.parse_file", + species_id=species_id, + population_id=population_id, + filename=filename, + filetype=request.form["filetype"])) + + +@exprdatabp.route("/data-review", methods=["GET"]) +@require_login +def data_review(): + """Provide some help on data expectations to the user.""" + return render_template("expression-data/data-review.html") + + +@exprdatabp.route( + "<int:species_id>/populations/<int:population_id>/expression-data/parse", + methods=["GET"]) +@require_login +def parse_file(species_id: int, population_id: int): + """Trigger file parsing""" + _errors = False + filename = request.args.get("filename") + filetype = request.args.get("filetype") + + species = with_db_connection(lambda con: species_by_id(con, species_id)) + if not bool(species): + flash("No such species.", "alert-danger") + _errors = True + + if filename is None: + flash("No file provided", "alert-danger") + _errors = True + + if filetype is None: + flash("No filetype provided", "alert-danger") + _errors = True + + if filetype not in ("average", "standard-error"): + flash("Invalid filetype provided", "alert-danger") + _errors = True + + if filename: + filepath = os.path.join(app.config["UPLOAD_FOLDER"], filename) + if not os.path.exists(filepath): + flash("Selected file does not exist (any longer)", "alert-danger") + _errors = True + + if _errors: + return redirect(url_for("species.populations.expression-data.upload_file")) + + redisurl = app.config["REDIS_URL"] + with Redis.from_url(redisurl, decode_responses=True) as rconn: + job = jobs.launch_job( + jobs.build_file_verification_job( + rconn, app.config["SQL_URI"], redisurl, + species_id, filepath, filetype,# type: ignore[arg-type] + app.config["JOBS_TTL_SECONDS"]), + redisurl, + f"{app.config['UPLOAD_FOLDER']}/job_errors") + + return redirect(url_for("species.populations.expression-data.parse_status", + species_id=species_id, + population_id=population_id, + job_id=job["jobid"])) + + +@exprdatabp.route( + "<int:species_id>/populations/<int:population_id>/expression-data/parse/" + "status/<uuid:job_id>", + methods=["GET"]) +@require_login +def parse_status(species_id: int, population_id: int, job_id: str): + "Retrieve the status of the job" + with Redis.from_url(app.config["REDIS_URL"], decode_responses=True) as rconn: + try: + job = jobs.job(rconn, jobs.jobsnamespace(), job_id) + except jobs.JobNotFound as _exc: + return render_template("no_such_job.html", job_id=job_id), 400 + + error_filename = jobs.error_filename( + job_id, f"{app.config['UPLOAD_FOLDER']}/job_errors") + if os.path.exists(error_filename): + stat = os.stat(error_filename) + if stat.st_size > 0: + return redirect(url_for("parse.fail", job_id=job_id)) + + job_id = job["jobid"] + progress = float(job["percent"]) + status = job["status"] + filename = job.get("filename", "uploaded file") + _errors = jsonpickle.decode( + job.get("errors", jsonpickle.encode(tuple()))) + if status in ("success", "aborted"): + return redirect(url_for("species.populations.expression-data.results", + species_id=species_id, + population_id=population_id, + job_id=job_id)) + + if status == "parse-error": + return redirect(url_for("species.populations.expression-data.fail", job_id=job_id)) + + app.jinja_env.globals.update( + isinvalidvalue=isinvalidvalue, + isduplicateheading=isduplicateheading) + return render_template( + "expression-data/job-progress.html", + job_id = job_id, + job_status = status, + progress = progress, + message = job.get("message", ""), + job_name = f"Parsing '{filename}'", + errors=_errors, + species=with_db_connection( + lambda conn: species_by_id(conn, species_id)), + population=with_db_connection( + lambda conn: population_by_species_and_id( + conn, species_id, population_id))) + + +@exprdatabp.route( + "<int:species_id>/populations/<int:population_id>/expression-data/parse/" + "<uuid:job_id>/results", + methods=["GET"]) +@require_login +def results(species_id: int, population_id: int, job_id: uuid.UUID): + """Show results of parsing...""" + with Redis.from_url(app.config["REDIS_URL"], decode_responses=True) as rconn: + job = jobs.job(rconn, jobs.jobsnamespace(), job_id) + + if job: + filename = job["filename"] + _errors = jsonpickle.decode(job.get("errors", jsonpickle.encode(tuple()))) + app.jinja_env.globals.update( + isinvalidvalue=isinvalidvalue, + isduplicateheading=isduplicateheading) + return render_template( + "expression-data/parse-results.html", + errors=_errors, + job_name = f"Parsing '{filename}'", + user_aborted = job.get("user_aborted"), + job_id=job["jobid"], + species=with_db_connection( + lambda conn: species_by_id(conn, species_id)), + population=with_db_connection( + lambda conn: population_by_species_and_id( + conn, species_id, population_id))) + + return render_template("expression-data/no-such-job.html", job_id=job_id) + + +@exprdatabp.route( + "<int:species_id>/populations/<int:population_id>/expression-data/parse/" + "<uuid:job_id>/fail", + methods=["GET"]) +@require_login +def fail(species_id: int, population_id: int, job_id: str): + """Handle parsing failure""" + with Redis.from_url(app.config["REDIS_URL"], decode_responses=True) as rconn: + job = jobs.job(rconn, jobs.jobsnamespace(), job_id) + + if job: + error_filename = jobs.error_filename( + job_id, f"{app.config['UPLOAD_FOLDER']}/job_errors") + if os.path.exists(error_filename): + stat = os.stat(error_filename) + if stat.st_size > 0: + return render_template( + "worker_failure.html", job_id=job_id) + + return render_template("parse_failure.html", job=job) + + return render_template("expression-data/no-such-job.html", + **with_db_connection(lambda conn: { + "species_id": species_by_id(conn, species_id), + "population_id": population_by_species_and_id( + conn, species_id, population_id)}), + job_id=job_id) + + +@exprdatabp.route( + "<int:species_id>/populations/<int:population_id>/expression-data/parse/" + "abort", + methods=["POST"]) +@require_login +def abort(species_id: int, population_id: int): + """Handle user request to abort file processing""" + job_id = request.form["job_id"] + + with Redis.from_url(app.config["REDIS_URL"], decode_responses=True) as rconn: + job = jobs.job(rconn, jobs.jobsnamespace(), job_id) + + if job: + rconn.hset(name=jobs.job_key(jobs.jobsnamespace(), job_id), + key="user_aborted", + value=int(True)) + + return redirect(url_for("species.populations.expression-data.parse_status", + species_id=species_id, + population_id=population_id, + job_id=job_id)) diff --git a/qc_app/files.py b/uploader/files.py index b163612..d37a53e 100644 --- a/qc_app/files.py +++ b/uploader/files.py @@ -1,7 +1,9 @@ """Utilities to deal with uploaded files.""" import hashlib from pathlib import Path +from typing import Iterator from datetime import datetime + from flask import current_app from werkzeug.utils import secure_filename @@ -21,6 +23,27 @@ def save_file(fileobj: FileStorage, upload_dir: Path) -> Path: fileobj.save(filepath) return filepath + def fullpath(filename: str): """Get a file's full path. This makes use of `flask.current_app`.""" return Path(current_app.config["UPLOAD_FOLDER"], filename).absolute() + + +def chunked_binary_read(filepath: Path, chunksize: int = 2048) -> Iterator: + """Read a file in binary mode in chunks.""" + with open(filepath, "rb") as inputfile: + while True: + data = inputfile.read(chunksize) + if data != b"": + yield data + continue + break + + +def sha256_digest_over_file(filepath: Path) -> str: + """Compute the sha256 digest over a file's contents.""" + filehash = hashlib.sha256() + for chunk in chunked_binary_read(filepath): + filehash.update(chunk) + + return filehash.hexdigest() diff --git a/uploader/genotypes/__init__.py b/uploader/genotypes/__init__.py new file mode 100644 index 0000000..d0025d6 --- /dev/null +++ b/uploader/genotypes/__init__.py @@ -0,0 +1 @@ +"""The Genotypes module.""" diff --git a/uploader/genotypes/models.py b/uploader/genotypes/models.py new file mode 100644 index 0000000..44c98b1 --- /dev/null +++ b/uploader/genotypes/models.py @@ -0,0 +1,101 @@ +"""Functions for handling genotypes.""" +from typing import Optional +from datetime import datetime + +import MySQLdb as mdb +from MySQLdb.cursors import Cursor, DictCursor + +from uploader.db_utils import debug_query + +def genocode_by_population( + conn: mdb.Connection, population_id: int) -> tuple[dict, ...]: + """Get the allele/genotype codes.""" + with conn.cursor(cursorclass=DictCursor) as cursor: + cursor.execute("SELECT * FROM GenoCode WHERE InbredSetId=%s", + (population_id,)) + return tuple(dict(item) for item in cursor.fetchall()) + + +def genotype_markers_count(conn: mdb.Connection, species_id: int) -> int: + """Find the total count of the genotype markers for a species.""" + with conn.cursor(cursorclass=DictCursor) as cursor: + cursor.execute( + "SELECT COUNT(Name) AS markers_count FROM Geno WHERE SpeciesId=%s", + (species_id,)) + return int(cursor.fetchone()["markers_count"]) + + +def genotype_markers( + conn: mdb.Connection, + species_id: int, + offset: int = 0, + limit: Optional[int] = None +) -> tuple[dict, ...]: + """Retrieve markers from the database.""" + _query = "SELECT * FROM Geno WHERE SpeciesId=%s" + if bool(limit) and limit > 0:# type: ignore[operator] + _query = _query + f" LIMIT {limit} OFFSET {offset}" + + with conn.cursor(cursorclass=DictCursor) as cursor: + cursor.execute(_query, (species_id,)) + debug_query(cursor) + return tuple(dict(row) for row in cursor.fetchall()) + + +def genotype_dataset( + conn: mdb.Connection, + species_id: int, + population_id: int, + dataset_id: Optional[int] = None +) -> Optional[dict]: + """Retrieve genotype datasets from the database. + + Apparently, you should only ever have one genotype dataset for a population. + """ + _query = ( + "SELECT gf.* FROM Species AS s INNER JOIN InbredSet AS iset " + "ON s.Id=iset.SpeciesId INNER JOIN GenoFreeze AS gf " + "ON iset.Id=gf.InbredSetId " + "WHERE s.Id=%s AND iset.Id=%s") + _params = (species_id, population_id) + if bool(dataset_id): + _query = _query + " AND gf.Id=%s" + _params = _params + (dataset_id,)# type: ignore[assignment] + + with conn.cursor(cursorclass=DictCursor) as cursor: + cursor.execute(_query, _params) + debug_query(cursor) + result = cursor.fetchone() + if bool(result): + return dict(result) + return None + + +def save_new_dataset( + cursor: Cursor, + population_id: int, + name: str, + fullname: str, + shortname: str +) -> dict: + """Save a new genotype dataset into the database.""" + params = { + "InbredSetId": population_id, + "Name": name, + "FullName": fullname, + "ShortName": shortname, + "CreateTime": datetime.now().date().isoformat(), + "public": 2, + "confidentiality": 0, + "AuthorisedUsers": None + } + cursor.execute( + "INSERT INTO GenoFreeze(" + "Name, FullName, ShortName, CreateTime, public, InbredSetId, " + "confidentiality, AuthorisedUsers" + ") VALUES (" + "%(Name)s, %(FullName)s, %(ShortName)s, %(CreateTime)s, %(public)s, " + "%(InbredSetId)s, %(confidentiality)s, %(AuthorisedUsers)s" + ")", + params) + return {**params, "Id": cursor.lastrowid} diff --git a/uploader/genotypes/views.py b/uploader/genotypes/views.py new file mode 100644 index 0000000..0821eca --- /dev/null +++ b/uploader/genotypes/views.py @@ -0,0 +1,204 @@ +"""Views for the genotypes.""" +from MySQLdb.cursors import DictCursor +from flask import (flash, + request, + url_for, + redirect, + Blueprint, + render_template, + current_app as app) + +from uploader.ui import make_template_renderer +from uploader.oauth2.client import oauth2_post +from uploader.authorisation import require_login +from uploader.db_utils import database_connection +from uploader.species.models import all_species, species_by_id +from uploader.monadic_requests import make_either_error_handler +from uploader.request_checks import with_species, with_population +from uploader.datautils import safe_int, order_by_family, enumerate_sequence +from uploader.population.models import (populations_by_species, + population_by_species_and_id) + +from .models import (genotype_markers, + genotype_dataset, + save_new_dataset, + genotype_markers_count, + genocode_by_population) + +genotypesbp = Blueprint("genotypes", __name__) +render_template = make_template_renderer("genotypes") + +@genotypesbp.route("populations/genotypes", methods=["GET"]) +@require_login +def index(): + """Direct entry-point for genotypes.""" + with database_connection(app.config["SQL_URI"]) as conn: + if not bool(request.args.get("species_id")): + return render_template("genotypes/index.html", + species=order_by_family(all_species(conn)), + activelink="genotypes") + species = species_by_id(conn, request.args.get("species_id")) + if not bool(species): + flash(f"Could not find species with ID '{request.args.get('species_id')}'!", + "alert-danger") + return redirect(url_for("species.populations.genotypes.index")) + return redirect(url_for("species.populations.genotypes.select_population", + species_id=species["SpeciesId"])) + + +@genotypesbp.route("/<int:species_id>/populations/genotypes/select-population", + methods=["GET"]) +@require_login +@with_species(redirect_uri="species.populations.genotypes.index") +def select_population(species: dict, species_id: int): + """Select the population under which the genotypes go.""" + with database_connection(app.config["SQL_URI"]) as conn: + if not bool(request.args.get("population_id")): + return render_template("genotypes/select-population.html", + species=species, + populations=order_by_family( + populations_by_species(conn, species_id), + order_key="FamilyOrder"), + activelink="genotypes") + + population = population_by_species_and_id( + conn, species_id, request.args.get("population_id")) + if not bool(population): + flash("Invalid population selected!", "alert-danger") + return redirect(url_for( + "species.populations.genotypes.select_population", + species_id=species_id)) + + return redirect(url_for("species.populations.genotypes.list_genotypes", + species_id=species_id, + population_id=population["Id"])) + + +@genotypesbp.route( + "/<int:species_id>/populations/<int:population_id>/genotypes", + methods=["GET"]) +@require_login +@with_population(species_redirect_uri="species.populations.genotypes.index", + redirect_uri="species.populations.genotypes.select_population") +def list_genotypes(species: dict, population: dict, **kwargs):# pylint: disable=[unused-argument] + """List genotype details for species and population.""" + with database_connection(app.config["SQL_URI"]) as conn: + return render_template("genotypes/list-genotypes.html", + species=species, + population=population, + genocode=genocode_by_population( + conn, population["Id"]), + total_markers=genotype_markers_count( + conn, species["SpeciesId"]), + dataset=genotype_dataset(conn, + species["SpeciesId"], + population["Id"]), + activelink="list-genotypes") + + +@genotypesbp.route("/<int:species_id>/genotypes/list-markers", methods=["GET"]) +@require_login +@with_species(redirect_uri="species.populations.genotypes.index") +def list_markers(species: dict, **kwargs):# pylint: disable=[unused-argument] + """List a species' genetic markers.""" + with database_connection(app.config["SQL_URI"]) as conn: + start_from = max(safe_int(request.args.get("start_from") or 0), 0) + count = safe_int(request.args.get("count") or 20) + return render_template("genotypes/list-markers.html", + species=species, + total_markers=genotype_markers_count( + conn, species["SpeciesId"]), + start_from=start_from, + count=count, + markers=enumerate_sequence( + genotype_markers(conn, + species["SpeciesId"], + offset=start_from, + limit=count), + start=start_from+1), + activelink="list-markers") + +@genotypesbp.route( + "/<int:species_id>/populations/<int:population_id>/genotypes/datasets/" + "<int:dataset_id>/view", + methods=["GET"]) +@require_login +def view_dataset(species_id: int, population_id: int, dataset_id: int): + """View details regarding a specific dataset.""" + with database_connection(app.config["SQL_URI"]) as conn: + species = species_by_id(conn, species_id) + if not bool(species): + flash("Invalid species provided!", "alert-danger") + return redirect(url_for("species.populations.genotypes.index")) + + population = population_by_species_and_id( + conn, species_id, population_id) + if not bool(population): + flash("Invalid population selected!", "alert-danger") + return redirect(url_for( + "species.populations.genotypes.select_population", + species_id=species_id)) + + dataset = genotype_dataset(conn, species_id, population_id, dataset_id) + if not bool(dataset): + flash("Could not find such a dataset!", "alert-danger") + return redirect(url_for( + "species.populations.genotypes.list_genotypes", + species_id=species_id, + population_id=population_id)) + + return render_template("genotypes/view-dataset.html", + species=species, + population=population, + dataset=dataset, + activelink="view-dataset") + + +@genotypesbp.route( + "/<int:species_id>/populations/<int:population_id>/genotypes/datasets/" + "create", + methods=["GET", "POST"]) +@require_login +@with_population(species_redirect_uri="species.populations.genotypes.index", + redirect_uri="species.populations.genotypes.select_population") +def create_dataset(species: dict, population: dict, **kwargs):# pylint: disable=[unused-argument] + """Create a genotype dataset.""" + with (database_connection(app.config["SQL_URI"]) as conn, + conn.cursor(cursorclass=DictCursor) as cursor): + if request.method == "GET": + return render_template("genotypes/create-dataset.html", + species=species, + population=population, + activelink="create-dataset") + + form = request.form + new_dataset = save_new_dataset( + cursor, + population["Id"], + form["geno-dataset-name"], + form["geno-dataset-fullname"], + form["geno-dataset-shortname"]) + + def __success__(_success): + flash("Successfully created genotype dataset.", "alert-success") + return redirect(url_for( + "species.populations.genotypes.list_genotypes", + species_id=species["SpeciesId"], + population_id=population["Id"])) + + return oauth2_post( + "auth/resource/genotypes/create", + json={ + **dict(request.form), + "species_id": species["SpeciesId"], + "population_id": population["Id"], + "dataset_id": new_dataset["Id"], + "dataset_name": form["geno-dataset-name"], + "dataset_fullname": form["geno-dataset-fullname"], + "dataset_shortname": form["geno-dataset-shortname"], + "public": "on" + } + ).either( + make_either_error_handler( + "There was an error creating the genotype dataset."), + __success__) diff --git a/uploader/input_validation.py b/uploader/input_validation.py new file mode 100644 index 0000000..627c69e --- /dev/null +++ b/uploader/input_validation.py @@ -0,0 +1,71 @@ +"""Input validation utilities""" +import re +import json +import base64 +from typing import Any + +def is_empty_string(value: str) -> bool: + """Check whether as string is empty""" + return (isinstance(value, str) and value.strip() == "") + + +def is_empty_input(value: Any) -> bool: + """Check whether user provided an empty value.""" + return (value is None or is_empty_string(value)) + + +def is_integer_input(value: Any) -> bool: + """ + Check whether user provided a value that can be parsed into an integer. + """ + def __is_int__(val, base): + try: + int(val, base=base) + except ValueError: + return False + return True + return isinstance(value, int) or ( + (not is_empty_input(value)) and ( + isinstance(value, str) and ( + __is_int__(value, 10) + or __is_int__(value, 8) + or __is_int__(value, 16)))) + + +def is_valid_representative_name(repr_name: str) -> bool: + """ + Check whether the given representative name is a valid according to our rules. + + Parameters + ---------- + repr_name: a string of characters. + + Checks For + ---------- + * The name MUST start with an alphabet [a-zA-Z] + * The name MUST end with an alphabet [a-zA-Z] or number [0-9] + * The name MUST be composed of alphabets [a-zA-Z], numbers [0-9], + underscores (_) and/or hyphens (-). + + Returns + ------- + Boolean indicating whether or not the name is valid. + """ + pattern = re.compile(r"^[a-zA-Z]+[a-zA-Z0-9_-]*[a-zA-Z0-9]$") + return bool(pattern.match(repr_name)) + + +def encode_errors(errors: tuple[tuple[str, str], ...], form) -> bytes: + """Encode form errors into base64 string.""" + return base64.b64encode( + json.dumps({ + "errors": dict(errors), + "original_formdata": dict(form) + }).encode("utf8")) + + +def decode_errors(errorstr) -> dict[str, dict]: + """Decode errors from base64 string""" + if not bool(errorstr): + return {"errors": {}, "original_formdata": {}} + return json.loads(base64.b64decode(errorstr.encode("utf8")).decode("utf8")) diff --git a/qc_app/jobs.py b/uploader/jobs.py index 21889da..4a3fc80 100644 --- a/qc_app/jobs.py +++ b/uploader/jobs.py @@ -10,7 +10,7 @@ from typing import Union, Optional from redis import Redis from flask import current_app as app -JOBS_PREFIX = "JOBS" +JOBS_PREFIX = "jobs" class JobNotFound(Exception): """Raised if we try to retrieve a non-existent job.""" diff --git a/uploader/monadic_requests.py b/uploader/monadic_requests.py new file mode 100644 index 0000000..c492df5 --- /dev/null +++ b/uploader/monadic_requests.py @@ -0,0 +1,104 @@ +"""Wrap requests functions with monads.""" +import traceback +from typing import Union, Optional, Callable + +import requests +from requests.models import Response +from pymonad.either import Left, Right, Either +from flask import (flash, + request, + redirect, + render_template, + current_app as app, + escape as flask_escape) + +# HTML Status codes indicating a successful request. +SUCCESS_CODES = (200, 201, 202, 203, 204, 205, 206, 207, 208, 226) + +# Possible error(s) that can be encontered while attempting to do a request. +PossibleError = Union[Response, Exception] + + +def make_error_handler( + redirect_to: Optional[Response] = None, + cleanup_thunk: Callable = lambda *args: None +) -> Callable[[PossibleError], Response]: + """ + Build a function to gracefully handle errors encountered while doing + requests. + + :rtype: Callable + """ + redirect_to = redirect_to or redirect(request.url) + def __handler__(resp_or_exc: PossibleError) -> Response: + cleanup_thunk() + if issubclass(type(resp_or_exc), Exception): + # Is an exception! + return render_template( + "unhandled_exception.html", + trace=traceback.format_exception(resp_or_exc)) + if isinstance(resp_or_exc, Response): + flash("The authorisation server responded with " + f"({flask_escape(resp_or_exc.status_code)}, " + f"{flask_escape(resp_or_exc.reason)}) for the request to " + f"'{flask_escape(resp_or_exc.request.url)}'", + "alert-danger") + return redirect_to + + flash("Unspecified error!", "alert-danger") + app.logger.debug("Error (%s): %s", type(resp_or_exc), resp_or_exc) + return redirect_to + return __handler__ + + +def get(url, params=None, **kwargs) -> Either: + """ + A wrapper around `requests.get` function. + + Takes the same arguments as `requests.get`. + + :rtype: pymonad.either.Either + """ + try: + resp = requests.get(url, params=params, **kwargs) + if resp.status_code in SUCCESS_CODES: + return Right(resp.json()) + return Left(resp) + except requests.exceptions.RequestException as exc: + return Left(exc) + + +def post(url, data=None, json=None, **kwargs) -> Either: + """ + A wrapper around `requests.post` function. + + Takes the same arguments as `requests.post`. + + :rtype: pymonad.either.Either + """ + try: + resp = requests.post(url, data=data, json=json, **kwargs) + if resp.status_code in SUCCESS_CODES: + return Right(resp.json()) + return Left(resp) + except requests.exceptions.RequestException as exc: + return Left(exc) + + +def make_either_error_handler(msg): + """Make generic error handler for pymonads Either objects.""" + def __fail__(error): + if issubclass(type(error), Exception): + app.logger.debug("\n\n%s (Exception)\n\n", msg, exc_info=True) + raise error + if issubclass(type(error), Response): + try: + _data = error.json() + except Exception as _exc: + raise Exception(error.content) from _exc + raise Exception(_data) + + app.logger.debug("\n\n%s\n\n", msg) + raise Exception(error) + + return __fail__ diff --git a/uploader/oauth2/__init__.py b/uploader/oauth2/__init__.py new file mode 100644 index 0000000..aaea638 --- /dev/null +++ b/uploader/oauth2/__init__.py @@ -0,0 +1 @@ +"""Package to handle OAuth2 authentication/authorisation issues.""" diff --git a/uploader/oauth2/client.py b/uploader/oauth2/client.py new file mode 100644 index 0000000..e7128de --- /dev/null +++ b/uploader/oauth2/client.py @@ -0,0 +1,230 @@ +"""OAuth2 client utilities.""" +import json +import time +import random +from datetime import datetime, timedelta +from urllib.parse import urljoin, urlparse + +import requests +from flask import request, current_app as app + +from pymonad.either import Left, Right, Either + +from authlib.common.urls import url_decode +from authlib.jose.errors import BadSignatureError +from authlib.jose import KeySet, JsonWebKey, JsonWebToken +from authlib.integrations.requests_client import OAuth2Session + +from uploader import session +import uploader.monadic_requests as mrequests + +SCOPE = ("profile group role resource register-client user masquerade " + "introspect migrate-data") + + +def authserver_uri(): + """Return URI to authorisation server.""" + return app.config["AUTH_SERVER_URL"] + + +def oauth2_clientid(): + """Return the client id.""" + return app.config["OAUTH2_CLIENT_ID"] + + +def oauth2_clientsecret(): + """Return the client secret.""" + return app.config["OAUTH2_CLIENT_SECRET"] + + +def __fetch_auth_server_jwks__() -> KeySet: + """Fetch the JWKs from the auth server.""" + return KeySet([ + JsonWebKey.import_key(key) + for key in requests.get( + urljoin(authserver_uri(), "auth/public-jwks") + ).json()["jwks"]]) + + +def __update_auth_server_jwks__(jwks) -> KeySet: + """Update the JWKs from the servers if necessary.""" + last_updated = jwks["last-updated"] + now = datetime.now().timestamp() + # Maybe the `two_hours` variable below can be made into a configuration + # variable and passed in to this function + two_hours = timedelta(hours=2).seconds + if bool(last_updated) and (now - last_updated) < two_hours: + return jwks["jwks"] + + return session.set_auth_server_jwks(__fetch_auth_server_jwks__()) + + +def auth_server_jwks() -> KeySet: + """Fetch the auth-server JSON Web Keys information.""" + _jwks = session.session_info().get("auth_server_jwks") or {} + if bool(_jwks): + return __update_auth_server_jwks__({ + "last-updated": _jwks["last-updated"], + "jwks": KeySet([ + JsonWebKey.import_key(key) for key in _jwks.get( + "jwks", {"keys": []})["keys"]]) + }) + + return __update_auth_server_jwks__({ + "last-updated": (datetime.now() - timedelta(hours=3)).timestamp() + }) + + +def oauth2_client(): + """Build the OAuth2 client for use fetching data.""" + def __update_token__(token, refresh_token=None, access_token=None):# pylint: disable=[unused-argument] + """Update the token when refreshed.""" + session.set_user_token(token) + + def __json_auth__(client, _method, uri, headers, body): + return ( + uri, + {**headers, "Content-Type": "application/json"}, + json.dumps({ + **dict(url_decode(body)), + "client_id": client.client_id, + "client_secret": client.client_secret + })) + + def __client__(token) -> OAuth2Session: + client = OAuth2Session( + oauth2_clientid(), + oauth2_clientsecret(), + scope=SCOPE, + token_endpoint=urljoin(authserver_uri(), "/auth/token"), + token_endpoint_auth_method="client_secret_post", + token=token, + update_token=__update_token__) + client.register_client_auth_method( + ("client_secret_post", __json_auth__)) + return client + + def __token_expired__(token): + """Check whether the token has expired.""" + jwks = auth_server_jwks() + if bool(jwks): + for jwk in jwks.keys: + try: + jwt = JsonWebToken(["RS256"]).decode( + token["access_token"], key=jwk) + return datetime.now().timestamp() > jwt["exp"] + except BadSignatureError as _bse: + pass + + return False + + def __delay__(): + """Do a tiny delay.""" + time.sleep(random.choice(tuple(i/1000.0 for i in range(0,100)))) + + def __refresh_token__(token): + """Refresh the token if necessary — synchronise amongst threads.""" + if __token_expired__(token): + __delay__() + if session.is_token_refreshing(): + while session.is_token_refreshing(): + __delay__() + + return session.user_token().either(None, lambda _tok: _tok) + + session.toggle_token_refreshing() + _client = __client__(token) + _client.get(urljoin(authserver_uri(), "auth/user/")) + session.toggle_token_refreshing() + return _client.token + + return token + + return session.user_token().then(__refresh_token__).either( + lambda _notok: __client__(None), + __client__) + + +def user_logged_in(): + """Check whether the user has logged in.""" + suser = session.session_info()["user"] + return suser["logged_in"] and suser["token"].is_right() + + +def authserver_authorise_uri(): + """Build up the authorisation URI.""" + req_baseurl = urlparse(request.base_url, scheme=request.scheme) + host_uri = f"{req_baseurl.scheme}://{req_baseurl.netloc}/" + return urljoin( + authserver_uri(), + "auth/authorise?response_type=code" + f"&client_id={oauth2_clientid()}" + f"&redirect_uri={urljoin(host_uri, 'oauth2/code')}") + + +def __no_token__(_err) -> Left: + """Handle situation where request is attempted with no token.""" + resp = requests.models.Response() + resp._content = json.dumps({#pylint: disable=[protected-access] + "error": "AuthenticationError", + "error-description": ("You need to authenticate to access requested " + "information.")}).encode("utf-8") + resp.status_code = 400 + return Left(resp) + + +def oauth2_get(url, **kwargs) -> Either: + """Do a get request to the authentication/authorisation server.""" + def __get__(_token) -> Either: + _uri = urljoin(authserver_uri(), url) + try: + resp = oauth2_client().get( + _uri, + **{ + **kwargs, + "headers": { + **kwargs.get("headers", {}), + "Content-Type": "application/json" + } + }) + if resp.status_code in mrequests.SUCCESS_CODES: + return Right(resp.json()) + return Left(resp) + except Exception as exc:#pylint: disable=[broad-except] + app.logger.error("Error retrieving data from auth server: (GET %s)", + _uri, + exc_info=True) + return Left(exc) + return session.user_token().either(__no_token__, __get__) + + +def oauth2_post(url, data=None, json=None, **kwargs):#pylint: disable=[redefined-outer-name] + """Do a POST request to the authentication/authorisation server.""" + def __post__(_token) -> Either: + _uri = urljoin(authserver_uri(), url) + _headers = ({ + **kwargs.get("headers", {}), + "Content-Type": "application/json" + } + if bool(json) else kwargs.get("headers", {})) + try: + request_data = { + **(data or {}), + **(json or {}), + "client_id": oauth2_clientid(), + "client_secret": oauth2_clientsecret() + } + resp = oauth2_client().post( + _uri, + data=(request_data if bool(data) else None), + json=(request_data if bool(json) else None), + **{**kwargs, "headers": _headers}) + if resp.status_code in mrequests.SUCCESS_CODES: + return Right(resp.json()) + return Left(resp) + except Exception as exc:#pylint: disable=[broad-except] + app.logger.error("Error retrieving data from auth server: (POST %s)", + _uri, + exc_info=True) + return Left(exc) + return session.user_token().either(__no_token__, __post__) diff --git a/uploader/oauth2/jwks.py b/uploader/oauth2/jwks.py new file mode 100644 index 0000000..efd0499 --- /dev/null +++ b/uploader/oauth2/jwks.py @@ -0,0 +1,86 @@ +"""Utilities dealing with JSON Web Keys (JWK)""" +import os +from pathlib import Path +from typing import Any, Union +from datetime import datetime, timedelta + +from flask import Flask +from authlib.jose import JsonWebKey +from pymonad.either import Left, Right, Either + +def jwks_directory(app: Flask, configname: str) -> Path: + """Compute the directory where the JWKs are stored.""" + appsecretsdir = Path(app.config[configname]).parent + if appsecretsdir.exists() and appsecretsdir.is_dir(): + jwksdir = Path(appsecretsdir, "jwks/") + if not jwksdir.exists(): + jwksdir.mkdir() + return jwksdir + raise ValueError( + "The `appsecretsdir` value should be a directory that actually exists.") + + +def generate_and_save_private_key( + storagedir: Path, + kty: str = "RSA", + crv_or_size: Union[str, int] = 2048, + options: tuple[tuple[str, Any]] = (("iat", datetime.now().timestamp()),) +) -> JsonWebKey: + """Generate a private key and save to `storagedir`.""" + privatejwk = JsonWebKey.generate_key( + kty, crv_or_size, dict(options), is_private=True) + keyname = f"{privatejwk.thumbprint()}.private.pem" + with open(Path(storagedir, keyname), "wb") as pemfile: + pemfile.write(privatejwk.as_pem(is_private=True)) + + return privatejwk + + +def pem_to_jwk(filepath: Path) -> JsonWebKey: + """Parse a PEM file into a JWK object.""" + with open(filepath, "rb") as pemfile: + return JsonWebKey.import_key(pemfile.read()) + + +def __sorted_jwks_paths__(storagedir: Path) -> tuple[tuple[float, Path], ...]: + """A sorted list of the JWK file paths with their creation timestamps.""" + return tuple(sorted(((os.stat(keypath).st_ctime, keypath) + for keypath in (Path(storagedir, keyfile) + for keyfile in os.listdir(storagedir) + if keyfile.endswith(".pem"))), + key=lambda tpl: tpl[0])) + + +def list_jwks(storagedir: Path) -> tuple[JsonWebKey, ...]: + """ + List all the JWKs in a particular directory in the order they were created. + """ + return tuple(pem_to_jwk(keypath) for ctime,keypath in + __sorted_jwks_paths__(storagedir)) + + +def newest_jwk(storagedir: Path) -> Either: + """ + Return an Either monad with the newest JWK or a message if none exists. + """ + existingkeys = __sorted_jwks_paths__(storagedir) + if len(existingkeys) > 0: + return Right(pem_to_jwk(existingkeys[-1][1])) + return Left("No JWKs exist") + + +def newest_jwk_with_rotation(jwksdir: Path, keyage: int) -> JsonWebKey: + """ + Retrieve the latests JWK, creating a new one if older than `keyage` days. + """ + def newer_than_days(jwkey): + filestat = os.stat(Path( + jwksdir, f"{jwkey.as_dict()['kid']}.private.pem")) + oldesttimeallowed = (datetime.now() - timedelta(days=keyage)) + if filestat.st_ctime < (oldesttimeallowed.timestamp()): + return Left("JWK is too old!") + return jwkey + + return newest_jwk(jwksdir).then(newer_than_days).either( + lambda _errmsg: generate_and_save_private_key(jwksdir), + lambda key: key) diff --git a/uploader/oauth2/views.py b/uploader/oauth2/views.py new file mode 100644 index 0000000..61037f3 --- /dev/null +++ b/uploader/oauth2/views.py @@ -0,0 +1,138 @@ +"""Views for OAuth2 related functionality.""" +import uuid +from datetime import datetime, timedelta +from urllib.parse import urljoin, urlparse, urlunparse + +from authlib.jose import jwt +from flask import ( + flash, + jsonify, + url_for, + request, + redirect, + Blueprint, + current_app as app) + +from uploader import session +from uploader import monadic_requests as mrequests +from uploader.monadic_requests import make_error_handler + +from . import jwks +from .client import ( + SCOPE, + oauth2_get, + user_logged_in, + authserver_uri, + oauth2_clientid, + oauth2_clientsecret) + +oauth2 = Blueprint("oauth2", __name__) + +@oauth2.route("/code") +def authorisation_code(): + """Receive authorisation code from auth server and use it to get token.""" + def __process_error__(resp_or_exception): + app.logger.debug("ERROR: (%s)", resp_or_exception) + flash("There was an error retrieving the authorisation token.", + "alert-danger") + return redirect("/") + + def __fail_set_user_details__(_failure): + app.logger.debug("Fetching user details fails: %s", _failure) + flash("Could not retrieve the user details", "alert-danger") + return redirect("/") + + def __success_set_user_details__(_success): + app.logger.debug("Session info: %s", _success) + return redirect("/") + + def __success__(token): + session.set_user_token(token) + return oauth2_get("auth/user/").then( + lambda usrdets: session.set_user_details({ + "user_id": uuid.UUID(usrdets["user_id"]), + "name": usrdets["name"], + "email": usrdets["email"], + "token": session.user_token(), + "logged_in": True})).either( + __fail_set_user_details__, + __success_set_user_details__) + + code = request.args.get("code", "").strip() + if not bool(code): + flash("AuthorisationError: No code was provided.", "alert-danger") + return redirect("/") + + baseurl = urlparse(request.base_url, scheme=request.scheme) + issued = datetime.now() + jwtkey = jwks.newest_jwk_with_rotation( + jwks.jwks_directory(app, "UPLOADER_SECRETS"), + int(app.config["JWKS_ROTATION_AGE_DAYS"])) + return mrequests.post( + urljoin(authserver_uri(), "auth/token"), + json={ + "grant_type": "urn:ietf:params:oauth:grant-type:jwt-bearer", + "code": code, + "scope": SCOPE, + "redirect_uri": urljoin( + urlunparse(baseurl), + url_for("oauth2.authorisation_code")), + "assertion": jwt.encode( + header={ + "alg": "RS256", + "typ": "JWT", + "kid": jwtkey.as_dict()["kid"] + }, + payload={ + "iss": str(oauth2_clientid()), + "sub": request.args["user_id"], + "aud": urljoin(authserver_uri(),"auth/token"), + "exp": (issued + timedelta(minutes=5)).timestamp(), + "nbf": int(issued.timestamp()), + "iat": int(issued.timestamp()), + "jti": str(uuid.uuid4()) + }, + key=jwtkey).decode("utf8"), + "client_id": oauth2_clientid() + }).either(__process_error__, __success__) + +@oauth2.route("/public-jwks") +def public_jwks(): + """List the available JWKs""" + return jsonify({ + "documentation": ( + "The keys are listed in order of creation, from the oldest (first) " + "to the newest (last)."), + "jwks": tuple(key.as_dict() for key + in jwks.list_jwks(jwks.jwks_directory( + app, "UPLOADER_SECRETS"))) + }) + + +@oauth2.route("/logout", methods=["GET"]) +def logout(): + """Log out of any active sessions.""" + def __unset_session__(session_info): + _user = session_info["user"] + _user_str = f"{_user['name']} ({_user['email']})" + session.clear_session_info() + flash("Successfully logged out.", "alert-success") + return redirect("/") + + if user_logged_in(): + return session.user_token().then( + lambda _tok: mrequests.post( + urljoin(authserver_uri(), "auth/revoke"), + json={ + "token": _tok["refresh_token"], + "token_type_hint": "refresh_token", + "client_id": oauth2_clientid(), + "client_secret": oauth2_clientsecret() + })).either( + make_error_handler( + redirect_to=redirect("/"), + cleanup_thunk=lambda: __unset_session__( + session.session_info())), + lambda res: __unset_session__(session.session_info())) + flash("There is no user that is currently logged in.", "alert-info") + return redirect("/") diff --git a/uploader/phenotypes/__init__.py b/uploader/phenotypes/__init__.py new file mode 100644 index 0000000..c17d32c --- /dev/null +++ b/uploader/phenotypes/__init__.py @@ -0,0 +1,2 @@ +"""Package for handling ('classical') phenotype data""" +from .views import phenotypesbp diff --git a/uploader/phenotypes/models.py b/uploader/phenotypes/models.py new file mode 100644 index 0000000..9324601 --- /dev/null +++ b/uploader/phenotypes/models.py @@ -0,0 +1,232 @@ +"""Database and utility functions for phenotypes.""" +from typing import Optional +from functools import reduce +from datetime import datetime + +import MySQLdb as mdb +from MySQLdb.cursors import Cursor, DictCursor + +from uploader.db_utils import debug_query + +def datasets_by_population( + conn: mdb.Connection, + species_id: int, + population_id: int +) -> tuple[dict, ...]: + """Retrieve all of a population's phenotype studies.""" + with conn.cursor(cursorclass=DictCursor) as cursor: + cursor.execute( + "SELECT s.SpeciesId, pf.* FROM Species AS s " + "INNER JOIN InbredSet AS iset ON s.Id=iset.SpeciesId " + "INNER JOIN PublishFreeze AS pf ON iset.Id=pf.InbredSetId " + "WHERE s.Id=%s AND iset.Id=%s;", + (species_id, population_id)) + return tuple(dict(row) for row in cursor.fetchall()) + + +def dataset_by_id(conn: mdb.Connection, + species_id: int, + population_id: int, + dataset_id: int) -> dict: + """Fetch dataset details by identifier""" + with conn.cursor(cursorclass=DictCursor) as cursor: + cursor.execute( + "SELECT s.SpeciesId, pf.* FROM Species AS s " + "INNER JOIN InbredSet AS iset ON s.Id=iset.SpeciesId " + "INNER JOIN PublishFreeze AS pf ON iset.Id=pf.InbredSetId " + "WHERE s.Id=%s AND iset.Id=%s AND pf.Id=%s", + (species_id, population_id, dataset_id)) + return dict(cursor.fetchone()) + + +def phenotypes_count(conn: mdb.Connection, + population_id: int, + dataset_id: int) -> int: + """Count the number of phenotypes in the dataset.""" + with conn.cursor(cursorclass=DictCursor) as cursor: + cursor.execute( + "SELECT COUNT(*) AS total_phenos FROM Phenotype AS pheno " + "INNER JOIN PublishXRef AS pxr ON pheno.Id=pxr.PhenotypeId " + "INNER JOIN PublishFreeze AS pf ON pxr.InbredSetId=pf.InbredSetId " + "WHERE pxr.InbredSetId=%s AND pf.Id=%s", + (population_id, dataset_id)) + return int(cursor.fetchone()["total_phenos"]) + + +def dataset_phenotypes(conn: mdb.Connection, + population_id: int, + dataset_id: int, + offset: int = 0, + limit: Optional[int] = None) -> tuple[dict, ...]: + """Fetch the actual phenotypes.""" + _query = ( + "SELECT pheno.*, pxr.Id, ist.InbredSetCode FROM Phenotype AS pheno " + "INNER JOIN PublishXRef AS pxr ON pheno.Id=pxr.PhenotypeId " + "INNER JOIN PublishFreeze AS pf ON pxr.InbredSetId=pf.InbredSetId " + "INNER JOIN InbredSet AS ist ON pf.InbredSetId=ist.Id " + "WHERE pxr.InbredSetId=%s AND pf.Id=%s") + ( + f" LIMIT {limit} OFFSET {offset}" if bool(limit) else "") + with conn.cursor(cursorclass=DictCursor) as cursor: + cursor.execute(_query, (population_id, dataset_id)) + debug_query(cursor) + return tuple(dict(row) for row in cursor.fetchall()) + + +def __phenotype_se__(cursor: Cursor, + species_id: int, + population_id: int, + dataset_id: int, + xref_id: str) -> dict: + """Fetch standard-error values (if they exist) for a phenotype.""" + _sequery = ( + "SELECT pxr.Id AS xref_id, pxr.DataId, str.Id AS StrainId, pse.error, nst.count " + "FROM Phenotype AS pheno " + "INNER JOIN PublishXRef AS pxr ON pheno.Id=pxr.PhenotypeId " + "INNER JOIN PublishSE AS pse ON pxr.DataId=pse.DataId " + "INNER JOIN NStrain AS nst ON pse.DataId=nst.DataId " + "INNER JOIN Strain AS str ON nst.StrainId=str.Id " + "INNER JOIN StrainXRef AS sxr ON str.Id=sxr.StrainId " + "INNER JOIN PublishFreeze AS pf ON sxr.InbredSetId=pf.InbredSetId " + "INNER JOIN InbredSet AS iset ON pf.InbredSetId=iset.InbredSetId " + "WHERE (str.SpeciesId, pxr.InbredSetId, pf.Id, pxr.Id)=(%s, %s, %s, %s)") + cursor.execute(_sequery, + (species_id, population_id, dataset_id, xref_id)) + return {(row["DataId"], row["StrainId"]): { + "xref_id": row["xref_id"], + "DataId": row["DataId"], + "error": row["error"], + "count": row["count"] + } for row in cursor.fetchall()} + +def __organise_by_phenotype__(pheno, row): + """Organise disparate data rows into phenotype 'objects'.""" + _pheno = pheno.get(row["Id"]) + return { + **pheno, + row["Id"]: { + "Id": row["Id"], + "Pre_publication_description": row["Pre_publication_description"], + "Post_publication_description": row["Post_publication_description"], + "Original_description": row["Original_description"], + "Units": row["Units"], + "Pre_publication_abbreviation": row["Pre_publication_abbreviation"], + "Post_publication_abbreviation": row["Post_publication_abbreviation"], + "xref_id": row["pxr.Id"], + "data": { + **(_pheno["data"] if bool(_pheno) else {}), + (row["DataId"], row["StrainId"]): { + "DataId": row["DataId"], + "mean": row["mean"], + "Locus": row["Locus"], + "LRS": row["LRS"], + "additive": row["additive"], + "Sequence": row["Sequence"], + "comments": row["comments"], + "value": row["value"], + "StrainName": row["Name"], + "StrainName2": row["Name2"], + "StrainSymbol": row["Symbol"], + "StrainAlias": row["Alias"] + } + } + } + } + + +def __merge_pheno_data_and_se__(data, sedata) -> dict: + """Merge phenotype data with the standard errors.""" + return { + key: {**value, **sedata.get(key, {})} + for key, value in data.items() + } + + +def phenotype_by_id( + conn: mdb.Connection, + species_id: int, + population_id: int, + dataset_id: int, + xref_id +) -> Optional[dict]: + """Fetch a specific phenotype.""" + _dataquery = ("SELECT pheno.*, pxr.*, pd.*, str.*, iset.InbredSetCode " + "FROM Phenotype AS pheno " + "INNER JOIN PublishXRef AS pxr ON pheno.Id=pxr.PhenotypeId " + "INNER JOIN PublishData AS pd ON pxr.DataId=pd.Id " + "INNER JOIN Strain AS str ON pd.StrainId=str.Id " + "INNER JOIN StrainXRef AS sxr ON str.Id=sxr.StrainId " + "INNER JOIN PublishFreeze AS pf ON sxr.InbredSetId=pf.InbredSetId " + "INNER JOIN InbredSet AS iset ON pf.InbredSetId=iset.InbredSetId " + "WHERE " + "(str.SpeciesId, pxr.InbredSetId, pf.Id, pxr.Id)=(%s, %s, %s, %s)") + with conn.cursor(cursorclass=DictCursor) as cursor: + cursor.execute(_dataquery, + (species_id, population_id, dataset_id, xref_id)) + _pheno: dict = reduce(__organise_by_phenotype__, cursor.fetchall(), {}) + if bool(_pheno) and len(_pheno.keys()) == 1: + _pheno = tuple(_pheno.values())[0] + return { + **_pheno, + "data": tuple(__merge_pheno_data_and_se__( + _pheno["data"], + __phenotype_se__(cursor, + species_id, + population_id, + dataset_id, + xref_id)).values()) + } + if bool(_pheno) and len(_pheno.keys()) > 1: + raise Exception( + "We found more than one phenotype with the same identifier!") + + return None + + +def phenotypes_data(conn: mdb.Connection, + population_id: int, + dataset_id: int, + offset: int = 0, + limit: Optional[int] = None) -> tuple[dict, ...]: + """Fetch the data for the phenotypes.""" + # — Phenotype -> PublishXRef -> PublishData -> Strain -> StrainXRef -> PublishFreeze + _query = ("SELECT pheno.*, pxr.*, pd.*, str.*, iset.InbredSetCode " + "FROM Phenotype AS pheno " + "INNER JOIN PublishXRef AS pxr ON pheno.Id=pxr.PhenotypeId " + "INNER JOIN PublishData AS pd ON pxr.DataId=pd.Id " + "INNER JOIN Strain AS str ON pd.StrainId=str.Id " + "INNER JOIN StrainXRef AS sxr ON str.Id=sxr.StrainId " + "INNER JOIN PublishFreeze AS pf ON sxr.InbredSetId=pf.InbredSetId " + "INNER JOIN InbredSet AS iset ON pf.InbredSetId=iset.InbredSetId " + "WHERE pxr.InbredSetId=%s AND pf.Id=%s") + ( + f" LIMIT {limit} OFFSET {offset}" if bool(limit) else "") + with conn.cursor(cursorclass=DictCursor) as cursor: + cursor.execute(_query, (population_id, dataset_id)) + debug_query(cursor) + return tuple(dict(row) for row in cursor.fetchall()) + + +def save_new_dataset(cursor: Cursor, + population_id: int, + dataset_name: str, + dataset_fullname: str, + dataset_shortname: str) -> dict: + """Create a new phenotype dataset.""" + params = { + "population_id": population_id, + "dataset_name": dataset_name, + "dataset_fullname": dataset_fullname, + "dataset_shortname": dataset_shortname, + "created": datetime.now().date().isoformat(), + "public": 2, + "confidentiality": 0, + "users": None + } + cursor.execute( + "INSERT INTO PublishFreeze(Name, FullName, ShortName, CreateTime, " + "public, InbredSetId, confidentiality, AuthorisedUsers) " + "VALUES(%(dataset_name)s, %(dataset_fullname)s, %(dataset_shortname)s, " + "%(created)s, %(public)s, %(population_id)s, %(confidentiality)s, " + "%(users)s)", + params) + debug_query(cursor) + return {**params, "Id": cursor.lastrowid} diff --git a/uploader/phenotypes/views.py b/uploader/phenotypes/views.py new file mode 100644 index 0000000..02e8078 --- /dev/null +++ b/uploader/phenotypes/views.py @@ -0,0 +1,368 @@ +"""Views handling ('classical') phenotypes.""" +import sys +import uuid +import json +from pathlib import Path +from functools import wraps + +from redis import Redis +from requests.models import Response +from MySQLdb.cursors import DictCursor +from flask import (flash, + request, + url_for, + redirect, + Blueprint, + render_template, + current_app as app) + +# from r_qtl import r_qtl2 as rqtl2 +from r_qtl import r_qtl2_qc as rqc +from r_qtl import exceptions as rqe + +from uploader import jobs +from uploader.files import save_file#, fullpath +from uploader.oauth2.client import oauth2_post +from uploader.authorisation import require_login +from uploader.db_utils import database_connection +from uploader.species.models import all_species, species_by_id +from uploader.monadic_requests import make_either_error_handler +from uploader.request_checks import with_species, with_population +from uploader.datautils import safe_int, order_by_family, enumerate_sequence +from uploader.population.models import (populations_by_species, + population_by_species_and_id) +from uploader.input_validation import (encode_errors, + decode_errors, + is_valid_representative_name) + +from .models import (dataset_by_id, + phenotype_by_id, + phenotypes_count, + save_new_dataset, + dataset_phenotypes, + datasets_by_population) + +phenotypesbp = Blueprint("phenotypes", __name__) + +@phenotypesbp.route("/phenotypes", methods=["GET"]) +@require_login +def index(): + """Direct entry-point for phenotypes data handling.""" + with database_connection(app.config["SQL_URI"]) as conn: + if not bool(request.args.get("species_id")): + return render_template("phenotypes/index.html", + species=order_by_family(all_species(conn)), + activelink="phenotypes") + + species = species_by_id(conn, request.args.get("species_id")) + if not bool(species): + flash("No such species!", "alert-danger") + return redirect(url_for("species.populations.phenotypes.index")) + return redirect(url_for("species.populations.phenotypes.select_population", + species_id=species["SpeciesId"])) + + +@phenotypesbp.route("<int:species_id>/phenotypes/select-population", + methods=["GET"]) +@require_login +@with_species(redirect_uri="species.populations.phenotypes.index") +def select_population(species: dict, **kwargs):# pylint: disable=[unused-argument] + """Select the population for your phenotypes.""" + with database_connection(app.config["SQL_URI"]) as conn: + if not bool(request.args.get("population_id")): + return render_template("phenotypes/select-population.html", + species=species, + populations=order_by_family( + populations_by_species( + conn, species["SpeciesId"]), + order_key="FamilyOrder"), + activelink="phenotypes") + + population = population_by_species_and_id( + conn, species["SpeciesId"], int(request.args["population_id"])) + if not bool(population): + flash("No such population found!", "alert-danger") + return redirect(url_for( + "species.populations.phenotypes.select_population", + species_id=species["SpeciesId"])) + + return redirect(url_for("species.populations.phenotypes.list_datasets", + species_id=species["SpeciesId"], + population_id=population["Id"])) + + + +@phenotypesbp.route( + "<int:species_id>/populations/<int:population_id>/phenotypes/datasets", + methods=["GET"]) +@require_login +@with_population(species_redirect_uri="species.populations.phenotypes.index", + redirect_uri="species.populations.phenotypes.select_population") +def list_datasets(species: dict, population: dict, **kwargs):# pylint: disable=[unused-argument] + """List available phenotype datasets.""" + with database_connection(app.config["SQL_URI"]) as conn: + return render_template("phenotypes/list-datasets.html", + species=species, + population=population, + datasets=datasets_by_population( + conn, + species["SpeciesId"], + population["Id"]), + activelink="list-datasets") + + +def with_dataset( + species_redirect_uri: str, + population_redirect_uri: str, + redirect_uri: str +): + """Ensure the dataset actually exists.""" + def __decorator__(func): + @wraps(func) + @with_population(species_redirect_uri, population_redirect_uri) + def __with_dataset__(**kwargs): + try: + _spcid = int(kwargs["species_id"]) + _popid = int(kwargs["population_id"]) + _dsetid = int(kwargs.get("dataset_id")) + select_dataset_uri = redirect(url_for( + redirect_uri, species_id=_spcid, population_id=_popid)) + if not bool(_dsetid): + flash("You need to select a valid 'dataset_id' value.", + "alert-danger") + return select_dataset_uri + with database_connection(app.config["SQL_URI"]) as conn: + dataset = dataset_by_id(conn, _spcid, _popid, _dsetid) + if not bool(dataset): + flash("You must select a valid dataset.", + "alert-danger") + return select_dataset_uri + except ValueError as _verr: + app.logger.debug( + "Exception converting 'dataset_id' to integer: %s", + kwargs.get("dataset_id"), + exc_info=True) + flash("Expected 'dataset_id' value to be an integer." + "alert-danger") + return select_dataset_uri + return func(dataset=dataset, **kwargs) + return __with_dataset__ + return __decorator__ + + +@phenotypesbp.route( + "<int:species_id>/populations/<int:population_id>/phenotypes/datasets" + "/<int:dataset_id>/view", + methods=["GET"]) +@require_login +@with_dataset( + species_redirect_uri="species.populations.phenotypes.index", + population_redirect_uri="species.populations.phenotypes.select_population", + redirect_uri="species.populations.phenotypes.list_datasets") +def view_dataset(# pylint: disable=[unused-argument] + species: dict, population: dict, dataset: dict, **kwargs): + """View a specific dataset""" + with database_connection(app.config["SQL_URI"]) as conn: + dataset = dataset_by_id( + conn, species["SpeciesId"], population["Id"], dataset["Id"]) + if not bool(dataset): + flash("Could not find such a phenotype dataset!", "alert-danger") + return redirect(url_for( + "species.populations.phenotypes.list_datasets", + species_id=species["SpeciesId"], + population_id=population["Id"])) + + start_at = max(safe_int(request.args.get("start_at") or 0), 0) + count = int(request.args.get("count") or 20) + return render_template("phenotypes/view-dataset.html", + species=species, + population=population, + dataset=dataset, + phenotype_count=phenotypes_count( + conn, population["Id"], dataset["Id"]), + phenotypes=enumerate_sequence( + dataset_phenotypes(conn, + population["Id"], + dataset["Id"], + offset=start_at, + limit=count), + start=start_at+1), + start_from=start_at, + count=count, + activelink="view-dataset") + + +@phenotypesbp.route( + "<int:species_id>/populations/<int:population_id>/phenotypes/datasets" + "/<int:dataset_id>/phenotype/<xref_id>", + methods=["GET"]) +@require_login +@with_dataset( + species_redirect_uri="species.populations.phenotypes.index", + population_redirect_uri="species.populations.phenotypes.select_population", + redirect_uri="species.populations.phenotypes.list_datasets") +def view_phenotype(# pylint: disable=[unused-argument] + species: dict, + population: dict, + dataset: dict, + xref_id: int, + **kwargs +): + """View an individual phenotype from the dataset.""" + def __render__(privileges): + return render_template( + "phenotypes/view-phenotype.html", + species=species, + population=population, + dataset=dataset, + phenotype=phenotype_by_id(conn, + species["SpeciesId"], + population["Id"], + dataset["Id"], + xref_id), + privileges=(privileges + ### For demo! Do not commit this part + + ("group:resource:edit-resource", + "group:resource:delete-resource",) + ### END: For demo! Do not commit this part + ), + activelink="view-phenotype") + + def __fail__(error): + if isinstance(error, Response) and error.json() == "No linked resource!": + return __render__(tuple()) + return make_either_error_handler( + "There was an error fetching the roles and privileges.")(error) + + with database_connection(app.config["SQL_URI"]) as conn: + return oauth2_post( + "/auth/resource/phenotypes/individual/linked-resource", + json={ + "species_id": species["SpeciesId"], + "population_id": population["Id"], + "dataset_id": dataset["Id"], + "xref_id": xref_id + } + ).then( + lambda resource: tuple( + privilege["privilege_id"] for role in resource["roles"] + for privilege in role["privileges"]) + ).then(__render__).either(__fail__, lambda resp: resp) + + +@phenotypesbp.route( + "<int:species_id>/populations/<int:population_id>/phenotypes/datasets/create", + methods=["GET", "POST"]) +@require_login +@with_population( + species_redirect_uri="species.populations.phenotypes.index", + redirect_uri="species.populations.phenotypes.select_population") +def create_dataset(species: dict, population: dict, **kwargs):# pylint: disable=[unused-argument] + """Create a new phenotype dataset.""" + with (database_connection(app.config["SQL_URI"]) as conn, + conn.cursor(cursorclass=DictCursor) as cursor): + if request.method == "GET": + return render_template("phenotypes/create-dataset.html", + activelink="create-dataset", + species=species, + population=population, + **decode_errors( + request.args.get("error_values", ""))) + + form = request.form + _errors: tuple[tuple[str, str], ...] = tuple() + if not is_valid_representative_name( + (form.get("dataset-name") or "").strip()): + _errors = _errors + (("dataset-name", "Invalid dataset name."),) + + if not bool((form.get("dataset-fullname") or "").strip()): + _errors = _errors + (("dataset-fullname", + "You must provide a value for 'Full Name'."),) + + if bool(_errors) > 0: + return redirect(url_for( + "species.populations.phenotypes.create_dataset", + species_id=species["SpeciesId"], + population_id=population["Id"], + error_values=encode_errors(_errors, form))) + + dataset_shortname = ( + form["dataset-shortname"] or form["dataset-name"]).strip() + _pheno_dataset = save_new_dataset( + cursor, + population["Id"], + form["dataset-name"].strip(), + form["dataset-fullname"].strip(), + dataset_shortname) + return redirect(url_for("species.populations.phenotypes.list_datasets", + species_id=species["SpeciesId"], + population_id=population["Id"])) + + +@phenotypesbp.route( + "<int:species_id>/populations/<int:population_id>/phenotypes/datasets" + "/<int:dataset_id>/add-phenotypes", + methods=["GET", "POST"]) +@require_login +@with_dataset( + species_redirect_uri="species.populations.phenotypes.index", + population_redirect_uri="species.populations.phenotypes.select_population", + redirect_uri="species.populations.phenotypes.list_datasets") +def add_phenotypes(species: dict, population: dict, dataset: dict, **kwargs):# pylint: disable=[unused-argument, too-many-locals] + """Add one or more phenotypes to the dataset.""" + add_phenos_uri = redirect(url_for( + "species.populations.phenotypes.add_phenotypes", + species_id=species["SpeciesId"], + population_id=population["Id"], + dataset_id=dataset["Id"])) + _redisuri = app.config["REDIS_URL"] + _sqluri = app.config["SQL_URI"] + with (Redis.from_url(_redisuri, decode_responses=True) as rconn, + # database_connection(_sqluri) as conn, + # conn.cursor(cursorclass=DictCursor) as cursor + ): + if request.method == "GET": + return render_template("phenotypes/add-phenotypes.html", + species=species, + population=population, + dataset=dataset, + activelink="add-phenotypes") + + try: + ## Handle huge files here... + phenobundle = save_file(request.files["phenotypes-bundle"], + Path(app.config["UPLOAD_FOLDER"])) + rqc.validate_bundle(phenobundle) + except AssertionError as _aerr: + app.logger.debug("File upload error!", exc_info=True) + flash("Expected a zipped bundle of files with phenotypes' " + "information.", + "alert-danger") + return add_phenos_uri + except rqe.RQTLError as rqtlerr: + app.logger.debug("Bundle validation error!", exc_info=True) + flash("R/qtl2 Error: " + " ".join(rqtlerr.args), "alert-danger") + return add_phenos_uri + + _jobid = uuid.uuid4() + _namespace = jobs.jobsnamespace() + _ttl_seconds = app.config["JOBS_TTL_SECONDS"] + _job = jobs.initialise_job( + rconn, + _namespace, + str(_jobid), + [sys.executable, "-m", "scripts.rqtl2.phenotypes_qc", _sqluri, + _redisuri, _namespace, str(_jobid), str(species["SpeciesId"]), + str(population["Id"]), str(dataset["Id"]), "--redisexpiry", + str(_ttl_seconds)], "phenotype_qc", _ttl_seconds, + {"job-metadata": json.dumps({ + "speciesid": species["SpeciesId"], + "populationid": population["Id"], + "datasetid": dataset["Id"], + "bundle": str(phenobundle.absolute())})}) + # jobs.launch_job( + # _job, + # redisuri, + # f"{app.config['UPLOAD_FOLDER']}/job_errors") + + raise NotImplementedError("Please implement this...") diff --git a/uploader/platforms/__init__.py b/uploader/platforms/__init__.py new file mode 100644 index 0000000..8cb89c9 --- /dev/null +++ b/uploader/platforms/__init__.py @@ -0,0 +1,2 @@ +"""Module to handle management of genetic platforms.""" +from .views import platformsbp diff --git a/uploader/platforms/models.py b/uploader/platforms/models.py new file mode 100644 index 0000000..a859371 --- /dev/null +++ b/uploader/platforms/models.py @@ -0,0 +1,95 @@ +"""Handle db interactions for platforms.""" +from typing import Optional + +import MySQLdb as mdb +from MySQLdb.cursors import Cursor, DictCursor + +def platforms_by_species( + conn: mdb.Connection, + speciesid: int, + offset: int = 0, + limit: Optional[int] = None +) -> tuple[dict, ...]: + """Retrieve platforms by the species""" + _query = ("SELECT * FROM GeneChip WHERE SpeciesId=%s " + "ORDER BY GeneChipName ASC") + if bool(limit) and limit > 0:# type: ignore[operator] + _query = f"{_query} LIMIT {limit} OFFSET {offset}" + with conn.cursor(cursorclass=DictCursor) as cursor: + cursor.execute(_query, (speciesid,)) + return tuple(dict(row) for row in cursor.fetchall()) + + +def species_platforms_count(conn: mdb.Connection, species_id: int) -> int: + """Get the number of platforms in the database for a particular species.""" + with conn.cursor(cursorclass=DictCursor) as cursor: + cursor.execute( + "SELECT COUNT(GeneChipName) AS count FROM GeneChip " + "WHERE SpeciesId=%s", + (species_id,)) + return int(cursor.fetchone()["count"]) + + +def platform_by_id(conn: mdb.Connection, platformid: int) -> Optional[dict]: + """Retrieve a platform by its ID""" + with conn.cursor(cursorclass=DictCursor) as cursor: + cursor.execute("SELECT * FROM GeneChip WHERE Id=%s", + (platformid,)) + result = cursor.fetchone() + if bool(result): + return dict(result) + + return None + + +def platform_by_species_and_id( + conn: mdb.Connection, species_id: int, platformid: int +) -> Optional[dict]: + """Retrieve a platform by its species and ID""" + with conn.cursor(cursorclass=DictCursor) as cursor: + cursor.execute("SELECT * FROM GeneChip WHERE SpeciesId=%s AND Id=%s", + (species_id, platformid)) + result = cursor.fetchone()#pylint: disable=[duplicate-code] + if bool(result): + return dict(result) + + return None + + +def save_new_platform(# pylint: disable=[too-many-arguments] + cursor: Cursor, + species_id: int, + geo_platform: str, + platform_name: str, + platform_shortname: str, + platform_title: str, + go_tree_value: Optional[str] +) -> dict: + """Save a new platform to the database.""" + params = { + "species_id": species_id, + "GeoPlatform": geo_platform, + "GeneChipName": platform_name, + "Name": platform_shortname, + "Title": platform_title, + "GO_tree_value": go_tree_value + } + cursor.execute("SELECT SpeciesId, GeoPlatform FROM GeneChip") + assert (species_id, geo_platform) not in ( + (row["SpeciesId"], row["GeoPlatform"]) for row in cursor.fetchall()) + cursor.execute( + "INSERT INTO " + "GeneChip(SpeciesId, GeneChipName, Name, GeoPlatform, Title, GO_tree_value) " + "VALUES(" + "%(species_id)s, %(GeneChipName)s, %(Name)s, %(GeoPlatform)s, " + "%(Title)s, %(GO_tree_value)s" + ")", + params) + new_id = cursor.lastrowid + cursor.execute("UPDATE GeneChip SET GeneChipId=%s WHERE Id=%s", + (new_id, new_id)) + return { + **params, + "Id": new_id, + "GeneChipId": new_id + } diff --git a/uploader/platforms/views.py b/uploader/platforms/views.py new file mode 100644 index 0000000..2d61b6a --- /dev/null +++ b/uploader/platforms/views.py @@ -0,0 +1,112 @@ +"""The endpoints for the platforms""" +from MySQLdb.cursors import DictCursor +from flask import ( + flash, + request, + url_for, + redirect, + Blueprint, + current_app as app) + +from uploader.ui import make_template_renderer +from uploader.authorisation import require_login +from uploader.db_utils import database_connection +from uploader.species.models import all_species, species_by_id +from uploader.datautils import safe_int, order_by_family, enumerate_sequence + +from .models import (save_new_platform, + platforms_by_species, + species_platforms_count) + +platformsbp = Blueprint("platforms", __name__) +render_template = make_template_renderer("platforms") + +@platformsbp.route("platforms", methods=["GET"]) +@require_login +def index(): + """Entry-point to the platforms feature.""" + with database_connection(app.config["SQL_URI"]) as conn: + if not bool(request.args.get("species_id")): + return render_template( + "platforms/index.html", + species=order_by_family(all_species(conn)), + activelink="platforms") + + species = species_by_id(conn, request.args["species_id"]) + if not bool(species): + flash("No species selected.", "alert-danger") + return redirect(url_for("species.platforms.index")) + + return redirect(url_for("species.platforms.list_platforms", + species_id=species["SpeciesId"])) + + +@platformsbp.route("<int:species_id>/platforms", methods=["GET"]) +@require_login +def list_platforms(species_id: int): + """List all the available genetic sequencing platforms.""" + with database_connection(app.config["SQL_URI"]) as conn: + species = species_by_id(conn, species_id) + if not bool(species): + flash("No species provided.", "alert-danger") + return redirect(url_for("species.platforms.index")) + + start_from = max(safe_int(request.args.get("start_from") or 0), 0) + count = safe_int(request.args.get("count") or 20) + return render_template( + "platforms/list-platforms.html", + species=species, + platforms=enumerate_sequence( + platforms_by_species(conn, + species_id, + offset=start_from, + limit=count), + start=start_from+1), + start_from=start_from, + count=count, + total_platforms=species_platforms_count(conn, species_id), + activelink="list-platforms") + + +@platformsbp.route("<int:species_id>/platforms/create", methods=["GET", "POST"]) +@require_login +def create_platform(species_id: int): + """Create a new genetic sequencing platform.""" + with (database_connection(app.config["SQL_URI"]) as conn, + conn.cursor(cursorclass=DictCursor) as cursor): + species = species_by_id(conn, species_id) + if not bool(species): + flash("No species provided.", "alert-danger") + return redirect(url_for("species.platforms.index")) + + if request.method == "GET": + return render_template( + "platforms/create-platform.html", + species=species, + activelink="create-platform") + + try: + form = request.form + _new_platform = save_new_platform( + cursor, + species_id, + form["geo-platform"], + form["platform-name"], + form["platform-shortname"], + form["platform-title"], + form.get("go-tree-value") or None) + except KeyError as _kerr: + flash(f"Required value for field {_kerr.args[0]} was not provided.", + "alert-danger") + return redirect(url_for("species.platforms.create_platform", + species_id=species_id)) + except AssertionError as _aerr: + flash(f"Platform with GeoPlatform value of '{form['geo-platform']}'" + f" already exists for species '{species['FullName']}'.", + "alert-danger") + return redirect(url_for("species.platforms.create_platform", + species_id=species_id)) + + flash("Platform created successfully", "alert-success") + return redirect(url_for("species.platforms.list_platforms", + species_id=species_id)) diff --git a/uploader/population/__init__.py b/uploader/population/__init__.py new file mode 100644 index 0000000..bf6bf3c --- /dev/null +++ b/uploader/population/__init__.py @@ -0,0 +1,3 @@ +"""Package to handle creation and management of Populations/InbredSets""" + +from .views import popbp diff --git a/uploader/population/models.py b/uploader/population/models.py new file mode 100644 index 0000000..6dcd85e --- /dev/null +++ b/uploader/population/models.py @@ -0,0 +1,87 @@ +"""Functions for accessing the database relating to species populations.""" +import MySQLdb as mdb +from MySQLdb.cursors import DictCursor + +def population_by_id(conn: mdb.Connection, population_id) -> dict: + """Get the grouping/population by id.""" + with conn.cursor(cursorclass=DictCursor) as cursor: + cursor.execute("SELECT * FROM InbredSet WHERE InbredSetId=%s", + (population_id,)) + return cursor.fetchone() + +def population_by_species_and_id( + conn: mdb.Connection, species_id, population_id) -> dict: + """Retrieve a population by its identifier and species.""" + with conn.cursor(cursorclass=DictCursor) as cursor: + cursor.execute("SELECT * FROM InbredSet WHERE SpeciesId=%s AND Id=%s", + (species_id, population_id)) + return cursor.fetchone() + +def populations_by_species(conn: mdb.Connection, speciesid) -> tuple: + "Retrieve group (InbredSet) information from the database." + with conn.cursor(cursorclass=DictCursor) as cursor: + query = "SELECT * FROM InbredSet WHERE SpeciesId=%s" + cursor.execute(query, (speciesid,)) + return tuple(cursor.fetchall()) + + return tuple() + + +def population_families(conn) -> tuple: + """Fetch the families under which populations are grouped.""" + with conn.cursor(cursorclass=DictCursor) as cursor: + cursor.execute( + "SELECT DISTINCT(Family) FROM InbredSet WHERE Family IS NOT NULL") + return tuple(row["Family"] for row in cursor.fetchall()) + + +def population_genetic_types(conn) -> tuple: + """Fetch the families under which populations are grouped.""" + with conn.cursor(cursorclass=DictCursor) as cursor: + cursor.execute( + "SELECT DISTINCT(GeneticType) FROM InbredSet WHERE GeneticType IS " + "NOT NULL") + return tuple(row["GeneticType"] for row in cursor.fetchall()) + + +def save_population(cursor: mdb.cursors.Cursor, population_details: dict) -> dict: + """Save the population details to the db.""" + cursor.execute("SELECT DISTINCT(Family), FamilyOrder FROM InbredSet " + "WHERE Family IS NOT NULL AND Family != '' " + "AND FamilyOrder IS NOT NULL " + "ORDER BY FamilyOrder ASC") + _families = { + row["Family"]: int(row["FamilyOrder"]) + for row in cursor.fetchall() + } + params = { + "MenuOrderId": 0, + "InbredSetId": 0, + "public": 2, + **population_details, + "FamilyOrder": _families.get( + population_details["Family"], + max(_families.values())+1) + } + cursor.execute( + "INSERT INTO InbredSet(" + "InbredSetId, InbredSetName, Name, SpeciesId, FullName, " + "public, MappingMethodId, GeneticType, Family, FamilyOrder," + " MenuOrderId, InbredSetCode, Description" + ") " + "VALUES (" + "%(InbredSetId)s, %(InbredSetName)s, %(Name)s, %(SpeciesId)s, " + "%(FullName)s, %(public)s, %(MappingMethodId)s, %(GeneticType)s, " + "%(Family)s, %(FamilyOrder)s, %(MenuOrderId)s, %(InbredSetCode)s, " + "%(Description)s" + ")", + params) + new_id = cursor.lastrowid + cursor.execute("UPDATE InbredSet SET InbredSetId=%s WHERE Id=%s", + (new_id, new_id)) + return { + **params, + "Id": new_id, + "InbredSetId": new_id, + "population_id": new_id + } diff --git a/qc_app/upload/rqtl2.py b/uploader/population/rqtl2.py index e691636..9968bd6 100644 --- a/qc_app/upload/rqtl2.py +++ b/uploader/population/rqtl2.py @@ -3,7 +3,6 @@ import sys import json import traceback from pathlib import Path -from datetime import date from uuid import UUID, uuid4 from functools import partial from zipfile import ZipFile, is_zipfile @@ -27,20 +26,19 @@ from flask import ( from r_qtl import r_qtl2 -from qc_app import jobs -from qc_app.files import save_file, fullpath -from qc_app.dbinsert import species as all_species -from qc_app.db_utils import with_db_connection, database_connection - -from qc_app.db.platforms import platform_by_id, platforms_by_species -from qc_app.db.averaging import averaging_methods, averaging_method_by_id -from qc_app.db.tissues import all_tissues, tissue_by_id, create_new_tissue -from qc_app.db import ( - species_by_id, - save_population, - populations_by_species, - population_by_species_and_id,) -from qc_app.db.datasets import ( +from uploader import jobs +from uploader.files import save_file, fullpath +from uploader.species.models import all_species +from uploader.db_utils import with_db_connection, database_connection + +from uploader.authorisation import require_login +from uploader.platforms.models import platform_by_id, platforms_by_species +from uploader.db.averaging import averaging_methods, averaging_method_by_id +from uploader.db.tissues import all_tissues, tissue_by_id, create_new_tissue +from uploader.population.models import (populations_by_species, + population_by_species_and_id) +from uploader.species.models import species_by_id +from uploader.db.datasets import ( geno_dataset_by_id, geno_datasets_by_species_and_population, @@ -53,36 +51,41 @@ from qc_app.db.datasets import ( rqtl2 = Blueprint("rqtl2", __name__) + @rqtl2.route("/", methods=["GET", "POST"]) @rqtl2.route("/select-species", methods=["GET", "POST"]) +@require_login def select_species(): """Select the species.""" if request.method == "GET": - return render_template("rqtl2/index.html", species=with_db_connection(all_species)) + return render_template("expression-data/rqtl2/index.html", + species=with_db_connection(all_species)) species_id = request.form.get("species_id") species = with_db_connection( lambda conn: species_by_id(conn, species_id)) if bool(species): return redirect(url_for( - "upload.rqtl2.select_population", species_id=species_id)) + "species.populations.expression-data.rqtl2.select_population", + species_id=species_id)) flash("Invalid species or no species selected!", "alert-error error-rqtl2") - return redirect(url_for("upload.rqtl2.select_species")) + return redirect(url_for("expression-data.rqtl2.select_species")) -@rqtl2.route("/upload/species/<int:species_id>/select-population", +@rqtl2.route("<int:species_id>/expression-data/rqtl2/select-population", methods=["GET", "POST"]) +@require_login def select_population(species_id: int): """Select/Create the population to organise data under.""" with database_connection(app.config["SQL_URI"]) as conn: species = species_by_id(conn, species_id) if not bool(species): flash("Invalid species selected!", "alert-error error-rqtl2") - return redirect(url_for("upload.rqtl2.select_species")) + return redirect(url_for("expression-data.rqtl2.select_species")) if request.method == "GET": return render_template( - "rqtl2/select-population.html", + "expression-data/rqtl2/select-population.html", species=species, populations=populations_by_species(conn, species_id)) @@ -91,51 +94,14 @@ def select_population(species_id: int): if not bool(population): flash("Invalid Population!", "alert-error error-rqtl2") return redirect( - url_for("upload.rqtl2.select_population", pgsrc="error"), + url_for("expression-data.rqtl2.select_population", pgsrc="error"), code=307) - return redirect(url_for("upload.rqtl2.upload_rqtl2_bundle", + return redirect(url_for("expression-data.rqtl2.upload_rqtl2_bundle", species_id=species["SpeciesId"], population_id=population["InbredSetId"])) -@rqtl2.route("/upload/species/<int:species_id>/create-population", - methods=["POST"]) -def create_population(species_id: int): - """Create a new population for the given species.""" - population_page = redirect(url_for("upload.rqtl2.select_population", - species_id=species_id)) - with database_connection(app.config["SQL_URI"]) as conn: - species = species_by_id(conn, species_id) - population_name = request.form.get("inbredset_name", "").strip() - population_fullname = request.form.get("inbredset_fullname", "").strip() - if not bool(species): - flash("Invalid species!", "alert-error error-rqtl2") - return redirect(url_for("upload.rqtl2.select_species")) - if not bool(population_name): - flash("Invalid Population Name!", "alert-error error-rqtl2") - return population_page - if not bool(population_fullname): - flash("Invalid Population Full Name!", "alert-error error-rqtl2") - return population_page - new_population = save_population(conn, { - "SpeciesId": species["SpeciesId"], - "Name": population_name, - "InbredSetName": population_fullname, - "FullName": population_fullname, - "Family": request.form.get("inbredset_family") or None, - "Description": request.form.get("description") or None - }) - - flash("Population created successfully.", "alert-success") - return redirect( - url_for("upload.rqtl2.upload_rqtl2_bundle", - species_id=species_id, - population_id=new_population["population_id"], - pgsrc="create-population"), - code=307) - - class __RequestError__(Exception): #pylint: disable=[invalid-name] """Internal class to avoid pylint's `too-many-return-statements` error.""" @@ -143,6 +109,7 @@ class __RequestError__(Exception): #pylint: disable=[invalid-name] @rqtl2.route(("/upload/species/<int:species_id>/population/<int:population_id>" "/rqtl2-bundle"), methods=["GET", "POST"]) +@require_login def upload_rqtl2_bundle(species_id: int, population_id: int): """Allow upload of R/qtl2 bundle.""" with database_connection(app.config["SQL_URI"]) as conn: @@ -151,18 +118,19 @@ def upload_rqtl2_bundle(species_id: int, population_id: int): conn, species["SpeciesId"], population_id) if not bool(species): flash("Invalid species!", "alert-error error-rqtl2") - return redirect(url_for("upload.rqtl2.select_species")) + return redirect(url_for("expression-data.rqtl2.select_species")) if not bool(population): flash("Invalid Population!", "alert-error error-rqtl2") return redirect( - url_for("upload.rqtl2.select_population", pgsrc="error"), + url_for("expression-data.rqtl2.select_population", pgsrc="error"), code=307) if request.method == "GET" or ( request.method == "POST" and bool(request.args.get("pgsrc"))): - return render_template("rqtl2/upload-rqtl2-bundle-step-01.html", - species=species, - population=population) + return render_template( + "expression-data/rqtl2/upload-rqtl2-bundle-step-01.html", + species=species, + population=population) try: app.logger.debug("Files in the form: %s", request.files) @@ -172,7 +140,7 @@ def upload_rqtl2_bundle(species_id: int, population_id: int): app.logger.debug(traceback.format_exc()) flash("Please provide a valid R/qtl2 zip bundle.", "alert-error error-rqtl2") - return redirect(url_for("upload.rqtl2.upload_rqtl2_bundle", + return redirect(url_for("expression-data.rqtl2.upload_rqtl2_bundle", species_id=species_id, population_id=population_id)) @@ -186,7 +154,7 @@ def upload_rqtl2_bundle(species_id: int, population_id: int): the_file, request.files["rqtl2_bundle_file"].filename)#type: ignore[arg-type] return redirect(url_for( - "upload.rqtl2.rqtl2_bundle_qc_status", jobid=jobid)) + "expression-data.rqtl2.rqtl2_bundle_qc_status", jobid=jobid)) def trigger_rqtl2_bundle_qc( @@ -238,9 +206,10 @@ def chunks_directory(uniqueidentifier: str) -> Path: return Path(app.config["UPLOAD_FOLDER"], f"tempdir_{uniqueidentifier}") -@rqtl2.route(("/upload/species/<int:species_id>/population/<int:population_id>" +@rqtl2.route(("<int:species_id>/populations/<int:population_id>/rqtl2/" "/rqtl2-bundle-chunked"), methods=["GET"]) +@require_login def upload_rqtl2_bundle_chunked_get(# pylint: disable=["unused-argument"] species_id: int, population_id: int @@ -248,7 +217,7 @@ def upload_rqtl2_bundle_chunked_get(# pylint: disable=["unused-argument"] """ Extension to the `upload_rqtl2_bundle` endpoint above that provides a way for testing whether all the chunks have been uploaded and to assist with - resuming a failed upload. + resuming a failed expression-data. """ fileid = request.args.get("resumableIdentifier", type=str) or "" filename = request.args.get("resumableFilename", type=str) or "" @@ -282,9 +251,10 @@ def __merge_chunks__(targetfile: Path, chunkpaths: tuple[Path, ...]) -> Path: return targetfile -@rqtl2.route(("/upload/species/<int:species_id>/population/<int:population_id>" +@rqtl2.route(("<int:species_id>/population/<int:population_id>/rqtl2/upload/" "/rqtl2-bundle-chunked"), methods=["POST"]) +@require_login def upload_rqtl2_bundle_chunked_post(species_id: int, population_id: int): """ Extension to the `upload_rqtl2_bundle` endpoint above that allows large @@ -310,29 +280,40 @@ def upload_rqtl2_bundle_chunked_post(species_id: int, population_id: int): "statuscode": 400 }), 400 - # save chunk data - chunks_directory(_fileid).mkdir(exist_ok=True) - request.files["file"].save(Path(chunks_directory(_fileid), - chunk_name(_uploadfilename, _chunk))) - - # Check whether upload is complete - chunkpaths = tuple( - Path(chunks_directory(_fileid), chunk_name(_uploadfilename, _achunk)) - for _achunk in range(1, _totalchunks+1)) - if all(_file.exists() for _file in chunkpaths): - # merge_files and clean up chunks - __merge_chunks__(_targetfile, chunkpaths) - chunks_directory(_fileid).rmdir() - jobid = trigger_rqtl2_bundle_qc( - species_id, population_id, _targetfile, _uploadfilename) - return url_for( - "upload.rqtl2.rqtl2_bundle_qc_status", jobid=jobid) + try: + # save chunk data + chunks_directory(_fileid).mkdir(exist_ok=True, parents=True) + request.files["file"].save(Path(chunks_directory(_fileid), + chunk_name(_uploadfilename, _chunk))) + + # Check whether upload is complete + chunkpaths = tuple( + Path(chunks_directory(_fileid), chunk_name(_uploadfilename, _achunk)) + for _achunk in range(1, _totalchunks+1)) + if all(_file.exists() for _file in chunkpaths): + # merge_files and clean up chunks + __merge_chunks__(_targetfile, chunkpaths) + chunks_directory(_fileid).rmdir() + jobid = trigger_rqtl2_bundle_qc( + species_id, population_id, _targetfile, _uploadfilename) + return url_for( + "expression-data.rqtl2.rqtl2_bundle_qc_status", jobid=jobid) + except Exception as exc:# pylint: disable=[broad-except] + msg = "Error processing uploaded file chunks." + app.logger.error(msg, exc_info=True, stack_info=True) + return jsonify({ + "message": msg, + "error": type(exc).__name__, + "error-description": " ".join(str(arg) for arg in exc.args), + "error-trace": traceback.format_exception(exc) + }), 500 return "OK" @rqtl2.route("/upload/species/rqtl2-bundle/qc-status/<uuid:jobid>", methods=["GET", "POST"]) +@require_login def rqtl2_bundle_qc_status(jobid: UUID): """Check the status of the QC jobs.""" with (Redis.from_url(app.config["REDIS_URL"], decode_responses=True) as rconn, @@ -344,24 +325,25 @@ def rqtl2_bundle_qc_status(jobid: UUID): if bool(messagelistname) else []) jobstatus = thejob["status"] if jobstatus == "error": - return render_template("rqtl2/rqtl2-qc-job-error.html", - job=thejob, - errorsgeneric=json.loads( - thejob.get("errors-generic", "[]")), - errorsgeno=json.loads( - thejob.get("errors-geno", "[]")), - errorspheno=json.loads( - thejob.get("errors-pheno", "[]")), - errorsphenose=json.loads( - thejob.get("errors-phenose", "[]")), - errorsphenocovar=json.loads( - thejob.get("errors-phenocovar", "[]")), - messages=logmessages) + return render_template( + "expression-data/rqtl2/rqtl2-qc-job-error.html", + job=thejob, + errorsgeneric=json.loads( + thejob.get("errors-generic", "[]")), + errorsgeno=json.loads( + thejob.get("errors-geno", "[]")), + errorspheno=json.loads( + thejob.get("errors-pheno", "[]")), + errorsphenose=json.loads( + thejob.get("errors-phenose", "[]")), + errorsphenocovar=json.loads( + thejob.get("errors-phenocovar", "[]")), + messages=logmessages) if jobstatus == "success": jobmeta = json.loads(thejob["job-metadata"]) species = species_by_id(dbconn, jobmeta["speciesid"]) return render_template( - "rqtl2/rqtl2-qc-job-results.html", + "expression-data/rqtl2/rqtl2-qc-job-results.html", species=species, population=population_by_species_and_id( dbconn, species["SpeciesId"], jobmeta["populationid"]), @@ -380,14 +362,14 @@ def rqtl2_bundle_qc_status(jobid: UUID): return None return render_template( - "rqtl2/rqtl2-qc-job-status.html", + "expression-data/rqtl2/rqtl2-qc-job-status.html", job=thejob, geno_percent=compute_percentage(thejob, "geno"), pheno_percent=compute_percentage(thejob, "pheno"), phenose_percent=compute_percentage(thejob, "phenose"), messages=logmessages) except jobs.JobNotFound: - return render_template("rqtl2/no-such-job.html", jobid=jobid) + return render_template("expression-data/rqtl2/no-such-job.html", jobid=jobid) def redirect_on_error(flaskroute, **kwargs): @@ -403,7 +385,7 @@ def check_species(conn: mdb.Connection, formargs: dict) -> Optional[ corresponding species exists in the database. Maybe give the function a better name...""" - speciespage = redirect_on_error("upload.rqtl2.select_species") + speciespage = redirect_on_error("expression-data.rqtl2.select_species") if "species_id" not in formargs: return "You MUST provide the Species identifier.", speciespage @@ -422,7 +404,7 @@ def check_population(conn: mdb.Connection, Maybe give the function a better name...""" poppage = redirect_on_error( - "upload.rqtl2.select_species", species_id=species_id) + "expression-data.rqtl2.select_species", species_id=species_id) if "population_id" not in formargs: return "You MUST provide the Population identifier.", poppage @@ -437,12 +419,12 @@ def check_r_qtl2_bundle(formargs: dict, species_id, population_id) -> Optional[tuple[str, Response]]: """Check for the existence of the R/qtl2 bundle.""" - fileuploadpage = redirect_on_error("upload.rqtl2.upload_rqtl2_bundle", + fileuploadpage = redirect_on_error("expression-data.rqtl2.upload_rqtl2_bundle", species_id=species_id, population_id=population_id) if not "rqtl2_bundle_file" in formargs: return ( - "You MUST provide a R/qtl2 zip bundle for upload.", fileuploadpage) + "You MUST provide a R/qtl2 zip bundle for expression-data.", fileuploadpage) if not Path(fullpath(formargs["rqtl2_bundle_file"])).exists(): return "No R/qtl2 bundle with the given name exists.", fileuploadpage @@ -455,7 +437,7 @@ def check_geno_dataset(conn: mdb.Connection, species_id, population_id) -> Optional[tuple[str, Response]]: """Check for the Genotype dataset.""" - genodsetpg = redirect_on_error("upload.rqtl2.select_dataset_info", + genodsetpg = redirect_on_error("expression-data.rqtl2.select_dataset_info", species_id=species_id, population_id=population_id) if not bool(formargs.get("geno-dataset-id")): @@ -480,7 +462,7 @@ def check_geno_dataset(conn: mdb.Connection, def check_tissue( conn: mdb.Connection,formargs: dict) -> Optional[tuple[str, Response]]: """Check for tissue/organ/biological material.""" - selectdsetpg = redirect_on_error("upload.rqtl2.select_dataset_info", + selectdsetpg = redirect_on_error("expression-data.rqtl2.select_dataset_info", species_id=formargs["species_id"], population_id=formargs["population_id"]) if not bool(formargs.get("tissueid", "").strip()): @@ -508,7 +490,7 @@ def check_probe_study(conn: mdb.Connection, species_id, population_id) -> Optional[tuple[str, Response]]: """Check for the ProbeSet study.""" - dsetinfopg = redirect_on_error("upload.rqtl2.select_dataset_info", + dsetinfopg = redirect_on_error("expression-data.rqtl2.select_dataset_info", species_id=species_id, population_id=population_id) if not bool(formargs.get("probe-study-id")): @@ -526,7 +508,7 @@ def check_probe_dataset(conn: mdb.Connection, species_id, population_id) -> Optional[tuple[str, Response]]: """Check for the ProbeSet dataset.""" - dsetinfopg = redirect_on_error("upload.rqtl2.select_dataset_info", + dsetinfopg = redirect_on_error("expression-data.rqtl2.select_dataset_info", species_id=species_id, population_id=population_id) if not bool(formargs.get("probe-dataset-id")): @@ -554,6 +536,7 @@ def with_errors(endpointthunk: Callable, *checkfns): @rqtl2.route(("/upload/species/<int:species_id>/population/<int:population_id>" "/rqtl2-bundle/select-geno-dataset"), methods=["POST"]) +@require_login def select_geno_dataset(species_id: int, population_id: int): """Select from existing geno datasets.""" with database_connection(app.config["SQL_URI"]) as conn: @@ -563,17 +546,17 @@ def select_geno_dataset(species_id: int, population_id: int): if not bool(geno_dset): flash("No genotype dataset was provided!", "alert-error error-rqtl2") - return redirect(url_for("upload.rqtl2.select_geno_dataset", + return redirect(url_for("expression-data.rqtl2.select_geno_dataset", species_id=species_id, population_id=population_id, pgsrc="error"), code=307) flash("Genotype accepted", "alert-success error-rqtl2") - return redirect(url_for("upload.rqtl2.select_dataset_info", + return redirect(url_for("expression-data.rqtl2.select_dataset_info", species_id=species_id, population_id=population_id, - pgsrc="upload.rqtl2.select_geno_dataset"), + pgsrc="expression-data.rqtl2.select_geno_dataset"), code=307) return with_errors(__thunk__, @@ -590,77 +573,9 @@ def select_geno_dataset(species_id: int, population_id: int): @rqtl2.route(("/upload/species/<int:species_id>/population/<int:population_id>" - "/rqtl2-bundle/create-geno-dataset"), - methods=["POST"]) -def create_geno_dataset(species_id: int, population_id: int): - """Create a new geno dataset.""" - with database_connection(app.config["SQL_URI"]) as conn: - def __thunk__(): - sgeno_page = redirect(url_for("upload.rqtl2.select_dataset_info", - species_id=species_id, - population_id=population_id, - pgsrc="error"), - code=307) - errorclasses = "alert-error error-rqtl2 error-rqtl2-create-geno-dataset" - if not bool(request.form.get("dataset-name")): - flash("You must provide the dataset name", errorclasses) - return sgeno_page - if not bool(request.form.get("dataset-fullname")): - flash("You must provide the dataset full name", errorclasses) - return sgeno_page - public = 2 if request.form.get("dataset-public") == "on" else 0 - - with conn.cursor(cursorclass=DictCursor) as cursor: - datasetname = request.form["dataset-name"] - new_dataset = { - "name": datasetname, - "fname": request.form.get("dataset-fullname"), - "sname": request.form.get("dataset-shortname") or datasetname, - "today": date.today().isoformat(), - "pub": public, - "isetid": population_id - } - cursor.execute("SELECT * FROM GenoFreeze WHERE Name=%s", - (datasetname,)) - results = cursor.fetchall() - if bool(results): - flash( - f"A genotype dataset with name '{escape(datasetname)}' " - "already exists.", - errorclasses) - return redirect(url_for("upload.rqtl2.select_dataset_info", - species_id=species_id, - population_id=population_id, - pgsrc="error"), - code=307) - cursor.execute( - "INSERT INTO GenoFreeze(" - "Name, FullName, ShortName, CreateTime, public, InbredSetId" - ") " - "VALUES(" - "%(name)s, %(fname)s, %(sname)s, %(today)s, %(pub)s, %(isetid)s" - ")", - new_dataset) - flash("Created dataset successfully.", "alert-success") - return render_template( - "rqtl2/create-geno-dataset-success.html", - species=species_by_id(conn, species_id), - population=population_by_species_and_id( - conn, species_id, population_id), - rqtl2_bundle_file=request.form["rqtl2_bundle_file"], - geno_dataset={**new_dataset, "id": cursor.lastrowid}) - - return with_errors(__thunk__, - partial(check_species, conn=conn), - partial(check_population, conn=conn, species_id=species_id), - partial(check_r_qtl2_bundle, - species_id=species_id, - population_id=population_id)) - - -@rqtl2.route(("/upload/species/<int:species_id>/population/<int:population_id>" "/rqtl2-bundle/select-tissue"), methods=["POST"]) +@require_login def select_tissue(species_id: int, population_id: int): """Select from existing tissues.""" with database_connection(app.config["SQL_URI"]) as conn: @@ -669,10 +584,10 @@ def select_tissue(species_id: int, population_id: int): flash("Invalid tissue selection!", "alert-error error-select-tissue error-rqtl2") - return redirect(url_for("upload.rqtl2.select_dataset_info", + return redirect(url_for("expression-data.rqtl2.select_dataset_info", species_id=species_id, population_id=population_id, - pgsrc="upload.rqtl2.select_geno_dataset"), + pgsrc="expression-data.rqtl2.select_geno_dataset"), code=307) return with_errors(__thunk__, @@ -691,14 +606,15 @@ def select_tissue(species_id: int, population_id: int): @rqtl2.route(("/upload/species/<int:species_id>/population/<int:population_id>" "/rqtl2-bundle/create-tissue"), methods=["POST"]) +@require_login def create_tissue(species_id: int, population_id: int): """Add new tissue, organ or biological material to the system.""" form = request.form datasetinfopage = redirect( - url_for("upload.rqtl2.select_dataset_info", + url_for("expression-data.rqtl2.select_dataset_info", species_id=species_id, population_id=population_id, - pgsrc="upload.rqtl2.select_geno_dataset"), + pgsrc="expression-data.rqtl2.select_geno_dataset"), code=307) with database_connection(app.config["SQL_URI"]) as conn: tissuename = form.get("tissuename", "").strip() @@ -717,7 +633,7 @@ def create_tissue(species_id: int, population_id: int): tissue = create_new_tissue(conn, tissuename, tissueshortname) flash("Tissue created successfully!", "alert-success") return render_template( - "rqtl2/create-tissue-success.html", + "expression-data/rqtl2/create-tissue-success.html", species=species_by_id(conn, species_id), population=population_by_species_and_id( conn, species_id, population_id), @@ -735,11 +651,12 @@ def create_tissue(species_id: int, population_id: int): @rqtl2.route(("/upload/species/<int:species_id>/population/<int:population_id>" "/rqtl2-bundle/select-probeset-study"), methods=["POST"]) +@require_login def select_probeset_study(species_id: int, population_id: int): """Select or create a probeset study.""" with database_connection(app.config["SQL_URI"]) as conn: def __thunk__(): - summary_page = redirect(url_for("upload.rqtl2.select_dataset_info", + summary_page = redirect(url_for("expression-data.rqtl2.select_dataset_info", species_id=species_id, population_id=population_id), code=307) @@ -770,11 +687,12 @@ def select_probeset_study(species_id: int, population_id: int): @rqtl2.route(("/upload/species/<int:species_id>/population/<int:population_id>" "/rqtl2-bundle/select-probeset-dataset"), methods=["POST"]) +@require_login def select_probeset_dataset(species_id: int, population_id: int): """Select or create a probeset dataset.""" with database_connection(app.config["SQL_URI"]) as conn: def __thunk__(): - summary_page = redirect(url_for("upload.rqtl2.select_dataset_info", + summary_page = redirect(url_for("expression-data.rqtl2.select_dataset_info", species_id=species_id, population_id=population_id), code=307) @@ -810,6 +728,7 @@ def select_probeset_dataset(species_id: int, population_id: int): @rqtl2.route(("/upload/species/<int:species_id>/population/<int:population_id>" "/rqtl2-bundle/create-probeset-study"), methods=["POST"]) +@require_login def create_probeset_study(species_id: int, population_id: int): """Create a new probeset study.""" errorclasses = "alert-error error-rqtl2 error-rqtl2-create-probeset-study" @@ -817,7 +736,7 @@ def create_probeset_study(species_id: int, population_id: int): def __thunk__(): form = request.form dataset_info_page = redirect( - url_for("upload.rqtl2.select_dataset_info", + url_for("expression-data.rqtl2.select_dataset_info", species_id=species_id, population_id=population_id), code=307) @@ -844,7 +763,7 @@ def create_probeset_study(species_id: int, population_id: int): errorclasses) return dataset_info_page return render_template( - "rqtl2/create-probe-study-success.html", + "expression-data/rqtl2/create-probe-study-success.html", species=species_by_id(conn, species_id), population=population_by_species_and_id( conn, species_id, population_id), @@ -872,13 +791,14 @@ def create_probeset_study(species_id: int, population_id: int): @rqtl2.route(("/upload/species/<int:species_id>/population/<int:population_id>" "/rqtl2-bundle/create-probeset-dataset"), methods=["POST"]) +@require_login def create_probeset_dataset(species_id: int, population_id: int):#pylint: disable=[too-many-return-statements] """Create a new probeset dataset.""" errorclasses = "alert-error error-rqtl2 error-rqtl2-create-probeset-dataset" with database_connection(app.config["SQL_URI"]) as conn: def __thunk__():#pylint: disable=[too-many-return-statements] form = request.form - summary_page = redirect(url_for("upload.rqtl2.select_dataset_info", + summary_page = redirect(url_for("expression-data.rqtl2.select_dataset_info", species_id=species_id, population_id=population_id), code=307) @@ -928,7 +848,7 @@ def create_probeset_dataset(species_id: int, population_id: int):#pylint: disabl errorclasses) return summary_page return render_template( - "rqtl2/create-probe-dataset-success.html", + "expression-data/rqtl2/create-probe-dataset-success.html", species=species_by_id(conn, species_id), population=population_by_species_and_id( conn, species_id, population_id), @@ -963,6 +883,7 @@ def create_probeset_dataset(species_id: int, population_id: int):#pylint: disabl @rqtl2.route(("/upload/species/<int:species_id>/population/<int:population_id>" "/rqtl2-bundle/dataset-info"), methods=["POST"]) +@require_login def select_dataset_info(species_id: int, population_id: int): """ If `geno` files exist in the R/qtl2 bundle, prompt user to provide the @@ -982,7 +903,7 @@ def select_dataset_info(species_id: int, population_id: int): conn,form.get("geno-dataset-id", "").strip()) if "geno" in cdata and not bool(form.get("geno-dataset-id")): return render_template( - "rqtl2/select-geno-dataset.html", + "expression-data/rqtl2/select-geno-dataset.html", species=species, population=population, rqtl2_bundle_file=thefile.name, @@ -992,7 +913,7 @@ def select_dataset_info(species_id: int, population_id: int): tissue = tissue_by_id(conn, form.get("tissueid", "").strip()) if "pheno" in cdata and not bool(tissue): return render_template( - "rqtl2/select-tissue.html", + "expression-data/rqtl2/select-tissue.html", species=species, population=population, rqtl2_bundle_file=thefile.name, @@ -1006,7 +927,7 @@ def select_dataset_info(species_id: int, population_id: int): conn, form.get("probe-study-id", "").strip()) if "pheno" in cdata and not bool(probeset_study): return render_template( - "rqtl2/select-probeset-study-id.html", + "expression-data/rqtl2/select-probeset-study-id.html", species=species, population=population, rqtl2_bundle_file=thefile.name, @@ -1022,7 +943,7 @@ def select_dataset_info(species_id: int, population_id: int): conn, form.get("probe-dataset-id", "").strip()) if "pheno" in cdata and not bool(probeset_dataset): return render_template( - "rqtl2/select-probeset-dataset.html", + "expression-data/rqtl2/select-probeset-dataset.html", species=species, population=population, rqtl2_bundle_file=thefile.name, @@ -1033,7 +954,7 @@ def select_dataset_info(species_id: int, population_id: int): conn, int(form["probe-study-id"])), avgmethods=averaging_methods(conn)) - return render_template("rqtl2/summary-info.html", + return render_template("expression-data/rqtl2/summary-info.html", species=species, population=population, rqtl2_bundle_file=thefile.name, @@ -1055,6 +976,7 @@ def select_dataset_info(species_id: int, population_id: int): @rqtl2.route(("/upload/species/<int:species_id>/population/<int:population_id>" "/rqtl2-bundle/confirm-bundle-details"), methods=["POST"]) +@require_login def confirm_bundle_details(species_id: int, population_id: int): """Confirm the details and trigger R/qtl2 bundle processing...""" redisuri = app.config["REDIS_URL"] @@ -1097,7 +1019,7 @@ def confirm_bundle_details(species_id: int, population_id: int): redisuri, f"{app.config['UPLOAD_FOLDER']}/job_errors") - return redirect(url_for("upload.rqtl2.rqtl2_processing_status", + return redirect(url_for("expression-data.rqtl2.rqtl2_processing_status", jobid=jobid)) return with_errors(__thunk__, @@ -1135,13 +1057,19 @@ def rqtl2_processing_status(jobid: UUID): if thejob["status"] == "error": return render_template( - "rqtl2/rqtl2-job-error.html", job=thejob, messages=logmessages) + "expression-data/rqtl2/rqtl2-job-error.html", + job=thejob, + messages=logmessages) if thejob["status"] == "success": - return render_template("rqtl2/rqtl2-job-results.html", - job=thejob, - messages=logmessages) + return render_template( + "expression-data/rqtl2/rqtl2-job-results.html", + job=thejob, + messages=logmessages) return render_template( - "rqtl2/rqtl2-job-status.html", job=thejob, messages=logmessages) + "expression-data/rqtl2/rqtl2-job-status.html", + job=thejob, + messages=logmessages) except jobs.JobNotFound as _exc: - return render_template("rqtl2/no-such-job.html", jobid=jobid) + return render_template("expression-data/rqtl2/no-such-job.html", + jobid=jobid) diff --git a/uploader/population/views.py b/uploader/population/views.py new file mode 100644 index 0000000..36201ba --- /dev/null +++ b/uploader/population/views.py @@ -0,0 +1,199 @@ +"""Views dealing with populations/inbredsets""" +import json +import base64 + +from MySQLdb.cursors import DictCursor +from flask import (flash, + request, + url_for, + redirect, + Blueprint, + current_app as app) + +from uploader.samples.views import samplesbp +from uploader.oauth2.client import oauth2_post +from uploader.ui import make_template_renderer +from uploader.authorisation import require_login +from uploader.genotypes.views import genotypesbp +from uploader.db_utils import database_connection +from uploader.datautils import enumerate_sequence +from uploader.phenotypes.views import phenotypesbp +from uploader.expression_data.views import exprdatabp +from uploader.monadic_requests import make_either_error_handler +from uploader.input_validation import is_valid_representative_name +from uploader.species.models import (all_species, + species_by_id, + order_species_by_family) + +from .models import (save_population, + population_families, + populations_by_species, + population_genetic_types, + population_by_species_and_id) + +__active_link__ = "populations" +popbp = Blueprint("populations", __name__) +popbp.register_blueprint(samplesbp, url_prefix="/") +popbp.register_blueprint(genotypesbp, url_prefix="/") +popbp.register_blueprint(phenotypesbp, url_prefix="/") +popbp.register_blueprint(exprdatabp, url_prefix="/") +render_template = make_template_renderer("populations") + + +@popbp.route("/populations", methods=["GET", "POST"]) +@require_login +def index(): + """Entry point for populations.""" + with database_connection(app.config["SQL_URI"]) as conn: + if not bool(request.args.get("species_id")): + return render_template( + "populations/index.html", + species=order_species_by_family(all_species(conn))) + species = species_by_id(conn, request.args.get("species_id")) + if not bool(species): + flash("Invalid species identifier provided!", "alert-danger") + return redirect(url_for("species.populations.index")) + return redirect(url_for("species.populations.list_species_populations", + species_id=species["SpeciesId"])) + +@popbp.route("/<int:species_id>/populations", methods=["GET"]) +@require_login +def list_species_populations(species_id: int): + """List a particular species' populations.""" + with database_connection(app.config["SQL_URI"]) as conn: + species = species_by_id(conn, species_id) + if not bool(species): + flash("No species was found for given ID.", "alert-danger") + return redirect(url_for("species.populations.index")) + return render_template( + "populations/list-populations.html", + species=species, + populations=enumerate_sequence(populations_by_species( + conn, species_id)), + activelink="list-populations") + + +@popbp.route("/<int:species_id>/populations/create", methods=["GET", "POST"]) +@require_login +def create_population(species_id: int): + """Create a new population.""" + with (database_connection(app.config["SQL_URI"]) as conn, + conn.cursor(cursorclass=DictCursor) as cursor): + species = species_by_id(conn, species_id) + + if request.method == "GET": + error_values = request.args.get("error_values") + if not bool(error_values): + error_values = base64.b64encode( + '{"errors":{}, "error_values": {}}'.encode("utf8") + ).decode("utf8") + + error_values = json.loads(base64.b64decode( + error_values.encode("utf8")).decode("utf8"))# type: ignore[union-attr] + return render_template( + "populations/create-population.html", + species=species, + families = population_families(conn), + genetic_types = population_genetic_types(conn), + mapping_methods=( + {"id": "0", "value": "No mapping support"}, + {"id": "1", "value": "GEMMA, QTLReaper, R/qtl"}, + {"id": "2", "value": "GEMMA"}, + {"id": "3", "value": "R/qtl"}, + {"id": "4", "value": "GEMMA, PLINK"}), + activelink="create-population", + **error_values) + + if not bool(species): + flash("You must select a species.", "alert-danger") + return redirect(url_for("species.populations.index")) + + errors: tuple[tuple[str, str], ...] = tuple() + + population_name = (request.form.get( + "population_name") or "").strip() + if not bool(population_name): + errors = errors + (("population_name", + "You must provide a name for the population!"),) + + if not is_valid_representative_name(population_name): + errors = errors + (( + "population_name", + "The population name can only contain letters, numbers, " + "hyphens and underscores."),) + + population_fullname = (request.form.get( + "population_fullname") or "").strip() + if not bool(population_fullname): + errors = errors + ( + ("population_fullname", "Full Name MUST be provided."),) + + if bool(errors): + values = base64.b64encode( + json.dumps({ + "errors": dict(errors), + "error_values": dict(request.form) + }).encode("utf8")) + return redirect(url_for("species.populations.create_population", + species_id=species["SpeciesId"], + error_values=values)) + + new_population = save_population(cursor, { + "SpeciesId": species["SpeciesId"], + "Name": population_name, + "InbredSetName": population_fullname, + "FullName": population_fullname, + "InbredSetCode": request.form.get("population_code") or None, + "Description": request.form.get("population_description") or None, + "Family": request.form.get("population_family") or None, + "MappingMethodId": request.form.get("population_mapping_method_id"), + "GeneticType": request.form.get("population_genetic_type") or None + }) + + def __flash_success__(_success): + flash("Successfully created resource.", "alert-success") + return redirect(url_for( + "species.populations.view_population", + species_id=species["SpeciesId"], + population_id=new_population["InbredSetId"])) + + app.logger.debug("We begin setting up the privileges here…") + return oauth2_post( + "auth/resource/populations/create", + json={ + **dict(request.form), + "species_id": species_id, + "population_id": new_population["Id"], + "public": "on" + } + ).either( + make_either_error_handler( + "There was an error creating the population"), + __flash_success__) + + +@popbp.route("/<int:species_id>/populations/<int:population_id>", + methods=["GET"]) +@require_login +def view_population(species_id: int, population_id: int): + """View the details of a population.""" + with database_connection(app.config["SQL_URI"]) as conn: + species = species_by_id(conn, species_id) + population = population_by_species_and_id(conn, species_id, population_id) + error = False + + if not bool(species): + flash("You must select a species.", "alert-danger") + error = True + + if not bool(population): + flash("You must select a population.", "alert-danger") + error = True + + if error: + return redirect(url_for("species.populations.index")) + + return render_template("populations/view-population.html", + species=species, + population=population, + activelink="view-population") diff --git a/uploader/request_checks.py b/uploader/request_checks.py new file mode 100644 index 0000000..a24b2f7 --- /dev/null +++ b/uploader/request_checks.py @@ -0,0 +1,75 @@ +"""Functions to perform common checks. + +These are useful for reusability, and hence maintainability of the code. +""" +from functools import wraps + +from flask import flash, url_for, redirect, current_app as app + +from uploader.species.models import species_by_id +from uploader.db_utils import database_connection +from uploader.population.models import population_by_species_and_id + +def with_species(redirect_uri: str): + """Ensure the species actually exists.""" + def __decorator__(function): + @wraps(function) + def __with_species__(**kwargs): + try: + species_id = int(kwargs.get("species_id")) + if not bool(species_id): + flash("Expected species_id value to be present!", + "alert-danger") + return redirect(url_for(redirect_uri)) + with database_connection(app.config["SQL_URI"]) as conn: + species = species_by_id(conn, species_id) + if not bool(species): + flash("Could not find species with that ID", + "alert-danger") + return redirect(url_for(redirect_uri)) + except ValueError as _verr: + app.logger.debug( + "Exception converting value to integer: %s", + kwargs.get("species_id"), + exc_info=True) + flash("Expected an integer for 'species_id' value.", + "alert-danger") + return redirect(url_for(redirect_uri)) + return function(**{**kwargs, "species": species}) + return __with_species__ + return __decorator__ + + +def with_population(species_redirect_uri: str, redirect_uri: str): + """Ensure the population actually exists.""" + def __decorator__(function): + @wraps(function) + @with_species(redirect_uri=species_redirect_uri) + def __with_population__(**kwargs): + try: + species_id = int(kwargs["species_id"]) + population_id = int(kwargs.get("population_id")) + select_population_uri = redirect(url_for( + redirect_uri, species_id=species_id)) + if not bool(population_id): + flash("Expected population_id value to be present!", + "alert-danger") + return select_population_uri + with database_connection(app.config["SQL_URI"]) as conn: + population = population_by_species_and_id( + conn, species_id, population_id) + if not bool(population): + flash("Could not find population with that ID", + "alert-danger") + return select_population_uri + except ValueError as _verr: + app.logger.debug( + "Exception converting value to integer: %s", + kwargs.get("population_id"), + exc_info=True) + flash("Expected an integer for 'population_id' value.", + "alert-danger") + return select_population_uri + return function(**{**kwargs, "population": population}) + return __with_population__ + return __decorator__ diff --git a/uploader/samples/__init__.py b/uploader/samples/__init__.py new file mode 100644 index 0000000..1bd6d2d --- /dev/null +++ b/uploader/samples/__init__.py @@ -0,0 +1 @@ +"""Samples package. Handle samples uploads and editing.""" diff --git a/uploader/samples/models.py b/uploader/samples/models.py new file mode 100644 index 0000000..d7d5384 --- /dev/null +++ b/uploader/samples/models.py @@ -0,0 +1,104 @@ +"""Functions for handling samples.""" +import csv +from typing import Iterator + +import MySQLdb as mdb +from MySQLdb.cursors import DictCursor + +from functional_tools import take + +def samples_by_species_and_population( + conn: mdb.Connection, + species_id: int, + population_id: int +) -> tuple[dict, ...]: + """Fetch the samples by their species and population.""" + with conn.cursor(cursorclass=DictCursor) as cursor: + cursor.execute( + "SELECT iset.InbredSetId, s.* FROM InbredSet AS iset " + "INNER JOIN StrainXRef AS sxr ON iset.InbredSetId=sxr.InbredSetId " + "INNER JOIN Strain AS s ON sxr.StrainId=s.Id " + "WHERE s.SpeciesId=%(species_id)s " + "AND iset.InbredSetId=%(population_id)s", + {"species_id": species_id, "population_id": population_id}) + return tuple(cursor.fetchall()) + + +def read_samples_file(filepath, separator: str, firstlineheading: bool, **kwargs) -> Iterator[dict]: + """Read the samples file.""" + with open(filepath, "r", encoding="utf-8") as inputfile: + reader = csv.DictReader( + inputfile, + fieldnames=( + None if firstlineheading + else ("Name", "Name2", "Symbol", "Alias")), + delimiter=separator, + quotechar=kwargs.get("quotechar", '"')) + for row in reader: + yield row + + +def save_samples_data(conn: mdb.Connection, + speciesid: int, + file_data: Iterator[dict]): + """Save the samples to DB.""" + data = ({**row, "SpeciesId": speciesid} for row in file_data) + total = 0 + with conn.cursor() as cursor: + while True: + batch = take(data, 5000) + if len(batch) == 0: + break + cursor.executemany( + "INSERT INTO Strain(Name, Name2, SpeciesId, Symbol, Alias) " + "VALUES(" + " %(Name)s, %(Name2)s, %(SpeciesId)s, %(Symbol)s, %(Alias)s" + ") ON DUPLICATE KEY UPDATE Name=Name", + batch) + total += len(batch) + print(f"\tSaved {total} samples total so far.") + + +def cross_reference_samples(conn: mdb.Connection, + species_id: int, + population_id: int, + strain_names: Iterator[str]): + """Link samples to their population.""" + with conn.cursor(cursorclass=DictCursor) as cursor: + cursor.execute( + "SELECT MAX(OrderId) AS loid FROM StrainXRef WHERE InbredSetId=%s", + (population_id,)) + last_order_id = (cursor.fetchone()["loid"] or 10) + total = 0 + while True: + batch = take(strain_names, 5000) + if len(batch) == 0: + break + params_str = ", ".join(["%s"] * len(batch)) + ## This query is slow -- investigate. + cursor.execute( + "SELECT s.Id FROM Strain AS s LEFT JOIN StrainXRef AS sx " + "ON s.Id = sx.StrainId WHERE s.SpeciesId=%s AND s.Name IN " + f"({params_str}) AND sx.StrainId IS NULL", + (species_id,) + tuple(batch)) + strain_ids = (sid["Id"] for sid in cursor.fetchall()) + params = tuple({ + "pop_id": population_id, + "strain_id": strain_id, + "order_id": last_order_id + (order_id * 10), + "mapping": "N", + "pedigree": None + } for order_id, strain_id in enumerate(strain_ids, start=1)) + cursor.executemany( + "INSERT INTO StrainXRef( " + " InbredSetId, StrainId, OrderId, Used_for_mapping, PedigreeStatus" + ")" + "VALUES (" + " %(pop_id)s, %(strain_id)s, %(order_id)s, %(mapping)s, " + " %(pedigree)s" + ")", + params) + last_order_id += (len(params) * 10) + total += len(batch) + print(f"\t{total} total samples cross-referenced to the population " + "so far.") diff --git a/uploader/samples/views.py b/uploader/samples/views.py new file mode 100644 index 0000000..ed79101 --- /dev/null +++ b/uploader/samples/views.py @@ -0,0 +1,280 @@ +"""Code regarding samples""" +import os +import sys +import uuid +from pathlib import Path + +from redis import Redis +from flask import (flash, + request, + url_for, + redirect, + Blueprint, + current_app as app) + +from uploader import jobs +from uploader.files import save_file +from uploader.ui import make_template_renderer +from uploader.authorisation import require_login +from uploader.request_checks import with_population +from uploader.input_validation import is_integer_input +from uploader.datautils import safe_int, order_by_family, enumerate_sequence +from uploader.population.models import population_by_id, populations_by_species +from uploader.db_utils import (with_db_connection, + database_connection, + with_redis_connection) +from uploader.species.models import (all_species, + species_by_id, + order_species_by_family) + +from .models import samples_by_species_and_population + +samplesbp = Blueprint("samples", __name__) +render_template = make_template_renderer("samples") + +@samplesbp.route("/samples", methods=["GET"]) +@require_login +def index(): + """Direct entry-point for uploading/handling the samples.""" + with database_connection(app.config["SQL_URI"]) as conn: + if not bool(request.args.get("species_id")): + return render_template( + "samples/index.html", + species=order_species_by_family(all_species(conn)), + activelink="samples") + species = species_by_id(conn, request.args.get("species_id")) + if not bool(species): + flash("No such species!", "alert-danger") + return redirect(url_for("species.populations.samples.index")) + return redirect(url_for("species.populations.samples.select_population", + species_id=species["SpeciesId"])) + + +@samplesbp.route("<int:species_id>/samples/select-population", methods=["GET"]) +@require_login +def select_population(species_id: int): + """Select the population to use for the samples.""" + with database_connection(app.config["SQL_URI"]) as conn: + species = species_by_id(conn, species_id) + if not bool(species): + flash("Invalid species!", "alert-danger") + return redirect(url_for("species.populations.samples.index")) + + if not bool(request.args.get("population_id")): + return render_template("samples/select-population.html", + species=species, + populations=order_by_family( + populations_by_species( + conn, + species_id), + order_key="FamilyOrder"), + activelink="samples") + + population = population_by_id(conn, request.args.get("population_id")) + if not bool(population): + flash("Population not found!", "alert-danger") + return redirect(url_for( + "species.populations.samples.select_population", + species_id=species_id)) + + return redirect(url_for("species.populations.samples.list_samples", + species_id=species_id, + population_id=population["Id"])) + +@samplesbp.route("<int:species_id>/populations/<int:population_id>/samples") +@require_login +def list_samples(species_id: int, population_id: int): + """ + List the samples in a particular population and give the ability to upload + new ones. + """ + with database_connection(app.config["SQL_URI"]) as conn: + species = species_by_id(conn, species_id) + if not bool(species): + flash("Invalid species!", "alert-danger") + return redirect(url_for("species.populations.samples.index")) + + population = population_by_id(conn, population_id) + if not bool(population): + flash("Population not found!", "alert-danger") + return redirect(url_for( + "species.populations.samples.select_population", + species_id=species_id)) + + all_samples = enumerate_sequence(samples_by_species_and_population( + conn, species_id, population_id)) + total_samples = len(all_samples) + offset = max(safe_int(request.args.get("from") or 0), 0) + count = int(request.args.get("count") or 20) + return render_template("samples/list-samples.html", + species=species, + population=population, + samples=all_samples[offset:offset+count], + offset=offset, + count=count, + total_samples=total_samples, + activelink="list-samples") + + +def build_sample_upload_job(# pylint: disable=[too-many-arguments] + speciesid: int, + populationid: int, + samplesfile: Path, + separator: str, + firstlineheading: bool, + quotechar: str): + """Define the async command to run the actual samples data upload.""" + return [ + sys.executable, "-m", "scripts.insert_samples", app.config["SQL_URI"], + str(speciesid), str(populationid), str(samplesfile.absolute()), + separator, f"--redisuri={app.config['REDIS_URL']}", + f"--quotechar={quotechar}" + ] + (["--firstlineheading"] if firstlineheading else []) + + +@samplesbp.route("<int:species_id>/populations/<int:population_id>/upload-samples", + methods=["GET", "POST"]) +@require_login +def upload_samples(species_id: int, population_id: int):#pylint: disable=[too-many-return-statements] + """Upload the samples.""" + samples_uploads_page = redirect(url_for( + "species.populations.samples.upload_samples", + species_id=species_id, + population_id=population_id)) + if not is_integer_input(species_id): + flash("You did not provide a valid species. Please select one to " + "continue.", + "alert-danger") + return redirect(url_for("expression-data.samples.select_species")) + species = with_db_connection(lambda conn: species_by_id(conn, species_id)) + if not bool(species): + flash("Species with given ID was not found.", "alert-danger") + return redirect(url_for("expression-data.samples.select_species")) + + if not is_integer_input(population_id): + flash("You did not provide a valid population. Please select one " + "to continue.", + "alert-danger") + return redirect(url_for("species.populations.samples.select_population", + species_id=species_id), + code=307) + population = with_db_connection( + lambda conn: population_by_id(conn, int(population_id))) + if not bool(population): + flash("Invalid grouping/population!", "alert-error") + return redirect(url_for("species.populations.samples.select_population", + species_id=species_id), + code=307) + + if request.method == "GET" or request.files.get("samples_file") is None: + return render_template("samples/upload-samples.html", + species=species, + population=population) + + try: + samples_file = save_file(request.files["samples_file"], + Path(app.config["UPLOAD_FOLDER"])) + except AssertionError: + flash("You need to provide a file with the samples data.", + "alert-error") + return samples_uploads_page + + firstlineheading = (request.form.get("first_line_heading") == "on") + + separator = request.form.get("separator", ",") + if separator == "other": + separator = request.form.get("other_separator", ",") + if not bool(separator): + flash("You need to provide a separator character.", "alert-error") + return samples_uploads_page + + quotechar = (request.form.get("field_delimiter", '"') or '"') + + redisuri = app.config["REDIS_URL"] + with Redis.from_url(redisuri, decode_responses=True) as rconn: + #TODO: Add a QC step here — what do we check? + # 1. Does any sample in the uploaded file exist within the database? + # If yes, what is/are its/their species and population? + # 2. If yes 1. above, provide error with notes on which species and + # populations already own the samples. + the_job = jobs.launch_job( + jobs.initialise_job( + rconn, + jobs.jobsnamespace(), + str(uuid.uuid4()), + build_sample_upload_job( + species["SpeciesId"], + population["InbredSetId"], + samples_file, + separator, + firstlineheading, + quotechar), + "samples_upload", + app.config["JOBS_TTL_SECONDS"], + {"job_name": f"Samples Upload: {samples_file.name}"}), + redisuri, + f"{app.config['UPLOAD_FOLDER']}/job_errors") + return redirect(url_for( + "species.populations.samples.upload_status", + species_id=species_id, + population_id=population_id, + job_id=the_job["jobid"])) + + +@samplesbp.route("<int:species_id>/populations/<int:population_id>/" + "upload-samples/status/<uuid:job_id>", + methods=["GET"]) +@require_login +@with_population(species_redirect_uri="species.populations.samples.index", + redirect_uri="species.populations.samples.select_population") +def upload_status(species: dict, population: dict, job_id: uuid.UUID, **kwargs):# pylint: disable=[unused-argument] + """Check on the status of a samples upload job.""" + job = with_redis_connection(lambda rconn: jobs.job( + rconn, jobs.jobsnamespace(), job_id)) + if job: + status = job["status"] + if status == "success": + return render_template("samples/upload-success.html", + job=job, + species=species, + population=population,) + + if status == "error": + return redirect(url_for( + "species.populations.samples.upload_failure", job_id=job_id)) + + error_filename = Path(jobs.error_filename( + job_id, f"{app.config['UPLOAD_FOLDER']}/job_errors")) + if error_filename.exists(): + stat = os.stat(error_filename) + if stat.st_size > 0: + return redirect(url_for( + "samples.upload_failure", job_id=job_id)) + + return render_template("samples/upload-progress.html", + species=species, + population=population, + job=job) # maybe also handle this? + + return render_template("no_such_job.html", + job_id=job_id, + species=species, + population=population), 400 + +@samplesbp.route("/upload/failure/<uuid:job_id>", methods=["GET"]) +@require_login +def upload_failure(job_id: uuid.UUID): + """Display the errors of the samples upload failure.""" + job = with_redis_connection(lambda rconn: jobs.job( + rconn, jobs.jobsnamespace(), job_id)) + if not bool(job): + return render_template("no_such_job.html", job_id=job_id), 400 + + error_filename = Path(jobs.error_filename( + job_id, f"{app.config['UPLOAD_FOLDER']}/job_errors")) + if error_filename.exists(): + stat = os.stat(error_filename) + if stat.st_size > 0: + return render_template("worker_failure.html", job_id=job_id) + + return render_template("samples/upload-failure.html", job=job) diff --git a/uploader/session.py b/uploader/session.py new file mode 100644 index 0000000..b538187 --- /dev/null +++ b/uploader/session.py @@ -0,0 +1,118 @@ +"""Deal with user sessions""" +from uuid import UUID, uuid4 +from datetime import datetime +from typing import Any, Optional, TypedDict + +from authlib.jose import KeySet +from flask import request, session +from pymonad.either import Left, Right, Either + + +class UserDetails(TypedDict): + """Session information relating specifically to the user.""" + user_id: UUID + name: str + email: str + token: Either + logged_in: bool + + +class SessionInfo(TypedDict): + """All Session information we save.""" + session_id: UUID + user: UserDetails + anon_id: UUID + user_agent: str + ip_addr: str + masquerade: Optional[UserDetails] + auth_server_jwks: Optional[dict[str, Any]] + + +__SESSION_KEY__ = "GN::uploader::session_info" # Do not use this outside this module!! + + +def clear_session_info(): + """Clears the session.""" + session.pop(__SESSION_KEY__) + + +def save_session_info(sess_info: SessionInfo) -> SessionInfo: + """Save `session_info`.""" + # T0d0: if it is an existing session, verify that certain important security + # bits have not changed before saving. + # old_session_info = session.get(__SESSION_KEY__) + # if bool(old_session_info): + # if old_session_info["user_agent"] == request.headers.get("User-Agent"): + # session[__SESSION_KEY__] = sess_info + # return sess_info + # # request session verification + # return verify_session(sess_info) + # New session + session[__SESSION_KEY__] = sess_info + return sess_info + + +def session_info() -> SessionInfo: + """Retrieve the session information""" + anon_id = uuid4() + return save_session_info( + session.get(__SESSION_KEY__, { + "session_id": uuid4(), + "user": { + "user_id": anon_id, + "name": "Anonymous User", + "email": "anon@ymous.user", + "token": Left("INVALID-TOKEN"), + "logged_in": False + }, + "anon_id": anon_id, + "user_agent": request.headers.get("User-Agent"), + "ip_addr": request.environ.get("HTTP_X_FORWARDED_FOR", + request.remote_addr), + "masquerading": None + })) + + +def set_user_token(token: str) -> SessionInfo: + """Set the user's token.""" + info = session_info() + return save_session_info({ + **info, "user": {**info["user"], "token": Right(token)}})#type: ignore[misc] + + +def set_user_details(userdets: UserDetails) -> SessionInfo: + """Set the user details information""" + return save_session_info({**session_info(), "user": userdets})#type: ignore[misc] + +def user_details() -> UserDetails: + """Retrieve user details.""" + return session_info()["user"] + +def user_token() -> Either: + """Retrieve the user token.""" + return session_info()["user"]["token"] + + +def set_auth_server_jwks(keyset: KeySet) -> KeySet: + """Update the JSON Web Keys in the session.""" + save_session_info({ + **session_info(),# type: ignore[misc] + "auth_server_jwks": { + "last-updated": datetime.now().timestamp(), + "jwks": keyset.as_dict() + } + }) + return keyset + + +def toggle_token_refreshing(): + """Toggle the state of the token_refreshing variable.""" + _session = session_info() + return save_session_info({ + **_session, + "token_refreshing": not _session.get("token_refreshing", False)}) + + +def is_token_refreshing(): + """Returns whether the token is being refreshed or not.""" + return session_info().get("token_refreshing", False) diff --git a/uploader/species/__init__.py b/uploader/species/__init__.py new file mode 100644 index 0000000..83f2165 --- /dev/null +++ b/uploader/species/__init__.py @@ -0,0 +1,2 @@ +"""Package to handle creation and management of species.""" +from .views import speciesbp diff --git a/uploader/species/models.py b/uploader/species/models.py new file mode 100644 index 0000000..51f941c --- /dev/null +++ b/uploader/species/models.py @@ -0,0 +1,152 @@ +"""Database functions for species.""" +import math +from typing import Optional +from functools import reduce + +import MySQLdb as mdb +from MySQLdb.cursors import DictCursor + +def all_species(conn: mdb.Connection) -> tuple: + "Retrieve the species from the database." + with conn.cursor(cursorclass=DictCursor) as cursor: + cursor.execute( + "SELECT Id AS SpeciesId, SpeciesName, LOWER(Name) AS Name, " + "MenuName, FullName, TaxonomyId, Family, FamilyOrderId, OrderId " + "FROM Species ORDER BY FamilyOrderId ASC, OrderID ASC") + return tuple(cursor.fetchall()) + + return tuple() + +def order_species_by_family(species: tuple[dict, ...]) -> list: + """Order the species by their family""" + def __family_order_id__(item): + orderid = item["FamilyOrderId"] + return math.inf if orderid is None else orderid + def __order__(ordered, current): + _key = (__family_order_id__(current), current["Family"]) + return { + **ordered, + _key: ordered.get(_key, tuple()) + (current,) + } + ordered = reduce(__order__, species, {})# type: ignore[var-annotated] + return sorted(tuple(ordered.items()), key=lambda item: item[0][0]) + + +def species_by_id(conn: mdb.Connection, speciesid) -> dict: + "Retrieve the species from the database by id." + with conn.cursor(cursorclass=DictCursor) as cursor: + cursor.execute( + "SELECT Id AS SpeciesId, SpeciesName, LOWER(Name) AS Name, " + "MenuName, FullName, TaxonomyId, Family, FamilyOrderId, OrderId " + "FROM Species WHERE SpeciesId=%s", + (speciesid,)) + return cursor.fetchone() + + +def save_species(conn: mdb.Connection, + common_name: str, + scientific_name: str, + family: str, + taxon_id: Optional[str] = None) -> dict: + """ + Save a new species to the database. + + Parameters + ---------- + conn: A connection to the MariaDB database. + taxon_id: The taxonomy identifier for the new species. + common_name: The species' common name. + scientific_name; The species' scientific name. + """ + genus, species_name = scientific_name.split(" ") + families = species_families(conn) + with conn.cursor() as cursor: + cursor.execute("SELECT MAX(OrderId) FROM Species") + species = { + "common_name": common_name, + "common_name_lower": common_name.lower(), + "menu_name": f"{common_name} ({genus[0]}. {species_name.lower()})", + "scientific_name": scientific_name, + "family": family, + "family_order": families[family], + "taxon_id": taxon_id, + "species_order": cursor.fetchone()[0] + 5 + } + cursor.execute( + "INSERT INTO Species(" + "SpeciesName, Name, MenuName, FullName, Family, FamilyOrderId, " + "TaxonomyId, OrderId" + ") VALUES (" + "%(common_name)s, %(common_name_lower)s, %(menu_name)s, " + "%(scientific_name)s, %(family)s, %(family_order)s, %(taxon_id)s, " + "%(species_order)s" + ")", + species) + species_id = cursor.lastrowid + cursor.execute("UPDATE Species SET SpeciesId=%s WHERE Id=%s", + (species_id, species_id)) + return { + **species, + "species_id": species_id + } + + +def update_species(# pylint: disable=[too-many-arguments] + conn: mdb.Connection, + species_id: int, + common_name: str, + scientific_name: str, + family: str, + family_order: int, + species_order: int +): + """Update a species' details. + + Parameters + ---------- + conn: A connection to the MariaDB database. + species_id: The species identifier + + Key-Word Arguments + ------------------ + common_name: A layman's name for the species + scientific_name: A binomial nomenclature name for the species + family: The grouping under which the species falls + family_order: The ordering for the "family" above + species_order: The ordering of this species in relation to others + """ + with conn.cursor(cursorclass=DictCursor) as cursor: + genus, species_name = scientific_name.split(" ") + species = { + "species_id": species_id, + "common_name": common_name, + "common_name_lower": common_name.lower(), + "menu_name": f"{common_name} ({genus[0]}. {species_name.lower()})", + "scientific_name": scientific_name, + "family": family, + "family_order": family_order, + "species_order": species_order + } + cursor.execute( + "UPDATE Species SET " + "SpeciesName=%(common_name)s, " + "Name=%(common_name_lower)s, " + "MenuName=%(menu_name)s, " + "FullName=%(scientific_name)s, " + "Family=%(family)s, " + "FamilyOrderId=%(family_order)s, " + "OrderId=%(species_order)s " + "WHERE Id=%(species_id)s", + species) + + +def species_families(conn: mdb.Connection) -> dict: + """Retrieve the families under which species are grouped.""" + with conn.cursor(cursorclass=DictCursor) as cursor: + cursor.execute( + "SELECT DISTINCT(Family), FamilyOrderId FROM Species " + "WHERE Family IS NOT NULL") + return { + fam["Family"]: fam["FamilyOrderId"] + for fam in cursor.fetchall() + } diff --git a/uploader/species/views.py b/uploader/species/views.py new file mode 100644 index 0000000..10715a5 --- /dev/null +++ b/uploader/species/views.py @@ -0,0 +1,200 @@ +"""Endpoints handling species.""" +from pymonad.either import Left, Right, Either +from flask import (flash, + request, + url_for, + redirect, + Blueprint, + current_app as app) + +from uploader.population import popbp +from uploader.platforms import platformsbp +from uploader.ui import make_template_renderer +from uploader.db_utils import database_connection +from uploader.oauth2.client import oauth2_get, oauth2_post +from uploader.authorisation import require_login, require_token +from uploader.datautils import order_by_family, enumerate_sequence + +from .models import (all_species, + save_species, + species_by_id, + update_species, + species_families) + + +speciesbp = Blueprint("species", __name__) +speciesbp.register_blueprint(popbp, url_prefix="/") +speciesbp.register_blueprint(platformsbp, url_prefix="/") +render_template = make_template_renderer("species") + + +@speciesbp.route("/", methods=["GET"]) +@require_login +def list_species(): + """List and display all the species in the database.""" + with database_connection(app.config["SQL_URI"]) as conn: + return render_template("species/list-species.html", + allspecies=enumerate_sequence(all_species(conn))) + +@speciesbp.route("/<int:species_id>", methods=["GET"]) +@require_login +def view_species(species_id: int): + """View details of a particular species and menus to act upon it.""" + with database_connection(app.config["SQL_URI"]) as conn: + species = species_by_id(conn, species_id) + if bool(species): + return render_template("species/view-species.html", + species=species, + activelink="view-species") + flash("Could not find a species with the given identifier.", + "alert-danger") + return redirect(url_for("species.view_species")) + +@speciesbp.route("/create", methods=["GET", "POST"]) +@require_login +def create_species(): + """Create a new species.""" + # We can use uniprot's API to fetch the details with something like + # https://rest.uniprot.org/taxonomy/<taxonID> e.g. + # https://rest.uniprot.org/taxonomy/6239 + with (database_connection(app.config["SQL_URI"]) as conn, + conn.cursor() as cursor): + if request.method == "GET": + return render_template("species/create-species.html", + families=species_families(conn), + activelink="create-species") + + error = False + taxon_id = request.form.get("species_taxonomy_id", "").strip() or None + + common_name = request.form.get("common_name", "").strip() + if not bool(common_name): + flash("The common species name MUST be provided.", "alert-danger") + error = True + + scientific_name = request.form.get("scientific_name", "").strip() + if not bool(scientific_name): + flash("The species' scientific name MUST be provided.", + "alert-danger") + error = True + + parts = tuple(name.strip() for name in scientific_name.split(" ")) + if len(parts) != 2 or not all(bool(name) for name in parts): + flash("The scientific name you provided is invalid.", "alert-danger") + error = True + + cursor.execute( + "SELECT * FROM Species WHERE FullName=%s", (scientific_name,)) + res = cursor.fetchone() + if bool(res): + flash("A species already exists with the provided scientific name.", + "alert-danger") + error = True + + family = request.form.get("species_family", "").strip() + if not bool(family): + flash("The species' family MUST be selected.", "alert-danger") + error = True + + if bool(taxon_id): + cursor.execute( + "SELECT * FROM Species WHERE TaxonomyId=%s", (taxon_id,)) + res = cursor.fetchone() + if bool(res): + flash("A species already exists with the provided scientific name.", + "alert-danger") + error = True + + if error: + return redirect(url_for("species.create_species", + common_name=common_name, + scientific_name=scientific_name, + taxon_id=taxon_id)) + + species = save_species( + conn, common_name, scientific_name, family, taxon_id) + flash("Species saved successfully!", "alert-success") + return redirect(url_for("species.view_species", species_id=species["species_id"])) + + +@speciesbp.route("/<int:species_id>/edit-extra", methods=["GET", "POST"]) +@require_login +@require_token +#def edit_species(species_id: int): +def edit_species_extra(token: dict, species_id: int):# pylint: disable=[unused-argument] + """Edit a species' details. + + Parameters + ---------- + token: A JWT token used for authorisation. + species_id: An identifier for the species being edited. + """ + def __failure__(res): + app.logger.debug( + "There was an error in the attempt to edit the species: %s", res) + flash(res, "alert-danger") + return redirect(url_for("species.view_species", species_id=species_id)) + + def __system_resource_uuid__(resources) -> Either: + sys_res = [ + resource for resource in resources + if resource["resource_category"]["resource_category_key"] == "system" + ] + if len(sys_res) != 1: + return Left("Could not find/identify a valid system resource.") + return Right(sys_res[0]["resource_id"]) + + def __check_privileges__(authorisations): + if len(authorisations.items()) != 1: + return Left("Got authorisations for more than a single resource!") + + auths = tuple(authorisations.items())[0][1] + authorised = "system:species:edit-extra-info" in tuple( + privilege["privilege_id"] + for role in auths["roles"] + for privilege in role["privileges"]) + if authorised: + return Right(authorised) + return Left("You are not authorised to edit species extra details.") + + with database_connection(app.config["SQL_URI"]) as conn: + species = species_by_id(conn, species_id) + all_the_species = all_species(conn) + families = species_families(conn) + family_order = tuple( + item[0] for item in order_by_family(all_the_species) + if item[0][1] is not None) + if bool(species) and request.method == "GET": + return oauth2_get("auth/user/resources").then( + __system_resource_uuid__ + ).then( + lambda resource_id: oauth2_post( + "auth/resource/authorisation", + json={"resource-ids": [resource_id]}) + ).then(__check_privileges__).then( + lambda authorisations: render_template( + "species/edit-species.html", + species=species, + families=families, + family_order=family_order, + max_order_id = max( + row["OrderId"] for row in all_the_species + if row["OrderId"] is not None), + activelink="edit-species") + ).either(__failure__, lambda res: res) + + if bool(species) and request.method == "POST": + update_species(conn, + species_id, + request.form["species_name"], + request.form["species_fullname"], + request.form["species_family"], + int(request.form["species_familyorderid"]), + int(request.form["species_orderid"])) + flash("Updated species successfully.", "alert-success") + return redirect(url_for("species.edit_species_extra", + species_id=species_id)) + + flash("Species with the given identifier was not found!", + "alert-danger") + return redirect(url_for("species.list_species")) diff --git a/qc_app/static/css/custom-bootstrap.css b/uploader/static/css/custom-bootstrap.css index 67f1199..67f1199 100644 --- a/qc_app/static/css/custom-bootstrap.css +++ b/uploader/static/css/custom-bootstrap.css diff --git a/uploader/static/css/styles.css b/uploader/static/css/styles.css new file mode 100644 index 0000000..f482c1b --- /dev/null +++ b/uploader/static/css/styles.css @@ -0,0 +1,161 @@ +body { + margin: 0.7em; + box-sizing: border-box; + display: grid; + grid-template-columns: 1fr 6fr; + grid-template-rows: 5em 100%; + grid-gap: 20px; + + font-family: Georgia, Garamond, serif; + font-style: normal; +} + +#header { + grid-column: 1/3; + width: 100%; + /* background: cyan; */ + padding-top: 0.5em; + border-radius: 0.5em; + + background-color: #336699; + border-color: #080808; + color: #FFFFFF; + background-image: none; +} + +#header .header { + font-size: 2em; + display: inline-block; + text-align: start; +} + +#header .header-nav { + display: inline-block; + color: #FFFFFF; +} + +#header .header-nav li { + border-width: 1px; + border-color: #FFFFFF; + vertical-align: middle; + margin: 0.2em; + border-style: solid; + border-width: 2px; + border-radius: 0.5em; + text-align: center; +} + +#header .header-nav a { + color: #FFFFFF; + text-decoration: none; +} + +#nav-sidebar { + grid-column: 1/2; + /* background: #e5e5ff; */ + padding-top: 0.5em; + border-radius: 0.5em; + font-size: 1.2em; +} + +#main { + grid-column: 2/3; + width: 100%; + /* background: gray; */ + border-radius: 0.5em; +} + +.pagetitle { + padding-top: 0.5em; + /* background: pink; */ + border-radius: 0.5em; + /* background-color: #6699CC; */ + /* background-color: #77AADD; */ + background-color: #88BBEE; +} + +.pagetitle h1 { + text-align: start; + text-transform: capitalize; + padding-left: 0.25em; +} + +.pagetitle .breadcrumb { + background: none; +} + +.pagetitle .breadcrumb .active a { + color: #333333; +} + +.pagetitle .breadcrumb a { + color: #666666; +} + +.main-content { + font-size: 1.275em; +} + +.breadcrumb { + text-transform: capitalize; +} + +dd { + margin-left: 3em; + font-size: 0.88em; + padding-bottom: 1em; +} + +input[type="submit"], .btn { + text-transform: capitalize; +} + +.card { + margin-top: 0.3em; + border-width: 1px; + border-style: solid; + border-radius: 0.3em; + border-color: #AAAAAA; + padding: 0.5em; +} + +.activemenu { + border-style: solid; + border-radius: 0.5em; + border-color: #AAAAAA; + background-color: #EFEFEF; +} + +.danger { + color: #A94442; + border-color: #DCA7A7; + background-color: #F2DEDE; +} + +.heading { + border-bottom: solid #EEBB88; +} + +.subheading { + padding: 1em 0 0.1em 0.5em; + border-bottom: solid #88BBEE; +} + +form { + margin-top: 0.3em; + background: #E5E5FF; + padding: 0.5em; + border-radius:0.5em; +} + +form .form-control { + background-color: #EAEAFF; +} + +.sidebar-content .card .card-title { + font-size: 1.5em; +} + +.sidebar-content .card-text table tbody td:nth-child(1) { + font-weight: bolder; +} diff --git a/qc_app/static/css/two-column-with-separator.css b/uploader/static/css/two-column-with-separator.css index b6efd46..b6efd46 100644 --- a/qc_app/static/css/two-column-with-separator.css +++ b/uploader/static/css/two-column-with-separator.css diff --git a/qc_app/static/images/CITGLogo.png b/uploader/static/images/CITGLogo.png Binary files differindex ae99fed..ae99fed 100644 --- a/qc_app/static/images/CITGLogo.png +++ b/uploader/static/images/CITGLogo.png diff --git a/uploader/static/js/misc.js b/uploader/static/js/misc.js new file mode 100644 index 0000000..cf7b39e --- /dev/null +++ b/uploader/static/js/misc.js @@ -0,0 +1,6 @@ +"Miscellaneous functions and event-handlers" + +$(".not-implemented").click((event) => { + event.preventDefault(); + alert("This feature is not implemented yet. Please bear with us."); +}); diff --git a/qc_app/static/js/select_platform.js b/uploader/static/js/select_platform.js index 4fdd865..4fdd865 100644 --- a/qc_app/static/js/select_platform.js +++ b/uploader/static/js/select_platform.js diff --git a/qc_app/static/js/upload_progress.js b/uploader/static/js/upload_progress.js index 9638b36..9638b36 100644 --- a/qc_app/static/js/upload_progress.js +++ b/uploader/static/js/upload_progress.js diff --git a/qc_app/static/js/upload_samples.js b/uploader/static/js/upload_samples.js index aed536f..aed536f 100644 --- a/qc_app/static/js/upload_samples.js +++ b/uploader/static/js/upload_samples.js diff --git a/qc_app/static/js/utils.js b/uploader/static/js/utils.js index 045dd47..045dd47 100644 --- a/qc_app/static/js/utils.js +++ b/uploader/static/js/utils.js diff --git a/uploader/templates/base.html b/uploader/templates/base.html new file mode 100644 index 0000000..019aa39 --- /dev/null +++ b/uploader/templates/base.html @@ -0,0 +1,132 @@ +<!DOCTYPE html> +<html lang="en"> + + <head> + + <meta charset="UTF-8" /> + <meta application-name="GeneNetwork Quality-Control Application" /> + <meta name="viewport" content="width=device-width, initial-scale=1.0" /> + {%block extrameta%}{%endblock%} + + <title>GN Uploader: {%block title%}{%endblock%}</title> + + <link rel="stylesheet" type="text/css" + href="{{url_for('base.bootstrap', + filename='css/bootstrap.min.css')}}" /> + <link rel="stylesheet" type="text/css" + href="{{url_for('base.bootstrap', + filename='css/bootstrap-theme.min.css')}}" /> + <link rel="stylesheet" type="text/css" href="/static/css/styles.css" /> + + {%block css%}{%endblock%} + + </head> + + <body> + <header id="header" class="container-fluid"> + <div class="row"> + <span class="header col-lg-9">GeneNetwork Data Quality Control and Upload</span> + <nav class="header-nav col-lg-3"> + <ul class="nav justify-content-end"> + <li> + {%if user_logged_in()%} + <a href="{{url_for('oauth2.logout')}}" + title="Log out of the system">{{user_email()}} — Log Out</a> + {%else%} + <a href="{{authserver_authorise_uri()}}" + title="Log in to the system">Log In</a> + {%endif%} + </li> + </ul> + </nav> + </header> + + <aside id="nav-sidebar" class="container-fluid"> + <ul class="nav flex-column"> + <li {%if activemenu=="home"%}class="activemenu"{%endif%}> + <a href="/" >Home</a></li> + <li {%if activemenu=="species"%}class="activemenu"{%endif%}> + <a href="{{url_for('species.list_species')}}" + title="View and manage species information.">Species</a></li> + <li {%if activemenu=="platforms"%}class="activemenu"{%endif%}> + <a href="{{url_for('species.platforms.index')}}" + title="View and manage species platforms.">Sequencing Platforms</a></li> + <li {%if activemenu=="populations"%}class="activemenu"{%endif%}> + <a href="{{url_for('species.populations.index')}}" + title="View and manage species populations.">Populations</a></li> + <li {%if activemenu=="samples"%}class="activemenu"{%endif%}> + <a href="{{url_for('species.populations.samples.index')}}" + title="Upload population samples.">Samples</a></li> + <li {%if activemenu=="genotypes"%}class="activemenu"{%endif%}> + <a href="{{url_for('species.populations.genotypes.index')}}" + title="Upload Genotype data.">Genotype Data</a></li> + <!-- + TODO: Maybe include menus here for managing studies and dataset or + maybe have the studies/datasets managed under their respective + sections, e.g. "Publish*" studies/datasets under the "Phenotypes" + section, "ProbeSet*" studies/datasets under the "Expression Data" + sections, etc. + --> + <li {%if activemenu=="phenotypes"%}class="activemenu"{%endif%}> + <a href="{{url_for('species.populations.phenotypes.index')}}" + title="Upload phenotype data.">Phenotype Data</a></li> + <li {%if activemenu=="expression-data"%}class="activemenu"{%endif%}> + <a href="{{url_for('species.populations.expression-data.index')}}" + title="Upload expression data.">Expression Data</a></li> + <li {%if activemenu=="individuals"%}class="activemenu"{%endif%}> + <a href="#" + class="not-implemented" + title="Upload individual data.">Individual Data</a></li> + <li {%if activemenu=="rna-seq"%}class="activemenu"{%endif%}> + <a href="#" + class="not-implemented" + title="Upload RNA-Seq data.">RNA-Seq Data</a></li> + <li {%if activemenu=="async-jobs"%}class="activemenu"{%endif%}> + <a href="#" + class="not-implemented" + title="View and manage the backgroud jobs you have running"> + Background Jobs</a></li> + </ul> + </aside> + + <main id="main" class="main container-fluid"> + + <div class="pagetitle row"> + <h1>GN Uploader: {%block pagetitle%}{%endblock%}</h1> + <nav> + <ol class="breadcrumb"> + <li {%if activelink is not defined or activelink=="home"%} + class="breadcrumb-item active" + {%else%} + class="breadcrumb-item" + {%endif%}> + <a href="{{url_for('base.index')}}">Home</a> + </li> + {%block lvl1_breadcrumbs%}{%endblock%} + </ol> + </nav> + </div> + + <div class="row"> + <div class="container-fluid"> + <div class="col-md-8 main-content"> + {%block contents%}{%endblock%} + </div> + <div class="sidebar-content col-md-4"> + {%block sidebarcontents%}{%endblock%} + </div> + </div> + </div> + </main> + + + <script src="{{url_for('base.jquery', + filename='jquery.min.js')}}"></script> + <script src="{{url_for('base.bootstrap', + filename='js/bootstrap.min.js')}}"></script> + <script type="text/javascript" src="/static/js/misc.js"></script> + {%block javascript%}{%endblock%} + + </body> + +</html> diff --git a/qc_app/templates/cli-output.html b/uploader/templates/cli-output.html index 33fb73b..33fb73b 100644 --- a/qc_app/templates/cli-output.html +++ b/uploader/templates/cli-output.html diff --git a/qc_app/templates/continue_from_create_dataset.html b/uploader/templates/continue_from_create_dataset.html index 03bb49c..03bb49c 100644 --- a/qc_app/templates/continue_from_create_dataset.html +++ b/uploader/templates/continue_from_create_dataset.html diff --git a/qc_app/templates/continue_from_create_study.html b/uploader/templates/continue_from_create_study.html index 34e6e5e..34e6e5e 100644 --- a/qc_app/templates/continue_from_create_study.html +++ b/uploader/templates/continue_from_create_study.html diff --git a/qc_app/templates/dbupdate_error.html b/uploader/templates/dbupdate_error.html index e1359d2..e1359d2 100644 --- a/qc_app/templates/dbupdate_error.html +++ b/uploader/templates/dbupdate_error.html diff --git a/qc_app/templates/dbupdate_hidden_fields.html b/uploader/templates/dbupdate_hidden_fields.html index ccbc299..ccbc299 100644 --- a/qc_app/templates/dbupdate_hidden_fields.html +++ b/uploader/templates/dbupdate_hidden_fields.html diff --git a/qc_app/templates/errors_display.html b/uploader/templates/errors_display.html index 715cfcf..715cfcf 100644 --- a/qc_app/templates/errors_display.html +++ b/uploader/templates/errors_display.html diff --git a/uploader/templates/expression-data/base.html b/uploader/templates/expression-data/base.html new file mode 100644 index 0000000..d63fd7e --- /dev/null +++ b/uploader/templates/expression-data/base.html @@ -0,0 +1,13 @@ +{%extends "populations/base.html"%} + +{%block lvl3_breadcrumbs%} +<li {%if activelink=="expression-data"%} + class="breadcrumb-item active" + {%else%} + class="breadcrumb-item" + {%endif%}> + <a href="{{url_for('species.populations.expression-data.index')}}"> + Expression Data</a> +</li> +{%block lvl4_breadcrumbs%}{%endblock%} +{%endblock%} diff --git a/qc_app/templates/data_review.html b/uploader/templates/expression-data/data-review.html index b7528fd..c985b03 100644 --- a/qc_app/templates/data_review.html +++ b/uploader/templates/expression-data/data-review.html @@ -26,7 +26,7 @@ <small class="text-muted"> If you encounter an error saying your sample(s)/case(s) do not exist in the GeneNetwork database, then you will have to use the - <a href="{{url_for('samples.select_species')}}" + <a href="{{url_for('species.populations.samples.index')}}" title="Upload samples/cases feature">Upload Samples/Cases</a> option on this system to upload them. </small> @@ -70,8 +70,8 @@ column</li> <li>The values of each field <strong>ARE NOT</strong> quoted.</li> <li>Here is an - <a href="https://gitlab.com/fredmanglis/gnqc_py/-/blob/main/tests/test_data/no_data_errors.tsv"> - example file</a> with a single data row.</li> + <a href="https://gitlab.com/fredmanglis/gnqc_py/-/blob/main/tests/test_data/no_data_errors.tsv" + target="_blank">example file</a> with a single data row.</li> </ul> </li> <li>.txt files: Content has the same format as .tsv file above</li> diff --git a/uploader/templates/expression-data/index.html b/uploader/templates/expression-data/index.html new file mode 100644 index 0000000..9ba3582 --- /dev/null +++ b/uploader/templates/expression-data/index.html @@ -0,0 +1,33 @@ +{%extends "expression-data/base.html"%} +{%from "flash_messages.html" import flash_all_messages%} +{%from "species/macro-select-species.html" import select_species_form%} + +{%block title%}Expression Data{%endblock%} + +{%block pagetitle%}Expression Data{%endblock%} + +{%block breadcrumb%} +<li class="breadcrumb-item"> + <a href="{{url_for('base.index')}}">Home</a> +</li> +<li class="breadcrumb-item active"> + <a href="{{url_for('species.populations.expression-data.index')}}" + title="Upload expression data."> + Expression Data</a> +</li> +{%endblock%} + +{%block contents%} +<div class="row"> + <h2 class="heading">Expression Data</h2> + {{flash_all_messages()}} + + <p>This section allows you to enter the expression data for your experiment. + You will need to select the species that your data concerns below.</p> +</div> + +<div class="row"> + {{select_species_form(url_for("species.populations.expression-data.index"), + species)}} +</div> +{%endblock%} diff --git a/qc_app/templates/job_progress.html b/uploader/templates/expression-data/job-progress.html index 1af0763..ef264e1 100644 --- a/qc_app/templates/job_progress.html +++ b/uploader/templates/expression-data/job-progress.html @@ -1,5 +1,6 @@ {%extends "base.html"%} {%from "errors_display.html" import errors_display%} +{%from "populations/macro-display-population-card.html" import display_population_card%} {%block extrameta%} <meta http-equiv="refresh" content="5"> @@ -11,7 +12,9 @@ <h1 class="heading">{{job_name}}</h2> <div class="row"> - <form action="{{url_for('parse.abort')}}" method="POST"> + <form action="{{url_for('species.populations.expression-data.abort', + species_id=species.SpeciesId, + population_id=population.Id)}}" method="POST"> <legend class="heading">Status</legend> <div class="form-group"> <label for="job_status" class="form-label">status:</label> @@ -38,3 +41,7 @@ </div> {%endblock%} + +{%block sidebarcontents%} +{{display_population_card(species, population)}} +{%endblock%} diff --git a/qc_app/templates/no_such_job.html b/uploader/templates/expression-data/no-such-job.html index 42a2d48..d22c429 100644 --- a/qc_app/templates/no_such_job.html +++ b/uploader/templates/expression-data/no-such-job.html @@ -1,7 +1,8 @@ {%extends "base.html"%} {%block extrameta%} -<meta http-equiv="refresh" content="5;url={{url_for('entry.upload_file')}}"> +<meta http-equiv="refresh" + content="5;url={{url_for('species.populations.expression-data.index.upload_file')}}"> {%endblock%} {%block title%}No Such Job{%endblock%} diff --git a/qc_app/templates/parse_failure.html b/uploader/templates/expression-data/parse-failure.html index 31f6be8..31f6be8 100644 --- a/qc_app/templates/parse_failure.html +++ b/uploader/templates/expression-data/parse-failure.html diff --git a/uploader/templates/expression-data/parse-results.html b/uploader/templates/expression-data/parse-results.html new file mode 100644 index 0000000..03a23e2 --- /dev/null +++ b/uploader/templates/expression-data/parse-results.html @@ -0,0 +1,39 @@ +{%extends "base.html"%} +{%from "errors_display.html" import errors_display%} +{%from "populations/macro-display-population-card.html" import display_population_card%} + +{%block title%}Parse Results{%endblock%} + +{%block contents%} + +<div class="row"> + <h2 class="heading">{{job_name}}: parse results</h2> + + {%if user_aborted%} + <span class="alert-warning">Job aborted by the user</span> + {%endif%} + + {{errors_display(errors, "No errors found in the file", "We found the following errors", True)}} + + {%if errors | length == 0 and not user_aborted %} + <form method="post" action="{{url_for('dbinsert.select_platform')}}"> + <input type="hidden" name="job_id" value="{{job_id}}" /> + <input type="submit" value="update database" class="btn btn-primary" /> + </form> + {%endif%} + + {%if errors | length > 0 or user_aborted %} + <br /> + <a href="{{url_for('species.populations.expression-data.upload_file', + species_id=species.SpeciesId, + population_id=population.Id)}}" + title="Back to index page." + class="btn btn-primary">Go back</a> + + {%endif%} +</div> +{%endblock%} + +{%block sidebarcontents%} +{{display_population_card(species, population)}} +{%endblock%} diff --git a/uploader/templates/expression-data/select-file.html b/uploader/templates/expression-data/select-file.html new file mode 100644 index 0000000..4ca461e --- /dev/null +++ b/uploader/templates/expression-data/select-file.html @@ -0,0 +1,115 @@ +{%extends "expression-data/base.html"%} +{%from "flash_messages.html" import flash_messages%} +{%from "upload_progress_indicator.html" import upload_progress_indicator%} +{%from "populations/macro-display-population-card.html" import display_population_card%} + +{%block title%}Expression Data — Upload Data{%endblock%} + +{%block pagetitle%}Expression Data — Upload Data{%endblock%} + +{%block contents%} +{{upload_progress_indicator()}} + +<div class="row"> + <h2 class="heading">Upload Expression Data</h2> + + <p>This feature enables you to upload expression data. It expects the data to + be in <strong>tab-separated values (TSV)</strong> files. The data should be + a simple matrix of <em>phenotype × sample</em>, i.e. The first column is a + list of the <em>phenotypes</em> and the first row is a list of + <em>samples/cases</em>.</p> + + <p>If you haven't done so please go to this page to learn the requirements for + file formats and helpful suggestions to enter your data in a fast and easy + way.</p> + + <ol> + <li><strong>PLEASE REVIEW YOUR DATA.</strong>Make sure your data complies + with our system requirements. ( + <a href="{{url_for('species.populations.expression-data.data_review')}}#data-concerns" + title="Details for the data expectations.">Help</a> + )</li> + <li><strong>UPLOAD YOUR DATA FOR DATA VERIFICATION.</strong> We accept + <strong>.csv</strong>, <strong>.txt</strong> and <strong>.zip</strong> + files (<a href="{{url_for('species.populations.expression-data.data_review')}}#file-types" + title="Details for the data expectations.">Help</a>)</li> + </ol> +</div> + +<div class="row"> + <form action="{{url_for( + 'species.populations.expression-data.upload_file', + species_id=species.SpeciesId, + population_id=population.Id)}}" + method="POST" + enctype="multipart/form-data" + id="frm-upload-expression-data"> + {{flash_messages("error-expr-data")}} + + <div class="form-group"> + <legend class="heading">File Type</legend> + + <div class="radio"> + <label for="filetype_average" class="form-check-label"> + <input type="radio" name="filetype" value="average" id="filetype_average" + required="required" class="form-check-input" /> + Average</label> + <p class="form-text text-muted"> + <small>The averages data …</small></p> + </div> + + <div class="radio"> + <label for="filetype_standard_error" class="form-check-label"> + <input type="radio" name="filetype" value="standard-error" + id="filetype_standard_error" required="required" + class="form-check-input" /> + Standard Error + </label> + <p class="form-text text-muted"> + <small>The standard errors computed from the averages …</small></p> + </div> + </div> + + <div class="form-group"> + <span id="no-file-error" class="alert-danger" style="display: none;"> + No file selected + </span> + <label for="file_upload" class="form-label">Select File</label> + <input type="file" name="qc_text_file" id="file_upload" + accept="text/plain, text/tab-separated-values, application/zip" + class="form-control"/> + <p class="form-text text-muted"> + <small>Select the file to upload.</small></p> + </div> + + <button type="submit" + class="btn btn-primary" + data-toggle="modal" + data-target="#upload-progress-indicator">upload file</button> + </form> +</div> +{%endblock%} + +{%block sidebarcontents%} +{{display_population_card(species, population)}} +{%endblock%} + +{%block javascript%} +<script type="text/javascript" src="/static/js/upload_progress.js"></script> +<script type="text/javascript"> + function setup_formdata(form) { + var formdata = new FormData(); + formdata.append( + "qc_text_file", + form.querySelector("input[type='file']").files[0]); + formdata.append( + "filetype", + selected_filetype( + Array.from(form.querySelectorAll("input[type='radio']")))); + return formdata; + } + + setup_upload_handlers( + "frm-upload-expression-data", make_data_uploader(setup_formdata)); +</script> +{%endblock%} diff --git a/uploader/templates/expression-data/select-population.html b/uploader/templates/expression-data/select-population.html new file mode 100644 index 0000000..8555e27 --- /dev/null +++ b/uploader/templates/expression-data/select-population.html @@ -0,0 +1,29 @@ +{%extends "expression-data/base.html"%} +{%from "flash_messages.html" import flash_all_messages%} +{%from "species/macro-display-species-card.html" import display_species_card%} +{%from "populations/macro-select-population.html" import select_population_form%} + +{%block title%}Expression Data{%endblock%} + +{%block pagetitle%}Expression Data{%endblock%} + + +{%block contents%} +{{flash_all_messages()}} + +<div class="row"> + <p>You have selected the species. Now you need to select the population that + the expression data belongs to.</p> +</div> + +<div class="row"> + {{select_population_form(url_for( + "species.populations.expression-data.select_population", + species_id=species.SpeciesId), + populations)}} +</div> +{%endblock%} + +{%block sidebarcontents%} +{{display_species_card(species)}} +{%endblock%} diff --git a/qc_app/templates/final_confirmation.html b/uploader/templates/final_confirmation.html index 0727fc8..0727fc8 100644 --- a/qc_app/templates/final_confirmation.html +++ b/uploader/templates/final_confirmation.html diff --git a/qc_app/templates/flash_messages.html b/uploader/templates/flash_messages.html index b7af178..b7af178 100644 --- a/qc_app/templates/flash_messages.html +++ b/uploader/templates/flash_messages.html diff --git a/uploader/templates/genotypes/base.html b/uploader/templates/genotypes/base.html new file mode 100644 index 0000000..1b274bf --- /dev/null +++ b/uploader/templates/genotypes/base.html @@ -0,0 +1,12 @@ +{%extends "populations/base.html"%} + +{%block lvl3_breadcrumbs%} +<li {%if activelink=="genotypes"%} + class="breadcrumb-item active" + {%else%} + class="breadcrumb-item" + {%endif%}> + <a href="{{url_for('species.populations.genotypes.index')}}">Genotypes</a> +</li> +{%block lvl4_breadcrumbs%}{%endblock%} +{%endblock%} diff --git a/uploader/templates/genotypes/create-dataset.html b/uploader/templates/genotypes/create-dataset.html new file mode 100644 index 0000000..10331c1 --- /dev/null +++ b/uploader/templates/genotypes/create-dataset.html @@ -0,0 +1,82 @@ +{%extends "genotypes/base.html"%} +{%from "flash_messages.html" import flash_all_messages%} +{%from "populations/macro-display-population-card.html" import display_population_card%} + +{%block title%}Genotypes — Create Dataset{%endblock%} + +{%block pagetitle%}Genotypes — Create Dataset{%endblock%} + +{%block lvl4_breadcrumbs%} +<li {%if activelink=="create-dataset"%} + class="breadcrumb-item active" + {%else%} + class="breadcrumb-item" + {%endif%}> + <a href="{{url_for('species.populations.genotypes.create_dataset', + species_id=species.SpeciesId, + population_id=population.Id)}}">Create Dataset</a> +</li> +{%endblock%} + +{%block contents%} +{{flash_all_messages()}} + +<div class="row"> + <form id="frm-geno-create-dataset" + method="POST" + action="{{url_for('species.populations.genotypes.create_dataset', + species_id=species.SpeciesId, + population_id=population.Id)}}"> + <legend>Create a new Genotype Dataset</legend> + + <div class="form-group"> + <label for="txt-geno-dataset-name" class="form-label">Name</label> + <input type="text" + id="txt-geno-dataset-name" + name="geno-dataset-name" + required="required" + class="form-control" /> + <small class="form-text text-muted"> + <p>This is a short representative, but constrained name for the genotype + dataset.<br /> + The field will only accept letters ('A-Za-z'), numbers (0-9), hyphens + and underscores. Any other character will cause the name to be + rejected.</p></small> + </div> + + <div class="form-group"> + <label for="txt-geno-dataset-fullname" class="form-label">Full Name</label> + <input type="text" + id="txt-geno-dataset-fullname" + name="geno-dataset-fullname" + required="required" + class="form-control" /> + <small class="form-text text-muted"> + <p>This is a longer, more descriptive name for your dataset.</p></small> + </div> + + <div class="form-group"> + <label for="txt-geno-dataset-shortname" + class="form-label">Short Name</label> + <input type="text" + id="txt-geno-dataset-shortname" + name="geno-dataset-shortname" + class="form-control" /> + <small class="form-text text-muted"> + <p>A short name for your dataset. If you leave this field blank, the + short name will be set to the same value as the + "<strong>Name</strong>" field above.</p></small> + </div> + + <div class="form-group"> + <input type="submit" + class="btn btn-primary" + value="create dataset" /> + </div> + </form> +</div> +{%endblock%} + +{%block sidebarcontents%} +{{display_population_card(species, population)}} +{%endblock%} diff --git a/uploader/templates/genotypes/index.html b/uploader/templates/genotypes/index.html new file mode 100644 index 0000000..e749f5a --- /dev/null +++ b/uploader/templates/genotypes/index.html @@ -0,0 +1,28 @@ +{%extends "genotypes/base.html"%} +{%from "flash_messages.html" import flash_all_messages%} +{%from "species/macro-select-species.html" import select_species_form%} + +{%block title%}Genotypes{%endblock%} + +{%block pagetitle%}Genotypes{%endblock%} + + +{%block contents%} +{{flash_all_messages()}} + +<div class="row"> + <p> + This section allows you to upload genotype information for your experiments, + in the case that you have not previously done so. + </p> + <p> + We'll need to link the genotypes to the species and population, so do please + go ahead and select those in the next two steps. + </p> +</div> + +<div class="row"> + {{select_species_form(url_for("species.populations.genotypes.index"), + species)}} +</div> +{%endblock%} diff --git a/uploader/templates/genotypes/list-genotypes.html b/uploader/templates/genotypes/list-genotypes.html new file mode 100644 index 0000000..e4c39eb --- /dev/null +++ b/uploader/templates/genotypes/list-genotypes.html @@ -0,0 +1,148 @@ +{%extends "genotypes/base.html"%} +{%from "flash_messages.html" import flash_all_messages%} +{%from "populations/macro-display-population-card.html" import display_population_card%} + +{%block title%}Genotypes{%endblock%} + +{%block pagetitle%}Genotypes{%endblock%} + +{%block lvl4_breadcrumbs%} +<li {%if activelink=="list-genotypes"%} + class="breadcrumb-item active" + {%else%} + class="breadcrumb-item" + {%endif%}> + <a href="{{url_for('species.populations.genotypes.list_genotypes', + species_id=species.SpeciesId, + population_id=population.Id)}}">List genotypes</a> +</li> +{%endblock%} + +{%block contents%} +{{flash_all_messages()}} + +<div class="row"> + <h2>Genetic Markers</h2> + <p>There are a total of {{total_markers}} currently registered genetic markers + for the "{{species.FullName}}" species. You can click + <a href="{{url_for('species.populations.genotypes.list_markers', + species_id=species.SpeciesId)}}" + title="View genetic markers for species '{{species.FullName}}"> + this link to view the genetic markers + </a>. + </p> +</div> + +<div class="row"> + <h2>Genotype Encoding</h2> + <p> + The genotype encoding used for the "{{population.FullName}}" population from + the "{{species.FullName}}" species is as shown in the table below. + </p> + <table class="table"> + + <thead> + <tr> + <th>Allele Type</th> + <th>Allele Symbol</th> + <th>Allele Value</th> + </tr> + </thead> + + <tbody> + {%for row in genocode%} + <tr> + <td>{{row.AlleleType}}</td> + <td>{{row.AlleleSymbol}}</td> + <td>{{row.DatabaseValue if row.DatabaseValue is not none else "NULL"}}</td> + </tr> + {%else%} + <tr> + <td colspan="7" class="text-info"> + <span class="glyphicon glyphicon-exclamation-sign"></span> + There is no explicit genotype encoding defined for this population. + </td> + </tr> + {%endfor%} + </tbody> + </table> + + {%if genocode | length < 1%} + <a href="#add-genotype-encoding" + title="Add a genotype encoding system for this population" + class="btn btn-primary"> + add genotype encoding + </a> + {%endif%} +</div> + +<div class="row text-danger"> + <h3>Some Important Concepts to Consider/Remember</h3> + <ul> + <li>Reference vs. Non-reference alleles</li> + <li>In <em>GenoCode</em> table, items are ordered by <strong>InbredSet</strong></li> + </ul> + <h3>Possible references</h3> + <ul> + <li>https://mr-dictionary.mrcieu.ac.uk/term/genotype/</li> + <li>https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7363099/</li> + </ul> +</div> + +<div class="row"> + <h2>Genotype Datasets</h2> + + <p>The genotype data is organised under various genotype datasets. You can + click on the link for the relevant dataset to view a little more information + about it.</p> + + {%if dataset is not none%} + <table class="table"> + <thead> + <tr> + <th>Name</th> + <th>Full Name</th> + </tr> + </thead> + + <tbody> + <tr> + <td>{{dataset.Name}}</td> + <td><a href="{{url_for('species.populations.genotypes.view_dataset', + species_id=species.SpeciesId, + population_id=population.Id, + dataset_id=dataset.Id)}}" + title="View details regarding and manage dataset '{{dataset.FullName}}'"> + {{dataset.FullName}}</a></td> + </tr> + </tbody> + </table> + {%else%} + <p class="text-warning"> + <span class="glyphicon glyphicon-exclamation-sign"></span> + There is no genotype dataset defined for this population. + </p> + <p> + <a href="{{url_for('species.populations.genotypes.create_dataset', + species_id=species.SpeciesId, + population_id=population.Id)}}" + title="Create a new genotype dataset for the '{{population.FullName}}' population for the '{{species.FullName}}' species." + class="btn btn-primary"> + create new genotype dataset</a></p> + {%endif%} +</div> +<div class="row text-warning"> + <p> + <span class="glyphicon glyphicon-exclamation-sign"></span> + <strong>NOTE</strong>: Currently the GN2 (and related) system(s) expect a + single genotype dataset. If there is more than one, the system apparently + fails in unpredictable ways. + </p> + <p>Fix this to allow multiple datasets, each with a different assembly from + all the rest.</p> +</div> +{%endblock%} + +{%block sidebarcontents%} +{{display_population_card(species, population)}} +{%endblock%} diff --git a/uploader/templates/genotypes/list-markers.html b/uploader/templates/genotypes/list-markers.html new file mode 100644 index 0000000..9198b44 --- /dev/null +++ b/uploader/templates/genotypes/list-markers.html @@ -0,0 +1,102 @@ +{%extends "genotypes/base.html"%} +{%from "flash_messages.html" import flash_all_messages%} +{%from "species/macro-display-species-card.html" import display_species_card%} + +{%block title%}Genotypes: List Markers{%endblock%} + +{%block pagetitle%}Genotypes: List Markers{%endblock%} + +{%block lvl4_breadcrumbs%} +<li {%if activelink=="list-markers"%} + class="breadcrumb-item active" + {%else%} + class="breadcrumb-item" + {%endif%}> + <a href="{{url_for('species.populations.genotypes.list_markers', + species_id=species.SpeciesId)}}">List markers</a> +</li> +{%endblock%} + +{%block contents%} +{{flash_all_messages()}} + +{%if markers | length > 0%} +<div class="row"> + <p> + There are a total of {{total_markers}} genotype markers for this species. + </p> + <div class="row"> + <div class="col-md-2" style="text-align: start;"> + {%if start_from > 0%} + <a href="{{url_for('species.populations.genotypes.list_markers', + species_id=species.SpeciesId, + start_from=start_from-count, + count=count)}}"> + <span class="glyphicon glyphicon-backward"></span> + Previous + </a> + {%endif%} + </div> + <div class="col-md-8" style="text-align: center;"> + Displaying markers {{start_from+1}} to {{start_from+count if start_from+count < total_markers else total_markers}} of + {{total_markers}} + </div> + <div class="col-md-2" style="text-align: end;"> + {%if start_from + count < total_markers%} + <a href="{{url_for('species.populations.genotypes.list_markers', + species_id=species.SpeciesId, + start_from=start_from+count, + count=count)}}"> + Next + <span class="glyphicon glyphicon-forward"></span> + </a> + {%endif%} + </div> + </div> + <table class="table"> + <thead> + <tr> + <th title="">#</th> + <th title="">Marker Name</th> + <th title="Chromosome">Chr</th> + <th title="Physical location of the marker in megabasepairs"> + Location (Mb)</th> + <th title="">Source</th> + <th title="">Source2</th> + </thead> + + <tbody> + {%for marker in markers%} + <tr> + <td>{{marker.sequence_number}}</td> + <td>{{marker.Marker_Name}}</td> + <td>{{marker.Chr}}</td> + <td>{{marker.Mb}}</td> + <td>{{marker.Source}}</td> + <td>{{marker.Source2}}</td> + </tr> + {%endfor%} + </tbody> + </table> +</div> +{%else%} +<div class="row"> + <p class="text-warning"> + <span class="glyphicon glyphicon-exclamation-sign"></span> + This species does not currently have any genetic markers uploaded, therefore, + there is nothing to display here. + </p> + <p> + <a href="#add-genetic-markers-for-species-{{species.SpeciesId}}" + title="Add genetic markers for this species" + class="btn btn-primary"> + add genetic markers + </a> + </p> +</div> +{%endif%} +{%endblock%} + +{%block sidebarcontents%} +{{display_species_card(species)}} +{%endblock%} diff --git a/uploader/templates/genotypes/select-population.html b/uploader/templates/genotypes/select-population.html new file mode 100644 index 0000000..7c81943 --- /dev/null +++ b/uploader/templates/genotypes/select-population.html @@ -0,0 +1,31 @@ +{%extends "genotypes/base.html"%} +{%from "flash_messages.html" import flash_all_messages%} +{%from "species/macro-display-species-card.html" import display_species_card%} +{%from "populations/macro-select-population.html" import select_population_form%} + +{%block title%}Genotypes{%endblock%} + +{%block pagetitle%}Genotypes{%endblock%} + + +{%block contents%} +{{flash_all_messages()}} + +<div class="row"> + <p> + You have indicated that you intend to upload the genotypes for species + '{{species.FullName}}'. We now just require the population for your + experiment/study, and you should be good to go. + </p> +</div> + +<div class="row"> + {{select_population_form(url_for("species.populations.genotypes.select_population", + species_id=species.SpeciesId), + populations)}} +</div> +{%endblock%} + +{%block sidebarcontents%} +{{display_species_card(species)}} +{%endblock%} diff --git a/uploader/templates/genotypes/view-dataset.html b/uploader/templates/genotypes/view-dataset.html new file mode 100644 index 0000000..e7ceb36 --- /dev/null +++ b/uploader/templates/genotypes/view-dataset.html @@ -0,0 +1,61 @@ +{%extends "genotypes/base.html"%} +{%from "flash_messages.html" import flash_all_messages%} +{%from "populations/macro-display-population-card.html" import display_population_card%} + +{%block title%}Genotypes: View Dataset{%endblock%} + +{%block pagetitle%}Genotypes: View Dataset{%endblock%} + +{%block lvl4_breadcrumbs%} +<li {%if activelink=="view-dataset"%} + class="breadcrumb-item active" + {%else%} + class="breadcrumb-item" + {%endif%}> + <a href="{{url_for('species.populations.genotypes.view_dataset', + species_id=species.SpeciesId, + population_id=population.Id, + dataset_id=dataset.Id)}}">view dataset</a> +</li> +{%endblock%} + +{%block contents%} +{{flash_all_messages()}} + +<div class="row"> + <h2>Genotype Dataset Details</h2> + <table class="table"> + <thead> + <tr> + <th>Name</th> + <th>Full Name</th> + </tr> + </thead> + + <tbody> + <tr> + <td>{{dataset.Name}}</td> + <td>{{dataset.FullName}}</td> + </tr> + </tbody> + </table> +</div> + +<div class="row text-warning"> + <h2>Assembly Details</h2> + + <p>Maybe include the assembly details here if found to be necessary.</p> +</div> + +<div class="row"> + <h2>Genotype Data</h2> + + <p class="text-danger"> + Provide link to enable uploading of genotype data here.</p> +</div> + +{%endblock%} + +{%block sidebarcontents%} +{{display_population_card(species, population)}} +{%endblock%} diff --git a/qc_app/templates/http-error.html b/uploader/templates/http-error.html index 374fb86..374fb86 100644 --- a/qc_app/templates/http-error.html +++ b/uploader/templates/http-error.html diff --git a/uploader/templates/index.html b/uploader/templates/index.html new file mode 100644 index 0000000..d6f57eb --- /dev/null +++ b/uploader/templates/index.html @@ -0,0 +1,99 @@ +{%extends "base.html"%} +{%from "flash_messages.html" import flash_all_messages%} + +{%block title%}Home{%endblock%} + +{%block pagetitle%}Home{%endblock%} + +{%block contents%} + +<div class="row"> + {{flash_all_messages()}} + <div class="explainer"> + <p>Welcome to the <strong>GeneNetwork Data Quality Control and Upload System</strong>. This system is provided to help in uploading your data onto GeneNetwork where you can do analysis on it.</p> + + <p>The sections below provide an overview of what features the menu items on + the left provide to you. Please peruse the information to get a good + big-picture understanding of what the system provides you and how to get + the most out of it.</p> + + {%block extrapageinfo%}{%endblock%} + + <h2>Species</h2> + + <p>The GeneNetwork service provides datasets and tools for doing genetic + studies — from + <a href="{{gn2server_intro}}" + target="_blank" + title="GeneNetwork introduction — opens in a new tab."> + its introduction</a>: + + <blockquote class="blockquote"> + <p>GeneNetwork is a group of linked data sets and tools used to study + complex networks of genes, molecules, and higher order gene function + and phenotypes. …</p> + </blockquote> + </p> + + <p>With this in mind, it follows that the data in the system is centered + aroud a variety of species. The <strong>species section</strong> will + list the currently available species in the system, and give you the + ability to add new ones, if the one you want to work on does not currently + exist on GeneNetwork</p> + + <h2>Populations</h2> + + <p>Your studies will probably focus on a particular subset of the entire + species you are interested in – your population.</p> + <p>Populations are a way to organise the species data so as to link data to + specific know populations for a particular species, e.g. The BXD + population of mice (Mus musculus)</p> + <p>In older GeneNetwork documentation, you might run into the term + <em>InbredSet</em>. Should you run into it, it is a term that we've + deprecated that essentially just means the population.</p> + + <h2>Samples</h2> + + <p>These are the samples or individuals (sometimes cases) that were involved + in the experiment, and from whom the data was derived.</p> + + <h2>Genotype Data</h2> + + <p>This section will allow you to view and upload the genetic markers for + your species, and the genotype encodings used for your particular + population.</p> + <p>While, technically, genetic markers relate to the species in general, and + not to a particular population, the data (allele information) itself + relates to the particular population it was generated from – + specifically, to the actual individuals used in the experiment.</p> + <p>This is the reason why the genotype data information comes under the + population, and will check for the prior existence of the related + samples/individuals before attempting an upload of your data.</p> + + <h2>Expression Data</h2> + + <p class="text-danger"> + <span class="glyphicon glyphicon-exclamation-sign"></span> + <strong>TODO</strong>: Document this …</p> + + <h2>Phenotype Data</h2> + + <p class="text-danger"> + <span class="glyphicon glyphicon-exclamation-sign"></span> + <strong>TODO</strong>: Document this …</p> + + <h2>Individual Data</h2> + + <p class="text-danger"> + <span class="glyphicon glyphicon-exclamation-sign"></span> + <strong>TODO</strong>: Document this …</p> + + <h2>RNA-Seq Data</h2> + + <p class="text-danger"> + <span class="glyphicon glyphicon-exclamation-sign"></span> + <strong>TODO</strong>: Document this …</p> + </div> +</div> + +{%endblock%} diff --git a/qc_app/templates/insert_error.html b/uploader/templates/insert_error.html index 5301288..5301288 100644 --- a/qc_app/templates/insert_error.html +++ b/uploader/templates/insert_error.html diff --git a/qc_app/templates/insert_progress.html b/uploader/templates/insert_progress.html index 52177d6..52177d6 100644 --- a/qc_app/templates/insert_progress.html +++ b/uploader/templates/insert_progress.html diff --git a/qc_app/templates/insert_success.html b/uploader/templates/insert_success.html index 7e1fa8d..7e1fa8d 100644 --- a/qc_app/templates/insert_success.html +++ b/uploader/templates/insert_success.html diff --git a/uploader/templates/login.html b/uploader/templates/login.html new file mode 100644 index 0000000..1f71416 --- /dev/null +++ b/uploader/templates/login.html @@ -0,0 +1,11 @@ +{%extends "index.html"%} + +{%block title%}Data Upload{%endblock%} + +{%block pagetitle%}log in{%endblock%} + +{%block extrapageinfo%} +<p class="text-dark text-primary"> + You <strong>do need to be logged in</strong> to upload data onto this system. + Please do that by clicking the "Log In" button at the top of the page.</p> +{%endblock%} diff --git a/uploader/templates/macro-table-pagination.html b/uploader/templates/macro-table-pagination.html new file mode 100644 index 0000000..292c531 --- /dev/null +++ b/uploader/templates/macro-table-pagination.html @@ -0,0 +1,26 @@ +{%macro table_pagination(start_at, page_count, total_count, base_uri, name)%} +{%set ns = namespace(forward_uri=base_uri, back_uri=base_uri)%} +{%set ns.forward_uri="brr"%} + <div class="row"> + <div class="col-md-2" style="text-align: start;"> + {%if start_at > 0%} + <a href="{{base_uri + + '?start_at='+((start_at-page_count)|string) + + '&count='+(page_count|string)}}"> + <span class="glyphicon glyphicon-backward"></span> + Previous + </a> + {%endif%} + </div> + <div class="col-md-8" style="text-align: center;"> + Displaying {{name}} {{start_at+1}} to {{start_at+page_count if start_at+page_count < total_count else total_count}} of {{total_count}}</div> + <div class="col-md-2" style="text-align: end;"> + {%if start_at + page_count < total_count%} + <a href="{{base_uri + + '?start_at='+((start_at+page_count)|string) + + '&count='+(page_count|string)}}"> + Next<span class="glyphicon glyphicon-forward"></span></a> + {%endif%} + </div> + </div> +{%endmacro%} diff --git a/uploader/templates/phenotypes/add-phenotypes.html b/uploader/templates/phenotypes/add-phenotypes.html new file mode 100644 index 0000000..196bc69 --- /dev/null +++ b/uploader/templates/phenotypes/add-phenotypes.html @@ -0,0 +1,231 @@ +{%extends "phenotypes/base.html"%} +{%from "flash_messages.html" import flash_all_messages%} +{%from "macro-table-pagination.html" import table_pagination%} +{%from "phenotypes/macro-display-pheno-dataset-card.html" import display_pheno_dataset_card%} + +{%block title%}Phenotypes{%endblock%} + +{%block pagetitle%}Phenotypes{%endblock%} + +{%block lvl4_breadcrumbs%} +<li {%if activelink=="add-phenotypes"%} + class="breadcrumb-item active" + {%else%} + class="breadcrumb-item" + {%endif%}> + <a href="{{url_for('species.populations.phenotypes.add_phenotypes', + species_id=species.SpeciesId, + population_id=population.Id, + dataset_id=dataset.Id)}}">View Datasets</a> +</li> +{%endblock%} + +{%block contents%} +{{flash_all_messages()}} + +<div class="row"> + <form id="frm-add-phenotypes" + method="POST" + enctype="multipart/form-data" + action="{{url_for('species.populations.phenotypes.add_phenotypes', + species_id=species.SpeciesId, + population_id=population.Id, + dataset_id=dataset.Id)}}"> + <legend>Add New Phenotypes</legend> + + <div class="form-text help-block"> + <p>Select the zip file bundle containing information on the phenotypes you + wish to upload, then click the "Upload Phenotypes" button below to + upload the data.</p> + <p>See the <a href="#section-file-formats">File Formats</a> section below + to get an understanding of what is expected of the bundle files you + upload.</p> + <p><strong>This will not update any existing phenotypes!</strong></p> + </div> + + <div class="form-group"> + <label for="finput-phenotypes-bundle" class="form-label"> + Phenotypes Bundle</label> + <input type="file" + id="finput-phenotypes-bundle" + name="phenotypes-bundle" + accept="application/zip, .zip" + required="required" + class="form-control" /> + </div> + + <div class="form-group"> + <input type="submit" + value="upload phenotypes" + class="btn btn-primary" /> + </div> + </form> +</div> + +<div class="row"> + <h2 class="heading" id="section-file-formats">File Formats</h2> + <p>We accept an extended form of the + <a href="https://kbroman.org/qtl2/assets/vignettes/input_files.html#format-of-the-data-files" + title="R/qtl2 software input file format documentation"> + input files' format used with the R/qtl2 software</a> as a single ZIP + file</p> + <p>The files that are used for this feature are: + <ul> + <li>the <em>control</em> file</li> + <li><em>pheno</em> file(s)</li> + <li><em>phenocovar</em> file(s)</li> + <li><em>phenose</em> files(s)</li> + </ul> + </p> + <p>Other files within the bundle will be ignored, for this feature.</p> + <p>The following section will detail the expectations for each of the + different file types within the uploaded ZIP file bundle for phenotypes:</p> + + <h3 class="subheading">Control File</h3> + <p>There <strong>MUST be <em>one, and only one</em></strong> file that acts + as the control file. This file can be: + <ul> + <li>a <em>JSON</em> file, or</li> + <li>a <em>YAML</em> file.</li> + </ul> + </p> + + <p>The control file is useful for defining things about the bundle such as:</p> + <ul> + <li>The field separator value (default: <code>sep: ','</code>). There can + only ever be one field separator and it <strong>MUST</strong> be the same + one for <strong>ALL</strong> files in the bundle.</li> + <li>The comment character (default: <code>comment.char: '#'</code>). Any + line that starts with this character will be considered a comment line and + be ignored in its entirety.</li> + <li>Code for missing values (default: <code>na.strings: 'NA'</code>). You + can specify more than one code to indicate missing values, e.g. + <code>{…, "na.strings": ["NA", "N/A", "-"], …}</code></li> + </ul> + + <h3 class="subheading"><em>pheno</em> File(s)</h3> + <p>These files are the main data files. You must have at least one of these + files in your bundle for it to be valid for this step.</p> + <p>The data is a matrix of <em>individuals × phenotypes</em> by default, as + below:<br /> + <code> + id,10001,10002,10003,10004,…<br /> + BXD1,61.400002,54.099998,483,49.799999,…<br /> + BXD2,49,50.099998,403,45.5,…<br /> + BXD5,62.5,53.299999,501,62.900002,…<br /> + BXD6,53.099998,55.099998,403,NA,…<br /> + â‹®<br /></code> + </p> + <p>If the <code>pheno_transposed</code> value is set to <code>True</code>, + then the data will be a <em>phenotypes × individuals</em> matrix as in the + example below:<br /> + <code> + id,BXD1,BXD2,BXD5,BXD6,…<br /> + 10001,61.400002,49,62.5,53.099998,…<br /> + 10002,54.099998,50.099998,53.299999,55.099998,…<br /> + 10003,483,403,501,403,…<br /> + 10004,49.799999,45.5,62.900002,NA,…<br /> + â‹® + </code> + </p> + + + <h3 class="subheading"><em>phenocovar</em> File(s)</h3> + <p>At least one phenotypes metadata file with the metadata values such as + descriptions, PubMed Identifier, publication titles (if present), etc.</p> + <p>The data in this/these file(s) is a matrix of + <em>phenotypes × phenotypes-covariates</em>. The first column is always the + phenotype names/identifiers — same as in the R/qtl2 format.</p> + <p><em>phenocovar</em> files <strong>should never be transposed</strong>!</p> + <p>This file <strong>MUST</strong> be present in the bundle, and have data for + the bundle to be considered valid by our system for this step.<br /> + In addition to that, the following are the fields that <strong>must be + present</strong>, and + have values, in the file before the file is considered valid: + <ul> + <li><em>description</em>: A description for each phenotype. Useful + for users to know what the phenotype is about.</li> + <li><em>units</em>: The units of measurement for the phenotype, + e.g. milligrams for brain weight, centimetres/millimetres for + tail-length, etc.</li> + </ul></p> + + <p>The following <em>optional</em> fields can also be provided: + <ul> + <li><em>pubmedid</em>: A PubMed Identifier for the publication where + the phenotype is published. If this field is not provided, the system will + assume your phenotype is not published.</li> + </ul> + </p> + <p>These files will be marked up in the control file with the + <code>phenocovar</code> key, as in the examples below: + <ol> + <li>JSON: single file<br /> + <code>{<br /> + â‹®,<br /> + "phenocovar": "your_covariates_file.csv",<br /> + â‹®<br /> + } + </code> + </li> + <li>JSON: multiple files<br /> + <code>{<br /> + â‹®,<br /> + "phenocovar": [<br /> + "covariates_file_01.csv",<br /> + "covariates_file_01.csv",<br /> + â‹®<br /> + ],<br /> + â‹®<br /> + } + </code> + </li> + <li>YAML: single file or<br /> + <code> + â‹®<br /> + phenocovar: your_covariates_file.csv<br /> + â‹® + </code> + </li> + <li>YAML: multiple files<br /> + <code> + â‹®<br /> + phenocovar:<br /> + - covariates_file_01.csv<br /> + - covariates_file_02.csv<br /> + - covariates_file_03.csv<br /> + …<br /> + â‹® + </code> + </li> + </ol> + </p> + + <h3 class="subheading"><em>phenose</em> and <em>phenonum</em> File(s)</h3> + <p>These are extensions to the R/qtl2 standard, i.e. these types ofs file are + not supported by the original R/qtl2 file format</p> + <p>We use these files to upload the standard errors (<em>phenose</em>) when + the data file (<em>pheno</em>) is average data. In that case, the + <em>phenonum</em> file(s) contains the number of individuals that were + involved when computing the averages.</p> + <p>Both types of files are matrices of <em>individuals × phenotypes</em> by + default. Like the related <em>pheno</em> files, if + <code>pheno_transposed: True</code>, then the file will be a matrix of + <em>phenotypes × individuals</em>.</p> +</div> + +<div class="row text-warning"> + <h3 class="subheading">Notes for Devs (well… Fred, really.)</h3> + <p>Use the following resources for automated retrieval of certain data</p> + <ul> + <li><a href="https://www.ncbi.nlm.nih.gov/pmc/tools/developers/" + title="NCBI APIs: Retrieve articles' metadata etc."> + NCBI APIS</a></li> + </ul> +</div> + +{%endblock%} + +{%block sidebarcontents%} +{{display_pheno_dataset_card(species, population, dataset)}} +{%endblock%} diff --git a/uploader/templates/phenotypes/base.html b/uploader/templates/phenotypes/base.html new file mode 100644 index 0000000..3bc5dea --- /dev/null +++ b/uploader/templates/phenotypes/base.html @@ -0,0 +1,12 @@ +{%extends "populations/base.html"%} + +{%block lvl3_breadcrumbs%} +<li {%if activelink=="phenotypes"%} + class="breadcrumb-item active" + {%else%} + class="breadcrumb-item" + {%endif%}> + <a href="{{url_for('species.populations.phenotypes.index')}}">Phenotypes</a> +</li> +{%block lvl4_breadcrumbs%}{%endblock%} +{%endblock%} diff --git a/uploader/templates/phenotypes/create-dataset.html b/uploader/templates/phenotypes/create-dataset.html new file mode 100644 index 0000000..93de92f --- /dev/null +++ b/uploader/templates/phenotypes/create-dataset.html @@ -0,0 +1,106 @@ +{%extends "phenotypes/base.html"%} +{%from "flash_messages.html" import flash_all_messages%} +{%from "macro-table-pagination.html" import table_pagination%} +{%from "populations/macro-display-population-card.html" import display_population_card%} + +{%block title%}Phenotypes{%endblock%} + +{%block pagetitle%}Phenotypes{%endblock%} + +{%block lvl4_breadcrumbs%} +<li {%if activelink=="create-dataset"%} + class="breadcrumb-item active" + {%else%} + class="breadcrumb-item" + {%endif%}> + <a href="{{url_for('species.populations.phenotypes.create_dataset', + species_id=species.SpeciesId, + population_id=population.Id)}}">Create Datasets</a> +</li> +{%endblock%} + +{%block contents%} +{{flash_all_messages()}} + +<div class="row"> + <p>Create a new phenotype dataset.</p> +</div> + +<div class="row"> + <form id="frm-create-pheno-dataset" + action="{{url_for('species.populations.phenotypes.create_dataset', + species_id=species.SpeciesId, + population_id=population.Id)}}" + method="POST"> + + <div class="form-group"> + <label class="form-label" for="txt-dataset-name">Name</label> + {%if errors["dataset-name"] is defined%} + <small class="form-text text-muted danger"> + <p>{{errors["dataset-name"]}}</p></small> + {%endif%} + <input type="text" + name="dataset-name" + id="txt-dataset-name" + value="{{original_formdata.get('dataset-name') or (population.InbredSetCode + 'Publish')}}" + {%if errors["dataset-name"] is defined%} + class="form-control danger" + {%else%} + class="form-control" + {%endif%} + required="required" /> + <small class="form-text text-muted"> + <p>A short representative name for the dataset.</p> + <p>Recommended: Use the population code and append "Publish" at the end. + <br />This field will only accept names composed of + letters ('A-Za-z'), numbers (0-9), hyphens and underscores.</p> + </small> + </div> + + <div class="form-group"> + <label class="form-label" for="txt-dataset-fullname">Full Name</label> + {%if errors["dataset-fullname"] is defined%} + <small class="form-text text-muted danger"> + <p>{{errors["dataset-fullname"]}}</p></small> + {%endif%} + <input id="txt-dataset-fullname" + name="dataset-fullname" + type="text" + value="{{original_formdata.get('dataset-fullname', '')}}" + {%if errors["dataset-fullname"] is defined%} + class="form-control danger" + {%else%} + class="form-control" + {%endif%} + required="required" /> + <small class="form-text text-muted"> + <p>A longer, descriptive name for the dataset — useful for humans. + </p></small> + </div> + + <div class="form-group"> + <label class="form-label" for="txt-dataset-shortname">Short Name</label> + <input id="txt-dataset-shortname" + name="dataset-shortname" + type="text" + class="form-control" + value="{{original_formdata.get('dataset-shortname') or (population.InbredSetCode + ' Publish')}}" /> + <small class="form-text text-muted"> + <p>An optional, short name for the dataset. <br /> + If this is not provided, it will default to the value provided for the + <strong>Name</strong> field above.</p></small> + </div> + + <div class="form-group"> + <input type="submit" + class="btn btn-primary" + value="create phenotype dataset" /> + </div> + + </form> +</div> +{%endblock%} + +{%block sidebarcontents%} +{{display_population_card(species, population)}} +{%endblock%} diff --git a/uploader/templates/phenotypes/index.html b/uploader/templates/phenotypes/index.html new file mode 100644 index 0000000..0c691e6 --- /dev/null +++ b/uploader/templates/phenotypes/index.html @@ -0,0 +1,26 @@ +{%extends "phenotypes/base.html"%} +{%from "flash_messages.html" import flash_all_messages%} +{%from "species/macro-select-species.html" import select_species_form%} + +{%block title%}Phenotypes{%endblock%} + +{%block pagetitle%}Phenotypes{%endblock%} + + +{%block contents%} +{{flash_all_messages()}} + +<div class="row"> + <p>This section deals with phenotypes that + <span class="text-warning"> + <span class="glyphicon glyphicon-exclamation-sign"></span> + … what are the characteristics of these phenotypes? …</span></p> + <p>Select the species to begin the process of viewing/uploading data about + your phenotypes</p> +</div> + +<div class="row"> + {{select_species_form(url_for("species.populations.phenotypes.index"), + species)}} +</div> +{%endblock%} diff --git a/uploader/templates/phenotypes/list-datasets.html b/uploader/templates/phenotypes/list-datasets.html new file mode 100644 index 0000000..2eaf43a --- /dev/null +++ b/uploader/templates/phenotypes/list-datasets.html @@ -0,0 +1,65 @@ +{%extends "phenotypes/base.html"%} +{%from "flash_messages.html" import flash_all_messages%} +{%from "populations/macro-display-population-card.html" import display_population_card%} + +{%block title%}Phenotypes{%endblock%} + +{%block pagetitle%}Phenotypes{%endblock%} + +{%block lvl4_breadcrumbs%} +<li {%if activelink=="list-datasets"%} + class="breadcrumb-item active" + {%else%} + class="breadcrumb-item" + {%endif%}> + <a href="{{url_for('species.populations.phenotypes.list_datasets', + species_id=species.SpeciesId, + population_id=population.Id)}}">List Datasets</a> +</li> +{%endblock%} + +{%block contents%} +{{flash_all_messages()}} + +<div class="row"> + {%if datasets | length > 0%} + <p>The dataset(s) available for this population is/are:</p> + + <table class="table"> + <thead> + <tr> + <th>Name</th> + <th>Full Name</th> + <th>Short Name</th> + </tr> + </thead> + + <tbody> + {%for dataset in datasets%} + <tr> + <td><a href="{{url_for('species.populations.phenotypes.view_dataset', + species_id=species.SpeciesId, + population_id=population.Id, + dataset_id=dataset.Id)}}">{{dataset.Name}}</a></td> + <td>{{dataset.FullName}}</td> + <td>{{dataset.ShortName}}</td> + </tr> + {%endfor%} + </tbody> + </table> + {%else%} + <p class="text-warning"> + <span class="glyphicon glyphicon-exclamation-sign"></span> + There is no dataset for this population!</p> + <p><a href="{{url_for('species.populations.phenotypes.create_dataset', + species_id=species.SpeciesId, + population_id=population.Id)}}" + class="btn btn-primary" + title="Create a new phenotype dataset.">create dataset</a></p> + {%endif%} +</div> +{%endblock%} + +{%block sidebarcontents%} +{{display_population_card(species, population)}} +{%endblock%} diff --git a/uploader/templates/phenotypes/macro-display-pheno-dataset-card.html b/uploader/templates/phenotypes/macro-display-pheno-dataset-card.html new file mode 100644 index 0000000..11b108b --- /dev/null +++ b/uploader/templates/phenotypes/macro-display-pheno-dataset-card.html @@ -0,0 +1,31 @@ +{%from "populations/macro-display-population-card.html" import display_population_card%} + +{%macro display_pheno_dataset_card(species, population, dataset)%} +{{display_population_card(species, population)}} + +<div class="card"> + <div class="card-body"> + <h5 class="card-title">Phenotypes' Dataset</h5> + <div class="card-text"> + <table class="table"> + <tbody> + <tr> + <td>Name</td> + <td>{{dataset.Name}}</td> + </tr> + + <tr> + <td>Full Name</td> + <td>{{dataset.FullName}}</td> + </tr> + + <tr> + <td>Short Name</td> + <td>{{dataset.ShortName}}</td> + </tr> + </tbody> + </table> + </div> + </div> +</div> +{%endmacro%} diff --git a/uploader/templates/phenotypes/select-population.html b/uploader/templates/phenotypes/select-population.html new file mode 100644 index 0000000..eafd4a7 --- /dev/null +++ b/uploader/templates/phenotypes/select-population.html @@ -0,0 +1,28 @@ +{%extends "phenotypes/base.html"%} +{%from "flash_messages.html" import flash_all_messages%} +{%from "species/macro-display-species-card.html" import display_species_card%} +{%from "populations/macro-select-population.html" import select_population_form%} + +{%block title%}Phenotypes{%endblock%} + +{%block pagetitle%}Phenotypes{%endblock%} + + +{%block contents%} +{{flash_all_messages()}} + +<div class="row"> + <p>Select the population for your phenotypes to view and manage the phenotype + datasets that relate to it.</p> +</div> + +<div class="row"> + {{select_population_form(url_for("species.populations.phenotypes.select_population", + species_id=species.SpeciesId), + populations)}} +</div> +{%endblock%} + +{%block sidebarcontents%} +{{display_species_card(species)}} +{%endblock%} diff --git a/uploader/templates/phenotypes/view-dataset.html b/uploader/templates/phenotypes/view-dataset.html new file mode 100644 index 0000000..b136bb6 --- /dev/null +++ b/uploader/templates/phenotypes/view-dataset.html @@ -0,0 +1,96 @@ +{%extends "phenotypes/base.html"%} +{%from "flash_messages.html" import flash_all_messages%} +{%from "macro-table-pagination.html" import table_pagination%} +{%from "populations/macro-display-population-card.html" import display_population_card%} + +{%block title%}Phenotypes{%endblock%} + +{%block pagetitle%}Phenotypes{%endblock%} + +{%block lvl4_breadcrumbs%} +<li {%if activelink=="view-dataset"%} + class="breadcrumb-item active" + {%else%} + class="breadcrumb-item" + {%endif%}> + <a href="{{url_for('species.populations.phenotypes.view_dataset', + species_id=species.SpeciesId, + population_id=population.Id, + dataset_id=dataset.Id)}}">View Datasets</a> +</li> +{%endblock%} + +{%block contents%} +{{flash_all_messages()}} + +<div class="row"> + <p>The basic dataset details are:</p> + + <table class="table"> + <thead> + <tr> + <th>Name</th> + <th>Full Name</th> + <th>Short Name</th> + </tr> + </thead> + + <tbody> + <tr> + <td>{{dataset.Name}}</td> + <td>{{dataset.FullName}}</td> + <td>{{dataset.ShortName}}</td> + </tr> + </tbody> + </table> +</div> + +<div class="row"> + <p><a href="{{url_for('species.populations.phenotypes.add_phenotypes', + species_id=species.SpeciesId, + population_id=population.Id, + dataset_id=dataset.Id)}}" + title="Add a bunch of phenotypes" + class="btn btn-primary">Add phenotypes</a></p> +</div> + +<div class="row"> + <h2>Phenotype Data</h2> + + <p>This dataset has a total of {{phenotype_count}} phenotypes.</p> + + {{table_pagination(start_from, count, phenotype_count, url_for('species.populations.phenotypes.view_dataset', species_id=species.SpeciesId, population_id=population.Id, dataset_id=dataset.Id), "phenotypes")}} + + <table class="table"> + <thead> + <tr> + <th>#</th> + <th>Record</th> + <th>Description</th> + </tr> + </thead> + + <tbody> + {%for pheno in phenotypes%} + <tr> + <td>{{pheno.sequence_number}}</td> + <td><a href="{{url_for('species.populations.phenotypes.view_phenotype', + species_id=species.SpeciesId, + population_id=population.Id, + dataset_id=dataset.Id, + xref_id=pheno['pxr.Id'])}}" + title="View phenotype details"> + {{pheno.InbredSetCode}}_{{pheno["pxr.Id"]}}</a></td> + <td>{{pheno.Post_publication_description or pheno.Pre_publication_abbreviation or pheno.Original_description}}</td> + </tr> + {%else%} + <tr><td colspan="5"></td></tr> + {%endfor%} + </tbody> + </table> +</div> +{%endblock%} + +{%block sidebarcontents%} +{{display_population_card(species, population)}} +{%endblock%} diff --git a/uploader/templates/phenotypes/view-phenotype.html b/uploader/templates/phenotypes/view-phenotype.html new file mode 100644 index 0000000..99bb8e5 --- /dev/null +++ b/uploader/templates/phenotypes/view-phenotype.html @@ -0,0 +1,126 @@ +{%extends "phenotypes/base.html"%} +{%from "flash_messages.html" import flash_all_messages%} +{%from "populations/macro-display-population-card.html" import display_population_card%} + +{%block title%}Phenotypes{%endblock%} + +{%block pagetitle%}Phenotypes{%endblock%} + +{%block lvl4_breadcrumbs%} +<li {%if activelink=="view-phenotype"%} + class="breadcrumb-item active" + {%else%} + class="breadcrumb-item" + {%endif%}> + <a href="{{url_for('species.populations.phenotypes.view_phenotype', + species_id=species.SpeciesId, + population_id=population.Id, + dataset_id=dataset.Id, + xref_id=xref_id)}}">View Datasets</a> +</li> +{%endblock%} + +{%block contents%} +{{flash_all_messages()}} + +<div class="row"> + <div class="panel panel-default"> + <div class="panel-heading"><strong>Basic Phenotype Details</strong></div> + + <table class="table"> + <tbody> + <tr> + <td><strong>Phenotype</strong></td> + <td>{{phenotype.Post_publication_description or phenotype.Pre_publication_abbreviation or phenotype.Original_description}} + </tr> + <tr> + <td><strong>Cross-Reference ID</strong></td> + <td>{{phenotype.xref_id}}</td> + </tr> + <tr> + <td><strong>Collation</strong></td> + <td>{{dataset.FullName}}</td> + </tr> + <tr> + <td><strong>Units</strong></td> + <td>{{phenotype.Units}}</td> + </tr> + </tbody> + </table> + + <form action="#edit-delete-phenotype" + method="POST" + id="frm-delete-phenotype"> + + <input type="hidden" name="species_id" value="{{species.SpeciesId}}" /> + <input type="hidden" name="population_id" value="{{population.Id}}" /> + <input type="hidden" name="dataset_id" value="{{dataset.Id}}" /> + <input type="hidden" name="phenotype_id" value="{{phenotype.Id}}" /> + + <div class="btn-group btn-group-justified"> + <div class="btn-group"> + {%if "group:resource:edit-resource" in privileges%} + <input type="submit" + title="Edit the values for the phenotype. This is meant to be used when you need to update only a few values." + class="btn btn-primary not-implemented" + value="edit" /> + {%endif%} + </div> + <div class="btn-group"></div> + <div class="btn-group"> + {%if "group:resource:delete-resource" in privileges%} + <input type="submit" + title="Delete the entire phenotype. This is useful when you need to change data for most or all of the fields for this phenotype." + class="btn btn-danger not-implemented" + value="delete" /> + {%endif%} + </div> + </div> + </form> + </div> +</div> + +<div class="row"> + <div class="panel panel-default"> + <div class="panel-heading"><strong>Phenotype Data</strong></div> + {%if "group:resource:view-resource" in privileges%} + <table class="table"> + <thead> + <tr> + <th>#</th> + <th>Sample</th> + <th>Value</th> + <th>Symbol</th> + <th>SE</th> + <th>N</th> + </tr> + </thead> + + <tbody> + {%for item in phenotype.data%} + <tr> + <td>{{loop.index}}</td> + <td>{{item.StrainName}}</td> + <td>{{item.value}}</td> + <td>{{item.Symbol or "-"}}</td> + <td>{{item.error or "-"}}</td> + <td>{{item.count or "-"}}</td> + </tr> + {%endfor%} + </tbody> + </table> + {%else%} + <p class="text-danger"> + <span class="glyphicon glyphicon-exclamation-sign"></span> + You do not currently have privileges to view this phenotype in greater + detail. + </p> + {%endif%} + </div> +</div> + +{%endblock%} + +{%block sidebarcontents%} +{{display_population_card(species, population)}} +{%endblock%} diff --git a/uploader/templates/platforms/base.html b/uploader/templates/platforms/base.html new file mode 100644 index 0000000..dac965f --- /dev/null +++ b/uploader/templates/platforms/base.html @@ -0,0 +1,13 @@ +{%extends "species/base.html"%} + +{%block lvl3_breadcrumbs%} +<li {%if activelink=="platforms"%} + class="breadcrumb-item active" + {%else%} + class="breadcrumb-item" + {%endif%}> + <a href="{{url_for('species.populations.platforms.index')}}"> + Sequencing Platforms</a> +</li> +{%block lvl4_breadcrumbs%}{%endblock%} +{%endblock%} diff --git a/uploader/templates/platforms/create-platform.html b/uploader/templates/platforms/create-platform.html new file mode 100644 index 0000000..0866d5e --- /dev/null +++ b/uploader/templates/platforms/create-platform.html @@ -0,0 +1,124 @@ +{%extends "platforms/base.html"%} +{%from "flash_messages.html" import flash_all_messages%} +{%from "species/macro-display-species-card.html" import display_species_card%} + +{%block title%}Platforms — Create Platforms{%endblock%} + +{%block pagetitle%}Platforms — Create Platforms{%endblock%} + +{%block lvl3_breadcrumbs%} +<li {%if activelink=="create-platform"%} + class="breadcrumb-item active" + {%else%} + class="breadcrumb-item" + {%endif%}> + <a href="{{url_for('species.platforms.create_platform', + species_id=species.SpeciesId)}}">create platform</a> +</li> +{%endblock%} + +{%block contents%} +{{flash_all_messages()}} + +<div class="row"> + <h2>Create New Platform</h2> + + <p>You can create a new genetic sequencing platform below.</p> +</div> + +<div class="row"> + <form id="frm-create-platform" + method="POST" + action="{{url_for('species.platforms.create_platform', + species_id=species.SpeciesId)}}"> + + <div class="form-group"> + <label for="txt-geo-platform" class="form-label">GEO Platform</label> + <input type="text" + id="txt-geo-platform" + name="geo-platform" + required="required" + class="form-control" /> + <small class="form-text text-muted"> + <p>This is the platform's + <a href="https://www.ncbi.nlm.nih.gov/geo/browse/?view=platforms&tax={{species.TaxonomyId}}" + title="Platforms for '{{species.FullName}}' on NCBI"> + accession value on NCBI</a>. If you do not know the value, click the + link and search on NCBI for species '{{species.FullName}}'.</p></small> + </div> + + <div class="form-group"> + <label for="txt-platform-name" class="form-label">Platform Name</label> + <input type="text" + id="txt-platform-name" + name="platform-name" + required="required" + class="form-control" /> + <small class="form-text text-muted"> + <p>This is name of the genetic sequencing platform.</p></small> + </div> + + <div class="form-group"> + <label for="txt-platform-shortname" class="form-label"> + Platform Short Name</label> + <input type="text" + id="txt-platform-shortname" + name="platform-shortname" + required="required" + class="form-control" /> + <small class="form-text text-muted"> + <p>Use the following conventions for this field: + <ol> + <li>Start with a 4-letter vendor code, e.g. "Affy" for "Affymetrix", "Illu" for "Illumina", etc.</li> + <li>Append an underscore to the 4-letter vendor code</li> + <li>Use the name of the array given by the vendor, e.g. U74AV2, MOE430A, etc.</li> + </ol> + </p> + </small> + </div> + + <div class="form-group"> + <label for="txt-platform-title" class="form-label">Platform Title</label> + <input type="text" + id="txt-platform-title" + name="platform-title" + required="required" + class="form-control" /> + <small class="form-text text-muted"> + <p>The full platform title. Sometimes, this is the same as the Platform + Name above.</p></small> + </div> + + <div class="form-group"> + <label for="txt-go-tree-value" class="form-label">GO Tree Value</label> + <input type="text" + id="txt-go-tree-value" + name="go-tree-value" + class="form-control" /> + <small class="form-text text-muted"> + <p>This is a Chip identification value useful for analysis with the + <strong> + <a href="https://www.geneweaver.org/" + title="Go to the GeneWeaver site." + target="_blank">GeneWeaver</a></strong> + and + <strong> + <a href="https://www.webgestalt.org/" + title="Go to the WEB-based GEne SeT AnaLysis Toolkit site." + target="_blank">WebGestalt</a></strong> + tools.<br /> + This can be left blank for custom platforms.</p></small> + </div> + + <div class="form-group"> + <input type="submit" + value="create new platform" + class="btn btn-primary" /> + </div> + </form> +</div> +{%endblock%} + +{%block sidebarcontents%} +{{display_species_card(species)}} +{%endblock%} diff --git a/uploader/templates/platforms/index.html b/uploader/templates/platforms/index.html new file mode 100644 index 0000000..35b6464 --- /dev/null +++ b/uploader/templates/platforms/index.html @@ -0,0 +1,21 @@ +{%extends "platforms/base.html"%} +{%from "flash_messages.html" import flash_all_messages%} +{%from "species/macro-select-species.html" import select_species_form%} + +{%block title%}Platforms{%endblock%} + +{%block pagetitle%}Platforms{%endblock%} + + +{%block contents%} +{{flash_all_messages()}} + +<div class="row"> + <p>In this section, you will be able to view and manage the sequencing + platforms that are currently supported by GeneNetwork.</p> +</div> + +<div class="row"> + {{select_species_form(url_for("species.platforms.index"), species)}} +</div> +{%endblock%} diff --git a/uploader/templates/platforms/list-platforms.html b/uploader/templates/platforms/list-platforms.html new file mode 100644 index 0000000..718dd1d --- /dev/null +++ b/uploader/templates/platforms/list-platforms.html @@ -0,0 +1,93 @@ +{%extends "platforms/base.html"%} +{%from "flash_messages.html" import flash_all_messages%} +{%from "species/macro-display-species-card.html" import display_species_card%} + +{%block title%}Platforms — List Platforms{%endblock%} + +{%block pagetitle%}Platforms — List Platforms{%endblock%} + + +{%block contents%} +{{flash_all_messages()}} + +<div class="row"> + <p>View the list of the genetic sequencing platforms that are currently + supported by GeneNetwork.</p> + <p>If you cannot find the platform you wish to use, you can add it by clicking + the "New Platform" button below.</p> + <p><a href="{{url_for('species.platforms.create_platform', + species_id=species.SpeciesId)}}" + title="Create a new genetic sequencing platform for species {{species.FullName}}" + class="btn btn-primary">Create Platform</a></p> +</div> + +<div class="row"> + <h2>Supported Platforms</h2> + {%if platforms is defined and platforms | length > 0%} + <p>There are {{total_platforms}} platforms supported by GeneNetwork</p> + + <div class="row"> + <div class="col-md-2" style="text-align: start;"> + {%if start_from > 0%} + <a href="{{url_for('species.platforms.list_platforms', + species_id=species.SpeciesId, + start_from=start_from-count, + count=count)}}"> + <span class="glyphicon glyphicon-backward"></span> + Previous + </a> + {%endif%} + </div> + <div class="col-md-8" style="text-align: center;"> + Displaying platforms {{start_from+1}} to {{start_from+count if start_from+count < total_platforms else total_platforms}} of + {{total_platforms}} + </div> + <div class="col-md-2" style="text-align: end;"> + {%if start_from + count < total_platforms%} + <a href="{{url_for('species.platforms.list_platforms', + species_id=species.SpeciesId, + start_from=start_from+count, + count=count)}}"> + Next + <span class="glyphicon glyphicon-forward"></span> + </a> + {%endif%} + </div> + </div> + + <table class="table"> + <thead> + <tr> + <th>#</th> + <th>Platform Name</th> + <th><a href="https://www.ncbi.nlm.nih.gov/geo/browse/?view=platforms&tax={{species.TaxonomyId}}" + title="Gene Expression Omnibus: Platforms section" + target="_blank">GEO Platform</a></th> + <th>Title</th> + </tr> + </thead> + + <tbody> + {%for platform in platforms%} + <tr> + <td>{{platform.sequence_number}}</td> + <td>{{platform.GeneChipName}}</td> + <td><a href="https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc={{platform.GeoPlatform}}" + title="View platform on the Gene Expression Omnibus" + target="_blank">{{platform.GeoPlatform}}</a></td> + <td>{{platform.Title}}</td> + </tr> + {%endfor%} + </tbody> + </table> + {%else%} + <p class="text-warning"> + <span class="glyphicon glyphicon-exclamation-sign"></span> + There are no platforms supported at this time!</p> + {%endif%} +</div> +{%endblock%} + +{%block sidebarcontents%} +{{display_species_card(species)}} +{%endblock%} diff --git a/uploader/templates/populations/base.html b/uploader/templates/populations/base.html new file mode 100644 index 0000000..d763fc1 --- /dev/null +++ b/uploader/templates/populations/base.html @@ -0,0 +1,12 @@ +{%extends "species/base.html"%} + +{%block lvl2_breadcrumbs%} +<li {%if activelink=="populations"%} + class="breadcrumb-item active" + {%else%} + class="breadcrumb-item" + {%endif%}> + <a href="{{url_for('species.populations.index')}}">Populations</a> +</li> +{%block lvl3_breadcrumbs%}{%endblock%} +{%endblock%} diff --git a/uploader/templates/populations/create-population.html b/uploader/templates/populations/create-population.html new file mode 100644 index 0000000..b05ce37 --- /dev/null +++ b/uploader/templates/populations/create-population.html @@ -0,0 +1,252 @@ +{%extends "populations/base.html"%} +{%from "flash_messages.html" import flash_all_messages%} +{%from "species/macro-select-species.html" import select_species_form%} +{%from "species/macro-display-species-card.html" import display_species_card%} + +{%block title%}Create Population{%endblock%} + +{%block pagetitle%}Create Population{%endblock%} + +{%block lvl3_breadcrumbs%} +<li {%if activelink=="create-population"%} + class="breadcrumb-item active" + {%else%} + class="breadcrumb-item" + {%endif%}> + <a href="{{url_for('species.populations.create_population', + species_id=species.SpeciesId)}}">create population</a> +</li> +{%endblock%} + + +{%block contents%} +<div class="row"> + <p>The population is the next hierarchical node under Species. Data is grouped under a specific population, under a particular species.</p> + <p> + This page enables you to create a new population, in the case that you + cannot find the population you want in the + <a + href="{{url_for('species.populations.list_species_populations', + species_id=species.SpeciesId)}}" + title="Population for species '{{species.FullName}}'."> + list of species populations + </a> + </p> +</div> + +<div class="row"> + <form method="POST" + action="{{url_for('species.populations.create_population', + species_id=species.SpeciesId)}}"> + + <legend>Create Population</legend> + + {{flash_all_messages()}} + + <div {%if errors.population_fullname%} + class="form-group has-error" + {%else%} + class="form-group" + {%endif%}> + <label for="txt-population-fullname" class="form-label">Full Name</label> + {%if errors.population_fullname%} + <small class="form-text text-danger">{{errors.population_fullname}}</small> + {%endif%} + <input type="text" + id="txt-population-fullname" + name="population_fullname" + required="required" + minLength="3" + maxLength="100" + value="{{error_values.population_fullname or ''}}" + class="form-control" /> + <small class="form-text text-muted"> + <p> + This is a descriptive name for your population — useful for + humans. + </p> + </small> + </div> + + <div {%if errors.population_name%} + class="form-group has-error" + {%else%} + class="form-group" + {%endif%}> + <label for="txt-population-name" class="form-label">Name</label> + {%if errors.population_name%} + <small class="form-text text-danger">{{errors.population_name}}</small> + {%endif%} + <input type="text" + id="txt-population-name" + name="population_name" + required="required" + minLength="3" + maxLength="30" + value="{{error_values.population_name or ''}}" + class="form-control" /> + <small class="form-text text-muted"> + <p> + This is a short representative, but constrained name for your + population. + <br /> + The field will only accept letters ('A-Za-z'), numbers (0-9), hyphens + and underscores. Any other character will cause the name to be + rejected. + </p> + </small> + </div> + + <div class="form-group"> + <label for="txt-population-code" class="form-label">Population Code</label> + <input type="text" + id="txt-population-code" + name="population_code" + maxLength="5" + minLength="3" + value="{{error_values.population_code or ''}}" + class="form-control" /> + <small class="form-text text-muted"> + <p class="text-danger"> + <span class="glyphicon glyphicon-exclamation-sign"></span> + What is this field is for? Confirm with Arthur and the rest. + </p> + </small> + </div> + + <div {%if errors.population_description%} + class="form-group has-error" + {%else%} + class="form-group" + {%endif%}> + <label for="txt-population-description" class="form-label"> + Description + </label> + {%if errors.population_description%} + <small class="form-text text-danger">{{errors.population_description}}</small> + {%endif%} + <textarea + id="txt-population-description" + name="population_description" + required="required" + class="form-control" + rows="5">{{error_values.population_description or ''}}</textarea> + <small class="form-text text-muted"> + <p> + This is a more detailed description for your population. This is + useful to communicate with other researchers some details regarding + your population, and what its about. + <br /> + Put, here, anything that describes your population but does not go + cleanly under metadata. + </p> + </small> + </div> + + <div {%if errors.population_family%} + class="form-group has-error" + {%else%} + class="form-group" + {%endif%}> + <label for="select-population-family" class="form-label">Family</label> + <select id="select-population-family" + name="population_family" + class="form-control" + required="required"> + <option value="">Please select a family</option> + {%for family in families%} + <option value="{{family}}" + {%if error_values.population_family == family%} + selected="selected" + {%endif%}>{{family}}</option> + {%endfor%} + </select> + <small class="form-text text-muted"> + <p> + This is a rough grouping of the populations in GeneNetwork into lists + of common types of populations. + </p> + </small> + </div> + + <div {%if errors.population_mapping_method_id%} + class="form-group has-error" + {%else%} + class="form-group" + {%endif%}> + <label for="select-population-mapping-methods" + class="form-label">Mapping Methods</label> + + <select id="select-population-mapping-methods" + name="population_mapping_method_id" + class="form-control" + required="required"> + <option value="">Select appropriate mapping methods</option> + {%for mmethod in mapping_methods%} + <option value="{{mmethod.id}}" + {%if error_values.population_mapping_method_id == mmethod.id%} + selected="selected" + {%endif%}>{{mmethod.value}}</option> + {%endfor%} + </select> + + <small class="form-text text-muted"> + <p>Select the mapping methods that your population will support.</p> + </small> + </div> + + <div {%if errors.population_genetic_type%} + class="form-group has-error" + {%else%} + class="form-group" + {%endif%}> + <label for="select-population-genetic-type" + class="form-label">Genetic Type</label> + <select id="select-population-genetic-type" + name="population_genetic_type" + class="form-control"> + <option value="">Select proper genetic type</option> + {%for gtype in genetic_types%} + <option value="{{gtype}}" + {%if error_values.population_genetic_type == gtype%} + selected="selected" + {%endif%}>{{gtype}}</option> + {%endfor%} + </select> + <small class="form-text text-muted text-danger"> + <p> + <span class="glyphicon glyphicon-exclamation-sign"></span> + This might be a poorly named field. + </p> + <p> + It probably has more to do with the mating crosses/crossings used to + produce the individuals in the population. I am no biologist, however, + and I'm leaving this here to remind myself to confirm this. + </p> + <p> + I still don't know what riset is.<br /> + … probably something to do with Recombinant Inbred Strains + </p> + <p> + Possible resources for this: + <ul> + <li>https://www.informatics.jax.org/silver/chapters/3-2.shtml</li> + <li>https://www.informatics.jax.org/silver/chapters/9-2.shtml</li> + </ul> + </p> + </small> + </div> + + <div class="form-group"> + <input type="submit" + value="create population" + class="btn btn-primary" /> + </div> + + </form> +</div> +{%endblock%} + +{%block sidebarcontents%} +{{display_species_card(species)}} +{%endblock%} diff --git a/uploader/templates/populations/index.html b/uploader/templates/populations/index.html new file mode 100644 index 0000000..4354e02 --- /dev/null +++ b/uploader/templates/populations/index.html @@ -0,0 +1,24 @@ +{%extends "populations/base.html"%} +{%from "flash_messages.html" import flash_all_messages%} +{%from "species/macro-select-species.html" import select_species_form%} + +{%block title%}Populations{%endblock%} + +{%block pagetitle%}Populations{%endblock%} + + +{%block contents%} +{{flash_all_messages()}} + +<div class="row"> + <p> + Your experiment data will relate to a particular population from a + particular species. Let us know what species it is you want to work with + below. + </p> +</div> + +<div class="row"> + {{select_species_form(url_for("species.populations.index"), species)}} +</div> +{%endblock%} diff --git a/uploader/templates/populations/list-populations.html b/uploader/templates/populations/list-populations.html new file mode 100644 index 0000000..7c7145f --- /dev/null +++ b/uploader/templates/populations/list-populations.html @@ -0,0 +1,93 @@ +{%extends "populations/base.html"%} +{%from "flash_messages.html" import flash_all_messages%} +{%from "species/macro-select-species.html" import select_species_form%} +{%from "species/macro-display-species-card.html" import display_species_card%} + +{%block title%}Populations{%endblock%} + +{%block pagetitle%}Populations{%endblock%} + +{%block lvl3_breadcrumbs%} +<li {%if activelink=="list-populations"%} + class="breadcrumb-item active" + {%else%} + class="breadcrumb-item" + {%endif%}> + <a href="{{url_for('species.populations.list_species_populations', + species_id=species.SpeciesId)}}">List populations</a> +</li> +{%endblock%} + + +{%block contents%} +{{flash_all_messages()}} +<div class="row"> + <p> + The following populations/groups exist for the '{{species.FullName}}' + species. + </p> + <p> + Click on the population's name to select and continue using the population. + </p> +</div> + +<div class="row"> + <p> + If the population you need for the species '{{species.FullName}}' does not + exist, click on the "Create Population" button below to create a new one. + </p> + <p> + <a href="{{url_for('species.populations.create_population', + species_id=species.SpeciesId)}}" + title="Create a new population for species '{{species.FullName}}'." + class="btn btn-danger"> + Create Population + </a> + </p> +</div> + +<div class="row"> + <table class="table"> + <caption>Populations for {{species.FullName}}</caption> + <thead> + <tr> + <th>#</th> + <th>Name</th> + <th>Full Name</th> + <th>Description</th> + </tr> + </thead> + + <tbody> + {%for population in populations%} + <tr> + <td>{{population["sequence_number"]}}</td> + <td> + <a href="{{url_for('species.populations.view_population', + species_id=species.SpeciesId, + population_id=population.InbredSetId)}}" + title="Population '{{population.FullName}}' for species '{{species.FullName}}'."> + {{population.Name}} + </a> + </td> + <td>{{population.FullName}}</td> + <td>{{population.Description}}</td> + </tr> + {%else%} + <tr> + <td colspan="3"> + <p class="text-danger"> + <span class="glyphicon glyphicon-exclamation-mark"></span> + There were no populations found for {{species.FullName}}! + </p> + </td> + </tr> + {%endfor%} + </tbody> + </table> +</div> +{%endblock%} + +{%block sidebarcontents%} +{{display_species_card(species)}} +{%endblock%} diff --git a/uploader/templates/populations/macro-display-population-card.html b/uploader/templates/populations/macro-display-population-card.html new file mode 100644 index 0000000..79f7925 --- /dev/null +++ b/uploader/templates/populations/macro-display-population-card.html @@ -0,0 +1,46 @@ +{%from "species/macro-display-species-card.html" import display_species_card%} + +{%macro display_population_card(species, population)%} +{{display_species_card(species)}} + +<div class="card"> + <div class="card-body"> + <h5 class="card-title">Population</h5> + <div class="card-text"> + <table class="table"> + <tbody> + <tr> + <td>Name</td> + <td>{{population.Name}}</td> + </tr> + + <tr> + <td>Full Name</td> + <td>{{population.FullName}}</td> + </tr> + + <tr> + <td>Code</td> + <td>{{population.InbredSetCode}}</td> + </tr> + + <tr> + <td>Genetic Type</td> + <td>{{population.GeneticType}}</td> + </tr> + + <tr> + <td>Family</td> + <td>{{population.Family}}</td> + </tr> + + <tr> + <td>Description</td> + <td>{{(population.Description or "")[0:500]}}…</td> + </tr> + </tbody> + </table> + </div> + </div> +</div> +{%endmacro%} diff --git a/uploader/templates/populations/macro-select-population.html b/uploader/templates/populations/macro-select-population.html new file mode 100644 index 0000000..af4fd3a --- /dev/null +++ b/uploader/templates/populations/macro-select-population.html @@ -0,0 +1,30 @@ +{%macro select_population_form(form_action, populations)%} +<form method="GET" action="{{form_action}}"> + <legend>Select Population</legend> + + <div class="form-group"> + <label for="select-population" class="form-label">Select Population</label> + <select id="select-population" + name="population_id" + class="form-control" + required="required"> + <option value="">Select Population</option> + {%for family in populations%} + <optgroup {%if family[0][1] is not none%} + label="{{family[0][1]}}" + {%else%} + label="Undefined" + {%endif%}> + {%for population in family[1]%} + <option value="{{population.Id}}">{{population.FullName}}</option> + {%endfor%} + </optgroup> + {%endfor%} + </select> + </div> + + <div class="form-group"> + <input type="submit" value="Select" class="btn btn-primary" /> + </div> +</form> +{%endmacro%} diff --git a/qc_app/templates/rqtl2/create-tissue-success.html b/uploader/templates/populations/rqtl2/create-tissue-success.html index 5f2c5a0..d6fe154 100644 --- a/qc_app/templates/rqtl2/create-tissue-success.html +++ b/uploader/templates/populations/rqtl2/create-tissue-success.html @@ -56,7 +56,7 @@ <form id="frm-create-tissue-success-continue" method="POST" - action="{{url_for('upload.rqtl2.select_dataset_info', + action="{{url_for('expression-data.rqtl2.select_dataset_info', species_id=species.SpeciesId, population_id=population.InbredSetId)}}" style="display: inline; width: 100%; grid-column: 1 / 2; @@ -85,7 +85,7 @@ <div class="row"> <form id="frm-create-tissue-success-select-existing" method="POST" - action="{{url_for('upload.rqtl2.select_tissue', + action="{{url_for('expression-data.rqtl2.select_tissue', species_id=species.SpeciesId, population_id=population.InbredSetId)}}" style="display: inline; width: 100%; grid-column: 3 / 4; diff --git a/uploader/templates/populations/rqtl2/index.html b/uploader/templates/populations/rqtl2/index.html new file mode 100644 index 0000000..ec6ffb8 --- /dev/null +++ b/uploader/templates/populations/rqtl2/index.html @@ -0,0 +1,54 @@ +{%extends "base.html"%} +{%from "flash_messages.html" import flash_messages%} + +{%block title%}Data Upload{%endblock%} + +{%block contents%} +<h1 class="heading">R/qtl2 data upload</h1> + +<h2>R/qtl2 Upload</h2> + +<div class="row"> + <form method="POST" action="{{url_for('expression-data.rqtl2.select_species')}}" + id="frm-rqtl2-upload"> + <legend class="heading">upload R/qtl2 bundle</legend> + {{flash_messages("error-rqtl2")}} + + <div class="form-group"> + <label for="select:species" class="form-label">Species</label> + <select id="select:species" + name="species_id" + required="required" + class="form-control"> + <option value="">Select species</option> + {%for spec in species%} + <option value="{{spec.SpeciesId}}">{{spec.MenuName}}</option> + {%endfor%} + </select> + <small class="form-text text-muted"> + Data that you upload to the system should belong to a know species. + Here you can select the species that you wish to upload data for. + </small> + </div> + + <input type="submit" class="btn btn-primary" value="submit" /> + </form> +</div> + +<div class="row"> + <h2 class="heading">R/qtl2 Bundles</h2> + + <div class="explainer"> + <p>This feature combines and extends the two upload methods below. Instead of + uploading one item at a time, the R/qtl2 bundle you upload can contain both + the genotypes data (samples/individuals/cases and their data) and the + expression data.</p> + <p>The R/qtl2 bundle, additionally, can contain extra metadata, that neither + of the methods below can handle.</p> + + <a href="{{url_for('expression-data.rqtl2.select_species')}}" + title="Upload a zip bundle of R/qtl2 files"> + <button class="btn btn-primary">upload R/qtl2 bundle</button></a> + </div> +</div> +{%endblock%} diff --git a/qc_app/templates/rqtl2/no-such-job.html b/uploader/templates/populations/rqtl2/no-such-job.html index b17004f..b17004f 100644 --- a/qc_app/templates/rqtl2/no-such-job.html +++ b/uploader/templates/populations/rqtl2/no-such-job.html diff --git a/qc_app/templates/rqtl2/rqtl2-job-error.html b/uploader/templates/populations/rqtl2/rqtl2-job-error.html index 72a334b..9817518 100644 --- a/qc_app/templates/rqtl2/rqtl2-job-error.html +++ b/uploader/templates/populations/rqtl2/rqtl2-job-error.html @@ -14,7 +14,13 @@ could be.</p> <p>If you find that you cannot figure out what the problem is on your own, please contact the team running the system for assistance, providing the - R/qtl2 bundle you uploaded, and a screenshot of this page.</p> + following details: + <ul> + <li>R/qtl2 bundle you uploaded</li> + <li>This URL: <strong>{{request_url()}}</strong></li> + <li>(maybe) a screenshot of this page</li> + </ul> + </p> </div> <h4>stdout</h4> diff --git a/qc_app/templates/rqtl2/rqtl2-job-results.html b/uploader/templates/populations/rqtl2/rqtl2-job-results.html index 4ecd415..4ecd415 100644 --- a/qc_app/templates/rqtl2/rqtl2-job-results.html +++ b/uploader/templates/populations/rqtl2/rqtl2-job-results.html diff --git a/qc_app/templates/rqtl2/rqtl2-job-status.html b/uploader/templates/populations/rqtl2/rqtl2-job-status.html index e896f88..e896f88 100644 --- a/qc_app/templates/rqtl2/rqtl2-job-status.html +++ b/uploader/templates/populations/rqtl2/rqtl2-job-status.html diff --git a/qc_app/templates/rqtl2/rqtl2-qc-job-error.html b/uploader/templates/populations/rqtl2/rqtl2-qc-job-error.html index 90e8887..90e8887 100644 --- a/qc_app/templates/rqtl2/rqtl2-qc-job-error.html +++ b/uploader/templates/populations/rqtl2/rqtl2-qc-job-error.html diff --git a/qc_app/templates/rqtl2/rqtl2-qc-job-results.html b/uploader/templates/populations/rqtl2/rqtl2-qc-job-results.html index 59bc8cd..b3c3a8f 100644 --- a/qc_app/templates/rqtl2/rqtl2-qc-job-results.html +++ b/uploader/templates/populations/rqtl2/rqtl2-qc-job-results.html @@ -15,7 +15,7 @@ <div class="row"> <form id="form-qc-job-results" - action="{{url_for('upload.rqtl2.select_dataset_info', + action="{{url_for('expression-data.rqtl2.select_dataset_info', species_id=species.SpeciesId, population_id=population.Id)}}" method="POST"> diff --git a/qc_app/templates/rqtl2/rqtl2-qc-job-status.html b/uploader/templates/populations/rqtl2/rqtl2-qc-job-status.html index f4a6266..f4a6266 100644 --- a/qc_app/templates/rqtl2/rqtl2-qc-job-status.html +++ b/uploader/templates/populations/rqtl2/rqtl2-qc-job-status.html diff --git a/qc_app/templates/rqtl2/rqtl2-qc-job-success.html b/uploader/templates/populations/rqtl2/rqtl2-qc-job-success.html index 2861a04..f126835 100644 --- a/qc_app/templates/rqtl2/rqtl2-qc-job-success.html +++ b/uploader/templates/populations/rqtl2/rqtl2-qc-job-success.html @@ -18,7 +18,7 @@ --> <div class="row"> <form id="frm-upload-rqtl2-bundle" - action="{{url_for('upload.rqtl2.select_dataset_info', + action="{{url_for('expression-data.rqtl2.select_dataset_info', species_id=species.SpeciesId, population_id=population.InbredSetId)}}" method="POST" diff --git a/uploader/templates/populations/rqtl2/select-geno-dataset.html b/uploader/templates/populations/rqtl2/select-geno-dataset.html new file mode 100644 index 0000000..3233abc --- /dev/null +++ b/uploader/templates/populations/rqtl2/select-geno-dataset.html @@ -0,0 +1,69 @@ +{%extends "base.html"%} +{%from "flash_messages.html" import flash_messages%} + +{%block title%}Upload R/qtl2 Bundle{%endblock%} + +{%block contents%} +<h2 class="heading">Select Genotypes Dataset</h2> + +<div class="row"> + <p>Your R/qtl2 files bundle could contain a "geno" specification. You will + therefore need to select from one of the existing Genotype datasets or + create a new one.</p> + <p>This is the dataset where your data will be organised under.</p> +</div> + +<div class="row"> + <form id="frm-upload-rqtl2-bundle" + action="{{url_for('expression-data.rqtl2.select_geno_dataset', + species_id=species.SpeciesId, + population_id=population.InbredSetId)}}" + method="POST" + enctype="multipart/form-data"> + <legend class="heading">select from existing genotype datasets</legend> + + <input type="hidden" name="species_id" value="{{species.SpeciesId}}" /> + <input type="hidden" name="population_id" + value="{{population.InbredSetId}}" /> + <input type="hidden" name="rqtl2_bundle_file" + value="{{rqtl2_bundle_file}}" /> + + {{flash_messages("error-rqtl2-select-geno-dataset")}} + + <div class="form-group"> + <legend>Datasets</legend> + <label for="select:geno-datasets" class="form-label">Dataset</label> + <select id="select:geno-datasets" + name="geno-dataset-id" + required="required" + {%if datasets | length == 0%} + disabled="disabled" + {%endif%} + class="form-control" + aria-describedby="help-geno-dataset-select-dataset"> + <option value="">Select dataset</option> + {%for dset in datasets%} + <option value="{{dset['Id']}}">{{dset["Name"]}} ({{dset["FullName"]}})</option> + {%endfor%} + </select> + <span id="help-geno-dataset-select-dataset" class="form-text text-muted"> + Select from the existing genotype datasets for species + {{species.SpeciesName}} ({{species.FullName}}). + </span> + </div> + + <button type="submit" class="btn btn-primary">select dataset</button> + </form> +</div> + +<div class="row"> + <p>If the genotype dataset you need does not currently exist for your dataset, + go the <a href="{{url_for( + 'species.populations.genotypes.create_dataset', + species_id=species.SpeciesId, + population_id=population.Id)}}" + title="Create a new genotypes dataset for {{species.FullName}}"> + genotypes page to create the genotype dataset</a></p> +</div> + +{%endblock%} diff --git a/uploader/templates/populations/rqtl2/select-population.html b/uploader/templates/populations/rqtl2/select-population.html new file mode 100644 index 0000000..ded425f --- /dev/null +++ b/uploader/templates/populations/rqtl2/select-population.html @@ -0,0 +1,57 @@ +{%extends "expression-data/index.html"%} +{%from "flash_messages.html" import flash_messages%} +{%from "species/macro-display-species-card.html" import display_species_card%} + +{%block title%}Select Grouping/Population{%endblock%} + +{%block contents%} +<h1 class="heading">Select grouping/population</h1> + +<div class="row"> + <p>The data is organised in a hierarchical form, beginning with + <em>species</em> at the very top. Under <em>species</em> the data is + organised by <em>population</em>, sometimes referred to as <em>grouping</em>. + (In some really old documents/systems, you might see this referred to as + <em>InbredSet</em>.)</p> + <p>In this section, you get to define what population your data is to be + organised by.</p> +</div> + +<div class="row"> + <form method="POST" + action="{{url_for('expression-data.rqtl2.select_population', + species_id=species.SpeciesId)}}"> + <legend class="heading">select grouping/population</legend> + {{flash_messages("error-select-population")}} + + <input type="hidden" name="species_id" value="{{species.SpeciesId}}" /> + + <div class="form-group"> + <label for="select:inbredset" class="form-label">population</label> + <select id="select:inbredset" + name="inbredset_id" + required="required" + class="form-control"> + <option value="">Select a grouping/population</option> + {%for pop in populations%} + <option value="{{pop.InbredSetId}}"> + {{pop.InbredSetName}} ({{pop.FullName}})</option> + {%endfor%} + </select> + <span class="form-text text-muted">Select the population for your data from + the list below.</span> + </div> + + <button type="submit" class="btn btn-primary" />select population</button> +</form> +</div> + +{%endblock%} + +{%block sidebarcontents%} +{{display_species_card(species)}} +{%endblock%} + + +{%block javascript%} +{%endblock%} diff --git a/qc_app/templates/rqtl2/select-probeset-dataset.html b/uploader/templates/populations/rqtl2/select-probeset-dataset.html index 26f52ed..74f8f69 100644 --- a/qc_app/templates/rqtl2/select-probeset-dataset.html +++ b/uploader/templates/populations/rqtl2/select-probeset-dataset.html @@ -15,7 +15,7 @@ {%if datasets | length > 0%} <div class="row"> <form method="POST" - action="{{url_for('upload.rqtl2.select_probeset_dataset', + action="{{url_for('expression-data.rqtl2.select_probeset_dataset', species_id=species.SpeciesId, population_id=population.Id)}}" id="frm:select-probeset-dataset"> <legend class="heading">Select from existing ProbeSet datasets</legend> @@ -68,7 +68,7 @@ <div class="row"> <form method="POST" - action="{{url_for('upload.rqtl2.create_probeset_dataset', + action="{{url_for('expression-data.rqtl2.create_probeset_dataset', species_id=species.SpeciesId, population_id=population.Id)}}" id="frm:create-probeset-dataset"> <legend class="heading">Create a new ProbeSet dataset</legend> diff --git a/qc_app/templates/rqtl2/select-probeset-study-id.html b/uploader/templates/populations/rqtl2/select-probeset-study-id.html index b9bf52e..e3fd9cc 100644 --- a/qc_app/templates/rqtl2/select-probeset-study-id.html +++ b/uploader/templates/populations/rqtl2/select-probeset-study-id.html @@ -12,7 +12,7 @@ <p>In this page, you can either select from a existing dataset:</p> <form method="POST" - action="{{url_for('upload.rqtl2.select_probeset_study', + action="{{url_for('expression-data.rqtl2.select_probeset_study', species_id=species.SpeciesId, population_id=population.Id)}}" id="frm:select-probeset-study"> <legend class="heading">Select from existing ProbeSet studies</legend> @@ -62,7 +62,7 @@ <p>Create a new ProbeSet dataset below:</p> <form method="POST" - action="{{url_for('upload.rqtl2.create_probeset_study', + action="{{url_for('expression-data.rqtl2.create_probeset_study', species_id=species.SpeciesId, population_id=population.Id)}}" id="frm:create-probeset-study"> <legend class="heading">Create new ProbeSet study</legend> diff --git a/qc_app/templates/rqtl2/select-tissue.html b/uploader/templates/populations/rqtl2/select-tissue.html index 34e1758..fe3080a 100644 --- a/qc_app/templates/rqtl2/select-tissue.html +++ b/uploader/templates/populations/rqtl2/select-tissue.html @@ -15,7 +15,7 @@ {%if tissues | length > 0%} <div class="row"> <form method="POST" - action="{{url_for('upload.rqtl2.select_tissue', + action="{{url_for('expression-data.rqtl2.select_tissue', species_id=species.SpeciesId, population_id=population.Id)}}" id="frm:select-probeset-dataset"> <legend class="heading">Select from existing ProbeSet datasets</legend> @@ -65,7 +65,7 @@ to the system below.</p> <form method="POST" - action="{{url_for('upload.rqtl2.create_tissue', + action="{{url_for('expression-data.rqtl2.create_tissue', species_id=species.SpeciesId, population_id=population.Id)}}" id="frm:create-probeset-dataset"> <legend class="heading">Add new tissue, organ or biological material</legend> diff --git a/qc_app/templates/rqtl2/summary-info.html b/uploader/templates/populations/rqtl2/summary-info.html index 1be87fa..0adba2e 100644 --- a/qc_app/templates/rqtl2/summary-info.html +++ b/uploader/templates/populations/rqtl2/summary-info.html @@ -44,7 +44,7 @@ <div class="row"> <form id="frm:confirm-rqtl2bundle-details" - action="{{url_for('upload.rqtl2.confirm_bundle_details', + action="{{url_for('expression-data.rqtl2.confirm_bundle_details', species_id=species.SpeciesId, population_id=population.InbredSetId)}}" method="POST" diff --git a/qc_app/templates/rqtl2/upload-rqtl2-bundle-step-01.html b/uploader/templates/populations/rqtl2/upload-rqtl2-bundle-step-01.html index 07c240f..9d45c5f 100644 --- a/qc_app/templates/rqtl2/upload-rqtl2-bundle-step-01.html +++ b/uploader/templates/populations/rqtl2/upload-rqtl2-bundle-step-01.html @@ -71,13 +71,13 @@ </div> <form id="frm-upload-rqtl2-bundle" - action="{{url_for('upload.rqtl2.upload_rqtl2_bundle', + action="{{url_for('expression-data.rqtl2.upload_rqtl2_bundle', species_id=species.SpeciesId, population_id=population.InbredSetId)}}" method="POST" enctype="multipart/form-data" data-resumable-target="{{url_for( - 'upload.rqtl2.upload_rqtl2_bundle_chunked_post', + 'expression-data.rqtl2.upload_rqtl2_bundle_chunked_post', species_id=species.SpeciesId, population_id=population.InbredSetId)}}"> <input type="hidden" name="species_id" value="{{species.SpeciesId}}" /> diff --git a/qc_app/templates/rqtl2/upload-rqtl2-bundle-step-02.html b/uploader/templates/populations/rqtl2/upload-rqtl2-bundle-step-02.html index 93b1dc9..8210ed0 100644 --- a/qc_app/templates/rqtl2/upload-rqtl2-bundle-step-02.html +++ b/uploader/templates/populations/rqtl2/upload-rqtl2-bundle-step-02.html @@ -14,7 +14,7 @@ <p>Click "Continue" below to proceed.</p> <form id="frm-upload-rqtl2-bundle" - action="{{url_for('upload.rqtl2.select_dataset_info', + action="{{url_for('expression-data.rqtl2.select_dataset_info', species_id=species.SpeciesId, population_id=population.InbredSetId)}}" method="POST" diff --git a/uploader/templates/populations/view-population.html b/uploader/templates/populations/view-population.html new file mode 100644 index 0000000..1e2964e --- /dev/null +++ b/uploader/templates/populations/view-population.html @@ -0,0 +1,96 @@ +{%extends "populations/base.html"%} +{%from "flash_messages.html" import flash_all_messages%} +{%from "species/macro-select-species.html" import select_species_form%} +{%from "species/macro-display-species-card.html" import display_species_card%} + +{%block title%}Populations{%endblock%} + +{%block pagetitle%}Populations{%endblock%} + +{%block lvl3_breadcrumbs%} +<li {%if activelink=="view-population"%} + class="breadcrumb-item active" + {%else%} + class="breadcrumb-item" + {%endif%}> + <a href="{{url_for('species.populations.view_population', + species_id=species.SpeciesId, + population_id=population.InbredSetId)}}">view population</a> +</li> +{%endblock%} + + +{%block contents%} +<div class="row"> + <h2>Population Details</h2> + + {{flash_all_messages()}} + + <dl> + <dt>Name</dt> + <dd>{{population.Name}}</dd> + + <dt>FullName</dt> + <dd>{{population.FullName}}</dd> + + <dt>Code</dt> + <dd>{{population.InbredSetCode}}</dd> + + <dt>Genetic Type</dt> + <dd>{{population.GeneticType}}</dd> + + <dt>Family</dt> + <dd>{{population.Family}}</dd> + + <dt>Description</dt> + <dd><pre>{{population.Description or "-"}}</pre></dd> + </dl> +</div> + +<div class="row"> + … maybe provide a way to organise populations in the same family here … +</div> + +<div class="row"> + <h3>Actions</h3> + + <p> + Click any of the following links to use this population in performing the + subsequent operations. + </p> + + <nav class="nav"> + <ul> + <li> + <a href="{{url_for('species.populations.genotypes.list_genotypes', + species_id=species.SpeciesId, + population_id=population.Id)}}" + title="Upload genotypes for {{species.FullName}}">Upload Genotypes</a> + </li> + <li> + <a href="{{url_for('species.populations.samples.list_samples', + species_id=species.SpeciesId, + population_id=population.Id)}}" + title="Manage samples: Add new or delete existing."> + manage samples</a> + </li> + <li> + <a href="#" title="Upload expression data">upload expression data</a> + </li> + <li> + <a href="#" title="Upload phenotype data">upload phenotype data</a> + </li> + <li> + <a href="#" title="Upload individual data">upload individual data</a> + </li> + <li> + <a href="#" title="Upload RNA-Seq data">upload RNA-Seq data</a> + </li> + </ul> + </nav> +</div> +{%endblock%} + +{%block sidebarcontents%} +{{display_species_card(species)}} +{%endblock%} diff --git a/uploader/templates/samples/base.html b/uploader/templates/samples/base.html new file mode 100644 index 0000000..291782b --- /dev/null +++ b/uploader/templates/samples/base.html @@ -0,0 +1,12 @@ +{%extends "populations/base.html"%} + +{%block lvl3_breadcrumbs%} +<li {%if activelink=="samples"%} + class="breadcrumb-item active" + {%else%} + class="breadcrumb-item" + {%endif%}> + <a href="{{url_for('species.populations.samples.index')}}">Samples</a> +</li> +{%block lvl4_breadcrumbs%}{%endblock%} +{%endblock%} diff --git a/uploader/templates/samples/index.html b/uploader/templates/samples/index.html new file mode 100644 index 0000000..ee4a63e --- /dev/null +++ b/uploader/templates/samples/index.html @@ -0,0 +1,19 @@ +{%extends "samples/base.html"%} +{%from "flash_messages.html" import flash_all_messages%} +{%from "species/macro-select-species.html" import select_species_form%} + +{%block title%}Populations{%endblock%} + +{%block pagetitle%}Populations{%endblock%} + + +{%block contents%} +{{flash_all_messages()}} + +<div class="row"> + <p>GeneNetwork has a selection of different species of organisms to choose from. Within those species, there are the populations of interest for a variety of experiments, from which you, the researcher, picked your samples (or individuals or cases) from. Here you can provide some basic details about your samples.</p> + <p>To start off, we will need to know what species and population your samples belong to. Please provide that information in the next sections.</p> + + {{select_species_form(url_for("species.populations.samples.index"), species)}} +</div> +{%endblock%} diff --git a/uploader/templates/samples/list-samples.html b/uploader/templates/samples/list-samples.html new file mode 100644 index 0000000..13e5cec --- /dev/null +++ b/uploader/templates/samples/list-samples.html @@ -0,0 +1,132 @@ +{%extends "samples/base.html"%} +{%from "flash_messages.html" import flash_all_messages%} +{%from "populations/macro-select-population.html" import select_population_form%} +{%from "populations/macro-display-population-card.html" import display_population_card%} + +{%block title%}Samples — List Samples{%endblock%} + +{%block pagetitle%}Samples — List Samples{%endblock%} + +{%block lvl4_breadcrumbs%} +<li {%if activelink=="list-samples"%} + class="breadcrumb-item active" + {%else%} + class="breadcrumb-item" + {%endif%}> + <a href="{{url_for('species.populations.samples.list_samples', + species_id=species.SpeciesId, + population_id=population.Id)}}">List</a> +</li> +{%endblock%} + +{%block contents%} +{{flash_all_messages()}} + +<div class="row"> + <p> + You selected the population "{{population.FullName}}" from the + "{{species.FullName}}" species. + </p> +</div> + +{%if samples | length > 0%} +<div class="row"> + <p> + This population already has <strong>{{total_samples}}</strong> + samples/individuals entered. You can explore the list of samples in this + population in the table below. + </p> +</div> + +<div class="row"> + <div class="col-md-2"> + {%if offset > 0:%} + <a href="{{url_for('species.populations.samples.list_samples', + species_id=species.SpeciesId, + population_id=population.Id, + from=offset-count, + count=count)}}"> + <span class="glyphicon glyphicon-backward"></span> + Previous + </a> + {%endif%} + </div> + + <div class="col-md-8" style="text-align: center;"> + Samples {{offset}} — {{offset+(count if offset + count < total_samples else total_samples - offset)}} / {{total_samples}} + </div> + + <div class="col-md-2"> + {%if offset + count < total_samples:%} + <a href="{{url_for('species.populations.samples.list_samples', + species_id=species.SpeciesId, + population_id=population.Id, + from=offset+count, + count=count)}}"> + Next + <span class="glyphicon glyphicon-forward"></span> + </a> + {%endif%} + </div> +</div> +<div class="row"> + <table class="table"> + <thead> + <tr> + <th>#</th> + <th>Name</th> + <th>Auxilliary Name</th> + <th>Symbol</th> + <th>Alias</th> + </tr> + </thead> + + <tbody> + {%for sample in samples%} + <tr> + <td>{{sample.sequence_number}}</td> + <td>{{sample.Name}}</td> + <td>{{sample.Name2}}</td> + <td>{{sample.Symbol or "-"}}</td> + <td>{{sample.Alias or "-"}}</td> + </tr> + {%endfor%} + </tbody> + </table> + + <p> + <a href="#" + title="Add samples for population '{{population.FullName}}' from species + '{{species.FullName}}'." + class="btn btn-danger"> + delete all samples + </a> + </p> +</div> + +{%else%} + +<div class="row"> + <p> + There are no samples entered for this population. Do please go ahead and add + the samples for this population by clicking on the button below. + </p> + + <p> + <a href="{{url_for('species.populations.samples.upload_samples', + species_id=species.SpeciesId, + population_id=population.Id)}}" + title="Add samples for population '{{population.FullName}}' from species + '{{species.FullName}}'." + class="btn btn-primary"> + add samples + </a> + </p> +</div> +{%endif%} + +{%endblock%} + +{%block sidebarcontents%} +{{display_population_card(species, population)}} +{%endblock%} diff --git a/uploader/templates/samples/select-population.html b/uploader/templates/samples/select-population.html new file mode 100644 index 0000000..f437780 --- /dev/null +++ b/uploader/templates/samples/select-population.html @@ -0,0 +1,39 @@ +{%extends "samples/base.html"%} +{%from "flash_messages.html" import flash_all_messages%} +{%from "populations/macro-select-population.html" import select_population_form%} +{%from "species/macro-display-species-card.html" import display_species_card%} + +{%block title%}Samples — Select Population{%endblock%} + +{%block pagetitle%}Samples — Select Population{%endblock%} + + +{%block contents%} +{{flash_all_messages()}} + +<div class="row"> + <p>You have selected "{{species.FullName}}" as the species that your data relates to.</p> + <p>Next, we need information regarding the population your data relates to. Do please select the population from the existing ones below</p> +</div> + +<div class="row"> + {{select_population_form( + url_for("species.populations.samples.select_population", species_id=species.SpeciesId), + populations)}} +</div> + +<div class="row"> + <p> + If you cannot find the population your data relates to in the drop-down + above, you might want to + <a href="{{url_for('species.populations.create_population', + species_id=species.SpeciesId)}}" + title="Create a new population for species '{{species.FullName}},"> + add a new population to GeneNetwork</a> + instead. +</div> +{%endblock%} + +{%block sidebarcontents%} +{{display_species_card(species)}} +{%endblock%} diff --git a/qc_app/templates/samples/upload-failure.html b/uploader/templates/samples/upload-failure.html index 09e2ecf..458ab55 100644 --- a/qc_app/templates/samples/upload-failure.html +++ b/uploader/templates/samples/upload-failure.html @@ -1,10 +1,12 @@ {%extends "base.html"%} {%from "cli-output.html" import cli_output%} +{%from "populations/macro-display-population-card.html" import display_population_card%} {%block title%}Samples Upload Failure{%endblock%} {%block contents%} -<h1 class="heading">{{job.job_name}}</h2> +<div class="row"> +<h2 class="heading">{{job.job_name[0:50]}}…</h2> <p>There was a failure attempting to upload the samples.</p> @@ -17,11 +19,19 @@ <li><strong>status</strong>: {{job.status}}</li> <li><strong>job type</strong>: {{job["job-type"]}}</li> </ul> +</div> +<div class="row"> <h4>stdout</h4> {{cli_output(job, "stdout")}} +</div> +<div class="row"> <h4>stderr</h4> {{cli_output(job, "stderr")}} +</div> +{%endblock%} +{%block sidebarcontents%} +{{display_population_card(species, population)}} {%endblock%} diff --git a/uploader/templates/samples/upload-progress.html b/uploader/templates/samples/upload-progress.html new file mode 100644 index 0000000..677d457 --- /dev/null +++ b/uploader/templates/samples/upload-progress.html @@ -0,0 +1,31 @@ +{%extends "samples/base.html"%} +{%from "cli-output.html" import cli_output%} +{%from "populations/macro-display-population-card.html" import display_population_card%} + +{%block extrameta%} +<meta http-equiv="refresh" content="5"> +{%endblock%} + +{%block title%}Job Status{%endblock%} + +{%block contents%} +<div class="row" style="overflow-x: clip;"> +<h2 class="heading">{{job.job_name[0:50]}}…</h2> + +<p> +<strong>status</strong>: +<span>{{job["status"]}} ({{job.get("message", "-")}})</span><br /> +</p> + +<p>saving to database...</p> +</div> + +<div class="row"> + {{cli_output(job, "stdout")}} +</div> + +{%endblock%} + +{%block sidebarcontents%} +{{display_population_card(species, population)}} +{%endblock%} diff --git a/uploader/templates/samples/upload-samples.html b/uploader/templates/samples/upload-samples.html new file mode 100644 index 0000000..25d3290 --- /dev/null +++ b/uploader/templates/samples/upload-samples.html @@ -0,0 +1,160 @@ +{%extends "samples/base.html"%} +{%from "flash_messages.html" import flash_all_messages%} +{%from "populations/macro-select-population.html" import select_population_form%} +{%from "populations/macro-display-population-card.html" import display_population_card%} + +{%block title%}Samples — Upload Samples{%endblock%} + +{%block pagetitle%}Samples — Upload Samples{%endblock%} + +{%block lvl4_breadcrumbs%} +<li {%if activelink=="uploade-samples"%} + class="breadcrumb-item active" + {%else%} + class="breadcrumb-item" + {%endif%}> + <a href="{{url_for('species.populations.samples.upload_samples', + species_id=species.SpeciesId, + population_id=population.Id)}}">List</a> +</li> +{%endblock%} + +{%block contents%} +{{flash_all_messages()}} + +<div class="row"> + <p> + You can now upload the samples for the "{{population.FullName}}" population + from the "{{species.FullName}}" species here. + </p> + <p> + Upload a <strong>character-separated value (CSV)</strong> file that contains + details about your samples. The CSV file should have the following fields: + <dl> + <dt>Name</dt> + <dd>The primary name/identifier for the sample/individual.</dd> + + <dt>Name2</dt> + <dd>A secondary name for the sample. This can simply be the same as + <strong>Name</strong> above. This field <strong>MUST</strong> contain a + value.</dd> + + <dt>Symbol</dt> + <dd>A symbol for the sample. This can be a strain name, e.g. 'BXD60' for + species that have strains. This field can be left empty for species like + Humans that do not have strains..</dd> + + <dt>Alias</dt> + <dd>An alias for the sample. Can be an empty field, or take on the same + value as that of the Symbol.</dd> + </dl> + </p> +</div> + +<div class="row"> + <form id="form-samples" + method="POST" + action="{{url_for('species.populations.samples.upload_samples', + species_id=species.SpeciesId, + population_id=population.InbredSetId)}}" + enctype="multipart/form-data"> + <legend class="heading">upload samples</legend> + + <input type="hidden" name="species_id" value="{{species.SpeciesId}}" /> + <input type="hidden" name="population_id" value="{{population.Id}}" /> + + <div class="form-group"> + <label for="file-samples" class="form-label">select file</label> + <input type="file" name="samples_file" id="file:samples" + accept="text/csv, text/tab-separated-values" + class="form-control" /> + </div> + + <div class="form-group"> + <label for="select:separator" class="form-label">field separator</label> + <select id="select:separator" + name="separator" + required="required" + class="form-control"> + <option value="">Select separator for your file: (default is comma)</option> + <option value="	">TAB</option> + <option value=" ">Space</option> + <option value=",">Comma</option> + <option value=";">Semicolon</option> + <option value="other">Other</option> + </select> + <input id="txt:separator" + type="text" + name="other_separator" + class="form-control" /> + <small class="form-text text-muted"> + If you select '<strong>Other</strong>' for the field separator value, + enter the character that separates the fields in your CSV file in the form + field below. + </small> + </div> + + <div class="form-group form-check"> + <input id="chk:heading" + type="checkbox" + name="first_line_heading" + class="form-check-input" /> + <label for="chk:heading" class="form-check-label"> + first line is a heading?</label> + <small class="form-text text-muted"> + Select this if the first line in your file contains headings for the + columns. + </small> + </div> + + <div class="form-group"> + <label for="txt:delimiter" class="form-label">field delimiter</label> + <input id="txt:delimiter" + type="text" + name="field_delimiter" + maxlength="1" + class="form-control" /> + <small class="form-text text-muted"> + If there is a character delimiting the string texts within particular + fields in your CSV, provide the character here. This can be left blank if + no such delimiters exist in your file. + </small> + </div> + + <button type="submit" + class="btn btn-primary">upload samples file</button> + </form> +</div> + +<div class="row"> + <h3>Preview File Content</h3> + + <table id="tbl:samples-preview" class="table"> + <caption class="heading">preview content</caption> + + <thead> + <tr> + <th>Name</th> + <th>Name2</th> + <th>Symbol</th> + <th>Alias</th> + </tr> + </thead> + + <tbody> + <tr id="default-row"> + <td colspan="4"> + Please make some selections in the form above to preview the data.</td> + </tr> + </tbody> + </table> +</div> +{%endblock%} + +{%block sidebarcontents%} +{{display_population_card(species, population)}} +{%endblock%} + +{%block javascript%} +<script src="/static/js/upload_samples.js" type="text/javascript"></script> +{%endblock%} diff --git a/uploader/templates/samples/upload-success.html b/uploader/templates/samples/upload-success.html new file mode 100644 index 0000000..881d466 --- /dev/null +++ b/uploader/templates/samples/upload-success.html @@ -0,0 +1,36 @@ +{%extends "samples/base.html"%} +{%from "cli-output.html" import cli_output%} +{%from "populations/macro-display-population-card.html" import display_population_card%} + +{%block title%}Job Status{%endblock%} + +{%block contents%} + +<div class="row" style="overflow-x: clip;"> + <h2 class="heading">{{job.job_name[0:50]}}…</h2> + + <p> + <strong>status</strong>: + <span>{{job["status"]}} ({{job.get("message", "-")}})</span><br /> + </p> + + <p>Successfully uploaded the samples.</p> + <p> + <a href="{{url_for('species.populations.samples.list_samples', + species_id=species.SpeciesId, + population_id=population.Id)}}" + title="View population samples"> + View samples + </a> + </p> +</div> + +<div class="row"> + {{cli_output(job, "stdout")}} +</div> + +{%endblock%} + +{%block sidebarcontents%} +{{display_population_card(species, population)}} +{%endblock%} diff --git a/qc_app/templates/select_dataset.html b/uploader/templates/select_dataset.html index 2f07de8..2f07de8 100644 --- a/qc_app/templates/select_dataset.html +++ b/uploader/templates/select_dataset.html diff --git a/qc_app/templates/select_platform.html b/uploader/templates/select_platform.html index d9bc68f..d9bc68f 100644 --- a/qc_app/templates/select_platform.html +++ b/uploader/templates/select_platform.html diff --git a/qc_app/templates/select_study.html b/uploader/templates/select_study.html index 648ad4c..648ad4c 100644 --- a/qc_app/templates/select_study.html +++ b/uploader/templates/select_study.html diff --git a/uploader/templates/species/base.html b/uploader/templates/species/base.html new file mode 100644 index 0000000..04391db --- /dev/null +++ b/uploader/templates/species/base.html @@ -0,0 +1,12 @@ +{%extends "base.html"%} + +{%block lvl1_breadcrumbs%} +<li {%if activelink=="species"%} + class="breadcrumb-item active" + {%else%} + class="breadcrumb-item" + {%endif%}> + <a href="{{url_for('species.list_species')}}">Species</a> +</li> +{%block lvl2_breadcrumbs%}{%endblock%} +{%endblock%} diff --git a/uploader/templates/species/create-species.html b/uploader/templates/species/create-species.html new file mode 100644 index 0000000..0d0bedf --- /dev/null +++ b/uploader/templates/species/create-species.html @@ -0,0 +1,132 @@ +{%extends "species/base.html"%} +{%from "flash_messages.html" import flash_all_messages%} + +{%block title%}Create Species{%endblock%} + +{%block pagetitle%}Create Species{%endblock%} + +{%block lvl2_breadcrumbs%} +<li {%if activelink=="create-species"%} + class="breadcrumb-item active" + {%else%} + class="breadcrumb-item" + {%endif%}> + <a href="{{url_for('species.create_species')}}">Create</a> +</li> +{%endblock%} + +{%block contents%} +<div class="row"> + <form id="frm-create-species" + method="POST" + action="{{url_for('species.create_species')}}"> + <legend>Create Species</legend> + + {{flash_all_messages()}} + + <div class="form-group"> + <label for="txt-taxonomy-id" class="form-label"> + Taxonomy ID</label> + <div class="input-group"> + <input id="txt-taxonomy-id" + name="species_taxonomy_id" + type="text" + class="form-control" /> + <span class="input-group-btn"> + <button id="btn-search-taxonid" class="btn btn-info">Search</button> + </span> + </div> + <small class="form-text text-small text-muted">Provide the taxonomy ID for + your species that can be used to link to external sites like NCBI. Enter + the taxonomy ID and click "Search" to auto-fill the form with data. + <br /> + While it is recommended to provide a value for this field, doing so is + optional. + </small> + </div> + + <div class="form-group"> + <label for="txt-species-name" class="form-label">Common Name</label> + <input id="txt-species-name" + name="common_name" + type="text" + class="form-control" + required="required" /> + <small class="form-text text-muted">Provide the common, possibly + non-scientific name for the species here, e.g. Human, Mouse, etc.</small> + </div> + + <div class="form-group"> + <label for="txt-species-scientific" class="form-label"> + Scientific Name</label> + <input id="txt-species-scientific" + name="scientific_name" + type="text" + class="form-control" + required="required" /> + <small class="form-text text-muted">Provide the scientific name for the + species you are creating, e.g. Homo sapiens, Mus musculus, etc.</small> + </div> + + <div class="form-group"> + <label for="select-species-family" class="form-label">Family</label> + <select id="select-species-family" + name="species_family" + required="required" + class="form-control"> + <option value="">Please select a grouping</option> + {%for family in families%} + <option value="{{family}}">{{family}}</option> + {%endfor%} + </select> + <small class="form-text text-muted"> + This is a generic grouping for the species that determines under which + grouping the species appears in the GeneNetwork menus</small> + </div> + + <div class="form-group"> + <input type="submit" + value="create new species" + class="btn btn-primary" /> + </div> + + </form> +</div> +{%endblock%} + +{%block javascript%} +<script> + var lastTaxonId = null; + + var fetch_taxonomy = (taxonId) => { + var uri = ( + "https://rest.uniprot.org/taxonomy/" + encodeURIComponent(taxonId)); + $.get( + uri, + {}, + (data, textStatus, jqXHR) => { + if(textStatus == "success") { + lastTaxonId = taxonId; + $("#txt-species-scientific").val(data.scientificName); + $("#txt-species-name").val(data.commonName); + return false; + } + msg = ( + "Request to '${uri}' failed with message '${textStatus}'. " + + "Please try again later, or fill the details manually."); + alert(msg); + console.error(msg, data, textStatus); + return false; + }, + "json"); + }; + + $("#btn-search-taxonid").on("click", (event) => { + event.preventDefault(); + taxonId = $("#txt-taxonomy-id").val(); + if((taxonId !== "") && (taxonId !== lastTaxonId)) { + fetch_taxonomy(taxonId); + } + }); +</script> +{%endblock%} diff --git a/uploader/templates/species/edit-species.html b/uploader/templates/species/edit-species.html new file mode 100644 index 0000000..5a26455 --- /dev/null +++ b/uploader/templates/species/edit-species.html @@ -0,0 +1,177 @@ +{%extends "species/base.html"%} +{%from "flash_messages.html" import flash_all_messages%} + +{%block title%}Edit Species{%endblock%} + +{%block pagetitle%}Edit Species{%endblock%} + +{%block css%} +<style type="text/css"> + .card { + margin-top: 0.3em; + border-width: 1px; + border-style: solid; + border-radius: 0.3em; + border-color: #AAAAAA; + padding: 0.5em; + } +</style> +{%endblock%} + +{%block lvl2_breadcrumbs%} +<li {%if activelink=="edit-species"%} + class="breadcrumb-item active" + {%else%} + class="breadcrumb-item" + {%endif%}> + <a href="{{url_for('species.edit_species_extra', + species_id=species.SpeciesId)}}">Edit</a> +</li> +{%endblock%} + +{%block contents%} +{{flash_all_messages()}} +<div class="row"> + <form id="frm-edit-species" + method="POST" + action="{{url_for('species.edit_species_extra', + species_id=species.SpeciesId)}}"> + + <legend>Edit Extra Detail for Species '{{species.FullName}}'</legend> + + <input type="hidden" name="species_id" value="{{species.SpeciesId}}" /> + + <div class="form-group"> + <label for="lbl-species-taxonid" class="form-label"> + Taxonomy Id + </label> + <label id="lbl-species-taxonid" + disabled="disabled" + class="form-control">{{species.TaxonomyId}}</label> + </div> + + <div class="form-group"> + <label for="txt-species-name" class="form-label"> + Common Name + </label> + <input type="text" + id="txt-species-name" + name="species_name" + required="required" + value="{{species.SpeciesName}}" + class="form-control" /> + <small class="form-text text-muted"> + This is the layman's name for the species, e.g. mouse</mall> + </div> + + <div class="form-group"> + <label for="txt-species-fullname" class="form-label"> + Scientific Name + </label> + <input type="text" + id="txt-species-fullname" + name="species_fullname" + required="required" + value="{{species.FullName}}" + class="form-control" /> + <small class="form-text text-muted"> + A scientific name for the species that mostly adheres to the biological + binomial nomenclature system.</small> + </div> + + <div class="form-group"> + <label for="select-species-family" class="form-label"> + Family + </label> + <select id="select-species-family" + name="species_family" + class="form-control"> + <option value="">Select the family</option> + {%for family in families%} + <option value="{{family}}" + {%if species.Family == family%} + selected="selected" + {%endif%}>{{family}}</option> + {%endfor%} + </select> + <small class="form-text text-muted"> + A general classification for the species. This is mostly for use in + GeneNetwork's menus.</small> + </div> + + <div class="form-group"> + <label for="txt-species-familyorderid" class="form-label"> + Family Order Id + </label> + <input type="number" + id="txt-species-familyorderid" + name="species_familyorderid" + value="{{species.FamilyOrderId}}" + required="required" + class="form-control" /> + <small class="form-text text-muted"> + This is a number that determines the order of the "Family" groupings + above in the GeneNetwork menus. This is an integer value that's manually + assigned.</small> + </div> + + <div class="form-group"> + <label for="txt-species-orderid" class="form-label"> + Order Id + </label> + <input type="number" + id="txt-species-orderid" + name="species_orderid" + value="{{species.OrderId or (max_order_id + 5)}}" + class="form-control" /> + <small class="form-text text-muted"> + This integer value determines the order of the species in relation to + each other, but also within the respective "Family" groups.</small> + </div> + + <div class="form-group"> + <input type="submit" value="Submit Changes" class="btn btn-primary" /> + </div> + + </form> +</div> +{%endblock%} + +{%block sidebarcontents%} + +<div class="card"> + <div class="card-body"> + <h5 class="card-title">Family Order</h5> + <div class="card-text"> + <p>The current family order is as follows</p> + <table class="table"> + <thead> + <tr> + <th>Family Order Id</th> + <th>Family</th> + </tr> + </thead> + <tbody> + {%for item in family_order%} + <tr> + <td>{{item[0]}}</td> + <td>{{item[1]}}</td> + </tr> + {%endfor%} + </tbody> + </table> + </div> + </div> +</div> + +<div class="card"> + <div class="card-body"> + <h5 class="card-title">Order ID</h5> + <div class="card-text"> + <p>The current largest OrderID is: {{max_order_id}}</p> + <p>We recommend giving a new species an order ID that is five more than + the current highest i.e. {{max_order_id + 5}}.</p> + </div> + </div> +</div> +{%endblock%} diff --git a/uploader/templates/species/list-species.html b/uploader/templates/species/list-species.html new file mode 100644 index 0000000..85c9d40 --- /dev/null +++ b/uploader/templates/species/list-species.html @@ -0,0 +1,75 @@ +{%extends "species/base.html"%} +{%from "flash_messages.html" import flash_all_messages%} + +{%block title%}List Species{%endblock%} + +{%block pagetitle%}List Species{%endblock%} + +{%block contents%} +{{flash_all_messages()}} +<div class="row"> + <p> + All data in GeneNetwork revolves around species. This is the core of the + system.</p> + <p>Here you can see a list of all the species available in GeneNetwork. + Click on the link besides each species to view greater detail on the species, + and access further operations that are possible for said species.</p> +</div> + +<div class="row"> + <p>If you cannot find the species you are looking for below, click the button + below to create it</p> + <p><a href="{{url_for('species.create_species')}}" + title="Add a new species to GeneNetwork" + class="btn btn-danger">Create Species</a></p> +</div> + +<div class="row"> + <table class="table"> + <caption>Available Species</caption> + <thead> + <tr> + <th>#</td> + <th title="A common, layman's name for the species.">Common Name</th> + <th title="The scientific name for the species">Organism Name</th> + <th title="An identifier for the species in the NCBI taxonomy database"> + Taxonomy ID + </th> + <th title="A generic grouping used internally by GeneNetwork for organising species."> + Family + </th> + </tr> + </thead> + <tbody> + {%for species in allspecies%} + <tr> + <td>{{species["sequence_number"]}}</td> + <td>{{species["SpeciesName"]}}</td> + <td> + <a href="{{url_for('species.view_species', + species_id=species['SpeciesId'])}}" + title="View details in GeneNetwork on {{species['FullName']}}"> + {{species["FullName"]}} + </a> + </td> + <td> + <a href="https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id={{species['TaxonomyId']}}" + title="View species details on NCBI" + target="_blank">{{species["TaxonomyId"]}}</a> + </td> + <td>{{species.Family}}</td> + </tr> + {%else%} + <tr> + <td colspan="3"> + <p class="text-danger"> + <span class="glyphicon glyphicon-exclamation-mark"></span> + There were no species found! + </p> + </td> + </tr> + {%endfor%} + </tbody> + </table> +</div> +{%endblock%} diff --git a/uploader/templates/species/macro-display-species-card.html b/uploader/templates/species/macro-display-species-card.html new file mode 100644 index 0000000..166c7b9 --- /dev/null +++ b/uploader/templates/species/macro-display-species-card.html @@ -0,0 +1,22 @@ +{%macro display_species_card(species)%} +<div class="card"> + <div class="card-body"> + <h5 class="card-title">Species</h5> + <div class="card-text"> + <table class="table"> + <tbody> + <tr> + <td>Common Name</td> + <td>{{species.SpeciesName}}</td> + </tr> + + <tr> + <td>Scientific Name</td> + <td>{{species.FullName}}</td> + </tr> + </tbody> + </table> + </div> + </div> +</div> +{%endmacro%} diff --git a/uploader/templates/species/macro-select-species.html b/uploader/templates/species/macro-select-species.html new file mode 100644 index 0000000..dd086c0 --- /dev/null +++ b/uploader/templates/species/macro-select-species.html @@ -0,0 +1,36 @@ +{%macro select_species_form(form_action, species)%} +{%if species | length > 0%} +<form method="GET" action="{{form_action}}"> + <div class="form-group"> + <label for="select-species" class="form-label">Species</label> + <select id="select-species" + name="species_id" + class="form-control" + required="required"> + <option value="">Select Species</option> + {%for group in species%} + {{group}} + <optgroup {%if group[0][1] is not none%} + label="{{group[0][1].capitalize()}}" + {%else%} + label="Undefined" + {%endif%}> + {%for aspecies in group[1]%} + <option value="{{aspecies.SpeciesId}}">{{aspecies.MenuName}}</option> + {%endfor%} + </optgroup> + {%endfor%} + </select> + </div> + + <div class="form-group"> + <input type="submit" value="Select" class="btn btn-primary" /> + </div> +</form> +{%else%} +<p class="text-danger"> + <span class="glyphicon glyphicon-exclamation-mark"></span> + We could not find species to select from! +</p> +{%endif%} +{%endmacro%} diff --git a/uploader/templates/species/view-species.html b/uploader/templates/species/view-species.html new file mode 100644 index 0000000..b01864d --- /dev/null +++ b/uploader/templates/species/view-species.html @@ -0,0 +1,84 @@ +{%extends "species/base.html"%} +{%from "flash_messages.html" import flash_all_messages%} + +{%block title%}View Species{%endblock%} + +{%block pagetitle%}View Species{%endblock%} + +{%block lvl2_breadcrumbs%} +<li {%if activelink=="view-species"%} + class="breadcrumb-item active" + {%else%} + class="breadcrumb-item" + {%endif%}> + <a href="{{url_for('species.view_species', species_id=species.SpeciesId)}}">View</a> +</li> +{%endblock%} + +{%block contents%} +{{flash_all_messages()}} +<div class="row"> + <h2>Details on species {{species.FullName}}</h2> + + <dl> + <dt>Common Name</dt> + <dd>{{species.SpeciesName}}</dd> + + <dt>Scientific Name</dt> + <dd>{{species.FullName}}</dd> + + <dt>Taxonomy ID</dt> + <dd>{{species.TaxonomyId}}</dd> + </dl> + + <h3>Actions</h3> + + <p> + You can proceed to perform any of the following actions for species + {{species.FullName}} + </p> + + <ol> + <li> + <a href="{{url_for('species.populations.list_species_populations', + species_id=species.SpeciesId)}}" + title="Create/Edit populations for {{species.FullName}}"> + Manage populations</a> + </li> + </ol> + + +</div> +{%endblock%} + +{%block sidebarcontents%} +<div class="card"> + <div class="card-body"> + <h5 class="card-title">Species Extras</h5> + <div class="card-text"> + <p>Some extra internal-use details (mostly for UI concerns on GeneNetwork)</p> + <p> + <small> + If you do not understand what the following are about, simply ignore them + — + They have no bearing whatsoever on your data, or its analysis. + </small> + </p> + <dl> + <dt>Family</dt> + <dd>{{species.Family}}</dd> + + <dt>FamilyOrderId</dt> + <dd>{{species.FamilyOrderId}}</dd> + + <dt>OrderId</dt> + <dd>{{species.OrderId}}</dd> + </dl> + </div> + <a href="{{url_for('species.edit_species_extra', + species_id=species.SpeciesId)}}" + class="card-link" + title="Edit the species' internal-use details.">Edit</a> + </div> +</div> +{%endblock%} diff --git a/qc_app/templates/stdout_output.html b/uploader/templates/stdout_output.html index 85345a9..85345a9 100644 --- a/qc_app/templates/stdout_output.html +++ b/uploader/templates/stdout_output.html diff --git a/uploader/templates/unhandled_exception.html b/uploader/templates/unhandled_exception.html new file mode 100644 index 0000000..cfb0c0b --- /dev/null +++ b/uploader/templates/unhandled_exception.html @@ -0,0 +1,24 @@ +{%extends "base.html"%} +{%from "flash_messages.html" import flash_all_messages%} + +{%block title%}System Error{%endblock%} + +{%block css%} +<link rel="stylesheet" href="/static/css/two-column-with-separator.css" /> +{%endblock%} + +{%block contents%} +<div class="row"> + {{flash_all_messages()}} + <h1>Exception!</h1> + + <p>An error has occured, and your request has been aborted. Please notify the + administrator to try and get this fixed.</p> + <p>The system has failed with the following error:</p> +</div> +<div class="row"> + <pre> + {{trace}} + </pre> +</div> +{%endblock%} diff --git a/qc_app/templates/upload_progress_indicator.html b/uploader/templates/upload_progress_indicator.html index e274e83..e274e83 100644 --- a/qc_app/templates/upload_progress_indicator.html +++ b/uploader/templates/upload_progress_indicator.html diff --git a/qc_app/templates/worker_failure.html b/uploader/templates/worker_failure.html index b65b140..b65b140 100644 --- a/qc_app/templates/worker_failure.html +++ b/uploader/templates/worker_failure.html diff --git a/uploader/ui.py b/uploader/ui.py new file mode 100644 index 0000000..1994056 --- /dev/null +++ b/uploader/ui.py @@ -0,0 +1,14 @@ +"""Utilities to handle the UI""" +from flask import render_template as flask_render_template + +def make_template_renderer(default): + """Render template for species.""" + def render_template(template, **kwargs): + return flask_render_template( + template, + **{ + **kwargs, + "activemenu": default, + "activelink": kwargs.get("activelink", default) + }) + return render_template |