#+STARTUP: inlineimages #+TITLE: GeneNetwork Quality Control Application ** Project Goals The project seeks to handle the checking of data files for correct syntax and other errors before allowing the code to be uploaded. The files are *"tab-separated"* values (TSV) files, and must conform to the following criteria: *** Line-Level Checks - Must be tab-separated - *** Cell-Level Checks - No empty data cells - no data cells with spurious characters like `eeeee`, `5.555iloveguix`, etc. - decimal numbers must conform to the following criteria: - - when checking an average file decimal numbers must contain exactly three places to the right side of the dot. - - when checking a standard error file decimal numbers must contain six or greater places to the right side of the dot. - - there must be a number to the left side of the dot (e.g. 0.55555 is allowed but .55555 is not). - check line endings to make sure they are Unix and not DOS - check strain headers against a source of truth (see strains.csv) ** Development For reproducibility, this project is developed using guix. To launch a guix shell for development, do #+BEGIN_SRC shell guix shell --container --network --share=/some/host/directory=/the/upload/directory --development --file=guix.scm #+END_SRC which environment that is isolated from the rest of your system. We share a host directory with the container (that is writeable by the user that started the web application) to serve as the upload directory for the application. *** Run the CLI version Run the CLI version of the application, with #+BEGIN_SRC shell python3 -m scripts.qc --help #+END_SRC *** Run the web version You need to have a running Redis instance, and configure the application to connect to it. To run the web-version of the qc app in development mode, you need to set up a few environment variables #+BEGIN_SRC shell export FLASK_APP=wsgi.py export FLASK_ENV=development export QCAPP_INSTANCE_PATH=/path/to/directory/with/config.py #+END_SRC then you can run the application with #+BEGIN_SRC shell flask run #+END_SRC *** Checks Run tests with: #+BEGIN_SRC shell pytest #+END_SRC To run the linter over the code base, run: #+BEGIN_SRC shell pylint *.py tests quality_control qc_app scripts #+END_SRC To check for correct type usage in the application, run: #+BEGIN_SRC shell mypy --show-error-codes . #+END_SRC ** Deploying/Installing QC *** CLI: Docker Generate the docker image file with #+BEGIN_SRC shell guix pack -f docker -S /bin=bin genenetwork-qc #+END_SRC That creates the image file with a path such as: #+BEGIN_EXAMPLE /gnu/store/ibz5qlwzv0lyply2by7422z0c6jfaa6s-genenetwork-qc-docker-pack.tar.gz #+END_EXAMPLE You can now load this file into docker withe #+BEGIN_SRC shell docker load < /gnu/store/ibz5qlwzv0lyply2by7422z0c6jfaa6s-genenetwork-qc-docker-pack.tar.gz #+END_SRC and from there, you can run the application as detailed in the [[#run-cli-version][Running QC: CLI-Version]] section below *** CLI: Guix The application can be installed using guix by pointing to the [[./guix.scm][guix.scm]] file as follows: #+BEGIN_SRC shell guix package [-p /path/to/qc/profile] -f guix.scm #+END_SRC *** Web-Version **** TODO Document deployment details for the web version of GeneNetwork QC better ** Running QC *** Command-Line Version :PROPERTIES: :CUSTOM_ID: run-cli-version :END: Install the application as shown in the [[Installing QC]] section above. To run qc against a file, the syntax is: #+BEGIN_SRC shell qc [--strainsfile ] [--verbose] #+END_SRC where - ~~ is one of "*average*" or "*standard-error*" - ~~ is either an absolute path to the file, or a path relative to the current working directory - if the ~--strainsfile~ option is not provided, it will default to the one in the root directory of this repository - the ~--verbose~ option is a flag, defaulting to ~False~ that controls the display of optional progress messages To view the usage information for the application, run #+BEGIN_SRC shell qc --help #+END_SRC *** Web Version **** TODO Document usage of the web-UI version of the application *** Docker Download the docker image file from [[https://git.genenetwork.org/fredmanglis/gnqc_py/releases][the releases page]] of the application and load it to docker with something like: #+BEGIN_SRC shell docker load < genenetwork-qc-0.0.1-1-oxu472i-docker.tar.gz #+END_SRC replacing ~genenetwork-qc-0.0.1-1-oxu472i.tar.gz~ with the actual name of the release you downloaded Run the application with something like: #+BEGIN_SRC shell docker run -v /path/to/qnqc_py/tests/test_data:/data -ti \ genenetwork-qc:latest /bin/qc average /data/average_error_at_end_200MB.tsv #+END_SRC replacing ~/path/to/qnqc_py/tests/test_data~ with the path to the folder where the file you want to check is in, and ~average_error_at_end_200MB.tsv~ with the name of the file you want to check for errors.