quality control of delimited files
qc CI badge

project goals

This project started as a collaboration with Arthur Centeno to check tsv files for the following critera:

  • no empty data cells
  • no data cells with spurious characters like eeeee, 5.555iloveguix, etc...
  • decimal numbers must conform to the following criteria:
    • when checking an average file decimal numbers must contain exactly three places to the right side of the dot.
    • when checking a standard error file decimal numbers must contain six or greater places to the right side of the dot.
    • there must be a number to the left side of the dot (e.g. 0.55555 is allowed but .55555 is not).
  • check line endings to make sure they are Unix and not DOS
  • check strain headers against a source of truth (see strains.csv)


  • do not hard code path when loading strain files
  • full test coverage

development with guix

guix shell --container --manifest=manifest.scm

running tests

sbcl --load run-tests.lisp