From 89610108c3edb5b494eb716a03a97246358d39ce Mon Sep 17 00:00:00 2001 From: Frederick Muriuki Muriithi Date: Fri, 2 Feb 2024 06:14:50 +0300 Subject: New issue: Quality control of Data in Uploaded R/qtl2 Bundles Detail some quality-control checks that will be run against the data in the uploaded R/qtl2 bundles. --- issues/quality-control/qc-r-qtl2-bundles.gmi | 53 ++++++++++++++++++++++++++++ 1 file changed, 53 insertions(+) create mode 100644 issues/quality-control/qc-r-qtl2-bundles.gmi (limited to 'issues') diff --git a/issues/quality-control/qc-r-qtl2-bundles.gmi b/issues/quality-control/qc-r-qtl2-bundles.gmi new file mode 100644 index 0000000..896831d --- /dev/null +++ b/issues/quality-control/qc-r-qtl2-bundles.gmi @@ -0,0 +1,53 @@ +# Quality Control of Data in Uploaded R/qtl2 Bundles + +## Tags + +* assigned: fredm, acenteno +* status: open +* type: feature request +* priority: medium +* keywords: quality control, QC, R/qtl2 bundle + +## Description + +Currently (2024-02-02T05:41+03:00UTC), the code simply allows the upload of data, doing the bare minimum in terms of quality control. In this document, we detail the quality control checks that are required to be run against the uploaded data, to ensure the data we have is acceptable. + +The following "key" details the meanings of certain notations in this file: + +* //[ ]//: not started +* //[-]//: partially done or in progress +* //[x]//: completed + +### [-] Control File + +* [x] MUST exist in bundle +* [x] One and only one control file in the bundle +* [-] Defaults for control data are auto-provided by code +* [ ] Every file listed in control file MUST exist in the bundle + +### [ ] //geno// File(s) + +* [ ] Every value existing in file is one of the genotype encodings in the control file + +### [ ] //phenocovar// File(s) + +* [ ] At least one of the //phenocovar// files contains a **description** column +* [ ] The description of every phenotype fits the rules[1] + +### [ ] //pheno// File(s) + +* [ ] Check for a minimal number of decimal places (three?) + +### [ ] //phenose// File(s) + +This is a proposed addition for our specific use-case. If the data in the //pheno// file(s) was derived from averaging values, then the user could provide the corresponding "//standard error//" file(s). + +* [ ] Check for a minimal number of decimal places (six?) + +## Questions Fred Has + +* Is there a way to detect whether data has been log2 normalised? GN requires that all data is log2 normalised. + +## Resources + +* [1]: Description rules: https://info.genenetwork.org/faq.php#q-22 -- cgit v1.2.3