summaryrefslogtreecommitdiff
path: root/issues
diff options
context:
space:
mode:
authorFrederick Muriuki Muriithi2024-02-02 06:14:50 +0300
committerFrederick Muriuki Muriithi2024-02-02 06:15:59 +0300
commit89610108c3edb5b494eb716a03a97246358d39ce (patch)
tree6eb26ccd9484d6b888c2c41b017256fc689467a1 /issues
parentdeff59a0a6b53bff87451b9945de8c529bff3db5 (diff)
downloadgn-gemtext-89610108c3edb5b494eb716a03a97246358d39ce.tar.gz
New issue: Quality control of Data in Uploaded R/qtl2 Bundles
Detail some quality-control checks that will be run against the data in the uploaded R/qtl2 bundles.
Diffstat (limited to 'issues')
-rw-r--r--issues/quality-control/qc-r-qtl2-bundles.gmi53
1 files changed, 53 insertions, 0 deletions
diff --git a/issues/quality-control/qc-r-qtl2-bundles.gmi b/issues/quality-control/qc-r-qtl2-bundles.gmi
new file mode 100644
index 0000000..896831d
--- /dev/null
+++ b/issues/quality-control/qc-r-qtl2-bundles.gmi
@@ -0,0 +1,53 @@
+# Quality Control of Data in Uploaded R/qtl2 Bundles
+
+## Tags
+
+* assigned: fredm, acenteno
+* status: open
+* type: feature request
+* priority: medium
+* keywords: quality control, QC, R/qtl2 bundle
+
+## Description
+
+Currently (2024-02-02T05:41+03:00UTC), the code simply allows the upload of data, doing the bare minimum in terms of quality control. In this document, we detail the quality control checks that are required to be run against the uploaded data, to ensure the data we have is acceptable.
+
+The following "key" details the meanings of certain notations in this file:
+
+* //[ ]//: not started
+* //[-]//: partially done or in progress
+* //[x]//: completed
+
+### [-] Control File
+
+* [x] MUST exist in bundle
+* [x] One and only one control file in the bundle
+* [-] Defaults for control data are auto-provided by code
+* [ ] Every file listed in control file MUST exist in the bundle
+
+### [ ] //geno// File(s)
+
+* [ ] Every value existing in file is one of the genotype encodings in the control file
+
+### [ ] //phenocovar// File(s)
+
+* [ ] At least one of the //phenocovar// files contains a **description** column
+* [ ] The description of every phenotype fits the rules[1]
+
+### [ ] //pheno// File(s)
+
+* [ ] Check for a minimal number of decimal places (three?)
+
+### [ ] //phenose// File(s)
+
+This is a proposed addition for our specific use-case. If the data in the //pheno// file(s) was derived from averaging values, then the user could provide the corresponding "//standard error//" file(s).
+
+* [ ] Check for a minimal number of decimal places (six?)
+
+## Questions Fred Has
+
+* Is there a way to detect whether data has been log2 normalised? GN requires that all data is log2 normalised.
+
+## Resources
+
+* [1]: Description rules: https://info.genenetwork.org/faq.php#q-22