aboutsummaryrefslogtreecommitdiff
path: root/scripts
AgeCommit message (Expand)Author
2024-10-22Refactor `qc_pheno_file` and reuse it for different file types....The QC/QA steps taken by the `qc_pheno_file` function are very similar across the "pheno", "phenose" and "phenonum" files. This commit makes the `qc_pheno_file` function a higher-order function and we pass the file-type specific check(s) as a callable (function) to be used for the QC/QA process. Frederick Muriuki Muriithi
2024-10-22Check for errors in `pheno` files.Frederick Muriuki Muriithi
2024-10-21Check `phenocovar` files for errors.Frederick Muriuki Muriithi
2024-10-17Cleanup: Delete all extracted files after processing.Frederick Muriuki Muriithi
2024-10-17Leave TODO notes for what needs to be done.Frederick Muriuki Muriithi
2024-10-17Fetch samples from database...Fetch the samples from the database. These will be used to verify that the samples in the phenotype files already exist in the database and are valid. Frederick Muriuki Muriithi
2024-10-17Undo transpose for any transposed files...To reduce the complexity involved in the processing of the files, we undo any transposition of the CSV files for those files that are marked as transposed. Frederick Muriuki Muriithi
2024-10-17Pass new arguments to QC function.Frederick Muriuki Muriithi
2024-10-17Add `speciesid` and `populationid` arguments to the script.Frederick Muriuki Muriithi
2024-10-17Add the working directory argument to the script....Add a `--working-dir` argument to allow passing a directory where the script process the files within. If not provided, it defaults to a directory within the systems temporary directory. Frederick Muriuki Muriithi
2024-10-17Extract the R/qtl2 bundle for processing....To enable processing of the files individually, this commit will enable the extraction of the files into a known working directory in which all further processing will take place. Frederick Muriuki Muriithi
2024-10-17Extract common functions.Frederick Muriuki Muriithi
2024-10-17Save errors for each file in lists. Parallelise error checking....* Save the errors for each file in a redis list for that file. * Make error checking parallel, i.e. ensure every file of a particular type is checked completely independent of other files of the same type. Frederick Muriuki Muriithi
2024-10-17Rewrite the QC code for R/qtl2Frederick Muriuki Muriithi
2024-10-14Initialise background script for running QC on phenotype bundles.Frederick Muriuki Muriithi
2024-10-14BugFix: Use provided prefix...Use the provided prefix rather than calling `jobs.jobsnamespace()` function that depends of an app context existing. Frederick Muriuki Muriithi
2024-10-14Make addition of arguments independent of each other.Frederick Muriuki Muriithi
2024-09-09Enable samples uploads.Frederick Muriuki Muriithi
2024-09-03Initialise the populations package and update references.Frederick Muriuki Muriithi
2024-08-28Move code handling expression data upload into new module.Frederick Muriuki Muriithi
2024-08-16Log out correct parameters.Frederick Muriuki Muriithi
2024-08-13Bug: cross reference with NULL cM when "gmap" file is absent...The "gmap" file might not exist in some bundles. In those instances, cross-reference the data without including the genotypes' physical positions (cM). Frederick Muriuki Muriithi
2024-08-12Rename module: Module contains exceptions classes.Frederick Muriuki Muriithi
2024-08-08Fix bugs and pass in logger to functions.Frederick Muriuki Muriithi
2024-08-06Pass logger on to inner functions...Pass the logger forward to inner functions to help with debugging things. Frederick Muriuki Muriithi
2024-07-25Fix typing and linting errors.Frederick Muriuki Muriithi
2024-07-25Rename module: qc_app --> uploaderFrederick Muriuki Muriithi
2024-07-05bug: Return a hashable key, not a dict.Frederick Muriuki Muriithi
2024-07-02Call correct method.Frederick Muriuki Muriithi
2024-07-02Ensure no duplicated values for the query.Frederick Muriuki Muriithi
2024-07-01Check for genotype samples in the database...Check for genotype samples in both the R/qtl2 file and in the database. Frederick Muriuki Muriithi
2024-06-27Fix bug with the logging setup.Frederick Muriuki Muriithi
2024-04-12Move entry-point module to scripts package....This ensures the entry-point script/module is actually installed together with the rest of the code. Frederick Muriuki Muriithi
2024-04-08Fix pylint and mypy errors.Frederick Muriuki Muriithi
2024-04-04Remove unused database connection.Frederick Muriuki Muriithi
2024-04-03Reduce size of data inserted per query...Reduce the size of data inserted per query since MariDB allows a packet with a maximum size of 1GB. This should hopefully resolve the …OperationalError: (2006, 'Server has gone away') error. Frederick Muriuki Muriithi
2024-03-29Quiet linter.Frederick Muriuki Muriithi
2024-03-22Notify user if identifiers are not consistent.Frederick Muriuki Muriithi
2024-03-22Map names in files to names in database.Frederick Muriuki Muriithi
2024-03-22Fix linting issue.Frederick Muriuki Muriithi
2024-03-20Fix bug: correctly merge standard-error values in file to data in db...`read_datavalues(…)` function returns a dict of the form: ``` { ProbeSetName01: ({…}, …), ProbeSetName02: ({…}, …), ︙ } ``` Previously, the generator would thus try to index into the keys of the datavalues, which were strings, leading to an error. This commit changes the generator to return the values of the datavalues dict as a flattened list of values. Frederick Muriuki Muriithi
2024-02-28Fix bug: fetch from cursor, not return from cursor.execute(…)load-raw-data-no-parsingFrederick Muriuki Muriithi
2024-02-21Check that samples/cases are consistent...Ensure that **ALL** samples/cases/individuals mentioned in any of the pheno files actually exist in at least one of the geno files. Frederick Muriuki Muriithi
2024-02-21Pass 'filename' value to error checker function.Frederick Muriuki Muriithi
2024-02-20Track filename in the errors...R/qtl2 bundles can contain more than one file, of the same type. When errors are encountered in any of the files, we need to be able to inform the user which file it is, in addition to the line and column number. Frederick Muriuki Muriithi
2024-02-15Filter out NULL valuesFrederick Muriuki Muriithi
2024-02-15Only log out information if the check is actually run.Frederick Muriuki Muriithi
2024-02-13Add some parallelism to the QC ChecksFrederick Muriuki Muriithi
2024-02-13Provide nice UI progress indicators.Frederick Muriuki Muriithi
2024-02-12Check for errors in the 'phenose' file.Frederick Muriuki Muriithi