Age | Commit message (Expand) | Author |
2024-08-12 | Update check for missing files: Check from directory....Enable the check for missing files to act upon a directory where the
R/qtl2 bundle has been extracted into.
| Frederick Muriuki Muriithi |
2024-08-12 | Add utility to transpose CSVs, renaming the original file. | Frederick Muriuki Muriithi |
2024-08-12 | Define new InvalidValue error type....Redesign the InvalidValue error type for the R/qtl2 bundles to list
the errors according to the row and column titles rather than line
numbers. This makes the error-reporting independent on whether or not
the file is transposed.
This will replace the use of the older
`quality_control.errors.InvalidValue` error type that depends on the
line and column numbers, and thus cannot work with transposable files.
| Frederick Muriuki Muriithi |
2024-08-12 | Rename module: Module contains exceptions classes. | Frederick Muriuki Muriithi |
2024-08-09 | Read R/qtl2 control data from a directory with extracted files. | Frederick Muriuki Muriithi |
2024-08-08 | Function to transpose CSV files....Some files come in a transposed form, so we need to transpose them
again in order to use the same processing code for all files.
| Frederick Muriuki Muriithi |
2024-08-08 | Add utility function to extract R/qtl2 zip bundles | Frederick Muriuki Muriithi |
2024-06-20 | Check for special files that might share names/extensions...Check for special files and hidden files that might be inadvertently
added to the zip bundle by the operating system in use that might
share names and/or extensions with the main bundle files.
| Frederick Muriuki Muriithi |
2024-02-21 | Check that samples/cases are consistent...Ensure that **ALL** samples/cases/individuals mentioned in any of the
pheno files actually exist in at least one of the geno files.
| Frederick Muriuki Muriithi |
2024-02-20 | Track filename in the errors...R/qtl2 bundles can contain more than one file, of the same type. When
errors are encountered in any of the files, we need to be able to
inform the user which file it is, in addition to the line and column
number.
| Frederick Muriuki Muriithi |
2024-02-20 | Generalise fetching of samples/cases/individuals. | Frederick Muriuki Muriithi |
2024-02-20 | Read samples from geno file. | Frederick Muriuki Muriithi |
2024-02-20 | Read each file separately...Provide the function 'read_file_data' in the 'r_qtl.r_qtl2' module to
read each file in the bundle separately.
The function 'file_data' in the 'r_qtl.r_qtl2' module reads *ALL* the
files of a particular type (e.g. geno files) and returns a single
generator object with the data from *ALL* the files. This does not
render itself very useful for error checking.
We needed a way to check for errors, and report them for each and
every file in the bundle, for easier tracking and fixing.
| Frederick Muriuki Muriithi |
2024-02-20 | Stand-alone function to read control file...Add a function that, given the path to the zip file, will read the
control data. It creates its own context manager.
| Frederick Muriuki Muriithi |
2024-02-16 | Replace genotype codes with values in control file. | Frederick Muriuki Muriithi |
2024-02-16 | Convert missing value codes to None | Frederick Muriuki Muriithi |
2024-02-16 | Strip comment lines. | Frederick Muriuki Muriithi |
2024-02-16 | Read raw text data from a file in the zip bundle | Frederick Muriuki Muriithi |
2024-02-13 | Make "FILE_TYPES" part of public interface for module/package. | Frederick Muriuki Muriithi |
2024-02-12 | Refactor: Use new decimal places checker. | Frederick Muriuki Muriithi |
2024-02-12 | Check for errors in the 'phenose' file. | Frederick Muriuki Muriithi |
2024-02-12 | Provide the key for each file listed in the control file. | Frederick Muriuki Muriithi |
2024-02-08 | Generalise error retrieval: extract common structure...Extract the common structure into a separate function and pass in
checkers that return the errors they find.
| Frederick Muriuki Muriithi |
2024-02-08 | Use error objects rather than plain tuple values. | Frederick Muriuki Muriithi |
2024-02-06 | Check that pheno values are numerical and at least 3 decimal places | Frederick Muriuki Muriithi |
2024-02-05 | Check that data in geno file is valid...Add a function to ensure the values in the geno files are all listed
in the control data under the "genotypes" key.
| Frederick Muriuki Muriithi |
2024-02-05 | Fix linting and type errors. | Frederick Muriuki Muriithi |
2024-02-05 | Do general bundle validation and show errors...* Display any and all errors on the UI
* Move `validate_bundle` to QC module and refactor to use
`missing_files`
| Frederick Muriuki Muriithi |
2024-02-05 | Retrieve list of all files, and list of missing files...Add QC a function to list all files listed in the control file, and
another to list only the files missing from the bundle.
| Frederick Muriuki Muriithi |
2024-02-02 | List file types in a single place for easier reuse | Frederick Muriuki Muriithi |
2024-02-02 | Ensure control file defaults are set up in code. | Frederick Muriuki Muriithi |
2024-01-16 | Provide defaults for various control variables...`na.strings` has a default value of "NA" as stated in
https://kbroman.org/qtl2/assets/vignettes/input_files.html#CSV_files
quote:
> Missing value codes will be specified in the control file (as
> na.strings, with default value "NA") and will apply across all
> files, so a missing value code for one file cannot be an allowed
> value in another file.
for `comment.char`
> The CSV files can include a header with a set of comment lines
> initiated by a value specified in the control file as comment.char
> (with default value "#").
for `sep`:
The default separator is expected to be the comma, as stated in
https://kbroman.org/qtl2/assets/vignettes/input_files.html#field-separator
quote:
> If the data files use a separator other than a comma ...
| Frederick Muriuki Muriithi |
2024-01-15 | Process `na.strings` even for default cases...There was a bug where the `na.strings` were not processed correctly if
the user called the `r_qtl.r_qtl2.file_data(...)` function without
explicitly providing the `process_*` arguments.
This commit fixes that.
| Frederick Muriuki Muriithi |
2024-01-15 | Extract common functional tools to separate package. | Frederick Muriuki Muriithi |
2024-01-10 | Provide convenience functions to avoid subtle call errors | Frederick Muriuki Muriithi |
2024-01-10 | Make identifier column name explicit...Since the R/qtl2 bundle generator could name the identifier column
anything, this commit converts the incoming identifier column name
into something explicit that we know and can use.
| Frederick Muriuki Muriithi |
2024-01-09 | Raise exception on reading non-existing file...The validation checks ensure that whatever files are listed in the
control file exist in the zip file bundle. It is still possible,
however, that the code tries to read a file that does not exist in the
file and is not listed in the control file. In those cases, raise the
appropriate exception.
| Frederick Muriuki Muriithi |
2024-01-08 | Upload R/qtl2 zip bundle and check for errors. | Frederick Muriuki Muriithi |
2024-01-04 | Parse sex information from R/qtl bundle. | Frederick Muriuki Muriithi |
2024-01-04 | Parse cross information from R/qtl2 bundle. | Frederick Muriuki Muriithi |
2024-01-04 | Process sex and cross information data in "covar" files. | Frederick Muriuki Muriithi |
2024-01-04 | Parse multiple files with same file key. | Frederick Muriuki Muriithi |
2024-01-03 | Use generic parser. Remove obsoleted functions. | Frederick Muriuki Muriithi |
2024-01-03 | Parse founder_geno files. Generalise parsing files....* Add tests for parsing "founder_geno" files
* Extract common file parsing structure out to more general function
* Use generic function to parse "founder_geno" file in test
| Frederick Muriuki Muriithi |
2024-01-03 | Parse the phenotype data from the R/qtl2 bundle. | Frederick Muriuki Muriithi |
2024-01-03 | Rename argument and add documentation to functions. | Frederick Muriuki Muriithi |
2024-01-03 | Extract processing of transposed files into reusable function....The processing of transposed files is similar across files. This
commit extracts the common parts into a separate function.
| Frederick Muriuki Muriithi |
2024-01-03 | Refactor: Extract potentially reusable functions...The processing of transposed files is probably going to be very
similar, thus the need to extract some reusable code from the
geno-file-specific function in preparation.
| Frederick Muriuki Muriithi |
2024-01-02 | Abstract away non-transposed file processing...Since the processing of non-transposed files is mostly similar,
abstract away the common operations into a separate function and use
the function instead of repeating the same pattern of code throughout
the codebase.
| Frederick Muriuki Muriithi |
2024-01-02 | Cleanup: Fix linting and typing errors and update docs. | Frederick Muriuki Muriithi |