gn-uploader - gn-uploader: Data uploader for GeneNetwork

Age	Commit message (Collapse)	Author
2024-08-12	Bug: Ensure file type values are lists.	Frederick Muriuki Muriithi

2024-08-12	Update check for missing files: Check from directory.	Frederick Muriuki Muriithi
	Enable the check for missing files to act upon a directory where the R/qtl2 bundle has been extracted into.
2024-08-12	Add utility to transpose CSVs, renaming the original file.	Frederick Muriuki Muriithi

2024-08-12	Define new InvalidValue error type.	Frederick Muriuki Muriithi
	Redesign the InvalidValue error type for the R/qtl2 bundles to list the errors according to the row and column titles rather than line numbers. This makes the error-reporting independent on whether or not the file is transposed. This will replace the use of the older `quality_control.errors.InvalidValue` error type that depends on the line and column numbers, and thus cannot work with transposable files.
2024-08-12	Rename module: Module contains exceptions classes.	Frederick Muriuki Muriithi

2024-08-09	Read R/qtl2 control data from a directory with extracted files.	Frederick Muriuki Muriithi

2024-08-08	Function to transpose CSV files.	Frederick Muriuki Muriithi
	Some files come in a transposed form, so we need to transpose them again in order to use the same processing code for all files.
2024-08-08	Add utility function to extract R/qtl2 zip bundles	Frederick Muriuki Muriithi

2024-06-20	Check for special files that might share names/extensions	Frederick Muriuki Muriithi
	Check for special files and hidden files that might be inadvertently added to the zip bundle by the operating system in use that might share names and/or extensions with the main bundle files.
2024-02-21	Check that samples/cases are consistent	Frederick Muriuki Muriithi
	Ensure that ALL samples/cases/individuals mentioned in any of the pheno files actually exist in at least one of the geno files.
2024-02-20	Track filename in the errors	Frederick Muriuki Muriithi
	R/qtl2 bundles can contain more than one file, of the same type. When errors are encountered in any of the files, we need to be able to inform the user which file it is, in addition to the line and column number.
2024-02-20	Generalise fetching of samples/cases/individuals.	Frederick Muriuki Muriithi

2024-02-20	Read samples from geno file.	Frederick Muriuki Muriithi

2024-02-20	Read each file separately	Frederick Muriuki Muriithi
	Provide the function 'read_file_data' in the 'r_qtl.r_qtl2' module to read each file in the bundle separately. The function 'file_data' in the 'r_qtl.r_qtl2' module reads ALL the files of a particular type (e.g. geno files) and returns a single generator object with the data from ALL the files. This does not render itself very useful for error checking. We needed a way to check for errors, and report them for each and every file in the bundle, for easier tracking and fixing.
2024-02-20	Stand-alone function to read control file	Frederick Muriuki Muriithi
	Add a function that, given the path to the zip file, will read the control data. It creates its own context manager.
2024-02-16	Replace genotype codes with values in control file.	Frederick Muriuki Muriithi

2024-02-16	Convert missing value codes to None	Frederick Muriuki Muriithi

2024-02-16	Strip comment lines.	Frederick Muriuki Muriithi

2024-02-16	Read raw text data from a file in the zip bundle	Frederick Muriuki Muriithi

2024-02-13	Make "FILE_TYPES" part of public interface for module/package.	Frederick Muriuki Muriithi

2024-02-12	Refactor: Use new decimal places checker.	Frederick Muriuki Muriithi

2024-02-12	Check for errors in the 'phenose' file.	Frederick Muriuki Muriithi

2024-02-12	Provide the key for each file listed in the control file.	Frederick Muriuki Muriithi

2024-02-08	Generalise error retrieval: extract common structure	Frederick Muriuki Muriithi
	Extract the common structure into a separate function and pass in checkers that return the errors they find.
2024-02-08	Use error objects rather than plain tuple values.	Frederick Muriuki Muriithi

2024-02-06	Check that pheno values are numerical and at least 3 decimal places	Frederick Muriuki Muriithi

2024-02-05	Check that data in geno file is valid	Frederick Muriuki Muriithi
	Add a function to ensure the values in the geno files are all listed in the control data under the "genotypes" key.
2024-02-05	Fix linting and type errors.	Frederick Muriuki Muriithi

2024-02-05	Do general bundle validation and show errors	Frederick Muriuki Muriithi
	* Display any and all errors on the UI * Move `validate_bundle` to QC module and refactor to use `missing_files`
2024-02-05	Retrieve list of all files, and list of missing files	Frederick Muriuki Muriithi
	Add QC a function to list all files listed in the control file, and another to list only the files missing from the bundle.
2024-02-02	List file types in a single place for easier reuse	Frederick Muriuki Muriithi

2024-02-02	Ensure control file defaults are set up in code.	Frederick Muriuki Muriithi

2024-01-16	Provide defaults for various control variables	Frederick Muriuki Muriithi
	`na.strings` has a default value of "NA" as stated in https://kbroman.org/qtl2/assets/vignettes/input_files.html#CSV_files quote: > Missing value codes will be specified in the control file (as > na.strings, with default value "NA") and will apply across all > files, so a missing value code for one file cannot be an allowed > value in another file. for `comment.char` > The CSV files can include a header with a set of comment lines > initiated by a value specified in the control file as comment.char > (with default value "#"). for `sep`: The default separator is expected to be the comma, as stated in https://kbroman.org/qtl2/assets/vignettes/input_files.html#field-separator quote: > If the data files use a separator other than a comma ...
2024-01-15	Process `na.strings` even for default cases	Frederick Muriuki Muriithi
	There was a bug where the `na.strings` were not processed correctly if the user called the `r_qtl.r_qtl2.file_data(...)` function without explicitly providing the `process_*` arguments. This commit fixes that.
2024-01-15	Extract common functional tools to separate package.	Frederick Muriuki Muriithi

2024-01-10	Provide convenience functions to avoid subtle call errors	Frederick Muriuki Muriithi

2024-01-10	Make identifier column name explicit	Frederick Muriuki Muriithi
	Since the R/qtl2 bundle generator could name the identifier column anything, this commit converts the incoming identifier column name into something explicit that we know and can use.
2024-01-09	Raise exception on reading non-existing file	Frederick Muriuki Muriithi
	The validation checks ensure that whatever files are listed in the control file exist in the zip file bundle. It is still possible, however, that the code tries to read a file that does not exist in the file and is not listed in the control file. In those cases, raise the appropriate exception.
2024-01-08	Upload R/qtl2 zip bundle and check for errors.	Frederick Muriuki Muriithi

2024-01-04	Parse sex information from R/qtl bundle.	Frederick Muriuki Muriithi

2024-01-04	Parse cross information from R/qtl2 bundle.	Frederick Muriuki Muriithi

2024-01-04	Process sex and cross information data in "covar" files.	Frederick Muriuki Muriithi

2024-01-04	Parse multiple files with same file key.	Frederick Muriuki Muriithi

2024-01-03	Use generic parser. Remove obsoleted functions.	Frederick Muriuki Muriithi

2024-01-03	Parse founder_geno files. Generalise parsing files.	Frederick Muriuki Muriithi
	* Add tests for parsing "founder_geno" files * Extract common file parsing structure out to more general function * Use generic function to parse "founder_geno" file in test
2024-01-03	Parse the phenotype data from the R/qtl2 bundle.	Frederick Muriuki Muriithi

2024-01-03	Rename argument and add documentation to functions.	Frederick Muriuki Muriithi

2024-01-03	Extract processing of transposed files into reusable function.	Frederick Muriuki Muriithi
	The processing of transposed files is similar across files. This commit extracts the common parts into a separate function.
2024-01-03	Refactor: Extract potentially reusable functions	Frederick Muriuki Muriithi
	The processing of transposed files is probably going to be very similar, thus the need to extract some reusable code from the geno-file-specific function in preparation.
2024-01-02	Abstract away non-transposed file processing	Frederick Muriuki Muriithi
	Since the processing of non-transposed files is mostly similar, abstract away the common operations into a separate function and use the function instead of repeating the same pattern of code throughout the codebase.