Probe (cell) level data from the CEL file: These CEL values produced by GCOS are 75% quantiles from a set of 91 pixel values per cell.
Probe set data: The expression data were processed by Yanhua Qu (UTHSC). Probe set data were generated from the fully normalized CEL files (quantile and batch corrected) using the standard MAS 5 Tukey biweight procedure. A 1-unit difference represents roughly a two-fold difference in expression level. Expression levels below 5 are usually close to background noise levels. Data quality control: A total of 62 samples passed RNA quality control.
Probe level QC: Log2 probe data of all arrays were inspected in DataDesk before and after quantile normalization. Inspection involved examining scatterplots of pairs of arrays for signal homogeneity (i.e., high correlation and linearity of the bivariate plots) and looking at all pairs of correlation coefficients (62x61/2). Arrays with probe data that was not homogeneous when compared to any other arrays was flagged. If the correlation at the probe level was less than approximately 0.92 we deleted that array data set. Three arrays we lost during this process (BXD19_M_Str_Batch03, BXD23_F_Str_Batch03, and BXD24_F_Str_Batch03).
Probe set level QC: The final normalized strain averages were evaluated for outliers. This involved counting the number of times that the probe set value for a particular strain was beyond two standard deviations of the mean of all strains. (We used the PDNN transform as our reference probe set data for this QC step.) Two strains, each represented by single arrays, generated greater than 5,000 outlier counts (10% of the number of probe sets). These two arrays generated a great number of outliers across the entire range of expression and since we do not yet have replicate arrays for either of these two strains we opted to delete them from the final April 2005 striatum data sets.