aboutsummaryrefslogtreecommitdiff
path: root/general/datasets/Sa_m2_0905_p/processing.rtf
diff options
context:
space:
mode:
Diffstat (limited to 'general/datasets/Sa_m2_0905_p/processing.rtf')
-rw-r--r--general/datasets/Sa_m2_0905_p/processing.rtf26
1 files changed, 26 insertions, 0 deletions
diff --git a/general/datasets/Sa_m2_0905_p/processing.rtf b/general/datasets/Sa_m2_0905_p/processing.rtf
new file mode 100644
index 0000000..8a23d84
--- /dev/null
+++ b/general/datasets/Sa_m2_0905_p/processing.rtf
@@ -0,0 +1,26 @@
+<blockquote><strong>Probe (cell) level data from the CEL file: </strong>These CEL values produced by <a class="normal" href="http://www.affymetrix.com/support/technical/product_updates/gcos_download.affx" target="_blank">GCOS</a> are the 75% quantiles from a set of 91 pixel values per cell. Probe values were processed as follows:
+<ul>
+ <li>Step 1: We added an offset of 1.0 unit to each cell signal to ensure that all values could be logged without generating negative values. We then computed the log base 2 of each cell.</li>
+ <li>Step 2: We took the log base 2 of each probe signal.</li>
+ <li>Step 3: We computed the Z scores for each probe signal.</li>
+ <li>Step 4: We multiplied all Z scores by 2.</li>
+ <li>Step 5: We added 8 to the value of all Z scores. The consequence of this simple set of transformations is to produce a set of Z scores that have a mean of 8, a variance of 4, and a standard deviation of 2. The advantage of this modified Z score is that a two-fold difference in expression level corresponds approximately to a 1 unit difference.</li>
+ <li>Step 6a: The 430A and 430B arrays include a set of 100 shared probe sets (2200 probes) that have identical sequences. These probes provide a way to calibrate expression of the A and B arrays to a common scale. The absolute mean expression on the 430B array is almost invariably lower than that on the 430A array. To bring the two arrays into alignment, we regressed Z scores of the common set of probes to obtain a linear regression corrections to rescale the 430B arrays to the 430A array. In our case this involved multiplying all 430B Z scores by the slope of the regression and adding or subtracting a very small offset. The result of this step is that the mean of the 430A GeneChip expression is fixed at a value of 8, whereas that of the 430B chip is typically 7. Thus average of A and B arrays is approximately 7.5.</li>
+ <li>Step 6b: We recenter the whole set of 430A and B transcripts to a mean of 8 and a standard deviation of 2. This involves reapplying Steps 3 through 5 above but now using the entire set of probes and probe sets from a merged 430A and B data set.</li>
+</ul>
+
+<p><strong>Probe set data from the TXT file: </strong>These TXT files were generated using the MAS 5. The same simple steps described above were also applied to these values. Every microarray data set therefore has a mean expression of 8 with a standard deviation of 2. A 1-unit difference therefor represents roughly a two-fold difference in expression level. Expression levels below 5 are usually close to background noise levels.</p>
+</blockquote>
+
+<p>About the marker set:</p>
+
+<blockquote>
+<p>The 56 mice were each genotyped at 309 MIT microsatellite markers distributed across the genome, including the Y chromosome. The genotyping error check routine (Lincoln and Lander, 1992) implemented within <a class="normal" href="http://biostat.jhsph.edu/~kbroman/qtl" target="_blank">R/qtl</a> (Broman et al., 2003) showed no likely errors at p &lt;.01 probability. Initial genotypes were generated at OHSU. Approximately 200 genotypes were generated at UTHSC by Jing Gu and Shuhua Qi.</p>
+</blockquote>
+
+<p>About the chromosome and megabase position values:</p>
+
+<blockquote>The chromosomal locations of M430A and M430B probe sets were determined by <a class="normal" href="http://genome.ucsc.edu/cgi-bin/hgBlat?command=start&amp;org=mouse" target="_blank">BLAT</a> analysis of concatenated probe sequences using the Mouse Genome Sequencing Consortium March 2005 (mm6) assembly. This BLAT analysis is performed periodically by Yanhua Qu as each new build of the mouse genome is released. We thank Yan Cui (UTHSC) for allowing us to use his Linux cluster to perform this analysis. It is possible to confirm the BLAT alignment results yourself simply by clicking on the <strong>Verify</strong> link in the Trait Data and Editing Form (right side of the <strong>Location</strong> line).
+
+<p>&nbsp;</p>
+</blockquote>