BXD Brain mRNA U74Av2 Database (August/03 Freeze)
About the mice used to map microarray data:
The set of animals used for mapping (a mapping panel) consists of 30 groups of genetically uniform mice of the BXD type. The parental strains are C57BL/6J (B6 or B) and DBA/2J (D2 or D). The first generation hybrid is labeled F1. The F1 hybrids were made by crossing B6 females to D2 males.
All other lines are recombinant inbred strains derived from C57BL/6J and DBA/2J crosses. BXD2 through BXD32 were produced by Dr. Benjamin Taylor starting in the late 1970s. BXD33 through BXD42 were also produced by Dr. Taylor, but they were generated in the 1990s. Lines BXDA12 and BXDA20 are two partially inbred advanced recombinant strains (F8 and F9) that are part of a large set of BXD-Advanced strains being produced by Drs. Robert Williams, Lu Lu, Guomin Zhou, Lee Silver, and Jeremy Peirce. There will eventually be ~45 of these strains. For additional background on recombinant inbred strains, please see http://www.nervenet.org/papers/bxn.html.
The table below lists the arrays by strain, sex, and age. Each array was hybridized to a pool of mRNA from 3 mice.
About the tissue used to generate these data:
Most expression data are averages based on three microarrays (U74Av2). Each individual array experiment involved a pool of brain tissue (forebrain plus the midbrain, but without the olfactory bulb) that was taken from three adult animals usually of the same age. A total of 83 arrays were used: 67 were female pools and 16 were male pools. Animals ranged in age from 56 to 441 days, usually with a balanced design (one pool at 8 weeks, one pool at ~20 weeks, one pool at approximately 1 year).
About data processing:
Probe (cell) level data from the .CEL file: These .CEL values produced by MAS 5.0 are the 75% quantiles from a set of 36 pixel values per cell (the pixel with the 12th highest value represents the whole cell).
- Step 1: We added an offset of 1.0 to the .CEL expression values for each cell to ensure that all values could be logged without generating negative values.
- Step 2: We took the log base 2 of each cell.
- Step 3: We computed the Z-score for each cell.
- Step 4: We multiplied all Z scores by 2.
- Step 5: We added 8 to the value of all Z-scores. The consequence of this simple set of transformations is to produce a set of Z-scores that have a mean of 8, a variance of 4, and a standard deviation of 2. The advantage of this modified Z-score is that a two-fold difference in expression level corresponds approximately to a 1 unit difference.
- Step 6: We computed the arithmetic mean of the values for the set of microarrays for each of the individual strains. We have not (yet) corrected for variance introduced by sex, age, or a sex-by-age interaction. We have not corrected for background beyond the background correction implemented by Affymetrix in generating the .CEL file.
Probe set data from the .TXT file: These .TXT files were generated using the MAS 5.0. The same simple steps described above were also applied to these values. Every microarray data set therefore has a mean expression of 8 with a standard deviation of 2. A 1-unit difference therefor represents roughly a two-fold difference in expression level. Expression levels below 5 are usually close to background noise levels.
About the chromosome and megabase position values:
The chromosomal locations of probe sets and gene markers were determined by BLAT analysis using the Mouse Genome Sequencing Consortium Feb 2002 Assembly (see http://genome.ucsc.edu/cgi-bin/hgBlat?command=start&org=mouse). We thank John Hogenesch (GNF) and Rob Edwards (UTHSC) for help in extracting and generating these position data.
Resolving Gene Identify and Position Problems:
Users should confirm the identity and positions of probe sets. Probe sets that are intended to target transcripts from a single gene occasionally map to different chromosomes; for example, two probe sets supposedly target the thyroid hormone alpha receptor (Thra): probe sets 99076_at and 99077_at map to Chr 14 at 13.556 Mb and Chr 11 at 99.537 Mb, respectively. One of these must be wrong and since Thra maps to Chr 11 rather than Chr 14, it is likely that 99076_at is mismapped or mislabeled as Thra. To determine which problem is more likely, please re-BLAT the perfect match probe sequence. This is usually quite simple. Just paste all of the perfect match probes into a single BLAT query. For example, to test probe set 99076, paste this sequence into the BLAT query window:
GTTAG ACTTT TTCAT CTGCC AAGTC TTTAG TAAGT GACCT
ACCTA CAGGG TGACC TACCT ACAGG CTTAG AGATT ACCTA
CAGGC TTAGA GATCA TGGTA AGATT CATGA ACAAC ACCCC
GTGCA GATTC ATGAA CAACA CCCCG TGCCG TAACG ACATT
AAGAA CCTGC TTTAT AACTT GTTGC TACAG GATTT GAACC
AGGAT TTGAA CTTCT GTGGT ACAGA CTTCT GTGGT ACAGT
TAGGA GAGCC TTCTG TGGTA CAGTT AGGAG AGCTG GTGTG
TCTGT CATTC AGTAG GGACC TGTCA TTCAG TAGGG ACCAT
AACTC TGTCA TTCAG TAGGG ACCAT AACTA TTCAG TAGGG
ACCAT AACTG CTGCG CTTAC GTTCA GTGGG TATGG CTTTG
TGAAT TCTTT ACATG ATAGC ATTC
(NOTE: BLAT is insensitive to sequence overlap and extra spaces. The sequence above is a concatenation of all PM probes without any concern for probe overlap. The Perfect Match sequences are available on WebQTL by selecting the link� on� the Trait Data and Editing window).
This will return this BLAT Search Results
This confirms that the probe set maps to Chr 14 (a score of 219 is good). However if you click on the browser link in the BLAT Search Results window you will see that the gene that these probes target is actually BC008556 (a nuclear receptor subfamily 1, group D, member 2 gene), not Thra. The Chr 19 hit with a score of 171 can be discounted since it does not correspond to a known transcript.
Data source acknowledgment:
Data were generated with funds to RWW from the Dunavant Chair of
Excellence, University of Tennessee Health Science Center, Department
of Pediatrics. The majority of arrays were processed at Genome Explorations by Dr. Divyen Patel.
Reference:
Williams RW, Shou S, Lu L, Qu Y, Manly KF, Wang J, Chesler E, Hsu HC, Mountz J, Threadgill DW (2002) Massively parallel QTL mapping of microarray data reveals mouse forebrain transcriptional networks. Soc. Neurosci Abst.
Williams RW, Manly KF, Shou S, Chesler E, Hsu HC, Mountz J, Wang J, Threadgill DW, Lu L (2002) Massively parallel complex trait analysis of transcriptional activity in mouse brain. International Mouse Genome Conference 16: 46
Manly KF, Wang J, Shou S. Qu Y, Chesler E, Lu L, Hsu HC, Mountz JD, Threadgill D, Williams RW (2002) QTL mapping with microarray expression data. International Mouse Genome Conference 16: 88.
Wang J, Williams RW, Manly KF (2003) WebQTL: Web-based complex trait analysis. Neuroinformatics 1: 299-308..
|