OHSU/VA B6D2F2 Brain mRNA M430 RMA Database (August/05 Freeze) modify this page

Accession number: GN84

This August 2005 data freeze provides estimate of mRNA expression in adult brains of F2 intercross mice (C57BL/6J x DBA/2J F2) measured using Affymetrix M430A and M430B microarray pairs. Data were generated at The Oregon Health Sciences University (OHSU) in Portland, Oregon, by John Belknap and Robert Hitzemann. Data were processed using the RMA protocol and are presented with secondary normalization to an average expression value of 8 units. To simplify comparison between transforms, RMA values of each array were adjusted to an average of 8 units and a variance of 2 units. This data set was run as a single large batch with effort to balance samples by sex, age, and environment.

    About the cases used to generate this set of data:

Fifty-six B6D2F2 samples, each taken from a single brain hemisphere from an individual mouse, were assayed using 56 M430A and M430B Affymetrix short oligomer microarrays. [The remaining hemisphere will be used later for an anaysis of specific brain regions.] Each array ID (see table below) includes a three letter code; the first letter usually denotes sex of the case (note that we have made a few corrections and there are therefore several sex-discordant IDs), the second letter denotes the hemisphere (R or L), and the third letter is the mouse number within each cell. The F2 mice were experimentally naive, born within a 3-day period from second litters of each dam, and housed at weaning (20- to 24-days-of-age) in like-sex groups of 3 to 4 mice for females and 2 to 3 mice for males in standard mouse shoebox cages within Thoren racks. All 56 F2 mice were killed at 77 to 79 days-of-age by cervical dislocation on December 17, 2003. The brains were immediately split at the midline and then quickly frozen on dry ice. The brains were stored for about two weeks at -80 degrees C until further use.

The F2 was derived as follows: C57BL/6J (B6) and DBA/2J (D2) breeders were obtained from The Jackson Laboratory, and two generations later their progeny were crossed to produce B6D2F1 and D2B6F1 hybrid at the Portland VA Veterinary Medical Unit (AAALAC approved). The reciprocal F1s were mated to create an F2 population with both progenitor X and Y chromosomes about equally represented.

    About the tissue used to generate these data:

Brain samples were from 31 male and 25 females and between 28 right and 28 left hemispheres distributed with good balance across the two sexes. The tissue arrayed included the forebrain, midbrain, one olfactory bulb, the cerebellum; and the rostral part of the medulla. The medulla was trimmed transversely at the caudal aspect of the cerebellum. The sagittal cut was made from a dorsal to ventral direction. (Note that several of the other brain transcriptome databases do not include olfactory bulb or cerebellum.) Total RNA was isolated with TRIZOL Reagent (Life Technologies Inc.) using a modification of the single-step acid guanidinium isothiocyanate phenol-chloroform extraction method according to the manufacturer’s protocol. The extracted RNA was then purified using RNeasy (Qiagen, Inc.). RNA samples were evaluated by UV spectroscopy for purity; only samples with an A260/280 ratio greater than 1.8 were used. RNA quality was monitored by visualization on an ethidium bromide-stained denaturing formaldehyde agarose gel. Samples containing at least 10 micrograms of total RNA were sent to the OHSU Gene Microarray Shared Resource facility for analysis. The procedures used at the facility precisely follow the manufacturer’s specifications. Details can be found at http://www.ohsu.edu/gmsr/amc. Following labeling, all samples were hybridized to the GeneChip Test3 array for quality control. If target performance did not meet recommended thresholds, the sample would have been discarded. All labeled samples passed the threshold and were hybridized to the 430A and 430B arraya.

    About the arrays:

All 56 430A&B arrays used in this project were purchased at one time and had the same Affymetrix lot number. The table below lists the arrays by Case ID, Array ID.

Order
CaseID
ArrayID
1
061
CASE05_061
2
062
CASE05_062
3
063
CASE05_063
4
064
CASE05_064
5
065
CASE05_065
6
066
CASE05_066
7
067
CASE05_067
8
068
CASE05_068
9
069
CASE05_069
10
070
CASE05_070
11
071
CASE05_071
12
072
CASE05_072
13
073
CASE05_073
14
074
CASE05_074
15
075
CASE05_075
16
076
CASE05_076
17
077
CASE05_077
18
078
CASE05_078
19
079
CASE05_079
20
080
CASE05_080
21
702
CASE05_702
22
704
CASE05_704
23
707
CASE05_707
24
709
CASE05_709
25
710
CASE05_710
26
712
CASE05_712
27
713
CASE05_713
28
715
CASE05_715
29
716
CASE05_716
30
719
CASE05_719
31
720
CASE05_720
32
722
CASE05_722
33
723
CASE05_723
34
724
CASE05_724
35
725
CASE05_725
36
726
CASE05_726
37
727
CASE05_727
38
728
CASE05_728
39
729
CASE05_729
40
732
CASE05_732
41
734
CASE05_734
42
735
CASE05_735
43
736
CASE05_736
44
737
CASE05_737
45
739
CASE05_739
46
741
CASE05_741
47
743
CASE05_743
48
746
CASE05_746
49
754
CASE05_754
50
771
CASE05_771
51
785
CASE05_785
52
793
CASE05_793
53
795
CASE05_795
54
796
CASE05_796
55
798
CASE05_798
56
799
CASE05_799
57
800
CASE05_800
58
801
CASE05_801
59
802
CASE05_802
60
803
CASE05_803
61
806
CASE05_806
62
807
CASE05_807
63
808
CASE05_808
64
809
CASE05_809
65
811
CASE05_811
66
813
CASE05_813
67
814
CASE05_814
68
815
CASE05_815
69
816
CASE05_816
70
817
CASE05_817
71
819
CASE05_819
72
821
CASE05_821
73
824
CASE05_824
74
825
CASE05_825
75
826
CASE05_826
76
828
CASE05_828
77
829
CASE05_829
78
830
CASE05_830
79
833
CASE05_833
80
835
CASE05_835

    About the marker set:

The 56 mice were each genotyped at 309 MIT microsatellite markers distributed across the genome, including the Y chromosome. The genotyping error check routine (Lincoln and Lander, 1992) implemented within R/qtl (Broman et al., 2003) showed no likely errors at p <.01 probability. Initial genotypes were generated at OHSU. Approximately 200 genotypes were generated at UTHSC by Jing Gu and Shuhua Qi.

    About data processing:

Probe (cell) level data from the CEL file: These CEL values produced by GCOS are the 75% quantiles from a set of 91 pixel values per cell. Probe values were processed as follows:
  • Step 1: We added an offset of 1.0 unit to each cell signal to ensure that all values could be logged without generating negative values. We then computed the log base 2 of each cell.
  • Step 2: We took the log base 2 of each probe signal.
  • Step 3: We computed the Z scores for each probe signal.
  • Step 4: We multiplied all Z scores by 2.
  • Step 5: We added 8 to the value of all Z scores. The consequence of this simple set of transformations is to produce a set of Z scores that have a mean of 8, a variance of 4, and a standard deviation of 2. The advantage of this modified Z score is that a two-fold difference in expression level corresponds approximately to a 1 unit difference.
  • Step 6a: The 430A and 430B arrays include a set of 100 shared probe sets (2200 probes) that have identical sequences. These probes provide a way to calibrate expression of the A and B arrays to a common scale. The absolute mean expression on the 430B array is almost invariably lower than that on the 430A array. To bring the two arrays into alignment, we regressed Z scores of the common set of probes to obtain a linear regression corrections to rescale the 430B arrays to the 430A array. In our case this involved multiplying all 430B Z scores by the slope of the regression and adding or subtracting a very small offset. The result of this step is that the mean of the 430A GeneChip expression is fixed at a value of 8, whereas that of the 430B chip is typically 7. Thus average of A and B arrays is approximately 7.5.
  • Step 6b: We recenter the whole set of 430A and B transcripts to a mean of 8 and a standard deviation of 2. This involves reapplying Steps 3 through 5 above but now using the entire set of probes and probe sets from a merged 430A and B data set.

Probe set data: The uncorrected, untransformed CEL files were subject to probe (low) level processing using both the RMA (Robust Multiarray Average; Irizarry et al. 2003) and PDNN (Position Dependent Nearest Neighbor; Zhang et al. 2003) methods because these two performed the best of four methods tested in a recent four inbred strain comparison using the M430A chip on whole brain samples (Hitzemann et al, submitted). RMA was implemented by the Affy package (11/24/03 version) within Bioconductor (http://www.bioconductor.org) and PDNN by the PerfectMatch v. 2.3 program from Li Zhang (PDNN ). For sake of comparison with other data sets, MAS 5 files have also been generated.

To better compare data sets, the same simple steps (1 through 6 above) were applied to PDNN and RMA values. Every microarray data set therefore has a mean expression of 8 units with a standard deviation of 2 units. A 1-unit difference therefore represents roughly a 2-fold difference in expression level. Expression levels below 5 are usually close to background noise levels.

    About the chromosome and megabase position values:

The chromosomal locations of M430A and M430B probe sets were determined by BLAT analysis of concatenated probe sequences using the Mouse Genome Sequencing Consortium March 2005 (mm6) assembly. This BLAT analysis is performed periodically by Yanhua Qu as each new build of the mouse genome is released. We thank Yan Cui (UTHSC) for allowing us to use his Linux cluster to perform this analysis. It is possible to confirm the BLAT alignment results yourself simply by clicking on the Verify link in the Trait Data and Editing Form (right side of the Location line).

    Data source acknowledgment:

This project was supported by two Department of Veterans Affairs Merit Review Awards (to JK Belknap and R Hitzemann, respectively), AA10760 (Portland Alcohol Research Center), AA06243, AA13484, AA11034, DA05228 and MH51372.

Please contact either John Belknap or Robert Hitzemann at the Dept. of Behavioral Neuroscience, Oregon Health & Science University (L470), or Research Service (R&D5), Portland VA Medical Ctr., Portland, OR 97239 USA.

    References:

Hitzemann, R, McWeeney, S, Harrington, S, Malmanger, B, Lawler, M, Belknap, JK (2004) Brain gene expression among four inbred mouse strains: The development of an analysis strategy for the integration of QTL and gene expression data. Submitted.

Irizarry, RA, Bolstad, BM, Collin, F, Cope, LM, Hobbs, B, Speed, TP (2003) Summaries of Affymetrix GeneChip probe level data. Nuc Acids Res 31:1-15.

Lincoln, SE, Lander, ES (1992) Systematic detection of errors in genetic linkage data. Genomics 14:604-610.

Zhang, L, Miles, MF, Aldape, KD (2003) A model of molecular interactions on short oligonucleotide microarrays. Nat Biotech 21:818-821.

    Information about this text file:

This text file was originally generated by John Belknap, March 2004. Updated by RWW, October 31, 2004.