OHSU/VA B6D2F2 Brain mRNA M430 (Aug05) RMA / WebQTL

OHSU/VA B6D2F2 Brain mRNA M430 RMA Database (August/05 Freeze)

Accession number: GN84

This August 2005 data freeze provides estimate of mRNA expression in adult brains of F2 intercross mice (C57BL/6J x DBA/2J F2) measured using Affymetrix M430A and M430B microarray pairs. Data were generated at The Oregon Health Sciences University (OHSU) in Portland, Oregon, by John Belknap and Robert Hitzemann. Data were processed using the RMA protocol and are presented with secondary normalization to an average expression value of 8 units. To simplify comparison between transforms, RMA values of each array were adjusted to an average of 8 units and a variance of 2 units. This data set was run as a single large batch with effort to balance samples by sex, age, and environment.

About the cases used to generate this set of data:

Fifty-six B6D2F2 samples, each taken from a single brain hemisphere from an individual mouse, were assayed using 56 M430A and M430B Affymetrix short oligomer microarrays. [The remaining hemisphere will be used later for an anaysis of specific brain regions.] Each array ID (see table below) includes a three letter code; the first letter usually denotes sex of the case (note that we have made a few corrections and there are therefore several sex-discordant IDs), the second letter denotes the hemisphere (R or L), and the third letter is the mouse number within each cell. The F2 mice were experimentally naive, born within a 3-day period from second litters of each dam, and housed at weaning (20- to 24-days-of-age) in like-sex groups of 3 to 4 mice for females and 2 to 3 mice for males in standard mouse shoebox cages within Thoren racks. All 56 F2 mice were killed at 77 to 79 days-of-age by cervical dislocation on December 17, 2003. The brains were immediately split at the midline and then quickly frozen on dry ice. The brains were stored for about two weeks at -80 degrees C until further use.

The F2 was derived as follows: C57BL/6J (B6) and DBA/2J (D2) breeders were obtained from The Jackson Laboratory, and two generations later their progeny were crossed to produce B6D2F1 and D2B6F1 hybrid at the Portland VA Veterinary Medical Unit (AAALAC approved). The reciprocal F1s were mated to create an F2 population with both progenitor X and Y chromosomes about equally represented.

About the tissue used to generate these data:

Brain samples were from 31 male and 25 females and between 28 right and 28 left hemispheres distributed with good balance across the two sexes. The tissue arrayed included the forebrain, midbrain, one olfactory bulb, the cerebellum; and the rostral part of the medulla. The medulla was trimmed transversely at the caudal aspect of the cerebellum. The sagittal cut was made from a dorsal to ventral direction. (Note that several of the other brain transcriptome databases do not include olfactory bulb or cerebellum.) Total RNA was isolated with TRIZOL Reagent (Life Technologies Inc.) using a modification of the single-step acid guanidinium isothiocyanate phenol-chloroform extraction method according to the manufacturer’s protocol. The extracted RNA was then purified using RNeasy (Qiagen, Inc.). RNA samples were evaluated by UV spectroscopy for purity; only samples with an A260/280 ratio greater than 1.8 were used. RNA quality was monitored by visualization on an ethidium bromide-stained denaturing formaldehyde agarose gel. Samples containing at least 10 micrograms of total RNA were sent to the OHSU Gene Microarray Shared Resource facility for analysis. The procedures used at the facility precisely follow the manufacturer’s specifications. Details can be found at http://www.ohsu.edu/gmsr/amc. Following labeling, all samples were hybridized to the GeneChip Test3 array for quality control. If target performance did not meet recommended thresholds, the sample would have been discarded. All labeled samples passed the threshold and were hybridized to the 430A and 430B arraya.

About the arrays:

All 56 430A&B arrays used in this project were purchased at one time and had the same Affymetrix lot number. The table below lists the arrays by Case ID, Array ID.

Order	CaseID	ArrayID
1	061	CASE05_061
2	062	CASE05_062
3	063	CASE05_063
4	064	CASE05_064
5	065	CASE05_065
6	066	CASE05_066
7	067	CASE05_067
8	068	CASE05_068
9	069	CASE05_069
10	070	CASE05_070
11	071	CASE05_071
12	072	CASE05_072
13	073	CASE05_073
14	074	CASE05_074
15	075	CASE05_075
16	076	CASE05_076
17	077	CASE05_077
18	078	CASE05_078
19	079	CASE05_079
20	080	CASE05_080
21	702	CASE05_702
22	704	CASE05_704
23	707	CASE05_707
24	709	CASE05_709
25	710	CASE05_710
26	712	CASE05_712
27	713	CASE05_713
28	715	CASE05_715
29	716	CASE05_716
30	719	CASE05_719
31	720	CASE05_720
32	722	CASE05_722
33	723	CASE05_723
34	724	CASE05_724
35	725	CASE05_725
36	726	CASE05_726
37	727	CASE05_727
38	728	CASE05_728
39	729	CASE05_729
40	732	CASE05_732
41	734	CASE05_734
42	735	CASE05_735
43	736	CASE05_736
44	737	CASE05_737
45	739	CASE05_739
46	741	CASE05_741
47	743	CASE05_743
48	746	CASE05_746
49	754	CASE05_754
50	771	CASE05_771
51	785	CASE05_785
52	793	CASE05_793
53	795	CASE05_795
54	796	CASE05_796
55	798	CASE05_798
56	799	CASE05_799
57	800	CASE05_800
58	801	CASE05_801
59	802	CASE05_802
60	803	CASE05_803
61	806	CASE05_806
62	807	CASE05_807
63	808	CASE05_808
64	809	CASE05_809
65	811	CASE05_811
66	813	CASE05_813
67	814	CASE05_814
68	815	CASE05_815
69	816	CASE05_816
70	817	CASE05_817
71	819	CASE05_819
72	821	CASE05_821
73	824	CASE05_824
74	825	CASE05_825
75	826	CASE05_826
76	828	CASE05_828
77	829	CASE05_829
78	830	CASE05_830
79	833	CASE05_833
80	835	CASE05_835

About the marker set:

The 56 mice were each genotyped at 309 MIT microsatellite markers distributed across the genome, including the Y chromosome. The genotyping error check routine (Lincoln and Lander, 1992) implemented within R/qtl (Broman et al., 2003) showed no likely errors at p <.01 probability. Initial genotypes were generated at OHSU. Approximately 200 genotypes were generated at UTHSC by Jing Gu and Shuhua Qi.

About data processing:

Probe (cell) level data from the CEL file: These CEL values produced by GCOS are the 75% quantiles from a set of 91 pixel values per cell. Probe values were processed as follows:

Step 1: We added an offset of 1.0 unit to each cell signal to ensure that all values could be logged without generating negative values. We then computed the log base 2 of each cell.
Step 2: We took the log base 2 of each probe signal.
Step 3: We computed the Z scores for each probe signal.
Step 4: We multiplied all Z scores by 2.
Step 5: We added 8 to the value of all Z scores. The consequence of this simple set of transformations is to produce a set of Z scores that have a mean of 8, a variance of 4, and a standard deviation of 2. The advantage of this modified Z score is that a two-fold difference in expression level corresponds approximately to a 1 unit difference.
Step 6a: The 430A and 430B arrays include a set of 100 shared probe sets (2200 probes) that have identical sequences. These probes provide a way to calibrate expression of the A and B arrays to a common scale. The absolute mean expression on the 430B array is almost invariably lower than that on the 430A array. To bring the two arrays into alignment, we regressed Z scores of the common set of probes to obtain a linear regression corrections to rescale the 430B arrays to the 430A array. In our case this involved multiplying all 430B Z scores by the slope of the regression and adding or subtracting a very small offset. The result of this step is that the mean of the 430A GeneChip expression is fixed at a value of 8, whereas that of the 430B chip is typically 7. Thus average of A and B arrays is approximately 7.5.
Step 6b: We recenter the whole set of 430A and B transcripts to a mean of 8 and a standard deviation of 2. This involves reapplying Steps 3 through 5 above but now using the entire set of probes and probe sets from a merged 430A and B data set.

Probe set data: The uncorrected, untransformed CEL files were subject to probe (low) level processing using both the RMA (Robust Multiarray Average; Irizarry et al. 2003) and PDNN (Position Dependent Nearest Neighbor; Zhang et al. 2003) methods because these two performed the best of four methods tested in a recent four inbred strain comparison using the M430A chip on whole brain samples (Hitzemann et al, submitted). RMA was implemented by the Affy package (11/24/03 version) within Bioconductor (http://www.bioconductor.org) and PDNN by the PerfectMatch v. 2.3 program from Li Zhang (PDNN ). For sake of comparison with other data sets, MAS 5 files have also been generated.
To better compare data sets, the same simple steps (1 through 6 above) were applied to PDNN and RMA values. Every microarray data set therefore has a mean expression of 8 units with a standard deviation of 2 units. A 1-unit difference therefore represents roughly a 2-fold difference in expression level. Expression levels below 5 are usually close to background noise levels.

About the chromosome and megabase position values:

The chromosomal locations of M430A and M430B probe sets were determined by BLAT analysis of concatenated probe sequences using the Mouse Genome Sequencing Consortium March 2005 (mm6) assembly. This BLAT analysis is performed periodically by Yanhua Qu as each new build of the mouse genome is released. We thank Yan Cui (UTHSC) for allowing us to use his Linux cluster to perform this analysis. It is possible to confirm the BLAT alignment results yourself simply by clicking on the Verify link in the Trait Data and Editing Form (right side of the Location line).

Data source acknowledgment:

This project was supported by two Department of Veterans Affairs Merit Review Awards (to JK Belknap and R Hitzemann, respectively), AA10760 (Portland Alcohol Research Center), AA06243, AA13484, AA11034, DA05228 and MH51372.

Please contact either John Belknap or Robert Hitzemann at the Dept. of Behavioral Neuroscience, Oregon Health & Science University (L470), or Research Service (R&D5), Portland VA Medical Ctr., Portland, OR 97239 USA.

References:

Hitzemann, R, McWeeney, S, Harrington, S, Malmanger, B, Lawler, M, Belknap, JK (2004) Brain gene expression among four inbred mouse strains: The development of an analysis strategy for the integration of QTL and gene expression data. Submitted.

Irizarry, RA, Bolstad, BM, Collin, F, Cope, LM, Hobbs, B, Speed, TP (2003) Summaries of Affymetrix GeneChip probe level data. Nuc Acids Res 31:1-15.

Lincoln, SE, Lander, ES (1992) Systematic detection of errors in genetic linkage data. Genomics 14:604-610.

Zhang, L, Miles, MF, Aldape, KD (2003) A model of molecular interactions on short oligonucleotide microarrays. Nat Biotech 21:818-821.

Information about this text file:

This text file was originally generated by John Belknap, March 2004. Updated by RWW, October 31, 2004.