INIA M430 brain RMA Database (January/06 Freeze) modify this page

Accession number: GN102

    Summary:

HIGHLY SELECTIVE DATA SET: This January 2006 data freeze provides estimates of mRNA expression in adult forebrain and midbrain from 43 lines of mice including C57BL/6J, DBA/2J, reciprocal F1 hybrids, and 39 BXD recombinant inbred strains. Data were generated at UTHSC and the University of Memphis with support from grants from the NIAAA Integrative Neuroscience Initiative on Alcoholism (INIA). Samples were hybridized in small pools (n = 3) to a total of 121 Affymetrix M430A and B array pairs. This data set only includes the highest quality subset of 76 arrays that have been quantile normalized at both probe and probe set levels. This data set was initially processed using the RMA protocol. Data were renormalized after generating the RMA values using a second quantile normalization step and a round of correction for group and batch effects. To simplify comparisons among transforms, final RMA values of each array have been adjusted to an average of 8 units and a standard deviation of 2 units. A total of 355 probe sets have LRS values above 50.

    About the cases used to generate this set of data:

We have used a set of BXD recombinant inbred strains generated by crossing C57BL/6J (B6 or B) with DBA/2J (D2 or D). The BXDs are particularly useful for systems genetics because both parental strains have been sequenced (8x coverage of B6 and 1.5x coverage for D). Physical maps in WebQTL incorporate approximately 2 million B vs D SNPs from Celera. BXD2 through BXD32 were bred by Benjamin A. Taylor starting in the late 1970s. BXD33 through 42 were bred by Taylor in the 1990s. These strains are available from The Jackson Laboratory. BXD43 through BXD99 were bred by Lu Lu, Jeremy Peirce, Lee M. Silver, and Robert W. Williams in the late 1990s and early 2000s using advanced intercross progeny (Peirce et al. 2004). Many of the 50 new BXD strains are available from Lu Lu and colleagues

All stock was obtained originally from The Jackson Laboratory between 1999 and 2003. Most BXD animals were born and housed at the University of Tennessee Health Science Center. Some cases were bred at the University of Memphis (Douglas Matthews) or the University of Alabama (John Mountz and Hui-Chen Hsu).

    About the tissue used to generate this set of data:

The INIA M430 brain Database (Jan06) consists of 78 Affymetrix 430A and 430B microarray pairs. Each pair was hybridized in sequence (A array first, B array second) with a pool of brain tissue (forebrain minus olfactory bulb or retina, plus the entire midbrain) taken from three adult animals of closely matched age and the same sex. RNA was extracted at UTHSC by Lu Lu, Zhiping Jia, and Hongtao Zhai. All samples were subsequently processed in the INIA Bioanalytical Core at the W. Harry Feinstone Center of Excellence by Thomas R. Sutter, Shirlean Goodwin, and colleagues at the University of Memphis.

Replication and Sample Balance: Our goal was to obtain data for independent biological sample pools from at least one of sample from each sex for all BXD strains. While we achieved this goal technically, not all of the replicates were of sufficient quality to be included in this highly selected set. This data set is now complete and includes more than 20 replicates. Despite the lack of replicates for about 20 strains we still recommend this data set strongly over earliers data sets that included more arrays, many of which are suboptimal.

Batch Structure: Before running the first batch of 30 pairs of array (dated Jan04), we ran four test samples (Nov03). The main batch of 30 includes the four test samples (four technical replicates). The Nov03 data was combined with the Jan04 data and was treated as a single batch that consists of one male and one female pool from C57BL/6J, DBA/2J, the B6D2F1 hybrid, 11 female BXD samples, and 11 male BXD samples. The second large batch was run February 2005 (Feb05) and consists of 71 pairs of arrays. Two more batches were run; the final in December 2005 (16 arrays pairs). Batch effects were corrected at the individual probe level as described below.

The table below summarizes information on strain, sex, age, sample name, batch result date, the grouping to which an arrays data set belongs based on expression similarity, and source of mice.

IdStrain Sex Age Sample
Batch
Final Grouping
Source
1B6D2F1F127R0919F1
2
e_2
UTM JB
2B6D2F1F127R0919F2
2
e_2
UTM JB
3B6D2F1F64R1053F1
3
g_3
UTM RW
4B6D2F1F64R1053F1
3
e_3
UTM RW
5B6D2F1M66R1057F1
3
e_3
UTM RW
6D2B6F1F57R1066F1
3
e_3
UTM RW
7C57BL/6JF65R0903F1
1
se_1
UTM RW
8C57BL/6JF65R0903F1
2
e_2
UTM RW
9C57BL/6JM66R0906F1
1
e_1
UTM RW
10C57BL/6JM76R0997F1
3
g_3
UTM RW
11DBA/2JF60R0917F1
1
e_1
UTM RW
12DBA/2JF64R1123F1
3
g_3
UTM RW
13DBA/2JM60R0918F1
2
sgA_2
UTM RW
14DBA/2JM73R1009F1
3
w_3
UTM RW
15BXD1M181R0956F1
3
e_3
UTM JB
16BXD2F142R0907F1
3
e_3
UAB
17BXD5F56R0744F1
3
o_3
UMemphis
18BXD5M71R0728F1
2
e_2
UMemphis
19BXD6F57R1711F1
3
g_3
JAX
20BXD8M71R2664F1
4
se_4
JAX
21BXD11F97R0745F1
3
gA_3
UAB
22BXD12F64R0896F1
3
o_3
UMemphis
23BXD12M64R0897F1
2
e_2
UMemphis
24BXD13F86R0748F1
2
e_2
UMemphis
25BXD13F86R0730F1
3
e_3
UMemphis
26BXD13M76R0929F1
3
e_3
UMemphis
27BXD14M68R1051F1
3
e_3
UTM RW
28BXD15F80R0928F1
3
e_3
UMemphis
29BXD18F108R0771F1
2
e_2
UAB
30BXD19M157R1229F1
3
gA_3
UTM JB
31BXD21F67R0740F1
3
gA_3
UAB
32BXD23F88R0815F1
3
gA_3
UAB
33BXD23F66R1035F1
3
gA_3
UTM RW
34BXD23M66R1256F1
4
e_4
UTM RW
35BXD23M66R1037F1
3
gA_3
UTM RW
36BXD24F71R0914F1
3
e_3
UMemphis
37BXD24M71R0913F1
2
e_2
UMemphis
38BXD25F74R0373F1
2
e_2
UTM RW
39BXD25M58R2623F1
4
e_4
UTM RW
40BXD27M54R2660F1
4
e_4
UTM RW
41BXD28F113R0892F1
3
e_3
UTM RW
42BXD28M79R0911F1
3
g_3
UMemphis
43BXD31M61R1141F1
3
e_3
UTM RW
44BXD32F93R0898F1
2
e_2
UAB
46BXD32M76R1217F2
4
e_4
UMemphis
47BXD32M65R1478F1
3
e_3
UMemphis
48BXD34M72R0916F1
2
e_2
UMemphis
49BXD34F92R0900F1
3
e_3
UMemphis
50BXD36F79R2654F1
4
e_4
UTM RW
51BXD36F61R1145F1
3
e_3
UTM RW
52BXD36M77R0926F1
2
e_2
UMemphis
53BXD38F69R0729F1
3
e_3
UMemphis
54BXD38F83R1208F1
3
g_3
UMemphis
55BXD39F76R1712F1
3
e_3
JAX
57BXD40F184R0741F1
3
e_3
UAB
58BXD40M56R0894F1
3
e_3
UMemphis
59BXD42F100R0742F1
3
e_3
UAB
60BXD43F61R1199F1
3
e_3
UTM RW
61BXD43F59R0980F1
4
e_4
UTM RW
62BXD44M58R1072F1
3
e_3
UTM RW
63BXD45F58R1398F1
3
o_3
UTM RW
64BXD45M81R1658F2
4
e_4
UTM RW
65BXD48F59R0946F1
3
e_3
UTM RW
66BXD51F63R1430F1
3
e_3
UTM RW
67BXD51M65R1001F1
3
e_3
UTM RW
68BXD60M59R1075F1
3
g_3
UTM RW
69BXD62M58R1027F1
3
e_3
UTM RW
70BXD69F60R1438F1
3
e_3
UTM RW
71BXD69M64R1193F1
3
o_3
UTM RW
72BXD73F60R1275F1
3
e_3
UTM RW
73BXD73M76R1442F1
3
g_3
UTM RW
74BXD77M61R1426F1
3
g_3
UTM RW
75BXD87F89R1713F1
3
e_3
UTM RW
76BXD90F71R2628F1
4
e_4
UTM RW
77BXD90M61R1452F
3
g_3
UTM RW
78BXD92F58R1299F1
3
e_3
UTM RW

The table below quality information on scale factor, background, present, absent, marginal, and control genes to which an arrays data set is from it's report file.

IdStrain Sample
Final grouping
Set
scale factor back ground
present
absent marginal Affy- b- Actin Affy- Gapdh
1B6D2F1R0919F1e_B2
A
14.21246.930.4170.5640.0191.240.8
1B6D2F1R0919F1e_B2
B
30.34942.210.2330.7480.0191.240.74
2B6D2F1R0919F2e_B2
A
5.95530.4680.5110.0211.170.73
2B6D2F1R0919F2e_B2
B
14.79547.950.2640.7160.021.190.75
3B6D2F1R1053F1g_B3
A
4.44550.820.5360.4470.0171.921.69
3B6D2F1R1053F1g_B3
B
16.59651.440.2780.7020.021.931.76
4B6D2F1R1053F1e_B3
A
11.19642.40.4570.5230.021.841.32
4B6D2F1R1053F1e_B3
B
16.59651.440.2780.7020.021.931.76
5B6D2F1R1057F1e_B3
A
7.33242.210.5050.4750.021.641.2
5B6D2F1R1057F1e_B3
B
16.44440.310.3140.6610.0251.131.31
6C57BL/6JR0903F1se_B1
A
10.1546.460.4180.5620.0191.130.76
6C57BL/6JR0903F1se_B1
B
20.22347.780.2220.7590.0181.360.89
7C57BL/6JR0903F1e_B2
A
7.40652.470.4730.5070.021.010.74
7C57BL/6JR0903F1e_B2
B
20.7146.980.2520.7290.021.080.74
8C57BL/6JR0906F1e_B1
A
9.40746.550.4390.540.02210.8
8C57BL/6JR0906F1e_B1
B
28.7744.520.210.770.0191.040.74
9C57BL/6JR0997F1g_B3
A
8.11855.740.4480.530.0220.91.04
9C57BL/6JR0997F1g_B3
B
13.2449.640.3160.6610.0231.411.11
10D2B6F1R1066F1e_B3
A
8.14746.390.4810.50.0190.971.22
10D2B6F1R1066F1e_B3
B
18.83543.240.2850.6950.0211.111.29
11DBA/2JR0917F1e_B1
A
13.77550.20.2530.7290.0191.180.76
11DBA/2JR0917F1e_B1
B
22.30147.490.2410.7410.0181.370.88
12DBA/2JR1123F1g_B3
A
9.45250.140.4560.5230.0211.371.87
12DBA/2JR1123F1g_B3
B
23.46742.270.250.7290.0210.911.9
13DBA/2JR0918F1sgA_B2
A
9.10548.240.4620.5170.0191.220.81
13DBA/2JR0918F1sgA_B2
B
25.00746.990.2440.7360.0191.220.81
14DBA/2JR1009F1w_B3
A
5.73642.880.5270.4550.0171.112.4
14DBA/2JR1009F1w_B3
B
17.73943.750.2910.690.0190.912.36
15BXD1R0956F1e_B3
A
4.92344.740.5190.460.0211.51.09
15BXD1R0956F1e_B3
B
15.93739.50.310.6650.0251.471.21
16BXD2R0907F1e_B3
A
6.19145.770.480.4980.0221.371.23
16BXD2R0907F1e_B3
B
16.1543.780.30.6770.0231.741.37
17BXD5R0744F1o_B3
A
10.44860.780.4030.5760.0211.231.38
17BXD5R0744F1o_B3
B
28.05444.720.2360.7460.0181.431.68
18BXD5R0728F1e_B2
A
7.88453.560.430.5490.0211.120.71
18BXD5R0728F1e_B2
B
18.9242.50.2450.7350.01910.76
19BXD6R1711F1g_B3
A
7.146.570.4980.4810.021.971.66
19BXD6R1711F1g_B3
B
12.46546.020.3190.660.0222.061.78
20BXD8R2664F1se_B4
A
2.12645.640.5940.390.0161.731
20BXD8R2664F1se_B4
B
7.13341.850.3770.6030.021.950.99
21BXD11R0745F1gA_B3
A
6.24240.990.5010.480.0191.41.24
21BXD11R0745F1gA_B3
B
18.68141.110.2780.7020.021.281.27
22BXD12R0896F1o_B3
A
8.23751.230.4330.5460.0211.721.28
22BXD12R0896F1o_B3
B
19.78143.610.2640.7140.0221.441.45
23BXD12R0897F1e_B2
A
10.71346.560.4210.560.0191.230.75
23BXD12R0897F1e_B2
B
20.09350.310.2360.7440.021.250.76
24BXD13R0748F1e_B2
A
7.14957.350.4350.5430.0221.020.74
24BXD13R0748F1e_B2
B
12.7756.440.2480.7340.0191.050.8
25BXD13R0730F1e_B3
A
6.07644.570.490.4880.0221.261.45
25BXD13R0730F1e_B3
B
15.744.240.2930.6870.021.311.52
26BXD13R0929F1e_B3
A
5.49347.460.5070.4720.0211.651.35
26BXD13R0929F1e_B3
B
14.73946.050.3010.6770.0230.931.62
27BXD14R1051F1e_B3
A
6.39345.190.490.4890.0211.221.26
27BXD14R1051F1e_B3
B
15.48841.140.3250.6530.0221.121.38
28BXD15R0928F1e_B3
A
5.64639.950.5240.4560.021.951.34
28BXD15R0928F1e_B3
B
19.34437.650.2960.6820.0231.331.42
29BXD18R0771F1e_B2
A
4.16854.80.5030.4770.021.130.77
29BXD18R0771F1e_B2
B
9.67954.70.2770.7020.021.40.76
30BXD19R1229F1gA_B3
A
6.99139.650.490.4910.021.921.29
30BXD19R1229F1gA_B3
B
20.94540.50.2770.7020.0211.541.22
31BXD21R0740F1gA_B3
A
6.22942.240.4830.4950.0221.311.25
31BXD21R0740F1gA_B3
B
16.58441.880.3060.6730.0211.431.23
32BXD23R0815F1gA_B3
A
4.75348.120.5210.460.0191.41.06
32BXD23R0815F1gA_B3
B
11.55539.410.3530.6260.0221.441.1
33BXD23R1035F1gA_B3
A
6.28139.580.5030.4760.021.311.6
33BXD23R1035F1gA_B3
B
22.53634.860.2920.6860.0211.311.67
34BXD23R1256F1e_B4
A
2.23346.660.5750.4080.0171.81.13
34BXD23R1256F1e_B4
B
4.86243.160.3990.580.0211.731.01
35BXD23R1037F1gA_B3
A
5.3741.470.5190.4620.0191.351.25
35BXD23R1037F1gA_B3
B
18.48337.490.3050.6710.0241.241.28
36BXD24R0914F1e_B3
A
6.21251.110.4970.4820.0211.091.53
36BXD24R0914F1e_B3
B
19.64936.070.3090.6710.0211.41.76
37BXD24R0913F1e_B2
A
9.00249.850.4370.5430.021.240.71
37BXD24R0913F1e_B2
B
14.37551.490.2460.7340.021.360.79
38BXD25R0373F1e_B2
A
6.22256.950.4570.5220.0221.370.75
38BXD25R0373F1e_B2
B
8.33750.910.2910.6850.0241.190.77
39BXD25R2623F1e_B4
A
1.98545.80.5880.3950.0161.61
39BXD25R2623F1e_B4
B
7.555400.3740.6070.0191.781.03
40BXD27R2660F1e_B4
A
2.68851.770.5820.4030.0161.40.84
40BXD27R2660F1e_B4
B
5.73554.080.3920.5880.021.510.78
41BXD28R0892F1e_B3
A
4.14347.20.5370.4420.0211.051.08
41BXD28R0892F1e_B3
B
16.41345.830.2970.6820.0211.041.23
42BXD28R0911F1g_B3
A
5.81143.060.5170.4650.0181.191.43
42BXD28R0911F1g_B3
B
16.2241.150.30.6780.0220.851.65
43BXD31R1141F1e_B3
A
3.60742.590.5470.4350.01911.15
43BXD31R1141F1e_B3
B
11.82641.260.3290.650.0211.041.27
44BXD32R0898F1e_B2
A
9.57445.430.4470.5320.0221.30.7
44BXD32R0898F1e_B2
B
28.5742.930.230.7520.0191.420.69
45BXD32R1214F1w_B3
A
5.50641.540.5270.4540.0191.42.12
46BXD32R1217F2e_B4
A
1.86168.710.5810.4040.0151.620.89
46BXD32R1217F2e_B4
B
5.38855.490.3760.6020.0221.940.83
47BXD32R1478F1e_B3
A
5.45242.10.520.460.0191.361.68
47BXD32R1478F1e_B3
B
14.80538.70.3320.6470.0211.531.84
48BXD34R0916F1e_B2
A
5.37755.950.4460.5340.0211.120.75
48BXD34R0916F1e_B2
B
13.77550.20.2530.7290.0191.180.76
49BXD34R0900F1e_B3
A
7.20645.60.4840.4950.0211.111.15
49BXD34R0900F1e_B3
B
14.66152.10.4940.4970.0211.111.15
50BXD36R2654F1e_B4
A
2.64653.840.5590.4240.0171.891.27
50BXD36R2654F1e_B4
B
7.06254.840.3340.6470.0191.911.24
51BXD36R1145F1e_B3
A
5.22941.480.5150.4660.0190.971.12
51BXD36R1145F1e_B3
B
12.66140.040.3340.6440.0221.041.13
52BXD36R0926F1e_B2
A
5.84155.50.4380.5410.0211.260.74
52BXD36R0926F1e_B2
B
13.35353.810.2630.7160.0211.230.76
53BXD38R0729F1e_B3
A
5.47283.410.4690.5120.0190.921.09
53BXD38R0729F1e_B3
B
10.8867.390.2990.6790.0221.061.2
54BXD38R1208F1g_B3
A
3.53243.380.5440.4380.0181.151.27
54BXD38R1208F1g_B3
B
15.23443.650.3110.6670.0231.081.38
55BXD39R1712F1e_B3
A
7.51444.540.490.4890.0211.691.42
55BXD39R1712F1e_B3
B
12.62444.610.3180.6610.0211.341.55
56BXD39R0602F1w_B3
B
20.23137.070.3010.680.021.072.33
57BXD40R0741F1e_B3
A
5.23445.680.510.4690.021.691.17
57BXD40R0741F1e_B3
B
12.24246.890.3230.6560.0211.121.23
58BXD40R0894F1e_B3
A
5.32644.90.520.4590.0211.261.21
58BXD40R0894F1e_B3
B
10.33941.240.3520.6250.0240.811.4
59BXD42R0742F1e_B3
A
5.54243.660.5220.4580.0211.721.17
59BXD42R0742F1e_B3
B
15.09541.370.3190.660.0221.271.24
60BXD43R1199F1e_B3
A
6.17141.280.5230.4580.0191.061.23
60BXD43R1199F1e_B3
B
16.53440.320.2910.6850.0240.991.54
61BXD43R0980F1e_B4
A
1.59263.750.5910.3920.0171.760.95
61BXD43R0980F1e_B4
B
5.81548.890.3780.6010.0212.060.97
62BXD44R1072F1e_B3
A
7.85841.120.4760.5020.0221.521.74
62BXD44R1072F1e_B3
B
23.06541.320.2640.7170.0191.251.84
63BXD45R1398F1o_B3
A
13.91145.870.3840.5950.0211.241.7
63BXD45R1398F1o_B3
B
40.0747.470.1780.8050.0171.211.68
64BXD45R1658F2e_B4
A
2.36856.290.5730.4080.0191.420.84
64BXD45R1658F2e_B4
B
7.00649.520.3720.6080.021.450.8
65BXD48R0946F1e_B3
A
6.56547.790.4870.4930.0211.681.27
65BXD48R0946F1e_B3
B
17.49941.870.2920.6870.0211.541.35
66BXD51R1430F1e_B3
A
7.04257.480.460.5190.0221.171.29
66BXD51R1430F1e_B3
B
19.37348.260.2590.720.0212.071.48
67BXD51R1001F1e_B3
A
4.68958.810.5010.480.0191.881.31
67BXD51R1001F1e_B3
B
16.03255.590.2660.7150.0191.311.64
68BXD60R1075F1g_B3
A
8.18949.90.4650.5130.0221.391.34
68BXD60R1075F1g_B3
B
19.21945.140.2770.7050.0181.771.41
69BXD62R1027F1e_B3
A
7.44744.420.4910.4880.0212.031.23
69BXD62R1027F1e_B3
B
19.39141.090.2850.6960.0191.051.44
70BXD69R1438F1e_B3
A
6.29744.190.5120.4690.0191.771.5
70BXD69R1438F1e_B3
B
12.33546.580.3110.6670.0211.251.62
71BXD69R1193F1o_B3
A
5.74983.560.4140.5640.0221.491.58
71BXD69R1193F1o_B3
B
20.51344.280.2610.7180.0211.141.58
72BXD73R1275F1e_B3
A
6.47840.910.4990.4810.021.051.52
72BXD73R1275F1e_B3
B
16.93141.60.2990.6810.021.621.53
73BXD73R1442F1g_B3
A
8.58462.860.4280.5520.021.781.69
73BXD73R1442F1g_B3
B
17.37855.710.260.720.021.171.83
74BXD77R1426F1g_B3
A
6.30646.270.5010.4810.0181.771.49
74BXD77R1426F1g_B3
B
13.36548.960.3090.670.0221.261.63
75BXD87R1713F1e_B3
A
6.24339.430.5150.4660.0181.381.34
75BXD87R1713F1e_B3
B
14.99742.780.3050.6730.0221.711.58
76BXD90R2628F1e_B4
A
2.09658.740.5720.4120.0161.570.82
76BXD90R2628F1e_B4
B
8.91349.120.3320.6460.0231.880.85
77BXD90R1452Fg_B3
A
7.47852.260.4490.5310.021.171.74
77BXD90R1452Fg_B3
B
15.46940.590.3120.6680.021.71.74
78BXD92R1299F1e_B3
A
8.26445.380.4780.5030.0191.41.37
78BXD92R1299F1e_B3
B
18.36943.40.290.6890.0211.911.6

    About the array platform :

Affymetrix Mouse Genome 430A and B array pairs: The 430A and B array pairs consist of 992936 25-nucleotide probes that collectively estimate the expression of approximately 39,000 transcripts. The array sequences were selected late in 2002 using Unigene Build 107. The arrays nominally contain the same probe sequences as the 430 2.0 series. However, we have found that roughy 75000 probes differ from those on A and B arrays and those on the 430 2.0

    About data processing:

Probe (cell) level data from the CEL file: These CEL values produced by GCOS are 75% quantiles from a set of 91 pixel values per cell.
  • Step 1: We added an offset of 1.0 unit to each cell signal to ensure that all values could be logged without generating negative values. We then computed the log base 2 of each cell.
  • Step 2: We performed a quantile normalization of the log base 2 values for the total set of 105 arrays (processed as two batches) using the same initial steps used by the RMA transform.
  • Step 3: We computed the Z scores for each cell value.
  • Step 4: We multiplied all Z scores by 2.
  • Step 5: We added 8 to the value of all Z scores. The consequence of this simple set of transformations is to produce a set of Z scores that have a mean of 8, a variance of 4, and a standard deviation of 2. The advantage of this modified Z score is that a two-fold difference in expression level corresponds approximately to a 1 unit difference.
  • Step 6: We eliminated much of the systematic technical variance introduced by the two batches (n = 34 and n = 71 array pairs) at the probe level. To do this we calculated the ratio of each batch mean to the mean of both batches and used this as a single multiplicative probe-specific batch correction factor. The consequence of this simple correction is that the mean probe signal value for each batch is the same.
  • Step 7a: The 430A and 430B arrays include a set of 100 shared probe sets (a total of 2200 probes) that have identical sequences. These probes and probe sets provide a way to calibrate expression of the 430A and 430B arrays to a common scale. To bring the two arrays into alignment, we regressed Z scores of the common set of probes to obtain a linear regression correction to rescale the 430B arrays to the 430A array. In our case this involved multiplying all 430B Z scores by the slope of the regression and adding or subtracting a small offset. The result of this step is that the mean of the 430A expression is fixed at a value of 8, whereas that of the 430B chip is typically reduced to 7. The average of the merged 430A and 430B array data set is approximately 7.5.
  • Step 7b: We recentered the merged 430A and 430B data sets to a mean of 8 and a standard deviation of 2. This involved reapplying Steps 3 through 5.
  • Step 8: Finally, we computed the arithmetic mean of the values for the set of microarrays for each strain. Technical replicates were averaged before computing the mean for independent biological samples. Note, that we have not (yet) corrected for variance introduced by differences in sex, age, source of animals, or any interaction terms. We have not corrected for background beyond the background correction implemented by Affymetrix in generating the CEL file. We eventually hope to add statistical controls and adjustments for some of these variables.
Probe set data: The expression data were processed by Yanhua Qu (UTHSC). The original CEL files were read into the R environment (Ihaka and Gentleman 1996). Data were processed using the Robust Multichip Average (RMA) method (Irrizary et al. 2003). Values were log2 transformed. Probe set values listed in WebQTL are the averages of biological replicates within strain. A few technical replicates were averaged and treated as single samples. A 1-unit difference represents roughly a two-fold difference in expression level. Expression levels below 5 are usually close to background noise levels.
  • Setp 1: Get CAB file for all arrays (121 arrays)
  • Setp 2: Unpack CAB file using GCOS 1.4 DAT, CEL, RPT, CHP
  • Setp 3: Put RPT data into spreadsheet
  • Setp 4: Remaining N CEL data files were transformed to old CEL format using Transfer Tool (121 arrays)
  • Setp 5: Old CEL format files transformed using RMA and PDNN (121 arrays). 430A set and 430B set arrays are processed separately using RMA and PDNN, Normalize 430A and 430B separately to Z Scores (2Z+8).
  • Setp 6: Examine all scatter plots of the probe sets using DataDesk and categorized them by similarity. We are looking for batch and sub-batch structure. There are still quite obvious differences. For the INIA data we defined 5 groups that did NOT align exactly with the batches. The results are indicated in the table under the heading "Final Grouping." These are letters followed by the batch. For example "e_2" is an "e" type data set from batch 2. The prefix "s" means that an array was considered the "standard" for a particular group. For example sgA_2 is the "standard" for the gA group and was a member of batch 2. We defined groups "e" (originally "e" stood for 'excellent'), "g" (originally 'g' stood for good), "o" (OK), "w" (wide), and "gA" (good subdivision A).
  • Setp 7: Delete obviously bad arrays (n of 3 were deleted, leaving 118 arrays). Array BXD8(S167) is high scale factor (A:16.797,B:35.646); BXD18(R1220) and BXD33(R2627) are high 3'/5' B_Act_Sig(64.20), GAPD_Sig(84.20) and B_Act_Sig(49.92), GAPD_Sig(84.17).
  • Setp 8: Group rescale four minor groups to the same level of the largest group (please note that a group may have arrays from multiple physical batches). This group correction is done on a probe_set-by-probe_set level. The result of this rescaling is a group corrected data set.
  • Setp 9: Look at the group rescaled arrays and delete any arrays that do not look good where good is usually a correlation of >0.96 with respect to other arrays. For the INIA data set of 118 arrays we deleted 40 arrays using very strict goodness criteria.
  • Setp 10: Reprocess the remaining 78 good old-format CEL files and process as in Step 5. , 430A set and 430B set separately using RMA and PDNN, Normalize 430A and 430B separately to Z Scores (2Z+8).
  • Setp 11: Bring the two arrays (430A and 430B) into alignment. To do this we regressed Z scores of the common set of 100 probe sets to obtain a linear regression corrections to rescale the 430B arrays to the 430A array values. Make data sets for RMA_430AB and PDNN_430AB. Normalize 430AB to Z Scores.
  • Setp 12:Rank order of Probe Sets: Run all of the arrays through a second quantile normalization. This involves computing the average of all probe sets across all arrays. These averages are then rank ordered. We also rank order each of the individual array data sets. Probe sets for each individual array are then assigned a new expression value based on 1. Its rank within the particular array and 2. the value of that particular rank taken from the AVERAGE data. This forces every array to have exactly the same distribution as the average data. The result of this process is colinear expression of all arrays.
  • Setp 13: We normalize the means of each of these groups to a common value set to the largest group (group e now with 37 members). If the mean for probe set 100001 is 8 in group e whereas group g a mean 8.5, then we just have a correction factor of 8/8.5 for probe set 100001 in the group g. The intent of this step is to correct for group effect on a probe set by probe set level.
  • Setp 14: Verify that all arrays have correlations >0.98 using RMA transform. Two arrays discovered that escaped deletion. Delete these arrays (BXD32-R1214, BXD39-R0602)
  • Setp 15: Finally, we compute the arithmetic mean of the values for the set of 76 final arrays for each strain.

This data set include further normalization to produce final estimates of expression that can be compared directly to the other transforms (average of 8 units and stabilized standard deviation of 2 units within each array). Please seee Bolstad and colleagues (2003) for a helpful comparison of RMA and two other common methods of processing Affymetrix array data sets.

    About the chromosome and megabase position values:

The chromosomal locations of probe sets included on the microarrays were determined by BLAT analysis using the Mouse Genome Sequencing Consortium May 2004 Assembly (see http://genome.ucsc.edu/cgi-bin/hgBlat?command=start&org=mouse). We thank Dr. Yan Cui (UTHSC) for allowing us to use his Linux cluster to perform this analysis.

    Data source acknowledgment:

Support for acquisition of microarray data were generously provided by the NIAAA and its INIA grant program to RWW, Thomas Sutter, and Daniel Goldowitz (U01AA013515, U01AA013499-03S1, U01AA013488, U01AA013503-03S1). Support for the continued development of the GeneNetwork and WebQTL was provided by a NIMH Human Brain Project grant (P20MH062009). All arrays were processed at the University of Memphis by Thomas Sutter and colleagues with support of the INIA Bioanalytical Core.

    Information about this text file:

This text file originally generated by RWW, YHQ, and EJC, Oct 2004. Updated by RWW, Nov 5, 2004; April 7, 2005; RNA/tissue preparation protocol updatedby JLP, Sept 2, 2005; Sept 26, 2005.