Hippocampus Consortium M430v2 (Oct05) PDNN modify this page

    Summary:

PRELIMINARY: The October 2005 Hippocampus Consortium data set provides estimates of mRNA expression in the adult hippocampus of approximately 99 genetically diverse strains of mice including 68 BXD recombinant inbred strains, 13 CXB recombinant inbred strains, a set of 16 diverse inbred strains, and 2 reciprocal F1 hybrids. The hippocampus is an important and intriguing part of the forebrain that is crucial for memory formation, and that is often affected in epilepsy, Alzheimer's disease, and schizophrenia. Unlike most other parts of the brain, the hippocampus contains a remarkable population of stems cells that continue to generate neurons and glial cells even in adult mammals (Kempermann, 2005). This genetic analysis of transcript expression in the hippocampus (dentate gyrus, CA1-CA3, and parts of the subiculum) is a joint effort of 14 investigators that is supported by numerous agencies described in the acknowledgments section. Samples were processed using a total of 206 Affymetrix Mouse Expression 430 2.0 short oligomer microarrays (MOE430 2.0 or M430v2), of which 179 passed stringent quality control and error checking. This particular data set was processed using the position-dependent nearest neighbor method (PDNN) of Zhang and colleagues. To simplify comparison among the transforms we have used, the quantile normalized PDNN values from each arrray have been adjusted to an average expression of 8 units and a standard deviation of 2 units.

    About the strains used to generate this set of data:

This analysis has used 68 of BXD strains, the complete set of 13 CXB recombinant inbred strain sets, and a mouse diversity panel consisting of 16 inbred strains and a pair of reciprocal F1 hybrids (B6D2F1 and D2B6F1).

The BXD genetic reference population of recombinant inbred strains consists of approximately 80 strains. Approximately 800 classical phenotypes from sets of 10 to 70 of these strains been integrated in the GeneNetwork. The BXD strains in this data set include 29 of the BXD strains made by Benjamin Taylor at the Jackson Laboratory in the 1970s and 1990s (BXD1 through BXD40). All of these strains are fully inbred, many well beyond the 100th filial (F) generation of inbreeding. We have also included 39 inbred (25 strains at F20+) and nearly inbred (14 strains between F14 and F20) BXD lines generated by Lu and Peirce. All of these strains, including those between F14 and F20, have been genotyped at 13,377 SNPs.

The CXB is the first and oldest set of recombinant inbred strains. Over 500 classical phenotypes from these strains been integrated in the GeneNetwork. It is noteworthy that the CXB strains segregate for the hippocampal lamination defect (Hld),characterized by Nowakowski and colleagues (1984). All of the CXBs have been recently genotyped at 13,377 SNPs.

Mouse Diversity Panel (MDP). We have profiled a MDP consisting 16 inbred strains and a pair of reciprocal F1 hybrids; B6D2F1 and D2B6F1. These strains were selected for several reasons:

  • genetic and phenotypic diversity, including use by the Phenome Project
  • their use in making genetic reference populations including recombinant inbred strains, cosomic strains, congenic and recombinant congenic strains
  • their use by the Complex Trait Consortium to make the Collaborative Cross (Nairobi/Wellcome, Oak Ridge/DOE, and Perth/UWA)
  • genome sequence data from three sources (NHGRI, Celera, and Perlegen-NIEHS)
  • availability from The Jackson Laboratory

All eight parents of the Collaborative Cross (129, A, C57BL/6J, CAST, NOD, NZO, PWK, and WSB) have been included in the MDP (noted below in the list). Twelve MDP strains have been sequenced, or are currently being resequenced by Perlegen for the NIEHS. This panel will be extremely helpful in systems genetic analysis of a wide variety of traits, and will be a powerful adjunct in fine mapping modulators using what is essentially an association analysis of sequence variants.

  1. 129S1/SvImJ
        Collaborative Cross strain sequenced by NIEHS; background for many knockouts; Phenome Project A list
  2. A/J
        Collaborative Cross strain sequenced by Perlegen/NIEHS; parent of the AXB/BXA panel
  3. AKR/J
        Sequenced by NIEHS; Phenome Project B list
  4. BALB/cByJ
        STILL IN PROGRESS (samples did not pass quality control); Sequenced by NIEHS; maternal parent of the CXB panel; Phenome Project A list
  5. BALB/cJ
        Widely used strain with forebrain abnormalities (callosal defects); Phenome Project A list
  6. C3H/HeJ
        Sequenced by Perlegen/NIEHS; paternal parent of the BXH panel; Phenome Project A list
  7. C57BL/6J
        Sequenced by NHGRI; parental strain of AXB/BXA, BXD, and BXH; Phenome Project A list
  8. C57BL/6ByJ
        Paternal substrain of B6 used to generate the CXB panel
  9. CAST/Ei
        Collaborative Cross strain sequenced by NIEHS; Phenome Project A list
  10. DBA/2J
        Sequenced by Perlegen/NIEHS and Celera; paternal parent of the BXD panel; Phenome Project A list
  11. KK/HIJ
        Sequenced by Perlegen/NIEHS
  12. LG/J
        Paternal parent of the LGXSM panel
  13. NOD/LtJ
        Collaborative Cross strain sequenced by NIEHS; Phenome Project B list; diabetic
  14. NZO/HILtJ
        Collaborative Cross strain
  15. PWD/PhJ
        Sequenced by Perlegen/NIEHS; parental strain for a consomic set by Forjet and colleagues
  16. PWK/PhJ
        Collaborative Cross strain; Phenome Project D list
  17. WSB/EiJ
        Collaborative Cross strain sequenced by NIEHS; Phenome Project C list
  18. B6D2F1 and D2B6F1
    F1 hybrids generated by crossing C57BL/6J with DBA/2J

We have not combined data from reciprocal F1s because they have different Y chromosome and mitochondial haplotypes. Parent-of-origin effects (imprinting, maternal environment) may also lead to interesting differences in hippocampal transcript levels.

These strains are available from The Jackson Laboratory. BXD43 through BXD100 are available from Lu Lu and colleagues at UTHSC.

    About the animals and tissue used to generate this set of data:

BXD animals were obtained from UTHSC, UAB, or directly from The Jackson Laboratory (see Table 1 below). Animals were housed at UTHSC, Beth Israel Deaconess, or the Jackson Laboratory before sacrifice. Virtually all CXB animals were obtained directly at the Jackson Laboratory by Lu Lu. We thanks Muriel Davission for making it possible to collect these cases on site. Standard inbred strain stock was from The Jackson Laboratory, but most animals were housed or reared at UTHSC. Mice were killed by cervical dislocation and brains were removed and placed in RNAlater prior to dissection. Cerebella and olfactory bulbs were removed; brains were hemisected, and both hippocampi were dissected whole by Hong Tao Zhang in the Lu lab. Hippocampal samples are very close to complete (see Lu et al., 2001) but probably include variable amounts of subiculum and fimbria.

A pool of dissected tissue typically from six hippocampi and three naive adults of the same strain, sex, and age was collected in one session and used to generate cRNA samples. Two-hundred and one RNA samples were extracted at UTHSC by Zhiping Jia, four samples by Shuhua Qi (R2331H1, R2332H1, P2350H1, R2349H1), and one by Siming Shou (R0129H2).

A great majority of animals used in this study were between 45 and 90 days of age (average of 66 days, maximum range from 41 to 196 days; see Table 1 below).

Sample Processing: Samples were processed in the INIA Bioanalytical Core at the W. Harry Feinstone Center for Genomic Research, The University of Memphis, led by Thomas R. Sutter. All processing steps were performed by Shirlean Goodwin. In brief, RNA purity was evaluated using the 260/280 nm absorbance ratio, and values had to be greater than 1.8. The majority of samples were 1.9 to 2.1. RNA integrity was assessed using the Agilent Bioanalyzer 2100. We required an RNA integrity number (RIN) of greater than 8. This RIN value is based on the intensity ratio and amplitude of 18S and 28S rRNA signals. The standard Eberwine T7 polymerase method was used to catalyze the synthesis of cDNA template from polyA-tailed RNA using Superscript II reverse transcriptase (Invitrogen Inc.). The Enzo Life Sciences, Inc., BioArray High Yield RNA Transcript Labeling Kit (T7, Part No. 42655) was used to synthesize labeled cRNA. The cRNA was evaluated using both the 260/280 ratio (values of 2.0 or 2.1 are acceptable) and the Bioanalyzer output (a dark cRNA smear on the 2100 output centered roughly between 600 and 2000 nucleotides is required). Those samples that passed both QC steps (10% usually failed and new RNA samples had to be acquired and processed) were then sheared using a fragmentation buffer included in the Affymetrix GeneChip Sample Cleanup Module (Part No. 900371). Fragmented cRNA samples were either stored at -80 deg. C until use or were immediately injected onto the array. The arrays were hybridized and washed following standard Affymetrix protocols.

Replication and Sample Balance: We obtained a male sample pool and female sample pool from each isogenic group. While all strains were orginally represented by matched male and female samples, not all data sets passed the final quality control steps. Seventy-seven of 99 strains are represented by pairs or (rarely) trios of arrays. The first and last samples are technical replicates of a B6D2F1 hippocampal pool (aliquotes R1291H3 and R1291H4).

Experimental Design and Batch Structure: This data set consists arrays processed in six groups over a three month period (May 2005 to August 2005). Each group consists of 32 to 34 arrays.Sex, strain, and strain type (BXD, CXB, and MDP) were interleaved among groups to ensure reasonable balance and to minimize group-by-strain statistical confounds in group normalization. The two independent samples from a single strain were always run in different groups. All arrays were processed using a single protocol by a single operator, Shirlean Goodwin.

All samples in a group were labeled on one day, except for a few cases that failed QC on their first pass. The hybridization station accommodates up to 20 samples, and for this reason each group was split into a large first set of 20 samples and a second set of 12 to 14 samples. Samples were washed in groups of four and then held in at 4 deg C until all 20 (or 12-14) arrays were ready to scan. The last four samples out of the wash stations were scanned directly. Samples were scanned in sets of four.

    Data Table 1:

This table lists all arrays by order of processing (Run), Sample ID, Strain, Sex, Age, number of animals in each sample pool (Pool), F generation number when less than 30 (GenN, and the Source of animals. SampleID is the ID number of the pooled RNA sample with a H1 through H3 suffix to indicate the actual hippocampal RNA aliquot used to prepare cRNA. Grp is the sequential group processing number (1 - 6).
    
Sort Run SampleID Strain Sex Age GenN Source Pool Grp Notes Data_Links_to_Affy_Files
14R1509H1BXD01F59>50GDR41uEXP RPT TXT CEL DAT
274R1507H1BXD01M58>50GDR43 EXP RPT TXT CEL DAT
3102R1520H1BXD02F56>50GDR44 EXP RPT TXT CEL DAT
46R1516H1BXD02M61>50GDR41rEXP RPT TXT CEL DAT
58R1593H2BXD05F60>50GDR31rEXP RPT TXT CEL DAT
680R1692H1BXD05M60>50GDR23 EXP RPT TXT CEL DAT
710R1539H2BXD06F59>50GDR41sEXP RPT TXT CEL DAT
8127R1538H1BXD06M59>50GDR44 EXP RPT TXT CEL DAT
912R1518H1BXD08F56>50GDR41tEXP RPT TXT CEL DAT
10189R1548H1BXD08M59>50GDR36 EXP RPT TXT CEL DAT
1114R1350H2BXD09F86>50UMem31uEXP RPT TXT CEL DAT
12117R1351H3BXD09M86>50UMem34 EXP RPT TXT CEL DAT
13173R1531H1BXD11F56>50GDR46 EXP RPT TXT CEL DAT
1416R1367H1BXD11M56>50GDR41rEXP RPT TXT CEL DAT
1518R1530H1BXD12F58>50GDR41sEXP RPT TXT CEL DAT
16119R1567H1BXD12M58>50GDR44 EXP RPT TXT CEL DAT
17177R1529H1BXD13F58>50GDR46 EXP RPT TXT CEL DAT
1820R1662H1BXD13M60>50GDR31NAEXP RPT TXT CEL DAT
1922R1280H2BXD14F56>50LuLu31sEXP RPT TXT CEL DAT
20121R1544H1BXD14M59>50GDR44 EXP RPT TXT CEL DAT
21179R1524H1BXD15F60>50GDR46 EXP RPT TXT CEL DAT
2224R1515H1BXD15M61>50GDR41sEXP RPT TXT CEL DAT
2326R1661H1BXD16F61>50GDR31sEXP RPT TXT CEL DAT
24123R1594H1BXD16M61>50GDR34 EXP RPT TXT CEL DAT
25181R1568H1BXD19F60>50GDR46 EXP RPT TXT CEL DAT
2628R1471H1BXD19M157>50JBo21tEXP RPT TXT CEL DAT
2730R1573H1BXD20F59>50GDR41sEXP RPT TXT CEL DAT
2832R1347H2BXD21F64>50UMem31sEXP RPT TXT CEL DAT
29125R1349H3BXD21M64>50UMem34 EXP RPT TXT CEL DAT
30183R1848H1BXD22F196>50UAB36 EXP RPT TXT CEL DAT
3134R1525H1BXD22M59>50GDR42 EXP RPT TXT CEL DAT
32156R1254H1BXD23F66>50LuLu45 EXP RPT TXT CEL DAT
3336R1337H2BXD23M102>50UAB32 EXP RPT TXT CEL DAT
3438R1343H2BXD24F71>50UMem22 EXP RPT TXT CEL DAT
3594R1517H1BXD24M57>50GDR43 EXP RPT TXT CEL DAT
3640R1366H1BXD27F60>50GDR42 EXP RPT TXT CEL DAT
37158R1849H1BXD27M70>50UAB25 EXP RPT TXT CEL DAT
3876R1353H1BXD28F79>50UMem33 EXP RPT TXT CEL DAT
3942R2332H1BXD28M60>50GDR32 EXP RPT TXT CEL DAT
4044R1532H1BXD29F57>50GDR42 EXP RPT TXT CEL DAT
41160R1356H1BXD29M76>50UMem35 EXP RPT TXT CEL DAT
4296R1242H2BXD31F61>50LuLu43 EXP RPT TXT CEL DAT
4346R1240H2BXD31M61>50LuLu32 EXP RPT TXT CEL DAT
44162R1470H1BXD32F76>50UMem25 EXP RPT TXT CEL DAT
4548R1508H2BXD32M58>50GDR42 EXP RPT TXT CEL DAT
4650R1345H3BXD33F65>50UMem22 EXP RPT TXT CEL DAT
4797R1581H1BXD33M59>50GDR43 EXP RPT TXT CEL DAT
4852R1527H1BXD34F59>50GDR42 EXP RPT TXT CEL DAT
49168R1339H1BXD34M74>50UMem35 EXP RPT TXT CEL DAT
5088R1469H1BXD36F83>50UMem33 EXP RPT TXT CEL DAT
5154R1363H1BXD36M77>50UMem32 EXP RPT TXT CEL DAT
5292R1855H1BXD38F55>50GDR33 EXP RPT TXT CEL DAT
5356R1510H1BXD38M65>50UMem32 EXP RPT TXT CEL DAT
5458R1528H2BXD39F59>50GDR42 EXP RPT TXT CEL DAT
5599R1514H1BXD39M59>50GDR43 EXP RPT TXT CEL DAT
56100R1522H1BXD40F59>50GDR44 EXP RPT TXT CEL DAT
5760R1359H1BXD40M73>50UMem32 EXP RPT TXT CEL DAT
5862R1519H1BXD42F58>50GDR42 EXP RPT TXT CEL DAT
59101R1512H1BXD42M59>50GDR44 EXP RPT TXT CEL DAT
605R1334H2BXD43F5922LuLu41rEXP RPT TXT CEL DAT
6184R1303H1BXD43M6324LuLu33 EXP RPT TXT CEL DAT
6267R1326H1BXD44F6520LuLu43 EXP RPT TXT CEL DAT
637R1577H2BXD44M5620LuLu41rEXP RPT TXT CEL DAT
64103R1399H2BXD45F5820LuLu34 EXP RPT TXT CEL DAT
65191R1465H1BXD45M6220LuLu46 EXP RPT TXT CEL DAT
66105R1316H1BXD48F5821LuLu34 EXP RPT TXT CEL DAT
6778R1575H3BXD48M6522LuLu43 EXP RPT TXT CEL DAT
68175R1879H1BXD50F6918LuLu36 EXP RPT TXT CEL DAT
6913R1944H2BXD50M8118LuLu21rEXP RPT TXT CEL DAT
7072R2331H1BXD51F6625LuLu33 EXP RPT TXT CEL DAT
71193R1330H1BXD51M6521LuLu46 EXP RPT TXT CEL DAT
72107R2095H2BXD55F6118LuLu34 EXP RPT TXT CEL DAT
7317R1474H1BXD55M5715LuLu21rEXP RPT TXT CEL DAT
74109R1331H1BXD60F6021LuLu44 EXP RPT TXT CEL DAT
7519R1281H2BXD60M5922LuLu41sEXP RPT TXT CEL DAT
76111R1914H2BXD61F6320LuLu24 EXP RPT TXT CEL DAT
7721R1856H2BXD61M9419LuLu21sEXP RPT TXT CEL DAT
7823R1246H1BXD62F5422LuLu41sEXP RPT TXT CEL DAT
79195R1585H1BXD62M6420LuLu46 EXP RPT TXT CEL DAT
8025R1945H1BXD63F10721LuLu41tEXP RPT TXT CEL DAT
81197R2093H1BXD63M7021LuLu26 EXP RPT TXT CEL DAT
8227R2062H2BXD64F6519LuLu21uEXP RPT TXT CEL DAT
8395R2061H1BXD64M8717LuLu43 EXP RPT TXT CEL DAT
8429R2054H2BXD65F5520LuLu21rEXP RPT TXT CEL DAT
85199R2056H1BXD65M8917LuLu26 EXP RPT TXT CEL DAT
8631R1941H2BXD66F7820LuLu41rEXP RPT TXT CEL DAT
87115R1949H2BXD66M9621LuLu24 EXP RPT TXT CEL DAT
88185R2060H1BXD67F5420LuLu36 EXP RPT TXT CEL DAT
8933R2052H1BXD67M6120LuLu31tEXP RPT TXT CEL DAT
90142R2074H1BXD68F6019LuLu35 EXP RPT TXT CEL DAT
9135R1928H1BXD68M7216LuLu22 EXP RPT TXT CEL DAT
9237R1439H3BXD69F6021LuLu32 EXP RPT TXT CEL DAT
9386R1559H1BXD69M6420LuLu33 EXP RPT TXT CEL DAT
94144R2134H1BXD70F6421LuLu25 EXP RPT TXT CEL DAT
9539R2063H1BXD70M5520LuLu32 EXP RPT TXT CEL DAT
96113R1277H1BXD73F6020LuLu24 EXP RPT TXT CEL DAT
9741R1443H2BXD73M7621LuLu32 EXP RPT TXT CEL DAT
9843R2055H2BXD74M7918LuLu42 EXP RPT TXT CEL DAT
99146R2316H1BXD74M19318LuLu25 EXP RPT TXT CEL DAT
10045R1871H1BXD75F6121LuLu42 EXP RPT TXT CEL DAT
10190R1844H2BXD75M9020LuLu43 EXP RPT TXT CEL DAT
10247R1948H2BXD76F8116LuLu32 EXP RPT TXT CEL DAT
103166R2094H1BXD76M6117LuLu35 EXP RPT TXT CEL DAT
10498R2262H1BXD77F6224LuLu33 EXP RPT TXT CEL DAT
10549R1423H1BXD77M6220LuLu42 EXP RPT TXT CEL DAT
10651R1947H1BXD79F10817LuLu22 EXP RPT TXT CEL DAT
107169R2092H1BXD79M8615LuLu35 EXP RPT TXT CEL DAT
108164R1880H1BXD80F6819LuLu35 EXP RPT TXT CEL DAT
10953R1881H2BXD80M6819LuLu32 EXP RPT TXT CEL DAT
11055R2075H1BXD83F6015LuLu32 EXP RPT TXT CEL DAT
111187R2076H1BXD83M6015LuLu36 EXP RPT TXT CEL DAT
112171R2077H1BXD84F6217LuLu26 EXP RPT TXT CEL DAT
11357R2135H3BXD84M7517LuLu22 EXP RPT TXT CEL DAT
11459R1473H1BXD85F7920LuLu42 EXP RPT TXT CEL DAT
115129R1597H1BXD85M8621LuLu34 EXP RPT TXT CEL DAT
116130R1415H1BXD86F7720LuLu34 EXP RPT TXT CEL DAT
11761R1419H1BXD86M5821LuLu32 EXP RPT TXT CEL DAT
118131R1946H2BXD87F10120LuLu24 EXP RPT TXT CEL DAT
11963R1710H1BXD87M9620LuLu32 EXP RPT TXT CEL DAT
12064R1872H2BXD89F9020LuLu22 EXP RPT TXT CEL DAT
121132R1850H2BXD89M8219LuLu44 EXP RPT TXT CEL DAT
12265R2058H1BXD90F6123LuLu32 EXP RPT TXT CEL DAT
123133R1453H1BXD90M6120LuLu24 EXP RPT TXT CEL DAT
12466R1301H2BXD92F5821LuLu32 EXP RPT TXT CEL DAT
125134R1309H1BXD92M5921LuLu34 EXP RPT TXT CEL DAT
126148R2057H1BXD93F9219LuLu25 EXP RPT TXT CEL DAT
1279R2059H1BXD93M5819LuLu41sEXP RPT TXT CEL DAT
12882R2313H1BXD94F5914LuLu33 EXP RPT TXT CEL DAT
129136R2314H1BXD94M5914LuLu35 EXP RPT TXT CEL DAT
130150R1847H1BXD96F7020LuLu35 EXP RPT TXT CEL DAT
13111R1846H2BXD96M6320LuLu41sEXP RPT TXT CEL DAT
132138R2053H1BXD97F5521LuLu35 EXP RPT TXT CEL DAT
13315R1927H2BXD97M6720LuLu31rEXP RPT TXT CEL DAT
134154R1942H1BXD98F6219LuLu35 EXP RPT TXT CEL DAT
13568R1943H2BXD98M6219LuLu33 EXP RPT TXT CEL DAT
13670R2197H1BXD99F7014LuLu43 EXP RPT TXT CEL DAT
137140R2315H1BXD99M8414LuLu25 EXP RPT TXT CEL DAT
13869R2116H1CXB1F55>50JAX33 EXP RPT TXT CEL DAT
139104R2096H1CXB1M55>50JAX24 EXP RPT TXT CEL DAT
140124R2124H1CXB10F53>50JAX24 EXP RPT TXT CEL DAT
14187R2108H1CXB10M53>50JAX33 EXP RPT TXT CEL DAT
14289R2125H1CXB11F58>50JAX33 EXP RPT TXT CEL DAT
143114R2128H1CXB11M58>50JAX24 EXP RPT TXT CEL DAT
144126R2126H1CXB12F47>50JAX34 EXP RPT TXT CEL DAT
14591R2109H1CXB12M47>50JAX33 EXP RPT TXT CEL DAT
14693R2127H2CXB13F56>50JAX33 EXP RPT TXT CEL DAT
147128R2110H1CXB13M56>50JAX34 EXP RPT TXT CEL DAT
148116R2117H1CXB2F62>50JAX24 EXP RPT TXT CEL DAT
14971R2098H1CXB2M68>50JAX33 EXP RPT TXT CEL DAT
15073R2118H1CXB3F47>50JAX33 EXP RPT TXT CEL DAT
151106R2100H1CXB3M47>50JAX34 EXP RPT TXT CEL DAT
152118R2119H1CXB4F58>50JAX34 EXP RPT TXT CEL DAT
15375R2101H1CXB4M58>50JAX33 EXP RPT TXT CEL DAT
15477R0129H2CXB5M70>50LuLu33 EXP RPT TXT CEL DAT
155108R2131H1CXB5M42>50JAX34 EXP RPT TXT CEL DAT
15679R2120H1CXB6F49>50JAX33 EXP RPT TXT CEL DAT
157120R2102H1CXB6M49>50JAX34 EXP RPT TXT CEL DAT
158110R2121H1CXB7F63>50JAX24 EXP RPT TXT CEL DAT
15981R2104H2CXB7M58>50JAX23 EXP RPT TXT CEL DAT
16083R2122H1CXB8F54>50JAX33 EXP RPT TXT CEL DAT
161122R2105H1CXB8M41>50JAX34 EXP RPT TXT CEL DAT
16285R2123H1CXB9F54>50JAX33 EXP RPT TXT CEL DAT
163112R2106H1CXB9M54>50JAX34 EXP RPT TXT CEL DAT
164135R2028H2129S1/SvImJF66>50JAX35EXP RPT TXT CEL DAT
165170R2029H1129S1/SvImJM66>50LuLu46EXP RPT TXT CEL DAT
166186R2031H2A/JF57>50JAX36 EXP RPT TXT CEL DAT
167149R2030H1A/JM57>50LuLu45 EXP RPT TXT CEL DAT
168151R2032H2AKR/JF66>50LuLu35 EXP RPT TXT CEL DAT
169172R2033H2AKR/JM67>50LuLu36 EXP RPT TXT CEL DAT
170188R2034H2BALB/cByJF63>50LuLu26 EXP RPT TXT CEL DAT
171152R2035H2BALB/cByJM63>50JAX35 EXP RPT TXT CEL DAT
172137R2036H2BALB/cJF51>50JAX35 EXP RPT TXT CEL DAT
173174R2037H2BALB/cJM51>50LuLu26 EXP RPT TXT CEL DAT
174190R2038H2C3H/HeJF63>50JAX36 EXP RPT TXT CEL DAT
175153R2039H1C3H/HeJM63>50LuLu35 EXP RPT TXT CEL DAT
176139R2137H1C57BL/6ByJF55>50LuLu45 EXP RPT TXT CEL DAT
177176R2136H1C57BL/6ByJM55>50LuLu26 EXP RPT TXT CEL DAT
178192R2040H2C57BL/6JF64>50LuLu26 EXP RPT TXT CEL DAT
1792R2041H2C57BL/6JM65>50LuLu31sEXP RPT TXT CEL DAT
180155R1449H2C57BL/6JM71>50LuLu35 EXP RPT TXT CEL DAT
181141R2042H2CAST/EIF64>50LuLu25 EXP RPT TXT CEL DAT
182178R2043H2CAST/EIM64>50JAX26 EXP RPT TXT CEL DAT
183165R1602H2DBA/2JF60>50LuLu35EXP RPT TXT CEL DAT
184203R2044H2DBA/2JF63>50LuLu36EXP RPT TXT CEL DAT
1853R2045H2DBA/2JM65>50LuLu21sEXP RPT TXT CEL DAT
186194R1683H1KK/HIJF72>50JAX36 EXP RPT TXT CEL DAT
187157R1687H2KK/HIJM72>50JAX25 EXP RPT TXT CEL DAT
188143R2046H1LG/JF63>50JAX55 EXP RPT TXT CEL DAT
189180R2047H1LG/JM63>50LuLu26 EXP RPT TXT CEL DAT
190196R2048H1NOD/LtJF77>50LuLu46 EXP RPT TXT CEL DAT
191159>R2049H2NOD/LtJM76>50LuLu45 EXP RPT TXT CEL DAT
192182R2350H1NZ0/HILtJM96>50JAX26 EXP RPT TXT CEL DAT
193145R2200H1NZO/H1LtJF62>50DanG45 EXP RPT TXT CEL DAT
194198R2050H1PWD/PhJF65>50JAX36 EXP RPT TXT CEL DAT
195161R2051H2PWD/PhJM64>50JAX25 EXP RPT TXT CEL DAT
196147R2322H1PWK/PHJF63>50JAX35. EXP RPT TXT CEL DAT
197184R2349H1PWK/PHJM83>50JAX16 EXP RPT TXT CEL DAT
198200R2198H1WSB/EiJF58>50LuLu46 EXP RPT TXT CEL DAT
199163R2199H1WSB/EiJM58>50JAX35 EXP RPT TXT CEL DAT
200201R1289H1B6D2F1F64NALuLu46 EXP RPT TXT CEL DAT
2011R1291H3B6D2F1M66NALuLu4sEXP RPT TXT CEL DAT
202204R1291H4B6D2F1M66NAJAX36 EXP RPT TXT CEL DAT
203167R1595H1D2B6F1F63NALuLu35 EXP RPT TXT CEL DAT
204202R1286H1D2B6F1F57NALuLu36 EXP RPT TXT CEL DAT

    Downloading all data:

All data links (right-most column above) will be made active as sooon as the global analysis of these data by the Consoritum has been accepted for publication. Please see text on Data Sharing Policies, and Conditions and Limitations, and Contacts. Following publication, download a summary text file or Excel file of the PDNN probe set data. Contact RW Williams regarding data access probelms.

    About the array platform:

Affymetrix Mouse Genome 430 2.0 array: The 430v2 array consists of 992936 useful 25-nucleotide probes that estimate the expression of approximately 39,000 transcripts and the majority of known genes and expressed sequence tags. The array sequences were selected late in 2002 using Unigene Build 107 by Affymetrix. The UTHSC group has recently reannotated all probe sets on this array, producing more accurate data on probe and probe set targets. All probes were aligned to the most recent assembly of the Mouse Genome (Build 34, mm6) using Jim Kent's BLAT program. Many of the probe sets have been manually curated by Jing Gu and Rob Williams.

    About data processing:

Harshlight was used to examine the image quality of the array (CEL files). Bad areas (bubbles, scratches, blemishes) of arrays were masked.

First pass data quality control: Affymetrix GCOS provides useful array quality control data including:

  1. The scale factor used to normalize mean probe intensity. This averaged 3.3 for the 179 arrays that passed and 6.2 for arrays that were excluded. The scale factor is not a particular critical parameter.
  2. The average background level. Values averaged 54.8 units for the data sets that passed and 55.8 for data sets that were excluded. This factor is not important for quality control.
  3. The percentage of probe sets that are associated with good signal ("present" calls). This averaged 50% for the 179 data sets that passed and 42% for those that failed. Values for passing data sets extended from 43% to 55%. This is a particularly important criterion.
  4. The 3':5' signal ratios of actin and Gapdh. Values for passing data sets averaged 1.5 for actin and 1.0 for Gapdh. Values for excluded data sets averaged 12.9 for actin and 9.6 for Gapdh. This is a highly discriminative QC criterion, although one must keep in mind that only two transcripts are being tested. Sequence variation among strains (particularly wild derivative strains such as CAST/Ei) may affect these ratios.

The second step in our post-processing QC involves a count of the number of probe sets in each array that are more than 2 standard deviations (z score units) from the mean across the entire 206 array data sets. This was the most important criterion used to eliminate "bad" data sets. All 206 arrays were processed togther using standard RMA and PDNN methods. The count and percentage of probe sets in each array that were beyond the 2 z theshold was computed. Using the RMA transform the average percentage of probe sets beyond the 2 z threshold for the 179 arrays that finally passed of QC procedure was 1.76% (median of 1.18%). In contrast the 2 z percentage was more than 10-fold higher (mean of 22.4% and median 20.2%) for those arrays that were excluded. This method is not very senstive to the transformation method that is used. Using the PDNN transform the average percent of probe sets exceeding was 1.31% for good arrays and was 22.6% for those that were excluded. In our opinion, this 2 z criterion is the most useful criterion for the final decision of whether or not to include arrays, although again, allowances need to be made for wild strains that one expects to be different from the majority of conventional inbred strains. For examploe, if a data set has excellent characteristics on all of the Affymetrix GCOS metrics listed above, but generates a high 2 z percentage, then one whould include the ssample if one can verify that there are no problems in sample and data set identification.

The entire procedure can be reapplied once the initial outlier data sets have been eliminated to detect any remaining outlier data sets.

DataDesk was used to examine the statistical quality of the of the probe level (CEL) data after step 5 below. DataDesk allows a rapid detection of subsets of probes that are particular sensitive to still unknown factors in array processing. Arrays can then be categorized at the probe level into "reaction classes." A reaction class is a group of arrays for which the expression of essentially all probes are colinear over the full range of log2 values. A single but large group of arrays (n = 32) processed in essentially the identical manner by a single operator can produce arrays belonging to as many as four different reaction classes. Reaction classes are NOT related to strain, age, sex, treatment, or any known biological parameter (technical replicates can belong to different reaction classes). We do not yet understand the technical origins of reaction classes. The number of probes that contribute to the definition of reaction classes is quite small (<10% of all probes). We have categorized all arrays in this data set into one of 5 reaction classes. These have then been treated as if they were separate batches. Probes in these data type "batches" have been aligned to a common mean as described below.

Probe (cell) level data from the CEL file: These CEL values produced by GCOS are 75% quantiles from a set of 91 pixel values per cell. <0L>

  • We added an offset of 1.0 unit to each cell signal to ensure that all values could be logged without generating negative values. We then computed the log base 2 of each cell.
  • We performed a quantile normalization of the log base 2 values for all arrays using the same initial steps used by the RMA transform.
  • We computed the Z scores for each cell value.
  • We multiplied all Z scores by 2.
  • We added 8 to the value of all Z scores. The consequence of this simple set of transformations is to produce a set of Z scores that have a mean of 8, a variance of 4, and a standard deviation of 2. The advantage of this modified Z score is that a two-fold difference in expression level (probe brightness level) corresponds approximately to a 1 unit difference.
  • inally, we computed the arithmetic mean of the values for the set of microarrays for each strain. Technical replicates were averaged before computing the mean for independent biological samples. Note, that we have not (yet) corrected for variance introduced by differences in sex or any interaction terms. We have not corrected for background beyond the background correction implemented by Affymetrix in generating the CEL file. We eventually hope to add statistical controls and adjustments for some of these variables.

    Probe set data from the CHP file: The expression values were generated using PDNN. The same simple steps described above were also applied to these values. Every microarray data set therefore has a mean expression of 8 with a standard deviation of 2. A 1 unit difference represents roughly a two-fold difference in expression level. Expression levels below 5 are usually close to background noise levels.

  • Probe level QC: Log2 probe data of all arrays were inspected in DataDesk before and after quantile normalization. Inspection involved examining scatterplots of pairs of arrays for signal homogeneity (i.e., high correlation and linearity of the bivariate plots) and looking at all pairs of correlation coefficients. XY plots of probe expression and signal variance were also examined. Probe level array data sets were organized into reaction groups. Arrays with probe data that were not homogeneous when compared to other arrays were flagged.

    Probe set level QC: The final normalized individual array data were evaluated for outliers. This involved counting the number of times that the probe set value for a particular array was beyond two standard deviations of the mean. This outlier analysis was carried out using the PDNN, RMA and MAS5 transforms and outliers across different levels of expression. Arrays that were associated with an average of more than 8% outlier probe sets across all transforms and at all expression levels were eliminated. In contrast, most other arrays generated fewer than 5% outliers.

    Validation of strains and sex of each array data set: A subset of probes and probe sets with a Mendelian pattern of inheritance were used to construct a expression correlation matrix for all arrays and the ideal Mendelian expectation for each strain constructed from the genotypes. There should naturally be a very high correlation in the expression patterns of transcripts with Mendelian phenotypes within each strain, as well as with the genotype strain distribution pattern of markers for the strain.

    Sex of the samples was validated using sex-specific probe sets such as Xist and Dby.

        Data source acknowledgment:

    Data were generated with funds provided by a variety of public and private source to members of the Consortium. All of us thank Muriel Davisson, Cathy Lutz, and colleagues at the Jackson Laboratory for making it possible for us to add all of the CXB strains, and one or more samples from KK/HIJ, WSB/Ei, NZO/HILtJ, LG/J, CAST/Ei, PWD/PhJ, and PWK/PhJ to this study. We thank Yan Cui at UTHSC for allowing us to use his Linux cluster to align all M430 2.0 probes and probe sets to the mouse genome. We thank Hui-Chen Hsu and John Mountz for providing us BXD tissue samples, as well as many strains of BXD stock. We thanks Douglas Matthews (UMem in Table 1) and John Boughter (JBo in Table 1) for sharing BXD stock with us. Members of the Hippocampus Consortium thank the following sources for financial support of this effort:

    • David C. Airey, Ph.D.
      Grant Support: Vanderbilt Institute for Integratie Genomics
      Department of Pharmacology
      david.airey at vanderbilt.edu
    • Lu Lu, M.D.
      Grant Support: NIH U01AA13499, U24AA13513
    • Fred H. Gage, Ph.D.
      Grant Support: NIH XXXX
    • Dan Goldowitz, Ph.D.
      Grant Support: NIAAA INIA AA013503
      University of Tennessee Health Science Center
      Dept. Anatomy and Neurobiology
      email: dgold@nb.utmem.edu
    • Shirlean Goodwin, Ph.D.
      Grant Support: NIAAA INIA U01AA013515
    • Gerd Kempermann, M.D.
      Grant Support: The Volkswagen Foundation Grant on Permissive and Persistent Factors in Neurogenesis in the Adult Central Nervous System
      Humboldt-Universitat Berlin
      Universitatsklinikum Charite
      email: gerd.kempermann at mdc-berlin.de
    • Kenneth F. Manly, Ph.D.
      Grant Support: NIH P20MH062009 and U01CA105417
    • Richard S. Nowakowski, Ph.D.
      Grant Support: R01 NS049445-01
    • Yanhua Qu, Ph.D.
      Grant Support: NIH U01CA105417
    • Glenn D. Rosen, Ph.D.
      Grant Support: NIH P20
    • Leonard C. Schalkwyk, Ph.D.
      Grant Support: MRC Career Establishment Grant G0000170
      Social, Genetic and Developmental Psychiatry
      Institute of Psychiatry,Kings College London
      PO82, De Crespigny Park London SE5 8AF
      L.Schalkwyk@iop.kcl.ac.uk
    • Guus Smit, Ph.D.
      Dutch NeuroBsik Mouse Phenomics Consortium
      Center for Neurogenomics & Cognitive Research
      Vrije Universiteit Amsterdam, The Netherlands
      e-mail: guus.smit at falw.vu.nl
      Grant Support: BSIK 03053
    • Thomas Sutter, Ph.D.
      Grant Support: INIA U01 AA13515 and the W. Harry Feinstone Center for Genome Research
    • Stephen Whatley, Ph.D.
      Grant Support: XXXX
    • Robert W. Williams, Ph.D.
      Grant Support: NIH U01AA013499, P20MH062009, U01AA013499, U01AA013513

        About this text file:

    This text file originally generated prospectively by RWW on July 30 2005. Updated by RWW July 31, 2005.