aboutsummaryrefslogtreecommitdiff
path: root/web/dbdoc/BXDGeno.html
blob: 71260e824ab876861a4234b765835bebbb796410 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD><TITLE>BXD Genotype / WebQTL</TITLE>
<META http-equiv=Content-Type content="text/html; charset=iso-8859-1">
<LINK REL="stylesheet" TYPE="text/css" HREF='/css/general.css'>
<LINK REL="stylesheet" TYPE="text/css" HREF='/css/menu.css'>

</HEAD>
<BODY  bottommargin="2" leftmargin="2" rightmargin="2" topmargin="2" text=#000000 bgColor=#ffffff>
<TABLE cellSpacing=5 cellPadding=4 width="100%" border=0>
<TBODY>
<TR>
<script language="JavaScript" src="/javascript/header.js"></script>
</TR>
<TR>
<TD bgColor=#eeeeee class="solidBorder">
<Table width= "100%" cellSpacing=0 cellPadding=5><TR>
<!-- Body Start from Here -->
<TD valign="top" height="200" width="100%" bgcolor="#eeeeee">

<P class="title">BXD Genotypes Database

 <A HREF="/webqtl/main.py?FormID=editHtml"><img src="/images/modify.gif" alt="modify this page" border= 0 valign="middle"></A></P>


<P class="subtitle">&nbsp;&nbsp;&nbsp;&nbsp;Coming Soon:</P>
<Blockquote>
The BXD genotype file is being upgraded in 2010-2011 using the new high density Affymetrix array developed in the laboratories of Drs. Fernando Pardo-Manuel de Villena (University of North Carolina) and Gary Churchill (The Jackson Laboratory). This cutting-edge research tool, produced by Affymetrix, provides more than 100 times the SNP coverage  than any other available mouse genotyping platform, permitting high resolution mapping and genomic analysis. (580,000 high quality SNPs of 623,124 SNPs and 916,269 invariant probes. 

<P>Yang H, Ding Y, Hutchins LN, Szatkiewicz J, Bell TA, Paigen BJ, Graber JH, Pardo-Manuel de Villena, F, Churchill GA (2009) A customized and verstatile high density genotyping array for the mouse. Nat Methods 6:663-666

<!-- Also has exon 1 and exon 2 invariant probes for expression analysis. 
Segmental duplication regions are slightly problematic, some multicopy gap-filling probes.
Variable intensity oligonuceotide picks up probe failure (A-B)
D2 has a het rate of 1.218%
C57 has het rate of 0.897%
PANCEVO has 12% het rate.
Copy number variation has to account for GC and fragment length. Mismatch are not as bright. Go back to separate bight by dim Het calls.
FhetCalls VeeNos. 
Mitochondrial SNPs.  MOLD/RkJ and CASA/RkJ have M. dom mitchondria.
129S1 and 129S6 differ a 18.5% of genome. 129S1 has LP genome segments
Can do methylation assays using the array. MspI and Hpa11 restriction
Hyuna Yang, Jin Szatkiewicz,

MSM sequenced used to search of B6 sequence gaps.
2773 B6 singletons
WSB, CAST, PWK
7.5 million Perlegen clean SNPs to construct phylogenetic tree, region by region. 25,000 different phylogenetic trees.  A SNP at each branch point and each region. Built a biased tree. For all SNPs you can figure out why it was chosen. W Yang et al., 2007
-->

<P><A HREF="http://jaxservices.jax.org/mdarray.html" target="_blank" class="fs14">JAX® Mouse Diversity Genotyping Array</A>

<P><B>New genotype array key features.</B> This genotyping array can simultaneously assay over 620,000 phylogenetically informative SNPs. SNPs are spaced approximately one every 4.3kb across the genome and were selected to be highly polymorphic among characterized mouse strains. Genotypes called from analysis of the array data are highly reliable. From an internal study of two strains, genotypes from 99.7% of the polymorphic SNPs that had genotypes in the NCBI dbSNP database had matching genotypes from the Diversity Array.


</BLOCKQUOTE>

</Blockquote>


<P class="subtitle">&nbsp;&nbsp;&nbsp;&nbsp;Synposis:</P>


<Blockquote>
The BXD genotype file used from June 2005 through 2011 exploits a set of 3796 markers typed across 88 extant and extinct BXD strains (BXD1 through BXD100). The mean interval between informative markers is about 0.7 Mb. This genotype file includes all markers, both SNPs and microsatellites, with unique strain distribution patterns (SDPs), as well as pairs of markers for those SDPs represented by two or more markers. In those situations where three or more markers had the same SDP, we retained only the most proximal and distal marker in the genotype file. This particular file has also been smoothed to eliminate genotypes that are likely to be erroneous. We have also conservatively imputed a small number of missing genotypes (usually over very short intervals). Smoothing genotypes is this way reduces the total number of SDPs and also lowers the rate of false discovery. However, this procedure also may eliminate some genuine SDPs.

<P><B>The smoothed BXD genotype data file can be downloaded from

<BR><A HREF="http://www.genenetwork.org/genotypes/BXD.geno" target="_blank" class="fs14">GeneNetwork at the URL http://www.genenetwork.org/genotypes/BXD.geno</A>.</B>

<P>Please Note: For a limited number of markers and strains, the genotypes of BXDs have been called heterozygous. This is usually done over comparatively short intervals in some of the newer strains that may not have been fully inbred when they were initially genotyped.  Use of the genotype file above in external software packages such as R/QTL, requires careful treatment of this issue to prevent bias in empirical significance thresholds. It is recommended to treat these rare heterozygous loci as missing data and ensure that only the additive effects of B vs. D alleles are estimated by these packages. (note from Elissa Chesler, Dec 2010).
 

</BLOCKQUOTE>

</Blockquote>

<P class="subtitle">&nbsp;&nbsp;&nbsp;&nbsp;Source of Genotypes:</P>

<Blockquote>
<P>In collaboration with members of the CTC (Richard Mott, Jonathan Flint, and colleagues), we have helped genotype a total of 480 strains using a panel of <A HREF="http://www.well.ox.ac.uk/mouse/INBREDS/" target="_blank" class="fs14">13,377</A> SNPs. These SNPs have been combined with our previious microsatellite genotypes to produce new consensus maps for the new expanded set of BXD using the latest mouse genome assembly as a reference frame for marker order (Mouse Build 36 - UCSC mm8). The order of markers given in the BXD genotype file is essentially the same as that given in Build 36. (Files were updated from mm6 to mm8 in January 2007.).

<P>A total of 88 strains were genotyped using the full set of SNPs, and 7482 of these were informative. <I>Informative</I> in this sense simply means that the C57BL/6J and DBA/2J parental strains have different alleles. To reduce false positive errors when mapping using this ultra dense map, we have eliminated most single genotypes that generate double-recombinant haplotypes that are most commonly produced by typing errors ("smoothed" genotypes). For this reason, the genotypes used in the GeneNetwork differ from those downloaded directly from Richard Mott's <A HREF="http://www.well.ox.ac.uk/mouse/INBREDS/" target="_blank" class="fs14">web site</A> at the Wellcome Trust, Oxford.

<P>
We have genotyped all available BXD strains from The Jackson Laboratory. BXD1 through BXD32 were produced by Benjamin Taylor starting in the late 1970s. BXD33 through BXD42 were produced by Taylor in the 1990s (Taylor et al., 1999). All BXD strains with numbers higher than BXD42 (BXD43 through BXD100) were generated by Lu Lu and Robert Williams at UTHSC, and by Jeremy Peirce and Lee Silver at Princeton University. We thank Guomin Zhou for generating the advanced intercross stock used to produce most of these advanced RI strains both at UTHSC and Princeton.

<!--As of July 2005, these lines are an average of 18 generations inbred.-->

There are approximately 48 of these advanced BXD strains, each of which archives approximately twice the recombinations present in a typical F2-derived recombinant inbred strain (Peirce et al. <A HREF="http://www.webqtl.org/reference.html" target="_blank" class="fs14">2003</a>).
</Blockquote>



<P class="subtitle">&nbsp;&nbsp;&nbsp;&nbsp;Mapping Algorithm:</P>

<Blockquote>
<P>Due to the very high density of markers, the mapping algorithm used to map BXD data sets has been modified and is a mixture of simple marker regression, linear interpolation, and standard Haley-Knott interval mapping. When two adjacent markers have identical SDPs, they will have identical linkage statistics, as will the entire interval between these two markers (assuming complete and error-free haplotype data for all strains). On a physical map the <A HREF="http://www.genenetwork.org/glossary.html#LRS" target="_blank" class="fs14">LRS</A> and the <A HREF="http://www.genenetwork.org/glossary.html#additive" target="_blank" class="fs14">additive effect</A> values will therefore be constant over this physical interval. Between neighboring markers that have different SDPs and that are separated by 1 cM or more, we use a conventional interval mapping method (Haley-Knott) combined with a Haldane estimate of genetic distance. When the interval is less than 1 cM, we simply interpolate linearly between markers based on a physical scale between those markers. The result of this <B>mixture mapping algorithm</B> is a linkage map of a trait that has an unusal profile that is particular striking on a physical (Mb) scale, with many plateaus, abrupt linear transitions between plateaus, and a few regions with the standard graceful curves typical of interval maps.</P>
</Blockquote>


<P class="subtitle">&nbsp;&nbsp;&nbsp;&nbsp;Archival Genotypes:</P>

<Blockquote>
<B>Archival BXD Genotype file</B>: Prior to July 2005, the marker genotypes used to map all BXD data sets consisted of a set of 779 markers described by Williams and colleagues (2001) that also included a small number of additional SNPs from Tim Wiltshire and Mathew Pletcher (GNF, La Jolla), new microsatellite markers generated by Grant Morahan and Jing Gu (Msw type markers), and a few CTC markers by Jing Gu. This old marker data set was made obsolete by the ultra high density Illumina SNP genotype data generated Spring, 2005. The old genotype file is still available for use on the Archive site.
</Blockquote>


<P class="subtitle">&nbsp;&nbsp;&nbsp;&nbsp;Download Genotypes:</P>
<Blockquote>
The entire BXD genotype data set used for mapping traits can be downloaded at <A HREF="http://www.genenetwork.org/genotypes/BXD.geno" class="fs14">www.genenetwork.org/genotypes/BXD.geno</A>. </P>
</Blockquote>









<!-- OLD TEXT used prior to June 2005

<P class="subtitle">&nbsp;&nbsp;&nbsp;&nbsp;About the genotypes used in these studies:</P>

<Blockquote>
WebQTL mapping algorithms rely on genotypes for the BXD strains that include both microsatellite markers (labeled <I>Mit</I> and <I>Msw</I>) and single nucleotide polymorphisms (labeled <I>Gnf</I>). The current set of markers (n = 779) have been carefully error-checked. Closely linked genetic markers often have the same strain distribution pattern (SDP) across the BXD strains. For computational efficiency, we only use a single marker associated with each SDP.
</Blockquote>


<Blockquote>
Marker-strain pairs for which we were missing genotypes were often inferred from flanking markers. In marker sets lacking genotypes for a particular strain, a note is included to that effect in the marker set description below.
</Blockquote>

<P class="subtitle">&nbsp;&nbsp;&nbsp;&nbsp;About the marker sets:</P>

<Blockquote> <U><B>Mit</B></U><br>
<I>Mit</I> markers, described by William Dietrich and colleagues (<a href="http://www.broad.mit.edu/cgi-bin/mouse/sts_info?database=mouserelease" target="_blank" class="fs14">1992</a>), are the most widely used of the three marker sets. These markers typically consist of regions of repeated dinucleotides (so-called CA repeat microsatellites) that vary in length among strains. The CA repeat polymorphisms are flanked by unique sequence that can be used to design polymerase chain reaction (PCR) primers that will selectively amplify the intervening variable region. While many of the <i>Mit</i> markers have been typed in the BXD strain set by a number of investigators, the genotypes used here are those reported in the consensus map created by Williams and colleagues (<a href="http://www.genomebiology.com/2001/2/11/research/0046" target="_blank" class="fs14">2001</a>).

<br>
<br><i>Mit</i> marker names: D + (Chr of Marker) + Mit + (Order Found) <UL>
<LI>D indicates that the marker is a DNA segment.
<LI><i>Mit</i>  indicates that the marker was identified at the Massachusetts Institute of Technology.
<LI>Order Found indicates the order in which the markers were identified. </UL>
</Blockquote>

<Blockquote><B><U>Gnf</U></B>

<br><i>Gnf</i> markers are single nucleotide polymorphisms (SNPs) identified between B6 and D2 by genomic sequence sampling. Polymorphisms were typed by Mathew Pletcher and Tim Wiltshire using the Sequenom MassEXTEND system (Wiltshire et al., <A HREF="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=12612341&dopt=Abstract" target="_blank" class="fs14">2003</A>) For BXD8 as well as BXD67 and BXD68, genotypes were ofteninferred from flanking markers. Each of the genotyping reactions was set up in duplicate. Physical positions were determined for each marker and integrated with previous BXD RI mapping data based on a combination of physical and genetic positions. Unsupported double crossovers were verified by manual inspection to ensure accuracy of calls. A full list of SNPs identified in the sequence sampling can be found at <a href="http://www.gnf.org/SNP/" class = "normalsize">http://www.gnf.org/SNP</a>.
<br>
<br><i>Gnf</i> marker names: S + (Chr of Marker) + Gnf + (Mb position) <UL>
<LI>S indicates the marker is a SNP
<LI><i>Gnf</i> indicates that the marker originated at the Genomics Institute of the Novartis Research Foundation.
<LI>Mb position may include decimal values. </UL>
</Blockquote>

<Blockquote><B><U>Msw</U></B>

<br><i>Msw</i> markers are variable length tracts of nucleotide repeats designed and tested by Grant Morahan, Keith Satterley, Robert W. Williams, and Jing Gu. In contrast to the variable CA repeats of Mit markers, the <i>Msw</i> markers exploit polymorphisms in tri- tetra-, penta-, and hexa-nucleotide repeats. <i>Msw</i> markers were typed by Shuhua Qi and Jing Gu at UTHSC using previously described methods (Williams et al. <a href="http://www.genomebiology.com/2001/2/11/research/0046" target="_blank" class="fs14">2001</a>). Genotypes for BXD67 and BXD68 were often inferred from flanking markers. Physical positions were determined for each marker by BLAT analysis of the microsatellite sequence against the most recent assembly of the mouse genome (currently mm5 of May 2004) and integrated with previous BXD RI mapping data based on a combination of physical and genetic positions.
<br>

<br><i>Msw</i> marker names: D + (Chr of Marker) + <i>Msw</i> + (Mb Position)<UL>
<LI>D indicates that the marker is a DNA segment <i>Msw</i> indicates the marker source.
<LI>Mb Position is marker position to the nearest megabase.
<LI>Mb position may include decimal values and, in rare cases, a letter suffix (a or b) if alternative primers were used to amplify the same repeat.
</UL></Blockquote>


END DELETED TEXT   -->



<P class="subtitle">&nbsp;&nbsp;&nbsp;&nbsp;Acknowledgments:</P>

<Blockquote>
The great majority of SNP genotypes were generated at <A HREF="http://www.illumina.com/Products/prod_snp.ilmn" target="_empty" class="fs14">Illumina</A> with support from the Wellcome Trust to JF and RM, a Human Brain Project grant to RWW (P20-MH 62009 and IBN-0003982), and by the NIAAA INIA Genotyping Core (U24AA13513). Genotypes for Mit and Msw markers were generated by Jing Gu and Lu Lu with support from NIH (P20-MH 62009). Markers for the <i>Msw</i> set were designed by Grant Morahan, Keith Satterley. <i>Gnf</i> SNP genotypes were generated by Tim Wiltshire and Mathew Pletcher. The selection of markers to included in the final file was carried out by Jing Gu and Robert W. Williams.
</Blockquote>


<P class="subtitle">&nbsp;&nbsp;&nbsp;&nbsp;Reference:</P>

<Blockquote>
<P>Dietrich WF, Katz H, Lincoln SE (<a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=1353738&dopt=Abstract" class="fs14">1992</a>) A genetic map of the mouse suitable for typing in intraspecific crosses. Genetics 131:423-447.
</P></Blockquote>

<Blockquote><P>
Taylor BA, Wnek C, Kotlus BS, Roemer N, MacTaggart T, Phillips SJ  (<A HREF="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=10087289&dopt=Abstract" class="fs14">1999</A>) Genotyping new BXD recombinant inbred mouse strains and comparison of BXD and consensus maps.
Mamm Genome 10:335-348.
</P></Blockquote>

<Blockquote><P>
Williams RW, Gu J, Qi S, Lu L (<a href="http://www.genomebiology.com/2001/2/11/research/0046" class=normal>2001</a>) The genetic structure of recombinant inbred mice: High-resolution consensus maps for complex trait analysis. Genome Biology 2:RESEARCH0046
</P></Blockquote>

<Blockquote><P>
Wiltshire T, Pletcher MT, Batalov S, Barnes SW, Tarantino LM, Cooke MP, Wu H, Smylie K, Santrosyan A, Copeland NG, Jenkins NA, Kalush F, Mural RJ, Glynne RJ, Kay SA, Adams MD, Fletcher CF (<A HREF="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=12612341&dopt=Abstract" class="fs14">2003</A>) Genome-wide single-nucleotide polymorphism analysis defines haplotype patterns in mouse. Proc Natl Acad Sci USA 100:3380-3385.
</P></Blockquote>

<Blockquote>
<P>This text file was originally written by Jeremy Peirce (August 21, 2003). Updated August 22, 2003 by RW/JP/LL. Updated October 19, 2004 by RW. Updated extensively July 26, 2005 by RW.
</P></Blockquote>



<Blockquote><P>
<P></P>

</TD>
</TR></TABLE>
</TD>
</TR>
<TR>
<TD align=center bgColor=#ddddff class="solidBorder">



<!--Start of footer-->
<TABLE width="90%">
<script language='JavaScript' src='/javascript/footer.js'></script>
</TABLE>
<!--End of footer-->



</TD>
</TR>
</TABLE>
<!-- /Footer -->
<!-- menu script itself. you should not modify this file -->
<script language="JavaScript" src="/javascript/menu_new.js"></script>
<!-- items structure. menu hierarchy and links are stored there -->
<script language="JavaScript" src="/javascript/menu_items.js"></script>
<!-- files with geometry and styles structures -->

<script language="JavaScript" src="/javascript/menu_tpl.js"></script>
<script language="JavaScript">
<!--//
// Note where menu initialization block is located in HTML document.
// Don't try to position menu locating menu initialization block in
// some table cell or other HTML element. Always put it before </body>
// each menu gets two parameters (see demo files)
// 1. items structure
// 2. geometry structure
new menu (MENU_ITEMS, MENU_POS);
// make sure files containing definitions for these variables are linked to the document
// if you got some javascript error like "MENU_POS is not defined", then you've made syntax
// error in menu_tpl.js file or that file isn't linked properly.

// also take a look at stylesheets loaded in header in order to set styles
//-->
</script>
</BODY>
</HTML>