aboutsummaryrefslogtreecommitdiff
path: root/gnqa/paper1_eval/src/data/datasets/human/dataset_citizenscientist_general_5.json
blob: 1fe6931bb1297b5a972253d5051ff561adf54f35 (about) (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
{
  "question": [
    "Create a how-to guide for genetic sequencing",
    "Create a guide for genetic sequencing",
    "Define dyslipidemia.",
    "What is cytochrome?",
    "How does one tell the difference between X and Y DNA, with respect  to DNA tracing and determining QTLs?"
  ],
  "answer": [
    "1. Identify the gene or genome of interest for sequencing.\n2. Choose the appropriate sequencing technology, such as next-generation sequencing.\n3. Prepare the sample by extracting DNA and creating a library for sequencing.\n4. Run the sequencing process using the chosen platform.\n5. Analyze the sequencing data using bioinformatics tools. This may include aligning the sequence reads to a reference genome, identifying variants, and interpreting the results.\n6. Validate the results using additional experiments or databases.\n7. Document and share the findings in a suitable format, such as a scientific paper or a database entry.\n8. For further analysis or research, use online resources like the '1000 genomes' project, the Mouse Genome Database, or the UCSC genome browser tutorial.\n9. Stay updated with best practice guidelines for the use of next-generation sequencing applications in genome diagnostics.",
    "1. Initiate a project for sequencing, like the '1000 genomes' project or the rainbow trout project, involving collaborators from various institutions.\n2. Choose a sequencing platform, such as Illumina, and prepare the sample for sequencing.\n3. Use bioinformatics tools to analyze the sequence data. Tools like Seqnature for RNA-seq or Acembly for predicting gene structures can be used.\n4. Construct an individualized pseudogenome for the sample using prior knowledge of variant sites in the sample's genome.\n5. Use online resources like NCBI Map Viewer for graphical depictions of genetic and physical maps and to locate genes, markers, and SNPs on the assembled sequences.\n6. Use websites offering annotation of the draft genome for various analyses such as gene predictions and similarity searches.\n7. Monitor the progress of sequencing online and aim for a resolution of selective constraint down to a segment length of eight nucleotides.",
    "Dyslipidemia is the term for blood fat disorders, which include high triglycerides, low HDL cholesterol, and high LDL cholesterol. These conditions can foster plaque buildups in artery walls.",
    "Cytochrome is a type of protein that contains heme groups and is responsible for the transport of electrons. They are found in aerobic cells and play a crucial role in the respiratory chain, aiding in the process of oxidative phosphorylation. There are different types of cytochromes, including cytochromes a, b, and c, each undergoing oxidation-reduction changes in a determined sequence.",
    "The difference between X and Y DNA in terms of DNA tracing and determining QTLs is not explicitly discussed in the background text. However, it is mentioned that the mouse genome consists of 19 pairs of autosomes and the X and Y chromosomes. In general, the X and Y chromosomes are different in size, gene content, and inheritance patterns, which could potentially influence the process of DNA tracing and determining QTLs. However, specific methods or implications related to these differences are not detailed in the provided text."
  ],
  "contexts": [
    [
      "Biesecker, L., Mullikin, J., Facio, F., Turner,C., Cherukuri, P., Blakesley, R., Bouffard, G.,Chines, P., Cruz, P., Hansen, N., Teer, J.,Maskeri, B., Young, A., Manolio, T., Wilson,A., Finkel, T., Hwang, P., Arai, A., Remaley,A., Sachdev, V., Shamburek, R., Cannon, R.,and Green, E. (2009) The ClinSeq Project:Piloting large-scale genome sequencing forresearch in genomic medicine. Genome Res. 19, 16651674. 32. Wang, Z., Gerstein, M., and Snyder, M. (2009)RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet. 10, 5763. 33. Mortazavi, A., Williams, B., McCue, K.,Schaeffer, L., and Wold, B.",
      "Extensive documentation canbe downloaded in either pdf or Hypertext formats. The Tutorial is especially helpful, but readers should be aware that its files are somewhat inconspicuously tuckedin with Sample Data files, rather than being included in the Map Manager QTXmanual. For the current example, genotype data were downloaded from the Mouse GenomeDatabase (2001) (http://www.informatics.jax.org/). Specifically, it consists of mousechromosome 1 genotypes from the CopelandJenkins backcross, and a selected subset of 10 markers spanning the entire 100 cM length of the chromosome.The full listof organisms, target sequence coverage and progress in sequencing can be monitored online (http://www.genome.gov/10002154). Based on the equations of Eddy( 2005) and simulations of Margulies et al. (2005), these genome sequences shouldprovide resolution of selective constraint down to a segment length of eight nucleotides, approaching the same scale as individual transcription factor-bindingsites.",
      "Resequencing of genomic regions of interest will also be necessary (see Electronic-DatabaseInformation for current examples).",
      "Best practice guidelines for the use of next-generation sequencing applications in genome diagnostics: A national collaborativestudy of dutch genome diagnostic laboratories. Human Mutation,34(10):13131321, Aug 2013. [363] D. Welter, J. MacArthur, J. Morales, T. Burdett, P. Hall,H. Junkins, A. Klemm, P. Flicek, T. Manolio, L. Hindor, and290BIBLIOGRAPHYet al. The nhgri gwas catalog, a curated resource of snp-trait associations. Nucleic Acids Research, 42(D1):D1001D1006, Dec2013. [364] Harm-Jan Westra, Marjolein J Peters, Tonu Esko, HaniehYaghootkar, Claudia Schurmann, Johannes Kettunen, Mark WChristiansen, Benjamin P Fairfax, Katharina Schramm, Joseph EPowell, and et al.",
      "Zweig, A. S., Karolchik, D., Kuhn, R. M., Haussler, D., and Kent,W. J. (2008). UCSC genome browser tutorial. Genomics 92, 75 84. Vol. 9, Summer 2010107Downloaded from http://www.lifescied.org/ by guest on May 12, 2015",
      "Useful Online Genomics Resources.",
      "Bioinformatics tools for pathogen whole-genome sequencing",
      "The '1000 genomes' project and related individual sequencing projects",
      "Sequencing a draft reference genomefor rainbow trout has been recently initiated; aconsortium involves collaborators from USDANational Cold and Cool Water Aquaculture(USDA-NCCCWA), Washington State University, University of Oregon, University ofCalifornia Davis, and West Virginia University (Palti et al. , 2011). The project will producea genome map and a draft reference genomesequence using the Illumina platform by sequencing the bacterial articial chromosomes(BACs) from the physical map minimal tilingpath (Palti, 2010; Miller et al. , 2011).",
      "Extensive documentation canbe downloaded in either pdf or Hypertext formats. The Tutorial is especially helpful, but readers should be aware that its files are somewhat inconspicuously tuckedin with Sample Data files, rather than being included in the Map Manager QTXmanual. For the current example, genotype data were downloaded from the Mouse GenomeDatabase (2001) (http://www.informatics.jax.org/). Specifically, it consists of mousechromosome 1 genotypes from the CopelandJenkins backcross, and a selected subset of 10 markers spanning the entire 100 cM length of the chromosome.The full listof organisms, target sequence coverage and progress in sequencing can be monitored online (http://www.genome.gov/10002154). Based on the equations of Eddy( 2005) and simulations of Margulies et al. (2005), these genome sequences shouldprovide resolution of selective constraint down to a segment length of eight nucleotides, approaching the same scale as individual transcription factor-bindingsites.",
      "The large number of bioinformatic tools that have beenmade available to scientists during the last few years has presented theproblem of which to use and how best to obtain scientifically valid answers(3). In this chapter, we will provide a guide for the most efficient way toanalyze a given sequence or to collect information regarding a gene, protein,structure, or interaction of interest by applying current publicly available software and databases that mainly use the World Wide Web.Hum Hered 63:678432122Gene/Protein Sequence AnalysisA Compilation of Bioinformatic ToolsBernd H. A. Rehm and Frank Reinecke1. IntroductionThe advent of automated high throughput DNA sequencing methods hasstrongly enabled genome sequencing strategies, culminating in determination ofthe entire human genome (1,2). An enormous amount of DNA sequence dataare available and databases still grow exponentially (see Fig. 22.1). Analysisof this overwhelming amount of data, including hundreds of genomes fromboth prokaryotes and eukaryotes, has given rise to the field of bioinformatics.",
      "2014) (https://github.com/jaxcs/Seqnature) developed in the Churchill group at the Jackson Laboratory is similar, andis tailored to RNA-seq in the DO. Author ManuscriptConstruction of an individualized pseudogenome for a sample requires prior knowledge ofvariant sites in that samples genome. In, for instance, an F1 cross between strains for whichwhole-genome sequencing data are available, imputing the pseudogenome is trivial. Genomes of recombinant individuals (e.g. , CC or DO) can be expressed as mosaics offounder haplotypes on the basis of genotyping (discussed previously), and a pseudogenomestitched together accordingly.",
      "Geschwind and KonopkaPage 9Box 3NIH-PA Author ManuscriptThe challenges of next-generation sequencingNext-generation sequencing will be revolutionary in the amount and content of datagenerated, but there are many obstacles to surmount. Extensive comparisons ofsequencing data have not been published demonstrating whether there are batch effects indata due to sample preparation, library generation, flow cell preparation or machine run. Few studies have compared the commercial platforms for either gene expression or generegulation81,82. Data storage and analysis are currently a much larger challenge than datageneration.",
      "In the following section, we provide an overview of the finished genome sequencing projects and report them in chronological order of their publication.",
      "Extensive documentation canbe downloaded in either pdf or Hypertext formats. The Tutorial is especially helpful, but readers should be aware that its files are somewhat inconspicuously tuckedin with Sample Data files, rather than being included in the Map Manager QTXmanual. For the current example, genotype data were downloaded from the Mouse GenomeDatabase (2001) (http://www.informatics.jax.org/). Specifically, it consists of mousechromosome 1 genotypes from the CopelandJenkins backcross, and a selected subset of 10 markers spanning the entire 100 cM length of the chromosome.The full listof organisms, target sequence coverage and progress in sequencing can be monitored online (http://www.genome.gov/10002154). Based on the equations of Eddy( 2005) and simulations of Margulies et al. (2005), these genome sequences shouldprovide resolution of selective constraint down to a segment length of eight nucleotides, approaching the same scale as individual transcription factor-bindingsites.",
      "Extensive documentation canbe downloaded in either pdf or Hypertext formats. The Tutorial is especially helpful, but readers should be aware that its files are somewhat inconspicuously tuckedin with Sample Data files, rather than being included in the Map Manager QTXmanual. For the current example, genotype data were downloaded from the Mouse GenomeDatabase (2001) (http://www.informatics.jax.org/). Specifically, it consists of mousechromosome 1 genotypes from the CopelandJenkins backcross, and a selected subset of 10 markers spanning the entire 100 cM length of the chromosome."
    ],
    [
      "gov/mapview/) evolved to allow graphical depictions of, and comparisons between,a wide range of genetic and physical maps in parallel with NCBI draft and finished sequence contigs. The locations of genes, markers, and SNPs are indicatedon the assembled sequences. As with Ensembl, there is a NCBI analysis protocolwhich aims to predict gene structures based upon EST and mRNA alignments withthe draft genome. This is carried out by a program called Acembly (unpublished;http://www.ncbi.nlm.nih.gov/IEB/Research/Acembly/index.html), which aims toderive gene structure from these alignments alone.There are now three well-designed websites (Table 4.1) offering users the chance tobrowse annotation of the draft human genome. All three sites offer a graphical interface to display the results of various analyses, such as gene predictions and similaritysearches, for draft and finished genomic sequence. These interfaces are indispensable for rapid, intuitive comparisons between the features predicted by differentprograms. For instance, one can see at once where an exon prediction overlaps withinterspersed repeats or a SNP.",
      "Resequencing of genomic regions of interest will also be necessary (see Electronic-DatabaseInformation for current examples).",
      "Useful Online Genomics Resources.",
      "Bioinformatics tools for pathogen whole-genome sequencing",
      "The '1000 genomes' project and related individual sequencing projects",
      "Sequencing a draft reference genomefor rainbow trout has been recently initiated; aconsortium involves collaborators from USDANational Cold and Cool Water Aquaculture(USDA-NCCCWA), Washington State University, University of Oregon, University ofCalifornia Davis, and West Virginia University (Palti et al. , 2011). The project will producea genome map and a draft reference genomesequence using the Illumina platform by sequencing the bacterial articial chromosomes(BACs) from the physical map minimal tilingpath (Palti, 2010; Miller et al. , 2011).",
      "gov/mapview/) evolved to allow graphical depictions of, and comparisons between,a wide range of genetic and physical maps in parallel with NCBI draft and finished sequence contigs. The locations of genes, markers, and SNPs are indicatedon the assembled sequences. As with Ensembl, there is a NCBI analysis protocolwhich aims to predict gene structures based upon EST and mRNA alignments withthe draft genome. This is carried out by a program called Acembly (unpublished;http://www.ncbi.nlm.nih.gov/IEB/Research/Acembly/index.html), which aims toderive gene structure from these alignments alone.There are now three well-designed websites (Table 4.1) offering users the chance tobrowse annotation of the draft human genome. All three sites offer a graphical interface to display the results of various analyses, such as gene predictions and similaritysearches, for draft and finished genomic sequence. These interfaces are indispensable for rapid, intuitive comparisons between the features predicted by differentprograms. For instance, one can see at once where an exon prediction overlaps withinterspersed repeats or a SNP.The full listof organisms, target sequence coverage and progress in sequencing can be monitored online (http://www.genome.gov/10002154). Based on the equations of Eddy( 2005) and simulations of Margulies et al. (2005), these genome sequences shouldprovide resolution of selective constraint down to a segment length of eight nucleotides, approaching the same scale as individual transcription factor-bindingsites.",
      "The large number of bioinformatic tools that have beenmade available to scientists during the last few years has presented theproblem of which to use and how best to obtain scientifically valid answers(3). In this chapter, we will provide a guide for the most efficient way toanalyze a given sequence or to collect information regarding a gene, protein,structure, or interaction of interest by applying current publicly available software and databases that mainly use the World Wide Web.",
      "2014) (https://github.com/jaxcs/Seqnature) developed in the Churchill group at the Jackson Laboratory is similar, andis tailored to RNA-seq in the DO. Author ManuscriptConstruction of an individualized pseudogenome for a sample requires prior knowledge ofvariant sites in that samples genome. In, for instance, an F1 cross between strains for whichwhole-genome sequencing data are available, imputing the pseudogenome is trivial. Genomes of recombinant individuals (e.g. , CC or DO) can be expressed as mosaics offounder haplotypes on the basis of genotyping (discussed previously), and a pseudogenomestitched together accordingly.",
      "In the following section, we provide an overview of the finished genome sequencing projects and report them in chronological order of their publication.",
      "gov/mapview/) evolved to allow graphical depictions of, and comparisons between,a wide range of genetic and physical maps in parallel with NCBI draft and finished sequence contigs. The locations of genes, markers, and SNPs are indicatedon the assembled sequences. As with Ensembl, there is a NCBI analysis protocolwhich aims to predict gene structures based upon EST and mRNA alignments withthe draft genome. This is carried out by a program called Acembly (unpublished;http://www.ncbi.nlm.nih.gov/IEB/Research/Acembly/index.html), which aims toderive gene structure from these alignments alone.There are now three well-designed websites (Table 4.1) offering users the chance tobrowse annotation of the draft human genome. All three sites offer a graphical interface to display the results of various analyses, such as gene predictions and similaritysearches, for draft and finished genomic sequence. These interfaces are indispensable for rapid, intuitive comparisons between the features predicted by differentprograms. For instance, one can see at once where an exon prediction overlaps withinterspersed repeats or a SNP.The full listof organisms, target sequence coverage and progress in sequencing can be monitored online (http://www.genome.gov/10002154). Based on the equations of Eddy( 2005) and simulations of Margulies et al. (2005), these genome sequences shouldprovide resolution of selective constraint down to a segment length of eight nucleotides, approaching the same scale as individual transcription factor-bindingsites.",
      "gov/mapview/) evolved to allow graphical depictions of, and comparisons between,a wide range of genetic and physical maps in parallel with NCBI draft and finished sequence contigs. The locations of genes, markers, and SNPs are indicatedon the assembled sequences. As with Ensembl, there is a NCBI analysis protocolwhich aims to predict gene structures based upon EST and mRNA alignments withthe draft genome. This is carried out by a program called Acembly (unpublished;http://www.ncbi.nlm.nih.gov/IEB/Research/Acembly/index.html), which aims toderive gene structure from these alignments alone.There are now three well-designed websites (Table 4.1) offering users the chance tobrowse annotation of the draft human genome. All three sites offer a graphical interface to display the results of various analyses, such as gene predictions and similaritysearches, for draft and finished genomic sequence. These interfaces are indispensable for rapid, intuitive comparisons between the features predicted by differentprograms. For instance, one can see at once where an exon prediction overlaps withinterspersed repeats or a SNP.The full listof organisms, target sequence coverage and progress in sequencing can be monitored online (http://www.genome.gov/10002154). Based on the equations of Eddy( 2005) and simulations of Margulies et al. (2005), these genome sequences shouldprovide resolution of selective constraint down to a segment length of eight nucleotides, approaching the same scale as individual transcription factor-bindingsites.",
      "LETTER RESEARCHthe sequence information available from the assembled scaftigs), resulting in the final reference gene catalogue used in this study."
    ],
    [
      "The dyslipidaemia in patients with Type 2 diabetes is characterised by hypertriglyceridaemia, low High Density Lipoprotein (HDL) and relatively normal Low Density Lipoprotein (LDL) and total cholesterol (Owen, et al., 2002).The lipid profiles of HNF1A MODY patients are similar to non-diabetic controls, but have higher HDL cholesterol and lower triglyceride levels compared to Type 2 diabetes patients (McDonald, et al., 2012;Owen, et al.,",
      "Dyslipidaemia Atherogenic dyslipidemia is the overall term for blood fatdisordershigh triglycerides, low HDL cholesterol and high LDL cholesterolthat foster plaque buildups in artery walls. Ecogenetics The interaction of genetics with the environment. Eicosanoid Any of a family of naturally occurring substances derived from20-carbon polyunsaturated fatty acids; they include prostaglandins,thromboxanes, leukotrienes and epoxyeicosatrienoic acids, and function ashormones. Endophenotype Measurable biological characteristics thought to lie along thepathway from gene to disorder, but that are closer to the gene and are thoughtto have a simpler relationship with a given gene.",
      "M A N U S C R I P T A C C E P T E D ACCEPTED MANUSCRIPTAbnormal lipid profiles have been known to be associated with the metabolic syndrome and AD for over a decade (Kuo et al., 1998;Roher et al., 1999;Merched et al., 2000).Plasma lipidomics allows the detection of sphingolipids and glycerophospholipids such as Cer, PI and PE that are present in very small amounts in the plasma.Changes in lipids disturb plasma membrane asymmetry (Axelsen et al., 2011), and this is likely to disturb peripheral liver A endosomal metabolism that is essential for mediating the clearance of A via ApoE-or another apolipoprotein-mediated pathway.",
      "Serum lipids are important determinants of cardiovascular diseases and are related to morbidity [187] .The high heritability of circulating lipid levels is well established, and earlier studies of individuals with extreme lipid values or families with Mendelian forms of dyslipidemias have reported the involvement of numerous genes and respective proteins in lipid metabolism [188] .Recent GWA studies mostly carried out in samples enriched for T2DM cases have implicated a total of 19 loci controlling serum high density lipoprotein (HDL) cholesterol, low density lipoprotein (LDL) cholesterol and triglycerides (TG).The loci include the genes encoding ABCA1 , APOB , CELSR2 , CETP , DOCK7 , GALNT2 , GCKR , HMGCR , LDLR , LIPC , LIPG , LPL , MLXIPL , shown according to the number of at -risk alleles in each class (for a total of 15 single nucleotide polymorphisms genotyped in each individual, from the best replicated variants following the results of genome -wide association studies).The study was performed in 4232 patients with diabetes and 4595 normoglycemic adult subjects.Adapted from Cauchi et al. [193] .",
      "Familial HypercholesterolemiaFamilial hypercholesterolemia is an inherited condition in which patients have extremely high blood levels of low-density lipoprotein (LDL) cholesterol, which results in abnormal deposition of cholesterol in various parts of the body and a dramatically increased risk of cardiovascular disease, which often manifests at an early age.Several genes have been implicated in this disorder.Mutations in LDLR, which encodes the LDL receptor, can affect the synthesis, structure, and function of the LDL receptor in a variety of ways, 6 resulting in the impaired ability of cells to remove cholesterol-carrying LDL particles from the bloodstream and thus the accumulation of LDL cholesterol in the blood.Although familial hypercholesterolemia is often regarded as an autosomal dominant disorder, LDLR mutations have an additive (codominant) effect such that patients who have 2 LDLR mutations have higher blood LDL cholesterol levels and experience earlier cardiovascular disease (as early as childhood) compared with patients with 1 LDLR mutation.Mutations in the APOB gene, which encodes the apolipoprotein B protein, which is a core protein of LDL particles and facilitates their removal from the bloodstream, can mimic the effects of LDLR mutations and result in familial hypercholesterolemia. 7 Finally, mutations in 2 other genes that encode proteins that affect the function of the LDL receptor, PCSK9 and LDLRAP1, can also result in familial hypercholesterolemia. 8,9 Unlike the other 3 genes, LDLRAP1 mutations are recessive and thus are required to affect both copies of the gene for patients to manifest disease.",
      "Dyslipidemia was defined according to ATP III.High TC was defined as serum level of TC equal to or greater than 6.21 mmol/L (!240 mg/dL).Low HDL-C was defined as serum level of HDL-C less than 1.03 mmol/L (<40 mg/dL).High LDL-C was defined as serum level of LDL-C equal to or greater than 4.16 mmol/L (!160 mg/dL).High TG was defined as serum level of triglyceride equal to or greater than 2.26 mmol/L (!200 mg/dL).The ratio of TC/HDL-C equal to or greater than five was defined as abnormal.All the participants were free of lipid-lowering medicines.Studies reported that dyslipidemia was associated with hypertension [27e30].However, in our current study, this association was not observed in the Yi people by multivariate logistic regression analysis.Significant association between ever alcohol drinking, dyslipidemia, low HDL-C and high ratio of TC/HDL-C was observed (OR<1).Our findings showed an association between increasing BMI and increasing prevalence of dyslipidemia.The association between ever smoking and dyslipidemia was not observed in the current research by multivariate logistic regression analysis.Compared with females aged below 45 years, females at menopausal stage had higher risk for dyslipidemia, adjusted for family income, educational level, physical activity, ever smoking, ever alcohol drinking, BMI, and history of hypertension and T2DM.In exploring associated factors for prevalence of dyslipidemia, we used multivariate logistic regression analysis with high TC, TG and LDL-C, low HDL-C and abnormal ratio of TC/HDL-C as dependent variables, respectively.Independent variables included age, gender, family income, educational level, physical activity, ever smoking, ever alcohol drinking, BMI, and history of hypertension and T2DM, each of them had a p-value less than 0.05.No factor associated with high LDL-C was observed.Overweight or obesity, and history of T2DM were independent factors related to dyslipidemia.Male Yi people had significantly higher risk for dyslipidemia, as compared with that in females.Ever alcohol drinking was associated with dyslipidemia, i.e., low HDL-C and higher ratio of TC/ HDL-C.Compared with light physical activity, people engaged in moderate or heavy labor had lower risk for high TG Fig. 1.Prevalence of dyslipidemia by time before and after emigration in the Yi migrants.There are some limitations in this study that require consideration in interpretation of our findings.One is limited sample size.Another one is that no dietary survey was conducted in the current study, so no dietary factors associated with dyslipidemia could be estimated.",
      "Perturbation of lipid metabolismIt is known that the characteristic features of dyslipidemia in humans with type 2 diabetes are high plasma triglyceride concentration, low HDL cholesterol concentrations and increased concentration of small dense LDL-cholesterol particles, while total cholesterol is not increased in patients with diabetes.These lipid changes in these individuals may be due to an increased free fatty acid flux secondary to insulin resistance [49].However, the perturbations observed here in lipid metabolism reflect the state of already treated diabetes.Association studies with blood lipid parameters [Adamski et al., unpublished data, [50]] show that many PC species associate with HDL and total cholesterol levels while PE species associate with triglyceride levels.In this study, we observed lower phosphatidylcholine (PC) and higher phosphatidylethanolamine (PE) concentration in the diabetes group matches the lower HDL and total cholesterol levels and higher triglyceride levels in this group, indicating that these glycerophospholipids may provide a more differentiated view of the shifted lipid homeostasis in patients with diabetes as what can be obtained from the bulk blood cholesterol and triglyceride parameters alone.Consistent with this finding, Gall et al. [34] observed reduced levels of multiple acylglycerophosphocholine species that were highly correlated with insulin resistance as measured by the euglycemic clamp.",
      "INTRODUCTIONCardiovascular disease (CVD) is the leading cause of morbidity and mortality worldwide (He et al., 2005;Lozano et al., 2013;Murray et al., 2013).Suboptimal lipid levels contribute to the atherosclerotic process, with clinical trials and observational studies demonstrating a strong relation between blood lipid concentrations and CVD (Hokanson and Austin, 1996;LaRosa et al., 1999;Di Angelantonio et al., 2009;Huxley et al., 2011).The heritabilities of low-density lipoprotein cholesterol (LDL-C), high-density lipoprotein cholesterol (HDL-C), and triglyceride concentrations have long been established (Friedlander et al., 1997;Malhotra and Wolford, 2005;Luo et al., 2010;Zhang et al., 2010).More recently, genome-wide association studies (GWASs) have made important strides in identifying single nucleotide polymorphisms (SNPs) that contribute to the inter-individual variability in these complex phenotypes (Saxena et al., 2007;Kathiresan et al., 2008;Kooner et al., 2008;Wallace et al., 2008;Willer et al., 2008;Aulchenko et al., 2009;Kathiresan et al., 2009;Teslovich et al., 2010;Waterworth et al., 2010;Kim et al., 2011;Tan et al., 2012).Despite such progress, up to 75% of the variance in lipid levels due to genetic factors remains unexplained (Teslovich et al., 2010).Further research is needed to identify novel variants, genes, and biological pathways with important influences on lipid phenotypes.",
      "AACE, Association of Clinical Endocrinologists; ADA, American Diabetes Association; HDL, high-density lipoprotein; IDF, International Diabetes Federation; LDL, low-density lipoprotein; NR, no recommendation; T2DM, type 2 diabetes mellitus; WDF, World Diabetes Foundation. *Individualized goals. High-risk or established cardiovascular disease.",
      "Although hyperlipidemia is traditionally considered a risk factor for type 2 diabetes (T2D), evidence has emerged from statin trials and candidate gene investigations suggesting that lower LDL cholesterol (LDL-C) increases T2D risk.We thus sought to more comprehensively examine the phenotypic and genotypic relationships of LDL-C with T2D.Using data from the UK Biobank, we found that levels of circulating LDL-C were negatively associated with T2D prevalence (odds ratio 0.41 [95% CI 0.39, 0.43] per mmol/L unit of LDL-C), despite positive associations of circulating LDL-C with HbA 1c and BMI.We then performed the first genome-wide exploration of variants simultaneously associated with lower circulating LDL-C and increased T2D risk, using data on LDL-C from the UK Biobank (n 5 431,167) and the Global Lipids Genetics Consortium (n 5 188,577), and data on T2D from the Diabetes Genetics Replication and Meta-Analysis consortium (n 5 898,130).We identified 31 loci associated with lower circulating LDL-C and increased T2D, capturing several potential mechanisms.Seven of these loci have previously been identified for this dual phenotype, and nine have previously been implicated in nonalcoholic fatty liver disease.These findings extend our current understanding of the higher T2D risk among individuals with low circulating LDL-C and of the underlying mechanisms, including those responsible for the diabetogenic effect of LDL-C-lowering medications.Lipid-lowering medications, in particular from the statin drug class, are effective at lowering levels of circulating LDL-C and rates of adverse cardiovascular events (4) but convey an increased T2D risk (odds ratio [OR] 1.09) (5,6) in a dose-dependent manner (7).This increased risk, however, is outweighed at a population level by the cardiovascular event rate reduction.An increased T2D risk has also been reported in observational studies.Individuals with low levels of circulating LDL-C (e.g., ,60 mg/dL) exhibit a higher risk of prevalent and incident T2D (8,9), and among individuals with coronary disease, LDL-C and T2D are inversely related (10).In addition, individuals with familial hypercholesterolemia exhibit a decreased risk of T2D as well as lower BMI and triglyceride (TG) levels (11).",
      "HypercholesterolemiaHyperlipidemia in the form of elevated cholesterol is among the most common medical disorder seen in individuals above the age of 40 years.Statins are the most widely obvserved drug class directed at lowering serum cholesterol.Statin drugs act by inhibiting HGM-CoA reductase activity and are among the most effective cholesterol-lowering agents available [156][157][158].However, there is a degree of variability in the response to statins among patients.Genetic variation at the APOE locus has been associated with plasma lipoprotein concentrations in both fasting and postprandial states [159].In this regard, APOE E2 carriers have been reported to be more responsive to lipid-lowering therapies [160].Differential response to statin medications has also been reported in relation to the patient's genotype status of the cholesteryl ester transfer gene [161], -fibrinogen gene [162] and lipoprotein lipase gene [163].Likewise, the ApoE protein has been associated with late-onset and sporadic Alzheimer's disease (AD) [164,165].However, in a study comparing influence of APOE genotype with clinical response to tacrine (acetylcholinesterase inhibitor), approximately two-thirds of the APOE E4 carriers presented ADAS scores that were worse compared with baseline levels [165].Whether ApoE will become clinically relevant in predicting response to AD therapy in the treatment of AD remains to be seen.",
      "The Genetics of Hypercholesterolemia and Related Lipid PhenotypesHypercholesterolemia, Lipid Levels, and Their Familial Nature Population-based, long-term prospective studies and large clinical trials of the late 20th century incontrovertibly demonstrated that elevated LDL cholesterol (LDL-C) and reduced high-density lipoprotein cholesterol (HDL-C) were CVD risk factors. 88Clinical trials have demonstrated that lowering LDL-C and raising HDL-C can ameliorate risk. 89As this knowledge has been incorporated into clinical practice, mean serum total cholesterol concentrations have dropped in the United States in recent decades; however, 50% of US adults still have total cholesterol concentrations of at least 5.2 mmol/L (200 mg/dL), 90 the level that the National Cholesterol Education Program Expert Panel considers \"borderline-high risk.\" 91ecause at least half of the variation in serum cholesterol and other lipids can be explained by genetic variation, 91,92 unraveling the genetic pathogenesis of hypercholesterolemia and other lipid abnormalities could reap significant public health benefits.For example, identifying the common variants in genes that contribute to LDL-C and HDL-C could provide a knowledge base for the development of novel treatments and/or screening tests to determine who would most benefit from lifestyle modification or treatment for dyslipidemias.Important strides to this end have, in fact, already been made.",
      "LipidsPopulation-based, long-term prospective studies and large clinical trials of the late 20th century incontrovertibly demonstrated that elevated LDL-C and reduced high-density lipoprotein cholesterol are CVD risk factors. 111Because at least half of the variation in serum cholesterol and other lipids can be explained by genetic variation, 112,113 unraveling the genetic pathogenesis of hypercholesterolemia and other lipid abnormalities could reap significant public health benefits by providing a knowledge base for the development of novel treatments or screening tests to determine who would most benefit from lifestyle modification or treatment for dyslipidemias.Important strides to this end have, in fact, already been made.",
      "Obesityn=7255 in 2 population studies; genome metabolome integrated network analysis; serum Valcrcel et al 39 Fatty acids and lipoprotein subclasses n=1269 individual twins, including 561 complete pairs; genetic and environmental cause of the associations of serum fatty acids with lipoprotein profile; serum Jelenkovic et al 41 Biomarkers and risk assessment Subclinical atherosclerosis n=1595 young adults; circulating biomarkers for 6-year high carotid intima media thickness, new systemic biomarkers with improved risk stratification for subclinical atherosclerosis in comparison with conventional lipids; serum Wrtz et al 19 Type 1 diabetes mellitus and kidney disease Up to n=3544 patients with type 1 diabetes mellitus; cross-sectional and prospective associations of various systemic metabolites and lipoprotein subclass measures with the severity of diabetic kidney disease and mortality; introducing multiparametric risk assessment of diabetic nephropathy; serum Mkinen et al 36 Mkinen et al 26 All-cause mortality n=17 345 from 2 general population cohorts; 4 circulating biomarkers for 5-year risk of death; biomarker associations with multiple causes of death suggest novel systemic connectivities across seemingly disparate morbidities; improved prediction of the short-term risk of death from all causes above established risk factors; serum and plasma Fischer et al 32",
      "IntroductionCoronary artery disease (CAD) is the leading cause of morbidity and mortality worldwide.Although our understanding of cardiovascular disease is improving, the underlying mechanisms and the comprehensive and detailed pathogenesis of CAD remain unclear.Disequilibrium of lipid metabolism is a causative factor of predisposition to CAD (Weber and Noels 2011).As a key structural component of functioning lipoproteins such as chylomicrons, very low-density lipoprotein (VLDL), intermediate-density lipoprotein (IDL), and low-density lipoprotein (LDL), apolipoprotein B (APOB) is a crucial protein involved in the metabolism and maintenance of serum cholesterol homeostasis.These functioning lipoproteins participate in the process of transporting cholesterol and triglycerides throughout the circulation.Additionally, APOB binds to the LDL receptor, which mediates LDL degradation.Furthermore, a recent review indicated that subendothelial retention of APOB-containing lipoproteins was responsible for the initiation of atherogenesis (Benn 2009).Therefore, APOB plays an important role in the development of CAD."
    ],
    [
      "Recent genetic analysis suggests that free radical production can be increased by decoupling electron transport of ubiquinone to O 2 .A missense mutation in C. elegans cytochrome b560 causes a decrease in life span and oxygen hypersensitivity (as well as radiation hypersensitivity, a reasonable pleiotrophy because radiation induces free radicals, which damage DNA).Paradoxically, this is the opposite phenotype from the clk-1 defect in coenzyme Q biosynthesis.A model that explains this is that the cytochrome missense mutation causes a toxic build up of ubisemiquinone (a free radical that can generate superoxide) because the normal pathway for further reduction of singly reduced coenzyme Q is compromised (50).",
      "Keywords: ethanol, Coenzyme Q, oxidative stress, hippocampus, mouse models, genetics, genomicsINTRODUCTIONCoenzyme Q (CoQ or ubiquinol) is a lipophilic molecule present in every cell membrane in thebody (Crane, 2001; Turunen et al. , 2004). It is best known for its roles as a mitochondrial electrontransporter and a potent membrane anti-oxidant (Ernster and Dallner, 1995; Bentinger et al. , 2007). CoQ is made up of a benzoquinone ring with an isoprenoid side chain (containing 610 units)conserved across species from yeast (as CoQ6 ), to mice (as CoQ7 ), to humans (as CoQ10 ) (Lenaz,1985).",
      "The MT-CYB, a polypeptide with approximately 400 amino acid residues is one of the integral subunit of complex III of electron transport chain.Previously, mutations in MT-CYB gene have been described in uterine tumors (Shaik et al. 2011), cardiomyopathy (Feigenbaum et al. 2006), exercise intolerance (Massie et al. 2010) and histiocytoid cardiomyopathy (andreu et al. 2000).The MT-CYB T15062C, C15238a, T15378G and C15491G variants identified in the present study were specific to right atrial appendage tissues.These were previously not found to be reported in any of the human diseases in Mitomap database.",
      "Mitochondria are indispensable organelles as they are responsible for the production of the majority of ATP in the cell.Most cellular ATP is generated by oxidative phosphorylation (OxPhos), a process through which electrons are extracted from reducing equivalents and transferred through four different respiratory complexes (RCs) present in the mitochondria inner membrane (CI-CIV).Electron transfer is coupled with the generation of a proton gradient through the mitochondrial membrane that drives the phosphorylation of ADP to ATP by the ATP-synthase complex (also known as complex V).",
      "Second, the protein product of the CYP24A1 (cytochrome P450, family 24, subfamilyA, polypeptide 1; HGNC:2602) gene is responsible for degradation of vitamin D intoa physiologically inactive form. Vitamin D was shown to be essential for propermuscle functioning (Endo et al. , 2003; Pfeifer et al. , 2002) and polymorphisms in thevitamin D receptor are associated in humans with changed muscle strength in bothgenders (Windelinckx et al. , 2007); these changes are likely to influence the levels ofphysical activity.",
      "The product of the UQCR gene is the ubiquinol-cytochrome c reductase complex, also called mitochondrial complex III.It functions to form a part of the mitochondrial respiratory chain.It may also act as a binding factor for the iron-sulfur protein.Mitochondrial Complex III is composed of one mitochondrial-encoded subunit (MT-CYB) and ten nuclear-encoded subunits.The complex is located within the mitochondrial inner membrane and plays an important role in biochemical synthesis of ATP.It functions to catalyze electrons to transfer from succinate and nicotinamide adenine dinucleotide linked dehydrogenases to mitochondrially encoded cytochrome b.It also functions to utilize the energy to translocate protons across the membrane 27 .Deficiency of isolated complex III has been detected in patients of neuromuscular and nonneuromuscular disorders in both children and adults 28 .",
      "Figure 3. Mito-nuclear co-adaptation from functional assays. (a) Activities of oxidative phosphorylation (OXPHOS) enzymes in cultured cells with a Mus musculus (Mm) nucleus and mitochondria from other species (i.e.cytonuclear hybrid, or 'cybrid' cells).Species names, abbreviations and divergence times in millions of years (my) are as follows: Mus spretus (Ms) w2 my, Mus caroli (Mc) w3 my, Mus dunni (Md) w4 my, Mus pahari (Mp) w6 my, Rattus norvegicus (Rn) and Otomys irroratus (Oi) w12 my.OXPHOS enzyme activities were normalized to control activity (Mm, 100%Gs.d. ), and show normal levels with mitochondria from Ms, Mc and Md.The Mp cybrids showed normal complex I, II and III activity, but a deficiency in complex IV activity.The Rn cybrid showed deficiencies of complex I and III and a partial defect of complex IV, whereas the Oi cybrid showed a marked complex I and IV defect, and a severe complex III defect. (*P!0.05; **P!0.005; reproduced, with permission, from[49]. )(b) Cytochrome oxidase complex (COX) activities in the copepod Tigriopus californicus using mitochondria isolated from Santa Cruz (SC) or San Diego (SD) individuals and cytochrome c isolated from either SC (light-green bars) or SD (dark-green bars).Assays at 188C (bi) or 258C (bii). (Error barsZs.e. ; P!0.0001 for COX-by-CYTC, COX-by-Temperature, and CYTC-by Temperature interactions.Reproduced, with permission, from[54].)",
      "Mitochondria are indispensable organelles as they are responsiblefor the production of the majority of ATP in the cell. Most cellularATP is generated by oxidative phosphorylation (OxPhos), a processthrough which electrons are extracted from reducing equivalentsand transferred through four different respiratory complexes(RCs) present in the mitochondria inner membrane (CICIV). Electron transfer is coupled with the generation of a protongradient through the mitochondrial membrane that drives the phosphorylation of ADP to ATP by the ATP-synthase complex (also knownas complex V). A full list of affiliations appears at the end of the paper.",
      "One is a conversion interaction; its inputsare citratecytosol + CoAcytosol + ATPcytosol and its outputs are acetyl-CoAcytosol +oxaloacetic acidcytosol + ADPcytosol + P04cytosol . The second is a catalytic interaction; its input is ATP citrate lyasecytosol . In another example, to represent thetranslocation of citrate from the mitochondrion to the cytosol, two entities and asingle conversion interaction are used: citratemitochondrion goes to citratecytosol . Theformation or modication of a protein complex can be represented. For example,ACLA and ACLB are the subunits that compose the enzyme ACL.",
      "In 1925, Keilin (188) discovered cytochromes in aerobic cells.He concluded that there are three distinct pigments, which he called cytochromes a, b, and c, and that they underwent oxidation-reduction changes in a determined sequence, which bridges dehydrogenase discovered by Wieland (414) and oxygenase by Warburg (407), leading to the concept of the respiratory chain.NADH + Flavoprotein -+ Cytochromes b +c+a+a3+OzDuring 1940 -1950s extensive studies identified mitochondria as centers of energy metabolism.In 1950-1960s studies on isolated mitochondria had clarified gross structure and bioenergetics of the respiratory chain that produces most of bioenergy in a cell.Mitochondria carry out the tricarboxylic acid cycle and the P-oxidation pathway for fatty acids.These degradative sequences essentially remove hydrogen from metabolic fuels with the release of CO2 and transfer it through coenzymic carrier to the respiratory chain in the mitochondrial inner membrane.The chain passes the electrons sequentially through complex I (NADH dehydrogenase) or complex II (succinate dehydrogenase), coenzyme Q (CoQ), complex III (ubiquinol: cytochrome c oxidoreductase), cytochrome c, and complex IV (cytochrome oxidase) to oxygen to give water, as schematically illustrated in Figure 1.The released energy is used to pump protons out of the mitochondrial inner membrane, creating an electrochemical gradient.The energy stored in this gradient is the driving force for complex V (ATP synthetase), which is also associated with the inner membrane, to condense ADP and Pi to make ATP.From the above mechanism of oxygen reduction, a large quantity of ROS is expected to generate from the genetically defected active sites of cytochrome oxidase and/or cytochrome b, or with too much oxygen supply over enzymic capability to dispose ROS.For example, the cytochrome oxidase subunit II is assessed as the binding site of cytochrome c.Hence, lack of the subunit II, which is commonly detected in the patients with mitochondrial myopathy (379), with KSS (241), or with MERRF (218), inevitably results in the decreased binding capacity cytochrome c to the oxidase, namely, the increased Michaelis constant (K,) of cytochrome c, leading to enhanced reduction of the respiratory chain at cytochrome b region and complex I, from where ROS production will explode similar to the ischemia-reperfusion episode of heart.All the patients harboring severe point mutations in the cytochrome oxidase subunit genes or in the cytochrome b gene expressed most severe clinical phenotype (280); for instance, a recipient of heart transplantation at age 7 (283) or a case of fatal infantile cardiomyopathy died at age 1 (291).A greater magnitude of ROS is produced by isolated mitochondria exposed to hyperbaric oxygen (36) or in vivo reperfusion of ischemic heart (204,267).The histochemical examination of biopsied hepatic tron transfer carriers and coupling ATPase, with a direct conversion of promitochondria to respiratory functional organelles (398).Promitochondria look clearly identifiable as normal mitochondria, having a normal outer membrane and an inner membrane with poorly developed cristae.Hence, a correlation between mitochondrial morphology and human aging would be expected not in the gross structure, but in histochemical demonstration of mtDNA encoded enzymes.cells (386) revealed no distinct difference with age in the activity of succinate dehydrogenase that is encoded bY nuclear gene.In contrast, mtDNA-encoded cytochrome oxidase (complex IV) in 1 40 autopsied hearts revealed randomly distributed cardiomyocytes without enzyme activity (254).The expression of the defect was independent of an underlying heart disease, but age was a discriminating factor.The defects occurred sporadically in the second decad .e but were regularly present from the sixth decade on.The results indicate that cytochrom .eoxidasedeficient heart muscle cells represent a degenerative lesion associated with cellular aging and may be involved Concerning number and size of mitochondria with aging, Tauchi and Sato (386) carried out an extensive survey of area and circumference of mitochondria of the in the reduction of myocardial contractile ability in senescence.Similarly, histochemical activity of cytochrome oxidase in limb muscle and in diaphragm (255) revealed randomly distributed muscle fibers without the enzyme activity, in contrast to normal histochemical reactivity for succinate dehydrogenase.A histochemical analysis of in diaphragmatic muscles from 49 subjects of different ages (54) demonstrated respiratory failure (cytochrome oxidase negativity) in occasional fibers from the fourth decade on with an -lo-fold increase between the fourth and ninth decade (from 0.16 to 2.85%).It thus had been a matter of great interest to discover why cytochrome oxidase has not been shown to generate significant amounts of such intermediates.On the basis of optical studies of oxy-and peroxy-cytochrome oxidase by Chance et al. ( 63), it became clear that the intermediates of oxygen reduction remain within the active site of cytochrome oxidase until the final reaction stage of water is achieved, probably for protection against cellular intoxication.From general properties of the mitochondrial generation of HzOz and effect of hyperbaric oxygen, it was postulated (36) that besides the well-known flavin reaction, formation of HZOZ may be due to interaction with an energy-dependent component of the respiratory chain at the cytochrome b level.These findings clearly indicated that the active sites of the complex IV and III, consisting of cytochromes a and b, respectively, play a crucial role not only for the cellular energy production, but also for protection against cellular oxidative damage.Hence, attenuation of the active sites, even low absolute level, could result in serious outcome in cellular viability.This point, however, has been not well recognized by the researchers until recently.",
      "Second, the proteinproduct of the CYP24A1 (cytochrome P450, family 24, subfamilyA, polypeptide 1; HGNC:2602) gene is responsible for degradationof vitamin D into a physiologically inactive form. Vitamin D wasshown to be essential for proper muscle functioning [48,49] andpolymorphisms in the vitamin D receptor are associated in humanswith changed muscle strength in both genders [50]; these changesare likely to inuence the levels of physical activity. However, neither Mc3r nor Cyp24a1 contain a known non-synonymous codingSNP between the progenitor strains, A and B6.",
      "The cytochrome P450s (CYPs) causes the site-specific oxidization of the cyclic skeleton synthesized by OSCs.",
      "Second, the proteinproduct of the CYP24A1 (cytochrome P450, family 24, subfamilyA, polypeptide 1; HGNC:2602) gene is responsible for degradationof vitamin D into a physiologically inactive form. Vitamin D wasshown to be essential for proper muscle functioning [48,49] andpolymorphisms in the vitamin D receptor are associated in humanswith changed muscle strength in both genders [50]; these changesare likely to inuence the levels of physical activity. However, neither Mc3r nor Cyp24a1 contain a known non-synonymous codingSNP between the progenitor strains, A and B6.",
      "IntroductionThe mitochondrion of the modern human cell is the product of an ancient symbiosis in which an oxidative bacterium took up residence in the proto-nucleated cell that had developed motility and endocytosis.Following this initial symbiotic event, most of the genes of the mitochondrion were transferred to the nuclear DNA (nDNA) where they now reside, are replicated and transcribed.The resulting nDNA-encoded mitochondrial mRNAs are then translated on cytosolic ribosomes into proteins which are selectively imported into the mitochondrion.This mitochondrial protein import is frequently mediated by an amino terminal targeting peptide which is removed on entrance of the polypeptide into the mitochondrial matrix.",
      "The ITCHY library gave rise to variants with improved kcat with the substrateused for selection compared to either of the parental enzymes and additionallyshowed activity on ethacrynic acid, a compound recognized by neither parental enzyme. This combination of a human with nonhuman enzymes to formactive chimeras shows that this method could be used for the humanization ofproteins with therapeutic values that show no conserved framework allowingfor rational grafting. Chapter 36 Directed Protein Evolution3.1.9. SHIPRECCytochromes are proteins that contain heme groups and are responsible forthe transport of electrons.",
      "One is a conversion interaction; its inputsare citratecytosol + CoAcytosol + ATPcytosol and its outputs are acetyl-CoAcytosol +oxaloacetic acidcytosol + ADPcytosol + P04cytosol . The second is a catalytic interaction; its input is ATP citrate lyasecytosol . In another example, to represent thetranslocation of citrate from the mitochondrion to the cytosol, two entities and asingle conversion interaction are used: citratemitochondrion goes to citratecytosol . Theformation or modication of a protein complex can be represented. For example,ACLA and ACLB are the subunits that compose the enzyme ACL."
    ],
    [
      "Genetic mapping inmouse strains enhances the power of detecting modifier genes and identifying complexgenetic interactions. Genomewide quantitative trait locus (QTL) analysis, as described inmore detail below, represents a promising approach to detect genetic variants that areassociated with specific phenotypes and interact with each other. 16ACCEPTED MANUSCRIPTIn experimental crosses of two (inbred) strains the first generation (F1) ofoffsprings is genetically heterozygous but equal. Then in the next generation (F2) thePTstrain-specific genetic information is distributed across the genomes of their progeny andRIeach offspring is genetically unique.",
      "This contrast can be exploited to identify subregions that underlie the trans-QTLs [67]. SNPs were counted for all four pairs of parental haplotypesBvs D, B vs H, B vs C, and L vs Sand SNP profiles for the fourcrosses were compared (figure 6). Qrr1 is a highly polymorphicPLoS Genetics | www.plosgenetics.org8November 2008 | Volume 4 | Issue 11 | e1000260QTL Hotspot on Mouse Distal Chromosome 1Figure 5. QTL for aminoacyl-tRNA synthetases in distal Qrr1.",
      "The traditional approach to QTL mapping is to usetwo strains that differ maximally in the phenotype asparental strains for genetic crosses, with the followingcaveats. QTL analysis based on a single cross will mostlikely reflect only a small portion of the net geneticvariation, and QTL detection will be limited to regionswhere the two progenitor strains have functional polymorphisms. Data from multiple crosses, or from an HS,will overcome this limitation and can also be used toreduce QTL intervals [5,30].",
      "These candidate genes are then sequenced in the two parental inbredstrains looking for sequence dierences in coding or regulatory regions. After ne mapping the QTL interval and shortening the list of plausiblecandidate polymorphisms, the major challenge remains \u0001 proving denitivelywhich nucleotide polymorphism underlies the QTL. The most direct proofwould be replacing one strains allele with another strains allele (creating aFIG. 1. Intercross breeding strategy for mapping quantitative trait loci (QTLs). On the right, the parental, F1 hybrid, and intercross (F2) mousegenerations are depicted.",
      "One key advantage of lookingat transcripts with expression levels linking to behavioral QTL is the potential tomake inferences about the causal DNA variants underlying behavioral traits andtheir mechanisms. In analysis of behavioral traits alone, since linkagedisequilibrium is far-ranging in the BXD cross and each marker represents a widestretch of genomic sequence across which inheritance is correlated, it is difficultto narrow down which gene and DNA sequence variation at a given mappedlocus influences the behavioral trait.",
      "Interval-specific haplotype analysisApproximately 97% of the genetic variation betweeninbred mouse strains is ancestral [22], so regions ofidentity by descent (IBD) between two strains used todetect a QTL are highly unlikely to contain the causalgenetic polymorphism underlying the QTL [28]. Forexample, a cross between C57BL/6J and A/J mice detectedwww.sciencedirect.coma blood pressure QTL on Chr 1 [7].",
      "Interval-specific haplotype analysisApproximately 97% of the genetic variation betweeninbred mouse strains is ancestral [22], so regions ofidentity by descent (IBD) between two strains used todetect a QTL are highly unlikely to contain the causalgenetic polymorphism underlying the QTL [28]. Forexample, a cross between C57BL/6J and A/J mice detectedwww.sciencedirect.coma blood pressure QTL on Chr 1 [7].",
      "At present, the BXD panel is composed of 80 different strains that all have beenfully genotyped.26 Variation in any quantifiable trait can be associated with thesegregation of parental alleles, and linkage genetics can map this variation toquantitative trait loci (QTLs), thereby identifying the genomic region(s) affectingthat trait. An overview of the QTL mapping approach is depicted in Figure 2. Classical QTL analysis has permitted the identification of loci that areassociated with variation in HSC traits.",
      "The progenitor mouse strainsshould have sufficient variation for the traits of interest and they should be genetically diverseenough to enable genetic mapping (BENNETT et al. 2006; FLINT 2003; GRISEL 2000). Thesample size required for the identification of QTL depends largely on the effect size that aQTL contributes to phenotypes on interest. Inference about QTL can be made if one or moregenetic markers are over- or underrepresented in the analysed individuals. Genotyping isoften done by means of microsatellite markers, which contains mono, di-, tri-, ortetranucleotide tandem repeats flanked by specific sequences (Figure 4a).In general,linking genetic variation with trait variation identifies QTL and a significant linkage ofphenotype and genotype suggest that the DNA status helps to determine trait expression. As stated above, mouse QTL studies provide distinct advantages over human studiesin the examination of genetic causes of a quantitative trait (e.g. alcoholism), even in theabsence of specific hypotheses regarding its aetiology or candidate genes.This comparison gives information about the reliability of the observed genotypeinformation: The more the marker locations differ between the two maps (which signifiesvariation in marker positions), the higher the possibility of genotyping errors. QTL mapping was done in several stages to identify loci acting individually and QTL thatinteracted, either additively or epistatically. To determine individually-acting QTL, a singleQTL genome scan was conducted with the function scanone.",
      "Importantly, whereasthese studies required substantial labor, time, and resources, X-QTL is a quick and easyapproach to achieve a comparable level of genetic dissection. The levels of complexityobserved here (e.g. 14 loci explaining 70% of the genetic variance for 4-NQO resistance) arestill dramatically lower than those seen in for some human traits in GWAS (e.g. 40 lociexplaining 5% of the variance for height 2,5). One obvious explanation is the difference inexperimental designs (line crosses vs. population association studies), but differences ingenetic architectures among species and traits may also contribute.",
      "The method uses two pieces of information: mapping data from crosses thatinvolve more than two inbred strains and sequence variants in the progenitor strains within the intervalcontaining a quantitative trait locus (QTL). By testing whether the strain distribution pattern in the progenitor strains is consistent with the observed genetic effect of the QTL we can assign a probability that anysequence variant is a quantitative trait nucleotide (QTN). It is not necessary to genotype the animals exceptat a skeleton of markers; the genotypes at all other polymorphisms are estimated by a multipoint analysis.",
      "The method uses two pieces of information: mapping data from crosses thatinvolve more than two inbred strains and sequence variants in the progenitor strains within the intervalcontaining a quantitative trait locus (QTL). By testing whether the strain distribution pattern in the progenitor strains is consistent with the observed genetic effect of the QTL we can assign a probability that anysequence variant is a quantitative trait nucleotide (QTN). It is not necessary to genotype the animals exceptat a skeleton of markers; the genotypes at all other polymorphisms are estimated by a multipoint analysis.",
      "which allows the the estimation of the degree of dominance of detected QTLs.but point out that various designs or linkage analysis methods have advantages and disadvantages and need to be chosen for the question at hand. In general, there are two common statistical approaches for detecting and/or locating QTL. Both approaches involve moving along the chromosome and considering data for one or several markers at a time and relating these to the traits of interest.",
      "Genotyping all the individual progeny formarkers that show allelic variation between the parental strains (either single nucleotide polymorphisms or simple sequence repeats) will allow the detection of associations between trait values and marker genotype, and in this way demonstrate to whichset of markers a QTL is linked. To reduce the genotyping effort, selective genotypingof the individuals at the extremes of the phenotypic spectrum can be performed (20,23). Although these three approaches are in general considered to be the best to detect andmap QTL, they have several disadvantages for quantitative traits involving HSC.",
      "So, how do you go about planning and performing a QTL study, and howdo you identify the responsible gene within a QTL that you have identified? Generally, one starts by performing a strain survey to find two parental inbredstrains that have a markedly different trait. One can now look up many differenttraits of inbred mice online at the Mouse Phenome Database (http://phenome. jax.org/pub-cgi/phenome/mpdcgi?rtn=docs/home). However, the trait you maywant to study may not be present in wild type mice, so you may want to crossa mutant (or genetically engineered) strain onto several inbred strains.QTL Theory and PlanningThe theory behind the most basic form of QTL mapping is based upon intercrossing two inbred strains. The mouse genome consists of 19 pairs of autosomes (non sex-determining chromosome) and the X and Y chromosomes. Inthe example shown in Fig. 18.1, we are intercrossing stain A (shown with ablack chromosome pair) with strain B (shown with a white chromosome pair). The initial F1 (filial generation 1) mice are true hybrids, with each individualFrom: Molecular Biomethods Handbook, 2nd Edition.",
      "These candidate genes are then sequenced in the two parental inbredstrains looking for sequence dierences in coding or regulatory regions. After ne mapping the QTL interval and shortening the list of plausiblecandidate polymorphisms, the major challenge remains \u0001 proving denitivelywhich nucleotide polymorphism underlies the QTL. The most direct proofwould be replacing one strains allele with another strains allele (creating aFIG. 1. Intercross breeding strategy for mapping quantitative trait loci (QTLs). On the right, the parental, F1 hybrid, and intercross (F2) mousegenerations are depicted.",
      "QTL mapping studies thenseek to detect the polymorphisms underlying the complex traits of interest byscanning for alleles that co-vary withthe traits. Similar experiments also can be conducted with special derivatives of inbredstrains known as recombinant inbred(RI) mice. These animals are derivedby cross-breeding two or more distinctparental strains (which often divergewidely for the trait of interest), followedby inbreeding of the offspring for severalgenerations (Bailey 1971). Given thecorrect breeding strategy, this method1This is an issue faced by GWASs researchers when classifyingsamples as cases or controls."
    ]
  ],
  "task_id": [
    "CDFC418BD568E839C09656C57808ADA1",
    "5DEB102510F48D0BF9C278DC895A8BD1",
    "58D6F365917926445960756A26B3FDC8",
    "A5DEAEAC441B3BDC65B58EA6923FAE73",
    "0CDD1C9219114BB2770C28D541F1060A"
  ]
}