Evaluation code for paper 1

author: SoloDShelby 2024-07-19 14:41:40 +0300
committer: SoloDShelby 2024-07-19 14:41:40 +0300
commit: 3fa31b50af2861382fbe2c76406f5a04c3fefc93 (patch)
tree: 34d581648b0e0d3fc8dbe6577752a4fd433a3258 /gnqa/paper1_eval/src/data/datasets/old
parent: 74616897e30c7daafe5e74d34073466464921316 (diff)
download: gn-ai-3fa31b50af2861382fbe2c76406f5a04c3fefc93.tar.gz
14 files changed, 2176 insertions, 0 deletions
diff --git a/gnqa/paper1_eval/src/data/datasets/old/aging1_dataset.json b/gnqa/paper1_eval/src/data/datasets/old/aging1_dataset.json
new file mode 100644
index 00000000..f5b979c1
--- /dev/null
+++ b/gnqa/paper1_eval/src/data/datasets/old/aging1_dataset.json
@@ -0,0 +1,128 @@
+{
+  "question": [
+    "List as many studies are you can that include rapamycin",
+    "Why is it so difficult to map gene loci that control aging in humans?",
+    "How would one evaluate a candidate gene for mouse lifespan?",
+    "Do we find significant quantitative trait loci in aging studies, if so give the top three examples with descriptions?",
+    "What are the most fundamental human life expectancy extending geroscience studies?"
+  ],
+  "answer": [
+    "1. A study assessing the impact of a rapamycin derivative on healthy aging in people (Mannick et al. 2014). 2. A veterinary clinical trial testing rapamycin as a treatment for osteosarcoma (Paoloni et al. 2010). 3. A veterinary clinical trial assessing the side effect profile and effects on age-associated cardiac function following 10 weeks of rapamycin treatment in healthy elderly dogs (Kaeberlein 2015). 4. A study on the pharmacokinetic analysis of rapamycin treatment in healthy dogs (Larson et al. 2016). 5. A study on the effects of rapamycin in aging mice (Johnson et al. 2015). 6. A study on the effects of rapamycin in rodents (Johnson et al. 2013). 7. A study on the effects of rapamycin on the replicative life of yeast (Reference [57]). 8. A study on the effects of rapamycin in 20-month-old mice (Reference [58]). 9. A study on the combination of rapamycin with other compounds for anti-aging activity (Reference [59]).",
+    "Mapping gene loci that control aging in humans is difficult due to several reasons. Firstly, aging is a complex phenotype influenced by multiple genes and their interactions with each other and the environment. Secondly, the effects of common genetic variations on longevity are small, requiring large studies for identification. Thirdly, human studies often face issues such as environmental heterogeneity, genetic diversity, and lack of birth matched controls. Lastly, the influence of social, economic, and cultural factors, which are unique to humans, can't be replicated in traditional model organisms, adding to the complexity.",
+    "One would evaluate a candidate gene for mouse lifespan by conducting gene mapping methods and using mouse life span as a surrogate for aging. This involves studying the genetic contributions to mortality rates, conducting sex-specific analyses, and determining any changes in pathology associated with the loci. The study would involve a large population of mice, and the results would be compared to Mendelian expectations. Additionally, epigenetic predictors and differential methylation analysis could be used to assess the relationship between the gene and lifespan. The gene's influence on specific late-life diseases and its effect on the risk of these diseases would also be evaluated.",
+    "Yes, significant quantitative trait loci (QTLs) have been found in aging studies. Three examples include:  1. Lfdr1: This QTL was found on chromosome 7 and affects lifespan and fertility in response to dietary restriction. It also has suggestive effects on fuel efficiency (FE).  2. Fedr1 and Fedr2: These QTLs were identified on chromosomes 9 and 15 respectively. They have significant effects on fuel efficiency in response to dietary restriction.  3. QTLs associated with age-related thymic involution: These were identified in a study using C57BL/6 J X DBA/2 J (BXD) recombinant inbred (RI) strains of mice.",
+    "The most fundamental human life expectancy extending geroscience studies include genetic and genomic research into longevity. These studies use both hypothesis-driven candidate gene approaches and explorative genome-wide analyses. They also involve the study of biomarkers related to aging, deep phenotyping, and the use of novel technologies like next-generation sequencing. Additionally, they examine the role of environmental factors, lifestyle, and heritability in human lifespan variation."
+  ],
+  "contexts": [
+    [
+      "Intervention trials and cell-based monotherapy",
+      "Rapamycin has been shown to be well tolerated in dogs, improves outcome in a glycogen storage disease model (Yi et al. 2014), and is currently being tested in veterinary clinical trials as a treatment for osteosarcoma (Paoloni et al. 2010).A veterinary clinical trial is underway to assess the side effect profile and effects on age-associated cardiac function following 10 weeks of rapamycin treatment in healthy elderly dogs (Kaeberlein 2015).",
+      "Rapamycin is used clinically to prevent organ transplant rejection, for some forms of cancer, and to prevent restenosis in cardiac stents (Kaeberlein 2013b).Shortterm treatment with the rapamycin derivative RAD001 improves ageassociated decline in immune function, as measured by antibody response to an influenza vaccine, in healthy elderly people (Mannick et al. 2014).",
+      "To date, only one study has been performed assessing the impact of a rapamycin derivative on healthy aging in people.In this trial, it was observed that 6 weeks of treatment with the rapamycin derivative RAD001 (everolimus) was sufficient to enhance function of the aged immune system, as assessed by response to an influenza vaccine (Mannick et al. 2014).This recapitulates what was observed in elderly mice (Chen et al. 2009), and suggests that at least some of the mechanisms by which rapamycin delays aging in mice work similarly in humans.Although both compounds have essentially identical biological activities, RAD001 was used in this study instead of rapamycin because the study was funded by Novartis, who holds the patent rights for RAD001 (rapamycin is now off patent and sold as a generic drug).The doses of RAD001 used in the human immune aging study were lower than those typically used to prevent organ transplant rejection and showed improved side effect profiles, although some adverse effects, including the presence of mouth sores in a subset of the patients, were noted.",
+      "This trial is designed to determine whether treatment with the drug rapamycin (see Table 1) can significantly reduce age-related disease and disability as well as mortality in middle-aged large dogs.The initial phase of this trial, which is in progress at the time of this writing, is intended to enroll at least 32 dogs 6 years of age or older and 40 lb in weight or greater.Each animal receives an initial veterinary exam and comprehensive blood work along with a cardiac exam including echocardiography (Fig. 3).Those dogs that do not present with any abnormalities or significant pre-existing health conditions are randomized into either placebo or rapamycin treatment groups for a 10-week treatment period.Initial rapamycin dosing regimens were determined, in part, based on pharmacokinetic analysis of rapamycin treatment in healthy dogs (Larson et al. 2016).After 10 weeks in the study, each dog receives another full exam and blood chemistry panel as well as repeat cardiac exam.The primary goals of this first phase are to establish appropriate dosing of rapamycin in the absence of significant adverse events and to determine whether similar improvements in cardiac function are achieved in aged dogs after 10 weeks of rapamycin treatment, as has been observed in laboratory mice (Dai et al. 2014;Flynn et al. 2013).",
+      "Fig. 3 Design of the current short-term rapamycin intervention trial.Dogs must weigh at least 40 pounds and be at least 6 years old at time of entry into the study.If no significant pre-existing health conditions are detected at the first exam, dogs are randomized into either placebo or one of the rapamycin treatment groups.Red indicates the 10-week period during which the dogs receive either rapamycin or placebo.Dogs receive the same generic rapamycin (sirolimus) pill that is provided to human patients.Asterisk Serum and feces are collected at each appointment for future metabolomic and microbiome analyses and for quantitation of circulating rapamycin levels",
+      "Pending the outcome of phase 1, we anticipate enrolling several hundred additional dogs with similar entry criteria into a longer-term, 3-5 year study, to carefully assess the extent to which rapamycin improves health and reduces mortality in middle-age companion dogs.In addition to cardiac function, assessments of multiple age-related phenotypes will be performed including measures of cognitive function, muscle function, kidney function, glucose homeostasis, and cancer incidence.Many of these parameters are beneficially impacted by rapamycin in aging mice (Johnson et al. 2015), and we predict that rapamycin will induce similar improvements in aging dogs.",
+      "Rapamycin is currently the most effective pharmacological intervention for extending lifespan and delaying a broad range of age-related functional declines in rodents (Johnson et al. 2013).However, the doses used clinically to prevent organ transplant rejection are associated with side effects, such as impaired wound healing, edema, elevated circulating triglycerides, impaired glucose homeostasis, gastrointestinal discomfort, and mouth ulcers (Augustine et al. 2007;de Oliveira et al. 2011).These adverse side effects would likely preclude long-term use of rapamycin at these levels in otherwise healthy people.With the possible exception of impaired glucose homeostasis (Lamming et al. 2012), these side effects have not been observed at doses that are associated with increased lifespan and healthspan in mice, however, raising the possibility that lower doses of this drug could promote healthy aging with minimal adverse effects.",
+      "Rapamycin  Rapamycin is a macrolide isolated from Streptomyces hygroscopicus, a bacteria from Pascua Island (Rapa Nui).It has functions as an antibiotic, an immune suppressant drug, and it is also proposed as a CRM.After the first studies, it was found that rapamycin could induce the extension of the replicative life of yeast through the inhibition of TOR signaling [57].This compound could extend the lifetime useful in 20-month-old mice in correlation with TOR activity [58].These studies were the basis of the research to determine the function of rapamycin as a CRM, due to its modulating properties over proteostasis.In addition, studies suggest that rapamycin can be combined with other compounds (metformin, losartan, statins, propranolol, and aspirin among others) to potentiate their anti-aging activity [59].",
+      "Rapamycin  Rapamycin is a macrolide isolated from Streptomyces hygroscopicus, a bacteria from Pascua Island (Rapa Nui).It has functions as an antibiotic, an immune suppressant drug, and it is also proposed as a CRM.After the first studies, it was found that rapamycin could induce the extension of the replicative life of yeast through the inhibition of TOR signaling [57].This compound could extend the lifetime useful in 20-month-old mice in correlation with TOR activity [58].These studies were the basis of the research to determine the function of rapamycin as a CRM, due to its modulating properties over proteostasis.In addition, studies suggest that rapamycin can be combined with other compounds (metformin, losartan, statins, propranolol, and aspirin among others) to potentiate their anti-aging activity [59].",
+      "One out of the 25 FDA approved Breast cancer drugs (Gemcitabine), was found in the top 20 drug list from LINCS from breast cancer stage I (dark magenta). As shown in Fig. 12, one drug out of 25 FDA approved Breast cancer drugs, Gemcitabine, was found as repurposed drug from LINCS for breast cancer stage III. Letrozole (Breast cancer drug) has similar structure (greater than 60%) with Ruxolitinib (repurposed drug from LINCS) a drug for the treatment of intermediate or high-risk myelofibrosis (Fig. 13).",
+      "One out of the 25 FDA approved Breast cancer drugs (Palbociclib), was found in the top 20 drug list from LINCS from breast cancer stage II (deep pink). Scientific Reports | 6:20518 | DOI: 10.1038/srep20518  13 www.nature.com/scientificreports/  Figure 11. Highlighted target genes that physically interact with genes from the breast cancer stage II common network pattern and their corresponding repurposed drugs from LINCS, along with their structurally similar Breast cancer drugs. As shown in Figs 16\u201317 two target genes (TOP2A and TYMS) are also involved in the Triple Negative pattern.",
+      "Two of them (Gemcitabine and Palbociclib) are included in the list of the 25 known FDA-approved Breast cancer therapeutic drugs. We performed a Hypergeometric distribution test in order to find the statistical significance of this drug overlapping. More precisely, LINCS_L1000 database is comprised from 20,413 chemical reagents. Twenty two out of twenty five breast cancer drugs are also included in LINCS database. Finally, from the 105 drugs that were found from our analysis, the probability of finding two drugs to overlap with the Breast Cancer drugs in LINCS is 0.005471157, pointing out that there is statistical significance in their selection.",
+      "Two from the 25 FDA approved Breast cancer drugs (Gemcitabine and Palbociclib), was found in the top 20 drug list from LINCS from Luminal A breast cancer (dark magenta and deep pink respectively).",
+      "18 two drugs out of 25 FDA approved Breast cancer drugs \u2013 Gemcitabine and Palbociclib \u2013 were also found as repurposed drugs from LINCS for breast cancer Luminal A (Fig. 18). Two genes from the Luminal A network pattern physically interact with four genes that involved in Histone deacetylases class (HDAC1, HDAC2, HDAC3 and HDAC8), which are target genes of Vorinostat (repurposed drug from LINCS). Vorinostat is a member of a larger class of compounds that inhibit histone deacetylases (HDAC) and it is used to treat cutaneous T cell lymphoma (CTCL).",
+      "One out of the 25 FDA approved Breast cancer drugs (Gemcitabine), was found in the top 20 drug list from LINCS from breast cancer stage III (dark magenta). that was found from the drug repurposing analysis of HER2 pattern. It has similar structure - 75% with WZ-4002 repurposed drug, which is a novel mutant-selective inhibitor of EGFR. Finally, both Palbociclib and WZ-4002 are structurally similar to Dasatinib (more than 60%), which is a cancer drug used to treat acute lymphoblastic leukemia.",
+      "Network pattern for each breast cancer subtype and the common interactions across Luminal A and Luminal B. As shown in Fig. 8, one drug out of 25 FDA approved Breast cancer drugs, Gemcitabine, was proposed as repurposed drug by the LINCS for breast cancer stage I. Furthermore, Gemcitabine is quite similar (tanimoto31 similarity greater than 80%) with Clofarabine and Kinetin-riboside (repurposed drugs from LINCS). Clofarabine is also an anti-cancer, antineoplastic chemotherapy drug and is classified as an antimetabolite.",
+      "Hierarchical clustering using tanimoto similarity (Soergel distance) was applied to each of the top 20 drug list from LINCS and the 25 known FDA-approved Breast cancer therapeutic drugs (Supplementary Figs 54\u201361). LINCS Drug Names were transformed into ChemSpider IDs (see Supplementary Table 1) In synopsis, the unique drugs for the breast cancer stages were 63 and for the breast cancer subtypes 58, as we have located common drugs across them. Taking their union and removing the duplicates we conclude to a total of 105 repurposed drugs.",
+      "13, is also structurally similar (greater than 60%) with 6-(1,3-Benzodioxol-5-yl)-N-(cyclopentylmethyl)-4-quinazolinamine (repurposed small molecule from LINCS). As in breast cancer stages I and III one drug out of 25 FDA approved Breast cancer drugs \u2013 Gemcitabine \u2013 was found as repurposed drug from LINCS for breast cancer stage IV (Fig. 14). A repurposed drug from LINCS \u2013 Homoharringtonine was found to be structurally similar with Everolimus and Vinblastine Breast cancer drugs (greater than 70%). On the other hand, as shown in Fig.",
+      "Rapamycin has serious side effects, particularly as an immunosuppressor, and thus it is not suitable as an antiaging drug.As in sirtuins, however, these studies highlight the road from basic discovery on the biology of aging to antiaging interventions.Further studies of the TOR pathway and of repressors more specific of its downstream signaling pathway are ongoing.Whether rapamycin produces a change in another parameter related to energy uptake or utilization is unknown, and determining which of its effects modulate lifespan is an important unsolved question.Like resveratrol, TOR has attracted considerable attention from the pharmaceutical industry, particularly in the context of cancer (Meric-Bernstam and Gonzalez-Angulo, 2009)."
+    ],
+    [
+      "FUTURE DIRECTIONS: HIGHER RESOLUTION DATA VIA HIGHER THROUGHPUT ASSAYS  One inescapable conclusion of the aggregate results of genome-wide studies of aging to date (see summary Table 1) is that we have not come close to saturating the number of potentially lifespan-altering genes in any organism.This is in no small part because directly generating survival curves is a relatively time-consuming process in most model organisms using current methods.There are several possible ways to address this.One way that has been tried is by attempting to find surrogate phenotypes [72,73,126] that can be screened more rapidly, or even scored under selection.Another is mining candidates from the many whole-genome expression profiles.Results to date with these have been very fruitful, but have not suggested that these methods alone will rapidly saturate our search for lifespan-and healthspan-altering genes in tractable model organisms.",
+      "Genetic linkage studies of long-lived human families identified a longevity locus while candidate gene approaches have been used to identify and confirm the association between specific variants in the FOXO3A gene and human longevity [3\u20137]. Genome-wide association studies have also been used to identify the association of APOE with life  123 Aging Clin Exp Res  span and have yielded insights into potential biological pathways and processes related to aging. Despite these successes, several problems are inherent in human longevity studies including potentially high degrees of environmental heterogeneity, genetic diversity, and lack of birth matched controls, among others [8].",
+      "Additional association studies with these families and replication of these results with an independent data set should facilitate the positional cloning of a gene that influences the ability to age well and achieve exceptional longevity.Identification of the genes in humans that allow certain individuals to live to extreme old age should lead to insights on cellular pathways that are important to the aging process.",
+      "The aging process most certainly is under highly polygenic controls\u2026 This should not discourage us from pursuing a search for those loci which may be of profound importance to human aging as it ordinarily occurs in most human beings.",
+      "In most experimentally modified animal model systems, single-gene mutations in many different genes have major life extension effects (Fontana et al., 2010;Kenyon, 2010).However, natural human and animal longevity is presumed to be a complex trait (Finch & Tanzi, 1997).In humans, both candidate gene and genome-wide genetic association approaches have been applied in an attempt to identify longevity loci.The frequency of genetic variants has been typically compared between nonagenarian cases and young controls, revealing loci at which genetic variants may contribute to a higher or lower probability of survival into old age.The initial candidate gene studies aimed at finding human longevity genes were dominated by contradictory results (Christensen et al., 2006).The more consistent evidence obtained by repeated observation in independent cohort studies for association with longevity has so far only been observed for three loci, the apolipoprotein E (APOE) locus (Schachter et al., 1994;Christensen et al., 2006), the FOXO3A locus (Willcox et al., 2008;Flachsbart et al., 2009;Pawlikowska et al., 2009;Soerensen et al., 2010), and the AKT1 locus (Pawlikowska et al., 2009).Thus, despite the expectation that longevity would be influenced by many genetic variants with small effect sizes, the effect of variants has consistently been shown in only three genes.",
+      "1993), and gene expression microarrays (Pletcher et al. 2002). Given the ambiguities and limitations of large-effect mutant studies of aging, discussed earlier, those publications do not provide very useful evidence with respect to the question of the number of loci that affect aging. At present, the best answer to the question of the number of genes controlling aging is many (Rose and Long 2002), in keeping with the original expectations of evolutionary biologists. However, studies of the genetics of the experimental evolution of aging are now amenable to the application of genomic methods.",
+      "Accepted Article  \u00a9 2013 The Authors Aging Cell \u00a9 2013 Blackwell Publishing Ltd/Anatomical Society of Great Britain and Ireland over 90 years and 1,955 controls between 55 and 80 years did not reveal genome-wide significant loci (Newman et al., 2010) and neither did the analyses of all-cause mortality and survival free of major disease in this cohort (Walter et al., 2011).A smaller Dutch study of 403 nonagenarians and 1,670 controls younger than 65 years identified the APOE gene as a mortality locus (Deelen et al., 2011), which was confirmed in a German study of 763 long-lived individuals and 1,085 younger controls (Nebel et al., 2011) and a longitudinal study of 1,606 Danes showed that the effect size of this association increases at the highest ages (Jacobsen et al., 2010).Apparently, the influence of the common genetic variation on longevity is small which requires large meta-GWA studies for identification.Alternatively, rare genetic variants may play a more important role in longevity.Since the previous linkage studies showed contradictory results potentially due to heterogeneity in the longevity phenotype, it is expected that longevity is influenced by many private rare variants.",
+      "The lack of success in the identification of genes related to aging in humans may be due to the complexity of the phenotype.One approach to investigate aging and longevity is to compare frequencies of genetic variants between nonagenarians or centenarians and the general population.This approach led to the discovery of an association between APOE (Deelen et al., 2011;Ewbank, 2007;Gerdes et al., 2000) and more recently FOXO3A (Anselmi et al., 2009;Flachsbart et al., 2009;Li et al., 2009a;Pawlikowska et al., 2009;Willcox et al., 2008) and human aging and longevity.However, a recent genome-wide association study (GWAS) of individuals reaching the age of 90 or older failed to identify genome-wide significant variants (Newman et al., 2010).",
+      "Several explanations are possible for the lack of genomewide significant findings.First, mortality is arguably 1 of the most complex phenotypes, and several trajectories toward extreme old age have been identified (Evert et al., 2003).Multiple genes could mediate the aging process but would have their effects through numerous different patho-physiological processes and diseases that act as intermediate factors on the pathway to death (de Magalhaes et al., 2010).Therefore, any common variation in genes associated with aging probably has a small effect.",
+      "Second, the largely negative findings of this and other studies contrast with the intriguing animal studies of longevity.Very large effects of single genes on lifespan have indeed been observed in laboratory animals, but humans often have several homologues of these genes which might significantly differ in function or compensate for mutated genes through redundant mechanisms (Kuningas et al., 2008).This could explain why our top findings did not include genes in these pathways found in animal models.Animal models also represent genetically homogenous populations and are exposed to controlled environmental influences.The lack of replication of animal model findings in humans suggests that the use of knockout animals may not provide the optimal approach to understanding the variation in survival in humans as interactions with environmental factors may obscure the associations and prevent the identification of loci in humans.",
+      "1993), and gene expression microarrays (Pletcher et al. 2002). Given the ambiguities and limitations of large-effect mutant studies of aging, discussed earlier, those publications do not provide very useful evidence with respect to the question of the number of loci that affect aging. At present, the best answer to the question of the number of genes controlling aging is many (Rose and Long 2002), in keeping with the original expectations of evolutionary biologists. However, studies of the genetics of the experimental evolution of aging are now amenable to the application of genomic methods.",
+      "The remarkable discoveries of the past 2 decades showing that single genes can regulate aging in model organisms demonstrate that aging can be genetically manipulated (Finch and Ruvkun, 2001;Kenyon, 2010).Hundreds of genes that modulate longevity have now been identified in model organisms (de Magalha \u02dces et al., 2009a).In some cases (e.g., in worms), mutations in single genes can extend lifespan by almost 10-fold (Ayyadevara et al., 2008).Nonetheless, aging is a complex process that derives not from single genes but from the interactions of multiple genes with each other and with the environment.Evidence from animal systems shows a major impact of the environment on aging, yet environmental manipulations of aging act through genes and proteins, usually by triggering signaling pathways and modulating gene expression.In fact, some genes have been shown in model organisms to have varying effects on lifespan depending on diet (Heikkinen et al., 2009).Genes that can regulate aging in model organisms cannot be directly applied to humans through genetic manipulations for numerous legal, ethical, and technical reasons.If we could understand how the environment modulates these aging-related genes, we might be able to create antiaging therapies applicable to humans, potentially through diet, lifestyle, and even pharmacological interventions.Therefore, understanding genome-environment interactions in the context of aging can be a powerful approach to identify attractive targets for drug design.",
+      "Results from mutational analysis across eukaryote model organisms have shown unexpected conservation of genes and processes regulating aging.While unique properties exist within particular organisms that modulate these foundational networks, the conservation provides a tool to refine human genetic studies.As noted, GWAS for human longevity metrics suffer from large sample size requirements to obtain statistical resolution due to multiple hypothesis testing across the genome.Assuming that evolutionary genesets for longevity could be generated with confidence, an intersection of them with human variation data would increase the sensitivity of association studies.This would serve as a selective filter to refine the number of loci investigated for association in human populations.Similarly, such evolutionary filters could refine analysis of rare, unique variation within genome sequence data from extremely long-lived cohorts.A similar approach to refine human longevity GWAS used an intersection with age-related disease datasets.This 'disease-informed' GWAS helped refine candidates (iGWAS, Fortney et al., 2015), though, it should be noted that this particular strategy would further blur the distinction between aging and longevity as discussed above.The definition of gene sets from evolutionary experiments in longevity, across clades, would similarly empower detection of networks previously hidden under GWAS in human population analyses (Figure 3).",
+      "ANALYSIS OF HUMAN VARIATION IN THE GENETIC CONTROL OF LONGEVITY  Heritability studies have convincingly demonstrated that at least some fraction of human lifespan is heritable.In tandem, large-scale genome-wide association studies (GWAS) have identified numerous loci associated with age-related traits (Buniello et al., 2019).While genetic studies have functionally shown an inverse effect of multiple age-related, diseaseassociated variants on lifespan regulation, the number of well-replicated longevity-conferring variants remains limited to variants in APOE (ApoE \u03b52), and more recently, CDKN2A/B and IL6 (see Table 1).To date, studies in humans have been hampered by the specific phenotype definitions used, sample sizes of the extreme phenotypes, and modest heritability of the longevity-related traits (Breitbach et al., 2019).This is due to the complex interplay of biological and social factors involved in human aging, as well as the limited power of GWAS, which require sampling thousands of subjects to achieve statistical significance (Breitbach et al., 2019).Genetic studies of aging have also been hindered by an inconsistent use of definitions of aging (reviewed in Baghdadi et al., 2020).The two main ways of conducting research on the genetics of longevity in human populations are by studying (i) the lifespan (continuous trait, years lived) and (ii) the longevity (dichotomous trait, i.e., being among the longest-lived individuals within a specific population).These complexities have limited the resolution and capability of broad association studies of human longevity.Importantly, these genomic analyses focus on a shift of survival in a population; these variables may be genetically distinct from the mechanisms establishing potential for longevity overall (Figure 1A).We argue that an understanding of this shift in lifespan as well as genetic mechanisms of regulating a species specific 'set points' (Figure 1B) will aid in the conceptual distinction of aging and longevity in humans.",
+      "TRANSLATION OF LONGEVITY MODEL ORGANISMS AND CORE AGING PATHWAYS  Genetic studies on lifespan have proven to be challenging.While longevity is a defining trait for a given species, the lifespan of individuals is of limited heritability, making analyses more difficult.Exceptional human life span, although a rare phenotype, is likely multifactorial; refined analyses are required to obtain statistically robust genomic signatures of longevity (Zhang et al., 2020) and these have proven elusive.Unlike laboratory models, the effect of environmental variance cannot be controlled in human studies, potentially masking purely biological aging mechanisms.Even laboratory models cannot replicate the complex \"environment\" of humans; it includes psychosocial, economic, and cultural factors, rather than strictly biological.These human-specific confounders are difficult or impossible to target in traditional model organisms.Despite these limitations, experimentally tractable model organisms have proven invaluable in deciphering the purely genetic contribution to lifespan, including genes and pathways conserved across the tree of life.",
+      "Our analyses show that it is extremely unlikely that there is a single gene harboring rare protein-altering variants shared by all supercentenarians but no controls.It is not surprising that a highly complex trait such as longevity is not explained by a single Mendelian gene.",
+      "With modern genomic technologies and largescale data analysis methods, it is possible to sift through the genes of populations to find the loci that act to postpone aging. [3]There are uncertainties with the comparison of populations with different rates of aging.However, it is superior to experimental designs that only consider age-dependence or dietary-response, without determining causal mechanisms.",
+      "Although the models data set comprises all genes (to our knowledge) shown by the time of the latest update to statistically increase longevity or alter the aging process in a noticeable way, in the human data set we try to evaluate whether a given intervention is affecting the aging process itself or not.For example, many mutations may increase longevity by decreasing the incidence of specific diseases, rather than by altering the basic process of aging (de Magalh\u00e3es et al ., 2005a(de Magalh\u00e3es et al ., , 2005b)).Therefore, the human data set is not merely an extension of the work conducted in model organisms and of its bibliography, but a manually selected list of the most pertinent human aging candidate genes, each presented with a higher annotation level.We cite studies on whether the functions of aging-associated genes in model organisms are conserved in their human orthologues.Likewise, we cite flaws in previous studies based on new published observations, although we have a neutral stance on conflicting findings from different research groups.Our policy is to cite all conflicting reports and let visitors make their own decisions on how to interpret them.By contrast, each entry in GenAge model organisms has only one reference: the first publication reporting an association of the gene with longevity or aging.Moreover, one of the latest enhancements in the human data set was the inclusion of Gene Ontology annotation.Gene Ontology terms and annotation files were obtained from the Gene Ontology Consortium website (http://www.geneontology.org/ ) and provide an additional layer of description for the gene products in a cellular context (Ashburner et al ., 2000).",
+      "Conclusions and prospects  Over the past two decades the human aging field has built up the necessary resources to study the biology of aging and longevity by establishing human populations with a diversity of designs.Meta-analyses integrating genetic and phenotypic datasets have successfully identified variants associated with a range of age-related traits and diseases.Despite these accomplishments, the number of novel leads contributing to human lifespan regulation is limited.Although positive regions of linkage and suggestive GWAS hits have been reported, the field has not yet identified the loci that explain the clustering of longevity in families and the variation in biological aging rate in the population.As for animal models, down-signaling of the IIS and mTOR pathway appeared to be relevant in humans.These findings are being followed up by molecular and physiological profiling using skin, fat and muscle tissue of long-lived family members and controls.Human studies now also include the response of nutrient sensing systems to the application of dietary and physical challenges.",
+      "Most of the human candidate gene studies were performed in cross-sectional designs (Box 1 and Fig. 1), comparing allele frequencies of potential longevity loci between highly aged individuals and young controls.The candidate gene studies based on single genes have pointed a role for genes involved in, e.g., GH/insulin/IGF-1 signaling, immune regulation, and lipoprotein metabolism (Supporting Information Table S1), although most of these results have not (yet) been confirmed in sufficient independent studies.The most convincing human longevity loci today are APOE and FOXO3A which have frequently been associated with longevity in cross-sectional studies (see for a review [26]) and survival in prospective studies [27][28][29] (Fig. 3).APOE encodes the protein apolipoprotein E which seems to play a role in e.g., lipoprotein metabolism, cognitive function, and immune regulation [30].FOXO3A encodes the protein forkhead box O3 which acts as a transcription factor for many different genes involved in processes like apoptosis and oxidative stress [31]."
+    ],
+    [
+      "Funding: See page 22 Preprinted: 24 June 2021 Received: 03 November 2021 Accepted: 01 April 2022 Published: 07 April 2022 Reviewing Editor: Joris Deelen, Max Planck Institute for Biology of Ageing, Germany \u200d \u200dCopyright Mozhui et al. This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited. Editor's evaluation This article used three newly generated epigenetic predictors to test how they differ between genetically diverse mice from the BXD family (by looking at metabolic traits and lifespan).",
+      "Longevity data was obtained from a parallel cohort of BXD mice housed in the same UTHSC colony, and members of this \u2018longevity cohort\u2019 were allowed to age until natural death (more detail on the longevity cohort can be found in Roy et al. , 2021). Males were excluded and strain-\u00adby-\u00addiet lifespan summary statistics were derived. Only strain-\u00adby-\u00addiet groups with five or more observations for lifespan were included in the correlational analyses with the epigenetic predictors. Multivariable EWAS Site-\u00adby-\u00adsite differential methylation analysis (EWAS) was performed on the 27,966 CpGs using a multivariable regression model.",
+      "Funding: See page 22 Preprinted: 24 June 2021 Received: 03 November 2021 Accepted: 01 April 2022 Published: 07 April 2022 Reviewing Editor: Joris Deelen, Max Planck Institute for Biology of Ageing, Germany \u200d \u200dCopyright Mozhui et al. This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited. Editor's evaluation This article used three newly generated epigenetic predictors to test how they differ between genetically diverse mice from the BXD family (by looking at metabolic traits and lifespan).",
+      "Longevity data was obtained from a parallel cohort of BXD mice housed in the same UTHSC colony, and members of this \u2018longevity cohort\u2019 were allowed to age until natural death (more detail on the longevity cohort can be found in Roy et al. , 2021). Males were excluded and strain-\u00adby-\u00addiet lifespan summary statistics were derived. Only strain-\u00adby-\u00addiet groups with five or more observations for lifespan were included in the correlational analyses with the epigenetic predictors. Multivariable EWAS Site-\u00adby-\u00adsite differential methylation analysis (EWAS) was performed on the 27,966 CpGs using a multivariable regression model.",
+      "Conclusions These results suggest a novel locus influencing survival in the B6/D2 genetic background, perhaps via a metabolic disorder that emerges by 200 days of age in male animals. Keywords Pathology  Longevity \u2401 Lifespan \u2401 Mouse \u2401 Linkage \u2401  Introduction Longevity, the quintessential complex trait, likely reflects all aspects of an organism\u2019s life history. In humans, the estimated heritability of age at death is estimated at 25\u201333 % [1]. Genetic contributions to mortality rates are thus of great interest and may aid in the understanding of disease etiology and the process of aging itself [2].",
+      "Leduc MS, Hageman RS, Meng Q et al (2010) Identification of genetic determinants of IGF-1 levels and longevity among mouse inbred strains. Aging Cell 9(5):823\u2013836. doi:10.1111/j.14749726.2010.00612.x 10. Lang DH, Gerhard GS, Griffith JW et al (2010) Quantitative trait loci (QTL) analysis of longevity in C57BL/6J by DBA/2J (BXD) recombinant inbred mice. Aging Clin Exp Res 22(1):8\u201319 11. Gelman R, Watson A, Bronson R et al (1988) Murine chromosomal regions correlated with longevity. Genetics 118(4):693\u2013704 12. Jackson AU, Galecki AT, Burke DT et al (2002) Mouse loci associated with life span exhibit sex-specific and epistatic effects.",
+      "Here, we have extended this analysis to search for genotypes related to survival to the age of 800 days in a population of a reciprocal F2 cross between (B6) and (D2) mice. Since QTL for longevity in mice have shown strong sex specificity [10, 12], we conducted sex-specific analyses. In addition, we also determined whether there were any change in pathology changes associated with the loci that showed frequency distortions with aging. To confirm the associations of the loci of interest with longevity and pathology, we performed replication analyses on a panel of BXD recombinant inbred strains.",
+      "Methods We examined a population of 1200 mice that were F2 generation offspring of a 4-way reciprocal cross between C57BL6/J and DBA2/J strains. Animals were sacrificed at age 200, 500, or 800 days and genotyped at 96 markers. The 800 days old cohort, which were the survivors of a much larger breeding group, were examined for enriched frequency of alleles that benefit survival and depletion of alleles that reduce survival. Results Loci on Chr 13 in males and on Chr X in females were significantly distorted from Mendelian expectations, even after conservative correction for multiple testing.",
+      "Assessing epigenetic age in long-lived mice  The epigenetic-aging model was applied to the methylation profiles of long-lived mice and the age-matched controls not used for training (Additional file 2: Datasets used summary).Reductions in age were calculated by subtracting the epigenetic ages of the untreated, wild-type mice from those of the treated mice of the same genetic background.To assess the significance, we used an ANOVA for all 22-month-old mice or only 22-month-old UM-HET3 mice.We also compared the epigenetic ages between treatments with their agematched controls from the same genetic background using a t-test (Additional file 4: Treatment vs wild type stats).",
+      "Editor's evaluation  This article used three newly generated epigenetic predictors to test how they differ between genetically diverse mice from the BXD family (by looking at metabolic traits and lifespan).The authors subsequently identified several quantitative trait loci for the different predictors, using linkage analysis, and performed transcriptome and proteome analyses of liver and adipose tissue.The described results provide some important new insights on the underlying biology of epigenetic mouse aging and may be used to inform future studies in other model organisms and humans focused on studying the relationship between epigenetic aging and metabolism.",
+      "352(6291): p. aad0189. Liao, C.Y. , et al. , Genetic variation in the murine lifespan response to dietary restriction: from life extension to life shortening. Aging Cell, 2010. 9(1): p. 92-5. Johnson, M., Laboratory Mice and Rats. Mater. Methods, 2012. 2: p. 113. Fontaine, D.A. and D.B. Davis, Attention to Background Strain Is Essential for Metabolic Research: C57BL/6 and the International Knockout Mouse Consortium. Diabetes, 2016. 65(1): p. 25-33. Simon, M.M. , et al. , A comparative phenotypic and genomic analysis of C57BL/6J and C57BL/6N mouse strains. Genome Biol, 2013. 14(7): p. R82. Lilue, J., et al.",
+      "Materials and Methods  Study Design.Female mice of the long-lived F 1 hybrid strain C3B10RF1 were fed and maintained as described (7).Briefly, mice were weaned at 28 days, individually housed, given free access to water, and randomly assigned to study groups.Comparisons between five groups of mice were used to determine the effects of aging and CR on gene expression.Control young (7-month-old; n \u03ed 3) and old (27-month-old; n \u03ed 3) mice were fed 95 kcal of a semipurified control diet (Harlan Teklad, Madison, WI; no.TD94145) per week after weaning.Long-term CR (LT-CR) young (7-month-old; n \u03ed 3) and old (27-month-old; n \u03ed 3) mice were fed 53 kcal of a semipurified CR diet (Harlan Teklad; no.TD94146) per week after weaning.Short-term CR (ST-CR) mice were 34-monthold control mice that were switched to 80 kcal of CR diet for 2 weeks, followed by 53 kcal for 2 weeks (n \u03ed 3).The effects of age on gene expression in control mice were determined by comparison between results from the young control and the old control groups.The effects of LT-CR on gene expression were determined by comparison between results from the young control and the young LT-CR groups, and from the old control and the old LT-CR groups.The effects of ST-CR were determined by comparison between results from the old control and the ST-CR groups.Mice were fasted for 48 h before killing.Mice were killed by cervical dislocation, and the livers were rapidly excised and flash frozen in liquid nitrogen.No signs of pathology were detected in any of the animals used.All animal use protocols were approved by the institutional animal use committee of the University of California, Riverside.",
+      "Accessing data resources in the mouse phenome database for genetic analysis of murine life span and health span. J. Gerontol. A Biol. Sci. Med. Sci. 71 (2), 170\u2013177. Brown, R.E. , Stanford, L., Schellinck, H.M., 2000. Developing standardized behavioral tests for knockout and mutant mice. ILAR J. 41 (3), 163\u2013174. Bubier, J.A. , Jay, J.J., Baker, C.L. , Bergeson, S.E. , Ohno, H., Metten, P., Crabbe, J.C., Chesler, E.J. , 2014. Identi\ufb01cation of a QTL in Mus musculus for alcohol preference, withdrawal, and Ap3m2 expression using integrative functional genomics and precision genetics. Genetics 197 (4), 1377\u20131393. Burn, C.C. , 2008.",
+      "Our own work has taken a different tack: we have attempted to determine whether mutations with differential effects on aging may be present within the many available populations of laboratory-adopted inbred mice.The goal is not so much to clone these genes-if indeed they existbecause positional cloning strategies of this kind require many thousands of animals and would be extremely expensive using an assay, age at death, that is itself so costly.Instead, the goal has been to use gene mapping methods to test hypotheses about aging and to develop new animal models that will be useful for testing well-specified hypotheses about the molecular basis for age-dependent changes.In the absence of a validated battery of biomarkers of aging, we (like most others) have reluctantly decided to use mouse life span as a crude surrogate for aging itself, reasoning that genetic alleles that extend life span well beyond the median for the tested population may be operating via an influence on aging itself.Work conducted using recombinant inbred mouse stocks (Gelman et al., 1988;de Haan and Van Zant, 1999) has suggested that life-span differences between pairs of inbred mouse lines might reflect the influence of as few as 4-7 polymorphic loci, providing some basis for hope that some of these would have an effect large enough to be detected by a genome scan experiment involving 300-1,200 mice.",
+      "The available dataset also provides examples in which genetic variants seem to influence the risk of specific late-life diseases.Figure 8-6, for example, shows longevity results for mice stratified by their inheritance at the 12th chromosome locus D12Mit167.This is a locus associated with differential longevity in both male and female mice, with the strongest effect (adjusted p < 0.01) seen in those mice living more than 657 days (Jackson et al., unpublished results).The longest-lived mice are those that inherit both the C57BL/6 allele from their mother and the C3H allele from their father; on average, they survive 93 days longer than siblings with the BALB plus C3H combination.Figure 8-6 shows that the D12Mit167, like the pair of loci illustrated in Figure 8-5, has significant and similar effects in mice dying of cancer (85 days) and in mice dying of non-neoplastic diseases (126 days).A more detailed analysis of the cancers, however, suggests that while lymphoma and hepatoma victims are equally protected by the favorable alleles (effect sizes of 93 and 167 days, respec-  mice of two subgroups: those dying of the urinary syndrome MUS, and those dying of all other causes.The genetic analysis contrasts mice with both the C57BL/6 allele at D4Mit84 and the C3H allele at D9Mit110 to mice with any of the three other allele combinations.In the males dying of causes other than MUS, this allele pair is associated with a 170-day increment in longevity (post-hoc p < 0.00003).But for males that do die of MUS, the same allele combination is associated with a 187-day decline in mean life span (post-hoc p < 0.03).This effect is thus pleiotropic, in that these alleles accelerate death in mice susceptible to MUS, while postponing death for all other males in the population.Although these loci are associated with differential longevity in mice that do develop MUS, they do not have a significant effect on the chances that MUS will indeed occur (not shown).The risk of developing MUS seems to be under control of a separate locus on chromosome 6.As shown in the bottom panel of Figure 8-7, males that inherit the C3H allele at D6Mit268 are far more likely to develop MUS (28 percent risk) than are their brothers who receive the DBA/2 allele at this locus (7 percent risk; p = 0.012 by two-tailed Fisher's exact test).",
+      "Previously, the methylation status of CpG sites within the genes Prima1, Hsf4, Kcns1 was shown to qualify as a reliable predictor of chronological age of B6 mice.10 This same study also revealed enhanced epigenetic aging of the D2 strain in accordance with its general reduced mean life span, supporting the possibility that the panel might also serve as a marker for the biological age in mice. Applying this B6trained marker panel to our (congenic) experimental strains, we observed that epigenetic age predictions correlated with chronological age in B6 (R2=0.93) and line A mice (R2=0.89).",
+      "34. Gelman R, Watson A, Bronson R & Yunis E Murine chromosomal regions correlated with longevity. Genetics 118, 693\u2013704 (1988). [PubMed: 3163317] 35. Houtkooper RHet al.The metabolic footprint of aging in mice. Sci. Rep1, (2011). 36. Houtkooper RHet al.Mitonuclear protein imbalance as a conserved longevity mechanism. Nature497, 451\u2013457 (2013). [PubMed: 23698443] 37. Williams EGet al.An Evolutionarily conserved role for the aryl hydrocarbon receptor in the regulation of movement. PLOS Genet. 10, e1004673 (2014). [PubMed: 25255223] 38. Lang DHet al.Quantitative trait loci (QTL) analysis of longevity in C57BL/6J by DBA/2J (BXD) recombinant inbred mice. Aging Clin. Exp. Res. 22, 8\u201319 (2010).",
+      "For females, hairs of the congenic mice grew 31% faster, also highly significant (P = 0.0006, 1-tailed). These results validated the presence of a gene in the differential region affecting FE. Discussion We report the outcomes of a quantitative genetic study on aging and longevity in the mouse. We studied an extant series of recombinant inbred strains (ILSXISS) that have been used both in DR aging studies as well as to study alcohol sensitivity (Williams et al. , 2004).",
+      "FOURTH STEP: MEDICAL TESTING OF CANDIDATE DRUGS  Many genes are common between fruit flies and mammals, but by no means all.Therefore, it is important to test biochemical pathways that work in fruit flies with mammals.Mice are the system of choice, as they have relatively short lifespans (2 -3 years) and a great deal is known of their genetics.Mortality rate measurements, like those studied in fruit flies, [10] might speed up mouse trials to just 6-12 months.Mouse trials would also help address issues of safety, such as liver and kidney toxicity, before going on to human trials.",
+      "Experimental Procedures  Mouse Breeding, Maintenance, and Longevity.Cdc42GAP \u03ea/\u03ea and p53 \u03ea/\u03ea mice were generated as previously described (6,35), and the mice used in the studies were mixed C57BL/6 \u03e9/\u03ea 129/Sv inbred.Littermates of different genotypes were housed and fed freely with standard mouse chow over their life span in a pathogen-free environment and were monitored for vitality and longevity.Mice exhibiting extreme morbidity were euthanized and subjected to necropsy.All animal procedures were approved by the Institutional Animal Care and Use Committee at the Children's Hospital Research Foundation."
+    ],
+    [
+      "Genetic associations for two biological age measures point to distinct aging phenotypes. Aging Cell 20:e13376. DOI: https://doi.org/10.1111/acel.13376, PMID: 34038024 Lang DH, Gerhard GS, Griffith JW, Vogler GP, Vandenbergh DJ, Blizard DA, Stout JT, Lakoski JM, McClearn GE. 2010. Quantitative trait loci (QTL) analysis of longevity in C57BL/6J by DBA/2J (BXD) recombinant inbred mice. Aging Clinical and Experimental Research 22:8\u201319. DOI: https://doi.org/10.1007/BF03324809, PMID: 20305363 Lappalainen T. 2015. Functional genomics bridges the gap between quantitative genetics and molecular biology. Genome Research 25:1427\u20131431.",
+      "Pharmacol Biochem Behav 81, 764\u2013768. Hsu, H.C., Lu, L., Yi, N., Van Zant, G., Williams, R.W. & Mountz, J.D. (2007) Quantitative trait locus (QTL) mapping in aging systems. Methods Mol Biol 371, 321\u2013348. Hurlin, P.J. & Huang, J. (2006) The MAX-interacting transcription factor network. Semin Cancer Biol 16, 265\u2013274. Jones, B.C. , Tarantino, L.M. , Rodriguez, L.A., Reed, C.L. , McClearn, G.E. , Plomin, R. & Erwin, V.G. (1999) Quantitative-trait loci analysis of cocaine-related behaviours and neurochemistry. Pharmacogenetics 9, 607\u2013617. Jones, B.C. , Beard, J.L. , Gibson, J.N. , Unger, E.L., Allen, R.P. , McCarthy, K.A. & Earley, C.J.",
+      "Genetic associations for two biological age measures point to distinct aging phenotypes. Aging Cell 20:e13376. DOI: https://doi.org/10.1111/acel.13376, PMID: 34038024 Lang DH, Gerhard GS, Griffith JW, Vogler GP, Vandenbergh DJ, Blizard DA, Stout JT, Lakoski JM, McClearn GE. 2010. Quantitative trait loci (QTL) analysis of longevity in C57BL/6J by DBA/2J (BXD) recombinant inbred mice. Aging Clinical and Experimental Research 22:8\u201319. DOI: https://doi.org/10.1007/BF03324809, PMID: 20305363 Lappalainen T. 2015. Functional genomics bridges the gap between quantitative genetics and molecular biology. Genome Research 25:1427\u20131431.",
+      "Interestingly, the correlation analysis indicates QTL Mapping in Aging Systems  333  Fig. 5. Basic statistics provided by the WebQTL GeneNetwork website. The strain distribution pattern (SDP) of the quantitative trait is presented in the basic statistics page of WebQTL in the following ways: (A) the raw data of the quantitative trait obtained from each BXD recombinant inbred (RI) strain, (B) data mean and distribution, (C) bar graph showing the mean and variable of each strain, and (D) the normal probability plot of the SDP.",
+      "23 Quantitative Trait Locus (QTL) Mapping in Aging Systems Hui-Chen Hsu, Lu Lu, Nengjun Yi, Gary Van Zant, Robert W. Williams, and John D. Mountz Summary Understanding the genetic basis of the effects of aging on the decline in the immune response is an enormous undertaking. The most prominent age-related change in the immune system is thymic involution. This chapter will focus on the use of C57BL/6 J X DBA/2 J (BXD) recombinant inbred (RI) strains of mice to map genetic loci associated with age-related thymic involution in mice.",
+      "For further prioritization, we converted the mouse QTL regions to the corresponding syntenic regions in the human genome and retrieved GWAS annotations for these intervals (Buniello et al., 2019).We specifically searched for the traits: epigenetic aging, longevity, age of menarche/menopause/puberty, Alzheimer's disease, and age-related cognitive decline and dementia.This highlighted five genes in Eaa11 and three genes in Eaa19 (Supplementary file 4c).We also identified a GWAS that found associations between variants near Myof-Cyp26a1 and human longevity (Yashin et al., 2018), and a meta-GWAS that found gene-level associations between Nkx2-3 and Cutc, and epigenetic aging (Supplementary file 4c; McCartney et al., 2021).",
+      "Jiang, C. and Zeng, Z. B. (1995). Multiple trait analysis of genetic mapping for quantitative trait loci. Genetics 140, 1111\u20131127. Jin, W., Riley, R. M., Wolfinger, R. D.et al. (2001). The contributions of sex, genotype and age to transcriptional variance in Drosophila melanogaster. Nat Genet 29, 389\u2013395. Kempermann, G., Chesler, E. J., Lu, L. et al. (2006). Natural variation and genetic covariance in adult hippocampal neurogenesis. Proc Natl Acad Sci U S A 103, 780\u2013785. Kendziorski, C. M., Chen, M., Yuan, M. et al. (2006). Statistical methods for expression quantitative trait loci (eQTL) mapping. Biometrics 62, 19\u201327.",
+      "Hypothesis-free genome-wide approaches have also been undertaken.Genome-wide linkage scans reported evidence for linkage with longevity on chromosome 4q25 (Puca et al., 2001), 3p24-22, 9q31-34, and12q24 (Boyden &Kunkel, 2010).However, the evidence for these loci is still very weak as the results, obtained in centenarians and their families, could not be replicated in nonagenarian sibling pairs (Beekman et al., 2006) or have yet to be tested in other studies.A meta GWAS of survival to 90 years or older in 1836 cases and 1955 controls did not find any significant genome-wide associations (Newman et al., 2010).Thus far, hypothesis-free approaches have not identified any loci involved in longevity.",
+      "Abiola O, Angel JM, Avner P, Bachmanov AA, Belknap JK, Bennett B, et al. The nature and identification of quantitative trait loci: a community\u2019s view. Nat Rev Genet. Nature Publishing Group; 2003; 4: 911\u2013916. https://doi.org/10.1038/nrg1206 PMID: 14634638  18. Grupe A, Germer S, Usuka J, Aud D, Belknap JK, Klein RF, et al. In silico mapping of complex diseaserelated traits in mice. Science. American Association for the Advancement of Science; 2001; 292: 1915\u20131918. https://doi.org/10.1126/science.1058889 PMID: 11397946  19. Pletcher MT, McClurg P, Batalov S, Su AI, Barnes SW, Lagler E, et al.",
+      "coid levels, etc.The mapping project should thus help to guide the search for human genes that regulate these interesting phenotypes and at the same time spark new investigations, in animal models, for the biochemical differences that mediate the genetic effects we detect.At the same time, the dataset that emerges should also allow us to test more general questions about the nature of aging and its genetic control.We may, for example, be able to identify QTLs that not only retard the development of one or more age-sensitive T-cell subsets, but also retard age-dependent changes in protein conformation, bone matrix turnover, and brain GFAP levels.Such a finding would imply that these changes are influenced, together, by a common biochemical pathway, and the corresponding QTLs would be excellent candidates for genes that regulate aging per se, rather than merely one among the many more agesensitive traits.In the same way, it will be of particular interest to determine if QTLs that regulate age-sensitive traits also are associated with differences in life span, and conversely if QTLs identified on the basis of longevity effects modify one (or nearly all?) of the age-sensitive traits in our test battery.",
+      "The strategy for mapping such quantitative trait loci (QTL) involves looking for preferential segregation of specific alleles or allele combina-tions in mice that differ in life span (or, more generally, any age-sensitive trait of interest).Our test population, called UM-HET3, consisted of a group of mice bred as the progeny of females of the (BALB/c \u00d7 C57BL/6)F1 genotype and males of the (C3H/HeJ \u00d7 DBA/2)F1 genotype.Mice bred in this way are, from a genetic perspective, all siblings; each shares a random half of its alleles with every other animal in the UM-HET3 population.The current set of analyses was conducted when genotype and longevity data were available from a group of 110 virgin males and 143 virgin females.The analytical method adjusted, by permutation testing, for Type I errors attributable to the simultaneous evaluation of multiple linkage hypotheses, and also included gender as a covariate to look for instances of sex-specific genetic effects.Because we had particular interest in regulation of late-life diseases rather than in causes of premature death, and because of evidence that genetic influences on mouse longevity were particularly strong when early deaths were not considered (Covelli et al., 1989), we repeated each analysis after exclusion of those animals dying before 657 days of age, i.e., the age at which 20 percent of the animals had already died.",
+      "The proportion of the phenotypic variance accounted for by the QTL yield for Hbact and Hbrear was substantial and of the same order of magnitude as that contributed by age. A small number of age-dependent QTL were found in the midst of a majority of age-stable QTL (see discussion above). These age-sensitive loci point toward genes whose functions are correlated with important behavioral changes during aging.",
+      "Ageing genes and pathways.Assessing the loci of interest for colocalisation with gene expression quantitative trait loci (eQTL), we find strong evidence (FDR SMR < 5%; P HEIDI > 1%; see \"Methods\") of cis-acting eQTL colocalisation for eight out of 10 loci.In total, we highlight 27 unique genes acting across 32 tissues, especially whole blood (12 genes) and the tibial nerve (7 genes) (Supplementary Data 5).In blood, higher expression levels of BCL3 and CKM (near APOE); CTC-510F12.2, ILF3, KANK2 and PDE4A (near LDLR); USP28 and ANKK1 (near ZW10); and CDKN2B are linked to an increase in multivariate ageing traits (i.e.improved survival), while the opposite is true for EXOC3L2 (near APOE), TTC12 (near ZW10), and FOXO3.For the multivariate signal near SLC4A7 we find colocalisation with expression of NEK10 (liver); for the signal near LPA we find colocalisation with expression of SLC22A1/A3 (multiple tissues) and MAP3K4 (pituitary); and for the signal near FGD6 we find colocalisation with expression of FGD6 itself (adipose/arterial).Including trans-acting eQTL from blood, while keeping the same thresholds for colocalisation, we additionally discover higher expression levels of FOXO3B colocalises with the life-extending signal near FOXO3.When we include genes which could not be tested for heterogeneity (N eQTL < 3), we identify one additional cis-acting and 49 additional trans-acting genes (of which 10 colocalise with the signal near LINC02513) (Table 2; Supplementary Data 5).",
+      "Jiang, C. and Zeng, Z. B. (1995). Multiple trait analysis of genetic mapping for quantitative trait loci. Genetics 140, 1111\u20131127. Jin, W., Riley, R. M., Wolfinger, R. D.et al. (2001). The contributions of sex, genotype and age to transcriptional variance in Drosophila melanogaster. Nat Genet 29, 389\u2013395. Kempermann, G., Chesler, E. J., Lu, L. et al. (2006). Natural variation and genetic covariance in adult hippocampal neurogenesis. Proc Natl Acad Sci U S A 103, 780\u2013785. Kendziorski, C. M., Chen, M., Yuan, M. et al. (2006). Statistical methods for expression quantitative trait loci (eQTL) mapping. Biometrics 62, 19\u201327.",
+      "Quantitative trait loci (QTLs) can be identified in several ways, but is there a definitive test of whether a candidate locus actually corresponds to a specific QTL? NIH-PA Author Manuscript  Much of the genetic variation that underlies disease susceptibility and morphology is complex and is governed by loci that have quantitative effects on the phenotype. Gene-gene and geneenvironment interactions are common and make these loci difficult to analyse. Here, we present a community\u2019s view on the steps that are necessary to identify genetic loci that govern quantitative traits, along with a set of interpretive guidelines.",
+      "QTL Analysis in Hematopoiesis  47  3 Quantitative Trait Analysis in the Investigation of Function and Aging of Hematopoietic Stem Cells Hans-Willem Snoeck Summary Extensive genetically determined quantitative variation exists in the number and function of hematopoietic stem cells in inbred mouse strains. Furthermore, aging of hematopoietic stem cells is genetically determined. Gene identification of quantitative trait loci involved in the regulation and aging of hematopoietic stem cells would provide novel insights into regulatory mechanisms that are relevant in vivo and may be clinically important.",
+      "In order to find the causal loci for heritable differences in transcript levels and possible interactions between age and genotype, we applied a two-time-point model.In this model, we used three factors-(1) relative age, (2) genotype (marker), and (3) the interaction between factors 1 and 2-to explain the differences in gene expression between RILs and age groups.With this mapping procedure, we found almost 900 genes that had an eQTL or gxa eQTL in developing and/or aging worms (P < 0.0001; Fig. 2).Almost half of these genes with heritable transcript differences were found to have a genotype-by-age effect (396 at P < 0.0001; Table 1) allocated to a specific marker, which we coined genotype-by-age expression-QTL ( gxa eQTL).One specific hotspot (trans-band) for gxa eQTL was found on chromosome IV for aging worms and a trans-band for eQTL on chromosome I was detected in developing worms (Fig. 2).",
+      "NIH-PA Author Manuscript  We found three significant QTLs (genetic regions harboring genes controlling these various aging traits, Supplementary Table 5). On chromosome 7, we found a QTL affecting lifespan and fertility after DR that we have named Lfdr1 for \u201clongevity and fertility response to dietary restriction, QTL 1; this QTL also has suggestive effects on FE (Fig. 5D). Two QTLs having significant effects on FE were identified on chromosomes 9 and 15. These we have named Fedr1 and Fedr2, respectively, for \u201cfuel efficiency response to dietary restriction\u201d QTLs 1 and 2.",
+      "Quantitative trait locus (QTL) mapping in aging systems. Methods in Molecular Biology (Clifton, NJ ). 2007; 371:321\u2013348. Hunter KW, Crawford NPS. The future of mouse QTL mapping to diagnose disease in mice in the age of whole-genome association studies. Annual Review of Genetics. 2008; 42:131\u2013141. Ito R, Robbins TW, Everitt BJ. Differential control over cocaine-seeking behavior by nucleus accumbens core and shell. Nature Neuroscience. 2004; 7:389\u2013397. [PubMed: 15034590] Kapp MB. Ethical and legal issues in research involving human subjects: do you want a piece of me? Journal of Clinical Pathology. 2006; 59:335\u2013339.",
+      "Jiang, C. and Zeng, Z. B. (1995). Multiple trait analysis of genetic mapping for quantitative trait loci. Genetics 140, 1111\u20131127. Jin, W., Riley, R. M., Wolfinger, R. D.et al. (2001). The contributions of sex, genotype and age to transcriptional variance in Drosophila melanogaster. Nat Genet 29, 389\u2013395. Kempermann, G., Chesler, E. J., Lu, L. et al. (2006). Natural variation and genetic covariance in adult hippocampal neurogenesis. Proc Natl Acad Sci U S A 103, 780\u2013785. Kendziorski, C. M., Chen, M., Yuan, M. et al. (2006). Statistical methods for expression quantitative trait loci (eQTL) mapping. Biometrics 62, 19\u201327."
+    ],
+    [
+      "Introduction  With the development of human genomics research, a large number of studies of the genetics of longevity have been conducted.Scientists from various countries have proposed many different theories concerning the mechanisms of aging from different perspectives, involving oxidative stress, energy metabolism, signal transduction pathways, immune response, etc. [1,2].These mechanisms interact with each other and are influenced by heredity to some degree [2,3].The identification of longevity-related biological markers is critical to an indepth understanding of the mechanisms of carrier protection against common disease and/or of the retardation of the process of aging.",
+      "INTRODUCTION  Human aging is affected by genes, life style, and environmental factors.The genetic contribution to average human aging can be modest with genes explaining \u223c20-25% of the variability of human survival to the mid-eighties (Herskind et al., 1996;Fraser and Shavlik, 2001).By contrast, genetic factors may have greater impact on survival to the ninth through eleventh decades (Tan et al., 2008).Notably, exceptional longevity is rare and may involve biological mechanisms that differ from those implicated in usual human aging.",
+      "Introduction  Geroscience refers to research aimed at understanding the mechanisms of biological aging (Kennedy et al. 2014).A major goal of geroscience is to define the genetic, epigenetic, and environmental features that determine individual rates of aging.From a translational perspective, a further goal is to use this knowledge to develop interventions that can slow or delay aging in order to promote healthy longevity and increase healthspan, the period of life spent in good health free from chronic disease and disability (Burch et al. 2014;Pitt and Kaeberlein 2015).",
+      "the maximum human life span.Several avenues to studying aging have placed us on Department of Biology Massachusetts Institute of Technology the threshold of understanding basic underlying mechanisms.These approaches include the identification of Cambridge, Massachusetts 02139 key genes and pathways important in aging; genetic studies of heritable diseases that cause the appearance of premature aging in affected people; physiological ex-Introduction periments that relate the pace of aging to caloric intake; Is aging the final act in the script of developmental bioland advances in human genetics, as well as cell and ogy?The characteristic changes that are part and parcel molecular biology leading to an understanding of the of aging appear similar to developmentally regulated basis of many diseases of aging.Strikingly, single gene programs.But why would aging mechanisms have been mutations have been found to significantly extend the evolutionarily selected as advantageous?Indeed, evolife span in C. elegans, yeast, and, most recently, Drolutionary biologists might argue that aging occurs by sophila, suggesting that aging may be relatively simple, default due to the absence of selection in the postreproat least in these organisms.Further, the limited replicaductive phase of life.By this view, the aging process is tion potential of human cells in culture has been attribnot programmed, but, rather, the detritus of the absence uted to a specific mechanism (i.e., the shortening of of selection for maintenance (Medawar, 1952; Kirkwood, telomeric ends of chromosomes).An important chal- 1977).However, it is quite reasonable that any mechalenge is now to relate these recent findings to the more nisms that sprang up to slow or regulate the pace of complex case of human aging.aging would be selected, because lucky individualsIn this review, we will discuss several important mocould potentially give rise to more progeny.Therefore, lecular models of aging that come from current research.it is reasonable to suppose that life span extending pro-These are damage by reactive oxygen species (ROS) cesses have been selected and that these can be viewed generated by metabolism, genome instability, genetias an elaboration of development itself.In principle, cally programmed extension mechanisms, cell death, such extension mechanisms may act to slow or forestall and systemic aging.Questions to be posed include the deleterious changes in an organism that progressively following.What evidence exists for and against these lead to death.The life span of an organism, therefore, models?Can more than one of these models apply to is the sum of deleterious changes and counteracting aging of different tissues in humans-specifically do repair and maintenance mechanisms that respond to organs with continually dividing cells age by the same the damage (Figure 1).mechanism as organs that are postmitotic?Finally, is A priori, one imagines such longevity mechanisms to aging amenable to therapeutic intervention, and would be much less complex than those regulating embryonic such intervention be advisable?development.The spatial and temporal constraints on embryonic development are many, while requirements Oxidative Damage for longevity mechanisms might be much more specific One theory of aging proposes that ROS which are generif there were a single process (or a few processes) whose ated by metabolism cause cumulative damage over a breakdown is the limiting event in longevity (i.e., the lifetime (Harman, 1981).Roughly two to three percent Achilles heel).of oxygen taken up is chemically reduced by the addition Aging is defined when two criteria are met.First, the of single electrons, which are sequentially converted probability of death at any point in time increases with into ROS, including the superoxide anion, hydrogen perthe age of the organism.This statistical definition applies oxide, and the hydroxyl radical.ROS have been shown from yeast to mammals and reflects the progressive to cause molecular damage relatively indiscriminately nature of aging.Second, characteristic changes in pheto proteins, lipids, and nucleic acids.In addition, specific notype occur in all individuals over time due to the limdamage has been observed in the mitochondrial DNA, iting processes.which we consider below in Genome Instability.The phenotypic definition is equally general and is What is the evidence that oxidative damage causes useful in distinguishing the aging process itself from aging?One category of study that is supportive of this diseases of aging, such as cancer and heart disease.view involves animals transgenic for genes encoding Phenotypes of aging affect all of the individuals in a antioxidants.Transgenic Drosophila overexpressing both population, while diseases of aging affect only a subset.Cu/Zn SOD and catalase live 34% longer than controls Both impact on life span, but in different ways.For exam-(Orr and Sohal, 1994).A more recent study shows that ple, the many advances in medicine and public health expression of human SOD1 exclusively in Drosophila in this century have caused a large increase in the averadult motor neurons leads to a 40% extension in life age life span of humans in developed countries.Howspan (Parkes et al., 1998).Further experiments are necever, because these advances have not altered the aging essary to clarify the nature of this primary role of motor neurons in life span.Conversely, mice knocked out for either GPX1 (encoding glutathione peroxidase), SOD1,",
+      "the maximum human life span.Several avenues to studying aging have placed us on Department of Biology Massachusetts Institute of Technology the threshold of understanding basic underlying mechanisms.These approaches include the identification of Cambridge, Massachusetts 02139 key genes and pathways important in aging; genetic studies of heritable diseases that cause the appearance of premature aging in affected people; physiological ex-Introduction periments that relate the pace of aging to caloric intake; Is aging the final act in the script of developmental bioland advances in human genetics, as well as cell and ogy?The characteristic changes that are part and parcel molecular biology leading to an understanding of the of aging appear similar to developmentally regulated basis of many diseases of aging.Strikingly, single gene programs.But why would aging mechanisms have been mutations have been found to significantly extend the evolutionarily selected as advantageous?Indeed, evolife span in C. elegans, yeast, and, most recently, Drolutionary biologists might argue that aging occurs by sophila, suggesting that aging may be relatively simple, default due to the absence of selection in the postreproat least in these organisms.Further, the limited replicaductive phase of life.By this view, the aging process is tion potential of human cells in culture has been attribnot programmed, but, rather, the detritus of the absence uted to a specific mechanism (i.e., the shortening of of selection for maintenance (Medawar, 1952; Kirkwood, telomeric ends of chromosomes).An important chal- 1977).However, it is quite reasonable that any mechalenge is now to relate these recent findings to the more nisms that sprang up to slow or regulate the pace of complex case of human aging.aging would be selected, because lucky individuals",
+      "Currently prevailing studies of genetic and biological origin of human health and longevity follow largely two approaches which focus on the aging-related diseases and on individuals with exceptionally long lives (Martin et al. 2007).This study provides de facto the rationale for a new approach.Specifically, Fig. 2 suggests that a promising strategy could be to focus on individuals who died prematurely.Studies of genetic profiles of short-lived subjects compared to those who aged more successfully (i.e., those who lived longer and perhaps healthier lives) can be a core of this strategy.Importantly, this strategy can be naturally implemented in longitudinal studies of aging and longevity by focusing on individuals who died first.",
+      "T he average human life expectancy has been increasing for centuries 1 .Based on twin studies, the heritability of human lifespan has been estimated to be ~25%, although this estimate differs among studies 2 .On the other hand, the heritability of lifespan based on the correlation of the mid-parent (i.e., the average of the father and mother) and offspring difference between age at death and expected lifespan was estimated to be 12% 3 .A recent study has indicated that the different heritability estimates may be inflated due to assortative mating, leaving a true heritability that is below 10% 4 .The heritability of lifespan, estimated using the sibling relative risk, increases with age 5 and is assumed to be enriched in long-lived families, particularly when belonging to the 10% longest-lived of their generation 6 .To identify genetic associations with human lifespan, several genome-wide association (GWA) studies have been performed [7][8][9][10][11][12][13][14][15][16][17][18][19][20] .These studies have used a discrete (i.e., older cases versus younger controls) or a continuous phenotype (such as age at death of individuals or their parents).The selection of cases for the studies using a discrete longevity phenotype has been based on the survival to ages above 90 or 100 years or belonging to the top 10% or 1% of survivors in a population.Studies defining cases using a discrete longevity phenotype often need to rely on controls from more contemporary birth cohorts, because all others from the case birth cohorts have died before sample collection.Previous GWA studies have identified several genetic variants, but the only locus that has shown genome-wide significance (P \u2264 5 \u00d7 10 \u22128 ) in multiple independent meta-analyses of GWA studies is apolipoprotein E (APOE) 21 , where the ApoE \u03b54 variant is associated with lower odds of being a long-lived case.",
+      "Introduction  Worldwide human populations have shown an increase in mean life expectancy in the past two centuries (Oeppen & Vaupel, 2002).This is mainly because of environmental factors such as improved hygiene, nutrition, and health care.The large variation in healthy lifespan among the elderly has prompted research into the determinants of aging and lifespan regulation.The genetic contribution to human lifespan variation was estimated at 25-30% in twin studies (Gudmundsson et al., 2000;Skytthe et al., 2003;Hjelmborg et al., 2006).The most prominent genetic influence is observed in families in which the capacity to attain a long lifespan clusters (Perls et al., 2000;Schoenmaker et al., 2006).Exceptional longevity can be reached with a low degree of age-related disability (Christensen et al., 2008;Terry et al., 2008), raising the question whether protective mechanisms against disease exist in long-lived subjects.",
+      "Introduction  Human life expectancies are increasing almost everywhere in the world where socio-economic circumstances are permissive (Tuljapurkar et al., 2000) and there is no evidence that a limit to life is anywhere near (Oeppen and Vaupel, 2002).While this increase in life span would prevent a proposed compression of morbidity (Fries, 1980), there is no evidence that higher average life spans are associated with an extension of the period of increased morbidity (Manton and Gu, 2001).On the contrary, older individuals have never been so healthy and further improvements in life style, environmental conditions and medical care are likely to help this trend to continue.Especially the medical sciences now seem poised to push the biological limits of longevity further by a number of innovations that seem to affect basic mechanisms of ageing and disease rather than merely alleviating its symptoms.While in the past medicine contributed mainly to public health advances by redu-cing infectious diseases, thereby helping infant mortality to decline, more recent developments hold promise for a more basic intervention in the processes that underlie age-related decline.An example is atherosclerosis, a common problem in ageing and, along with hypertension, the cause of most cardiovascular disease.Basic medical research has likely contributed significantly to the current dramatic decline in cardiovascular disease by actively intervening in some of its main risk factors, i.e., lipid levels and hypertension (Levi et al., 2002).However, one could question whether age-related diseases should be seen as separate from ageing.In this respect, ageing has been considered as a process of cellular degeneration and death universal to all or most species, increasing the risk of fatal disease in humans and other mammals.Would it be possible to define such a process and ultimately understand it in terms of the timedependent, coordinated action of the products of multiple genes interacting with the environment?If so, then ageing per se rather than the diseases associated with it, may offer a more logical starting point for further increasing healthy life expectancies through prevention and therapy.This is especially true now that we have a working draft of the human genome and are in a position to determine the functional significance of each gene as part of the dynamic network of all genes that ultimately determine the physiology of an organism.Termed 'Functional Genomics', this new discipline is now often called upon to solve the complex problems in biology, such as to understand functional control mechanisms and investigate the role that genotype and environment play in determining disease phenotypes.The question is then if this same approach would apply to ageing as a complex phenotype.What is ageing, how does it differ from its diametrical opposite, i.e., organismal development, and what role can functional genomics play in unraveling the basic causes of ageing and exploit such knowledge for developing new, rational strategies for extending healthy life span?",
+      "Introduction  As a result of improvements in health care and living conditions over the past two centuries, the average human life expectancy has dramatically increased in many regions of the world [1].This major success reflects the great malleability of the ageing process.Unfortunately, for most people, ageing is accompanied with an increased risk of developing age-related illnesses/disabilities and frailty.Therefore new approaches are required to understand the genetic, cellular, and molecular factors controlling ageing to identify strategies to extend healthy life span.",
+      "The search for the genetic determinants of extreme human longevity has been challenged by the phenotype's rarity and its nonspecific definition by investigators.To address these issues, we established a consortium of four studies of extreme longevity that contributed 2,070 individuals who survived to the oldest one percentile of survival for the 1900 U.S. birth year cohort.We conducted various analyses to discover longevity-associated variants (LAV) and characterized those LAVs that differentiate survival to extreme age at death (eSAVs) from those LAVs that become more frequent in centenarians because of mortality selection (eg, survival to younger years).The analyses identified new rare variants in chromosomes 4 and 7 associated with extreme survival and with reduced risk for cardiovascular disease and Alzheimer's disease.The results confirm the importance of studying truly rare survival to discover those combinations of common and rare variants associated with extreme longevity and longer health span.",
+      "The search for the genetic determinants of extreme human longevity has been challenged by the phenotype's rarity and its nonspecific definition by investigators.To address these issues, we established a consortium of four studies of extreme longevity that contributed 2,070 individuals who survived to the oldest one percentile of survival for the 1900 U.S. birth year cohort.We conducted various analyses to discover longevity-associated variants (LAV) and characterized those LAVs that differentiate survival to extreme age at death (eSAVs) from those LAVs that become more frequent in centenarians because of mortality selection (eg, survival to younger years).The analyses identified new rare variants in chromosomes 4 and 7 associated with extreme survival and with reduced risk for cardiovascular disease and Alzheimer's disease.The results confirm the importance of studying truly rare survival to discover those combinations of common and rare variants associated with extreme longevity and longer health span.",
+      "Introduction  The recent, remarkable extension of life expectancy is largely attributed to the postponement of mortality at old age (Vaupel, 1997(Vaupel, , 2010)).The years of life gained in the older population residing in developed nations are a success story of public health measures and improved health care.In addition to such external factors, longevity and healthy aging consistently show a modest heritability between 20% and 50% and aging-associated genetic research may provide further insights into the mechanisms of aging (Herskind et al., 1996;McGue et al., 1993;Reed and Dick, 2003).It has been postulated that genes involved in pathways associated with aging identified in animal models, such as insulin-like growth factor (IGF)-insulin signaling, regulation of lipoprotein metabolism, the mTOR pathway, and the oxidative stress response may also influence survival to old or even exceptionally old age in humans (Christensen et al., 2006;Kenyon, 2010;Vellai et al., 2003).However, in humans, common variants within genes involved in these pathways have not been consistently associated with lifespan (Chris-tensen et al., 2006;Kenyon, 2010;Kuningas et al., 2008;Vijg and Suh, 2005).",
+      "DESIGNS TO STUDY PARAMETERS OF HEALTHY AGEING, MORBIDITY, MORTALITY AND LONGEVITY  Human cohorts may vary considerably in their morbidity, mortality and longevity characteristics and yet they have shown a common increase in mean life expectancy in the past two centuries [5].This is mainly due to improved hygiene, nutrition and healthcare.There is a large variation in healthy lifespan among the elderly and remarkably exceptional longevity (EL) can be reached with a low degree of agerelated disability [6,7].Heritability studies comparing the concordance of lifespan in monozygous and dizygous twins estimated a 25 -30% genetic contribution to human lifespan variation [8 -11], which becomes increasingly important at higher ages.The most prominent genetic influence is present in families in which survival to high ages clusters [12,13].Unlike model systems where single-gene mutations have major life extension effects, human longevity is presumed to be a complex trait [14].",
+      "INTRODUCTION  Genomic studies into human longevity are inspired by the fact that, in animal models, healthy lifespan has proved to be remarkably plastic, and major pathways of lifespan regulation have been identified.Considerable lifespan extension has been induced in models as diverse as yeast, worms, fish, flies and rodents by applying genetic manipulation and dietary restriction (DR) (see [1] for review).Reduced activity of nutrient-sensing pathways such as insulin/insulin-like growth factor (IGF-1) signalling (IIS) and target of rapamycin (TOR) signalling mediated lifespan extension, and also the extension of lifespan by DR [2].An interesting observation from the perspective of human ageing is that, in rodents and monkeys, diets restricted in glucose, fat or protein uptake reduced or delayed the risk of cancer and metabolic disease, thus extending the healthspan of the animals [2].Following the discovery of genes and pathways involved in animal lifespan extension, human research has focused on the corresponding candidate human genes with genetic, genomic and epigenetic studies into ageing and longevity.The designs of these studies differ with respect to the selection of naturally occurring phenotypes and the study populations, which include population-based, patient-based, family-based and exposure-based cohorts.",
+      "GENETIC STUDIES OF HUMAN LONGEVITY  Genetic and genomic studies into longevity have been performed based on a hypothesis, referred to as a candidate gene approach.Alternatively, explorative genome-wide analyses have been applied in which genetic variation and gene transcription across the complete genome are being studied for associations with longevity and related traits.Genetic studies into human disease and longevity include candidate gene approaches, genome-wide association studies (GWASs) and genome-wide linkage studies.",
+      "ANALYSIS OF HUMAN VARIATION IN THE GENETIC CONTROL OF LONGEVITY  Heritability studies have convincingly demonstrated that at least some fraction of human lifespan is heritable.In tandem, large-scale genome-wide association studies (GWAS) have identified numerous loci associated with age-related traits (Buniello et al., 2019).While genetic studies have functionally shown an inverse effect of multiple age-related, diseaseassociated variants on lifespan regulation, the number of well-replicated longevity-conferring variants remains limited to variants in APOE (ApoE \u03b52), and more recently, CDKN2A/B and IL6 (see Table 1).To date, studies in humans have been hampered by the specific phenotype definitions used, sample sizes of the extreme phenotypes, and modest heritability of the longevity-related traits (Breitbach et al., 2019).This is due to the complex interplay of biological and social factors involved in human aging, as well as the limited power of GWAS, which require sampling thousands of subjects to achieve statistical significance (Breitbach et al., 2019).Genetic studies of aging have also been hindered by an inconsistent use of definitions of aging (reviewed in Baghdadi et al., 2020).The two main ways of conducting research on the genetics of longevity in human populations are by studying (i) the lifespan (continuous trait, years lived) and (ii) the longevity (dichotomous trait, i.e., being among the longest-lived individuals within a specific population).These complexities have limited the resolution and capability of broad association studies of human longevity.Importantly, these genomic analyses focus on a shift of survival in a population; these variables may be genetically distinct from the mechanisms establishing potential for longevity overall (Figure 1A).We argue that an understanding of this shift in lifespan as well as genetic mechanisms of regulating a species specific 'set points' (Figure 1B) will aid in the conceptual distinction of aging and longevity in humans.",
+      "Introduction  Human longevity is influenced by multiple genetic and environmental factors.Approximately 25-32% of the overall variation in adult lifespan is because of genetic variation that becomes particularly important for survival at advanced age (Hjelmborg et al., 2006).Epidemiological studies have revealed that long-lived individuals (LLI), that is, people surviving to the 95th percentile of the respective birth cohort-specific age distributions (Gudmundsson et al., 2000), frequently show a favorable ('healthy') course of the aging process, with the absence or a delayed onset of agerelated diseases (Hitt et al., 1999).Hence, the LLI offer the key to elucidate the molecular mechanisms underlying the 'healthy aging' phenotype (Perls, 2006).",
+      "Conclusions and prospects  Over the past two decades the human aging field has built up the necessary resources to study the biology of aging and longevity by establishing human populations with a diversity of designs.Meta-analyses integrating genetic and phenotypic datasets have successfully identified variants associated with a range of age-related traits and diseases.Despite these accomplishments, the number of novel leads contributing to human lifespan regulation is limited.Although positive regions of linkage and suggestive GWAS hits have been reported, the field has not yet identified the loci that explain the clustering of longevity in families and the variation in biological aging rate in the population.As for animal models, down-signaling of the IIS and mTOR pathway appeared to be relevant in humans.These findings are being followed up by molecular and physiological profiling using skin, fat and muscle tissue of long-lived family members and controls.Human studies now also include the response of nutrient sensing systems to the application of dietary and physical challenges.",
+      "Human lifespan variation is mainly determined by environmental factors, whereas the genetic contribution is 25-30% and expected to be polygenic.Two complementary fields go hand in hand in order to unravel the mechanisms of biological aging: genomic and biomarker research.Explorative and candidate gene studies of the human genome by genetic, transcriptomic, and epigenomic approaches have resulted in the identification of a limited number of interesting positive linkage regions, genes, and pathways that contribute to lifespan variation.The possibilities to further exploit these findings are rapidly increasing through the use of novel technologies, such as next-generation sequencing.Genomic research is progressively being integrated with biomarker studies on aging, including the application of (noninvasive) deep phenotyping and omics data -generated using novel technologies -in a wealth of studies in human populations.Hence, these studies may assist in obtaining a more holistic perspective on the role of the genome in aging and lifespan regulation."
+    ]
+  ]
+}
diff --git a/gnqa/paper1_eval/src/data/datasets/old/aging2_dataset.json b/gnqa/paper1_eval/src/data/datasets/old/aging2_dataset.json
new file mode 100644
index 00000000..b5605903
--- /dev/null
+++ b/gnqa/paper1_eval/src/data/datasets/old/aging2_dataset.json
@@ -0,0 +1,128 @@
+{
+  "question": [
+    "List and describe the grand challenges related to aging research in genetics and genomics.",
+    "How do gene-by-drug effects (GxD) modulate life expectancies and health?",
+    "What loci and gene variants modulate normal variation in lifespan, and in which cases is this sex-specific (GxS)?",
+    "what is the influence of rapamycin and aging on the in vivo physiological and metabolic states and cellular bioenergetic, and how do these relate to the rapamycin-induced health and lifespan modification for that genometype?",
+    "Describe the genotypes related to alzheimers and dementia which have commonalities with those for aging?"
+  ],
+  "answer": [
+    "The grand challenges related to aging research in genetics and genomics include:  1. Lack of awareness and interest: Health professionals, governments, and researchers often lack awareness and interest in genetic diseases, which hinders progress.  2. Limited funding and access to technology: Research funding is often limited, and access to high-end technology is not always available, which restricts the scope of research.  3. Economic and health disparities: These disparities can affect the application and benefits of genomic medicine, making it less accessible to those in lower income brackets.  4. Ethical, legal, and social issues: The interpretation, storage, and sharing of genetic data raise ethical and legal concerns, including issues of privacy and consent.  5. Rapid advancements: The fast pace of developments in genetics and genomics can outpace the ability to address emerging ethical, legal, and social questions.  6. Public knowledge and trust: There is a need to increase public knowledge about genomic research and address issues of trust in medicine and scientific research.  7. Integration of genomics into public health: This is a major future challenge for healthcare systems, requiring dialogue with all stakeholders in society.",
+    "Gene-by-drug effects (GxD) can modulate life expectancies and health by influencing how individuals respond to medications. Genetic variations can affect how drugs are absorbed, disseminated, metabolized, and excreted in the body, which can lead to differences in drug efficacy and the occurrence of adverse drug reactions. For example, a genetic variant in the HMG-CoA reductase gene can modify the LDL-C response to pravastatin. Additionally, certain genes are involved in DNA damage repair and oxidative stress, which can influence susceptibility to adverse drug effects. Therefore, understanding these genetic variations can help in personalizing treatment and potentially improving health outcomes.",
+    "The study identified 10 genomic loci which influence healthspan, parental lifespan, and longevity. These include loci near FOXO3, SLC4A7, LINC02513, ZW10, and FGD6. The life-extending variant near FOXO3 is associated with a delay in the age at menarche, suggesting a possible sex-specific effect. However, the study did not perform sex-stratified analyses, so it's unclear if there are other sex-specific longevity-related genetic variants.",
+    "Rapamycin, an mTOR inhibitor, has been shown to extend the lifespan of various organisms, including mice, by delaying or reversing aging in multiple cell types. It does this by regulating rates of protein synthesis and energy utilization, which are processes associated with aging. In the hematopoietic system, rapamycin limits age-related increases in stem cells and biomarkers of aging, enhancing the performance of these cells. However, rapamycin also has potential unwanted metabolic effects, such as insulin resistance and glucose intolerance, though these effects are controversial and potentially reversible. The mTOR pathway, which rapamycin targets, integrates signals from insulin, cytokines, nutrients, oxygen, and mitogenic stimuli, and its regulation has implications for longevity and against the negative effects of aging. Rapamycin also induces autophagy, a process important for cellular homeostasis and damage prevention. Despite these benefits, the exact mechanisms by which rapamycin extends lifespan and whether it delays aging or affects specific diseases remain unclear.",
+    "The genotypes related to Alzheimer's and dementia that have commonalities with those for aging include a locus on chromosome 17 associated with the age at onset, and a specific variant in CCL11. Other genes associated with Alzheimer's include APP, PSEN1, PSEN2, and APOE. These genes are also associated with early-onset Alzheimer's disease. The APOE gene is the strongest genetic risk factor for later onset Alzheimer's. The heritability of late-onset Alzheimer's disease (LOAD) is estimated to be ~60-80%, suggesting a large proportion of individual differences in LOAD risk is driven by genetics."
+  ],
+  "contexts": [
+    [
+      "There is a great need for continuing efforts to increase public knowledge about genomic research.As individuals and communities from diverse social backgrounds become more aware of genomic research and the potential role of genetics in contributing to health outcomes, the public will hopefully be more informed about the implications of genomic research for personal medical care, public health and more broadly the public representation of diverse population groups based on genetic findings.This knowledge should reinforce the ability of potential participants to make informed choices about joining a genetic study.There are complicated issues underlying public trust in medicine as well as scientific and genetic research that must be addressed.Innovative strategies for public education and community engagement should take into account cultural settings and historical experiences that have contributed to distrust in the past.",
+      "The issues discussed in this section refl ect key current concerns, but, given the rapid advances in genetic and genomic research, new issues will continue to confront families in the next few years.For example, major advances in the developing area of neuropsychiatric genetics, studies of the heritable nature of psychiatric and other nervous system disorders, characterized at the molecular, cellular, or behavioral levels, will challenge family members to address the potential role genes play in the development of schizophrenia, bipolar, or affective disorders (Genomics Network, n.d.).",
+      "Future Implications and Communication Research Directions  Given ever-expanding research on genetics and genomics, scholars interested in family interaction will be challenged to stay abreast of the implications for family disclosure and discussion of genetic health.We believe that the following issues will emerge as key concerns:",
+      "Conclusion  After more than four decades of working, genetics and genomic medicine still faces a considerable challenge to be addressed.Lack of awareness of health professionals and government, lack of interest of researcher on genetic diseases, limited research funding, limited access to high technology, low national health budget and low income family are seem to be the main obstacles to be overcome in implementation of genetics and genomic medicine.Despite these conditions, several research centers still managed to do some studies and few numbers of genetic testing.Several collaborations with countries abroad have been done to overcome some obstacles.Yet, Indonesia still has to accelerate this effort to be able to catch up its lag.Mentoring and collaborations are needed to enable Indonesia in doing so.",
+      "Opportunities for Population-Based Research on Aging Human Subjects:  Pathology and Genetics",
+      "Concluding remarks  The next decade will provide a window of opportunity to prepare health professionals, public health practitioners, the public and policy makers for the advent of genomics on health and health care.This will be a doable project but will require regional, national, European and global coordination on both the vertical and horizontal levels.We argue that there is an ethical obligation to prepare society to meet this challenge and to take up the opportunities provided by the science in a medically useful, effective, efficient, socially desirable and ethically justifiable manner.Here, health literacy, health communication and empowerment in managing risks are key for opening the doors to a truly beneficial Public Health Genomics practice.This can be facilitated by implementing ethical benchmarks and legal safeguards 70 such as respect for autonomy and social justice in the context of policy development.",
+      "Clarifying the general conditions under which genomic knowledge can be put to best practice in the field of public health, paying particular consideration to the ethical, legal and social implications 12,17,35 is currently the most pressing task in Public Health Genomics.Aiming the application of genetic and molecular science to the promotion of health and disease prevention through the organised efforts of society, integral to its activities is a dialogue with all stakeholders in society, including industry, governments, health professionals and the general public. 18Thus, the integration of genomics into public health research, policy and practice is one of the major future challenges for our health-care systems. 36,37Expertise is already feasible and can be clustered and evaluated for a socially accountable use.",
+      "Public health needs to prepare itself for the upcoming challenges, which derive from genomics.In this sense, it needs to strengthen the communication efforts among all sciences involved.Public health can serve as the umbrella, that spans the disciplines such as genetics, ethics, law and all other stakeholders.",
+      "Economic and health disparities related to genetics and genomics.",
+      "Capabilities and limitations of current genetic/genomic technologies.",
+      "Identify ethical, legal, and social issues associated with genetic/genomic information.",
+      "Ongoing research contributing to improved understanding of the genetic/genomic influences on health.",
+      "Economic and health disparities related to genetics and genomics. Integrate knowledge from psychology, history, politics, sociology and culture when delivering genetic and genomic care.",
+      "Ethical and legal issues surrounding genetic and genomic information and services.",
+      "Developments in genetics and genomics occur very rapidly and bring with them new ethical, legal and social questions that need swift, sensible and responsible responses (Pepper, 2011).Examples include next-generation sequencing, genetic cohort studies and biobanks, which have raised questions about data management, including quality of interpretation of data, data storage, data sharing, consent for re-use of data, as well as concerns about identifiability and privacy interests of those who provide samples (Kaye, 2012;Wolf, 2013;Pinxten and Howard, 2014).However, the rapidity of advancement poses difficulties for those who must determine the responses to these questions.They are often slow or even overtaken by further advancements.Ethical, legal and social-related challenges should be prioritised for policymakers, researchers, clinicians and public health practitioners to maximise the benefits of genomic and genetic applications while minimising the risk of harm to people (Geller et al., 2014).Any education strategy developed should therefore be dynamic.",
+      "Query 2. Perceptions of Genetics and Genomics  Awareness of Genetic and Genomic Advancements.",
+      "In addition, 4 scholarly commentaries in this issue provide insights into several current practical issues and developments in genetics and genomics.Feero and colleagues 11 describe advances in genomics science and explore many of the issues surrounding translation of these advances to routine \"personalized\" patient care.Offit 12 discusses the increasing availability of direct-to-consumer marketing of genomic and genetic testing and sounds an appropriately cautionary note about the need for standards, quality control, and appropriate regulation.Uhlmann and Guttmacher 13 present a useful collection of practical Internet genetics resources for clinicians and patients, including genetics information on specific diseases; guidelines for genetic testing; and educational resources to help clinicians integrate genetics into patient care.Ginsberg and colleagues 14 discuss the importance of centralized biorepositories for genetics and genomics research and empha-size the need to develop and implement standards for informed consent, informatics, and governance.",
+      "Key Themes Relevant To Genomic Research . . . . . . . . . . . . . . . . . . . . . . . . . . 3",
+      "A first step is to define the challenges that stand in the way of realizing the promise of genomic medicine.These include addressing gaps in the oversight of genetic testing (including regulation of companies providing test interpretation services), ensuring that realistic claims are made in promotional materials for genetic testing, determining the appropriate role of new genomic technologies in patient care, ensuring the privacy of patients' genomic data, and improving insurance coverage and reimbursement for genetic services.The Secretary's Advisory Committee on Genetics, Health, and Society (SACGHS), on which two of us serve, advises the secretary of health and human services and reports on these issues.",
+      "How can we maximize the benefits of these new developments and minimize the harms?How can we encourage patients' involvement and autonomy yet establish appropriate safeguards while avoiding inappropriate paternalism?How do we promote Preparing for a Consumer-Driven Genomic Age the understanding that interpretations of genomic information may evolve as research unravels the meaning of gene-gene and gene-environment interactions and the roles of noncoding DNA sequences, copy-number variants, epigenetic mechanisms, and behavioral factors in health and disease?"
+    ],
+    [
+      "A supervised (pathway driven) approach was used to specifically query three general gene ontology (GO) areas of interest, namely xenobiotic metabolism, DNA damage repair, and oxidative stress-related genes (Table 1).These gene categories are hypothesized to play important roles in sex-and age-related susceptibility to adverse drug effects [18,30].Of the 122 genes included in the xenobiotic metabolism gene list in the Ingenuity Knowledge Base, 61 were differentially expressed.These included Cyp2d4, the rat ortholog of human gene CYP2D6, which is speculated to metabolize up to 25% of commonly prescribed drugs [31].Genes involved in DNA Damage Repair, derived from Ingenuity, were combined with the list by Wood et al. [32] to give 222 genes involved in DNA damage repair.Sixty-five of these genes (approximately 25%) were found to be differentially expressed in the liver.Oxidative Stress genes were defined by 68 genes included in \"response to oxidative stress\" (IPA) of which 23 genes were differentially expressed (Table 1).",
+      "Pharmacogenomics has advanced the field of drug-response assessment.For example, the first experiences with guiding vitamin K antagonist therapy with the aid of CYP2C9 (cytochrome P450, family 2, subfamily C, polypeptide 9) or VKORC1 (vitamin K epox- ide reductase complex, subunit 1) polymorphisms (93 ), and the use of cytochrome P450 polymorphisms for assessing clopidogrel response have entered US Food and Drug Administration recommendations (94 ).Disease prevention lags behind.Gene chips and modern sequencing approaches that allow largescale interrogation of the genome at the population level will generate novel hypotheses of disease causation.Furthermore, with the continuing drop in the costs of whole-genome sequencing, the practicing physician may soon be faced with having to comment on the disease risks of a patient's \u03fe4 \u03eb 10 6 sequence variants before any clinical signs occur, a task that no certified genetic counselor could fulfill at present.With advent of GWASs, ethical and practical concerns of reporting genetic research results have become apparent.Initial efforts at defining rules of reporting large-scale association results and assessing the level of evidence also apply to nextgeneration large-scale genomics (95,96 ).Reports have suggested that on the consumer side, genomewide genetic profiling of employees of health and technology companies does not change anxiety symptoms, dietary fat intake, or exercise behavior (i.e., lifestyle factors) over a 6-month period (97 ); however, the association of genetic variation with risk and the dissection of objective markers of risk and risk factors that reside in the causal pathways of disease will need careful assessment before these approaches can enter clinical decision making (98 ).A data set containing 80 genes associated with coronary heart disease in GWASs was uploaded and overlaid onto the molecular networks developed from information contained in the Ingenuity Knowledge Base.Networks of Network Eligible Molecules were then algorithmically generated on the basis of their connectivity.The most substantially enriched network, as shown, comprises 36 genes, of which 20 are coronary heart disease genes.",
+      "19.3.1 An environmental or pharmacogenetic basis for drug efficacy and ADR? Before getting into the complexities of PGx, it is important to recognize that many non-genetic factors also influence the efficacy of medications, including the patient\u2019s age, sex and general health, but also environmental factors, such as concomitant therapies, drug interactions and diet. To give a seemingly innocuous example, grapefruit juice is an inhibitor of intestinal cytochrome P-450 3A4, which is responsible for the first-pass metabolism of many medications.",
+      "Finally, it is possible that other molecules (or drugs) might modulate the biological context within which the drug\u2013 target interaction takes place. Variation in any of the elements that control these types of processes can lead to variability in drug action, which might well confound the search for causative genes among the usual ADME and target-related candidates. 19.3 PHARMACOGENETICS (PGx)  519  19.3.5 Using bioinformatics to gain understanding of adverse drug reaction (ADR) One of the biggest concerns during the development of any medication is the possibility of unintended consequences in the patient.",
+      "19.3 Pharmacogenetics (PGx) It is well known that after exposure to a drug, almost any given cohort of patients show a wide variety of responses. In an ideal situation, patients show a beneficial response to the therapy, although they may also show no response or a weak response, and perhaps most worryingly, they may experience an adverse drug reaction (ADR), which in extreme situations could lead to serious illness or even death. ADR is an increasingly serious problem with a huge toll in lives and health-care costs every year.",
+      "A good understanding of disease biology and effective chemistry is not the only requirement for an efficacious drug; we also must understand how variation at the target affects drug action, and how variation in other genes affects the way drugs are absorbed, disseminated, metabolized and excreted. Genetic analysis in the drug development paradigm also faces some unique challenges; for example, the exquisite rarity of some adverse reactions makes collection of sufficient samples for well-powered genetic analysis almost impossible.",
+      "19.3.1 An environmental or pharmacogenetic basis for drug efficacy and ADR? Before getting into the complexities of PGx, it is important to recognize that many non-genetic factors also influence the efficacy of medications, including the patient\u2019s age, sex and general health, but also environmental factors, such as concomitant therapies, drug interactions and diet. To give a seemingly innocuous example, grapefruit juice is an inhibitor of intestinal cytochrome P-450 3A4, which is responsible for the first-pass metabolism of many medications.",
+      "Finally, it is possible that other molecules (or drugs) might modulate the biological context within which the drug\u2013 target interaction takes place. Variation in any of the elements that control these types of processes can lead to variability in drug action, which might well confound the search for causative genes among the usual ADME and target-related candidates. 19.3 PHARMACOGENETICS (PGx)  519  19.3.5 Using bioinformatics to gain understanding of adverse drug reaction (ADR) One of the biggest concerns during the development of any medication is the possibility of unintended consequences in the patient.",
+      "19.3 Pharmacogenetics (PGx) It is well known that after exposure to a drug, almost any given cohort of patients show a wide variety of responses. In an ideal situation, patients show a beneficial response to the therapy, although they may also show no response or a weak response, and perhaps most worryingly, they may experience an adverse drug reaction (ADR), which in extreme situations could lead to serious illness or even death. ADR is an increasingly serious problem with a huge toll in lives and health-care costs every year.",
+      "A good understanding of disease biology and effective chemistry is not the only requirement for an efficacious drug; we also must understand how variation at the target affects drug action, and how variation in other genes affects the way drugs are absorbed, disseminated, metabolized and excreted. Genetic analysis in the drug development paradigm also faces some unique challenges; for example, the exquisite rarity of some adverse reactions makes collection of sufficient samples for well-powered genetic analysis almost impossible.",
+      "Drug-Gene Interactions Predicting Efficacy  In 1 candidate gene study, a genetic variant in the HMG-CoA reductase gene, present in 6.7% of patients, modified the LDL-C response to pravastatin by 6.4 mg/dL. 244][247] However, these effect sizes are small and difficult to distinguish from random variation in individual patients.Indeed, the metformin finding is less important for its potential clinical applications than for the biological insight provided by this link between glucose control and a gene involved in the response to DNA damage. 245,246",
+      "Nutrition and metabolism  The power of these new experimental protocols, comparing gene expression profiles to understand spontaneous differences in phenotype due to disease, was extended by inducing phenotypic differences using creative molecular intervention.The first experiments to manipulate phenotype in this way used drugs.A comparison of the gene expression of a drug-induced phenotype with that of the normal phenotype was brilliantly executed in a single study that simultaneously identified a mechanism for the regulation of sterol uptake in the intestine and a genetic disease, sitosterolemia [17  \u2022 ], mice were treated with a lipid-metabolism altering compound and the expression profiles of various tissues compared with normal mice using gene arrays.Differentially expressed genes were evaluated 'in silico,' and an unknown gene was found using bioinformatic tools to be homologous to the ATP-binding cassette (ABC) family of genes.Members of the ABC family include cellular cholesterol transport proteins.Defects in a member of this family (ABCA1) form the basis for the poor cholesterol delivery to high-density lipoprotein (HDL) that underlies Tangiers disease [18], another cholesterol-related disease [19].Through the use of a variety of in silico techniques, Berge et al. [17 \u2022\u2022 ] concluded that the proteins produced from the newly discovered genes, ABCG5 and ABCG8, were responsible for the regulated reverse transport of newly absorbed cholesterol and phytosterols out of the apical surface of intestinal cells.Using public gene databases, a human homolog of the putative mouse transporter was identified, cloned and used to screen sitosterolemic humans.Dysfunctional mutations were found in these genes in all individuals suffering from sitosterolemia.Thus, individuals suffering from sitosterolemia lack the machinery responsible for the selective and controlled transport of cholesterol, and therefore hyperabsorb various sterols (including plant sterols).This study illustrated many of the strengths of genomic experimentation: the identification of phenotypically important genes using global differential gene expression analysis; querying internet databases to deduce structure/function relationships from sequence comparison; and the characterization of individual variation (polymorphism) linked to health.These findings have transformed our understanding of lipid absorption and metabolism, begging the question: how long would this knowledge have waited to be discovered without genomics?",
+      "19.3.1 An environmental or pharmacogenetic basis for drug efficacy and ADR? Before getting into the complexities of PGx, it is important to recognize that many non-genetic factors also influence the efficacy of medications, including the patient\u2019s age, sex and general health, but also environmental factors, such as concomitant therapies, drug interactions and diet. To give a seemingly innocuous example, grapefruit juice is an inhibitor of intestinal cytochrome P-450 3A4, which is responsible for the first-pass metabolism of many medications.",
+      "Finally, it is possible that other molecules (or drugs) might modulate the biological context within which the drug\u2013 target interaction takes place. Variation in any of the elements that control these types of processes can lead to variability in drug action, which might well confound the search for causative genes among the usual ADME and target-related candidates. 19.3 PHARMACOGENETICS (PGx)  519  19.3.5 Using bioinformatics to gain understanding of adverse drug reaction (ADR) One of the biggest concerns during the development of any medication is the possibility of unintended consequences in the patient.",
+      "19.3 Pharmacogenetics (PGx) It is well known that after exposure to a drug, almost any given cohort of patients show a wide variety of responses. In an ideal situation, patients show a beneficial response to the therapy, although they may also show no response or a weak response, and perhaps most worryingly, they may experience an adverse drug reaction (ADR), which in extreme situations could lead to serious illness or even death. ADR is an increasingly serious problem with a huge toll in lives and health-care costs every year.",
+      "A good understanding of disease biology and effective chemistry is not the only requirement for an efficacious drug; we also must understand how variation at the target affects drug action, and how variation in other genes affects the way drugs are absorbed, disseminated, metabolized and excreted. Genetic analysis in the drug development paradigm also faces some unique challenges; for example, the exquisite rarity of some adverse reactions makes collection of sufficient samples for well-powered genetic analysis almost impossible.",
+      "19.3.1 An environmental or pharmacogenetic basis for drug efficacy and ADR? Before getting into the complexities of PGx, it is important to recognize that many non-genetic factors also influence the efficacy of medications, including the patient\u2019s age, sex and general health, but also environmental factors, such as concomitant therapies, drug interactions and diet. To give a seemingly innocuous example, grapefruit juice is an inhibitor of intestinal cytochrome P-450 3A4, which is responsible for the first-pass metabolism of many medications.",
+      "Finally, it is possible that other molecules (or drugs) might modulate the biological context within which the drug\u2013 target interaction takes place. Variation in any of the elements that control these types of processes can lead to variability in drug action, which might well confound the search for causative genes among the usual ADME and target-related candidates. 19.3 PHARMACOGENETICS (PGx)  519  19.3.5 Using bioinformatics to gain understanding of adverse drug reaction (ADR) One of the biggest concerns during the development of any medication is the possibility of unintended consequences in the patient.",
+      "19.3 Pharmacogenetics (PGx) It is well known that after exposure to a drug, almost any given cohort of patients show a wide variety of responses. In an ideal situation, patients show a beneficial response to the therapy, although they may also show no response or a weak response, and perhaps most worryingly, they may experience an adverse drug reaction (ADR), which in extreme situations could lead to serious illness or even death. ADR is an increasingly serious problem with a huge toll in lives and health-care costs every year.",
+      "A good understanding of disease biology and effective chemistry is not the only requirement for an efficacious drug; we also must understand how variation at the target affects drug action, and how variation in other genes affects the way drugs are absorbed, disseminated, metabolized and excreted. Genetic analysis in the drug development paradigm also faces some unique challenges; for example, the exquisite rarity of some adverse reactions makes collection of sufficient samples for well-powered genetic analysis almost impossible."
+    ],
+    [
+      "In one case, a gene identified by mutation recovered from a genetic screen in the laboratory, methuselah, may have variants in natural populations.In particular, the common ATATC haplotype has a sharp geographic (north-south) cline in U.S. populations, which, intriguingly, is associated with an 18% difference in life span (97).It would be interesting to examine these natural populations for differences in their reproductive schedule.Extensive studies show that life span can be rapidly selected as an indirect outcome of artificial selection for age at reproduction.Samples from natural populations of Drosophila contain genetic variants that can be rapidly selected, within 15 generations, for 50% or greater differences in life span on the basis of choosing individuals that are reproductive at early versus later ages (93).Selection was reversible, indicating that these life history variants depended on existing gene combinations not new mutations.Among the genes that differed in quantitative expression between young-and old-selected lines were heat shock proteins, e.g., hsp 22 (60).An overarching conclusion from fly aging genetics is that stress resistance is coupled to longevity (94), as in C. elegans.Other gene candidates are being sought by QTL analysis and show complex interactions with gender and population density (17,115).",
+      "Murabito JM, Yuan R, Lunetta KL (2012) The search for longevity and healthy aging genes: insights from epidemiological studies and samples of long-lived individuals. J Gerontol A Biol Sci Med Sci 67(5):470\u2013479. doi:10.1093/gerona/gls089 20. Nuzhdin SV, Pasyukova EG, Dilda CL et al (1997) Sex-specific quantitative trait loci affecting longevity in Drosophila melanogaster. Proc Natl Acad Sci USA 94(18):9734\u20139739 21. Gems D, Riddle DL (2000) Genetic, behavioral and environmental determinants of male longevity in Caenorhabditis elegans. Genetics 154(4):1597\u20131610  123  22.",
+      "Somatic mutations with the inherited gene variations of each individual cumulatively or synergistically influence the health span and life span [11].Very few genetic variants have been associated with human longevity, but those found include the transcription factor FOXO3 gene, the APOE/TOMM40 and the CDKN2B/ ANRIL loci, which are associated with Alzheimer's disease and cellular senescence [12][13][14].In fact, the heritability for human longevity has been estimated to be approximately 20-30%, according to studies of twins, suggesting that external factors such as diet, environment, physical activity and microbiomes are important factors that influence the health span [14][15][16].The increase in the rate of retrotranscription reflects genome deregulation, creating additional mutations, DNA damage, and other forms of genome instability.For instance, the expression of several families of retrotransposable elements increases with age, as observed in mouse skeletal muscle and human fibroblasts, particularly the long interspersed nuclear element-1 (L1 LINE) [17,18].",
+      "Our study has several limitations.First, we did not analyse the sex and mitochondrial chromosomes, since we were unable to gather enough cohorts that could contribute to the analysis of these chromosomes.However, these chromosomes may harbour loci associated with longevity that we thus have missed.Second, although we included as many cohorts as possible, the sample size of our study is still relatively small (especially for the 99th percentile analysis) in comparison to GWA studies of age-related diseases, such as T2D and cardiovascular disease, and parental age at death 11,51,52 .Hence, this limited our power to detect loci with a low MAF (<1%) that contribute to longevity.Third, we did not perform sex-stratified analyses and may thus have missed sexspecific longevity-related genetic variants.The reason for this is that (1) we only identified a limited number of suggestive significant associations in our unstratified 90th and 99th percentile analyses, (2) our sample size is modest (especially when stratified by sex), and (3) thus far, there has been no report of any genomewide significant sex-specific longevity locus.",
+      "In most experimentally modified animal model systems, single-gene mutations in many different genes have major life extension effects (Fontana et al., 2010;Kenyon, 2010).However, natural human and animal longevity is presumed to be a complex trait (Finch & Tanzi, 1997).In humans, both candidate gene and genome-wide genetic association approaches have been applied in an attempt to identify longevity loci.The frequency of genetic variants has been typically compared between nonagenarian cases and young controls, revealing loci at which genetic variants may contribute to a higher or lower probability of survival into old age.The initial candidate gene studies aimed at finding human longevity genes were dominated by contradictory results (Christensen et al., 2006).The more consistent evidence obtained by repeated observation in independent cohort studies for association with longevity has so far only been observed for three loci, the apolipoprotein E (APOE) locus (Schachter et al., 1994;Christensen et al., 2006), the FOXO3A locus (Willcox et al., 2008;Flachsbart et al., 2009;Pawlikowska et al., 2009;Soerensen et al., 2010), and the AKT1 locus (Pawlikowska et al., 2009).Thus, despite the expectation that longevity would be influenced by many genetic variants with small effect sizes, the effect of variants has consistently been shown in only three genes.",
+      "Previously, it has been suggested that genetic variation in the FOXO1 gene is specifically contributing to human female longevity (reviewed in Chung et al., 2010).However, at chromosome 13q14.11harboring the FOXO1 gene we found no evidence for linkage with female longevity (LOD<0.05)and at the gene position of FOXO1 we found no evidence for association in the females-only metaanalysis (p-values>0.042) in the GEHA Study.Potentially, the effect of this locus is not only influenced by gender but also by genetic background.",
+      ", 2003), to study GXE and consequences of treatments as a function of age, diet, and sex (Fleet et al. , 2016; Philip et al. , 2010; Roy et al. , 2020; Sandoval-Sierra et al. , 2020; Williams et al. , 2016, 2020), gene pleiotropy (Wang et al. , 2016a), and to test behavioral predictions based on differences in brain architecture (Yang et al. , 2008). Author Manuscript Author Manuscript  Here we summarize the current status of this resource with a focus on genetic structure, and on the power and precision of mapping trait variance to loci and genes.",
+      "Somatic mutations with the inherited gene variations of each individual cumulatively or synergistically influence the health span and life span [11].Very few genetic variants have been associated with human longevity, but those found include the transcription factor FOXO3 gene, the APOE/TOMM40 and the CDKN2B/ ANRIL loci, which are associated with Alzheimer's disease and cellular senescence [12][13][14].In fact, the heritability for human longevity has been estimated to be approximately 20-30%, according to studies of twins, suggesting that external factors such as diet, environment, physical activity and microbiomes are important factors that influence the health span [14][15][16].The increase in the rate of retrotranscription reflects genome deregulation, creating additional mutations, DNA damage, and other forms of genome instability.For instance, the expression of several families of retrotransposable elements increases with age, as observed in mouse skeletal muscle and human fibroblasts, particularly the long interspersed nuclear element-1 (L1 LINE) [17,18].",
+      "The Height-Life Span Nexus  Several observations and lines of experimentation have raised the issue of whether interindividual differences in aging rate are influenced by genes that modulate body size and early-life growth patterns.These include (a) the association between small stature and exceptional longevity in calorically restricted rodents (Yu et al., 1985), methionine-restricted rats (Orentreich et al., 1993), and mutant dwarf mice (Brown-Borg et al., 1996;Miller, 1999); and (b) the association between small body size and longer life span in natural populations of mice (Falconer et al., 1978), flies (Hillesheim and Stearns, 1992), dogs (Li et al., 1996), and, possibly, people (Samaras andStorms, 1992).The correlation in dogs is particularly striking: selective breeding for dogs of different body size has produced breeds varying in size from Chihuahua to Irish wolfhound.These breeds also vary greatly in mean longevity, from approximately 7 to 10.5 years, and the correlation between breed longevity and breed body weight (Miller, 1999) is a remarkable R 2 = 0.56.These differences are genetic and affect stature rather than obesity: no amount of overeating will convert a West Highland white terrier to a St. Bernard.The selective pressures applied were designed to create dogs of specific sizes and temperaments and were not intended to influence aging rate or life span.The clear implication is that the effects on longevity are pleiotropic, i.e., that genes selected for their effect on body size and conformation influenced life span as a side effect.It is of interest to note that the few analyses (Eigenmann et al., 1984(Eigenmann et al., , 1988) ) of the hormonal basis for interbreed differences in body size have shown that the genes in question influence levels of IGF-1, the most likely mediator of the life-span effects in the long-lived df/df and dw/dw mouse mutants.Could it be mere coincidence that long-lived mutant nematode worms (Kimura et al., 1997) also show mutations in genes related to insulin and IGF-1 receptors?",
+      "The antagonistic pleiotropy and hyperfunction theories of ageing predict the presence of genetic variants important for growth and development in early life with deleterious effects towards the end of the reproductive window 19,20 .While we are unable to directly capture the genetic effects on individuals before age 40 due to the study design of our datasets, we found that the life-extending variant near FOXO3 is associated with a delay in the age at menarche and a decrease in intracranial volume and cognitive abilities.It thus appears that there are loci exhibiting antagonistic effects, although we are unable to discern whether this is due to true pleiotropy or due to linkage of causal variants within a region  Genes which showed a significant effect (FDR < 5%) of gene expression on ageing traits are displayed here.Gene names are annotated with the direction of effect, where + andindicate whether the life-extending association of the locus is linked with higher or lower gene expression, respectively.Locus: nearest gene to lead variant in the multivariate analysis, Chr: chromosome, Position: base-pair position of lead variant (GRCh37), Cis-genes: genes in physical proximity (<500 kb) to the lead variant of the locus which colocalise with the multivariate signal, Trans-genes: genes located more than 500 kb from the lead variant of the locus.",
+      "Ageing phenotypes, such as years lived in good health (healthspan), total years lived (lifespan), and survival until an exceptional old age (longevity), are of interest to us all but require exceptionally large sample sizes to study genetically.Here we combine existing genome-wide association summary statistics for healthspan, parental lifespan, and longevity in a multivariate framework, increasing statistical power, and identify 10 genomic loci which influence all three phenotypes, of which five (near FOXO3, SLC4A7, LINC02513, ZW10, and FGD6) have not been reported previously at genome-wide significance.The majority of these 10 loci are associated with cardiovascular disease and some affect the expression of genes known to change their activity with age.In total, we implicate 78 genes, and find these to be enriched for ageing pathways previously highlighted in model organisms, such as the response to DNA damage, apoptosis, and homeostasis.Finally, we identify a pathway worthy of further study: haem metabolism.",
+      "Here, we assess the degree of genetic overlap between published GWAS of three different kinds of ageing phenotypeshealthspan, parental lifespan, and longevity (defined as survival to an age above the 90th percentile)-and perform a multivariate meta-analysis to identify genetic variants related to healthy ageing.We subsequently characterise the sex-and age-specific effects of loci which affect all three ageing traits and look up reported associations with age-related phenotypes and diseases.Finally, we link the observed signal in these loci to the expression of specific genes, including some that are currently studied in model organisms, and identify pathways involved in healthy ageing.",
+      "Ageing phenotypes, such as years lived in good health (healthspan), total years lived (lifespan), and survival until an exceptional old age (longevity), are of interest to us all but require exceptionally large sample sizes to study genetically.Here we combine existing genome-wide association summary statistics for healthspan, parental lifespan, and longevity in a multivariate framework, increasing statistical power, and identify 10 genomic loci which influence all three phenotypes, of which five (near FOXO3, SLC4A7, LINC02513, ZW10, and FGD6) have not been reported previously at genome-wide significance.The majority of these 10 loci are associated with cardiovascular disease and some affect the expression of genes known to change their activity with age.In total, we implicate 78 genes, and find these to be enriched for ageing pathways previously highlighted in model organisms, such as the response to DNA damage, apoptosis, and homeostasis.Finally, we identify a pathway worthy of further study: haem metabolism.",
+      "LongevityMap--human genetic variants associated with longevity  Variation in human lifespan has been found to be 20-30% heritable, with increasing heritability at advanced ages (27).As next-generation sequencing and genome-wide approaches advance, so does the capacity for performing longevity association studies.To catalog the increasing volume of data in genetic studies of human longevity, we created LongevityMap (http://genomics.senescence.info/longevity/), a database of genes, gene variants and chromosomal locations associated with longevity (28).This differs from the GenAge database, which focuses mostly on data from model organisms and the few genes associated with human ageing (e.g.genes causing progeroid syndromes).",
+      "Genes/loci identified by genome-wide association studies of longevity and lifespan traits.",
+      "ANALYSIS OF HUMAN VARIATION IN THE GENETIC CONTROL OF LONGEVITY  Heritability studies have convincingly demonstrated that at least some fraction of human lifespan is heritable.In tandem, large-scale genome-wide association studies (GWAS) have identified numerous loci associated with age-related traits (Buniello et al., 2019).While genetic studies have functionally shown an inverse effect of multiple age-related, diseaseassociated variants on lifespan regulation, the number of well-replicated longevity-conferring variants remains limited to variants in APOE (ApoE \u03b52), and more recently, CDKN2A/B and IL6 (see Table 1).To date, studies in humans have been hampered by the specific phenotype definitions used, sample sizes of the extreme phenotypes, and modest heritability of the longevity-related traits (Breitbach et al., 2019).This is due to the complex interplay of biological and social factors involved in human aging, as well as the limited power of GWAS, which require sampling thousands of subjects to achieve statistical significance (Breitbach et al., 2019).Genetic studies of aging have also been hindered by an inconsistent use of definitions of aging (reviewed in Baghdadi et al., 2020).The two main ways of conducting research on the genetics of longevity in human populations are by studying (i) the lifespan (continuous trait, years lived) and (ii) the longevity (dichotomous trait, i.e., being among the longest-lived individuals within a specific population).These complexities have limited the resolution and capability of broad association studies of human longevity.Importantly, these genomic analyses focus on a shift of survival in a population; these variables may be genetically distinct from the mechanisms establishing potential for longevity overall (Figure 1A).We argue that an understanding of this shift in lifespan as well as genetic mechanisms of regulating a species specific 'set points' (Figure 1B) will aid in the conceptual distinction of aging and longevity in humans.",
+      "Put more simply: What is the strength of evidence in favor of GXE effects on lifespan? We ask if youthful adult body weight (~120 days) predicts lifespan. Is the change in body weight in adults in response to a HFD a causal predictor of lifespan? Finally, we ask whether levels of classic serum metabolites or metabolic hormones measured in middle-age or old-age predict variation in lifespan? Our focus is both on overall effects and on strain-specific difference in effect of diet on lifespan and weight gain, rather than on specific genetic modifiers or loci of lifespan.",
+      "Studies in various models have revealed that genetic differences and somatic mutations underlie longevity, but non-genetic contributions also play a major role (Cournil and Kirkwood, 2001).Calorie restriction (Bordone and Guarente, 2005), lowering of basal metabolic rate (Ruggiero et al., 2008), upregulated stress response (Migliaccio et al., 1999), restoration of mi-tonuclear protein balance (Houtkooper et al., 2013), and reduced fertility (Westendorp and Kirkwood, 1998) have all been shown to correlate with lifespan extension.These observations illuminate the role of ''epi''-genetic mechanisms in modulating longevity pathways.",
+      "Introduction  Approximately 25-30% of the variation in adult lifespan is attributable to genetic factors that become more important with increasing age and exert their strongest effects in nonagenarians and centenarians (Go \u00a8gele et al., 2010;Hjelmborg et al., 2006).As yet, however, only a few genetic variants have been found consistently to influence longevity.The first to be discovered was the e4 allele of the apolipoprotein E (APOE) gene, a mortality factor that predisposes to both Alzheimer's and cardiovascular diseases (Corder et al., 1993; Panza et al., 2004).APOE e4 is the only variant with a reportedly large adverse effect upon survival at advanced age (Scha \u00a8chter et al., 1994), and this association has been replicated in several populations (Christensen et al., 2006).Variation in the human forkhead box O3A gene (FOXO3A), in contrast, has been found to be associated with the ability to live long, an effect corroborated by studies in Japanese, German, Italian, US-American, Jewish, Chinese and Danish populations (Anselmi et al., 2009;Flachsbart et al., 2009;Li et al., 2009;Pawlikowska et al., 2009;Soerensen et al., 2010;Willcox et al., 2008).More recently, we have identified exonuclease 1 (EXO1) as a potential novel longevity gene (Nebel et al., 2009).All three genes were detected through candidate-gene approaches.",
+      "Studies of mono-and dizygous twins have revealed that the genetic contribution to the variation in human lifespan is about 25-30% [12,13], and is most prominent in families clustered for longevity [14,15].This genetic contribution is mainly apparent after the age of 60 years and seems to increase with age [13,16].Furthermore, human lifespan is a complex trait which is assumed to be determined by many genes with small individual effects [17], although the polygenic architecture still needs to be characterized [18,19].The diverse health features of long-lived families illustrate that different age-related diseases have common determinants and implicate that pathways can be identified that attenuate aging and delay age-related disease.From a genomic perspective, individuals from long-lived families are assumed to be characterized by a decreased prevalence of disease-promoting variants (referred to as disease-susceptibility alleles) and an increased prevalence of variants conferring maintenance of health and protection from disease, when compared to population controls.In the last 5 years, many diseasesusceptibility alleles have been identified (National Human Genome Research Institute (NHGRI) genome-wide association study (GWAS) Catalog; http://www.genome.gov/gwastudies/)[20].A first comparison between long-lived individuals, selected from both long-lived families (LLS) and the general population (Leiden 85-plus study), and young controls showed no difference in the distribution or frequency of disease-susceptibility alleles identified in cancer, coronary artery disease and type 2 diabetes [21].The search for lifespan regulating loci -contributing to longevity and population mortality -must therefore extend beyond a focus on disease-susceptibility alleles.We will first discuss the efforts to identify longevity loci by genetics approaches."
+    ],
+    [
+      "One surprising result of our experiment was the relatively weak support for involvement of the insulin/insulin-like signaling (IIS) or target-of-rapamycin (TOR) pathways in the evolution of late-life performance.Mutations in genes within these pathways can alter life span and fertility in flies and other organisms (Partridge and Gems 2002); natural genetic variation in expression of IIS/TOR-pathway genes has been reported to predict agingrelated phenotypes (Nuzhdin et al. 2009), and natural clinal variation in the insulin receptor gene InR has been associated with variation in stress resistance and fecundity (Paaby et al. 2010).We therefore expected that some of these genes would contribute to the evolution of life span and late-life fecundity in our experiment.Only one gene previously annotated with the Gene Ontology biological function \"determination of adult life span\" (Cct1) was among the genes bearing the strongest signature of selection, no more than would be expected by chance (1/96 of the candidate genes that had some biological process annotation, compared to 116/10,792 of all genes with some biological-process annotation, \u03c7 [1] 2 = 0.002, P > 0.96).Genes annotated with the functions \"aging\" or \"determination of adult life span\" were also significantly underrepresented among differentially expressed genes (43/215 transcripts with these annotations had P < 0.05 for line or line-by-age effects, compared to 4488/13,258 of all annotated transcripts, \u03c7 [1] 2 = 18.1, P < 0.0001).Most of the genes we identified are therefore novel candidates for the regulation of life span and late-age performance.",
+      "Rapamycin  Rapamycin has been shown to robustly increase lifespan in at least three different mouse strains and to improve healthspan measures including cognitive function, cardiac function, immune function, obesity, and cancer incidence (Johnson et al. 2015;Kaeberlein 2014).",
+      "mTOR activates the kinase S6K, which phosphorylates S6, inhibiting autophagy [92].Rapamycin can extend the life span of organisms from yeast to mammals in a dose-dependent manner [95].However, some data suggest that rapamycin has unwanted metabolic effects, including insulin resistance, hyperlipidemia, glucose intolerance, and hypophosphatemia; however, whether rapamycin is responsible for these effects remains controversial, and some of the effects are reversible [96,97].The mTOR pathway integrates different signals from insulin, cytokines, nutrients, oxygen, and mitogenic stimuli, and its regulation has important implications for longevity and against the negative effects of aging [92].",
+      "The molecular mechanisms that drive cellular senescence in proliferative and nonproliferative cells are being discovered.One of the metabolic pathways associated with aging is the growth-promoting mitogen/nutrient-sensing pathway, in which the target of rapamycin (mTOR) is considered a central signaling molecule that affects multiple cellular pathways associated with aging [137].In particular, mTOR participates in the transition of cells from quiescence to senescence [138].",
+      "Inductors of Autophagy and its Impact on Aging  Autophagy has a role in homeostasis, which plays an essential role in the maintenance of cellular physiology and the prevention of cellular damage.Among the inducers of autophagy have been described the already-mentioned rapamycin, resveratrol, and polyamines; however, only polyamines have demonstrated results in clinical research in humans [65].It is known that these compounds can induce the canonical autophagy pathway, which includes inactivation of the mammalian objective of the rapamycin complex 1 (mTORC1), allowing phosphorylation and activation of the Unc-51 complex (Ulk1/2), where the cascade of the other members of the complex is subsequently activated, ULK as FIP200 and ATG13 [65].",
+      "A third example illustrates that pharmacological targeting of pathways that have been implicated in promoting aging may also restore youthfulness at cellular and biochemical levels.Among the key regulators associated with interventions that extend life span is the enzyme mTOR, which senses cellular nutrient levels and in turn regulates rates of protein synthesis and energy utilization.Notably, administration of rapamycin, an mTOR inhibitor, starting at midlife can extend the life span of mice, suggesting that aging can be delayed or reversed in multiple cell types (Harrison et al., 2009).In the hematopoietic system, aging is associated with an increase in mTOR activation in stem cells and progenitors (Chen et al., 2009).Administration of rapamycin to old mice to inhibit mTOR not only limited the normal age-related increases in hematopoietic stem cells and biomarkers of aging in those cells, but also enhanced the performance of the stem cells to become as effective as young stem cells in heterochronic transplantation experiments (Chen et al., 2009) (Figure 1).",
+      "Rapamycin inhibits TOR signalling to alter nDNA translation, inducing mitonuclear protein imbalance35, and increases lifespan in various species, including mice33. Rapamycin also increased mean worm lifespan (by 16%)34 in a ubl-5-dependent manner, induced UPRmt, but not UPRER or heat shock response, and increased respiration (Fig. 6a, c and Supplementary Fig. 9a). This was associated with increased ATP levels, equal citrate synthase activity and altered nDNA/mtDNA oxidative phosphorylation protein ratio (Fig. 6d, e). Additionally, rapamycin changed the balance between nDNA- and mtDNA-encoded oxidative phosphorylation subunits in mouse hepatocytes in a dose dependent manner (Fig. 6f, g).",
+      "Zylbee, E., Vesco, C. & Penman, S. Selective inhibition of the synthesis of mitochondria-associated RNA by ethidium bromide. J. Mol. Biol. 44, 195\u2013204 (1969). 33. Harrison, D. E. et al. Rapamycin fed late in life extends lifespan in genetically heterogeneous mice. Nature 460, 392\u2013395 (2009). 34. Robida-Stubbs, S. et al. TOR signaling and rapamycin influence longevity by regulating SKN-1/Nrf and DAF-16/FoxO. Cell Metab. 15, 713\u2013724 (2012). 35. Zid, B. M. et al. 4E-BP extends lifespan upon dietary restriction by enhancing mitochondrial activity in Drosophila. Cell 139, 149\u2013160 (2009). 36. Schulz, T. J. et al.",
+      "a, Rapamycin (Rapa, 1 nM) extends worm lifespan in a ubl-5-dependent manner; b, ubl-5-dependently induced UPRmt (hsp-6::GFP) but not UPRER (hsp-4::GFP) (n 5 4). c\u2013e, Rapamycin increased respiration (c, n 5 10) and ATP content but not citrate synthase activity (d, n 5 3) and induced mitonuclear protein imbalance (e). f\u2013h, In mouse hepatocytes, rapamycin induces mitonuclear protein imbalance (f, g) and induces UPRmt as  shown at the protein (f, g, n 5 3), and transcriptional (h, n 5 8) level. i, Resveratrol (Resv, 25 mM) induced mitonuclear protein imbalance in mouse hepatocytes (n 5 4).",
+      "pivotal in this aspect providing molecular insights and having huge conceptual contributions in the field.Characterising the contribution of individual mutants in ageing is a continuously active and informative activity in the field.On top of these studies, genome-wide screens have provided insights on the role of evolutionarily conserved processes and signalling pathways in ageing such as nutrient response [17,18], protein translation, oxidative damage [19,20], mitochondrial function [21,22] and autophagy [22,23] opening new avenues for biogerontology research.Yeasts have proved informative and helped in understanding mechanisms of highly conserved pathways (from yeast to human) in physiology, health and disease such as the Target of Rapamycin (TOR) [24], glucose sensing (PKA) and stress response pathways (Sty1/p38) [25].",
+      "mTOR activates the kinase S6K, which phosphorylates S6, inhibiting autophagy [92].Rapamycin can extend the life span of organisms from yeast to mammals in a dose-dependent manner [95].However, some data suggest that rapamycin has unwanted metabolic effects, including insulin resistance, hyperlipidemia, glucose intolerance, and hypophosphatemia; however, whether rapamycin is responsible for these effects remains controversial, and some of the effects are reversible [96,97].The mTOR pathway integrates different signals from insulin, cytokines, nutrients, oxygen, and mitogenic stimuli, and its regulation has important implications for longevity and against the negative effects of aging [92].",
+      "The molecular mechanisms that drive cellular senescence in proliferative and nonproliferative cells are being discovered.One of the metabolic pathways associated with aging is the growth-promoting mitogen/nutrient-sensing pathway, in which the target of rapamycin (mTOR) is considered a central signaling molecule that affects multiple cellular pathways associated with aging [137].In particular, mTOR participates in the transition of cells from quiescence to senescence [138].",
+      "Inductors of Autophagy and its Impact on Aging  Autophagy has a role in homeostasis, which plays an essential role in the maintenance of cellular physiology and the prevention of cellular damage.Among the inducers of autophagy have been described the already-mentioned rapamycin, resveratrol, and polyamines; however, only polyamines have demonstrated results in clinical research in humans [65].It is known that these compounds can induce the canonical autophagy pathway, which includes inactivation of the mammalian objective of the rapamycin complex 1 (mTORC1), allowing phosphorylation and activation of the Unc-51 complex (Ulk1/2), where the cascade of the other members of the complex is subsequently activated, ULK as FIP200 and ATG13 [65].",
+      "Background  Genetic, dietary and drug interventions can enhance longevity and suppress age-associated disease, such as cancer.Prominent genetic interventions that robustly extend longevity and healthspan in mammals include those that decrease growth hormone (GH) and insulin-like growth factor (IGF) signalling; for example, Ames dwarf mice live more than 50% longer than their wild-type siblings [1].These diminutive mice result from a point mutation in a gene (Prop1 df/df ) that drives development of the pituitary gland, so that mutant mice are deficient in specific hormones.The GH deficiency, in particular, has been shown to underlie their enhanced health span and extended lifespan.Ames mice are highly insulinsensitive, resistant to some stresses and the incidence of cancer is delayed [2][3][4].Dietary and drug interventions that extend lifespan include calorie restriction (CR) and the mTOR inhibitor rapamycin [5].Like the Ames dwarf mutation, CR and rapamycin also suppress and/ or delay the incidence of cancer [5][6][7].A detailed understanding of how these interventions exert their beneficial effects is essential to develop strategies to promote healthy aging in humans [8].Currently, these interventions are thought to exert their effects by related and interconnected effects on some or all of the following: genome stability, the epigenome, telomere attrition and/or function, protein quality control, mitochondrial function, nutrient sensing, cellular senescence, stem cell exhaustion, cellular stress responses and altered intercellular communication [9].Of note, the effects of longevity promoting interventions on the epigenome, a key determinant of cell phenotype, are poorly understood.",
+      "The target of rapamycin (TOR) signaling pathway has also emerged as a major regulator of lifespan.TOR is a highly conserved kinase that transduces signals from nutrients to regulate cell size, cell growth, and metabolism (Martin & Hall, 2005).Genetic studies in yeast Saccharomyces cerevisiae have shown that reduced levels of nutrients, namely amino acids and sugars, can extend yeast lifespan through regulation of the TOR signaling pathway (Kaeberlein et al ., 2005;Powers et al ., 2006).In Drosophila , recent studies have shown that amino acid restriction, rather than 'calorie restriction', extends lifespan (Min & Tatar, 2006).In C. elegans , either inactivation of CeTOR/let-363 by RNAi, or mutations in Raptor/daf-15 , encoding a regulatory subunit of CeTOR, leads to lifespan extension (Vellai et al ., 2003;Jia et al ., 2004).",
+      "As mentioned above, a number of genes regulating longevity also control growth and development.Some of these, such as the insulin/IGF1/GH pathway, have been suggested to play a role in the mechanisms of CR (Fig. 1).An emerging critical player is the target of rapamycin (TOR) signaling pathway, which involves both nutrient sensing and regulation of growth.Several genes in the TOR pathway, and the TOR gene itself, regulate longevity in flies (Kapahi et al., 2004) and both longevity and dauer diapause in worms (Jia et al., 2004).Strikingly, not only have genetic manipulations of the TOR gene extended lifespan in yeast and worms (Stanfel et al., 2009) but also feeding rapamycin (which inhibits TOR and is also known as sirolimus) to middle-aged mice significantly (9 -14%) increased lifespan (Harrison et al., 2009).Whether rapamycin is extending lifespan by delaying of aging or by affecting a specific disease, such as cancer, remains unclear.More recent studies show that starting rapamycin administration earlier in life does AGING GENES AS TARGETS FOR DRUG DISCOVERY not result in a significantly greater increase in lifespan (10 -18%) than that obtained in middle-aged mice (Miller et al., 2011).",
+      "Replacement of the C/ebp\u03b1 gene with C/ebp\u03b2 increases lifespan by 20% [35,36], and may alter the rate of aging [37], indicating that altering the isoform expression of these genes can affect lifespan.Moreover, the life-extending drug rapamycin may affect isoform ratios of C/ebp\u03b2.Rapamycin has been shown to increase lifespan via the suppression of Mtor [38] which in turn controls the isoform ratios of C/ebp\u03b2 [39].Therefore, we speculate that rapamycin may in part exert its life extending effect through C/ebp\u03b2.",
+      "The genome-wide RNAi study conducted by the Ruvkun lab, authored by Hamilton et al. [88], identified a total of 89 additional aging genes with disparate functions including cell structure, cell surface proteins, cell signaling, cellular metabolism, and protein turnover.Of the 66 genes with previously known functions, 17 corresponded to various aspects of carbon metabolism, including citric acid cycle enzymes and subunits of complexes I, IV, and V of the ETC.Researchers also speculated that protein translation might play a role in lifespan regulation, based on the identification of iff-1 (T05G5.10),a gene that has homology to the translation initiation factor eIF5A.Other hits from this screen included two genes containing PH domains known to interact with phosphatidylinositol lipids, multiple G protein-coupled receptors, protein processing and degradation genes such as proteases and ubiquitin ligases/hydrolases, and chromatin modifying factors.",
+      "How cellular processes that regulate aging impact genome stability also remain unclear.Compelling evidence now exists that in all eukaryotes, aging is regulated by conserved insulin/insulin-like growth factor (I-(IFG-1)) pathways and growth-signaling pathways regulated by the target of rapamycin (TOR) family of kinases (4).In general, experimental manipulations that upregulate these pathways promote aging, and manipulations that downregulate these pathways-including mutational inactivation or caloric restriction-extend life span and mitigate age-related pathologies.Downregulation of these pathways often leads to a reduction in oxidative stress and oxidative damage to DNA and other cellular constituents.For the most part, however, the relationship between aging and changes in oxidative damage downstream of alterations in growth-signaling pathways remains correlative rather than causal.",
+      "The potential of interventional approaches targeted at aging has yet to be realized in part because aging is a complicated multisystem process that has remained enigmatic.However, research over the last two decades has led to significant excitement.One of the most striking findings is that it is possible to administer a clinically approved drug, rapamycin, to mice at 20 months of age and extend both their life span and health span (Harrison et al., 2009).Surprisingly, much of the recent success of aging research can be traced back to one of its simplest model organisms: yeast.Two of the major pathways studied in the context of aging and age-related disease are the sirtuin pathway and the TOR signaling pathway, and yeast was pivotal in their discovery."
+    ],
+    [
+      "We briefly comment on rare mutations that shorten life span through the early onset of diseases that are increasingly common during aging in the general population, e.g., familial forms of Alzheimer, breast cancer, coronary artery disease, type II diabetes, etc.The later onset forms of these diseases are associated with causes of death at later ages.A major question is what role the more common allelic variants of these same genes have in \"normal aging\".Although examination of this huge emerging topic goes beyond the present discussion, we may consider the example of Werner's syndrome, a rare autosomal recessive that causes adult onset progeria with a high incidence of cancer and atherosclerosis (70).The absence of Alzheimer-type dementia in Werner's syndrome illustrates the \"segmental\" nature of this and other progerias (70).Thus, heritable shortening of life span should not be considered as a simple acceleration of general aging processes.The Werner's lesion maps to a defective gene encoding a helicase and exonuclease, which also has several polymorphisms.In Japan, 1367Arg was associated with a lower risk of myocardial infarction (70), although it was not associated with longevity in Finland (14).In general, we know little of the genetic factors involved in frailty and morbidity at later ages, which are important to the geneenvironment interactions implied in the major longevity increase seen during the twentieth century.",
+      "Indicative diseases associated with the candidate aging genes",
+      "D  ementia has an age-and sex-standardized prevalence of ~7.1% in Europeans 1 , with Alzheimer's disease (AD) being the most common form of dementia (50-70% of cases) 2 .AD is pathologically characterized by the presence of amyloid-beta plaques and tau neurofibrillary tangles in the brain 3 .Most patients are diagnosed with AD after the age of 65, termed late-onset AD (LOAD), while only 1% of AD cases have an early onset (before the age of 65) 3 .On the basis of twin studies, the heritability of LOAD is estimated to be ~60-80% (refs. 4,5 ), suggesting that a large proportion of individual differences in LOAD risk is driven by genetics.The heritability of LOAD is spread across many genetic variants; however, Zhang et al. 6 suggested that LOAD is more of an oligogenic than a polygenic disorder due to the large effects of APOE variants.Zhang et al. 6 and Holland et al. 7 predicted there to be ~100-10,000 causal variants contributing to LOAD; however, only a fraction have been identified.Increasing the sample size of genome-wide association studies (GWAS) will improve the statistical power to identify the missing causal variants and may highlight additional disease mechanisms.In combination with increasing the number of samples, it is beneficial to use different approaches to identify rare and private variation to help identify additional causal variants and increase understanding of disease mechanisms; however, we deem this to be out of the scope of the current analysis.",
+      "Dementia has an age-and sex-standardized prevalence of ~7.1% in Europeans 1 , with Alzheimer's disease (AD) being the most common form of dementia (50-70% of cases) 2 .AD is pathologically characterized by the presence of amyloid-beta plaques and tau neurofibrillary tangles in the brain 3 .Most patients are diagnosed with AD after the age of 65, termed late-onset AD (LOAD), while only 1% of AD cases have an early onset (before the age of 65) 3 .On the basis of twin studies, the heritability of LOAD is estimated to be ~60-80% (refs. 4,5 ), suggesting that a large proportion of individual differences in LOAD risk is driven by genetics.The heritability of LOAD is spread across many genetic variants; however, Zhang et al. 6 suggested that LOAD is more of an oligogenic than a polygenic disorder due to the large effects of APOE variants.Zhang et al. 6 and Holland et al. 7 predicted there to be ~100-10,000 causal variants contributing to LOAD; however, only a fraction have been identified.Increasing the sample size of genome-wide association studies (GWAS) will improve the statistical power to identify the missing causal variants and may highlight additional disease mechanisms.In combination with increasing the number of samples, it is beneficial to use different approaches to identify rare and private variation to help identify additional causal variants and increase understanding of disease mechanisms; however, we deem this to be out of the scope of the current analysis.The largest previous GWAS of LOAD, identified 29 risk loci from 71,880 (46,613 proxy) cases and 383,378 (318,246 proxy) controls 8 .Our current study expands this to include 90,338 (46,613 proxy) cases and 1,036,225 (318,246 proxy) controls.The recruitment of LOAD cases can be difficult due to the late age of onset, so proxy cases can allow for the inclusion of younger individuals by estimating their risk of LOAD using parental status.Proxy cases and controls were defined on the basis of known parental LOAD status weighted by parental age (Supplementary Information).In the current study, we identified 38 loci, including seven loci that have not been reported previously.Functional follow-up analyses implicated tissues, cell types and genes of interest through tissue and cell type enrichment, colocalization and statistical fine-mapping.This study highlights microglia, immune cells and protein catabolism as relevant to LOAD, while identifying previously unidentified genes of potential interest. ResultsGenome-wide inferences.We performed meta-analysis on data from 13 cohorts, totaling 1,126,563 individuals (Supplementary",
+      "Introduction  Alzheimer's disease (AD) is a complex disorder and is the most common form of dementia [1].After age, family history is the single greatest risk factor for AD.AD can be classified into early and late onset forms.Mutations in three genes: PSEN1/2 and APP are known to cause early onset AD in an autosomal dominant manner [2,3].The majority of AD cases, however, are late onset (LOAD) and the APOE e4 allele is the strongest known genetic risk factor.Many additional genetic polymorphisms have been identified, though with substantially lower risk estimates [1,4,5,6,7,8,9,10].LOAD appears to be inherited and/or sporadic and there is evidence of a maternal inheritance pattern [11].Current estimates suggest that more than 20% of inherited LOAD cases are maternally inherited [12].",
+      "INTRODUCTION  Many common noninfectious diseases exhibit a more severe clinical presentation in older individuals.These diseases often exhibit complex etiology and can affect different tissues and cell types, with a wide spectrum of clinical outcomes.Prominent aging-associated neurodegenerative diseases are Alzheimer's disease (AD), Parkinson's disease (PD), and age-related macular degeneration (AMD), all of which can severely compromise the quality of life and have serious repercussions on both the individual and society at large.These late-onset diseases generally result from the interplay between multiple genetic susceptibility factors and environmental components.Sequencing of the human genome, cataloging of millions of single nucleotide polymorphisms (SNPs) together with the development of a map of common haplotypes, and technological innovations in genotyping are among the major milestones that are facilitating exploration of the genetic basis of common diseases (1,7,50).In the field of AMD genetics, these advances have led to the identification of several genetic susceptibility factors and enabled us to start dissecting the relationship between environmental risk factors and the genetic constitution of each individual (66,118,148).As a result, new opportunities are emerging for improved understanding of disease pathogenesis that may lead to better management and treatment of AMD.Clinical aspects of AMD are discussed only briefly (for a more in-depth discussion, see Reference 79).",
+      "Aging-associated neurodegenerative diseases significantly influence the quality of life of affected individuals.Genetic approaches, combined with genomic technology, have provided powerful insights into common late-onset diseases, such as age-related macular degeneration (AMD).Here, we discuss current findings on the genetics of AMD to highlight areas of rapid progress and new challenges.We also attempt to integrate available genetic and biochemical data with cellular pathways involved in aging to formulate an integrated model of AMD pathogenesis.",
+      "Aging-associated neurodegenerative diseases significantly influence the quality of life of affected individuals.Genetic approaches, combined with genomic technology, have provided powerful insights into common late-onset diseases, such as age-related macular degeneration (AMD).Here, we discuss current findings on the genetics of AMD to highlight areas of rapid progress and new challenges.We also attempt to integrate available genetic and biochemical data with cellular pathways involved in aging to formulate an integrated model of AMD pathogenesis.",
+      "Genetics of Alzheimer Disease: Early-Onset AD  In the early to mid-1990s, genetic studies of AD focused on extended families with high burden of disease (two or more cases among first-degree relatives), and used linkage analysis of highly polymorphic genetic markers called short tandem repeats (STRs, or microsattelites) in order to identify genomic regions co-transmitting with disease in affected family members.This strategy, followed by \"fine mapping\"-the positional cloning of candidate genes-was used to identify genes and genetic variants contributing to AD risk.The first three genes known to cause AD were identified among families with multiple early-onset cases (age-at-onset <60 years): APP, encoding amyloid precursor protein [Goate et al., 1991], and PS1 and PS2, encoding presenilins I and II respectively [Levy-Lahad et al., 1995;Rogaev et al., 1995;Sherrington et al., 1995], each transmitting disease-causing variants in the predicted autosomal-dominant fashion.",
+      "Alzheimer's disease (AD) (MIM: 104300) is a highly heritable disease with great complexity in its genetic contributors, and represents the most common form of dementia.With the gradual aging of the world's population, leading to increased prevalence of AD, and the substantial cost of care for those afflicted, identifying the genetic causes of disease represents a critical effort in identifying therapeutic targets.Here we provide a comprehensive review of genomic studies of AD, from the earliest linkage studies identifying monogenic contributors to early-onset forms of AD to the genome-wide and rare variant association studies of recent years that are being used to characterize the mosaic of genetic contributors to late-onset AD (LOAD), and which have identified approximately $20 genes with common variants contributing to LOAD risk.In addition, we explore studies employing alternative approaches to identify genetic contributors to AD, including studies of AD-related phenotypes and multi-variant association studies such as pathway analyses.Finally, we introduce studies of next-generation sequencing, which have recently helped identify multiple lowfrequency and rare variant contributors to AD, and discuss ongoing efforts with next-generation sequencing studies to develop statistically well-powered and comprehensive genomic studies of AD.Through this review, we help uncover the many insights the genetics of AD have provided into the pathways and pathophysiology of AD.",
+      "Alzheimer's disease (AD) (MIM: 104300) is a highly heritable disease with great complexity in its genetic contributors, and represents the most common form of dementia.With the gradual aging of the world's population, leading to increased prevalence of AD, and the substantial cost of care for those afflicted, identifying the genetic causes of disease represents a critical effort in identifying therapeutic targets.Here we provide a comprehensive review of genomic studies of AD, from the earliest linkage studies identifying monogenic contributors to early-onset forms of AD to the genome-wide and rare variant association studies of recent years that are being used to characterize the mosaic of genetic contributors to late-onset AD (LOAD), and which have identified approximately $20 genes with common variants contributing to LOAD risk.In addition, we explore studies employing alternative approaches to identify genetic contributors to AD, including studies of AD-related phenotypes and multi-variant association studies such as pathway analyses.Finally, we introduce studies of next-generation sequencing, which have recently helped identify multiple lowfrequency and rare variant contributors to AD, and discuss ongoing efforts with next-generation sequencing studies to develop statistically well-powered and comprehensive genomic studies of AD.Through this review, we help uncover the many insights the genetics of AD have provided into the pathways and pathophysiology of AD.",
+      "Indeed, as age increases, there is an exponential increase in the incidence of AD, with a corresponding effect on healthcare costs and quality of life. AD is a complex disease involving several genetic and environmental components (Hardy, 1997; Munoz & Feldman, 2000), and 15% of patients have a genetic predisposition. Almost 100 candidate genes are currently known to be involved in the development of AD, and only 4 (APP, PSEN1, PSEN2, APOE) in humans have been proven to play a direct role in AD pathogenesis (Thomas & Fenech, 2007).",
+      "T  he genetics of Alzheimer disease (AD) to date support an age-dependent dichotomous model whereby earlier age of disease onset (\u03fd60 years) is explained by 3 fully penetrant genes (APP [NCBI Entrez gene 351], PSEN1 [NCBI Entrez gene 5663], and PSEN2 [NCBI Entrez gene 5664]), whereas later age of disease onset (\u054665 years) representing most cases of AD has yet to be explained by a purely genetic model.The APOE gene (NCBI Entrez gene 348) is the strongest genetic risk factor for later onset, although it is neither sufficient nor necessary to explain all occurrences of disease.Numerous putative genetic risk alleles and genetic variants have been reported.Although all have relevance to biological mechanisms that may be associated with AD pathogenesis, they await replication in large representative populations.Genome-wide association studies have emerged as an increasingly effective tool for identifying genetic contributions to complex diseases and represent the next frontier for furthering our understanding of the underlying etiologic, biological, and pathologic mechanisms associated with chronic complex disorders.There have already been success stories for diseases such as macular degeneration and diabetes mellitus.Whether this will hold true for a genetically complex and heterogeneous disease such as AD is not known, although early reports are encouraging.This review considers recent publications from studies that have successfully applied genome-wide association methods to investigations of AD by taking advantage of the currently available high-throughput arrays, bioinformatics, and software advances.The inherent strengths, limitations, and challenges associated with study design issues in the context of AD are presented herein.",
+      "Arch Neurol.2008;65(3): 329-334   Alzheimer disease (AD) is the most common cause of dementia and the most prevalent neurodegenerative disorder associated with aging. 1 Alzheimer disease is a heterogeneous disorder with a complex etiology owing to genetic and environmental influences as causal or risk modifiers.The neuropathologic hallmarks of disease are extracellular amyloid plaques and intracellular neurofibrillary tangles of hyperphosphorylated tau protein. 2 Only 10% of AD cases occurring before 60 years of age (early-onset AD) are due to rare, fully penetrant (autosomal dominant) mutations in 3 genes: A\u2424 precursor protein (APP) on chromosome 21, 3 presenilin 1 (PSEN1) on chromosome 14, 4 and presenilin 2 (PSEN2) on chromosome 1. 5,6In contrast, most cases of AD are later in onset (\u0546 65 years of age) (late-onset AD), are nonfamilial, and are likely the result of highly prevalent genetic variants with low penetrance. 7To date, the only genetic risk factor for lateonset AD remains the apolipoprotein E gene (APOE), specifically the \u03b54 allele, which is moderately penetrant, accounting for up to 50% of cases. 8owever, a robust literature reports numerous putative genetic risk alleles and promising genetic variants.Recent reports from individual studies reveal significant associations with the sortilin-related receptor (SORL1 [NCBI Entrez gene 6653]) 9,10 and glycine-rich protein 2-associated binding protein 2 (GAB2 [NCBI Entrez gene 9846]) 11 on chromosome 11; death-associated protein kinase 1 (DAPK1 [NCBI Entrez gene 1612]), 12 ubiquilin 1 (UBQLN1 [NCBI Entrez gene 299798]), 13 and adenosine triphosphate-binding cassette transporter 1, subfamily A (ABCA1 [NCBI Entrez gene 19]), on chromosome 9 14 ; and low-density lipoprotein receptor-related protein 6 (LRP6 [NCBI Entrez gene 4040]) on chromosome 12. 15 All of these putative variants still lack replication in large representative populations but have relevance to neuropathologic mechanisms and pathways that may be associated with AD pathogenesis (   A large meta-analysis from the AlzGene database 16 17 All are associated with relevant biological mechanisms and pathways but await replication to further elucidate their utility as significant markers for AD.",
+      "Background  Alzheimer's disease (AD) is the most common neurodegenerative disorder and the leading cause of dementia in the elderly [1].Diagnosis of AD is based on the presence of neurofibrillary tangles and amyloid plaques [2], and symptoms typically include memory loss and impaired cognitive ability.Although the pathological hallmarks associated with dementia-related symptoms in AD appear largely similar between both the early-onset and late-onset forms of the disease, their underlying etiologies contrast [3].Whereas early-onset AD is a familial autosomal dominant disorder caused by rare, highly penetrant mutations in one of a small set of genes (APP, PSEN1, and PSEN2), the more common late-onset form of the disease (accounting for 90-95 % of cases) occurs sporadically, and risk is determined by complex underlying mechanisms [3][4][5][6].Estimates based on twin concordance rates suggest heritability of late-onset AD is as high as 70 %, implicating major roles for genetic as well as non-genetic factors [6].Indeed, through candidate gene studies, as well as more recent genome-wide association studies (GWASs) and whole-exome sequencing, both common and rare variants associated with the late-onset form of AD have been identified [7][8][9][10][11].Collectively, however, common GWAS variants account for only a modest proportion (~30 %) of the underlying variance in disease susceptibility [12].Several environmental factors are also thought to play a role [5,6], yet exactly how these contribute to risk, onset, and progression remains poorly defined.",
+      "Alzheimer's disease is the most common type of dementia, and it is characterized by a decline in memory or other thinking skills.The greatest risk factor for Alzheimer's disease is advanced age.A recent genome-wide study identified a locus on chromosome 17 associated with the age at onset, and a specific variant in CCL11 is probably responsible for the association.The association of a protective haplotype with a 10-year delay in the onset of Alzheimer's disease and the identification of a CCL11 variant with possible functional roles in this association might allow the future development of immunomodulators with the potential to halve disease incidence.",
+      "Alzheimer's disease is the most common type of dementia, and it is characterized by a decline in memory or other thinking skills.The greatest risk factor for Alzheimer's disease is advanced age.A recent genome-wide study identified a locus on chromosome 17 associated with the age at onset, and a specific variant in CCL11 is probably responsible for the association.The association of a protective haplotype with a 10-year delay in the onset of Alzheimer's disease and the identification of a CCL11 variant with possible functional roles in this association might allow the future development of immunomodulators with the potential to halve disease incidence.",
+      "INTRODUCTION  Alzheimer's disease (AD) is a common debilitating disorder with a prevalence that rises steeply with age from below 1% at 65 years to as high as 40% after the age of 90 [Bachman et al., 1992].Genes are known to play a role in the development of AD.Twin studies show heritabilities of around 60% [Bergem et al., 1997;Gatz et al., 1997].Indeed, variation in four genes has already been shown to cause rare forms of early-onset AD [the Amyloid Precursor Protein Gene (APP); Goate et al., 1991; Presenilin 1 (PS1); Sherrington et al., 1995; Presenilin 2 (PS2); Levy Lahad et al., 1995, Rogaev et al., 1995] or increase the general risk of disease development [Apolipoprotein E (APOE), Corder et al., 1993].As well as increasing disease susceptibility, APOE e4 alleles are associated with reduced age at onset (AAO) and appear to show their strongest effect below 70 years [Farrer et al., 1997].There is also evidence from both twin [Pedersen et al., 2001] and family studies [Tunstall et al., 2000;Li et al., 2002] that AAO in AD is heritable.Daw et al. [2000] have estimated that in addition to APOE, there are at least four loci with similar effect sizes, which contribute to AAO in AD.",
+      "Introduction  Alzheimer's disease (AD), a devastating neurodegenerative disease, is the most common form of dementia among the elderly.Genetically, AD is a complex and multifactorial disease with the possible involvement of multiple genes.The rare early-onset form of the disease usually follows an autosomal-dominant inheritance pattern and to date three genes have been identified: amyloid precursor protein (APP) and presenilin 1 and 2 (PSEN1 and PSEN2).The common late-onset form of the disease is much more complex than the early-onset form and until recently the apolipoprotein E (APOE) gene was the only major genetic factor accounting for 20-29% of the risk for late-onset AD. 1,2 Recent large genome-wide association studies (GWAS) have identi-fied nine additional genes for late-onset AD, including CR1, BIN1, CLU (a.k.a.4][5][6][7] There is high heritability for AD risk (up to 80%), 8 but the total risk attributable to all confirmed loci is about 50%, indicating the presence of additional risk genes for late-onset AD.",
+      "NE OF EVERY 5 PERSONS AGED 65 years is predicted to develop Alzheimer disease (AD) in their lifetime, and genetic variants may play an important part in the development of the disease. 1 The apparent substantial heritability of late-onset AD 2 is inadequately explained by genetic variation within the well-replicated genes (apolipoprotein E [APOE; RefSeq NG_007084], presenilin-1 [PSEN1; RefSeq NG_007386], presenilin-2 [PSEN2; RefSeq NG_007381], and amyloid beta precursor protein [APP;"
+    ]
+  ]
+}
diff --git a/gnqa/paper1_eval/src/data/datasets/old/diabetes_1_dataset.json b/gnqa/paper1_eval/src/data/datasets/old/diabetes_1_dataset.json
new file mode 100644
index 00000000..9354ac70
--- /dev/null
+++ b/gnqa/paper1_eval/src/data/datasets/old/diabetes_1_dataset.json
@@ -0,0 +1,128 @@
+{
+  "question": [
+    "What is the age range in mouse for the onset of diabetes?",
+    "Do we find significant quantitative trait loci in the studies for diabetes?",
+    "What environmental factors effect mice that also influence diabetes?",
+    "List mouse studies that involve insulin.",
+    "Are there phenotypes or genotypes associated with early onset diabetes in mouse?"
+  ],
+  "answer": [
+    "The onset of diabetes in mice can occur as early as 3 to 4 weeks of age, with blood sugar concentrations often elevating between 5 and 8 weeks. However, in some cases, hyperglycemia can develop at approximately 20 weeks of age.",
+    "Yes, the studies identified significant quantitative trait loci associated with both type 1 and type 2 diabetes. These include loci such as TCF7L2, DNER, SRR, HNF1A, KCNQ1, CDKN2A/CDKN2B, JAZF1, CDC123, CAMK1D, TSPAN8, LGR5, THADA, ADAMTS9, NOTCH2, and others. However, the significance of some loci varied across different populations and studies.",
+    "Several environmental factors can affect mice and influence diabetes. These include the stress response, which can be greater in male mice and can confound feeding behavior, the unique gut microbiota of rodents from different production facilities, cage position, single versus group housing, the skill level of the researcher, ambient room temperature, and the type of cage bedding. Diet, particularly high-fat feeding, can also induce obesity and diabetes. Additionally, the intrauterine environment can impact the subsequent development of diabetes amongst offspring.",
+    "1. Diabetes incidence study where mice were tested for diabetes monthly by blood glucose and weekly by urine assessment. 2. Studies involving transgenic/knockout animal models of type 2 diabetes to study the role of genes and their effects on peripheral insulin action. 3. Study involving male type-2 diabetic db/db mice and genetic control non-diabetic db/ mice. 4. Study involving young adult mice fed a high-fat or Western diet to elicit DIO and insulin resistance. 5. Study involving Akita mouse genotyping. 6. Study involving 12-week-old male type 2 diabetic db/db mice and contemporary control wild-type mice. 7. Study involving transgenic mice to create specific models of type 1 and type 2 diabetes. 8. Study involving AKITA mice derived from a C57BL/6NSlc mouse with a spontaneous mutation in the insulin 2 gene. 9. Study monitoring mice for the development of diabetes.",
+    "Yes, there are phenotypes and genotypes associated with early onset diabetes in mice. The db/db mouse strain, which carries the diabetes (db) mutation, is one such genotype. These mice develop an abnormal and characteristic deposition of fat beginning at 3 to 4 weeks of age, which is an associated phenotype. The C57BL/Ks strain also exhibits early onset diabetes. The severity of the diabetes condition in these mice depends on the interaction of the mutant gene with the inbred background."
+  ],
+  "contexts": [
+    [
+      "Diabetes incidence study. Mice were kept for 20-28 weeks and tested for diabetes monthly by blood glucose and weekly by urine assessment, with a positive indication being followed by twice-weekly blood testing.Mice were diagnosed as diabetic when the blood glucose concentration was over 260 mg/dl (14.4 mM) after 2-3 h of fasting for two sequential tests.Glucose and insulin tolerance tests were performed by injecting glucose (2 g/kg body weight) or insulin (1 U/kg body weight) intraperitoneally in mice fasted for 6-7 h.Tail vein blood was tested by a Contour glucometer.Assessments of plasma insulin, proinsulin and C-peptide levels were performed using commercial ELISA kits, according to the manufacturer's instructions (insulin, proinsulin and C-peptide mouse ELISA kits, R&D Systems Quantikine).Assays were performed with blinding, with mice coded by number until experimental end.",
+      "Subsequently, genetic dissection of the diabetes-associated traits in the male BC1 progeny obtained from a cross between (normal B6 female \u03eb diabetic TH male)F1 female and diabetic TH male mice (B6 cross) was carried out.Because of the sexual dimorphism, with respect to NIDDM onset, we used diabetic TH male mice as breeders to ensure the presence of a mutant allele(s) and targeted our genetic dissection using only male BC1 progeny.In male BC1 mice hyperglycemia developed at approximately 20 weeks of age and was sustained through a 30-week period studied.Based on these data, we measured plasma glucose levels three times in biweekly intervals (to minimize phenotyping error) between 20 and 26 weeks of age, and the mean of the three measurements was used for genetic analysis.Body weights were measured at 20 weeks.At the end of the study (26 weeks), plasma insulin levels and nasal-anal lengths were measured, and the five regional fat pads were dissected and weighed from a subset of 133 mice.In total, 206 male BC1 mice were collected, and individual mice were genotyped with 92 SSLP markers at approximately 20-cM intervals (covering \u03f396% of the genome).",
+      "The Diabetes (db) .Mouse (Chromosome 4).Diabetes (db), an autosomal recessive mutation, occurred in the C57BL/KsJ (BL/Ks) inbred strain and on this background is characterized by obesity, hyperphagia, and a severe diabetes with marked hyperglycaemia [7,22].Increased plasma insulin concentration is observed as early as 10 days of age [10].The concentration of insulin peaks at 6 to 10 times normal by 2 to 3 months of age then drops precipitously to near normal levels.Prior to the fall in plasma insulin concentration, the most consistent morphological feature of the islets of Langerhans appears to be hyperplasia and hypertrophy of the beta cells in an attempt to produce sufficient insulin to control blood glucose concentration at physiological levels.The drop in plasma insulin concentration is concomitant with islet atrophy and rapidly rising blood glucose concentrations that remain over 400 mg per 100 ml until death at 5 to 8 months [7].Compared with other obesity mutants the diabetic condition is more severe and the lifespan is markedly decreased.",
+      "The animal models available for diabetes research (Table 1) are most often more like maturityonset diabetes in man.Obesity is a consistent factor and insulinopaenia is rare.However, the time of gene expression at about two weeks of age is within the time period of juvenile expression.The severity and clinical course of the diabetes produced depends on the interaction of the mutant gene with the inbred background rather than the action of the gene itself.Thus on one inbred background a well-compensated, maturity onset type diabetes, compatible with near normal life is observed whereas on another inbred background the syndrome presents as a juvenile-type diabetes with insulinopaenia, islet cell degeneration, marked hyperglycaemia, some ketosis and a much shortened lifespan.Unfortunately, vascular, retinal and the other complications of diabetes are not seen consistently in these rodent syndromes.It seems that the severely diabetic animal either does not live long enough to develop these complications or that rodents are particularly resistant to those complications that commonly afflict human diabetics.Several comprehensive bibliographies and excellent reviews of the various studies carried out with each of these syndromes in animals have been published [2,3,19,30,31,32].This presentation will be restricted primarily to the research undertaken by my colleagues and myself with the two mouse mutations; diabetes (db), and obese (ob).Both mutations have been extensively studied by numerous investigators in attempts to define the primary lesion causing the syndrome.As yet, the primary defect remains illusive, although several possibilities are becoming increasingly plausible in the light of current research.Although the metabolic abnormalities associated with both obese and diabetes have many similarities with regard to the overall progression of the obesity-diabetes state, the documentation of two single genes on separate chromosomes makes it unlikely that the two syndromes are caused by the same primary lesion.However, the marked similarity between the two mutants when maintained on the same genetic background implies that the defects may occur in the same metabolic pathway.",
+      "Diabetes-obesity syndromes in rodents",
+      "Diabetes-related clinical traits for 275 B6XBTBR-ob/ ob F2 male mice at 10 weeks of age.",
+      "However, in other contexts, B6 mice are more likely than D2 to spontaneously develop diabetic syndromes, Aging Clin Exp Res  indicating that risk factors exist on both genetic backgrounds [29]. QTL mapping studies indicate that these murine metabolic traits have a complex genetic architecture that is not dominated by any single allele [29\u201331], much like humans [32, 33]. Prior work identified candidate genes on Chr 13 that might underlie diabetes-related traits, including RASA1, Nnt, and PSK1. RASA1 show strong sequence differences between B6 and D2 strains [34]. Rasche et al.",
+      "In total, about 360 male mice (10 for each strain) were fed with either a regular chow diet (CD) or a high-fat diet (HFD) to induce obesity and associated metabolic stress. At 20 weeks of age, a test meal bolus was administered orally, and postprandial BAs and blood glucose levels were analyzed at three different time points (before and 30 or 60 min after gavage). Nine weeks later, the mice were sacrificed 4 h after feeding, a time point in which the main metabolic adaptive processes in response to BA-mediated food intake are captured.",
+      "BB rats usually develop diabetes just after puberty and have similar incidence in males and females.Around 90% of rats develop diabetes between 8 and 16 weeks of age.The diabetic phenotype is quite severe, and the rats require insulin therapy for survival.Although the animals have insulitis with the presence of T cells, B cells, macrophages and NK cells, the animals are lymphopenic with a severe reduction in CD4 + T cells and a near absence of CD8 + T cells (Mordes et al., 2004).Lymphopenia is not a characteristic of type 1 diabetes in humans or NOD mice (Mordes et al., 2004) and is seen to be a disadvantage in using the BB as a model of type 1 diabetes in humans.Also, in contrast to NOD mice, the insulitis is not preceded by peri-insulitis.However, the model has been valuable in elucidating more about the genetics of type 1 diabetes (Wallis et al., 2009), and it has been suggested that it may be the preferable small animal model for islet transplantation tolerance induction (Mordes et al., 2004).In addition, BB rats have been used in intervention studies (Hartoft-Nielsen et al., 2009;Holmberg et al., 2011) and studies of diabetic neuropathy (Zhang et al., 2007).",
+      "Ageing likewise affects metabolic parameters in rodents.Analogous to what occurs in humans, the body weight of the C57BL/6J mouse, the most commonly used mouse strain for metabolic studies, increases with age, peaking at ~9 months 133 , and older C57BL/6J mice (22 months) have reduced lean mass and increased fat mass compared with young 3-month-old mice 134 .In both rats and mice, fasting glucose levels are mostly stable throughout life, but whereas glucose tolerance generally worsens with age in rats, mice are less affected [135][136][137][138][139][140] .In fact, 2-year-old male C57BL/6J mice were significantly more glucose tolerant than their 5-month-old counterparts 138 .Consistent with these findings, glucosestimulated insulin release from the pancreas decreases with age in rats, but not in mice 137,138 .",
+      "All mice h o m o z y g o u s for t h e d i a b e t e s gene (db/db) b e c o m e diabetic, t h e first d i s t i n g u i s h i n g f e a t u r e being a m a r k e d t e n d e n c y to o b e s i t y w i t h large f a t d e p o s i t i o n s o b s e r v e d in t h e a x i l l a r y a n d i n g u i n a l regions a t a b o u t 3 t o 4 weeks of age.",
+      "In many of these diabetic mice blood sugar concentration tends to increase gradually between 5 and 12 weeks of age, after which it may rise sharply to over 500 rag/100 ml of blood almost overnight. The diabetic condition, thus, appears to develop in two phases, an early one when there is some regulation of blood sugar concentration, and a later stage characterized by a marked increase in hyperglycemia and a complete loss of metabolic control. A few exceptional diabetics, usually females, exhibit a pattern similar to that shown in Fig. 3. Although 16 240  D.L. COLEMANand K.P.",
+      "Results All mice homozygous for the trait, diabetes (db), develop an abnormal and characteristic deposition of fat beginning at 3 to 4 weeks of age, making their early identification possible. The difference in size and appearance of litter-mate 6-week old mice, one normal and one diabetic, is shown in Fig. 1. Weight increases  Fig. 1. C57BL/Ks-db litter-mates a t 6 weeks.",
+      "of age; m o r e o f t e n this e l e v a t i o n occurs b e t w e e n 5 a n d 8 weeks. I n older d i a b e t i c mice b l o o d sugar c o n c e n t r a t i o n s g r e a t e r t h a n 600 m g / 1 0 0 m l are n o t  u n c o m m o n .",
+      "I n older mice with blood sugar concentrations over 250 rag/100 ml, injections of up t o 100 units / 100 g were completely ineffective in reducing blood sugar to normal levels. Continued treatment of young diabetic mice with daily injections of insulin, although controlling Mood sugar concentrations initially, did not prevent or delay either the obesity or the uncontrollable high blood sugar concentrations, which usually develop at about 6 to 8 weeks of age.",
+      "Although the early onset of diabetes in db mice coincides with t h a t in juvenile diabetes in man, the symptoms of obesity and elevated serum insulin are more suggestive of the pattern of development observed in the maturity-onset type of diabetes. As yet, none of the lesions associated with advanced diabetes in humans such as retinopathies, cardiovascular and kidney lesions have been observed, possibly because of the early onset of the diabetes and the relatively rapid deterioration and death of these mice.",
+      "To screen for genes that show correlation with different phenotypic outcome in diabetic mouse models, we used the cross-sectional design and performed microarray analysis on 24-wk-old STZ-treated and db/db mice with established renal pathology.In parallel with the functional genomics characterization, each individual mouse underwent a detailed renal phenotype analysis.Mice that were treated with low doses of STZ developed diabetes and moderately severe albuminuria (twice the control).In mice with C57B6/J background, the mesangial changes were mild or absent.Mice with 129SvJ genetic background developed significant glomerular changes.However, these were not significantly different from the agematched controls (K.Sharma, K. Susztak, and E.P. Bo \u00a8ttinger, unpublished observations).The db/db mice became insulin resistant and developed diabetes at approximately 8 wk of age.Albuminuria was detected as early as 3 to 4 wk after the development of hyperglycemia.The glomerular histology was characterized by severe diffuse mesangial expansion, as previously reported (49).",
+      "Renal lesions in diabetic mouse models  Db/db mice, which have a recessive mutation in the hypothalamic leptin receptor, develop obesity at 4 wk of age and type 2 diabetes at approximately 8 wk of age.In C57BL/6J background, the diabetes and the obesity are usually less severe than in the C57BL/KsJ background (44).Kidneys are generally enlarged in this mouse strain, and structural glomerular changes (e.g., diffuse glomerulosclerosis, GBM thickening) occur without evidence of tubulointerstitial disease (40).Glomerular lesions of the KK mice are characterized by diffuse and nodular mesangial sclerosis without evidence of tubular disease (45).The lack of reliable mouse models prompted the National Institute of Diabetes and Digestive and Kidney Diseases to fund a consortium for the development and phenotyping of new diabetic mouse models that would resemble closely human DNP.",
+      "In total, about 360 male mice (10 for each strain) were fed with either a regular chow diet (CD) or a high-fat diet (HFD) to induce obesity and associated metabolic stress. At 20 weeks of age, a test meal bolus was administered orally, and postprandial BAs and blood glucose levels were analyzed at three different time points (before and 30 or 60 min after gavage). Nine weeks later, the mice were sacrificed 4 h after feeding, a time point in which the main metabolic adaptive processes in response to BA-mediated food intake are captured.",
+      "Assessment of Diabetes  Mice were monitored for the development of diabetes as described previously (Wicker et al. 1994)."
+    ],
+    [
+      "Additional large-scale meta-analysis predicated upon increased sample size, were carried out on existing datasets from the WTCCC [18] and the Genetics of Kidneys in Diabetes (GoKinD) study [69][70][71] plus control data derived from the National Institute of Mental Health.These investigators observed significant association of previously observed loci.Importantly, they did not observe evidence of new T1D loci reaching the threshold for genome-wide significance.Instead they re-analyzed the most nominally significant associated SNP in an independent British cohort of approximately 6000 cases, 7000 controls and in 2800 families, where they uncovered four additional loci, BACH2 (previously reported [67]), 10p15 harboring protein kinase C theta (PRKCQ), 15q24 harboring nine genes including the cathepsin H (CTSH), complement 1q (C1q), tumor necrosis factor related protein 6 (C1QTNF6) and somatostatin receptor 3 (SSTR3) genes.Table 1 summarizes the 16 T1D loci reported to date.An example of a tag-SNP that captures the association with T1D in each instance is highlighted together with its relative minor allele frequency in controls and what magnitude of risk or protection it confers.Key references regarding the role of each locus in the context of the disease are included and along with the chromosomal band where each locus resides, the main candidate gene (symbol and full name) is highlighted.",
+      "Detection of established loci  We explored the extent to which previously reported type 2 diabetes association signals could be detected in African-descent individuals.Based on the previously reported effect sizes and the effect allele frequency and sample size from our African meta-analysis, we had sufficient power (80%) to detect three signals (TCF7L2, DNER and SRR) at genome-wide significance (p < 2.5 \u00d7 10 \u22128 ) (ESM Table 2).Only the TCF7L2 variant reached genome-wide significance in our study, whereas both variants in DNER (rs1861612) and SRR (rs391300), originally discovered in Pima Indians and East Asians, respectively, had p > 0.1 (ESM Table 2).",
+      "On the basis of the combined stage 1-3 analyses, we found that six signals reached compelling levels of evidence (P \u00bc 5.0 \u00c2 10 -8 or better) for association with T2D (Table 2).As in all linkage disequilibrium (LD)-mapping approaches, characterization of the causal variants responsible, their effect sizes and the genes through which they act will require extensive resequencing and fine-mapping.However, on the basis of current evidence, we found that the most associated variants in each of these signals map to intron 1 of JAZF1, between CDC123 and CAMK1D, between TSPAN8 and LGR5, in exon 24 of THADA, near ADAMTS9 and in intron 5 of NOTCH2.",
+      "Replication study of newly identified type 1 diabetes risk loci",
+      "Although these are considered to be loci convincingly associated with susceptibility to type 2 diabetes in populations of European descent, other genes related to susceptibility to the disease are probably still unidentified, particularly those for populations of other ancestries.In order to uncover genetic variants that increase the risk of type 2 diabetes, we conducted a genome-wide association study in Japanese individuals with type 2 diabetes and unrelated controls.We first genotyped 268,068 SNPs, which covered approximately 56% of common SNPs in the Japanese, in 194 individuals with type 2 diabetes and diabetic retinopathy (case 1) and in 1,558 controls (control 1) collected in the BioBank Japan.We compared the allele frequencies of 207,097 successfully genotyped SNPs and selected the 8,323 SNPs showing the lowest P values.We then attempted to genotype these 8,323 SNPs in 1,367 individuals with type 2 diabetes and diabetic retinopathy (case 2) and for 1,266 controls (control 2) (stage 2), and successfully obtained data for 6,731 SNPs (the P value distribution in the second test is shown in Supplementary Fig. 1a online).The results of principal component analysis 8 in the stage 1 and 2 samples and HapMap samples revealed that there was no evidence for population stratification between the case and control groups throughout the present tests (Supplementary Fig. 1b,c).We selected the 9 SNP loci showing P values o0.0001 (additive model in stage 2, Table 1) and genotyped a third set of cases and controls comprising 3,557 Japanese individuals with type 2 diabetes (cases 3,4,5) and 1,352 controls (controls 3,4).We evaluated the differences in the population structure among these three sets of case and two sets of control groups by Wright's F test.As the results indicated that there was no difference in the population structure among these groups (Supplementary Table 1b online), we combined these populations for the third test of case-control study.The third set of analysis identified the significant associations for six SNPs (Table 1), including the CDKAL1 locus at 6p22.3 (rs4712524, rs9295475 and rs9460546), the IGF2BP2 locus at 3q27.2 (rs6769511 and rs4376068) and the KCNQ1 locus at 11p15.5 (rs2283228).The remaining three SNPs (rs13259803, rs612774 and rs10836097) had P values of 40.05 in the third test and were not further examined.CDKAL1 and IGF2BP2 were previously reported as susceptibility genes for type 2 diabetes in the Japanese population 9 .Therefore, we focused on the KCNQ1 locus, which was highly associated with type 2 diabetes.",
+      "We consider these data as an interesting preliminary result that surely requires additional independent studies including a higher number of patients in order to confirm and clarify the possible contribution of this locus to the development of T2DM complications.",
+      "DISCUSSION  Taken together, our full second-stage approach and combined meta-analysis have revealed additional loci associated with type 1 diabetes.Clearly the risks are relatively modest compared with previously described associations, and it was only with this sample size at our disposal that we could we detect and establish these signals as true positives through an independent validation effort.",
+      "Identification of susceptibility loci  The degree of evidence for all reported T2D loci was quantified as follows: a locus with a logarithm of odds ratio (LOD) score of 3 or more was considered significant, a LOD score between 2.2 and 3 was considered suggestive and a LOD score between 1 and 2.2 was considered nominal.For T2D, only those loci were included that were significant at least once, or were suggestive in at least one study and at least nominal in two or more studies.The inclusion of the second category of loci was based on a study by Wiltshire et al. [72], in which it was postulated that locus counting is a useful additional tool for the evaluation of genome scan data for complex trait loci.We used the same two criteria to determine the loci from the five papers published on obesity since 2004 and combined these loci with those from Bell et al. [7].As obesity phenotypes, BMI, serum leptin levels, abdominal subcutaneous and visceral fat, and percentage body fat were included.All of these phenotypes were used as continuous quantitative traits, as well as with various cut-off levels.",
+      "Today, more than 100 loci for type 2 diabetes and glycemic traits have been identified through numerous GWA studies of common and rare variation in populations of diverse ancestral origins [31]; however, to date, very few GWA studies have been published in cohorts of Mexican ancestry.The first GWA study performed in a non-European cohort was published in 2007 and comprised 561 Mexican American type 2 diabetes cases and controls drawn from the Starr County Health Studies [32].Although no loci reached genome-wide significance, several loci identified in prior GWA studies in Europeans were replicated [32].This analysis was subsequently expanded (N = 1273) and meta-analyzed with a cohort from Mexico City (N = 1310) in 2011 [33,34].The most significant variants observed in this meta-analysis included known regions near HNF1A and KCNQ1.Top association signals were then meta-analyzed with the DIAGRAM and DIAGRAM+ datasets of European ancestry individuals, resulting in two regions reaching genome-wide significance: HNF1A and CDKN2A/CDKN2B (Table 1).Top association signals in both studies were annotated to explore their roles as expression quantitative trait loci (eQTL) in both adipose and muscle tissues, revealing a marked excess of transacting eQTL in top signals in both tissue types.",
+      "75\u00b179 The main conclusion is that there is no major locus for T2D (analogous to HLA in type 1 diabetes).This is not surprising given the modest l s for T2D (approximately 3.5 in Europeans), imposing a limit on the magnitude of any single gene eect. 4Many scans have consequently been signi\u00aecantly underpowered to detect the modest gene eects anticipated.Certainly, few T2D scans have reported linkages meeting the established criteria for genomewide signi\u00aecance. 80This modest power, combined with the diversity of the pedigrees sampled and the analytical techniques used, means that the replication of positive \u00aendings between data sets has been the exception rather than the rule.",
+      "Quantitative Trait Analysis  Exploration of putative T2DM variants with quantitative glycemic traits in a subset of African-American samples (n = 671 from the IRAS and IRASFS control samples, Table S5) revealed     limited insight into the biological mechanism associated with T2DM risk.In addition, the five putative African-American T2DM susceptibility loci were tested for association with quantitative measures of glucose homeostasis in the European Caucasian population, in silico, by the Meta-Analyses of Glucose and Insulin-related traits Consortium (MAGIC; [16]).These results did not provide further insight into the probable role these variants may have in disease susceptibility (Table S6).The most significantly associated SNP in African Americans, rs7560163, failed quality controls filters and was not included in analysis likely due to being monomorphic as seen in a representative Caucasian population from the HapMap project (Table S4).",
+      "Discovery of novel loci for T2D susceptibility.We tested for T2D association with ~27 million variants passing quality-control filters, ~21 million of which had a minor allele frequency (MAF) < 5%.Our meta-analysis identified variants at 231 loci reaching genomewide significance (P < 5 \u00d7 10 \u22128 ) in the BMI-unadjusted analysis (N eff 231,436) and 152 in the smaller (N eff 157,401) BMI-adjusted analysis.Of the 243 loci identified across these two analyses, 135 mapped outside regions previously implicated in T2D risk (Methods, Fig. 1 and Supplementary Table 2).",
+      "Genetic studies performed since 2012 have identified many additional T2D loci based on risk alleles common in one population but less common in others.Studies in African Americans identified RND3-RBM43 (28), HLA-B and INS-IGF2 (29).Studies in South Asians identified TMEM163 (30) and SGCG (31).One locus, SLC16A11-SLC16A13, was simultaneously identified in Japanese and Mexican Americans (32,33), and studies in East Asians identified ANK1 (34), GRK5 and RASGRP1 (35), LEP and GPSM1 (32), and CCDC63 and C12orf51 (36).A study of individuals from Greenland identified TBC1D4 (37), and a sequencing-based study of Danes with follow-up in other Europeans identified MACF1 (38).Finally, the largest GWAS to date in American Indians identified DNER at near genome-wide significance (P = 6.6 \u00d7 10 \u22128 ) (39).Three of these studies imputed GWAS data using the 1000 Genomes Project sequence-based reference panels, providing better genome coverage (29,32,33,40).Taken together, these studies highlight the value of diverse populations, including founder and historically isolated populations, to detect risk loci.",
+      "Finally, a recent study identified additional susceptibility loci for type 2 diabetes by performing a meta-analysis of three published GWAs. 21As acknowledged by the authors, GWAs are limited by the modest effect sizes of individual common variants and the need for stringent statistical thresholds.Thus, by combining data involving 10,128 samples, the authors found in the initial stages of the analysis highly associated variants (they followed only 69 signals out of over 2 million metaanalyzed SNPs) with P values \u03fd10 \u03ea4 in unknown loci, and 11 of these type 2 diabetes' associated SNPs were taken forward to further stages of analysis.Large stage replication testing allowed the detection of at least six previously unknown loci with robust evidence for association with type 2 diabetes.",
+      "Surprisingly, data about previous published loci associated with type 2 diabetes were not sufficiently powerful to reach a significant P value in individual scans.For example, variants at SLC30A8 and PPARG were significantly associated with type 2 diabetes only when pooling all the GWAs data, whereas in a single genome scan (DGI), no gene showed a positive signal (P value: 0.92 and 0.83, respectively).Thus, this may suggest that GWAs are still underpowered to find SNPs with small effect size.",
+      "Background: The two genome-wide association studies published by us and by the Wellcome Trust Case-Control Consortium (WTCCC) revealed a number of novel loci, but neither had the statistical power to elucidate all of the genetic components of type 1 diabetes risk, a task for which larger effective sample sizes are needed.Methods: We analysed data from two sources: (1) The previously published second stage of our study, with a total sample size of the two stages consisting of 1046 Canadian case-parent trios and 538 multiplex families with 929 affected offspring from the Type 1 Diabetes Genetics Consortium (T1DGC); (2) the Rapid Response 2 (RR2) project of the T1DGC, which genotyped 4417 individuals from 1062 non-overlapping families, including 2059 affected individuals (mostly sibling pairs) for the 1536 markers with the highest statistical significance for type 1 diabetes in the WTCCC results.Results: One locus, mapping to a linkage disequilibrium (LD) block at chr15q14, reached statistical significance by combining results from two markers (rs17574546 and rs7171171) in perfect LD with each other (r 2 = 1).We obtained a joint p value of 1.3610 26 , which exceeds by an order of magnitude the conservative threshold of 3.26610 25 obtained by correcting for the 1536 single nucleotide polymorphisms (SNPs) tested in our study.Meta-analysis with the original WTCCC genome-wide data produced a p value of 5.83610 29 .Conclusions: A novel type 1 diabetes locus was discovered.It involves RASGRP1, a gene known to play a crucial role in thymocyte differentiation and T cell receptor (TCR) signalling by activating the Ras signalling pathway.",
+      "Finally, we examined whether genes identified using our association studies were enriched within diabetes-related pathways.We collated a list of 42 genes to which 53 CpG sites associated with T2D traits (CS score \u22651.77, combined P < 0.017) mapped.Even in this small dataset, pathway analysis (Supplementary Material, Table S12) indicated significant enrichment in 31 pathways (Fisher's exact P < 0.05), including those related to circadian clock (P = 0.005), adipocytokine signaling (P = 0.009), leptin pathway (P = 0.023), HDL-mediated lipid transport (P = 0.031) and insulin signaling (P = 0.033).",
+      "In recent years, progress has been made in following up mechanistic studies of GWAS type 2 diabetes-association signals [6,7,9,[25][26][27][28][29][30], but challenges remain in sifting through the many associated variants at a locus to identify those influencing disease.We hypothesized that a common variant with modest effect underlies the association at the CDC123/CAMK1D locus and evaluated the location of high LD variants (r 2 $.7; n = 11) at the locus relative to known transcripts and to putative DNA regulatory elements.We identified two variants that overlapped putative islet and/or liver regulatory regions and none located in exons.We did not assess variants in lower LD (r 2 ,.7), and additional functional SNPs may exist at this locus acting through alternate functional mechanisms untested in the current study.",
+      "Meta-analysis results for T2D SNPs for insulin and glucose-related traits.",
+      "A r t i c l e s  By combining genome-wide association data from 8,130 individuals with type 2 diabetes (T2D) and 38,987 controls of European descent and following up previously unidentified meta-analysis signals in a further 34,412 cases and 59,925 controls, we identified 12 new T2D association signals with combined P < 5 \u00d7 10 \u22128 .These include a second independent signal at the KCNQ1 locus; the first report, to our knowledge, of an X-chromosomal association (near DUSP9); and a further instance of overlap between loci implicated in monogenic and multifactorial forms of diabetes (at HNF1A).The identified loci affect both beta-cell function and insulin action, and, overall, T2D association signals show evidence of enrichment for genes involved in cell cycle regulation.We also show that a high proportion of T2D susceptibility loci harbor independent association signals influencing apparently unrelated complex traits."
+    ],
+    [
+      "Methods  Mouse models of diabetes.All animal studies were conducted according to a protocol approved by the Institutional Animal Care and Use Committee at the Beckman Research Institute of City of Hope.Male type-2 diabetic db/db mice (T2D leptin receptor deficient; Strain BKS.Cg-m \u00fe / \u00fe lepr db/J) and genetic control non-diabetic db/ \u00fe mice (10-12 weeks old), were obtained from The Jackson Laboratory (Bar Harbor, ME) 11,17 .Male C57BL/6 mice (10 week old, The Jackson Laboratory) were injected with 50 mg kg \u00c0 1 of STZ intraperitoneally on 5 consecutive days.Mice injected with diluent served as controls.Diabetes was confirmed by tail vein blood glucose levels (fasting glucose 4300 mg dl \u00c0 1 ).Each group was composed of five to six mice.Mice were sacrificed at 4-5 or 22 (ref.17) weeks post-induction of diabetes.Glomeruli were isolated from freshly harvested kidneys by a sieving technique 11,17 in which renal capsules were removed, and the cortical tissue of each kidney separated by dissection.The cortical tissue was then carefully strained through a stainless sieve with a pore size of 150 mm by applying gentle pressure.Enriched glomerular tissue below the sieve was collected and transferred to another sieve with a pore size of 75 mm.After several washes with cold PBS, the glomerular tissue remaining on top of the sieve was collected.Pooled glomeruli were centrifuged, and the pellet was collected for RNA, protein extraction or for preparing MMCs 11,17 .Male Chop-KO mice were also obtained from the Jackson Laboratory (B6.129S(Cg)-Ddit3 tm2.1Dron /J).Based on our previous experience, sample size was determined to have enough power to detect an estimated difference between two groups.With minimum sample size of 5 in each group, the study can provide at least 80% power to detect an effect size of 2 between diabetic and non-diabetic groups or treated and untreated groups at the 0.05 significant level using two-sided t-test.Since we expected larger variation between groups especially for the mice with oligo-injection, we used more than 5 mice in each group (with 6 mice in each group, we have 80% power to detect an effect size of 1.8 at the 0.05 confidence level).Our actual results with current sample size did show statistical significance for majority of the miRNAs in the cluster.Histopathological and biochemical analysis of tissues or cells derived from animal models were performed by investigators masked to the genotypes or treatments of the animals.",
+      "In these models, adult offspring of diabetic animals were noted to have normal development of the endocrine pancreas (Aerts et al., 1997;Ma et al., 2012).However, they develop glucose intolerance and impaired insulin response to glucose challenge, and display insulin resistance, mainly in the liver and muscle, highlighting the presence of both insulin resistance and b-cell dysfunction (Aerts et al., 1988;Holemans et al., 1991a,b).The key role of the intrauterine environment was demonstrated by a series of embryo transfer experiments, which showed that the diabetes risk in a low genetic risk strain can be substantially increased by the hyperglycaemic environment of a dam with a high genetic risk of diabetes (Gill-Randall et al., 2004).",
+      "Diabetes-obesity syndromes in rodents",
+      "However, in other contexts, B6 mice are more likely than D2 to spontaneously develop diabetic syndromes, Aging Clin Exp Res  indicating that risk factors exist on both genetic backgrounds [29]. QTL mapping studies indicate that these murine metabolic traits have a complex genetic architecture that is not dominated by any single allele [29\u201331], much like humans [32, 33]. Prior work identified candidate genes on Chr 13 that might underlie diabetes-related traits, including RASA1, Nnt, and PSK1. RASA1 show strong sequence differences between B6 and D2 strains [34]. Rasche et al.",
+      "Other diet-induced rodent models of type 2 diabetes.Although rats and mice are the most commonly used models for studies of type 2 diabetes, other rodents have also been identified as useful models.These include the desert gerbil and the newly described Nile grass rat, both of which tend to develop obesity in captivity.",
+      "Summary of rodent models of type 2 diabetes",
+      "Since the obesity is induced by environmental manipulation rather than genes, it is thought to model the human situation more accurately than genetic models of obesityinduced diabetes.High fat feeding is often used in transgenic or knock-out models, which may not show an overt diabetic phenotype under normal conditions, but when the beta cells are 'pushed', the gene may be shown to be of importance.It should be noted that the background strain of the mice can determine the susceptibility to diet-induced metabolic changes, and thus, effects could be missed if a more resistant strain is used (Surwit et al., 1995;Bachmanov et al., 2001;Almind and Kahn, 2004).It has also been reported that there is heterogeneity of the response to high fat feeding within the inbred C57BL/6 strain, indicating that differential responses to a high-fat diet are not purely genetic (Burcelin et al., 2002).",
+      "Other considerations and limitations  A myriad of factors affect animal experiments.Men elicit a greater stress response in mice than women 292 , likely confounding feeding behaviour.Rodents from different production facilities (for example, Jackson Laboratory and Taconic) have unique gut microbiotas 293 , perhaps contributing to differences in their susceptibility to DIO and related diabetic complications 293 .Similarly, cage position within a rack of cages, single versus group housing, the skill level of the researcher, ambient room temperature or the type of cage bedding can all affect experimental outcomes.",
+      "We believe there are several factors that researchers should consider when conducting obesity and diabetes mellitus research in rodents (FIG.2).Although our list is by no means an exhaustive, it demonstrates the complexity and interconnectedness of the myriad of factors that can confound experimental outcomes.Although it is impossible to control for everything, researchers should accurately detail all experimental conditions and methods to allow for better interpretation of the results and, importantly, for better reproducibility.",
+      "Figure2| Important experimental parameters and potential confounders of experimental outcomes in obesity and diabetes research and their interrelatedness.Countless factors influence experimental outcomes when using animal models, and what is enumerated here is by no means a complete list.This figure is one depiction of the multifactorial and interconnected genetic and environmental matrix that makes it virtually impossible to design the perfect experiment.For example, single-housing mice to obtain more accurate food intake data introduces a stress that in turn affects food intake.The severity of this stress response is both strain-specific and sex-dependent.What is important is to be aware of these challenges and to control for them in the most optimal manner.It is equally, if not more, important to accurately and comprehensively detail all experimental conditions in research papers, as these have bearing on the interpretation and reproducibility of the published results.DIO, diet-induced obesity.",
+      "Another concern pertains to control mice.Compared with free-living mice in the wild, laboratory control mice with ad libitum access to food are sedentary, overweight, glucose intolerant and tend to die at a younger age 297 .Comparisons between mice with DIO and control mice might be analogous to investigating the genetic cause of obesity-resistance by comparing humans who are overweight or obese.This potential problem with control mice could explain why the use of DIO diets that have 40% to 60% of total energy from fat is so prevalent, as this might be necessary to achieve divergent weight gains.With free access to running wheels, C57BL/6J mice voluntarily run 5-10 km per day 298,299 .As is the case with humans 300 , mice get health benefits from regular physical activity including weight loss, decreased adiposity and improved insulin sensitivity 301,302 .Physical activity might also affect the epigenome over several generations 303 .An enriched physical and social cage environment alone improves leptin sensitivity and energy expenditure in mice, independent of physical activity 304,305 .Overall, these data suggest that with standard mouse husbandry, chow-fed laboratory mice are not the ideal healthy and lean control group for meaningful obesity research.",
+      "To better address these points, various animal models have been developed.For example, using HFD-T2DM male rats, the F1 female offspring showed reduced \u03b2 cell area and insulin secretion, together with glucose intolerance, without changes in body weight [145].The islets of the F1 female offspring showed differential expression of many genes involved in Ca 2+ , mitogen-activated protein kinase and Wnt signaling, apoptosis and cell cycle regulation [145].Similarly, in pregnant C57BL6J mice, food deprivation resulted in \u03b2 cell mass reduction and an increased risk of \u03b2 cell failure in offspring [146].",
+      "They are probably typical of those few mice that develop diabetes more slowly and do not tax the pancreatic insulin supply as severely early in the course of the disease. Attempts at therapy. Attempts to keep the weight of diabetic mice within normal limits by total or partial food restriction resulted in premature deaths. After it was discovered that gluconeogenesis is greatly increased in diabetic mice, attempts were made to regulate blood sugar levels and also weight gain by feeding rations devoid of carbohydrate.",
+      "The degree of dependence of adiposity, hyperglycemia, and islet hypertrophy on food consumption varies among these mice, but in all, the increase in islet volume and consequent fi-eell hyperplasia appears to be an effective  247  means of maintaining blood sugar concentrations at near normal levels. I n contrast, neither the diabetic sand rat [5] nor the diabetic mouse has hypertrophied islets and neither effectively controls blood sugar levels.",
+      "HV~MEI,: Studies with the Mutation, Diabetes  almost undetectable. Similarly, the activities of citrate lyase and glucose-6-phosphate dehydrogenase were greatly decreased in these older diabetic as compared  Diabetologia  the diabetic mice have attained m a x i m u m weight, after which no further accumulation of adipose tissue is noted. Fig. 8.",
+      "Rodent models of monogenic obesity and diabetes  Obesity and the consequent insulin resistance is a major harbinger of Type 2 diabetes mellitus in humans.Consequently, animal models of obesity have been used in an attempt to gain insights into the human condition.Some strains maintain euglycaemia by mounting a robust and persistent compensatory \u03b2 -cell response, matching the insulin resistance with hyperinsulinaemia.The ob / ob mouse and fa / fa rats are good examples of this phenomenon.Others, such as the db / db mouse and Psammomys obesus (discussed later) rapidly develop hyperglycaemia as their \u03b2 -cells are unable to maintain the high levels of insulin secretion required throughout life.Investigation of these different animal models may help explain why some humans with morbid obesity never develop Type 2 diabetes whilst others become hyperglycaemic at relatively modest levels of insulin resistance and obesity.",
+      "As with the KK mouse, the Israeli sand rat model is particularly useful when studying the effects of diet and exercise [120] on the development of Type 2 diabetes.",
+      "Animal models of diabetes in pregnancy and the role of intrauterine environment  Another important field of diabetes research that has relied heavily on animal experimentation is the study of diabetes in pregnancy and the role of the intrauterine environment on the subsequent development of diabetes amongst offspring.",
+      "Animal models of Type 2 diabetes mellitus",
+      "Assessment of Diabetes  Mice were monitored for the development of diabetes as described previously (Wicker et al. 1994)."
+    ],
+    [
+      "Methods  Mouse models of diabetes.All animal studies were conducted according to a protocol approved by the Institutional Animal Care and Use Committee at the Beckman Research Institute of City of Hope.Male type-2 diabetic db/db mice (T2D leptin receptor deficient; Strain BKS.Cg-m \u00fe / \u00fe lepr db/J) and genetic control non-diabetic db/ \u00fe mice (10-12 weeks old), were obtained from The Jackson Laboratory (Bar Harbor, ME) 11,17 .Male C57BL/6 mice (10 week old, The Jackson Laboratory) were injected with 50 mg kg \u00c0 1 of STZ intraperitoneally on 5 consecutive days.Mice injected with diluent served as controls.Diabetes was confirmed by tail vein blood glucose levels (fasting glucose 4300 mg dl \u00c0 1 ).Each group was composed of five to six mice.Mice were sacrificed at 4-5 or 22 (ref.17) weeks post-induction of diabetes.Glomeruli were isolated from freshly harvested kidneys by a sieving technique 11,17 in which renal capsules were removed, and the cortical tissue of each kidney separated by dissection.The cortical tissue was then carefully strained through a stainless sieve with a pore size of 150 mm by applying gentle pressure.Enriched glomerular tissue below the sieve was collected and transferred to another sieve with a pore size of 75 mm.After several washes with cold PBS, the glomerular tissue remaining on top of the sieve was collected.Pooled glomeruli were centrifuged, and the pellet was collected for RNA, protein extraction or for preparing MMCs 11,17 .Male Chop-KO mice were also obtained from the Jackson Laboratory (B6.129S(Cg)-Ddit3 tm2.1Dron /J).Based on our previous experience, sample size was determined to have enough power to detect an estimated difference between two groups.With minimum sample size of 5 in each group, the study can provide at least 80% power to detect an effect size of 2 between diabetic and non-diabetic groups or treated and untreated groups at the 0.05 significant level using two-sided t-test.Since we expected larger variation between groups especially for the mice with oligo-injection, we used more than 5 mice in each group (with 6 mice in each group, we have 80% power to detect an effect size of 1.8 at the 0.05 confidence level).Our actual results with current sample size did show statistical significance for majority of the miRNAs in the cluster.Histopathological and biochemical analysis of tissues or cells derived from animal models were performed by investigators masked to the genotypes or treatments of the animals.",
+      "Diabetes incidence study. Mice were kept for 20-28 weeks and tested for diabetes monthly by blood glucose and weekly by urine assessment, with a positive indication being followed by twice-weekly blood testing.Mice were diagnosed as diabetic when the blood glucose concentration was over 260 mg/dl (14.4 mM) after 2-3 h of fasting for two sequential tests.Glucose and insulin tolerance tests were performed by injecting glucose (2 g/kg body weight) or insulin (1 U/kg body weight) intraperitoneally in mice fasted for 6-7 h.Tail vein blood was tested by a Contour glucometer.Assessments of plasma insulin, proinsulin and C-peptide levels were performed using commercial ELISA kits, according to the manufacturer's instructions (insulin, proinsulin and C-peptide mouse ELISA kits, R&D Systems Quantikine).Assays were performed with blinding, with mice coded by number until experimental end.",
+      "Animal group and study design  First, one set of animals comprising 12-week-old male type 2 diabetic db/db (C57BL/KsJ-db\u2212/db\u2212, n = 8) and contemporary control wild-type (C57BL/KsJ-db+/db\u2212, n = 8) mice (Jackson Laboratories) were included in this study.Their weights and blood glucose levels were analysed to eliminate variation.Erectile functions of the animals were evaluated by the apomorphine-induced penile erection test, according to a previously described protocol (Pan et al. 2014).Afterwards, intracavernous pressure (ICP) investigations and histological measurements were applied to further confirm the results of the function tests.Then, all mice were sacrificed and the corpus cavernosum (CC) was collected from each mouse.Because the tissue of the CC is difficult to crush, we randomly collected the CCs from two mice and mixed them into one subgroup.As a result, four diabetic subgroups (DB groups) and four normal control subgroups (NC groups) were used for molecular measurements.Second, another set of animals, including three T2DMED and three normal control mice that were independent from the original set of animals, were included in the validation experiments using qRT-PCR.Third, another separate set of animals, including five T2DMED and five control mice, were used to verify one of the predicted targets, IGF-1, using ELISA.A luciferase reporter assay was performed to verify the binding of the differentially expressed miRNAs to the target gene IGF-1.All procedures were approved by the Institutional Animal Care and Use committee at Nanjing Medical University.",
+      "Summary of rodent models of type 2 diabetes",
+      "Summary of rodent models of type 1 diabetes",
+      "Knock-out and transgenic mice in diabetes research  Transgenic mice have been used to create specific models of type 1 and type 2 diabetes, including hIAPP mice, humanized mice with aspects of the human immune system and mice allowing conditional ablation of beta cells, as outlined above.Beta cells expressing fluorescent proteins can also provide elegant methods of tracking beta cells for use in diabetes research (Hara et al., 2003).",
+      "Genetically induced insulin-dependent diabetes  AKITA mice.The AKITA mouse was derived in Akita, Japan from a C57BL/6NSlc mouse with a spontaneous mutation in the insulin 2 gene preventing correct processing of proinsulin.This causes an overload of misfolded proteins and subsequent ER stress.This results in a severe insulindependent diabetes starting from 3 to 4 weeks of age, which is characterized by hyperglycaemia, hypoinsulinaemia, polyuria and polydipsia.Untreated homozygotes rarely survive longer than 12 weeks.The lack of beta cell mass in this model makes it an alternative to streptozotocin-treated mice in transplantation studies (Mathews et al., 2002).It has also been used as a model of type 1 diabetic macrovascular disease (Zhou et al., 2011) and neuropathy (Drel et al., 2011).In addition, this model is commonly used to study potential alleviators of ER stress in the islets and in this respect models some of the pathology of type 2 diabetes (Chen et al., 2011).",
+      "To achieve a slow pathogenesis of T2DM, young adult mice 284 or rats 285 are fed a high-fat or Western diet to elicit DIO and insulin resistance.Single or multiple injections with low-dose streptozotocin (~30-40 mg/kg intraperitoneally) then elicit partial loss of \u03b2-cells, which results in hypoinsulinaemia and hyperglycaemia.Protocols are being continuously refined and likely differ between species and even strains 283 .The HFD streptozotocin rat is sensitive to metformin, further demonstrating the utility of this model 285 .Downsides of streptozotocin treatment include liver and kidney toxicity and mild carcinogenic adverse effects (TABLE 1).",
+      "Materials and methods 2.1 Mouse models 2.1.1 Mouse strains 2.1.2 Induction of type 1 diabetes 8 2.1.3 Insulin treatment on diabetic mice 2.1.4 Akita mouse genotyping 2.2 Characterization of diabetic nephropathy in mice 2.2.1 Proteinuria measurement 2.2.2 Glomerular cells quantification 2.2.3 Methenamine silver staining quantification  3. 4. 5. 6.",
+      "ii) Rodent models of diabetic retinopathy",
+      "There are some good reviews available in the literatures describing the transgenic/knockout animal models of type 2 diabetes [114][115][116][117][118] .The transgenic and knockout models are developed for studying the role of genes and their effects on peripheral insulin action such as insulin receptor, IRS-1, IRS-2, glucose transporter (GLUT 4), peroxisome proliferator activated receptor-g (PPAR-g) and tumour necrosis factor-a (TNF-a) as well as in insulin secretion such as GLUT-2, glucokinase (GK), islet amyloid polypeptide (IAPP) and GLP-1 and in hepatic glucose production (expression of PEPCK) associated with development of type 2 diabetes.Further, combination or double knockout mouse models including defect in insulin action and insulin secretion (e.g., IRS-1 +/-/GK +/-double knockout) have been produced which clearly illustrate the mechanisms associated with development of insulin resistance and beta cell dysfunction leading to overt hyperglycaemic state in human type 2 diabetes.These above genetically modified animals exhibit various phenotypic features of type 2 diabetes varying from mild to severe hyperglycaemia, insulin resistance, hyperinsulinaemia, impaired glucose tolerance and others as explained in detail elsewhere 6,9,[114][115][116][117][118] .Very recently, tissue specific knockout mouse models have been achieved, allowing further insight into the insulin action with respect to particular target tissues (muscle, adipose tissue and liver) associated with insulin resistance and type 2 diabetes 115,117,118 .The transgenic/knockout animals are currently used mostly for the mechanistic study in diabetes research and not usually recommended for screening programme as they are more complicated and costly.",
+      "Functional deficits refs  Non-Alzheimer-disease mouse [71][72][73][74]76,78,81,85,87 and rat 59,75,77 ,79,95,97  Mouse [81][82][83][84][85] and rat 79,111  Cerebral effects of inducing diabetes or insulin resistance in normal rodents (that is, non-Alzheimer-disease rodent models) and in rodents genetically modified to accumulate amyloid\u03b2 in the brain (that is, rodent models of Alzheimer disease). Common intervetions to induce diabetic conditions in rodents included recessive mutations in the leptin gene (Lep; also known as Ob), defects in the leptin receptor (LEPR; also known as OB-R), diet and administration of streptozotocin. Rodents with pancratic overexpression of human amylin spontaneously develop both type 2 diabetes mellitus and dementia-like pathology.",
+      "Animal models have been used extensively in diabetes research.Early studies used pancreatectomised dogs to confirm the central role of the pancreas in glucose homeostasis, culminating in the discovery and purification of insulin.Today, animal experimentation is contentious and subject to legal and ethical restrictions that vary throughout the world.Most experiments are carried out on rodents, although some studies are still performed on larger animals.Several toxins, including streptozotocin and alloxan, induce hyperglycaemia in rats and mice.Selective inbreeding has produced several strains of animal that are considered reasonable models of Type 1 diabetes, Type 2 diabetes and related phenotypes such as obesity and insulin resistance.Apart from their use in studying the pathogenesis of the disease and its complications, all new treatments for diabetes, including islet cell transplantation and preventative strategies, are initially investigated in animals.In recent years, molecular biological techniques have produced a large number of new animal models for the study of diabetes, including knock-in, generalized knock-out and tissue-specific knockout mice.",
+      "Animal models of Type 2 diabetes mellitus",
+      "As with the KK mouse, the Israeli sand rat model is particularly useful when studying the effects of diet and exercise [120] on the development of Type 2 diabetes.",
+      "Animal models of Type 1 diabetes",
+      "Animal models have been used extensively in diabetes research.Early studies used pancreatectomised dogs to confirm the central role of the pancreas in glucose homeostasis, culminating in the discovery and purification of insulin.Today, animal experimentation is contentious and subject to legal and ethical restrictions that vary throughout the world.Most experiments are carried out on rodents, although some studies are still performed on larger animals.Several toxins, including streptozotocin and alloxan, induce hyperglycaemia in rats and mice.Selective inbreeding has produced several strains of animal that are considered reasonable models of Type 1 diabetes, Type 2 diabetes and related phenotypes such as obesity and insulin resistance.Apart from their use in studying the pathogenesis of the disease and its complications, all new treatments for diabetes, including islet cell transplantation and preventative strategies, are initially investigated in animals.In recent years, molecular biological techniques have produced a large number of new animal models for the study of diabetes, including knock-in, generalized knock-out and tissue-specific knockout mice.",
+      "Rodent models of monogenic obesity and diabetes  Obesity and the consequent insulin resistance is a major harbinger of Type 2 diabetes mellitus in humans.Consequently, animal models of obesity have been used in an attempt to gain insights into the human condition.Some strains maintain euglycaemia by mounting a robust and persistent compensatory \u03b2 -cell response, matching the insulin resistance with hyperinsulinaemia.The ob / ob mouse and fa / fa rats are good examples of this phenomenon.Others, such as the db / db mouse and Psammomys obesus (discussed later) rapidly develop hyperglycaemia as their \u03b2 -cells are unable to maintain the high levels of insulin secretion required throughout life.Investigation of these different animal models may help explain why some humans with morbid obesity never develop Type 2 diabetes whilst others become hyperglycaemic at relatively modest levels of insulin resistance and obesity.",
+      "Introduction  Animal experimentation has a long history in the field of diabetes research.The aim of this article is to review the commonly used animal models and discuss the recent technological advances that are being employed in the discipline.The review is based on an extensive literature search using the terms rodent, mouse, rat, animal model, transgenics, knockout, diabetes and pathogenesis, in scientific journal databases such as MEDLINE \u00ae.In addition, abstracts presented at meetings of Diabetes UK, the European Association for the Study of Diabetes and the American Diabetes Association over the last 5 years were examined in order to gain an appreciation of recent and ongoing research projects.",
+      "Assessment of Diabetes  Mice were monitored for the development of diabetes as described previously (Wicker et al. 1994)."
+    ],
+    [
+      "Subsequently, genetic dissection of the diabetes-associated traits in the male BC1 progeny obtained from a cross between (normal B6 female \u03eb diabetic TH male)F1 female and diabetic TH male mice (B6 cross) was carried out.Because of the sexual dimorphism, with respect to NIDDM onset, we used diabetic TH male mice as breeders to ensure the presence of a mutant allele(s) and targeted our genetic dissection using only male BC1 progeny.In male BC1 mice hyperglycemia developed at approximately 20 weeks of age and was sustained through a 30-week period studied.Based on these data, we measured plasma glucose levels three times in biweekly intervals (to minimize phenotyping error) between 20 and 26 weeks of age, and the mean of the three measurements was used for genetic analysis.Body weights were measured at 20 weeks.At the end of the study (26 weeks), plasma insulin levels and nasal-anal lengths were measured, and the five regional fat pads were dissected and weighed from a subset of 133 mice.In total, 206 male BC1 mice were collected, and individual mice were genotyped with 92 SSLP markers at approximately 20-cM intervals (covering \u03f396% of the genome).",
+      "Effects of Inbred Background (Table 2).The syndrome produced in BL/Ks diabetes (db) mice, while similar in early development to that of BL/6 obese (ob) mice, has a more severe diabetes-like condition and a less pronounced obesity.However, both mutations when maintained on the same inbred background exhibit identical syndromes from 3 weeks of age on [9,21].Both diabetes and obese mice of the BL/Ks strain have the severe diabetes characterized by insulinopaenia and islet atrophy, whereas both mutations maintained on the BL/6 strain have mild diabetes characterized by islet hypertrophy and hyperplasia of the beta cells.Islet hypertrophy is either sustained or followed by atrophy depending on modifiers in the genetic background rather than the specific action of the mutant gene.The markedly different obesity-diabetes states exhibited when obese and diabetes mice are on different backgrounds points out the importance of strict genetic control in studies with all types of obese-hyperglycaemic mutants.Genetic studies [11] have shown that the modifiers leading to islet hypertrophy and well-compensated diabetes compatible with a near normal lifespan are dominant to those factors causing severe diabetes.Two other mutations, yellow and fat, cause similar diabetes-syndromes and yet have identical symptoms on both inbred backgrounds (Table 2).This may suggest that the primary insult caused by these mutations is not as severe as that for obese and diabetes and that this more gradual initiation of obesity permits the host genome to make a response (islet hypertrophy) compatible with life rather than islet atrophy, insulinopaenia, and life-shortening diabetes.",
+      "The animal models available for diabetes research (Table 1) are most often more like maturityonset diabetes in man.Obesity is a consistent factor and insulinopaenia is rare.However, the time of gene expression at about two weeks of age is within the time period of juvenile expression.The severity and clinical course of the diabetes produced depends on the interaction of the mutant gene with the inbred background rather than the action of the gene itself.Thus on one inbred background a well-compensated, maturity onset type diabetes, compatible with near normal life is observed whereas on another inbred background the syndrome presents as a juvenile-type diabetes with insulinopaenia, islet cell degeneration, marked hyperglycaemia, some ketosis and a much shortened lifespan.Unfortunately, vascular, retinal and the other complications of diabetes are not seen consistently in these rodent syndromes.It seems that the severely diabetic animal either does not live long enough to develop these complications or that rodents are particularly resistant to those complications that commonly afflict human diabetics.Several comprehensive bibliographies and excellent reviews of the various studies carried out with each of these syndromes in animals have been published [2,3,19,30,31,32].This presentation will be restricted primarily to the research undertaken by my colleagues and myself with the two mouse mutations; diabetes (db), and obese (ob).Both mutations have been extensively studied by numerous investigators in attempts to define the primary lesion causing the syndrome.As yet, the primary defect remains illusive, although several possibilities are becoming increasingly plausible in the light of current research.Although the metabolic abnormalities associated with both obese and diabetes have many similarities with regard to the overall progression of the obesity-diabetes state, the documentation of two single genes on separate chromosomes makes it unlikely that the two syndromes are caused by the same primary lesion.However, the marked similarity between the two mutants when maintained on the same genetic background implies that the defects may occur in the same metabolic pathway.",
+      "Diabetes-obesity syndromes in rodents",
+      "The Diabetes (db) .Mouse (Chromosome 4).Diabetes (db), an autosomal recessive mutation, occurred in the C57BL/KsJ (BL/Ks) inbred strain and on this background is characterized by obesity, hyperphagia, and a severe diabetes with marked hyperglycaemia [7,22].Increased plasma insulin concentration is observed as early as 10 days of age [10].The concentration of insulin peaks at 6 to 10 times normal by 2 to 3 months of age then drops precipitously to near normal levels.Prior to the fall in plasma insulin concentration, the most consistent morphological feature of the islets of Langerhans appears to be hyperplasia and hypertrophy of the beta cells in an attempt to produce sufficient insulin to control blood glucose concentration at physiological levels.The drop in plasma insulin concentration is concomitant with islet atrophy and rapidly rising blood glucose concentrations that remain over 400 mg per 100 ml until death at 5 to 8 months [7].Compared with other obesity mutants the diabetic condition is more severe and the lifespan is markedly decreased.",
+      "Diabetes-related clinical traits for 275 B6XBTBR-ob/ ob F2 male mice at 10 weeks of age.",
+      "Results  We generated an F2 inter-cross between diabetes-resistant (B6) and diabetes-susceptible (BTBR) mouse strains, made genetically obese in response to the Lep ob mutation [24].The cross consisted of .500mice, evenly split between males and females.A comprehensive set of ,5000 genotype markers were used to genotype each F2 mouse (,2000 informative SNPs were used for analysis), and the expression levels of ,40 K transcripts (corresponding to 25,901 unique genes) were monitored in five tissues (adipose, liver, pancreatic islets, hypothalamus, and gastroc (gastrocnemius muscle)) that were harvested from each mouse at 10 weeks of age.In addition to gene expression, several key T2D-related traits were determined for each mouse.The medians, and 1st and 3rd quartiles for the following traits: body weight, the number of islets harvested per pancreas, HOMA, plasma insulin, glucose, triglyceride, and C-peptide are listed in Table 1.",
+      "However, in other contexts, B6 mice are more likely than D2 to spontaneously develop diabetic syndromes, Aging Clin Exp Res  indicating that risk factors exist on both genetic backgrounds [29]. QTL mapping studies indicate that these murine metabolic traits have a complex genetic architecture that is not dominated by any single allele [29\u201331], much like humans [32, 33]. Prior work identified candidate genes on Chr 13 that might underlie diabetes-related traits, including RASA1, Nnt, and PSK1. RASA1 show strong sequence differences between B6 and D2 strains [34]. Rasche et al.",
+      "Thus, there is a rich literature indicating strong genetic effects on glucose metabolism in the B6 and D2 genetic background, and a male-specific form of diabetes is known to spontaneously occur in hybrids of this strain. Dental traits The reported link between a Chr 13 locus and dental malocclusions [46] might provide an alternative or additional explanation of the associations we observe. Dental malocclusions were the only major male-specific cause of death we observed in this mouse population (20 % of males that died before the 750-day phenotyping tests, 0 % of females).",
+      "Obesity-associated diabetes (''diabesity'') in mouse strains is characterized by severe insulin resistance, hyperglycaemia and progressive failure, and loss of beta cells.This condition is observed in inbred obese mouse strains such as the New Zealand Obese (NZO/HlLt and NZO/HlBomDife) or the TALLYHO/JngJ mouse.In lean strains such as C57BLKS/J, BTBR T?tf/J or DBA/2 J carrying diabetes susceptibility genes (''diabetes susceptible'' background), it can be induced by introgression of the obesity-causing mutations Lep \\ob[ (ob) or Lepr \\db[ (db).Outcross populations of these models have been employed in the genome-wide search for mouse diabetes genes, and have led to positional cloning of the strong candidates Pctp, Tbc1d1, Zfp69, and Ifi202b (NZO-derived obesity) and Sorcs1, Lisch-like, Tomosyn-2, App, Tsc2, and Ube2l6 (obesity caused by the ob or db mutation).Some of these genes have been shown to play a role in the regulation of the human glucose or lipid metabolism.Thus, dissection of the genetic basis of obesity and diabetes in mouse models can identify regulatory mechanisms that are relevant for the human disease.",
+      "Polygenic basis of ''diabesity'' in mice: the interaction of obesity and diabetes genes Obesity-associated diabetes (''diabesity'') is due to interaction of genes causing obesity with diabetes genes.This conclusion is based on findings indicating that obesity is a necessary but not sufficient condition for the type 2 diabetes-like hyperglycaemia: Obese mice are insulin resistant and therefore more or less glucose intolerant, but in some strains such as C57BL/6J-ob/ob, insulin resistance is compensated by hyperinsulinemia and beta cell hyperplasia, and plasma glucose is only moderately elevated.Other models such as C57BLKS/J-db/db and NZO present overt diabetes mellitus as defined by a threshold of 16.6 mM (300 mg/dl) plasma glucose (Leiter et al. 1998); mice crossing this threshold usually exhibit progressive failure and subsequent apoptosis of beta cells.This type 2 diabetes-like condition is not due to the obesity-causing gene variants but to other genes in the genetic background of the strain, which cause obesity-associated diabetes.The severe and early onsetting diabetes of the C57BLKS/J-db/ db strain is due to the C57BLKS/J background, since mice carrying the db mutation on the C57BL/6J background are not diabetic (Stoehr et al. 2000).Conversely, C57BL/6Job/ob mice are normoglycemic, whereas introgression of the ob mutation into the C57BLKS/J background produced a severely diabetic strain (Coleman 1978).Furthermore, it has been shown that in crosses of lean, normoglycaemic strains with diabetic strains the lean strain can introduce variants that markedly aggravate the diabetic phenotype (Leiter et al. 1998;Plum et al. 2000).",
+      "Obesity-associated diabetes (''diabesity'') in mouse strains is characterized by severe insulin resistance, hyperglycaemia and progressive failure, and loss of beta cells.This condition is observed in inbred obese mouse strains such as the New Zealand Obese (NZO/HlLt and NZO/HlBomDife) or the TALLYHO/JngJ mouse.In lean strains such as C57BLKS/J, BTBR T?tf/J or DBA/2 J carrying diabetes susceptibility genes (''diabetes susceptible'' background), it can be induced by introgression of the obesity-causing mutations Lep \\ob[ (ob) or Lepr \\db[ (db).Outcross populations of these models have been employed in the genome-wide search for mouse diabetes genes, and have led to positional cloning of the strong candidates Pctp, Tbc1d1, Zfp69, and Ifi202b (NZO-derived obesity) and Sorcs1, Lisch-like, Tomosyn-2, App, Tsc2, and Ube2l6 (obesity caused by the ob or db mutation).Some of these genes have been shown to play a role in the regulation of the human glucose or lipid metabolism.Thus, dissection of the genetic basis of obesity and diabetes in mouse models can identify regulatory mechanisms that are relevant for the human disease.",
+      "Spontaneous type 2 diabetic models  Spontaneously diabetic animals of type 2 diabetes may be obtained from the animals with one or several genetic mutations transmitted from generation to generation (e.g., ob/ob, db/db mice) or by selected from non-diabetic outbred animals by repeated breeding over several generation [e.g., (GK) rat, Tsumara Suzuki Obese Diabetes (TSOD) mouse].These animals generally inherited diabetes either as single or multigene defects.The metabolic peculiarities result from single gene defect (monogenic) which may be due to dominant gene (e.g., Yellow obese or KK/A y mouse) or recessive gene (diabetic or db/db mouse, Zucker fatty rat) or it can be of polygenic origin [e.g., Kuo Kondo (KK) mouse, New Zealand obese (NZO) mouse] 13 .Type 2 diabetes occurring in majority of human being is a result of interaction between environmental and multiple gene defects though certain subtype of diabetes do also exist with well defined cause [i.e., maturity onset diabetes of youth (MODY) due to defect in glucokinase gene] and this single gene defects may cause type 2 diabetes only in few cases.",
+      "Mice of the KK strain exhibit a multigenic syndrome of hyperphagia, moderate obesity, hyperinsulinemia, and hyperglycemia (Ikeda 1994;Nakamura andYamada 1963, 1967;Reddi and Camerini-Davalos 1988).Most KK males develop non-insulindependent diabetes after 4 months of age (Leiter and Herberg 1997).While KK females are much less diabetes prone, they do become obese.Previous analyses indicate that the inheritance of obesity and diabetes phenotypes in KK mice is multigenic (Nakamura and Yamada 1963;Reddi and Camerini-Davalos 1988).In the present study, we have searched for QTLs affecting male and female adiposity and related traits in an intercross between strains KK and B6.",
+      "We have previously shown that diabetes traits show strong heritability in an F2 intercross between the diabetes-resistant C57BL/6 leptinob/ob and the diabetes-susceptible BTBR leptinob/ob mouse strains. We assume that the disease phenotype is brought about by a complex pattern of gene expression changes in key tissues [21,22]. However, we also recognize the complexity inherent in discriminating the gene expression changes that cause diabetes from those that occur as a consequence of the disease. For example, many genes are known to be responsive to elevated blood glucose levels [43].",
+      "Although the early onset of diabetes in db mice coincides with t h a t in juvenile diabetes in man, the symptoms of obesity and elevated serum insulin are more suggestive of the pattern of development observed in the maturity-onset type of diabetes. As yet, none of the lesions associated with advanced diabetes in humans such as retinopathies, cardiovascular and kidney lesions have been observed, possibly because of the early onset of the diabetes and the relatively rapid deterioration and death of these mice.",
+      "Key-words: Spontaneous Diabetes, Genotype : C57BL/ K5-db, Diabetes in mice, Mutation: diabetes, Obesity, Prediabetes, Insulin in plasma, Insulin in pancreas.",
+      "Results All mice homozygous for the trait, diabetes (db), develop an abnormal and characteristic deposition of fat beginning at 3 to 4 weeks of age, making their early identification possible. The difference in size and appearance of litter-mate 6-week old mice, one normal and one diabetic, is shown in Fig. 1. Weight increases  Fig. 1. C57BL/Ks-db litter-mates a t 6 weeks.",
+      "Diabetologia 3, 238-248 (1967)  Studies with the Mutation, Diabetes, in the Mouse* D . L . COT.EMA~ a n d I ~ T H A a I ~  P. t I u M ~ L  The Jackson Laboratory, Bar Harbor, Maine  Summary. The mutation, diabetes:,(db), t h a t occurred in the C57BL/Ks strain of mice is a unit autosomal recessive gene with full penetrance, and causes metabolic disturbances in homozygous mice resembling diabetes mellitus in man.",
+      "To screen for genes that show correlation with different phenotypic outcome in diabetic mouse models, we used the cross-sectional design and performed microarray analysis on 24-wk-old STZ-treated and db/db mice with established renal pathology.In parallel with the functional genomics characterization, each individual mouse underwent a detailed renal phenotype analysis.Mice that were treated with low doses of STZ developed diabetes and moderately severe albuminuria (twice the control).In mice with C57B6/J background, the mesangial changes were mild or absent.Mice with 129SvJ genetic background developed significant glomerular changes.However, these were not significantly different from the agematched controls (K.Sharma, K. Susztak, and E.P. Bo \u00a8ttinger, unpublished observations).The db/db mice became insulin resistant and developed diabetes at approximately 8 wk of age.Albuminuria was detected as early as 3 to 4 wk after the development of hyperglycemia.The glomerular histology was characterized by severe diffuse mesangial expansion, as previously reported (49)."
+    ]
+  ]
+}
\ No newline at end of file
diff --git a/gnqa/paper1_eval/src/data/datasets/old/diabetes_2_dataset.json b/gnqa/paper1_eval/src/data/datasets/old/diabetes_2_dataset.json
new file mode 100644
index 00000000..97480b71
--- /dev/null
+++ b/gnqa/paper1_eval/src/data/datasets/old/diabetes_2_dataset.json
@@ -0,0 +1,128 @@
+{
+  "question": [
+    "How do gene-environment interactions influence diabetes risk and progression?",
+    "What non-coding RNAs are involved in diabetes, and what roles do they play?",
+    "How do gene-environment interactions influence diabetes risk and progression?",
+    "Can we identify genetic predictors of diabetes complications?",
+    "What are the genetic bases for the varying efficacy of diabetes treatments among individuals?"
+  ],
+  "answer": [
+    "Gene-environment interactions influence diabetes risk and progression by the interplay of genetic predisposition and environmental factors such as diet, physical activity, and lifestyle. Certain genetic variants may increase the risk of type 2 diabetes (T2D), but this risk can be modified by environmental factors. For instance, the adverse effect of some T2D-associated genetic variants may be attenuated by higher physical activity levels or a healthy lifestyle. Conversely, low physical activity and dietary factors characterizing a Western dietary pattern may augment the risk. Understanding these interactions can help in the development of personalized prevention strategies and treatments for T2D.",
+    "MicroRNAs and long noncoding RNAs (lncRNAs) are involved in diabetes. MicroRNAs modulate post-transcriptional control of gene expression through degradation or translational repression of key messenger RNAs. They can regulate pathogenic responses such as angiogenesis, blood flow, neural cell dysfunction, tissue-specific inflammation and glucose metabolism. They also hold potential as diagnostic biomarkers and possible drug-targets for regulation of dysfunctional cell responses. LncRNAs are implicated in complications associated with diabetes, such as diabetic retinopathy and diabetic nephropathy. They can regulate cell proliferation, viability, migration, and the expression of pathological genes via post-transcriptional mechanisms.",
+    "Gene-environment interactions influence diabetes risk and progression by the interplay of genetic predisposition and environmental factors such as diet, physical activity, and lifestyle. Certain genetic variants may increase the risk of type 2 diabetes (T2D), but this risk can be modified by environmental factors. For instance, the adverse effect of some T2D-associated genetic variants may be attenuated by higher physical activity levels or a healthy lifestyle. Conversely, low physical activity and dietary factors characterizing a Western dietary pattern may augment the risk. Understanding these interactions can help in the development of personalized prevention strategies and treatments for T2D.",
+    "Yes, genetic predictors of diabetes complications can be identified. Studies have revealed several susceptibility loci for diabetic complications such as retinopathy and nephropathy. However, these genetic factors only explain a small proportion of the phenotypic variation observed in type 2 diabetes patients, indicating a need for the identification of more novel genetic risk factors.",
+    "The varying efficacy of diabetes treatments among individuals can be attributed to genetic variants present in drug receptors or drug metabolizers, such as OCT genes, KCNJ11, ABCC8, and CYP2C9. These genetic variants can influence the metabolism, transportation, and therapeutic mechanisms of antidiabetic drugs, leading to differences in drug disposition, glycemic response, tolerability, and incidence of adverse effects. Additionally, gene-gene, gene-environment, and gene-treatment interactions may also contribute to the variation in disease progression and response to therapy."
+  ],
+  "contexts": [
+    [
+      "Additional evidence supporting a potentially important role for environmental modulation of genetic risk was found in previous population studies.For example, although some of the GWASidentified T2D loci could be replicated successfully in various populations (e.g., CDKAL1, HHEX, IGF2BP2, TCF7L2 and SLC30A8), more genetic variants have been identified only in some specific populations [26].T2D risk alleles showed extreme directional differentiation between different populations compared with other common diseases [29].Different T2D loci and loci frequencies across different populations may reflect the adaptation to the local environments and diets along with human migration [30].Therefore, the interplay between gene and environment leads to a more complex pathogenesis of T2D and related traits.These hypotheses are strongly supported by a number of recent GxE studies [7,11,31,32].For example, Qi et al. [31] generated a genetic risk score (GRS) using ten GWAS-identified SNPs and observed a significant interaction between the Western dietary pattern and GRS in the Health Professionals Follow-Up Study.The Western dietary pattern was only positively associated with risk of T2D among men with a high GRS, but not with low GRS subjects.Another large meta-analysis of 14 cohort studies [32] revealed that dietary whole-grain intake potentially interacted with one GCKR variant (rs780094) for fasting insulin in individuals of European descent.Greater whole-grain intake was associated with a smaller reduction of fasting insulin in individuals with the insulin-raising allele of rs780094, compared to the non-risk allele.",
+      "Gene\u2013exercise interaction in type 2 diabetes When studying gene\u2013environment interaction on the quantitative traits that underlie diabetes, the power to detect interaction is highly dependent on the precision with which non-genetic exposures are measured (Wareham et al 2002). Achievement of optimal glycaemic control is the focus of traditional treatment paradigms. Regular exercise, both aerobic (walking, jogging, or cycling) and resistance (weightlifting) training results in increased glucose uptake and insulin sensitivity and is a primary modality used in the treatment of type 2 diabetes patients (Sigal et al 2007).",
+      "Gene-Environment Interaction  Evidence from the epidemiology of T2D overwhelmingly supports a strong environmental influence interacting with genetic predisposition in a synergistic fashion as has been recently reviewed [123], however current state-of-the-art methods for measuring environmental effects lack precision and can result in changes in statistical power to detect interaction [123,124].Since lifestyle factors are important in preventing diabetes [125,126], interaction of gene variants with measures of dietary intake and exercise have been selected for studies on gene-environment interaction.For example, HNF1B (rs 4430796) was shown to interact with exercise; low levels of activity enhanced the risk of T2D in association with absence of the risk allele, but there was no protective effect of exercise when the allele was present.It follows that subgrouping by genotype may serve to enhance risk prediction while considering gene-environment interaction as has been done for exercise [127].Also lifestyle including exercise modified the effect of a CDKN2A/B variant on 2-hour glucose levels in the Diabetes Prevention Program [128] but was not confirmed in the HERITAGE study using different measurements and phenotypes involving insulin sensitivity and \u03b2-cell function [129].The pro12ala PPARG variant also interacts with physical activity for effect on 2-hour glucose levels [130], which was confirmed in the smaller HERITAGE study [129].In addition, a relationship of dietary fat intake with plasma insulin and BMI differs by the pro12ala PPARG genotype [131].",
+      "A person's risk of type 2 diabetes or obesity reflects the joint effects of genetic predisposition and relevant environmental exposures.Efforts to determine whether these genetic and environmental components of risk interact (in the statistical sense that joint effects cannot be predicted from main effects alone) 70 face challenges associated with measuring relevant exposures (diet and physical activity being notoriously difficult to estimate) and the effect of imprecision on statistical power. 71Although claims that statistical interactions reflect shared mechanisms (i.e., that the interacting factors act through the same pathways) are probably overstated, understanding the relative contributions of genetic and environmental components to risk is important.After all, environmental factors can be modified more readily than genetic factors.Genetic discoveries have provided a molecular basis for the clinically useful classification of monogenic forms of diabetes and obesity. 3,4Will the same be true for the common forms of these conditions?Probably not: as far as the common variants are concerned, each patient with diabetes or obesity has an individual \"barcode\" of susceptibility alleles and protective alleles across many loci.It is possible to show that the genetic profiles of lean subjects with type 2 diabetes and obese subjects with type 2 diabetes are not identical, but these differences appear to be inadequate for clinically useful subclassification. 22,72f efforts to uncover less prevalent, higher-penetrance alleles are successful, more precise classification of disease subtypes may become possible, particularly if genetic data can be integrated with clinical and biochemical information.For example, in persons presenting with diabetes in early adulthood, there are several possible diagnoses: various subtypes of maturity-onset diabetes of the young or mitochondrial diabetes, for example, as well as type 1 or type 2 diabetes.Assigning the correct diagnosis has both prognostic and therapeutic benefits for the patient (Table 3).",
+      "Genes, environment, and development of type 2 diabetes  Genes and the environment together are important determinants of insulin resistance and \u03b2-cell dysfunction (fi gure 2).Because changes in the gene pool cannot account for the rapid increase in prevalence of type 2 diabetes in recent decades, environmental changes are essential to understanding of the epidemic.",
+      "Type 2 diabetes (T2D) is thought to arise from the complex interplay of both genetic and environmental factors.Since the advent of genomewide association studies (GWAS), we have seen considerable progress in our understanding of the role that genetics and gene-environment interactions play in the development of T2D.Recent work suggests that the adverse effect of several T2D loci may be abolished or at least attenuated by higher physical activity levels or healthy lifestyle, whereas low physical activity and dietary factors characterizing a Western dietary pattern may augment it.However, there still remain inconsistencies warranting further investigation.Lack of statistical power and measurement errors for the environmental factors continue to challenge our efforts for characterizing interactions.Although our recent focus on established T2D loci is reasonable, we may be overlooking many other potential loci not captured by recent T2D GWAS.Agnostic approaches to the discovery of gene and environment interactions may address this possibility, but their application to the field is currently limited and still faces conceptual challenges.Nonetheless, continued investment in gene-environment interaction studies through large collaborative efforts holds promise in furthering our understanding of the interplay between genetic and environmental factors.",
+      "Type 2 diabetes (T2D) is thought to arise from the complex interplay of both genetic and environmental factors.Since the advent of genomewide association studies (GWAS), we have seen considerable progress in our understanding of the role that genetics and gene-environment interactions play in the development of T2D.Recent work suggests that the adverse effect of several T2D loci may be abolished or at least attenuated by higher physical activity levels or healthy lifestyle, whereas low physical activity and dietary factors characterizing a Western dietary pattern may augment it.However, there still remain inconsistencies warranting further investigation.Lack of statistical power and measurement errors for the environmental factors continue to challenge our efforts for characterizing interactions.Although our recent focus on established T2D loci is reasonable, we may be overlooking many other potential loci not captured by recent T2D GWAS.Agnostic approaches to the discovery of gene and environment interactions may address this possibility, but their application to the field is currently limited and still faces conceptual challenges.Nonetheless, continued investment in gene-environment interaction studies through large collaborative efforts holds promise in furthering our understanding of the interplay between genetic and environmental factors.",
+      "Gene and Environment Selection  Environmental factors selected for recent G \u00d7 E interactions studies continue to be the established modifiable risk factors for T2D such as obesity, physical activity, dietary fat, and carbohydrate quality as well as measures of pre-and post-uterine environment.The genetic factors selected, however, have shifted from biological candidates based on functional evidence to genome-wide established loci for T2D or related traits (Table 1).This approach may improve power to detect and strengthen causal inference for an interaction (49).Focusing on established T2D loci may also further our understanding of their functional role in disease development in addition to their public health relevance in the context of genetic risk modification (13).",
+      "We have seen considerable progress in our understanding of the role that both environment and genetics play in the development of T2D.Recent work suggests that the adverse effect of some established T2D-associated loci may be greatly attenuated by appropriate changes in certain lifestyle factors.Our recent approach to studies of G \u00d7 E interactions in T2D has gained considerable advantage over previous approaches, but it is clearly not optimal.Lack of statistical power and measurement error for environmental factors will continue to challenge our efforts to characterize G \u00d7 E interactions.Although our recent focus on established T2D loci is reasonable, we may be overlooking many other potential loci not captured by recent T2D GWAS.Agnostic approaches to the discovery of G \u00d7 E interactions may address this possibility, but their application to the field is currently limited and still faces conceptual challenges.Nevertheless, large collaborative efforts have the potential to uncover true G \u00d7 E interactions, which will enhance our understanding of the interplays between genes and environment in the etiology of T2D.",
+      "The purpose of the present review is to summarize recent epidemiological approaches and progress pertaining to gene-environment (G \u00d7 E) interactions potentially implicated in the pathogenesis of T2D and its related traits.We also discuss continuing challenges, evolving approaches, and recommendations for future efforts in this field.",
+      "FUTURE PERSPECTIVES  Continued investment in studies of G \u00d7 E interactions for T2D holds promise on several grounds.First, such studies may provide insight into the function of novel T2D loci and pathways by which environmental exposures act and, therefore, yield a better understanding of T2D etiology (66).They could also channel experimental studies in a productive direction.Second, knowledge of G \u00d7 E interactions may help identify high-risk individuals for diet and lifestyle interventions.This may also apply to pharmacological interventions if individuals carrying certain genotypes are more or less responsive to specific medications.The finding that patients with rare forms of neonatal diabetes resulting from KCNJ11 mutations respond better to sulfonylurea than to insulin therapy is just one example demonstrating the potential for this application of G \u00d7 E interaction research (69).Third, we are fast approaching an era when individuals can feasibly obtain their complete genetic profile and thus a snapshot of their genetic predisposition to disease.It will therefore be the responsibility of health professionals to ensure that their patients have an accurate interpretation of this information and a means to curb their genetic risk.A long-held goal of genetic research has been to tailor diet and lifestyle advice to an individual's genetic profile, which will, in turn, motivate him or her to adopt and maintain a protective lifestyle.There is currently no evidence that this occurs.Findings to date, however, indicate that behavioral changes can substantially mitigate diabetogenic and obesogenic effects of individual or multiple risk alleles, which has much broader clinical and public health implications.",
+      "Gene-Nutrient or Dietary Pattern Interactions in The Development of T2DM  Recently, several studies have demonstrated the significant effects of genotype by environment interactions on T2DM [48,49].However, further clarification of the role of these interactions at the genome-wide level could help predict disease risk more accurately and facilitate the development of dietary recommendations to improve prevention and treatment.Moreover, it would be very interesting to identify the specific dietary factors that are the most influential in the variation of a given T2DM-related phenotype and to what extent these dietary factors contribute to the phenotypic variation (Table 2).In particular, the dietary factors considered are macro-and micronutrients, foods and type of diets.A recent review present evidence on the dietary environment and genetics as risk factors for T2DM [50]. * Adiponectin (ADIPOQ).",
+      "Introduction  Genome wide association studies (GWAS) of type 2 diabetes mellitus and relevant endophenotypes have shed new light on the complex etiology of the disease and underscored the multiple molecular mechanisms involved in the pathogenic processes leading to hyperglycemia [1].Even though these studies have successfully mapped many diabetes risk genetic loci that could not be detected by linkage analysis, the risk single nucleotide polymorphisms (SNP) have small effect sizes and generally explain little of disease heritability estimates [2].The poor contribution of risk loci to diabetes inheritance suggests a prominent role of environmental factors (eg.diet, physical activity, lifestyle), gene \u00c2 environment interactions and epigenetic mechanisms in the pathological processes leading to the deterioration of glycemic control [3,4].",
+      "The literature on gene-environment interactions in diabetes-related traits is extensive, but few studies are accompanied by adequate replication data or compelling mechanistic explanations.Moreover, most studies are cross-sectional, from which temporal patterns and causal effects cannot be confidently ascertained.This has undermined confidence in many published reports of gene-environment interactions across many diseases; although interaction studies in psychiatry have been especially heavily criticized [3], many of the points made in that area relate to other diseases, not least to T2D, where the diagnostic phenotype (elevated blood glucose or HbA1c) is a consequence of underlying and usually unmeasured physiological defects (e.g., at the level of the pancreatic beta-cell, peripheral tissue, liver, and gut), and the major environmental risk factors are difficult to measure well.Nevertheless, several promising examples of geneenvironment interactions relating to cardiometabolic disease exist, as discussed below and described in Table 1, and interaction studies with deep genomic coverage in large cohorts are now conceivable; the hope is that these studies will highlight novel disease mechanisms and biological pathways that will fuel subsequent functional and clinical translation studies.This is important, because diabetes medicine may rely increasingly on genomic stratification of patient populations and disease phenotype, for which gene-environment interaction studies might prove highly informative.",
+      "The genome is often the conduit through which environmental exposures convey their effects on health and disease.Whilst not all diseases act by directly perturbing the genome, the phenotypic responses are often genetically determined.Hence, whilst diseases are often defined has having differing degrees of genetic determination, genetic and environmental factors are, with few exceptions, inseparable features of most diseases, not least type 2 diabetes.It follows that to optimize diabetes, prevention and treatment will require that the etiological roles of genetic and environmental risk factors be jointly considered.As we discuss here, studies focused on quantifying gene-environment and gene-treatment interactions are gathering momentum and may eventually yield data that helps guide health-related choices and medical interventions for type 2 diabetes and other complex diseases.",
+      "The genome is often the conduit through which environmental exposures convey their effects on health and disease.Whilst not all diseases act by directly perturbing the genome, the phenotypic responses are often genetically determined.Hence, whilst diseases are often defined has having differing degrees of genetic determination, genetic and environmental factors are, with few exceptions, inseparable features of most diseases, not least type 2 diabetes.It follows that to optimize diabetes, prevention and treatment will require that the etiological roles of genetic and environmental risk factors be jointly considered.As we discuss here, studies focused on quantifying gene-environment and gene-treatment interactions are gathering momentum and may eventually yield data that helps guide health-related choices and medical interventions for type 2 diabetes and other complex diseases.",
+      "Predisposition is influenced by the level of certain environmental exposures, personal factors, access to good-quality primary care, and by genotype.Interactions between genetic and nongenetic risk factors are hypothesized to raise diabetes risk in a synergistic manner; reciprocally, health-enhancing changes in behavior, body composition, or medication may reduce the risk of disease conveyed by genetic factors.Defining the nature of these interactions and identifying ways through which reliable observations of gene-environment interactions (GEIs) can be translated into the public health setting might help 1) optimize targeting of health interventions to persons most likely to respond well to them, 2) improve cost-and health-effectiveness of existing preventive and treatment paradigms; 3) reduce unnecessary adverse consequences of interventions; 4) increase patient adherence to health practitioners' recommendations; and 5) identify novel interventions that are beneficial only in a defined genetic subgroup of the population.In this Perspective, we describe the rationale and evidence relating to the existence of gene-environment and genetreatment interactions in type 2 diabetes.We discuss the tried, tested, and oftenfailed approaches to investigating genelifestyle interactions in type 2 diabetes; we discuss some recent developments in gene-treatment interactions (pharmacogenetics); and we look forward to the strategies that are likely to dominate these fields of research in the future.We conclude with a discussion of the requirements for translating findings from these future studies into a form where they can be used to help predict, prevent, or treat diabetes.Here we describe the rationale and evidence concerning GEIs and gene-treatment interactions in type 2 diabetes, provide an interpretation of current findings and strategies, and offer a view for their future translation.",
+      "T2DM results from the contribution of many genes [10] , many environmental factors [11] , and the interactions among those genetic and environmental factors.Physical activity and dietary fat have been reported to be important modifiers of the associations between glucose homeostasis and well-known candidate genes for T2DM [12] and there is reason to believe that a significant proportion of the susceptibility genes identified by GWASs will interact with these environmental factors to influence the disease risk.Florez et al. [13] reported that response to the Diabetes Prevention Program lifestyle intervention did not differ by genotype groups at TCF7L2 rs7903146 [13] .A more recent report from the Diabetes Prevention Program [14] showed that among 10 of the recently identified diabetes susceptibility polymorphisms (single nucleotide polymorphisms, SNPs), only CDKN2A/B rs10811661 was shown to marginally modify the effect of the lifestyle intervention on diabetes risk reduction.Similarly, the study of Brito et al. [15] reported that among 17 of the diabetes SNPs, only HNF1B rs4430796 significantly interacted with physical activity to influence impaired glucose tolerance risk and incident diabetes.",
+      "Gene-Environment  Interactions.An risk of developing T2D is the product of interaction between the individual's genetic constitution and the environment inhabited by the individual.Whilst the contribution of genetic factors to disease risk is relatively easy to quantify, the impact of environmental exposure is less easily measured in a clinical setting.Nevertheless, efforts have been made to study the interactions between some of the known susceptibility loci for T2D and the environment, and these findings may be useful for the development of prediction models and tailoring clinical treatment for T2D [122,123].For example, for carriers of the risk allele for TCF7L2, diets of low glycaemic load [124,125] and a more intensive lifestyle modification regime (versus that recommended for nonrisk carriers) [61,62,126,127] have been shown to reduce the risk of T2D.Meaningful studies for gene-environment interactions will require samples of sufficient size to increase statistical power [128] and accurate methods for measuring environmental exposure, for example, the use of metabolomics to identify and assess metabolic characteristics, changes, and phenotypes in response to the environment, diet, lifestyle, and pathophysiological states.This information will allow the generation of better risk prediction models and personalisation/stratification of treatment, the holy grail of GWAS.",
+      "Other aspects that have been overlooked in large GWAS on T2DM relate to environmental effects such as diet, physical activity, and stresses, which may affect gene expression.For example, fish oil may stimulate PPARG in much the same fashion as the thiazolidinedione class of drugs; however, studies on the interaction of the PPARG variant with dietary components have not been performed.The spectacular rise in the incidence of diabetes among Pima Indians and other populations as they adopt Western diets and lifestyles dramatically demonstrates the key role of the environment [12].Consequently, it could be expected that the effect of a common gene variant among populations that have very different diets and exercise habits might be totally different, thus explaining some instances of lack of replication. [4].Another variable that influences the statistical and real association of an SNP with a disease or response to a diet is epigenetic interaction.Epigenesis is the study of heritable changes in gene function that occur without a change in the DNA sequence, such as DNA methylation and chromatin remodeling.Both mechanisms can affect gene expression by altering the accessibility of DNA to regulatory proteins or complexes such as transcription factors, and they can be influenced by certain nutrients and by overall caloric intake.Thus, it can be expected that long-term exposure to certain diets could produce permanent epigenetic changes in the genome [7]."
+    ],
+    [
+      "It is important to find better treatments for diabetic nephropathy (DN), a debilitating renal complication.Targeting early features of DN, including renal extracellular matrix accumulation (ECM) and glomerular hypertrophy, can prevent disease progression.Here we show that a megacluster of nearly 40 microRNAs and their host long non-coding RNA transcript (lnc-MGC) are coordinately increased in the glomeruli of mouse models of DN, and mesangial cells treated with transforming growth factor-b1 (TGF-b1) or high glucose.Lnc-MGC is regulated by an endoplasmic reticulum (ER) stress-related transcription factor, CHOP.Cluster microRNAs and lnc-MGC are decreased in diabetic Chop \u00c0 / \u00c0 mice that showed protection from DN. Target genes of megacluster microRNAs have functions related to protein synthesis and ER stress.A chemically modified oligonucleotide targeting lnc-MGC inhibits cluster microRNAs, glomerular ECM and hypertrophy in diabetic mice.Relevance to human DN is also demonstrated.These results demonstrate the translational implications of targeting lnc-MGC for controlling DN progression.",
+      "It is important to find better treatments for diabetic nephropathy (DN), a debilitating renal complication.Targeting early features of DN, including renal extracellular matrix accumulation (ECM) and glomerular hypertrophy, can prevent disease progression.Here we show that a megacluster of nearly 40 microRNAs and their host long non-coding RNA transcript (lnc-MGC) are coordinately increased in the glomeruli of mouse models of DN, and mesangial cells treated with transforming growth factor-b1 (TGF-b1) or high glucose.Lnc-MGC is regulated by an endoplasmic reticulum (ER) stress-related transcription factor, CHOP.Cluster microRNAs and lnc-MGC are decreased in diabetic Chop \u00c0 / \u00c0 mice that showed protection from DN. Target genes of megacluster microRNAs have functions related to protein synthesis and ER stress.A chemically modified oligonucleotide targeting lnc-MGC inhibits cluster microRNAs, glomerular ECM and hypertrophy in diabetic mice.Relevance to human DN is also demonstrated.These results demonstrate the translational implications of targeting lnc-MGC for controlling DN progression.",
+      "Platelets are key partaker in CVD and their involvement in the development of cardiovascular complications is strengthened in diabetes (148).Platelets play an important role in the pathophysiology of thrombosis and represent an important source of different RNA species, including pseudogenes, intronic transcripts, non-coding RNAs, and antisense transcripts (149,150).These molecules can be released by platelets through microvescicles, contributing to the horizontal transfer of molecular signals delivered through the bloodstream to specific sites of action (151).The downregulation of miR-223, miR-126, or 146a observed in diabetic and hyperglycemic patients (137,152) has been associated with increased platelet reactivity and aggregation (153,154).In line with these findings, silencing of miR-223 in mice caused a hyperreactive and hyperadhesive platelet phenotype, and was associated with calpain activation through the increased expression of beta1 integrin, kindlin-3, and factor XIII (153,155).Moreover, the modulation of the expression levels of platelet miRNAs can also be measured in plasma.In fact, plasma levels of miR-223 and miR-126 are decreased in diabetics (137,156).This leads to the upregulation of the P2Y12 receptor, as well as P-selectin, further contributing to platelet dysfunction (156).As a result of this interaction, activation level of platelets in type 2 DM is increased (149,156,157).Consistently with this, circulating miR-223 levels are independent predictors of high on-treatment platelet reactivity (158).Another interesting mechanism linking platelets and diabetes involves miR-103b, a platelet-derived biomarker proposed for the early diagnosis of type 2 DM, and the secreted frizzledrelated protein-4 (SFRP4), a potential biomarker of early \u03b2 cell dysfunction and diabetes.In fact, platelet-derived miR-103b is able to downregulate SFRP4, whose expression levels are significantly increased in pancreatic islets and in the blood of patients with prediabetes or overt diabetes (159).These interesting results identify miR-103b as a novel potential marker of prediabetes and diabetes, and disclose a novel potential therapeutic target in type 2 DM.",
+      "In vitro and in vivo studies concerning the mechanisms that are responsible for the endothelial dysfunction in diabetes demonstrated that, in the presence of high glucose concentrations, upregulation of miR-185 reduced the expression of the glutathione peroxidase-1 (GPx-1) gene, which encodes an enzyme that is important in the prevention of oxidative stress (129); instead upregulation of miR-34a and miR-204 contributed to endothelial cell senescence by impairing SIRT-1 expression and function (130,131).In the endothelium, miR-126 exerts proangiogenic, and anti-inflammatory activities.At a functional level, it enhances VEGF and fibroblast growth factor activities, contributing to vascular integrity and angiogenesis (132,133), recruits progenitor cells through the chemokine CXCL12 (134), while it suppresses inflammation by inhibiting TNF-\u03b1, ROS, and NADPH oxidase via HMGB1 (135).Consistently, miR-126 levels are down-regulated in both myocardial tissue and plasma from type 2 diabetic patients without any known anamnestic data for CVD (136,137), and in patients with CAD (138), suggesting that it could represent a new diagnostic marker for diabetes and CVD.Other studies in endothelial colony-forming cells, as well as in progenitor endothelial cells (EPCs) exposed to high glucose, demonstrated that miR-134 and miR-130a affected cell motility and apoptosis, respectively (139,140).",
+      "Numerous recent reports have demonstrated abnormal expression of various miRNAs in renal, vascular and retinal cells under diabetic conditions, and in vivo models of related diabetic complications [8,[87][88][89][90][91]. Notably, the functional relevance of these miRNAs has been highlighted by the fact they target key genes associated with the progression of, or protection against, these complications.In particular, the role of miRNAs in diabetic nephropathy has been extensively studied, including in the actions of TGF-\u03b2 related to fibrosis and other key renal outcomes in vitro and in vivo [8,[87][88][89][90].In diabetic retinopathy, several miRNAs have been reported to modulate the disease by targeting factors associated with angiogenesis, inflammation, and oxidant stress in RECs and in diabetic retinas [88,89].Reports have also implicated various miRNAs in the aberrant expression of genes associated with diabetic cardiomyopathy [88,91].In addition, effective in vivo targeting of miRNAs has now been demonstrated thanks to advances in nucleotide chemistry and the design of nuclease-resistant anti-miRNAs, which suggest future translational potential of miRNA-based therapies for human diabetic complications [8].Importantly, since miRNAs are stable in biological fluids such as urine and serum [8], they are being assessed in samples from various clinical cohorts as valuable biomarkers for the early detection of diabetic complications, for which there is a major unmet clinical need.It is clear that research in the field of miRNAs and diabetic complications will continue at a rapid pace.",
+      "Introduction  Diabetes-related complications represent one of the most important health problems worldwide with dire social and economic projections (Cooper, 2012).One of the most important medical concerns of the diabetes epidemic is diabetic nephropathy (DN).Diabetic nephropathy is regarded as a prototypical disease of gene and environmental interactions because not all diabetic subjects with traditional risk factors develop clinically evident nephropathy, indicating a role for individual susceptibility.The majority (>85%) of GWAS-identified single nucleotide polymorphisms (SNPs) are located in the non-coding regions of the genome and thus their functional implication lies in identifying the target genes, cell types, and the mode of dysregulation caused by these non-coding SNPs (Maurano et al., 2012).Recent studies indicate that complex trait-causing variants localize to cell-type-specific, functionally important gene regulatory regions where they can disrupt or create transcription factor binding sites to alter transcript levels only in disease-target cell types (Ko and Susztak, 2013;Susztak, 2014).Several elements of the immune system including cytokines and resident chemokines, macrophage recruitment, T lymphocytes, and immune complex deposition have recently been associated with DN (Navarro-Gonz\u00e1lez and Mora-Fern\u00e1ndez, 2008;Gaballa and Farag, 2013).Since renal cells are also capable of synthesizing pro-inflammatory cytokines such as tumor necrotic factor-alpha (TNF-\u03b1), interleukin-1\u03b2 (IL-1\u03b2) and interleukin-6 (IL-6), therefore, these cytokines acting in a paracrine or autocrine manner may induce significant effects leading to the development and progression of several renal disorders (Matoba et al., 2010;Pruijm et al., 2012;Shankar et al., 2011).The rationale of this study involved a concerted effort of genotyping, correlation and gene expression techniques involving three pro-inflammatory cytokine genes  in the development and progression of DN as well as identification of high risk patients involving susceptibility or poor clinical outcome.",
+      "They also identified enrichment in coagulation and complement pathways, signaling pathways, tissue remodeling, and antigen presentation, including PI3K-Akt, Rap1, Toll-like, and NOD-like. Sun et al. [25] studied diabetic retinopathy and identified four stress-inducible genes Rmb3, Cirbp, Mt1, and Mt2 which commonly exist in most retinal cell types. Diabetes increases the inflammatory factor gene expressions in retinal microglia and stimulates the immediate early gene expressions (IEGs) in retinal astrocytes. Van Zyl et al. [30] studied glaucoma cases and identified the cell types that represent gene expressions implicated in glaucoma.",
+      "One of the major problems facing clinical nephrology currently throughout the world is an exponential increase in patients with end-stage renal disease (ESRD), which is largely related to a high incidence of diabetic nephropathy.The latter is characterized by a multitude of metabolic and signaling events following excessive channeling of glucose, which leads to an increased synthesis of extracellular matrix (ECM) glycoproteins resulting in glomerulosclerosis, interstitial fibrosis and ultimately ESRD.With the incidence of nephropathy at pandemic levels and a high rate of ESRD, physicians around the world must treat a disproportionately large number of diabetic patients with upto-date innovative measures.In this regard, identification of genes that are crucially involved in the progression of diabetic nephropathy would enhance the discovery of new biomarkers and could also promote the development of novel therapeutic strategies.Over the last decade, we focused on the recent methodologies of high-throughput and genome-wide screening for identification of relevant genes in various animal models, which included the following: (1) single nucleotide polymorphism-based genome-wide screening; (2) the transcriptome approach, such as differential display reverse transcription polymerase chain reaction (DDRT-PCR), representational difference analysis of cDNA (cDNA-RDA)/suppressive subtractive hybridization, SAGE (serial analysis of gene expression) and DNA Microarray; and (3) the proteomic approach and 2-dimensional polyacrylamide gel electrophoresis (2D-PAGE) coupled with mass spectroscopic analysis.Several genes, such as Tim44 (translocase of inner mito-chondrial membrane-44), RSOR/MIOX (renal specific oxidoreductase/myo-inositol oxygenase), UbA52, Rap1b (Ras-related GTPase), gremlin, osteopontin, hydroxysteroid dehydrogenase-3\u03b2 isotype 4 and those of the Wnt signaling pathway, were identified as differentially expressed genes in kidneys of diabetic rodents.Functional analysis of these genes and the subsequent translational research in the clinical settings would be very valuable in the prevention and treatment of diabetic nephropathy.Future trends for identification of the biomarkers and therapeutic target genes should also include genome scale DNA/histonemethylation profiling, metabolomic approaches (e.g.metabolic phenotyping by 1H spectroscopy) and lectin microarray for glycan profiling along with the development of robust data-mining strategies.",
+      "M A N U S C R I P T A C C E P T E D  In relation to the regulation of gene expression, the role of microRNAs (miRNAs) in diabetic retinopathy has been gaining more emphasis.miRNAs are non-coding small RNAs which modulate post-transcriptional control of gene expression through degradation or translational repression of key messenger RNAs.miRNAs can be detected in serum (free, associated with proteins or within membrane-bound particles) (Weiland et al., 2012), vitreous (Ragusa et al., 2013) and aqueous (Dunmire et al., 2013).As reviewed by Mastropasqua et al., miRNAs hold considerable interest for diabetic retinopathy since they can regulate important pathogenic responses such as angiogenesis, blood flow, neural cell dysfunction, tissue-specific inflammation and glucose metabolism (Mastropasqua et al., 2014).Although based on a small patient sample, it has been reported that three separate miRNAs (miR-21, miR-181c, and miR-1179) in serum of patients with diabetic retinopathy have potential to be used as biomarkers for early detection of disease (Li et al., 2014;Qing et al., 2014).While this is still a growing research area, miRNAs hold considerable clinical potential in the diabetic retinopathy field, both as possible drug-targets for regulation of dysfunctional cell responses and as diagnostic biomarkers.",
+      "Roles of lncRNAs in diabetic complications  Apart from being involved in major metabolic tissues during diabetes as discussed above, lncRNAs are implicated in complications associated with diabetes.Diabetic retinopathy is one of the common complications in diabetic patients, which leads to impaired or loss of vision.Altered expression of lncRNAs, namely MALAT1 [82,83] and MEG3 [84], are reported to be associated with diabetic retinopathy.In STZ-induced diabetic rats, the expression of MALAT1 is elevated in the endothelial cells of the retina and knockdown of MALAT1 ameliorates retinopathy in STZ-induced rats [82].The lncRNA, MEG3, was also found to be downregulated in the retina of STZ-induced diabetic mice and its in vitro knockdown in retinal endothelial cells was found to regulate cell proliferation, viability, and migration [84].Hyperglycemia as in diabetes causes upregulation of ANRIL levels in endothelial cells [85,86], and this elevates the levels of the PRC2 subunit, EZH2 that consequently promotes the expression of VEGF, a key promoter of angiogenesis [85].Another major complication associated with diabetes is diabetic nephropathy, and this is considered a major cause of end-stage renal disease and disability in diabetic patients [87].Recent studies show that lncRNAs play important roles in the development of diabetic nephropathy and accumulation of extracellular matrix (ECM) proteins.There is higher expression of the lncRNA, PVT1, during diabetic nephropathy, and this increase leads to increased fibrosis due to accumulation of ECM proteins in renal cells [88]; downregulation of PVT1 reduces ECM accumulation [88].LncRNA PVT1 is also a host to miR-1207-5p and this miRNA is shown to regulate the expression of fibronectin1 (FN1), plasminogen activator inhibitor-1 (PAI1), and transforming growth factor beta 1 (TGF\u03b21) [89].In renal tube injury during diabetes, the lncRNA, MIAT, is under-expressed, and this negatively correlates with creatinine and BUN levels in the serum of these subjects.It has been shown to regulate cell viability of proximal convoluted renal tubules [90].In diabetic nephropathic mice, the lncRNA, MGC, is increased in renal mesangial cells.Interestingly, this lncRNA harbours a cluster of approximately 40 miRNAs, and is regulated by the ER stress marker C/EBP homologous protein (CHOP) [91].In CHOP -deficient mice, there is decreased expression of the lncRNA, MGC, and the clustered miRNAs, and these mice have shown an improvement in diabetic nephropathy [91].Diabetic nephropathy is also associated with increased levels of lincRNA, Gm4419, and this exerts its action by interacting with NF-\u03ba\u03b2.Knockdown of this lincRNA in renal mesangial cells lowers cellular proliferation and inhibits expression of NF-\u03ba\u03b2 in hyperglycemic states [92].The lncRNA, TUG1, that is upregulated in diabetic nephropathy acts as sponge for miR-377 and regulates PPAR-\u03b3 expression which further modulates the expression of FN1, collagen type IV alpha 1 chain (COL4A1), PAI1, and TGF\u03b21 in renal mesangial cells [93].Diabetic cardiomyopathy is a critical end-stage complication associated with diabetes.Several such cardiovascular complications and myocardial dysfunction in diabetic patients lead to heart failure [94].Differential expression analysis in cardiac tissue from normal and diabetic rats shows that the lncRNA, MALAT1, is upregulated during cardiomyopathy and knockdown of this lncRNA improves left ventricular systolic function by reducing myocardial inflammation in diabetic rats [95,96].Decreased expression of the lncRNA, H19, is also reported during diabetes [68,70], and this often results in decreased expression of the exonic miRNA, miR-675 [97,98].mir-675 directly targets the voltage-dependent anion channel 1 (VDAC1) which is involved in mitochondria-mediated apoptosis in the cardiac tissue during diabetes.H19 overexpression in diabetic rats reduces oxidative stress, apoptosis, and inflammation, and improves ventricle function [98].LncRNAs NONRATT021972 and uc.48+ are reported to be associated with diabetic neuropathic pain [99,100], and inhibition of both have been shown to alleviate such neuropathic pain by activating the P2X3 receptor.Impaired wound closure is a notable complication associated with diabetes and a recent report shows decreased levels of the lncRNA, Lethe in such impaired dorsal wounds of diabetic mice.This was demonstrated to be associated with increased ROS production, possibly through regulation of NOX2 expression [101].",
+      "All these suggest towards important roles of various lncRNAs in complications associated with diabetes and, therefore, assume importance to be studied in detail.",
+      "An overall important consideration in study design is that similar to RNA, noncoding RNAs are tissue and cell specific [24,[77][78][79][80][81][82].Given that it is still unknown if pathogenic changes in AMD are localized to specific ocular tissues or systemic, one must take into consideration that potential biomarkers identified in the peripheral blood as \"disease associated\" may not reflect the disease mechanism occurring in the neural retina and/or RPE.",
+      "Skol et al. developed methods to study genomics and transcriptomics together to help discover genes that cause diabetic retinopathy.Genes involved in how cells respond to high blood sugar were first identified using cells grown in the lab.By comparing the activity of these genes in people with and without retinopathy the study identified genes associated with an increased risk of retinopathy in diabetes.In people with retinopathy, the activity of the folliculin gene (FLCN) increased more in response to high blood sugar.This was further verified with independent groups of people and using computer models to estimate the effect of different versions of the folliculin gene.",
+      "miRNAs in Kidney Disease and Diabetic Nephropathy  Diabetic nephropathy is a progressive kidney disease and a major debilitating complication of both type 1 and type 2  diabetes that can lead to end-stage renal disease (ESRD) and related cardiovascular disorders.Absence or lower levels of particular miRNAs in the kidney compared with other organs may permit renal specific expression of target proteins that are important for kidney functions [45].Figure 4 depicts the connection between the role of miRNAs and kidney fibrosis.Altered expression of miRNAs causes renal fibrosis by inducing EMT, EndMT, and other fibrogenic stimuli.The accumulative effects of hyperglycaemia, inflammatory cytokines, proteinuria, ageing, high blood pressure, and hypoxia result into alteration of miRNAs expression profiles.The altered miRNAs level causes the initiation of such transition program in normal kidney, finally fibrosis.Some of the miRNAs that are more abundant in the kidney compared with other organs include miR-192, miR-194, miR-204, miR-215, and miR-216.A critical role of miRNA regulation in the progression of glomerular and tubular damage and the development of proteinuria been suggested by studies in mice with podocytespecific deletion of Dicer [46].There was a rapid progression of renal disease with initial development of albuminuria followed by pathological features of glomerulosclerosis and tubulointerstitial fibrosis.It is likely that these phenotypes are due to the global loss of miRNAs because of Dicer deletion, but, given multiple miRNAs and their myriad targets, the precise pathways responsible require identification.These investigators also identified specific miRNA changes, for example, the downregulation of the miR-30 family when Dicer was deleted.Of relevance, the miR-30 family was found to target connective tissue growth factor, a profibrotic molecule that is also downstream of transforming growth factor (TGF)- [47].Thus, the targets of these miRNAs may regulate critical glomerular and podocyte functions.These findings have also been complemented by an elegant study revealing a developmental role for the miR-30 family during pronephric kidney development in Xenopus [48].Sun et al. [49] identified five miRNAs (-192, -194, -204, -215, and -216) that were highly expressed in human and mouse kidney using miRNA microarray.A recent report using new proteomic approaches to profile and identify miRNA targets demonstrated that miR-NAs repress their targets at both the mRNA and translational levels and that the effects are mostly relatively mild [50].The role of miR-192 remains controversial and highlights the complex nature of miRNA research.Kato et al. [51] observed increased renal expression of miR-192 in streptozotocin-(STZ-) induced diabetes and in the db/db mouse and demonstrated that transforming growth factor (TGF-1) upregulated miR-192 in mesangial cells (MCs).miR-192 repressed the translation of Zeb2, a transcriptional repressor that binds to the E-box in the collagen 12 (col12) gene.They proposed that miR-192 repressed Zeb2 and resulted in increased col12 expression in vitro and contributed to increased collagen deposition in vivo.These data suggest a role for miR-192 in the development of the matrix accumulation observed in DN.It is interesting that the expression of miR-192 was increased by TGF- in mouse MCs (mesangial cells), whereas, conversely, the expression of its target, Zeb2, was decreased [51].This also paralleled the increased Col1 2 and TGF- expression [51].These results suggested that the increase in TGF- in vivo in diabetic glomeruli and in vitro in MCs can induce miR-192 expression, which can target and downregulate Zeb2 thereby to increase Col1 2.This is supported by the report showing that miR-192 is upregulated in human MCs treated with high glucose [51].TGF- induced downregulation of Zeb2 (via miR-192) and Zeb1 (via potentially another miRNA) can cooperate to enhance Col1 2 expression via de-repression at E-box elements [51].In contrast to the above, other reports suggest the relationship between miR-192 and renal fibrosis may be more complicated.Krupa et al. [52] identified two miRNAs in human renal biopsies, the expression of which differed by more than twofold between progressors and nonprogressors with respect to DN, the greatest change occurring in miR-192 which was significantly lower in patients with advanced DN, correlating with tubulointerstitial fibrosis and low glomerular filtration rate.They also reported, in contrast to the Kato et al. [51] study in MCs, that TGF-1 decreased expression of miR-192 in cultured proximal tubular cells (PTCs).These investigators concluded that a decrease in miR-192 is associated with increased renal fibrosis in vivo.Interestingly, connective tissue growth factor (CTGF) treatment also resulted in fibrogenesis but caused the induction of miR-192/215 and, consequently, decreased Zeb2 and increased E-cadherin.The contrasting findings above highlight the complex nature of miRNA research.Some of the differences may relate to models and/or experimental conditions; however, one often overlooked explanation is that some effects of miRNAs and inhibitors are likely to be indirect in nature.A recent report also showed that BMP6-induced miR-192 decreases the expression of Zeb1 in breast cancer cells [53].Thus, TGF- induced increase in the expression of key miRNAs (miR-192 and miR-200 family members) might coordinately downregulate E-box repressors Zeb1 and Zeb2 to increase Col12 expression in MCs related to the pathogenesis of DN.The proximal promoter of the Col1a2 gene responds to TGF- via smads and SP1.Conversely, the downregulation of Zeb1 and Zeb2 by TGF- via miR-200 family and miR-192 can affect upstream E-box regions.Because E-boxes are present in the upstream genomic regions of the miR-200 family, miR-200 family members may themselves be regulated by Zeb1 and Zeb2 [54].It is possible that the miR-200 family upregulated by TGF- or in diabetic glomeruli under early stages of the disease can also regulate collagen expression related to diabetic kidney disease by targeting and downregulating E-box repressors.miR-192 might initiate signaling from TGF- to upregulate miR-200 family members, which subsequently could amplify the signaling by further regulating themselves through down regulation of Ebox repressors.Such events could lead to progressive renal dysfunction under pathologic conditions such as diabetes, in which TGF- levels are enhanced.Conversely, there are several reports that miR-200 family members and miR-192 can be suppressed by TGF-, and this promotes epithelial-tomesenchymal transition (EMT) in cancer and other kidneyderived epithelial cell lines via subsequent upregulation of targets Zeb1 and Zeb2 to repress E-cadherin [54,55].",
+      "DR. HARRINGTON: You mentioned Liu's data from China [abstract; Liu Z-H et al J Am Soc Nephrol 14:400A, 2003], which overwhelmed me.Apparently there are 182 genes whose expression is up-or down-regulated significantly in patients with diabetes.If I asked you to pick the \"top three\" genes other than the ACE polymorphisms, which three would you choose and why?DR.ADLER: Well, actually I didn't see all of their results nor did they report all 182.But I guess my favorite ones would be some that relate to the ROS pathway because this is an all-purpose pathway of cell injury fueled by a hyperglycemic environment; some that relate to podocyte structure to explain the development of proteinuria; and TGF-b, which is a master regulator of sclerosis and fibrosis.",
+      "IncRNAs and microRNAs  Figure 1 | Emerging molecular mechanisms of diabetic nephropathy.Diabetic conditions induce the expression of growth factors such as transforming growth factor \u03b21 and angiotensin II, cytokines and AGEs to promote inflammation, fibrosis and hypertrophy, which contribute to the progression of diabetic nephropathy.These factors stimulate various signal transduction mechanisms that activate downstream transcription factors.They can also affect DNA methylation and histone modifications, which result in increased chromatin accessibility to transcription factors near pathological genes in renal cells.Coordinated interactions between transcription factors and epigenetic mechanisms can increase the expression of not only coding RNAs, but also noncoding RNAs such as microRNAs and lncRNAs.Furthermore, microRNAs and lncRNAs can also increase the expression of pathological genes via post-transcriptional mechanisms.Notably, the induction of key coding genes and proteins, lncRNAs and microRNAs can also 'lock' open chromatin states to create persistent expression of genes, which could be one mechanism of metabolic memory.Abbreviations: AGE, advanced glycation end-product; lncRNA, long noncoding RNA.",
+      "Key points  \u25a0 Diabetic conditions induce inflammation, fibrosis and hypertrophy in renal cells through various cytokines and growth factors such as transforming growth factor \u03b21, angiotensin II and platelet-derived growth factor \u25a0 The engagement of cytokines and growth factors with their receptors triggers signal transduction cascades that result in the activation of transcription factors to increase expression of inflammatory and fibrotic genes \u25a0 These signalling mechanisms affect epigenetic states-such as DNA methylation and chromatin histone modifications-to augment the expression of profibrotic and inflammatory genes, as well as noncoding RNAs \u25a0 Noncoding RNAs that are induced by diabetic conditions can also promote the expression of pathological genes via various post-transcriptional and post-translational mechanisms \u25a0 These epigenetic mechanisms and noncoding RNAs can lead to persistently open chromatin structures at pathological genes and sustained gene expression, which can also be a mechanism for 'metabolic memory' \u25a0 Key epigenetic regulators, microRNAs and long noncoding RNAs could serve as new therapeutic targets for diabetic nephropathy",
+      "| Diabetic nephropathy (DN), a severe microvascular complication frequently associated with both type 1 and type 2 diabetes mellitus, is a leading cause of renal failure.The condition can also lead to accelerated cardiovascular disease and macrovascular complications.Currently available therapies have not been fully efficacious in the treatment of DN, suggesting that further understanding of the molecular mechanisms underlying the pathogenesis of DN is necessary for the improved management of this disease.Although key signal transduction and gene regulation mechanisms have been identified, especially those related to the effects of hyperglycaemia, transforming growth factor \u03b21 and angiotensin II, progress in functional genomics, high-throughput sequencing technology, epigenetics and systems biology approaches have greatly expanded our knowledge and uncovered new molecular mechanisms and factors involved in DN.These mechanisms include DNA methylation, chromatin histone modifications, novel transcripts and functional noncoding RNAs, such as microRNAs and long noncoding RNAs.In this Review, we discuss the significance of these emerging mechanisms, how they mediate the actions of growth factors to augment the expression of extracellular matrix and inflammatory genes associated with DN and their potential usefulness as diagnostic biomarkers or novel therapeutic targets for DN.",
+      "| microRNAs relevant to the pathogenesis of diabetic nephropathy",
+      "Review criteria  A search for original published articles focusing on \"diabetic nephropathy\", \"signal transduction\", \"noncoding RNAs\", \"microRNAs\", \"long noncoding RNAs\", \"genetics\" and \"epigenetics\" was performed in MEDLINE and PubMed.All articles identified were English-language, full-text papers.We also searched the reference lists of identified articles for further relevant papers."
+    ],
+    [
+      "Additional evidence supporting a potentially important role for environmental modulation of genetic risk was found in previous population studies.For example, although some of the GWASidentified T2D loci could be replicated successfully in various populations (e.g., CDKAL1, HHEX, IGF2BP2, TCF7L2 and SLC30A8), more genetic variants have been identified only in some specific populations [26].T2D risk alleles showed extreme directional differentiation between different populations compared with other common diseases [29].Different T2D loci and loci frequencies across different populations may reflect the adaptation to the local environments and diets along with human migration [30].Therefore, the interplay between gene and environment leads to a more complex pathogenesis of T2D and related traits.These hypotheses are strongly supported by a number of recent GxE studies [7,11,31,32].For example, Qi et al. [31] generated a genetic risk score (GRS) using ten GWAS-identified SNPs and observed a significant interaction between the Western dietary pattern and GRS in the Health Professionals Follow-Up Study.The Western dietary pattern was only positively associated with risk of T2D among men with a high GRS, but not with low GRS subjects.Another large meta-analysis of 14 cohort studies [32] revealed that dietary whole-grain intake potentially interacted with one GCKR variant (rs780094) for fasting insulin in individuals of European descent.Greater whole-grain intake was associated with a smaller reduction of fasting insulin in individuals with the insulin-raising allele of rs780094, compared to the non-risk allele.",
+      "Gene\u2013exercise interaction in type 2 diabetes When studying gene\u2013environment interaction on the quantitative traits that underlie diabetes, the power to detect interaction is highly dependent on the precision with which non-genetic exposures are measured (Wareham et al 2002). Achievement of optimal glycaemic control is the focus of traditional treatment paradigms. Regular exercise, both aerobic (walking, jogging, or cycling) and resistance (weightlifting) training results in increased glucose uptake and insulin sensitivity and is a primary modality used in the treatment of type 2 diabetes patients (Sigal et al 2007).",
+      "Gene-Environment Interaction  Evidence from the epidemiology of T2D overwhelmingly supports a strong environmental influence interacting with genetic predisposition in a synergistic fashion as has been recently reviewed [123], however current state-of-the-art methods for measuring environmental effects lack precision and can result in changes in statistical power to detect interaction [123,124].Since lifestyle factors are important in preventing diabetes [125,126], interaction of gene variants with measures of dietary intake and exercise have been selected for studies on gene-environment interaction.For example, HNF1B (rs 4430796) was shown to interact with exercise; low levels of activity enhanced the risk of T2D in association with absence of the risk allele, but there was no protective effect of exercise when the allele was present.It follows that subgrouping by genotype may serve to enhance risk prediction while considering gene-environment interaction as has been done for exercise [127].Also lifestyle including exercise modified the effect of a CDKN2A/B variant on 2-hour glucose levels in the Diabetes Prevention Program [128] but was not confirmed in the HERITAGE study using different measurements and phenotypes involving insulin sensitivity and \u03b2-cell function [129].The pro12ala PPARG variant also interacts with physical activity for effect on 2-hour glucose levels [130], which was confirmed in the smaller HERITAGE study [129].In addition, a relationship of dietary fat intake with plasma insulin and BMI differs by the pro12ala PPARG genotype [131].",
+      "A person's risk of type 2 diabetes or obesity reflects the joint effects of genetic predisposition and relevant environmental exposures.Efforts to determine whether these genetic and environmental components of risk interact (in the statistical sense that joint effects cannot be predicted from main effects alone) 70 face challenges associated with measuring relevant exposures (diet and physical activity being notoriously difficult to estimate) and the effect of imprecision on statistical power. 71Although claims that statistical interactions reflect shared mechanisms (i.e., that the interacting factors act through the same pathways) are probably overstated, understanding the relative contributions of genetic and environmental components to risk is important.After all, environmental factors can be modified more readily than genetic factors.Genetic discoveries have provided a molecular basis for the clinically useful classification of monogenic forms of diabetes and obesity. 3,4Will the same be true for the common forms of these conditions?Probably not: as far as the common variants are concerned, each patient with diabetes or obesity has an individual \"barcode\" of susceptibility alleles and protective alleles across many loci.It is possible to show that the genetic profiles of lean subjects with type 2 diabetes and obese subjects with type 2 diabetes are not identical, but these differences appear to be inadequate for clinically useful subclassification. 22,72f efforts to uncover less prevalent, higher-penetrance alleles are successful, more precise classification of disease subtypes may become possible, particularly if genetic data can be integrated with clinical and biochemical information.For example, in persons presenting with diabetes in early adulthood, there are several possible diagnoses: various subtypes of maturity-onset diabetes of the young or mitochondrial diabetes, for example, as well as type 1 or type 2 diabetes.Assigning the correct diagnosis has both prognostic and therapeutic benefits for the patient (Table 3).",
+      "Genes, environment, and development of type 2 diabetes  Genes and the environment together are important determinants of insulin resistance and \u03b2-cell dysfunction (fi gure 2).Because changes in the gene pool cannot account for the rapid increase in prevalence of type 2 diabetes in recent decades, environmental changes are essential to understanding of the epidemic.",
+      "Type 2 diabetes (T2D) is thought to arise from the complex interplay of both genetic and environmental factors.Since the advent of genomewide association studies (GWAS), we have seen considerable progress in our understanding of the role that genetics and gene-environment interactions play in the development of T2D.Recent work suggests that the adverse effect of several T2D loci may be abolished or at least attenuated by higher physical activity levels or healthy lifestyle, whereas low physical activity and dietary factors characterizing a Western dietary pattern may augment it.However, there still remain inconsistencies warranting further investigation.Lack of statistical power and measurement errors for the environmental factors continue to challenge our efforts for characterizing interactions.Although our recent focus on established T2D loci is reasonable, we may be overlooking many other potential loci not captured by recent T2D GWAS.Agnostic approaches to the discovery of gene and environment interactions may address this possibility, but their application to the field is currently limited and still faces conceptual challenges.Nonetheless, continued investment in gene-environment interaction studies through large collaborative efforts holds promise in furthering our understanding of the interplay between genetic and environmental factors.",
+      "Type 2 diabetes (T2D) is thought to arise from the complex interplay of both genetic and environmental factors.Since the advent of genomewide association studies (GWAS), we have seen considerable progress in our understanding of the role that genetics and gene-environment interactions play in the development of T2D.Recent work suggests that the adverse effect of several T2D loci may be abolished or at least attenuated by higher physical activity levels or healthy lifestyle, whereas low physical activity and dietary factors characterizing a Western dietary pattern may augment it.However, there still remain inconsistencies warranting further investigation.Lack of statistical power and measurement errors for the environmental factors continue to challenge our efforts for characterizing interactions.Although our recent focus on established T2D loci is reasonable, we may be overlooking many other potential loci not captured by recent T2D GWAS.Agnostic approaches to the discovery of gene and environment interactions may address this possibility, but their application to the field is currently limited and still faces conceptual challenges.Nonetheless, continued investment in gene-environment interaction studies through large collaborative efforts holds promise in furthering our understanding of the interplay between genetic and environmental factors.",
+      "Gene and Environment Selection  Environmental factors selected for recent G \u00d7 E interactions studies continue to be the established modifiable risk factors for T2D such as obesity, physical activity, dietary fat, and carbohydrate quality as well as measures of pre-and post-uterine environment.The genetic factors selected, however, have shifted from biological candidates based on functional evidence to genome-wide established loci for T2D or related traits (Table 1).This approach may improve power to detect and strengthen causal inference for an interaction (49).Focusing on established T2D loci may also further our understanding of their functional role in disease development in addition to their public health relevance in the context of genetic risk modification (13).",
+      "We have seen considerable progress in our understanding of the role that both environment and genetics play in the development of T2D.Recent work suggests that the adverse effect of some established T2D-associated loci may be greatly attenuated by appropriate changes in certain lifestyle factors.Our recent approach to studies of G \u00d7 E interactions in T2D has gained considerable advantage over previous approaches, but it is clearly not optimal.Lack of statistical power and measurement error for environmental factors will continue to challenge our efforts to characterize G \u00d7 E interactions.Although our recent focus on established T2D loci is reasonable, we may be overlooking many other potential loci not captured by recent T2D GWAS.Agnostic approaches to the discovery of G \u00d7 E interactions may address this possibility, but their application to the field is currently limited and still faces conceptual challenges.Nevertheless, large collaborative efforts have the potential to uncover true G \u00d7 E interactions, which will enhance our understanding of the interplays between genes and environment in the etiology of T2D.",
+      "The purpose of the present review is to summarize recent epidemiological approaches and progress pertaining to gene-environment (G \u00d7 E) interactions potentially implicated in the pathogenesis of T2D and its related traits.We also discuss continuing challenges, evolving approaches, and recommendations for future efforts in this field.",
+      "FUTURE PERSPECTIVES  Continued investment in studies of G \u00d7 E interactions for T2D holds promise on several grounds.First, such studies may provide insight into the function of novel T2D loci and pathways by which environmental exposures act and, therefore, yield a better understanding of T2D etiology (66).They could also channel experimental studies in a productive direction.Second, knowledge of G \u00d7 E interactions may help identify high-risk individuals for diet and lifestyle interventions.This may also apply to pharmacological interventions if individuals carrying certain genotypes are more or less responsive to specific medications.The finding that patients with rare forms of neonatal diabetes resulting from KCNJ11 mutations respond better to sulfonylurea than to insulin therapy is just one example demonstrating the potential for this application of G \u00d7 E interaction research (69).Third, we are fast approaching an era when individuals can feasibly obtain their complete genetic profile and thus a snapshot of their genetic predisposition to disease.It will therefore be the responsibility of health professionals to ensure that their patients have an accurate interpretation of this information and a means to curb their genetic risk.A long-held goal of genetic research has been to tailor diet and lifestyle advice to an individual's genetic profile, which will, in turn, motivate him or her to adopt and maintain a protective lifestyle.There is currently no evidence that this occurs.Findings to date, however, indicate that behavioral changes can substantially mitigate diabetogenic and obesogenic effects of individual or multiple risk alleles, which has much broader clinical and public health implications.",
+      "Gene-Nutrient or Dietary Pattern Interactions in The Development of T2DM  Recently, several studies have demonstrated the significant effects of genotype by environment interactions on T2DM [48,49].However, further clarification of the role of these interactions at the genome-wide level could help predict disease risk more accurately and facilitate the development of dietary recommendations to improve prevention and treatment.Moreover, it would be very interesting to identify the specific dietary factors that are the most influential in the variation of a given T2DM-related phenotype and to what extent these dietary factors contribute to the phenotypic variation (Table 2).In particular, the dietary factors considered are macro-and micronutrients, foods and type of diets.A recent review present evidence on the dietary environment and genetics as risk factors for T2DM [50]. * Adiponectin (ADIPOQ).",
+      "Introduction  Genome wide association studies (GWAS) of type 2 diabetes mellitus and relevant endophenotypes have shed new light on the complex etiology of the disease and underscored the multiple molecular mechanisms involved in the pathogenic processes leading to hyperglycemia [1].Even though these studies have successfully mapped many diabetes risk genetic loci that could not be detected by linkage analysis, the risk single nucleotide polymorphisms (SNP) have small effect sizes and generally explain little of disease heritability estimates [2].The poor contribution of risk loci to diabetes inheritance suggests a prominent role of environmental factors (eg.diet, physical activity, lifestyle), gene \u00c2 environment interactions and epigenetic mechanisms in the pathological processes leading to the deterioration of glycemic control [3,4].",
+      "The literature on gene-environment interactions in diabetes-related traits is extensive, but few studies are accompanied by adequate replication data or compelling mechanistic explanations.Moreover, most studies are cross-sectional, from which temporal patterns and causal effects cannot be confidently ascertained.This has undermined confidence in many published reports of gene-environment interactions across many diseases; although interaction studies in psychiatry have been especially heavily criticized [3], many of the points made in that area relate to other diseases, not least to T2D, where the diagnostic phenotype (elevated blood glucose or HbA1c) is a consequence of underlying and usually unmeasured physiological defects (e.g., at the level of the pancreatic beta-cell, peripheral tissue, liver, and gut), and the major environmental risk factors are difficult to measure well.Nevertheless, several promising examples of geneenvironment interactions relating to cardiometabolic disease exist, as discussed below and described in Table 1, and interaction studies with deep genomic coverage in large cohorts are now conceivable; the hope is that these studies will highlight novel disease mechanisms and biological pathways that will fuel subsequent functional and clinical translation studies.This is important, because diabetes medicine may rely increasingly on genomic stratification of patient populations and disease phenotype, for which gene-environment interaction studies might prove highly informative.",
+      "The genome is often the conduit through which environmental exposures convey their effects on health and disease.Whilst not all diseases act by directly perturbing the genome, the phenotypic responses are often genetically determined.Hence, whilst diseases are often defined has having differing degrees of genetic determination, genetic and environmental factors are, with few exceptions, inseparable features of most diseases, not least type 2 diabetes.It follows that to optimize diabetes, prevention and treatment will require that the etiological roles of genetic and environmental risk factors be jointly considered.As we discuss here, studies focused on quantifying gene-environment and gene-treatment interactions are gathering momentum and may eventually yield data that helps guide health-related choices and medical interventions for type 2 diabetes and other complex diseases.",
+      "The genome is often the conduit through which environmental exposures convey their effects on health and disease.Whilst not all diseases act by directly perturbing the genome, the phenotypic responses are often genetically determined.Hence, whilst diseases are often defined has having differing degrees of genetic determination, genetic and environmental factors are, with few exceptions, inseparable features of most diseases, not least type 2 diabetes.It follows that to optimize diabetes, prevention and treatment will require that the etiological roles of genetic and environmental risk factors be jointly considered.As we discuss here, studies focused on quantifying gene-environment and gene-treatment interactions are gathering momentum and may eventually yield data that helps guide health-related choices and medical interventions for type 2 diabetes and other complex diseases.",
+      "Predisposition is influenced by the level of certain environmental exposures, personal factors, access to good-quality primary care, and by genotype.Interactions between genetic and nongenetic risk factors are hypothesized to raise diabetes risk in a synergistic manner; reciprocally, health-enhancing changes in behavior, body composition, or medication may reduce the risk of disease conveyed by genetic factors.Defining the nature of these interactions and identifying ways through which reliable observations of gene-environment interactions (GEIs) can be translated into the public health setting might help 1) optimize targeting of health interventions to persons most likely to respond well to them, 2) improve cost-and health-effectiveness of existing preventive and treatment paradigms; 3) reduce unnecessary adverse consequences of interventions; 4) increase patient adherence to health practitioners' recommendations; and 5) identify novel interventions that are beneficial only in a defined genetic subgroup of the population.In this Perspective, we describe the rationale and evidence relating to the existence of gene-environment and genetreatment interactions in type 2 diabetes.We discuss the tried, tested, and oftenfailed approaches to investigating genelifestyle interactions in type 2 diabetes; we discuss some recent developments in gene-treatment interactions (pharmacogenetics); and we look forward to the strategies that are likely to dominate these fields of research in the future.We conclude with a discussion of the requirements for translating findings from these future studies into a form where they can be used to help predict, prevent, or treat diabetes.Here we describe the rationale and evidence concerning GEIs and gene-treatment interactions in type 2 diabetes, provide an interpretation of current findings and strategies, and offer a view for their future translation.",
+      "T2DM results from the contribution of many genes [10] , many environmental factors [11] , and the interactions among those genetic and environmental factors.Physical activity and dietary fat have been reported to be important modifiers of the associations between glucose homeostasis and well-known candidate genes for T2DM [12] and there is reason to believe that a significant proportion of the susceptibility genes identified by GWASs will interact with these environmental factors to influence the disease risk.Florez et al. [13] reported that response to the Diabetes Prevention Program lifestyle intervention did not differ by genotype groups at TCF7L2 rs7903146 [13] .A more recent report from the Diabetes Prevention Program [14] showed that among 10 of the recently identified diabetes susceptibility polymorphisms (single nucleotide polymorphisms, SNPs), only CDKN2A/B rs10811661 was shown to marginally modify the effect of the lifestyle intervention on diabetes risk reduction.Similarly, the study of Brito et al. [15] reported that among 17 of the diabetes SNPs, only HNF1B rs4430796 significantly interacted with physical activity to influence impaired glucose tolerance risk and incident diabetes.",
+      "Gene-Environment  Interactions.An risk of developing T2D is the product of interaction between the individual's genetic constitution and the environment inhabited by the individual.Whilst the contribution of genetic factors to disease risk is relatively easy to quantify, the impact of environmental exposure is less easily measured in a clinical setting.Nevertheless, efforts have been made to study the interactions between some of the known susceptibility loci for T2D and the environment, and these findings may be useful for the development of prediction models and tailoring clinical treatment for T2D [122,123].For example, for carriers of the risk allele for TCF7L2, diets of low glycaemic load [124,125] and a more intensive lifestyle modification regime (versus that recommended for nonrisk carriers) [61,62,126,127] have been shown to reduce the risk of T2D.Meaningful studies for gene-environment interactions will require samples of sufficient size to increase statistical power [128] and accurate methods for measuring environmental exposure, for example, the use of metabolomics to identify and assess metabolic characteristics, changes, and phenotypes in response to the environment, diet, lifestyle, and pathophysiological states.This information will allow the generation of better risk prediction models and personalisation/stratification of treatment, the holy grail of GWAS.",
+      "Other aspects that have been overlooked in large GWAS on T2DM relate to environmental effects such as diet, physical activity, and stresses, which may affect gene expression.For example, fish oil may stimulate PPARG in much the same fashion as the thiazolidinedione class of drugs; however, studies on the interaction of the PPARG variant with dietary components have not been performed.The spectacular rise in the incidence of diabetes among Pima Indians and other populations as they adopt Western diets and lifestyles dramatically demonstrates the key role of the environment [12].Consequently, it could be expected that the effect of a common gene variant among populations that have very different diets and exercise habits might be totally different, thus explaining some instances of lack of replication. [4].Another variable that influences the statistical and real association of an SNP with a disease or response to a diet is epigenetic interaction.Epigenesis is the study of heritable changes in gene function that occur without a change in the DNA sequence, such as DNA methylation and chromatin remodeling.Both mechanisms can affect gene expression by altering the accessibility of DNA to regulatory proteins or complexes such as transcription factors, and they can be influenced by certain nutrients and by overall caloric intake.Thus, it can be expected that long-term exposure to certain diets could produce permanent epigenetic changes in the genome [7]."
+    ],
+    [
+      "Researchers are expanding our understanding of genetic risk factors for diabetes through ongoing discoveries.Genetic variants associated with increased susceptibility to type 2 diabetes, a disease that affects more than 200 million people worldwide, have been identified (NHGRI & NIDDK, 2007).Such discoveries accelerate efforts to understand genetic contributions to chronic illness, as well as facilitate greater investigation of how these genetic factors interact with each other and with lifestyle factors.Ultimately, once the association of these variants with diabetes are confirmed, genetic tests may be utilized to identify (even before escalating blood sugars) those individuals, like Vanessa, who may be able to delay or prevent diabetes with healthy lifestyle decisions and behaviors.Information to assist nurses in this challenge is available in a toolkit \"Your Game Plan for Preventing Type 2 Diabetes\" (Your Game Plan, n.d.).Would you have known whether or not genetic testing was available for Vanessa?If you had said no to this question but could have explained the progress currently being made in understanding diabetes, Vanessa would have had access to the best care possible today.",
+      "A considerable amount of work has focused on dissecting the genetics of diabetes itself; however, fewer studies have been conducted on the molecular mechanisms leading to its specific complications such as DR.To identify susceptibility loci that are associated with T2D retinopathy in Taiwanese population, we conducted a genome-wide association study involving 749 T2D cases (174 with retinopathy and 575 without retinopathy) and 100 nondiabetic controls and identified 12 previously unknown susceptibility loci related to DR.",
+      "Progress toward wider use of genetic testing in the prediction of type 2 diabetes and its complications will require three developments.The first involves identification of a growing number of risk variants that, collectively, deliver greater predictive and discriminative performance than the subset thus far known.The second involves understanding how genetic information can be combined with other conventional risk factors (and possibly with non-DNA-based biomarkers, as these emerge) to provide a more accurate assessment of individual risk.It should be kept in mind that susceptibility genotype information will not be orthogonal to those traditional factors, since several of them (such as ethnicity, family history, and BMI) capture overlapping genetic information.The third development will be evidence that imparting such information results in clinically meaningful differences in individual behavior or provides a more rational basis for therapeutic or preventative interventions.",
+      "Future directions  Delays in identifying genetic variants that are robustly associated with differences in individual predisposition to the complications of diabetes, have constrained progress towards a mechanistic understanding of these conditions.Some approaches to overcome these limitations are outlined in Figure 4.",
+      "Recent advances in GWAS have substantially improved our understanding of the pathophysiology of diabetes, but the currently identified genetic susceptibility loci are insufficient to explain differences in diabetes risk across different ethnic groups or the rapid rise in diabetes prevalence over the past several decades.Clinical utility of these loci in predicting future risk of diabetes is also limited.",
+      "Conclusions: Together these results provide further evidence for the implication of genetic factors in the development of type 2 diabetes complications and highlight several potential key loci, able to modify the risk of developing these conditions.Moreover, the candidate variant approach proves a strong and consistent effect for multiple variants across different populations.",
+      "Studies show evidence of considerable genetic component predisposing to diabetic complications, explaining even around 50% of the risk of proliferative retinopathy [11].In the last few decades, genetic research including genome-wide association studies (GWAS), linkage analysis, and candidate gene approach has revealed several susceptibility loci for diabetic retinopathy and nephropathy (VEGF, CAT , FTO, UCP1, and INSR), and also macrovascular complications (ADIPOQ).Nevertheless, they explain only a small proportion of the phenotypic variation observed in T2DM patients [12][13][14][15][16][17], justifying a need for identification of novel genetic risk factors for T2DM complications and improvement of knowledge about molecular mechanisms underlying these comorbid conditions.",
+      "Methods:  We performed a genome-wide association study in 601 type 2 diabetes patients after stratifying them according to the presence or absence of four types of diabetes complications: diabetic neuropathy, diabetic nephropathy, macrovascular complications, and ophthalmic complications.",
+      "Background: Type 2 diabetes complications cause a serious emotional and economical burden to patients and healthcare systems globally.Management of both acute and chronic complications of diabetes, which dramatically impair the quality of patients' life, is still an unsolved issue in diabetes care, suggesting a need for early identification of individuals with high risk for developing diabetes complications. Methods:We performed a genome-wide association study in 601 type 2 diabetes patients after stratifying them according to the presence or absence of four types of diabetes complications: diabetic neuropathy, diabetic nephropathy, macrovascular complications, and ophthalmic complications. Results:The analysis revealed ten novel associations showing genome-wide significance, including rs1132787 (GYPA, OR = 2.71; 95% CI = 2.02-3.64)and diabetic neuropathy, rs2477088 (PDE4DIP, OR = 2.50; 95% CI = 1.87-3.34),rs4852954 (NAT8, OR = 2.27; 95% CI = 2.71-3.01),rs6032 (F5, OR = 2.12; 95% CI = 1.63-2.77),rs6935464 (RPS6KA2, OR = 2.25; 95% CI = 6.69-3.01)and macrovascular complications, rs3095447 (CCDC146, OR = 2.18; 95% CI = 1.66-2.87)and ophthalmic complications.By applying the targeted approach of previously reported susceptibility loci we managed to replicate three associations: MAPK14 (rs3761980, rs80028505) and diabetic neuropathy, APOL1 (rs136161) and diabetic nephropathy.Conclusions: Together these results provide further evidence for the implication of genetic factors in the development of type 2 diabetes complications and highlight several potential key loci, able to modify the risk of developing these conditions.Moreover, the candidate variant approach proves a strong and consistent effect for multiple variants across different populations.",
+      "Discussion  Here we present the results of the genome-wide association study for T2DM complications performed in a population of Latvia for the first time, revealing 10 susceptibility loci for T2DM complications, including diabetic neuropathy, macrovascular and ophthalmic complications.As in other reports aimed to identify the risk factors of T2DM complications [15,32], the control group of our study consisted of T2DM patients with no evidence of the complication type of interest instead of conventional healthy subjects, since the implementation of healthy controls would rather reveal genetic associations with the diagnosis of T2DM itself, not the T2DM complications.",
+      "Genetic determinants of diabetes and metabolic syndromes.",
+      "Conclusions  As compared with clinical risk factors alone, common genetic variants associated with the risk of diabetes had a small effect on the ability to predict the future development of type 2 diabetes.The value of genetic factors increased with an increasing duration of follow-up.",
+      "Research Gaps  After consideration of the known genetic associations with diabetes risk, consensus developed that the field is not yet at a place where genetics has provided actionable information to guide treatment decisions, with a few notable exceptions, namely in MODY.The experts agreed there is a need to use the increasingly accessible and affordable technologies to further refine our understanding of how genetic variations affect the rate of progression of diabetes and its complications.The expert committee also highlighted the importance of determining categorical phenotypic subtypes of diabetes in order to link specific genetic associations to these phenotypic subtypes.These types of information are necessary to develop the tools to predict response to-and side effects of-therapeutic approaches for diabetes in patient populations.",
+      "COMPLICATIONS  In addition to the genetic determinants of diabetes, several gene mutations and polymorphisms have been associated with the clinical complications of diabetes.The cumulative data on diabetes patients with a variety of micro-and macrovascular complications support the presence of strong genetic factors involved in the development of various complications [200] .A list of genes have been reported that are associated with diabetes complications including ACE and AKR1B1 in nephropathy, VEGF and AKRB1 in retinopathy and ADIPOQ and GLUL in cardiovascular diseases [200] .",
+      "How do we identify the major 'culprits' at the implicated genome-wide association study loci? If population-based genetics, including genome-wide association studies, have allowed progress in the identification of Type 2 diabetes loci to be rapid over the past few years, progress towards determining which of the gene variants close to the implicated loci confer altered disease risk and how (at the molecular, cellular and whole body level) has lagged some way behind.Indeed, given the number of possible single nucleotide polymorphisms and genes, unravelling these questions represents a monumental challenge, requiring multiple, complementary approaches.Nonetheless, the rewards of success, in terms of new understanding of disease mechanisms and even the identification of new targets for therapeutic intervention, are likely to be great, potentially allowing the treatment of underlying disease aetiology in a personalized (stratified) manner.",
+      "During the last decade, there have been substantial advances in the identification and characterization of DNA sequence variants associated with individual predisposition to type 1 and type 2 diabetes.As well as providing insights into the molecular, cellular, and physiological mechanisms involved in disease pathogenesis, these risk variants, when combined into a polygenic score, capture information on individual patterns of disease predisposition that have the potential to influence clinical management.In this review, we describe the various opportunities that polygenic scores provide: to predict diabetes risk, to support differential diagnosis, and to understand phenotypic and clinical heterogeneity.We also describe the challenges that will need to be overcome if this potential is to be fully realized.",
+      "During the last decade, there have been substantial advances in the identification and characterization of DNA sequence variants associated with individual predisposition to type 1 and type 2 diabetes.As well as providing insights into the molecular, cellular, and physiological mechanisms involved in disease pathogenesis, these risk variants, when combined into a polygenic score, capture information on individual patterns of disease predisposition that have the potential to influence clinical management.In this review, we describe the various opportunities that polygenic scores provide: to predict diabetes risk, to support differential diagnosis, and to understand phenotypic and clinical heterogeneity.We also describe the challenges that will need to be overcome if this potential is to be fully realized.",
+      "Conclusions and Future Directions GWAS and GWAS meta-analyses have by far been the most efficient way to identify new T2D genes (Figure 2), but their predictive value for future occurrence of T2D has been very limited compared to classic risk factors such as obesity and fasting glucose levels (Walford et al., 2014).Although it might be good news that our genome does not fully dictate our future, the knowledge of its specificities may help us to improve our health.Early genetic studies showed that the higher risk for T2D conferred by TCF7L2 variant can be reversed by lifestyle intervention (Florez et al., 2006), opening avenues for strategies targeted on genetically selected individuals with pre-diabetes.TCF7L2 has also been shown to be associated with a lower efficiency of oral sulfonylureas in newly diagnosed T2D patients (Pearson et al., 2007), but a more recent Danish study suggested that in contrast to clinical markers, all known T2D-associated variants do not significantly affect the time to prescription of the first drug after disease onset (Hornbak et al., 2014).In other words, frequent SNPs are not helpful to predict patients' futures, though the good use of genetic data may contribute to provide better care to newly diagnosed T2D patients who are currently all treated the same (with metformin).",
+      "Background  Multiple genetic loci have been convincingly associated with the risk of type 2 diabetes mellitus.We tested the hypothesis that knowledge of these loci allows better prediction of risk than knowledge of common phenotypic risk factors alone.",
+      "Genetic predisposition to diabetes mellitus type 2: will large collaborative efforts be able to overcome the geneticist's nightmare?"
+    ],
+    [
+      "Genetics and pharmacogenomics  We are at the dawn of the age of pharmacogenomics and personalized medicine and ever closer to achieving the \"$1,000 genome. \"What does this mean for diabetes?Forward genetic approaches (i.e., starting from phenotype and identifying the genetic cause) to dissecting mendelian forms of diabetes have been hugely successful in identifying a small subset of diabetic patients in whom rare, highly penetrant mutations of a single gene cause their diabetes (13).While common variants of these genes that make a small contribution to polygenic diabetes may also exist (13), the variants causing monogenic diabetes have limited utility in pharmacogenetics due to their low allele frequency.The vast majority of type 2 diabetes patients have polygenetic forms of the disease that typically also require a permissive environment (e.g., obesity, sedentary lifestyle, advancing age, etc.) to be penetrant.Each locus contributes a small amount of risk (odds ratios typically ranging from 1.1- to 1.5-fold), so large cohorts are needed to identify the at-risk alleles.Some of the loci identified to date include transcription factor 7-like 2 (TCF7L2) (14), calpain 10 (CAPN10) (15), peroxisome proliferator-activated receptor \u03b3 (PPARG) (16), and potassium inwardly rectifying channel, subfamily J, member 11 (KCNJ11) (17).However, the pace of gene identification is increasing due to the availability of large-scale databases of genetic variation and advances in genotyping technology.A recent genome-wide study identified solute carrier family 30, member 8 (SLC30A8), a \u03b2 cell Zn transporter, and two other genomic regions as additional diabetes risk loci (18).",
+      "With further progress in unravelling the pathogenic roles of genes and epigenomic phenomena in type 2 diabetes, pharmacogenomic and pharmacoepigenomic studies might eventually yield treatment choices that can be personalised for individual patients.",
+      "Pharmacogenomics of Type 2 Diabetes  With the advent of GWAS, studies on the roles of inherited and acquired genetic variations in drug response have undergone an evolution from pharmacogenetics into pharmacogenomics, with a shift from the focus on individual candidate genes to GWAS [147].Clinically, it is often observed that even patients who receive similar antidiabetic regimens demonstrate large variability in drug disposition, glycemic response, tolerability, and incidence of adverse effects [148].This interindividual variability can be attributed to specific gene polymorphisms involved in the metabolism, transportation, and therapeutic mechanisms of oral antidiabetic drugs.Pharmacogenomics is on the agenda to explore feasible genetic testing to predict treatment outcome, so that appropriate steps could be taken to treat type 2 diabetes more efficiently.",
+      "Future directions  Delays in identifying genetic variants that are robustly associated with differences in individual predisposition to the complications of diabetes, have constrained progress towards a mechanistic understanding of these conditions.Some approaches to overcome these limitations are outlined in Figure 4.",
+      "Genomics of T2D  Diet, lifestyle, environment, and even genetic variation influence an individual's response to disease therapy.Like GWAS which identify genetic variants conferring risk for a disease, studies have been carried out for identifying genetic variants responsible for patient differences in drug response.Pharmacogenomics in diabetes focuses on the study of gene polymorphisms which influence an individual's response to antidiabetic drugs.Such genetic variants influence the pharmacodynamics and/or pharmacokinetics of the drug, thus affecting its efficacy or toxicity in an individual.The difference in response to treatments and therapies across individuals on account of these factors strengthens the case for personalized medicine in diabetes.",
+      "Genetics & genomics of T2D  \u2022 Genome-wide association studies (GWAS) have been helpful in identifying a large number of genetic variants conferring risk to T2D.However, only close to 10% heritability is explained by these variants.Other genetic variants, particularly those which are rare but with significant effects need to be identified.\u2022 Genetic variability is responsible for the difference in response to antidiabetic drugs seen across individuals.",
+      "The aim of this study was to summarize current knowledge and provide perspectives on the relationships between human genetic variants, type 2 diabetes, antidiabetic treatment, and disease progression.Type 2 diabetes is a complex disease with clear-cut diagnostic criteria and treatment guidelines.Yet, the interindividual response to therapy and slope of disease progression varies markedly among patients with type 2 diabetes.Gene-gene, gene-environment, and gene-treatment interactions may explain some of the variation in disease progression.Several genetic variants have been suggested to be associated with response to antidiabetic drugs.Some are present in drug receptors or drug metabolizers (OCT genes, KCNJ11, ABCC8, and CYP2C9).Numerous type 2 diabetes risk variants have been identified, but genetic risk score models applying these variants have failed to identify 'disease progressors' among patients with diabetes.Although genetic risk scores are based on a few known loci and only explain a fraction of the heritability of type 2 diabetes, it seems that the genes responsible for the development of diabetes may not be the same driving disease progression after the diagnosis has been made.Pharmacogenetic interactions explain some of the interindividual variation in responses to antidiabetic treatment and may provide the foundation for future genotype-based treatment standards.Pharmacogenetics and Genomics 25:475-484",
+      "Diabetes progression is a multifactorial process; however, pharmacogenetics seems to play an important role in understanding the different phenotypes and progression rates among diabetic patients.Genetic variants associated with decreased effect of a certain drug might explain why some individuals are more likely to experience glycemic deterioration on a given treatment.In the following sections, different genetic variants and their impact on treatment efficacy and outcome will be addressed.",
+      "The aim of this study was to summarize current knowledge and provide perspectives on the relationships between human genetic variants, type 2 diabetes, antidiabetic treatment, and disease progression.Type 2 diabetes is a complex disease with clear-cut diagnostic criteria and treatment guidelines.Yet, the interindividual response to therapy and slope of disease progression varies markedly among patients with type 2 diabetes.Gene-gene, gene-environment, and gene-treatment interactions may explain some of the variation in disease progression.Several genetic variants have been suggested to be associated with response to antidiabetic drugs.Some are present in drug receptors or drug metabolizers (OCT genes, KCNJ11, ABCC8, and CYP2C9).Numerous type 2 diabetes risk variants have been identified, but genetic risk score models applying these variants have failed to identify 'disease progressors' among patients with diabetes.Although genetic risk scores are based on a few known loci and only explain a fraction of the heritability of type 2 diabetes, it seems that the genes responsible for the development of diabetes may not be the same driving disease progression after the diagnosis has been made.Pharmacogenetic interactions explain some of the interindividual variation in responses to antidiabetic treatment and may provide the foundation for future genotype-based treatment standards.Pharmacogenetics and Genomics 25:475-484",
+      "To date, a number of genetic variants have been identified to be associated with response to antidiabetic drugs.Of these, some variants are present in either drug receptors or drug metabolizers as for OCT genes, KCNJ11, ABCC8, and CYP2C9.Other variants are known T2D susceptibility variants such as TCF7L2.To identify variants of importance for antiglycemic drug response, GWAS in large cohorts of patients with diabetes with detailed measures of pharmacotherapy are lacking.The pharmacologic management of patients with diabetes often involves drug classes other than antidiabetics.Pharmacogenetic studies on statin and antihypertensive treatment have reported several genetic variants associated with treatment response and adverse drug reactions [101,102].It therefore seems natural to conclude that the future perspectives in pharmacogenetics is to conduct genetic studies in large cohorts with wellphenotyped individuals, thorough data collection on baseline treatment, concomitant treatment, adherence to therapy as well as data collection on comorbidity and additional disease diagnoses.These types of pharmacogenetic studies may provide unique opportunities for future genotype-based treatment standards and may help in delaying or changing the slope of disease progression among patients with T2D.",
+      "Genetic determinants of diabetes and metabolic syndromes.",
+      "Thus, specific answers are lacking as to the genetic basis for type 2 diabetes.Still, speculations can be made about what eventually will be found.It is almost certain the genetic basis for type 2 diabetes and other common metabolic diseases will be extremely complex-that a predisposition for the disease will require several genetic hits as opposed to just one.Also, it is generally assumed there will be many susceptibility genes for type 2 diabetes, with enormous variability in different families and ethnic groups.Not known is whether there will be a common form of type 2 diabetes, with any one or even a few susceptibility genes accounting for a sizeable percentage of affected persons.As such, identifying diabetes genes will be slow and difficult.",
+      "Ta rge ted T r e atmen t a nd Pr e v en t ion  4][75] In monogenic forms of diabetes, at least, genetic testing already drives the choice of therapy.For example, in patients who have maturity-onset diabetes of the young due to mutations in the gene encoding glucokinase (GCK), the hyperglycemia is mild and stable, the risk of complications is low, and dietary management is often sufficient.In contrast, in patients who have maturity-onset diabetes of the young due to mutations in HNF1A, the disease follows a more aggressive course, with a greater risk of severe complications, but is particularly responsive to the hypoglycemic effects of sulfonylureas. 62,73Most children with neonatal diabetes have mutations in KCNJ11 or ABCC8, adjacent genes that jointly encode the beta-cell ATP-sensitive potassium channel that mediates glucose-stimulated insulin secretion and is the target of sulfonylureas.In such children, treatment with sulfonylureas has proved more effective and convenient than the lifelong insulin therapy previously considered the default option. 74,75n children with severe obesity due to profound leptin deficiency, exogenous leptin therapy is lifesaving. 76s yet, there are insufficient genetic data to support management decisions for common forms of type 2 diabetes and obesity. 77Although the TCF7L2 genotype is associated with variation in the response to sulfonylurea treatment, 78 the effect is too modest to guide the care of individual patients.For the time being, the contribution of genetic information to therapy is most likely to come through the drug-discovery pipeline.Information from genetic studies could be used to identify new targets for pharmaceutical intervention that have validated effects on physiological characteristics, to provide information about new and existing targets (e.g., clues about the long-term safety of pathway intervention), 32 and to characterize high-risk groups to enable more efficient clinical trials of agents designed to reduce the progression of type 2 diabetes or obesity or the risk of complications.",
+      "Type 2 Diabetes  While a subset of genetic variants are linked to both type 1 and type 2 diabetes (42,43), the two diseases have a largely distinct genetic basis, which could be leveraged toward classification of diabetes (44).Genome-wide association studies have identified more than 130 genetic variants associated with type 2 diabetes, glucose levels, or insulin levels; however, these variants explain less than 15% of disease heritability (45)(46)(47).There are many possibilities for explaining the majority of type 2 diabetes heritability, including disease heterogeneity, gene-gene interactions, and epigenetics.Most type 2 variants are in noncoding genomic regions.Some variants, such as those in KCNQ1, show strong parent-of-origin effects (48).It is possible that children of mothers carrying KCNQ1 are born with a reduced functional b-cell mass and thereby are less able to increase their insulin secretion when exposed to insulin resistance (49).Another area of particular interest has been the search for rare variants protecting from type 2 diabetes, such as loss-of-function mutations in SLC30A8 (50), which could offer potential new drug targets for type 2 diabetes.",
+      "Research Gaps  After consideration of the known genetic associations with diabetes risk, consensus developed that the field is not yet at a place where genetics has provided actionable information to guide treatment decisions, with a few notable exceptions, namely in MODY.The experts agreed there is a need to use the increasingly accessible and affordable technologies to further refine our understanding of how genetic variations affect the rate of progression of diabetes and its complications.The expert committee also highlighted the importance of determining categorical phenotypic subtypes of diabetes in order to link specific genetic associations to these phenotypic subtypes.These types of information are necessary to develop the tools to predict response to-and side effects of-therapeutic approaches for diabetes in patient populations.",
+      "Genome-wide association (GWAS) and sequencing studies are providing new insights into the genetic basis of type 2 diabetes (T2D) and the inter-individual variation in glycemic traits, including levels of glucose, insulin, proinsulin and hemoglobin A1c (HbA1c).At the end of 2011, established loci (P < 5 \u00d7 10 \u22128 ) totaled 55 for T2D and 32 for glycemic traits.Since then, most new loci have been detected by analyzing common [minor allele frequency (MAF)>0.05]variants in increasingly large sample sizes from populations around the world, and in trans-ancestry studies that successfully combine data from diverse populations.Most recently, advances in sequencing have led to the discovery of four loci for T2D or glycemic traits based on low-frequency (0.005 < MAF \u2264 0.05) variants, and additional low-frequency, potentially functional variants have been identified at GWAS loci.Established published loci now total \u223c88 for T2D and 83 for one or more glycemic traits, and many additional loci likely remain to be discovered.Future studies will build on these successes by identifying additional loci and by determining the pathogenic effects of the underlying variants and genes.",
+      "Together, the findings from these studies were among the first to demonstrate that the genetic etiology of hyperglycemia may modulate response to hypoglycemia agents.Such results yielded strong implications for patient management and paved the way toward elucidating additional genetic factors that might influence drug response in the treatment of T2D.",
+      "A  number of studies have implicated a genetic basis for type 2 diabetes (1).The discovery of monogenic forms of the disease underscored the phenotypic and genotypic heterogeneity, although monogenic forms account for only a few percent of the disease (1).Defining the genetic basis of the far more common polygenic form of the disease presents more difficulties (2,3).Nevertheless, some interesting results have recently emerged.A genome scan of Hispanic-American families (330 affected sib-pairs [ASPs]) found linkage to chromosome 2q37 (logarithm of odds [LOD] 4.15) (4), and the causative gene has been recently reported (5).A number of other genome scans in various racial groups have identified other putative susceptibility loci (6 -8).The largest genome-wide scan for type 2 diabetes loci reported to date studied 477 Finnish families (716 ASPs) and found evidence for linkage to chromosome 20q12-13.1(LOD 2.06 at D20S107) (9).Interestingly, similar results have been reported by at least three other groups (10 -12).",
+      "Because more than one genetic mutation contributes to T1D, the differences that occur between individuals of different backgrounds (for instance, race and locality) may need to be considered in the design of treatments.Personalized medicine is about the ability to classify individuals into subpopulations that differ in their susceptibility to a particular disease or in their response to a specific treatment (Blau and Liakopoulou, 2013;Timmeman, 2013).This will allow for a more accurate diagnosis per individual, and design of specific treatment plans including gene therapy.",
+      "Genetic predisposition to diabetes mellitus type 2: will large collaborative efforts be able to overcome the geneticist's nightmare?"
+    ]
+  ]
+}
\ No newline at end of file
diff --git a/gnqa/paper1_eval/src/data/datasets/old/experts_aging1_dataset.json b/gnqa/paper1_eval/src/data/datasets/old/experts_aging1_dataset.json
new file mode 100644
index 00000000..6429e096
--- /dev/null
+++ b/gnqa/paper1_eval/src/data/datasets/old/experts_aging1_dataset.json
@@ -0,0 +1,128 @@
+{
+  "question": [
+    "Which mouse genes have been associated with longevity?",
+    "Is lifespan determined by genetics?",
+    "Is there a direct association between aging and susceptibility to having diabetes?",
+    "Which genes are associated with aging in humans?",
+    "What genetic factors influence aging in humans?"
+  ],
+  "answer": [
+    "The mouse genes associated with longevity include the C3H allele at D2Mit58, the BALB allele at D16Mit182, the C57BL/6 allele at D4Mit84, the C3H allele at D9Mit110, and the C57BL/6 and C3H alleles at D12Mit167. Additionally, a locus on chromosome 2 and another on chromosome 6 have been associated with longevity.",
+    "Lifespan is partially determined by genetics, with studies estimating that genetics contribute to about 25-30% of the variation in human lifespan. However, it is also significantly influenced by non-heritable factors such as environment and lifestyle.",
+    "Yes, there is a direct association between aging and susceptibility to having diabetes. The risk of onset of type 2 diabetes increases with age, and most diabetic patients in certain regions are 40 years old or more. Additionally, aging is associated with changes in body composition and glucose tolerance, which can contribute to the development of diabetes.",
+    "The genes associated with aging in humans are APOE and FOXO3A.",
+    "Several genetic factors influence aging in humans. These include the apolipoprotein E gene (APOE), the forkhead box O3A (FOXO3A) gene in the insulin-IGF1 signaling pathway, and the exonuclease 1 (EXO1) gene. Other genes associated with aging are those involved in pathways such as insulin-like growth factor (IGF)-insulin signaling, regulation of lipoprotein metabolism, the mTOR pathway, and the oxidative stress response. Additionally, genes related to immune response, energy metabolism, signal transduction pathways, and cellular senescence also play a role in aging."
+  ],
+  "contexts": [
+    [
+      "DOI: https://doi.org/10.7554/eLife.75244  \b  24 of 30 Chromosomes and Gene Expression | Genetics and Genomics  Research article\ufeff\ufeff\ufeff\ufeff\ufeff\ufeff Continued Author(s)  Year  Dataset title  Dataset URL  Database and Identifier  Longevityteam  2021  Genetics of longevity in BXD mice  http://www.\u200b BDL_10006, 10006 genenetwork.\u200borg/\u200b show_\u200btrait?\u200btrait_\u200bid=\u200b 10006&\u200bdataset=\u200bBXD-\u200b LongevityPublish  Longevityteam  2021  Genetics of longevity in BXD mice  http://www.\u200b BDL_10010, 10010 genenetwork.\u200borg/\u200b show_\u200btrait?\u200btrait_\u200bid=\u200b 10010&\u200bdataset=\u200bBXD-\u200b LongevityPublish  Longevityteam  2021  Genetics of longevity in BXD mice  http://www.\u200b BDL_10011, 10011 genenetwork.\u200borg/\u200b show_\u200btrait?\u200btrait_\u200bid=\u200b 10011&\u200bdataset=\u200bBXD-\u200b LongevityPublish  Longevityteam  2020  Genetics of longevity in BXD mice  http://www.\u200b BDL_10021, 10021 genenetwork.\u200borg/\u200b show_\u200btrait?\u200btrait_\u200bid=\u200b 10021&\u200bdataset=\u200bBXD-\u200b LongevityPublish  Longevityteam  2020  Genetics of longevity in BXD mice  http://www.\u200b BDL_10022, 10022 genenetwork.\u200borg/\u200b show_\u200btrait?\u200btrait_\u200bid=\u200b 10022&\u200bdataset=\u200bBXD-\u200b LongevityPublish  Longevityteam  2020  Genetics of longevity in BXD mice  http://www.\u200b BDL_10025, 10025 genenetwork.\u200borg/\u200b show_\u200btrait?\u200btrait_\u200bid=\u200b 10025&\u200bdataset=\u200bBXD-\u200b LongevityPublish  Longevityteam  2021  Genetics and epigenetics of aging and longevity in BXD mice  http://www.\u200b BDL_10066, 10066 genenetwork.\u200borg/\u200b show_\u200btrait?\u200btrait_\u200bid=\u200b 10066&\u200bdataset=\u200bBXD-\u200b LongevityPublish  References Albertsen HM, Smith SA, Mazoyer S, Fujimoto E, Stevens J, Williams B, Rodriguez P, Cropp CS, Slijepcevic P, Carlson M. 1994.",
+      "DOI: https://doi.org/10.7554/eLife.75244  \b  24 of 30 Chromosomes and Gene Expression | Genetics and Genomics  Research article\ufeff\ufeff\ufeff\ufeff\ufeff\ufeff Continued Author(s)  Year  Dataset title  Dataset URL  Database and Identifier  Longevityteam  2021  Genetics of longevity in BXD mice  http://www.\u200b BDL_10006, 10006 genenetwork.\u200borg/\u200b show_\u200btrait?\u200btrait_\u200bid=\u200b 10006&\u200bdataset=\u200bBXD-\u200b LongevityPublish  Longevityteam  2021  Genetics of longevity in BXD mice  http://www.\u200b BDL_10010, 10010 genenetwork.\u200borg/\u200b show_\u200btrait?\u200btrait_\u200bid=\u200b 10010&\u200bdataset=\u200bBXD-\u200b LongevityPublish  Longevityteam  2021  Genetics of longevity in BXD mice  http://www.\u200b BDL_10011, 10011 genenetwork.\u200borg/\u200b show_\u200btrait?\u200btrait_\u200bid=\u200b 10011&\u200bdataset=\u200bBXD-\u200b LongevityPublish  Longevityteam  2020  Genetics of longevity in BXD mice  http://www.\u200b BDL_10021, 10021 genenetwork.\u200borg/\u200b show_\u200btrait?\u200btrait_\u200bid=\u200b 10021&\u200bdataset=\u200bBXD-\u200b LongevityPublish  Longevityteam  2020  Genetics of longevity in BXD mice  http://www.\u200b BDL_10022, 10022 genenetwork.\u200borg/\u200b show_\u200btrait?\u200btrait_\u200bid=\u200b 10022&\u200bdataset=\u200bBXD-\u200b LongevityPublish  Longevityteam  2020  Genetics of longevity in BXD mice  http://www.\u200b BDL_10025, 10025 genenetwork.\u200borg/\u200b show_\u200btrait?\u200btrait_\u200bid=\u200b 10025&\u200bdataset=\u200bBXD-\u200b LongevityPublish  Longevityteam  2021  Genetics and epigenetics of aging and longevity in BXD mice  http://www.\u200b BDL_10066, 10066 genenetwork.\u200borg/\u200b show_\u200btrait?\u200btrait_\u200bid=\u200b 10066&\u200bdataset=\u200bBXD-\u200b LongevityPublish  References Albertsen HM, Smith SA, Mazoyer S, Fujimoto E, Stevens J, Williams B, Rodriguez P, Cropp CS, Slijepcevic P, Carlson M. 1994.",
+      "Leduc MS, Hageman RS, Meng Q et al (2010) Identification of genetic determinants of IGF-1 levels and longevity among mouse inbred strains. Aging Cell 9(5):823\u2013836. doi:10.1111/j.14749726.2010.00612.x 10. Lang DH, Gerhard GS, Griffith JW et al (2010) Quantitative trait loci (QTL) analysis of longevity in C57BL/6J by DBA/2J (BXD) recombinant inbred mice. Aging Clin Exp Res 22(1):8\u201319 11. Gelman R, Watson A, Bronson R et al (1988) Murine chromosomal regions correlated with longevity. Genetics 118(4):693\u2013704 12. Jackson AU, Galecki AT, Burke DT et al (2002) Mouse loci associated with life span exhibit sex-specific and epistatic effects.",
+      "Conclusions These results suggest a novel locus influencing survival in the B6/D2 genetic background, perhaps via a metabolic disorder that emerges by 200 days of age in male animals. Keywords Pathology  Longevity \\ Lifespan \\ Mouse \\ Linkage \\  Introduction Longevity, the quintessential complex trait, likely reflects all aspects of an organism\u2019s life history. In humans, the estimated heritability of age at death is estimated at 25\u201333 % [1]. Genetic contributions to mortality rates are thus of great interest and may aid in the understanding of disease etiology and the process of aging itself [2].",
+      "Here, we have extended this analysis to search for genotypes related to survival to the age of 800 days in a population of a reciprocal F2 cross between (B6) and (D2) mice. Since QTL for longevity in mice have shown strong sex specificity [10, 12], we conducted sex-specific analyses. In addition, we also determined whether there were any change in pathology changes associated with the loci that showed frequency distortions with aging. To confirm the associations of the loci of interest with longevity and pathology, we performed replication analyses on a panel of BXD recombinant inbred strains.",
+      "352(6291): p. aad0189. Liao, C.Y. , et al. , Genetic variation in the murine lifespan response to dietary restriction: from life extension to life shortening. Aging Cell, 2010. 9(1): p. 92-5. Johnson, M., Laboratory Mice and Rats. Mater. Methods, 2012. 2: p. 113. Fontaine, D.A. and D.B. Davis, Attention to Background Strain Is Essential for Metabolic Research: C57BL/6 and the International Knockout Mouse Consortium. Diabetes, 2016. 65(1): p. 25-33. Simon, M.M. , et al. , A comparative phenotypic and genomic analysis of C57BL/6J and C57BL/6N mouse strains. Genome Biol, 2013. 14(7): p. R82. Lilue, J., et al.",
+      "Mamm Genome 2001;12: 930\u20132. 21 Gelman R, Watson A, Bronson R, Yunis E. Murine chromosomal regions correlated with longevity. Genetics 1988;118:693\u2013704. 22 Peirce JL, Lu L, Gu J, Silver LM, Williams RW. A new set of BXD recombinant inbred lines from advanced intercross populations in mice. BMC Genet 2004;5:7. 23 Rahman ZS, Tin SK, Buenaventura PN et al. A novel susceptibility locus on chromosome 2 in the (New Zealand Black \\ New Zealand White) F1 hybrid mouse model of systemic lupus erythematosus. J Immunol 2002;168:3042\u20139. 24 Kono DH, Burlingame RW, Owens DG et al.",
+      "Conversely, the BXD strain with the shortest life span (BXD14) has the lowest responsiveness to the stimulatory effect of TGF-\u24242 when old (48). The region on chromosome 2 where a suggestive QTL regulating the responsiveness to TGF-\u24242 in old mice is located also contains two QTL for longevity (32). Finally, the strongest support for this hypothesis is the correlation between longevity and the age-related increase in the serum-dependent effect of TGF-\u24242 on LSK cells, the extent of which may determine stem cell function in aged mice.",
+      "FIGURE 8-5 Genetic regulation of longevity in mice stratified by cause of death.Female mice that inherit the C3H allele at D2Mit58 plus the BALB allele at D16Mit182 (light gray bars) have significantly higher longevity than their sisters (dark gray bars) with the C57BL/6 plus DBA/2 allele combination (\"all causes\" of death combined).Subsets of mice that died either of cancer or of a nonneoplastic (\"benign\") illness both show the association between genotype and longevity.Among the mice dying of neoplasia, subsets dying of lymphoma or of fibrosarcoma show equivalent, and significant, genotypic effects.Bars indicate means plus standard error of the mean.SOURCE:Miller et al. (unpublished  results).",
+      "The available dataset also provides examples in which genetic variants seem to influence the risk of specific late-life diseases.Figure 8-6, for example, shows longevity results for mice stratified by their inheritance at the 12th chromosome locus D12Mit167.This is a locus associated with differential longevity in both male and female mice, with the strongest effect (adjusted p < 0.01) seen in those mice living more than 657 days (Jackson et al., unpublished results).The longest-lived mice are those that inherit both the C57BL/6 allele from their mother and the C3H allele from their father; on average, they survive 93 days longer than siblings with the BALB plus C3H combination.Figure 8-6 shows that the D12Mit167, like the pair of loci illustrated in Figure 8-5, has significant and similar effects in mice dying of cancer (85 days) and in mice dying of non-neoplastic diseases (126 days).A more detailed analysis of the cancers, however, suggests that while lymphoma and hepatoma victims are equally protected by the favorable alleles (effect sizes of 93 and 167 days, respec-  mice of two subgroups: those dying of the urinary syndrome MUS, and those dying of all other causes.The genetic analysis contrasts mice with both the C57BL/6 allele at D4Mit84 and the C3H allele at D9Mit110 to mice with any of the three other allele combinations.In the males dying of causes other than MUS, this allele pair is associated with a 170-day increment in longevity (post-hoc p < 0.00003).But for males that do die of MUS, the same allele combination is associated with a 187-day decline in mean life span (post-hoc p < 0.03).This effect is thus pleiotropic, in that these alleles accelerate death in mice susceptible to MUS, while postponing death for all other males in the population.Although these loci are associated with differential longevity in mice that do develop MUS, they do not have a significant effect on the chances that MUS will indeed occur (not shown).The risk of developing MUS seems to be under control of a separate locus on chromosome 6.As shown in the bottom panel of Figure 8-7, males that inherit the C3H allele at D6Mit268 are far more likely to develop MUS (28 percent risk) than are their brothers who receive the DBA/2 allele at this locus (7 percent risk; p = 0.012 by two-tailed Fisher's exact test).",
+      "High levels of CD8M cells are associated with diminished longevity in mated females (left panel; p < 0.001), but not in virgin females (center panel).Among virgin males, those dying of diseases other than the urinary syndrome MUS show no association between CD8M and longevity (open circles, upper line), but those dying because of MUS show a nonsignificant trend (filled circles, lower line, R = -0.27,p = 0.13) similar to the relationship observed in mated females.SOURCE : Miller et al. (unpublished results).Male or female mice that inherit the C57BL/6 (maternal) and C3H (paternal) alleles at D12Mit167 (light gray bars) are longer lived than their siblings that inherit the BALB plus C3H combination.The \"effect size\" shown at the right represents that difference in mean longevity between mice in the two genetically different groups, with (**) = p < 0.01 and (*) = p < 0.05 by t-test.Similar effect sizes are seen for mice dying of cancer or of non-neoplastic illnesses (\"benign\"), and among the cancer deaths the genetic effect is similar for deaths due to lymphoma and hepatoma.The genetic effect on longevity seems to be minimal, however, for mice dying of fibrosarcoma.Bars show means plus standard errors.SOURCE : Miller et al. (unpublished results).",
+      "Our own work has taken a different tack: we have attempted to determine whether mutations with differential effects on aging may be present within the many available populations of laboratory-adopted inbred mice.The goal is not so much to clone these genes-if indeed they existbecause positional cloning strategies of this kind require many thousands of animals and would be extremely expensive using an assay, age at death, that is itself so costly.Instead, the goal has been to use gene mapping methods to test hypotheses about aging and to develop new animal models that will be useful for testing well-specified hypotheses about the molecular basis for age-dependent changes.In the absence of a validated battery of biomarkers of aging, we (like most others) have reluctantly decided to use mouse life span as a crude surrogate for aging itself, reasoning that genetic alleles that extend life span well beyond the median for the tested population may be operating via an influence on aging itself.Work conducted using recombinant inbred mouse stocks (Gelman et al., 1988;de Haan and Van Zant, 1999) has suggested that life-span differences between pairs of inbred mouse lines might reflect the influence of as few as 4-7 polymorphic loci, providing some basis for hope that some of these would have an effect large enough to be detected by a genome scan experiment involving 300-1,200 mice.",
+      ", Vogler, G.P. , Vandenbergh, D.J. , Blizard, D.A. , Stout, J.T. & McClearn, G.E. Quantitative Trait Locus (QTL) Analysis of Longevity in C57BL/6J byDBA/2J (BXD) Recombinant Inbred Mice. Aging Clin Exp Res (in press). Lionikas, A., Blizard, D.A. , Vandenbergh, D.J. , Glover, M.G. , Stout, J.T. , Vogler, G.P. , McClearn, G.E. & Larsson, L. (2003) Genetic architecture of fast- and slow-twitch skeletal muscle weight in 200-day-old mice of the C57BL/6J and DBA/2J lineage. Physiol Genomics 16, 141\u2013152. Lionikas A., Blizard D.A. , Gerhard G.S. , Vandenbergh D.J. , Stout J.T. , Vogler G.P. , McClearn G.E.",
+      "Deficiency mapping of quantitative trait loci affecting longevity in Drosophila melanogaster. Genetics 2000;156:1129\u20131146. [PubMed: 11063689] 33. Ma RZ, et al. Identification of Bphs, an autoimmune disease locus, as histamine receptor H1. Science 2002;297:620\u2013623. [PubMed: 12142541]  Nat Rev Genet. Author manuscript; available in PMC 2007 November 5. Page 12  NIH-PA Author Manuscript  34. Vivian JL, Chen Y, Yee D, Schneider E, Magnuson T. An allelic series of mutations in Smad2 and Smad4 identified in a genotype-based screen of N-ethyl-N-nitrosourea-mutagenized mouse embryonic stem cells. Proc. Natl Acad. Sci. USA 2002;99:15542\u201315547. [PubMed: 12432092] 35. Vogel G. Scientists dream of 1001 complex mice.",
+      "34. Gelman R, Watson A, Bronson R & Yunis E Murine chromosomal regions correlated with longevity. Genetics 118, 693\u2013704 (1988). [PubMed: 3163317] 35. Houtkooper RHet al.The metabolic footprint of aging in mice. Sci. Rep1, (2011). 36. Houtkooper RHet al.Mitonuclear protein imbalance as a conserved longevity mechanism. Nature497, 451\u2013457 (2013). [PubMed: 23698443] 37. Williams EGet al.An Evolutionarily conserved role for the aryl hydrocarbon receptor in the regulation of movement. PLOS Genet. 10, e1004673 (2014). [PubMed: 25255223] 38. Lang DHet al.Quantitative trait loci (QTL) analysis of longevity in C57BL/6J by DBA/2J (BXD) recombinant inbred mice. Aging Clin. Exp. Res. 22, 8\u201319 (2010).",
+      "In addition, the B6 mouse strain is one of the longest-lived mouse strains with a mean lifespan of 3 years versus other mouse strains with mean lifespan from 1.5-2 years. Therefore, it is evident that the genetic background of a particular mouse strain can have a profound effect on the biology of the HSC population as well as organismal longevity. Indeed, it is for this reason that it is difficult to compare findings from various laboratories where different mouse strains are used.",
+      "NIH-PA Author Manuscript  This study indicated a large amount of genetic variation for mouse longevity; heritability was 34% for AL and 36% for DR (60% of AL food intake). There was no significant correlation between mean longevity under these two conditions, although maximum lifespans of the AL and DR mice were significantly correlated. Similar observations were made at the UTHSCSA on the ILSXISS RI mice (Liao et al. , 2010a, b; Mattson 2010), where they also observed similar heritability (28% AL males, 36% AL females, 55% DR males, 53% DR females).",
+      "For females, hairs of the congenic mice grew 31% faster, also highly significant (P = 0.0006, 1-tailed). These results validated the presence of a gene in the differential region affecting FE. Discussion We report the outcomes of a quantitative genetic study on aging and longevity in the mouse. We studied an extant series of recombinant inbred strains (ILSXISS) that have been used both in DR aging studies as well as to study alcohol sensitivity (Williams et al. , 2004).",
+      "(2007) is a separate issue from the analyses conducted in this study (the AL efficiency model will be tested in future studies). Exp Gerontol. Author manuscript; available in PMC 2011 September 1. Rikke et al. Page 8  NIH-PA Author Manuscript  Other studies have also reported that individual mice that maintained the highest BW were likely to be the longest-lived individuals among cohorts of genetically identical mice (Weindruch et al. , 1986; Harper et al. , 2006).",
+      "Age-associated changes are conserved between mouse strains  Life span and aging vary between mouse strains.For example, C57BL/6 mice are long-lived compared to the short-lived DBA/2 mice (Turturro et al. 1999).To test the generality of our observations, we also examined LT-HSCs, ST-HSC and MPPs in young and old mice from the DBA/2 strain, which originates from a distinct breeding lineage (Fox 1997)."
+    ],
+    [
+      "Our results show that longevity is partly determined by the predisposition to common diseases and, to an even greater extent, by modifiable risk factors.The genetic architecture of lifespan appears complex and diverse and there appears to be no single genetic elixir of long life.",
+      "L ongevity is of interest to us all, and philosophers have long speculated on the extent to which it is pre-determined by fate.Here we focus on a narrower question-the extent and nature of its genetic basis and how this inter-relates with that of health and disease traits.In what follows, we shall use longevity as an umbrella term.We shall also more specifically refer to lifespan (the duration of life) and long-livedness (living to extreme old age, usually defined by a threshold, such as 90 years).Up to 25% of the variability in human lifespan has been estimated to be genetic 1 , but genetic variation at only three loci (near APOE, FOXO3A and CHRNA3/5) [2][3][4][5] have so far been demonstrated to be robustly associated with lifespan.",
+      "GENETICS OF LIFE SPAN IN HUMANS  Most studies of human twins agree that the heritability of life span is less than 50% (45,68).Of particular interest is an ongoing study of aging in Swedish twins that includes a large group of adopted twins who were reared separately.Ljungquist et al. (68) concluded that \"a maximum of one-third the variance in integrated mortality risk is attributable to genetic factors and that almost all of the remaining variance is due to nonshared, individually unique environmental factors. \"Moreover, this heritability declined with age and was negligible after the age of 85 in men and 90 in women.",
+      "How can lifespan be controlled by a single gene?Two possibilities are, first, that the mutations that extend lifespan are in genes whose products regulate the activity of many other genes and, second, that these genes do not in fact control the rate of ageing.",
+      "Since that time, observations across species have shown that life span can be extended by genetic factors.One of the first demonstrations of this entailed the study of recombinant inbred populations of the nematode worm Caenorhabditis elegans by Thomas E. Johnson.Then a postdoc in William (Bill) Wood's lab at the University of Colorado Boulder, Tom and Bill demonstrated that crosses of C. elegans strains did not display the heterosis effect that interfered with many other studies, \"As predicted, we found significant genetic effects on life span as well as other life history traits. \"This finding established a method for evaluating genetic factors that influenced life-span variation.In fact, their measurements of life span of the recombinant inbred strains demonstrated the heritability of life span to be 19%-51% (1).Consistent with theories of the 1970s and 1980s, it was concluded that these genetic factors were a collection of small influences across many genes.This finding was one of the first steps in demonstrating that genetic factors influence aging.As genetic analysis was making great progress in understanding other biological processes, such as developmental programming, the realization that aging could be investigated using the same tools was highly significant.",
+      "Although it is known that health and lifespan are heavily influenced by genetics [14], variations in the lifespan of different individuals within the same species seem to be more the result of the accumulation over time of molecular damage that compromises the function of the cells [15].These molecular alterations can occur both at the genetic and epigenetic levels and depend on genetic, environmental, and stochastic factors [16].This complex multifactorial mix determined characteristics, such as longevity and a healthy lifespan, which are central concerns of human existence (Fig. 13.1).This chapter describes different types of tools in genomics used in ageing research and their different applications in clinical scenarios.",
+      "Age at death in adulthood has a moderate genetic component overall, with a heritability of approximately 25% (Murabito et al., 2012).Heritability of longevity increases with age, with a negligible genetic contribution to survival up to approximately 60 years of age, after which an increasing genetic component to survival is observed (Brooks-Wilson, 2013;Christensen et al., 2006).Most genetic studies of aging have focused on long-lived individuals, typically defined as centenarians 100 years or older, who may have had exceptional survival due to medical interventions (Murabito et al., 2012).A number of genetic associations with exceptional longevity have been made (Atzmon et al., 2006;Bojesen and Nordestgaard, 2008;Hurme et al., 2005;Kuningas et al., 2007;Melzer et al., 2007;Pawlikowska et al., 2009;Sanders et al., 2010;Suh et al., 2008;Willcox et al., 2008), with only markers at APOE and FOXO3A being well replicated (Murabito et al., 2012).Overall, the results of genetic and epidemiological longevity studies suggest aging is a complex trait and that achievement of exceptional longevity may not best capture the genetics of resistance to or delay of age-associated disease (Christensen et al., 2006).",
+      "Introduction  Worldwide human populations have shown an increase in mean life expectancy in the past two centuries (Oeppen & Vaupel, 2002).This is mainly because of environmental factors such as improved hygiene, nutrition, and health care.The large variation in healthy lifespan among the elderly has prompted research into the determinants of aging and lifespan regulation.The genetic contribution to human lifespan variation was estimated at 25-30% in twin studies (Gudmundsson et al., 2000;Skytthe et al., 2003;Hjelmborg et al., 2006).The most prominent genetic influence is observed in families in which the capacity to attain a long lifespan clusters (Perls et al., 2000;Schoenmaker et al., 2006).Exceptional longevity can be reached with a low degree of age-related disability (Christensen et al., 2008;Terry et al., 2008), raising the question whether protective mechanisms against disease exist in long-lived subjects.",
+      "Introduction  Human lifespan is a highly complex trait, the product of myriad factors involving health, lifestyle, genetics, environment, and chance.The extent of the role of genetic variation in human lifespan has been widely debated (van den Berg et al., 2017), with estimates of broad sense heritability ranging from around 25% based on twin studies (Ljungquist et al., 1998;Herskind et al., 1996;McGue et al., 1993) (perhaps over-estimated [Young et al., 2018]) to around 16.1%, (narrow sense 12.2%) based on large-scale population data (Kaplanis et al., 2018).One very recent study suggests it is much lower still (<7%) (Ruby et al., 2018), pointing to assortative mating as the source of resemblance amongst kin.",
+      "Many factors beside genetics influence how long a person will live and our lifespan cannot be read from our DNA alone.Nevertheless, Timmers et al. had hoped to narrow down their search and discover specific genes that directly influence how quickly people age, beyond diseases.If such genes exist, their effects were too small to be detected in this study.The next step will be to expand the study to include more participants, which will hopefully pinpoint further genomic regions and help disentangle the biology of ageing and disease.",
+      "Life Span  During the last decade a variety of twin studies have shown that approximately 25 percent of the variation in life span is caused by genetic differences.This seems to be a rather consistent finding in various Nordic countries in different time periods and even so among other species not living in the wild (Herskind et al., 1996;Iachine et al., 1999;Finch and Tanzi, 1997).their relative magnitude and pattern depend on sex and on the socioeconomic environment experienced by successive birth cohorts.Genetic effects were most pronounced in periods with consciously controlled fertility, suggesting that the genetic disposition primarily affects fertility behavior and motivation for having children.Analyses of fertility motivation in some of the more recent twin cohorts, measured by age at first attempt to have children, supported this interpretation.",
+      "The Height-Life Span Nexus  Several observations and lines of experimentation have raised the issue of whether interindividual differences in aging rate are influenced by genes that modulate body size and early-life growth patterns.These include (a) the association between small stature and exceptional longevity in calorically restricted rodents (Yu et al., 1985), methionine-restricted rats (Orentreich et al., 1993), and mutant dwarf mice (Brown-Borg et al., 1996;Miller, 1999); and (b) the association between small body size and longer life span in natural populations of mice (Falconer et al., 1978), flies (Hillesheim and Stearns, 1992), dogs (Li et al., 1996), and, possibly, people (Samaras andStorms, 1992).The correlation in dogs is particularly striking: selective breeding for dogs of different body size has produced breeds varying in size from Chihuahua to Irish wolfhound.These breeds also vary greatly in mean longevity, from approximately 7 to 10.5 years, and the correlation between breed longevity and breed body weight (Miller, 1999) is a remarkable R 2 = 0.56.These differences are genetic and affect stature rather than obesity: no amount of overeating will convert a West Highland white terrier to a St. Bernard.The selective pressures applied were designed to create dogs of specific sizes and temperaments and were not intended to influence aging rate or life span.The clear implication is that the effects on longevity are pleiotropic, i.e., that genes selected for their effect on body size and conformation influenced life span as a side effect.It is of interest to note that the few analyses (Eigenmann et al., 1984(Eigenmann et al., , 1988) ) of the hormonal basis for interbreed differences in body size have shown that the genes in question influence levels of IGF-1, the most likely mediator of the life-span effects in the long-lived df/df and dw/dw mouse mutants.Could it be mere coincidence that long-lived mutant nematode worms (Kimura et al., 1997) also show mutations in genes related to insulin and IGF-1 receptors?",
+      "Altogether, the twin and genealogical studies have shown that human lifespan is heritable, but is significantly influenced by non-heritable factors, which may explain why genetic studies of lifespan have proven to be challenging.",
+      "Twin studies have shown that the heritability of lifespan ranges between 0.01 and 0.27 in various European populations (Ljungquist et al., 1998;van den Berg et al., 2017).Large genealogical studies are more powered to address questions FIGURE 1 | Relationship between aging and lifespan variation versus species defining lifespan. (A) Lifespan comparisons within species, measured as mean (50%) or portion of a population living till extended limits of lifespan (90-95%).Differences between populations (orange and green) can identify specific genetic or environmental changes associating with long life.These factors promote viability and often associate with increasing healthspan.Mutant analysis within a particular model organism often encompasses these types of changes as it relates to lifespan. (B) Maximum lifespans recorded for different species (A-E).While lifespan variation within a species is capped to a certain extent, variation between species can range dramatically.Changes to maximum lifespan often are associated with protective mechanisms for genomic and genetic fidelity as well as life history changes as they relate to maturation and reproduction.",
+      "The genetic component of human lifespan based on twin studies has been estimated to be around 20-30 percent in the normal population [7], but higher in long-lived families [8][9][10].Furthermore, siblings, parents, and offspring of centenarians also live well beyond average [11,12].Lifestyle choices in terms of smoking, alcohol consumption, exercise, or diet does not appear to differ between centenarians and controls [13].Taken together, these findings provide ample evidence that extreme longevity has a genetic component .",
+      "Introduction  Human longevity is influenced by multiple genetic and environmental factors.Approximately 25-32% of the overall variation in adult lifespan is because of genetic variation that becomes particularly important for survival at advanced age (Hjelmborg et al., 2006).Epidemiological studies have revealed that long-lived individuals (LLI), that is, people surviving to the 95th percentile of the respective birth cohort-specific age distributions (Gudmundsson et al., 2000), frequently show a favorable ('healthy') course of the aging process, with the absence or a delayed onset of agerelated diseases (Hitt et al., 1999).Hence, the LLI offer the key to elucidate the molecular mechanisms underlying the 'healthy aging' phenotype (Perls, 2006).",
+      "Unraveling the heritability of human longevity was one of the first problems faced by geneticists.Just over a century ago, Mary Beeton and Karl Pearson [1] described a resemblance among relatives for the duration of life.A short time later, Yule [2] and Fisher [3] proved that the correlation is to be expected if lifespan is influenced by what had recently been termed 'genes' [4].Indeed, a century of correlation studies have established that something on the order of 30-50% of the total variation in human life span is attributable to genetic variation [5].Despite the wealth of diversity, specific genes contributing to this variation have proven notoriously difficult to identify.Sample size and issues of shared environment limit family-based methods such as linkage analysis, where rough genomic positions of important genetic variants are identified by comparing a small number of exceptionally long-lived people in defined pedigrees.",
+      "Human lifespan variation is mainly determined by environmental factors, whereas the genetic contribution is 25-30% and expected to be polygenic.Two complementary fields go hand in hand in order to unravel the mechanisms of biological aging: genomic and biomarker research.Explorative and candidate gene studies of the human genome by genetic, transcriptomic, and epigenomic approaches have resulted in the identification of a limited number of interesting positive linkage regions, genes, and pathways that contribute to lifespan variation.The possibilities to further exploit these findings are rapidly increasing through the use of novel technologies, such as next-generation sequencing.Genomic research is progressively being integrated with biomarker studies on aging, including the application of (noninvasive) deep phenotyping and omics data -generated using novel technologies -in a wealth of studies in human populations.Hence, these studies may assist in obtaining a more holistic perspective on the role of the genome in aging and lifespan regulation.",
+      "Human lifespan variation is mainly determined by environmental factors, whereas the genetic contribution is 25-30% and expected to be polygenic.Two complementary fields go hand in hand in order to unravel the mechanisms of biological aging: genomic and biomarker research.Explorative and candidate gene studies of the human genome by genetic, transcriptomic, and epigenomic approaches have resulted in the identification of a limited number of interesting positive linkage regions, genes, and pathways that contribute to lifespan variation.The possibilities to further exploit these findings are rapidly increasing through the use of novel technologies, such as next-generation sequencing.Genomic research is progressively being integrated with biomarker studies on aging, including the application of (noninvasive) deep phenotyping and omics data -generated using novel technologies -in a wealth of studies in human populations.Hence, these studies may assist in obtaining a more holistic perspective on the role of the genome in aging and lifespan regulation.",
+      "Studies of mono-and dizygous twins have revealed that the genetic contribution to the variation in human lifespan is about 25-30% [12,13], and is most prominent in families clustered for longevity [14,15].This genetic contribution is mainly apparent after the age of 60 years and seems to increase with age [13,16].Furthermore, human lifespan is a complex trait which is assumed to be determined by many genes with small individual effects [17], although the polygenic architecture still needs to be characterized [18,19].The diverse health features of long-lived families illustrate that different age-related diseases have common determinants and implicate that pathways can be identified that attenuate aging and delay age-related disease.From a genomic perspective, individuals from long-lived families are assumed to be characterized by a decreased prevalence of disease-promoting variants (referred to as disease-susceptibility alleles) and an increased prevalence of variants conferring maintenance of health and protection from disease, when compared to population controls.In the last 5 years, many diseasesusceptibility alleles have been identified (National Human Genome Research Institute (NHGRI) genome-wide association study (GWAS) Catalog; http://www.genome.gov/gwastudies/)[20].A first comparison between long-lived individuals, selected from both long-lived families (LLS) and the general population (Leiden 85-plus study), and young controls showed no difference in the distribution or frequency of disease-susceptibility alleles identified in cancer, coronary artery disease and type 2 diabetes [21].The search for lifespan regulating loci -contributing to longevity and population mortality -must therefore extend beyond a focus on disease-susceptibility alleles.We will first discuss the efforts to identify longevity loci by genetics approaches."
+    ],
+    [
+      "Our result provides a novel hypothesis on the mechanism for the connection between two aging-related diseases: Alzheimer's disease and type 2 diabetes.",
+      "There are two major factors that underlie these alarming projections.The first is T2D is associated with age, and Western populations are aging rapidly.The second major explanation is our lifestyles have changed dramatically in recent years.Epidemiological studies have identified strong T2D risk relationships for obesity, sedentary behavior [2][3][4], and diets rich in energy [5], processed carbohydrates [6], and animal fats [7].Collectively, these lifestyle factors impede the actions of insulin and raise hepatic glucose production, which can result in the diminution of endogenous insulin production and T2D.The strongest evidence for a causal relationship between adverse lifestyle behaviors and T2D comes from randomized controlled trials that show intensive lifestyle interventions involving structured exercise regimes which promote habitual physical activity (PA) and have a major beneficial impact on diabetes incidence in high-risk individuals [8,9].",
+      "Epidemiological studies examining the associations between lifestyle behaviors and diabetes risk have reached similar conclusions as the clinical trials described above.For example, the 14-year follow-up University of Pennsylvania Alumni Health Study [52] (n = 5,990 men aged 39-68 years) showed PA (leisure time physical activity [LTPA] expressed in kcal expended per week through walking, stair climbing, and sports) was inversely associated with the incidence of T2D.Incidence rates declined as energy expenditure rose from 500 through 3,500 kcal/week.The age-adjusted relative risk ratio (RR) of T2D was reduced by about 6% for each 500 kcal increment increase in PA energy expenditure.",
+      "Overall, results were similar in analyses restricted to diabetes mellitus identified at baseline only, although the confidence interval included 1.These results suggest that diabetes mellitus is related to risk of AD in old age.These findings are consistent with the results of 2 large longitudinal cohort studies. 5,6In one study, 5 diabetes mellitus doubled the risk of AD during 2 years of follow-up in a sample of more than 6000 older persons from a defined cohort.The other study, 6 using data from about 2500 Japanese American men, found a similar result: diabetes mellitus approximately doubled the risk of AD.In contrast, 2 other longitudinal studies 7,8 did not  demonstrate a significant association between diabetes mellitus and incident AD, but in both, the results were in the direction of increased risk.Some, [9][10][11] but not all, 12 previous studies found that diabetes mellitus was related to change in cognitive function.One factor that may contribute to variability from study to study is that diabetes mellitus may be related to decline in some cognitive systems but not others.4][15] Although diabetes mellitus was related to level of global cognition and multiple cognitive domains at baseline, we found that diabetes mellitus was only related to decline in perceptual speed.The one study 12 that did not find a relation between diabetes mellitus and cognitive decline did not include a measure of perceptual speed.",
+      "COMMENT  In a cohort of more than 800 older persons, we found that diabetes mellitus sometime in the study was associated with an increased risk of developing AD during a mean of 5.5 years of observation.The risk of incident AD was 65% higher in those with diabetes mellitus than in those without it.",
+      "In summary, these findings suggest that diabetes mellitus is associated with AD and decline in cognitive function in older persons.December 12, 2003.",
+      "DIABETES MELLITUS AND RISK OF AD  During the follow-up evaluations, 151 persons developed AD, of whom 31 had diabetes mellitus.In a proportional hazards model adjusted for age, sex, and educational level, there was a 65% increase in the risk of developing AD in those with diabetes mellitus compared with those without diabetes mellitus (hazard ratio, 1.65; 95% confidence interval, 1.10-2.47).The cumulative hazard of AD over time, adjusted for age, sex, and educational level, is shown graphically in Figure 1 for typical participants with and without diabetes mellitus.Similar results were found in analyses with diabetes mellitus identified at baseline only (hazard ratio, 1.53; 95% confidence interval, 0.96-2.45).",
+      "Age. Age is another factor that has a considerable effect on outcomes in obesity and T2DM research.In humans, body weight increases with age and peaks at ~55 years in both men and women.Ageing per se is associated with a redistribution of both the fat-free mass and the fat mass, with the latter increase starting at ~30 years of age 129 .Intramuscular and intrahepatic fat are particularly increased in older persons, and this increase has been linked to insulin resistance 130 .Partially on the basis of these changes, ageing has been proposed to be an independent determinant of glucose tolerance, which progressively worsens with age 131,132 .",
+      "Age also plays a vital role in the onset of diabetes (Cowie & Eberhardt, 1995).In south-east Asia almost 97% diabetic patients are 40 years old or more (IDF Atlas, 2017).In Bangladesh, the reported age of diabetes is \u226540 years in 71% urban and 85% rural female, while in the case of male the proportion is 85.5% urban and 86.5% in rural population (IDF Atlas, 2017).The current study also pinpointed an exponential increase in the risk of onset of T2DM with the increase of age when 40 years was chosen as the reference (Table S4).",
+      "Whether age and stress variables are risk factors for type 2 diabetes incidence was assessed by multivariate logistic regression (Table S4).Subjects in the age groups of (40-60) and >60 years had 1.78\u00d7 (p = .005)and 3.19\u00d7 (p = .006)greater risk for type 2 diabetes respectively than group of <40 years.Overall, patients under stressful condition are more likely to develop T2DM than that of nonstressed respondent (p = .000).Moreover, when stress is divided into two groups-low stress and high stress, we found that both males (p = .000)and females (p = .000)with high stress were at high risk of diabetes mellitus, whereas the association between low stress and T2DM incidence was significant only among males (Male: p = .002;Female: p = .115).The distribution and association of the genotypes, age, and stress with T2DM have been summarized in Table 3 and Figure 3.There was no difference in T2DM incidence between CT (p = .030)and TT/CC (p = .034)genotype containing people who were in age group of 40-60 years (Table 3).In contrast, people who were more than 60 years old with CT genotype (OR = 4.636, p = .029)were more prone to T2DM than that of TT/CC genotype (OR = 3.714, p = .007)subjects (Table 3).",
+      "Research Gaps  There is a clear correlation of environmental influences to diabetes risk.Yet, the assembled experts agreed that hypothesis-driven research is needed to define direct causal relationships between specific environmental factors and pathophysiologies leading to diabetes.Research efforts need to address environmental etiologies of type 1 diabetes and determine their relative contribution to onset of autoimmunity and progression to symptomatic disease.Whether there is a direct causal role of the intestinal microbiota in pathogenesis of type 1 and type 2 diabetes and response to therapies needs to be determined.Public health interventions that successfully reduce the levels of consumption of energy-dense foods and/or reduce sedentary time and increase time spent in physical activity need to be evaluated to determine whether they can reduce type 2 diabetes incidence at a population level.",
+      "In sum, it is clear that multiple risk factors are involved in diabetes-associated cognitive decrements as well as in dementia in relation to diabetes 38 .On the basis of our assessment of the literature, it is also clear that there are still substantial knowledge gaps on how the risk factors interconnect, how the risk factors translate to potentially modifiable mechanisms and which genetic factors are involved.",
+      "The aim of this study was to investigate the association between age at natural menopause and risk of developing type 2 diabetes, and to assess whether this association is independent of potential intermediate risk factors for type 2 diabetes.Furthermore, we examined the role of endogenous sex hormone levels in the association between age at natural menopause and type 2 diabetes.",
+      "Aims/hypothesis In this study, we aimed to examine the association between age at natural menopause and risk of type 2 diabetes, and to assess whether this association is independent of potential mediators.Methods We included 3639 postmenopausal women from the prospective, population-based Rotterdam Study.Age at natural menopause was self-reported retrospectively and was treated as a continuous variable and in categories (premature,  <40 years; early, 40-44 years; normal, 45-55 years; and late  menopause, >55 years [reference]).Type 2 diabetes events were diagnosed on the basis of medical records and glucose measurements from Rotterdam Study visits.HRs and 95% CIs were calculated using Cox proportional hazards models, adjusted for confounding factors; in another model, they were additionally adjusted for potential mediators, including obesity, C-reactive protein, glucose and insulin, as well as for levels of total oestradiol and androgens.",
+      "Aims/hypothesis In this study, we aimed to examine the association between age at natural menopause and risk of type 2 diabetes, and to assess whether this association is independent of potential mediators.Methods We included 3639 postmenopausal women from the prospective, population-based Rotterdam Study.Age at natural menopause was self-reported retrospectively and was treated as a continuous variable and in categories (premature,  <40 years; early, 40-44 years; normal, 45-55 years; and late  menopause, >55 years [reference]).Type 2 diabetes events were diagnosed on the basis of medical records and glucose measurements from Rotterdam Study visits.HRs and 95% CIs were calculated using Cox proportional hazards models, adjusted for confounding factors; in another model, they were additionally adjusted for potential mediators, including obesity, C-reactive protein, glucose and insulin, as well as for levels of total oestradiol and androgens.Results During a median follow-up of 9.2 years, we identified 348 individuals with incident type 2 diabetes.After adjustment for confounders, HRs for type 2 diabetes were 3.7 (95% CI 1.8, 7.5), 2.4 (95% CI 1.3, 4.3) and 1.60 (95% CI 1.0, 2.8) for women with premature, early and normal menopause, respectively, relative to those with late menopause (ptrend <0.001).The HR for type 2 diabetes per 1 year older at menopause was 0.96 (95% CI 0.94, 0.98).Further adjustment for BMI, glycaemic traits, metabolic risk factors, C-reactive protein, endogenous sex hormone levels or shared genetic factors did not affect this association.Conclusions/interpretation Early onset of natural menopause is an independent marker for type 2 diabetes in postmenopausal women.",
+      "association and explore whether the timing of natural menopause can add value to diabetes prediction and prevention.",
+      "Although drawing of definitive conclusions is difficult from these observational studies, their results suggest that young-onset type 2 diabetes is associated with a much more frequent occurrence of adverse macrovascular and microvascular outcomes and a more rapidly progressing severity of complications than is seen in type 1 diabetes or later-onset type 2 diabetes.",
+      "In a study of the age-specific incidence of type 2 diabetes in the UK (a retrospective cohort study of patients with newly diagnosed type 2 diabetes between 1990 and 2010), the investigators reported a substantial increase in the proportion of people aged 40 years or younger at diagnosis",
+      "The prevalence of type 2 diabetes in adolescents and young adults is dramatically increasing.Similar to older-onset type 2 diabetes, the major predisposing risk factors are obesity, family history, and sedentary lifestyle.Onset of diabetes at a younger age (defined here as up to age 40 years) is associated with longer disease exposure and increased risk for chronic complications.Young-onset type 2 diabetes also affects more individuals of working age, accentuating the adverse societal effects of the disease.Furthermore, evidence is accumulating that young-onset type 2 diabetes has a more aggressive disease phenotype, leading to premature development of complications, with adverse effects on quality of life and unfavourable effects on long-term outcomes, raising the possibility of a future public health catastrophe.In this Review, we describe the epidemiology and existing knowledge regarding pathophysiology, risk factors, complications, and management of type 2 diabetes in adolescents and young adults.",
+      "The biological processes linking aging and disease risk are poorly understood.Still, aging is considered to date as one of the main factors responsible for several complex diseases including cancer, cardiovascular diseases, and diabetes."
+    ],
+    [
+      "Genomic analysis of longevity offers the potential to illuminate the biology of human aging.Here, using genome-wide association meta-analysis of 606,059 parents' survival, we discover two regions associated with longevity (HLA-DQA1/DRB1 and LPA).We also validate previous suggestions that APOE, CHRNA3/5, CDKN2A/B, SH2B3 and FOXO3A influence longevity.Next we show that giving up smoking, educational attainment, openness to new experience and high-density lipoprotein (HDL) cholesterol levels are most positively genetically correlated with lifespan while susceptibility to coronary artery disease (CAD), cigarettes smoked per day, lung cancer, insulin resistance and body fat are most negatively correlated.We suggest that the effect of education on lifespan is principally mediated through smoking while the effect of obesity appears to act via CAD.Using instrumental variables, we suggest that an increase of one body mass index unit reduces lifespan by 7 months while 1 year of education adds 11 months to expected lifespan.",
+      "Genomic analysis of longevity offers the potential to illuminate the biology of human aging.Here, using genome-wide association meta-analysis of 606,059 parents' survival, we discover two regions associated with longevity (HLA-DQA1/DRB1 and LPA).We also validate previous suggestions that APOE, CHRNA3/5, CDKN2A/B, SH2B3 and FOXO3A influence longevity.Next we show that giving up smoking, educational attainment, openness to new experience and high-density lipoprotein (HDL) cholesterol levels are most positively genetically correlated with lifespan while susceptibility to coronary artery disease (CAD), cigarettes smoked per day, lung cancer, insulin resistance and body fat are most negatively correlated.We suggest that the effect of education on lifespan is principally mediated through smoking while the effect of obesity appears to act via CAD.Using instrumental variables, we suggest that an increase of one body mass index unit reduces lifespan by 7 months while 1 year of education adds 11 months to expected lifespan.",
+      "Background: Genetic research on longevity has provided important insights into the mechanism of aging and aging-related diseases.Pinpointing import genetic variants associated with aging could provide insights for aging research.Methods: We performed a whole-genome sequencing in 19 centenarians to establish the genetic basis of human longevity.Results: Using SKAT analysis, we found 41 significantly correlated genes in centenarians as compared to control genomes.Pathway enrichment analysis of these genes showed that immune-related pathways were enriched, suggesting that immune pathways might be critically involved in aging.HLA typing was next performed based on the whole-genome sequencing data obtained.We discovered that several HLA subtypes were significantly overrepresented.Conclusions: Our study indicated a new mechanism of longevity, suggesting potential genetic variants for further study.",
+      "Genetic linkage studies of long-lived human families identified a longevity locus while candidate gene approaches have been used to identify and confirm the association between specific variants in the FOXO3A gene and human longevity [3\u20137]. Genome-wide association studies have also been used to identify the association of APOE with life  123 Aging Clin Exp Res  span and have yielded insights into potential biological pathways and processes related to aging. Despite these successes, several problems are inherent in human longevity studies including potentially high degrees of environmental heterogeneity, genetic diversity, and lack of birth matched controls, among others [8].",
+      "Additional association studies with these families and replication of these results with an independent data set should facilitate the positional cloning of a gene that influences the ability to age well and achieve exceptional longevity.Identification of the genes in humans that allow certain individuals to live to extreme old age should lead to insights on cellular pathways that are important to the aging process.",
+      "In conclusion, we performed a genome-wide association study of longevity-related phenotypes in individuals of European, East Asian and African American ancestry and identified the APOE and GPR78 loci to be associated with these phenotypes in our study.Moreover, our gene-level association analyses highlight a role for tissue-specific expression of genes at chromosome 5q13.3,12q13.2,17q21.31,and 19q13.32 in longevity.Genetic correlation analyses show that our longevity-related phenotypes are genetically correlated with several disease-related phenotypes, which in turn could help to identify phenotypes that could be used as potential biomarkers for longevity in future (genetic) studies.",
+      "The only two genes associated with human longevity that have been replicated in multiple populations are FOXO3A and APOE [11,12,15,26,28 -31].The effect sizes of these two genes for longevity are small with odds ratios of 1.26 and 1.45 for survival to age 100 in replicate studies for FOXO3A and APOE, respectively [10,29].These genes account for only a small portion of the genetic contribution to longevity measured through family heritability studies [4,5].Therefore, much of the heritability of lifespan remains to be explained.",
+      "Ageing in humans is typified by the decline of physiological functions in various organs and tissues leading to an increased probability of death.Some individuals delay, escape or survive much of this age-related decline and live past age 100.Studies comparing centenarians to average-aged individuals have found polymorphisms in genes that are associated with long life, including APOE and FOXOA3, which have been replicated many times.However, the associations found in humans account for small percentages of the variance in lifespan and many other gene associations have not been replicated in additional populations.Therefore, ageing is probably a highly polygenic trait.In humans, it is important to also consider differences in age-related decline that occur within and among tissues.Longitudinal data of age-related traits can be used in association studies to test for polymorphisms that predict how an individual will change over time.Transcriptional and genetic association studies of different tissues have revealed common and unique pathways involved in human ageing.Genomic convergence is a method that combines multiple types of functional genomic information such as transcriptional profiling, expression quantitative trait mapping and gene association.The genomic convergence approach has been used to implicate the gene MMP20 in human kidney ageing.New human genetics technologies are continually in development and may lead to additional breakthroughs in human ageing in the near future.",
+      "Ageing in humans is typified by the decline of physiological functions in various organs and tissues leading to an increased probability of death.Some individuals delay, escape or survive much of this age-related decline and live past age 100.Studies comparing centenarians to average-aged individuals have found polymorphisms in genes that are associated with long life, including APOE and FOXOA3, which have been replicated many times.However, the associations found in humans account for small percentages of the variance in lifespan and many other gene associations have not been replicated in additional populations.Therefore, ageing is probably a highly polygenic trait.In humans, it is important to also consider differences in age-related decline that occur within and among tissues.Longitudinal data of age-related traits can be used in association studies to test for polymorphisms that predict how an individual will change over time.Transcriptional and genetic association studies of different tissues have revealed common and unique pathways involved in human ageing.Genomic convergence is a method that combines multiple types of functional genomic information such as transcriptional profiling, expression quantitative trait mapping and gene association.The genomic convergence approach has been used to implicate the gene MMP20 in human kidney ageing.New human genetics technologies are continually in development and may lead to additional breakthroughs in human ageing in the near future.",
+      "In most experimentally modified animal model systems, single-gene mutations in many different genes have major life extension effects (Fontana et al., 2010;Kenyon, 2010).However, natural human and animal longevity is presumed to be a complex trait (Finch & Tanzi, 1997).In humans, both candidate gene and genome-wide genetic association approaches have been applied in an attempt to identify longevity loci.The frequency of genetic variants has been typically compared between nonagenarian cases and young controls, revealing loci at which genetic variants may contribute to a higher or lower probability of survival into old age.The initial candidate gene studies aimed at finding human longevity genes were dominated by contradictory results (Christensen et al., 2006).The more consistent evidence obtained by repeated observation in independent cohort studies for association with longevity has so far only been observed for three loci, the apolipoprotein E (APOE) locus (Schachter et al., 1994;Christensen et al., 2006), the FOXO3A locus (Willcox et al., 2008;Flachsbart et al., 2009;Pawlikowska et al., 2009;Soerensen et al., 2010), and the AKT1 locus (Pawlikowska et al., 2009).Thus, despite the expectation that longevity would be influenced by many genetic variants with small effect sizes, the effect of variants has consistently been shown in only three genes.",
+      "The lack of success in the identification of genes related to aging in humans may be due to the complexity of the phenotype.One approach to investigate aging and longevity is to compare frequencies of genetic variants between nonagenarians or centenarians and the general population.This approach led to the discovery of an association between APOE (Deelen et al., 2011;Ewbank, 2007;Gerdes et al., 2000) and more recently FOXO3A (Anselmi et al., 2009;Flachsbart et al., 2009;Li et al., 2009a;Pawlikowska et al., 2009;Willcox et al., 2008) and human aging and longevity.However, a recent genome-wide association study (GWAS) of individuals reaching the age of 90 or older failed to identify genome-wide significant variants (Newman et al., 2010).",
+      "Human longevity and healthy aging show moderate heritability (20%-50%).We conducted a meta-analysis of genome-wide association studies from 9 studies from the Cohorts for Heart and Aging Research in Genomic Epidemiology Consortium for 2 outcomes: (1) all-cause mortality, and (2) survival free of major disease or death.No single nucleotide polymorphism (SNP) was a genome-wide significant predictor of either outcome (p \u03fd 5 \u03eb 10 \u03ea8 ).We found 14 independent SNPs that predicted risk of death, and 8 SNPs that predicted event-free survival (p \u03fd 10 \u03ea5 ).These SNPs are in or near genes that are highly expressed in the brain (HECW2, HIP1, BIN2, GRIA1), genes involved in neural development and function (KCNQ4, LMO4, GRIA1, NETO1) and autophagy (ATG4C), and genes that are associated with risk of various diseases including cancer and Alzheimer's disease.In addition to considerable overlap between the traits, pathway and network analysis corroborated these findings.These findings indicate that variation in genes involved in neurological processes may be an important factor in regulating aging free of major disease and achieving longevity.",
+      "Human longevity and healthy aging show moderate heritability (20%-50%).We conducted a meta-analysis of genome-wide association studies from 9 studies from the Cohorts for Heart and Aging Research in Genomic Epidemiology Consortium for 2 outcomes: (1) all-cause mortality, and (2) survival free of major disease or death.No single nucleotide polymorphism (SNP) was a genome-wide significant predictor of either outcome (p \u03fd 5 \u03eb 10 \u03ea8 ).We found 14 independent SNPs that predicted risk of death, and 8 SNPs that predicted event-free survival (p \u03fd 10 \u03ea5 ).These SNPs are in or near genes that are highly expressed in the brain (HECW2, HIP1, BIN2, GRIA1), genes involved in neural development and function (KCNQ4, LMO4, GRIA1, NETO1) and autophagy (ATG4C), and genes that are associated with risk of various diseases including cancer and Alzheimer's disease.In addition to considerable overlap between the traits, pathway and network analysis corroborated these findings.These findings indicate that variation in genes involved in neurological processes may be an important factor in regulating aging free of major disease and achieving longevity.",
+      "In addition to aging-and CR-related genes, another source of candidate genes and pathways for drug design are human longevity-associated genes (Barzilai and Shuldiner, 2001;Browner et al., 2004;Kenyon, 2010).Dozens of genes have now been associated with human longevity (de Magalha \u02dces et al., 2009a), although only a handful of genes have been shown to have consistent effects across populations.",
+      "The genetic basis of human longevity has so far been primarily investigated by association studies.Most results from these experiments have been difficult to confirm in independent samples, probably owing to the modest heritability, multifactorial nature, and heterogeneity of the phenotype (Christensen et al., 2006).To date, variation in only two genes has been identified, which has an effect on longevity in various populations: (i) the apolipoprotein E gene (APOE) (Scha \u00a8chter et al., 1994;Christensen et al., 2006) and (ii) the forkhead box O3A (FOXO3A) gene in the insulin-IGF1 signaling (IIS) pathway (Willcox et al., 2008;Flachsbart et al., 2009).Given the apparent lack of susceptibility candidates, it is conceivable that other genetic factors influence the function or expression of genes relevant for human longevity.",
+      "GenAge: the aging gene database Philosophy and overview of resources  It is undisputed that genetic factors influence aging.In a remarkable series of recent breakthroughs, a number of genes capable of altering the aging process as a whole -or at least to a large degree -have been identified in animal models and even a few in humans (Finch & Ruvkun, 2001;de Magalh\u00e3es, 2005;Kenyon, 2005).Furthermore, multiple alleles have been examined for their association with human exceptional longevity (Vijg & Suh, 2005).This is a fascinating and important area of research, yet there are now so many genes being associated with aging and longevity that keeping track of them all is becoming increasingly more difficult.Moreover, it is necessary now to study not only individual genes but their interactions with each other and with the environment, and how together genes give rise to a given phenotype: the so-called systems biology approach.To help researchers address these issues we created GenAge, a database of genes related to longevity and/or aging.",
+      "The only two genes associated with human longevity that have been replicated in multiple populations are FOXO3A and APOE [11,12,15,26,28 -31].The effect sizes of these two genes for longevity are small with odds ratios of 1.26 and 1.45 for survival to age 100 in replicate studies for FOXO3A and APOE, respectively [10,29].These genes account for only a small portion of the genetic contribution to longevity measured through family heritability studies [4,5].Therefore, much of the heritability of lifespan remains to be explained.",
+      "Ageing in humans is typified by the decline of physiological functions in various organs and tissues leading to an increased probability of death.Some individuals delay, escape or survive much of this age-related decline and live past age 100.Studies comparing centenarians to average-aged individuals have found polymorphisms in genes that are associated with long life, including APOE and FOXOA3, which have been replicated many times.However, the associations found in humans account for small percentages of the variance in lifespan and many other gene associations have not been replicated in additional populations.Therefore, ageing is probably a highly polygenic trait.In humans, it is important to also consider differences in age-related decline that occur within and among tissues.Longitudinal data of age-related traits can be used in association studies to test for polymorphisms that predict how an individual will change over time.Transcriptional and genetic association studies of different tissues have revealed common and unique pathways involved in human ageing.Genomic convergence is a method that combines multiple types of functional genomic information such as transcriptional profiling, expression quantitative trait mapping and gene association.The genomic convergence approach has been used to implicate the gene MMP20 in human kidney ageing.New human genetics technologies are continually in development and may lead to additional breakthroughs in human ageing in the near future.",
+      "Ageing in humans is typified by the decline of physiological functions in various organs and tissues leading to an increased probability of death.Some individuals delay, escape or survive much of this age-related decline and live past age 100.Studies comparing centenarians to average-aged individuals have found polymorphisms in genes that are associated with long life, including APOE and FOXOA3, which have been replicated many times.However, the associations found in humans account for small percentages of the variance in lifespan and many other gene associations have not been replicated in additional populations.Therefore, ageing is probably a highly polygenic trait.In humans, it is important to also consider differences in age-related decline that occur within and among tissues.Longitudinal data of age-related traits can be used in association studies to test for polymorphisms that predict how an individual will change over time.Transcriptional and genetic association studies of different tissues have revealed common and unique pathways involved in human ageing.Genomic convergence is a method that combines multiple types of functional genomic information such as transcriptional profiling, expression quantitative trait mapping and gene association.The genomic convergence approach has been used to implicate the gene MMP20 in human kidney ageing.New human genetics technologies are continually in development and may lead to additional breakthroughs in human ageing in the near future.",
+      "Most of the human candidate gene studies were performed in cross-sectional designs (Box 1 and Fig. 1), comparing allele frequencies of potential longevity loci between highly aged individuals and young controls.The candidate gene studies based on single genes have pointed a role for genes involved in, e.g., GH/insulin/IGF-1 signaling, immune regulation, and lipoprotein metabolism (Supporting Information Table S1), although most of these results have not (yet) been confirmed in sufficient independent studies.The most convincing human longevity loci today are APOE and FOXO3A which have frequently been associated with longevity in cross-sectional studies (see for a review [26]) and survival in prospective studies [27][28][29] (Fig. 3).APOE encodes the protein apolipoprotein E which seems to play a role in e.g., lipoprotein metabolism, cognitive function, and immune regulation [30].FOXO3A encodes the protein forkhead box O3 which acts as a transcription factor for many different genes involved in processes like apoptosis and oxidative stress [31]."
+    ],
+    [
+      "Genomic analysis of longevity offers the potential to illuminate the biology of human aging.Here, using genome-wide association meta-analysis of 606,059 parents' survival, we discover two regions associated with longevity (HLA-DQA1/DRB1 and LPA).We also validate previous suggestions that APOE, CHRNA3/5, CDKN2A/B, SH2B3 and FOXO3A influence longevity.Next we show that giving up smoking, educational attainment, openness to new experience and high-density lipoprotein (HDL) cholesterol levels are most positively genetically correlated with lifespan while susceptibility to coronary artery disease (CAD), cigarettes smoked per day, lung cancer, insulin resistance and body fat are most negatively correlated.We suggest that the effect of education on lifespan is principally mediated through smoking while the effect of obesity appears to act via CAD.Using instrumental variables, we suggest that an increase of one body mass index unit reduces lifespan by 7 months while 1 year of education adds 11 months to expected lifespan.",
+      "Genomic analysis of longevity offers the potential to illuminate the biology of human aging.Here, using genome-wide association meta-analysis of 606,059 parents' survival, we discover two regions associated with longevity (HLA-DQA1/DRB1 and LPA).We also validate previous suggestions that APOE, CHRNA3/5, CDKN2A/B, SH2B3 and FOXO3A influence longevity.Next we show that giving up smoking, educational attainment, openness to new experience and high-density lipoprotein (HDL) cholesterol levels are most positively genetically correlated with lifespan while susceptibility to coronary artery disease (CAD), cigarettes smoked per day, lung cancer, insulin resistance and body fat are most negatively correlated.We suggest that the effect of education on lifespan is principally mediated through smoking while the effect of obesity appears to act via CAD.Using instrumental variables, we suggest that an increase of one body mass index unit reduces lifespan by 7 months while 1 year of education adds 11 months to expected lifespan.",
+      "Recent developments on the genetics of aging can be seen as several streams of effort.In general, humans show a relatively modest (<50%) heritability of life spans (results obtained from twin studies discussed below).The apoE polymorphisms are remarkable for their influence on both cardiovascular disease and Alzheimer disease.In contrast, rare mutant genes with high penetrance cause these same diseases but with early onset and a major shortening of the life span.Shortlived laboratory models (fruit flies, nematodes, mice) are yielding rapid advances, with the discovery of mutants that increase life spans in association with altered metabolism, which leads to questions on the physiological organization of aging processes.Although these early findings do not show that a conserved genetic program actually controls aging processes across animal phylogeny, it is striking how frequently findings of metabolic rate, insulin signaling, and free radicals have emerged from very different approaches to aging in nematodes and mammals, for example.These findings hint that the genetic control of life span was already developed in the common ancestor of modern animals so that subsequent evolution of life spans was mediated by quantitative changes in the control of metabolism through insulin and the production of free radicals.",
+      "Background: Genetic research on longevity has provided important insights into the mechanism of aging and aging-related diseases.Pinpointing import genetic variants associated with aging could provide insights for aging research.Methods: We performed a whole-genome sequencing in 19 centenarians to establish the genetic basis of human longevity.Results: Using SKAT analysis, we found 41 significantly correlated genes in centenarians as compared to control genomes.Pathway enrichment analysis of these genes showed that immune-related pathways were enriched, suggesting that immune pathways might be critically involved in aging.HLA typing was next performed based on the whole-genome sequencing data obtained.We discovered that several HLA subtypes were significantly overrepresented.Conclusions: Our study indicated a new mechanism of longevity, suggesting potential genetic variants for further study.",
+      "Introduction  With the development of human genomics research, a large number of studies of the genetics of longevity have been conducted.Scientists from various countries have proposed many different theories concerning the mechanisms of aging from different perspectives, involving oxidative stress, energy metabolism, signal transduction pathways, immune response, etc. [1,2].These mechanisms interact with each other and are influenced by heredity to some degree [2,3].The identification of longevity-related biological markers is critical to an indepth understanding of the mechanisms of carrier protection against common disease and/or of the retardation of the process of aging.",
+      "INTRODUCTION  Human aging is affected by genes, life style, and environmental factors.The genetic contribution to average human aging can be modest with genes explaining \u223c20-25% of the variability of human survival to the mid-eighties (Herskind et al., 1996;Fraser and Shavlik, 2001).By contrast, genetic factors may have greater impact on survival to the ninth through eleventh decades (Tan et al., 2008).Notably, exceptional longevity is rare and may involve biological mechanisms that differ from those implicated in usual human aging.",
+      "Somatic mutations with the inherited gene variations of each individual cumulatively or synergistically influence the health span and life span [11].Very few genetic variants have been associated with human longevity, but those found include the transcription factor FOXO3 gene, the APOE/TOMM40 and the CDKN2B/ ANRIL loci, which are associated with Alzheimer's disease and cellular senescence [12][13][14].In fact, the heritability for human longevity has been estimated to be approximately 20-30%, according to studies of twins, suggesting that external factors such as diet, environment, physical activity and microbiomes are important factors that influence the health span [14][15][16].The increase in the rate of retrotranscription reflects genome deregulation, creating additional mutations, DNA damage, and other forms of genome instability.For instance, the expression of several families of retrotransposable elements increases with age, as observed in mouse skeletal muscle and human fibroblasts, particularly the long interspersed nuclear element-1 (L1 LINE) [17,18].",
+      "Influence of Genetic Factors in Ageing and Lifespan  Ageing is defined as the decline of physiological functions in several tissues and organs inducing an increasing probability of death [17].The understanding of genetic factors involved in ageing has been limited due to the complexity of this process and the heterogeneity among individuals and even among tissues [18][19][20].Tissue cells adopt a senescent phenotype as a consequence of multiple intrinsic, extrinsic, and stochastic factors [21].The combination of these genetic factors is related to longevity and healthy ageing [22].Although this decline is somewhat predictable, some individuals show a much slower decline and get to live past the age of 100.Studies in these individuals showed polymorphisms in some genes which are associated with long life, such as APOE and FOXO3.However, these associations have not been consistent across different populations, suggesting that ageing is rather polygenic [23].",
+      "Before the advent of NGS technologies, several scientists were interested in the study of allele variants associated with aging, but they were limited by the lack of aging rate biomarkers.Now with NGS technologies, these biomarkers have been emerged such as the epigenetic clock that is described in the DNA methylation sequencing section of this chapter.In this post-genomic era, different strategies have been developed in order to understand the genetic factors involved in aging [17].One strategy used is the study of aging in extreme longevity groups of people, called centenarians.Centenarians are a group that can reach an age above 100 years and has an incidence of 1 every 10,000 people [18].In a pioneering study using extreme longevity people (308 individuals belonging to 137 sibships showing extreme longevity), genome-wide scan analysis identified a region on chromosome 4 associated with extreme longevity [19] that corresponds to the microsomal transfer protein (MTP) [20], which is associated with abetalipoproteinemia and hypobeta lipoproteinemia in humans [21,22].Another approach to study the genetic factors involved in longevity consists in assessing allele frequencies from people of different ages, looking for those polymorphisms (SNPs) with enhanced allele frequencies in high-longevity individuals.Those alleles with diminished frequencies in aged individuals may be associated with age-related diseases.Using this approximation, an SNP that shifts isoleucine to valine was identified in the PKA-anchoring protein (AKAP2) gene.This polymorphism is associated with reduced longevity and cardiac disease [23].Genome-wide association studies (GWAS) have confirmed only three loci that affect longevity: FOXO3A, APOE, and an intergenic locus on chromosome 5q33.3[24][25][26].",
+      "M OST genetic studies involved with aging have focused on identifying genes contributing to particular diseases.More recently, it has been recognized that it is also valuable to examine genetic factors related to diseasefree or healthy aging (1,2).Utilizing twins from the National Academy of Sciences-National Research Council (NAS-NRC) twin panel, we have demonstrated that healthy physical aging is under a significant degree of genetic influence, with a heritability over 50% (3).Our definition of healthy aging focused principally on freedom from cardiovascular disease, and has received considerable support in the more recent literature.Brand and colleagues (4) reported that parental age at death was a significant predictor of coronary heart disease death in the Framingham offspring study and concluded that familial similarities for age at death may be mediated through shared coronary heart disease risk factors.Frederiksen and colleagues (5) reported that increased parental life was associated with a reduction in odds ratio for their children to have diabetes, ischemic heart disease, heart failure, stroke, and hypertension.We have found that better midlife lipid levels and blood pressures were associated with increased parental longevity in the National Heart, Lung, and Blood Institute twin study (6).Centenarian siblings and offspring, besides having increased longevity, have been shown to have better health and better cardiovascular risk factor profiles (7)(8)(9)(10).",
+      "Introduction  The recent, remarkable extension of life expectancy is largely attributed to the postponement of mortality at old age (Vaupel, 1997(Vaupel, , 2010)).The years of life gained in the older population residing in developed nations are a success story of public health measures and improved health care.In addition to such external factors, longevity and healthy aging consistently show a modest heritability between 20% and 50% and aging-associated genetic research may provide further insights into the mechanisms of aging (Herskind et al., 1996;McGue et al., 1993;Reed and Dick, 2003).It has been postulated that genes involved in pathways associated with aging identified in animal models, such as insulin-like growth factor (IGF)-insulin signaling, regulation of lipoprotein metabolism, the mTOR pathway, and the oxidative stress response may also influence survival to old or even exceptionally old age in humans (Christensen et al., 2006;Kenyon, 2010;Vellai et al., 2003).However, in humans, common variants within genes involved in these pathways have not been consistently associated with lifespan (Chris-tensen et al., 2006;Kenyon, 2010;Kuningas et al., 2008;Vijg and Suh, 2005).",
+      "Human longevity and healthy aging show moderate heritability (20%-50%).We conducted a meta-analysis of genome-wide association studies from 9 studies from the Cohorts for Heart and Aging Research in Genomic Epidemiology Consortium for 2 outcomes: (1) all-cause mortality, and (2) survival free of major disease or death.No single nucleotide polymorphism (SNP) was a genome-wide significant predictor of either outcome (p \u03fd 5 \u03eb 10 \u03ea8 ).We found 14 independent SNPs that predicted risk of death, and 8 SNPs that predicted event-free survival (p \u03fd 10 \u03ea5 ).These SNPs are in or near genes that are highly expressed in the brain (HECW2, HIP1, BIN2, GRIA1), genes involved in neural development and function (KCNQ4, LMO4, GRIA1, NETO1) and autophagy (ATG4C), and genes that are associated with risk of various diseases including cancer and Alzheimer's disease.In addition to considerable overlap between the traits, pathway and network analysis corroborated these findings.These findings indicate that variation in genes involved in neurological processes may be an important factor in regulating aging free of major disease and achieving longevity.",
+      "Translational  A LTHOUGH there is much debate about the processes driving human aging, there is little doubt that genetic influences play a significant role (1).Humans clearly live very much longer than the currently favored laboratory models of aging, and such interspecies differences in reproductively 'fit' life span must have an inherited genetic foundation.Within human populations, environmental and behavioral exposures are important but at least a quarter of life expectancy variation in twin or family studies is attributable to inherited genetic or epigenetic factors (2).Age-related conditions such as type 2 diabetes, myocardial infarction, common cancers, and Alzheimer's disease (AD) typically have onsets after the fourth decade of life; \"successful\" agers delay these onsets until relatively late in life (3).Many aging traits and diseases show moderate heritability, including cardiovascular disease (CVD) (4) and impaired physical functioning (5), independent of known environmental risk factors.",
+      "Many factors contribute to aging, including genes.This is the first article in a 10-part series that highlight some of what is known about the influence of genes on aging and emerging treatment options that may slow down or potentially reverse the aging process.The series will address \\genes, adducts, and telomeres, decreased immune defenses, oxidation and inefficient mitochondria, toxins and radiation, glycosylation, caloric intake and sirtuin production, neurotransmitter imbalance, hormone mechanisms, reduced nitric oxide, and stem cell slowdown.Underpinning these factors are wear and tear on cells and aging as a result of inability to repair or replace these affected cells.These topics have been addressed in research, health magazines, and even by talk show hosts.There is even a LongevityMap website addressing significant and nonsignificant genetic association studies in aging across the human genome (http://genomics.senescence.info/longevity/).The series will address a scientific and clinical approach to genome-related aging topics.",
+      "The genetic basis of human longevity has so far been primarily investigated by association studies.Most results from these experiments have been difficult to confirm in independent samples, probably owing to the modest heritability, multifactorial nature, and heterogeneity of the phenotype (Christensen et al., 2006).To date, variation in only two genes has been identified, which has an effect on longevity in various populations: (i) the apolipoprotein E gene (APOE) (Scha \u00a8chter et al., 1994;Christensen et al., 2006) and (ii) the forkhead box O3A (FOXO3A) gene in the insulin-IGF1 signaling (IIS) pathway (Willcox et al., 2008;Flachsbart et al., 2009).Given the apparent lack of susceptibility candidates, it is conceivable that other genetic factors influence the function or expression of genes relevant for human longevity.",
+      "Introduction  Human longevity is influenced by multiple genetic and environmental factors.Approximately 25-32% of the overall variation in adult lifespan is because of genetic variation that becomes particularly important for survival at advanced age (Hjelmborg et al., 2006).Epidemiological studies have revealed that long-lived individuals (LLI), that is, people surviving to the 95th percentile of the respective birth cohort-specific age distributions (Gudmundsson et al., 2000), frequently show a favorable ('healthy') course of the aging process, with the absence or a delayed onset of agerelated diseases (Hitt et al., 1999).Hence, the LLI offer the key to elucidate the molecular mechanisms underlying the 'healthy aging' phenotype (Perls, 2006).",
+      "Introduction  Approximately 25-30% of the variation in adult lifespan is attributable to genetic factors that become more important with increasing age and exert their strongest effects in nonagenarians and centenarians (Go \u00a8gele et al., 2010;Hjelmborg et al., 2006).As yet, however, only a few genetic variants have been found consistently to influence longevity.The first to be discovered was the e4 allele of the apolipoprotein E (APOE) gene, a mortality factor that predisposes to both Alzheimer's and cardiovascular diseases (Corder et al., 1993; Panza et al., 2004).APOE e4 is the only variant with a reportedly large adverse effect upon survival at advanced age (Scha \u00a8chter et al., 1994), and this association has been replicated in several populations (Christensen et al., 2006).Variation in the human forkhead box O3A gene (FOXO3A), in contrast, has been found to be associated with the ability to live long, an effect corroborated by studies in Japanese, German, Italian, US-American, Jewish, Chinese and Danish populations (Anselmi et al., 2009;Flachsbart et al., 2009;Li et al., 2009;Pawlikowska et al., 2009;Soerensen et al., 2010;Willcox et al., 2008).More recently, we have identified exonuclease 1 (EXO1) as a potential novel longevity gene (Nebel et al., 2009).All three genes were detected through candidate-gene approaches.",
+      "GenAge: the aging gene database Philosophy and overview of resources  It is undisputed that genetic factors influence aging.In a remarkable series of recent breakthroughs, a number of genes capable of altering the aging process as a whole -or at least to a large degree -have been identified in animal models and even a few in humans (Finch & Ruvkun, 2001;de Magalh\u00e3es, 2005;Kenyon, 2005).Furthermore, multiple alleles have been examined for their association with human exceptional longevity (Vijg & Suh, 2005).This is a fascinating and important area of research, yet there are now so many genes being associated with aging and longevity that keeping track of them all is becoming increasingly more difficult.Moreover, it is necessary now to study not only individual genes but their interactions with each other and with the environment, and how together genes give rise to a given phenotype: the so-called systems biology approach.To help researchers address these issues we created GenAge, a database of genes related to longevity and/or aging.",
+      "I NCREASES in longevity of the general population world- wide are an unprecedented phenomenon with significant health and social impact.Although environmental factors have led to an increase in life span, there is ample evidence that genetic factors are involved in extreme longevity both in humans (1-7) and in other organisms (8).The protective genetic factors that lead to longevity are likely to involve fundamental processes of aging that may be different from those associated with early mortality or premature onset of age-related diseases in younger individuals.The mechanisms of aging in humans are far from understood, but available evidence suggests that several pathways-inflammation, oxidative stress and stress responses, cellular senescence, DNA damage and repair, and the growth hormone or insulinlike growth factor and insulin (GH, IGF, INS) axis-may play key roles (9)(10)(11)(12).Model organisms suggest that inhibiting the GH, IGF, or INS axis, which is involved in regulating cell proliferation, cell death, wound repair, and metabolism, may promote longevity by reducing oxidative stress and slowing the rate of cell replication and the accumulation of somatic-cell DNA mutations (13).There is also evidence for other important pathways such as the heatshock proteins and heat-shock factors that are highly conserved across species and play a role in prolongevity transcription pathways.Clinical and epidemiological investigations, including candidate gene studies, have suggested that inflammation pathways may affect life span and risk of age-related conditions such as cardiovascular disease (CVD) and its risk factors (14)(15)(16)(17)(18)(19).A combination of multiple genetic variants may be required for an individual to achieve exceptional longevity, which may account in part for its rarity.",
+      "Human lifespan variation is mainly determined by environmental factors, whereas the genetic contribution is 25-30% and expected to be polygenic.Two complementary fields go hand in hand in order to unravel the mechanisms of biological aging: genomic and biomarker research.Explorative and candidate gene studies of the human genome by genetic, transcriptomic, and epigenomic approaches have resulted in the identification of a limited number of interesting positive linkage regions, genes, and pathways that contribute to lifespan variation.The possibilities to further exploit these findings are rapidly increasing through the use of novel technologies, such as next-generation sequencing.Genomic research is progressively being integrated with biomarker studies on aging, including the application of (noninvasive) deep phenotyping and omics data -generated using novel technologies -in a wealth of studies in human populations.Hence, these studies may assist in obtaining a more holistic perspective on the role of the genome in aging and lifespan regulation."
+    ]
+  ]
+}
\ No newline at end of file
diff --git a/gnqa/paper1_eval/src/data/datasets/old/experts_aging2_dataset.json b/gnqa/paper1_eval/src/data/datasets/old/experts_aging2_dataset.json
new file mode 100644
index 00000000..4153c307
--- /dev/null
+++ b/gnqa/paper1_eval/src/data/datasets/old/experts_aging2_dataset.json
@@ -0,0 +1,128 @@
+{
+  "question": [
+    "Why is it so diffuclut to map gene loci that control aging in humans?",
+    "what causes the aging process",
+    "What genes are associated with aging?",
+    "what genetic factor are associated with aging",
+    "which genes are involved in the aging process"
+  ],
+  "answer": [
+    "Mapping gene loci that control aging in humans is difficult due to several reasons. Firstly, aging is a complex phenotype influenced by multiple genes and numerous different pathophysiological processes and diseases. Secondly, any common variation in genes associated with aging probably has a small effect, requiring large studies for identification. Thirdly, human studies face issues like environmental heterogeneity, genetic diversity, and lack of birth matched controls. Lastly, the influence of environmental factors and human-specific confounders like psychosocial, economic, and cultural factors can potentially mask purely biological aging mechanisms, making the analysis more difficult.",
+    "The aging process is caused by a combination of factors including genetic influences, environmental conditions, and stochastic processes. It involves the accumulation of molecular damage, mutations, incomplete repair, and genetic programs. Other factors include wear and tear on cells, decreased immune defenses, oxidation and inefficient mitochondria, toxins and radiation, glycosylation, caloric intake and sirtuin production, neurotransmitter imbalance, hormone mechanisms, reduced nitric oxide, and stem cell slowdown. Aging is also associated with changes in dynamic biological, physiological, environmental, psychological, behavioral, and social processes.",
+    "The genes associated with aging include the apolipoprotein E gene (APOE), the forkhead box O3A (FOXO3A) gene, HLA-DQA1/DRB1, LPA, CHRNA3/5, CDKN2A/B, SH2B3, and AKT1. Other genes involved in processes like growth and development, energy metabolism, oxidative stress, genomic stability maintenance, and neurocognition are also associated with aging.",
+    "Several genetic factors are associated with aging. These include the APOE, FOXO3A, and CHRNA3/5 genes, as well as the CDKN2A/B, SH2B3, and MTP genes. Other factors include the HLA-DQA1/DRB1 and LPA regions, and the AKAP2 gene. Additionally, genes highly expressed in the brain, genes involved in neural development and function, and genes associated with autophagy have been linked to aging.",
+    "The genes involved in the aging process include daf-16, daf-2, BAZ2B, HMGB4, NOC2L, RAI1, SIK1, SMARCA2, SPZ1, TBP, TRIP13, ZKSCAN1, DBH, TPO, LSS, GPER, HCRTR2, ATG2A, NEDD4L, PSMB1, UBXN4, USP6, EEF1A2, ITGB2, TUBB2C, WRN, ABCA7, AZGP1, CD36, DEGS2, PI4KA, SOAT2, APOE, LDLR, CDKN2B, RBM38, IGF1R, FOXO3, SNCA, NAP1L4, GAB2, QKI, and many others."
+  ],
+  "contexts": [
+    [
+      "Recent developments on the genetics of aging can be seen as several streams of effort.In general, humans show a relatively modest (<50%) heritability of life spans (results obtained from twin studies discussed below).The apoE polymorphisms are remarkable for their influence on both cardiovascular disease and Alzheimer disease.In contrast, rare mutant genes with high penetrance cause these same diseases but with early onset and a major shortening of the life span.Shortlived laboratory models (fruit flies, nematodes, mice) are yielding rapid advances, with the discovery of mutants that increase life spans in association with altered metabolism, which leads to questions on the physiological organization of aging processes.Although these early findings do not show that a conserved genetic program actually controls aging processes across animal phylogeny, it is striking how frequently findings of metabolic rate, insulin signaling, and free radicals have emerged from very different approaches to aging in nematodes and mammals, for example.These findings hint that the genetic control of life span was already developed in the common ancestor of modern animals so that subsequent evolution of life spans was mediated by quantitative changes in the control of metabolism through insulin and the production of free radicals.",
+      "FUTURE DIRECTIONS: HIGHER RESOLUTION DATA VIA HIGHER THROUGHPUT ASSAYS  One inescapable conclusion of the aggregate results of genome-wide studies of aging to date (see summary Table 1) is that we have not come close to saturating the number of potentially lifespan-altering genes in any organism.This is in no small part because directly generating survival curves is a relatively time-consuming process in most model organisms using current methods.There are several possible ways to address this.One way that has been tried is by attempting to find surrogate phenotypes [72,73,126] that can be screened more rapidly, or even scored under selection.Another is mining candidates from the many whole-genome expression profiles.Results to date with these have been very fruitful, but have not suggested that these methods alone will rapidly saturate our search for lifespan-and healthspan-altering genes in tractable model organisms.",
+      "Chromosome mapping of genes that were differentially expressed in mice of different ages and/or in response to CR revealed a wide distribution of genes with some physical clustering of responsive genes within the genome.The latter findings are consistent with the concept that aging is a complex process and that evolutionary adaptations to aging, if they exist, may or may not involve geographic clustering of functionally related genes.",
+      "Genetic linkage studies of long-lived human families identified a longevity locus while candidate gene approaches have been used to identify and confirm the association between specific variants in the FOXO3A gene and human longevity [3\u20137]. Genome-wide association studies have also been used to identify the association of APOE with life  123 Aging Clin Exp Res  span and have yielded insights into potential biological pathways and processes related to aging. Despite these successes, several problems are inherent in human longevity studies including potentially high degrees of environmental heterogeneity, genetic diversity, and lack of birth matched controls, among others [8].",
+      "The aging process most certainly is under highly polygenic controls\u2026 This should not discourage us from pursuing a search for those loci which may be of profound importance to human aging as it ordinarily occurs in most human beings.",
+      "In most experimentally modified animal model systems, single-gene mutations in many different genes have major life extension effects (Fontana et al., 2010;Kenyon, 2010).However, natural human and animal longevity is presumed to be a complex trait (Finch & Tanzi, 1997).In humans, both candidate gene and genome-wide genetic association approaches have been applied in an attempt to identify longevity loci.The frequency of genetic variants has been typically compared between nonagenarian cases and young controls, revealing loci at which genetic variants may contribute to a higher or lower probability of survival into old age.The initial candidate gene studies aimed at finding human longevity genes were dominated by contradictory results (Christensen et al., 2006).The more consistent evidence obtained by repeated observation in independent cohort studies for association with longevity has so far only been observed for three loci, the apolipoprotein E (APOE) locus (Schachter et al., 1994;Christensen et al., 2006), the FOXO3A locus (Willcox et al., 2008;Flachsbart et al., 2009;Pawlikowska et al., 2009;Soerensen et al., 2010), and the AKT1 locus (Pawlikowska et al., 2009).Thus, despite the expectation that longevity would be influenced by many genetic variants with small effect sizes, the effect of variants has consistently been shown in only three genes.",
+      "1993), and gene expression microarrays (Pletcher et al. 2002). Given the ambiguities and limitations of large-effect mutant studies of aging, discussed earlier, those publications do not provide very useful evidence with respect to the question of the number of loci that affect aging. At present, the best answer to the question of the number of genes controlling aging is many (Rose and Long 2002), in keeping with the original expectations of evolutionary biologists. However, studies of the genetics of the experimental evolution of aging are now amenable to the application of genomic methods.",
+      "Accepted Article  \u00a9 2013 The Authors Aging Cell \u00a9 2013 Blackwell Publishing Ltd/Anatomical Society of Great Britain and Ireland over 90 years and 1,955 controls between 55 and 80 years did not reveal genome-wide significant loci (Newman et al., 2010) and neither did the analyses of all-cause mortality and survival free of major disease in this cohort (Walter et al., 2011).A smaller Dutch study of 403 nonagenarians and 1,670 controls younger than 65 years identified the APOE gene as a mortality locus (Deelen et al., 2011), which was confirmed in a German study of 763 long-lived individuals and 1,085 younger controls (Nebel et al., 2011) and a longitudinal study of 1,606 Danes showed that the effect size of this association increases at the highest ages (Jacobsen et al., 2010).Apparently, the influence of the common genetic variation on longevity is small which requires large meta-GWA studies for identification.Alternatively, rare genetic variants may play a more important role in longevity.Since the previous linkage studies showed contradictory results potentially due to heterogeneity in the longevity phenotype, it is expected that longevity is influenced by many private rare variants.",
+      "Ageing is complex and takes a long time to study -a lifetime in fact.This makes it difficult to discern its causes, among the countless possibilities based on an individual's genes, behaviour or environment.While thousands of regions in an individual's genetic makeup are known to influence their risk of different diseases, those that affect how long they will live have proved harder to disentangle.Timmers et al. sought to pinpoint such regions, and then use this information to predict, based on their DNA, whether someone had a better or worse chance of living longer than average.",
+      "Several explanations are possible for the lack of genomewide significant findings.First, mortality is arguably 1 of the most complex phenotypes, and several trajectories toward extreme old age have been identified (Evert et al., 2003).Multiple genes could mediate the aging process but would have their effects through numerous different patho-physiological processes and diseases that act as intermediate factors on the pathway to death (de Magalhaes et al., 2010).Therefore, any common variation in genes associated with aging probably has a small effect.",
+      "Second, the largely negative findings of this and other studies contrast with the intriguing animal studies of longevity.Very large effects of single genes on lifespan have indeed been observed in laboratory animals, but humans often have several homologues of these genes which might significantly differ in function or compensate for mutated genes through redundant mechanisms (Kuningas et al., 2008).This could explain why our top findings did not include genes in these pathways found in animal models.Animal models also represent genetically homogenous populations and are exposed to controlled environmental influences.The lack of replication of animal model findings in humans suggests that the use of knockout animals may not provide the optimal approach to understanding the variation in survival in humans as interactions with environmental factors may obscure the associations and prevent the identification of loci in humans.",
+      "The lack of success in the identification of genes related to aging in humans may be due to the complexity of the phenotype.One approach to investigate aging and longevity is to compare frequencies of genetic variants between nonagenarians or centenarians and the general population.This approach led to the discovery of an association between APOE (Deelen et al., 2011;Ewbank, 2007;Gerdes et al., 2000) and more recently FOXO3A (Anselmi et al., 2009;Flachsbart et al., 2009;Li et al., 2009a;Pawlikowska et al., 2009;Willcox et al., 2008) and human aging and longevity.However, a recent genome-wide association study (GWAS) of individuals reaching the age of 90 or older failed to identify genome-wide significant variants (Newman et al., 2010).",
+      "In addition to timing differences, a small proportion of genes (10%-15%) exhibit opposite trends of expression changes with age in humans and macaques (Supplemental Fig. S13).Interestingly, such differences are ;1.5 times more common in aging than in development, an observation consistent with the lower strength of purifying selection on the gene regulation at old age (discussed below).These differences could also reflect extreme shifts in developmental timing between species, as well as technical artifacts.Future studies, using additional species and alternative methodology, are needed to address this issue.",
+      "1993), and gene expression microarrays (Pletcher et al. 2002). Given the ambiguities and limitations of large-effect mutant studies of aging, discussed earlier, those publications do not provide very useful evidence with respect to the question of the number of loci that affect aging. At present, the best answer to the question of the number of genes controlling aging is many (Rose and Long 2002), in keeping with the original expectations of evolutionary biologists. However, studies of the genetics of the experimental evolution of aging are now amenable to the application of genomic methods.",
+      "The remarkable discoveries of the past 2 decades showing that single genes can regulate aging in model organisms demonstrate that aging can be genetically manipulated (Finch and Ruvkun, 2001;Kenyon, 2010).Hundreds of genes that modulate longevity have now been identified in model organisms (de Magalha \u02dces et al., 2009a).In some cases (e.g., in worms), mutations in single genes can extend lifespan by almost 10-fold (Ayyadevara et al., 2008).Nonetheless, aging is a complex process that derives not from single genes but from the interactions of multiple genes with each other and with the environment.Evidence from animal systems shows a major impact of the environment on aging, yet environmental manipulations of aging act through genes and proteins, usually by triggering signaling pathways and modulating gene expression.In fact, some genes have been shown in model organisms to have varying effects on lifespan depending on diet (Heikkinen et al., 2009).Genes that can regulate aging in model organisms cannot be directly applied to humans through genetic manipulations for numerous legal, ethical, and technical reasons.If we could understand how the environment modulates these aging-related genes, we might be able to create antiaging therapies applicable to humans, potentially through diet, lifestyle, and even pharmacological interventions.Therefore, understanding genome-environment interactions in the context of aging can be a powerful approach to identify attractive targets for drug design.",
+      "TRANSLATION OF LONGEVITY MODEL ORGANISMS AND CORE AGING PATHWAYS  Genetic studies on lifespan have proven to be challenging.While longevity is a defining trait for a given species, the lifespan of individuals is of limited heritability, making analyses more difficult.Exceptional human life span, although a rare phenotype, is likely multifactorial; refined analyses are required to obtain statistically robust genomic signatures of longevity (Zhang et al., 2020) and these have proven elusive.Unlike laboratory models, the effect of environmental variance cannot be controlled in human studies, potentially masking purely biological aging mechanisms.Even laboratory models cannot replicate the complex \"environment\" of humans; it includes psychosocial, economic, and cultural factors, rather than strictly biological.These human-specific confounders are difficult or impossible to target in traditional model organisms.Despite these limitations, experimentally tractable model organisms have proven invaluable in deciphering the purely genetic contribution to lifespan, including genes and pathways conserved across the tree of life.",
+      "ANALYSIS OF HUMAN VARIATION IN THE GENETIC CONTROL OF LONGEVITY  Heritability studies have convincingly demonstrated that at least some fraction of human lifespan is heritable.In tandem, large-scale genome-wide association studies (GWAS) have identified numerous loci associated with age-related traits (Buniello et al., 2019).While genetic studies have functionally shown an inverse effect of multiple age-related, diseaseassociated variants on lifespan regulation, the number of well-replicated longevity-conferring variants remains limited to variants in APOE (ApoE \u03b52), and more recently, CDKN2A/B and IL6 (see Table 1).To date, studies in humans have been hampered by the specific phenotype definitions used, sample sizes of the extreme phenotypes, and modest heritability of the longevity-related traits (Breitbach et al., 2019).This is due to the complex interplay of biological and social factors involved in human aging, as well as the limited power of GWAS, which require sampling thousands of subjects to achieve statistical significance (Breitbach et al., 2019).Genetic studies of aging have also been hindered by an inconsistent use of definitions of aging (reviewed in Baghdadi et al., 2020).The two main ways of conducting research on the genetics of longevity in human populations are by studying (i) the lifespan (continuous trait, years lived) and (ii) the longevity (dichotomous trait, i.e., being among the longest-lived individuals within a specific population).These complexities have limited the resolution and capability of broad association studies of human longevity.Importantly, these genomic analyses focus on a shift of survival in a population; these variables may be genetically distinct from the mechanisms establishing potential for longevity overall (Figure 1A).We argue that an understanding of this shift in lifespan as well as genetic mechanisms of regulating a species specific 'set points' (Figure 1B) will aid in the conceptual distinction of aging and longevity in humans.",
+      "With modern genomic technologies and largescale data analysis methods, it is possible to sift through the genes of populations to find the loci that act to postpone aging. [3]There are uncertainties with the comparison of populations with different rates of aging.However, it is superior to experimental designs that only consider age-dependence or dietary-response, without determining causal mechanisms.",
+      "Most of the human candidate gene studies were performed in cross-sectional designs (Box 1 and Fig. 1), comparing allele frequencies of potential longevity loci between highly aged individuals and young controls.The candidate gene studies based on single genes have pointed a role for genes involved in, e.g., GH/insulin/IGF-1 signaling, immune regulation, and lipoprotein metabolism (Supporting Information Table S1), although most of these results have not (yet) been confirmed in sufficient independent studies.The most convincing human longevity loci today are APOE and FOXO3A which have frequently been associated with longevity in cross-sectional studies (see for a review [26]) and survival in prospective studies [27][28][29] (Fig. 3).APOE encodes the protein apolipoprotein E which seems to play a role in e.g., lipoprotein metabolism, cognitive function, and immune regulation [30].FOXO3A encodes the protein forkhead box O3 which acts as a transcription factor for many different genes involved in processes like apoptosis and oxidative stress [31].",
+      "Conclusions and prospects  Over the past two decades the human aging field has built up the necessary resources to study the biology of aging and longevity by establishing human populations with a diversity of designs.Meta-analyses integrating genetic and phenotypic datasets have successfully identified variants associated with a range of age-related traits and diseases.Despite these accomplishments, the number of novel leads contributing to human lifespan regulation is limited.Although positive regions of linkage and suggestive GWAS hits have been reported, the field has not yet identified the loci that explain the clustering of longevity in families and the variation in biological aging rate in the population.As for animal models, down-signaling of the IIS and mTOR pathway appeared to be relevant in humans.These findings are being followed up by molecular and physiological profiling using skin, fat and muscle tissue of long-lived family members and controls.Human studies now also include the response of nutrient sensing systems to the application of dietary and physical challenges."
+    ],
+    [
+      "There are multiple definitions of the aging process.Aging may be perceived as the random, systemic loss of molecular fidelity that, after reproductive maturity, accumulates to levels that eventually exceed tissue repair, turnover, or maintenance capacity (Hayflick 2004).The underlying molecular mechanisms of aging remain a subject of debates (de Magalhaes et al. 2009): tissue deterioration might not be programmed, being just a function of increase in entropy (Hayflick 2004).No genes are necessary to drive a stochastic process; however, there are genes that act to prevent an organism from destruction and disorganization.It may be due to the absence of specific disease-causing alleles or due to the presence of favorable alleles (Halaschek-Wiener et al. 2009).These genes may inhibit entropy, regulate inflammation, maintain DNA repair (such as telomere maintenance factors), or provide antioxidant functions (e.g., antagonists of reactive oxygen species).As healthy cells adapt to degeneration, differential expression of genes with age may indicate a transcriptional response to aging rather than a deleterious mechanism of aging per se (de Magalhaes et al. 2009).It might be postulated that there exist alleles that confer a pleiotropic effect on structure and function during aging (Lunetta et al. 2007).These alleles should regulate the ability of an organism to withstand challenging endogenous and exogenous influences.",
+      "Why does ageing evolve? The intrinsic decline in function that occurs during ageing appears to be caused by the accumulation of damage, particularly at the molecular level.As far as we know, no genes have evolved specifically because they cause damage to accumulate, and the evolution of ageing can therefore be understood only as a side-effect of other causes of evolutionary change.The mechanisms by which ageing can evolve were first elucidated by J.B.S. Haldane [14], P.B. Medawar [15] and G.C. Williams [16].Extrinsic hazards from disease, predation and accidents mean that even potentially immortal organisms will die.Genetic effects that become apparent only later in life encounter a reduced force of natural selection, because not all their bearers will survive to express them.Haldane pointed out that late-onset genetic diseases in humans, such as Huntington's disease, encounter only weak selection, because most reproduction is complete by the age of onset [14].Ageing could therefore result from the accumulation under mutation pressure of age-specific, deleterious mutations.In addition, if some mutations have pleiotropic effects, with beneficial effects in youth, such as high fecundity, but also with a higher subsequent rate of ageing, then they could be incorporated into the population by natural selection, which will act more strongly on the early, beneficial effect.Thus, variation in the rate of ageing would result from the readjustment of a tradeoff between youthful benefits and the subsequent rate of ageing.Both processes imply that faster ageing will evolve where the extrinsic hazard to adults is greatest, a hypothesis in general supported by the data [1,2,17].",
+      "A. Theories  In looking back at the development of aging studies, we can see that it did not follow a straight or logical course.On the contrary, it can be compared with the flow of several convergent streams winding in their course.To date, numerous proposals have been made for the paradigm of aging.These include Hayflick's contributions (153) on programmed cellular incapacitation derived from flbroblast studies, a decrease in immunologic response, deleterious endocrinological changes, nuclear somatic gene mutation, mitochondrial somatic gene mutation, oxygen free radical damage to proteins and nucleic acids, molecular instabilities, molecular cross-linking, glycation reactions, and so on.There is little doubt that many of these factors contribute to the overall aging, but what are primary causes, and what are secondary outcomes?",
+      "Ageing Is Adjusted by Genetic, Environmental, and Stochastic Processes  Enough evidence suggests that ageing is the result of different events such as molecular damage, mutations, incomplete repair, genetic programs, and continued development, among others [16].These events, in turn, are caused by genetic factors, environmental conditions, and even stochastic factors, which are mentioned below in this chapter.",
+      "Different stochastic theories of ageing focus on specific mechanisms that may lead to ageing.The catastrophic error theory poses that the accumulation of errors in protein synthesis causes damage in cell function.The theory of cross-linking holds this process between proteins and other macromolecules responsible for ageing, while the theory of free radicals suggests that ageing is the result of inadequate protection against cell and tissue damage by free radicals and oxidative stress throughout life.Finally, the wear-and-tear theory poses that the cumulative damage that eventually leads to ageing and death is, in fact, the result of the continuous functioning of vital processes, during which stochastic errors gradually arise.",
+      "Introduction  Aging is a natural and irreversible process characterized by a progressive decay in physiological, biochemical, and structural functions of individuals.Aging is a multifactorial process that can be affected by two main factors: environmental and genetic.Environmental factors are nutrition, pathologies, pollution exposure, physical activity, and microbiota, while genetic factors are issues that have been associated with antioxidant and DNA damage responses, the fidelity of genetic information transfer, the efficiency of protein degradation, the extent of cellular responsiveness to stress, the mechanisms of epigenetic regulation, and the ability to elongate telomeres.All of them can determine how fast we age.Traditionally, aging studies had used several model organisms, from yeast to mammals, especially rodents (rats and mice).Most of the studies are made under controlled conditions, where only a few variables are observed, and the subjects are members of the same strain with the same genetic backgrounds or the same mutations.The information that so far has been obtained about aging has helped us to describe different factors that influence this process and that are the fundamental concepts of the various theories of aging.However, these theories do not fully explain the aging process in the different models of aging study.This is the case of the study of aging in humans, where it is very difficult to control the environmental and genetic variables.That is why issues haven't been solved such as the following: How does time influence aging?When do we start to age?How do we know we are old?Is it possible to delay aging?Those and more questions are the cornerstones for aging studies.Biological aging has been associated with the decrease in the repair and regeneration capacity of tissues and organs; it is a time-dependent process.This reduction can be observed by an increase in the acquisition of diseases and functional and reproductive disability, which eventually lead to death.On the other hand, it has been observed that in humans, people with the same chronological age exhibit different trajectories in the decrease of physiological functions associated with biological aging and what complicates the understanding of the molecular and physiological phenomena that drive the complex and multifactorial processes that underlie biological aging in humans.",
+      "The underlying cause of aging remains one of the central mysteries of biology.Recent studies in several different systems suggest that not only may the rate of aging be modified by environmental and genetic factors, but also that the aging clock can be reversed, restoring characteristics of youthfulness to aged cells and tissues.This Review focuses on the emerging biology of rejuvenation through the lens of epigenetic reprogramming.By defining youthfulness and senescence as epigenetic states, a framework for asking new questions about the aging process emerges.",
+      "Aging does not happen in a vacuum.Aging must be the result of changes that occur in molecules that have existed at one time with no age changes.It is the state of these pre-existing molecules that governs longevity determination.The pre-existing state is, as I have already described, maintained by repair and turnover systems that themselves eventually succumb to irreparable age changes.Longevity determination is the state of all molecules prior to succumbing to irreparable loss of molecular structure.",
+      "Biological aging is more than simply the occurrence of random changes in molecules.It also includes the role of the many repair systems found within cells.Thus, a more complete, but less concise, explanation of the first causes of aging in biological systems is the following:",
+      "U  nderstanding the deleterious processes that cause aging has been a human endeavor ever since we figured out that we grew old and that we didn't like it.Many hypotheses have been proposed to explain the root cause of aging (1).One broad-based hypothesis is that generalized homeostatic failure leads to age-related decline.Although notions of time-and use-related deterioration may be applicable to mechanical objects, they fall short as analogies to biological systems because energy input should theoretically maintain living systems indefinitely.Yet, despite the regenerative potential of biological organisms, progressive deterioration accompanies postmaturational aging.That the organism's repair capabilities cannot keep up with wear and tear is, according to evolutionary theory, explained by the inevitable declining force of natural selection with age.According to this reasoning, there is no selective advantage to maintaining somatic cells in perfect order much beyond reproductive maturation (1).Hence, a long life depends on the timing of maturation and the quality of somatic cell maintenance.",
+      "Wear and tear on the DNA often has been touted as a possible basis for our progressive age-related decline.Supporting this notion is the work of de Boer et al. (2) reported on page 1276 of this week's issue.They reveal important evidence for imperfect genome maintenance of DNA damage as a possible causal factor in aging.Harman, with his \"free radical theory of aging\" (3), was the first to propose that metabolic by-products called reactive oxygen species (ROS) continually damage cellular macromolecules, including DNA.Incomplete repair of such damage would lead to its accumulation over time and eventually result in age-related deterioration.A number of observations support the free radical theory, including the discovery that dietary restriction delays aging and extends life-span in a wide range of rodents and other species, possibly by reducing free radical damage.The notion that genomic DNA could be a major target of continual free radical attack over time is supported by the recent observation that genetic lesions accumulate with age and that dietary restriction reduces this accumulation in rodents (4).In addition, deletion of p66 shc , a signaling protein that maintains oxidant levels, increases resistance to oxidative damage and extends the life-span of mice (5).",
+      "Instead, aging is expected to be a pervasive failure of adaptation across most, if not all, of the physiological mechanisms that sustain survival and reproduction among young individuals. For this reason, evolutionary biologists have generally been skeptical of proposals that attribute \u201cthe cause of aging\u201d to any one physiological mechanism or gene for aging or programmed death. Although common genetic pathways might be identified that contribute to aging among a variety of organisms (cf.",
+      "Background  Aging is a complex process characterized by the progressive degeneration of a healthy phenotype and correlated with a decline in the ability to withstand cellular stress and damage.The subject of investigation for decades, the underlying molecular genetic causes of and responses to aging remain an area of active study.Research from model systems has characterized a range of physiological and molecular phenotypes associated with aging.These include genomic instability caused by accumulation of DNA damage, dysregulation of repair mechanisms, and telomere attrition; epigenetic alterations; dysregulation of transcription; loss of proteostasis; cellular senescence; and deregulated nutrient sensing, metabolic pathways, and energy use (reviewed in [1]).Separating causation from correlation between these phenotypes and aging remains a challenge, however.",
+      "Introduction  Understanding what actually causes ageing remains admittedly a fundamental and fascinating problem in biology [1].Experimental data accumulated in the last three decades have led to the identification of various environmental and genetic factors, as well as chemical substances that influence lifespan in divergent eukaryotic species [1,2].Organisms normally age faster and hence live shorter under stress conditions that can lead to the generation of DNA mutations and, often as a consequence of mutations, damaged cytoplasmic constituents (including injured proteins, lipids, carbohydrates and organelles).Such types of damage can interfere with cellular functioning; thereby, they should be eliminated by effective repair and self-cleaning mechanisms to maintain cellular homeostasis.These mechanisms include DNA repair pathways, molecular chaperons, as well as the proteasome-ubiquitin system and lysosome-mediated autophagy, the main forms of cellular self-degradation [3].This has led to the attractive model that the gradual, lifelong accumulation of unrepaired cellular damage drives the ageing process and determines the incidence of age-related fatal diseases [4,5].",
+      "In conclusion, aging may not be primarily due to damage accumulating from the basic biochemical reactions that make up life but rather the result of the developmental program or of changes brought about by it.Our hypothesis is that the timing of development regulates the rate of aging among mammals, with a subset of developmental mechanisms determining the pace and causing most agerelated changes.Maybe people change as they grow old due to the same mechanisms that drive changes throughout the earlier stages in life.",
+      "Instead, aging is expected to be a pervasive failure of adaptation across most, if not all, of the physiological mechanisms that sustain survival and reproduction among young individuals. For this reason, evolutionary biologists have generally been skeptical of proposals that attribute \u201cthe cause of aging\u201d to any one physiological mechanism or gene for aging or programmed death. Although common genetic pathways might be identified that contribute to aging among a variety of organisms (cf.",
+      "In 2021, Science published a special issue entitled \"125 Questions: Exploration and Discovery.\" One of these 125 questions was \"Can we stop ourselves from aging? \"The U.S. National Institute on Aging (NIA) at the National Institutes of Health (NIH) states that \"aging is associated with changes in dynamic biological, physiological, environmental, psychological, behavioral, and social processes.\" Although geneticists and epidemiologists have long debated the relative importance of the role played by genotype or the environment in the development of age-related diseases, it is apparent that both can play substantial roles in this process [6,7].However, most etiological studies have concentrated on the role of genotype and have considered the environment to play a secondary role.Nevertheless, an analysis of GBD data showed that nearly 50% of deaths worldwide are attributable to environmental exposure, primarily exposure to airborne particulates (including household air pollution and occupational exposure; 14% of all deaths), smoking and secondhand smoke (13%), plasma sodium concentrations (6%), and alcohol consumption (5%) [8].In contrast, a recent analysis of 28 chronic diseases in identical twins showed that the genetic-related risks of developing one of five age-related diseases were 33.3%, 10.6%, 36.3%, 19.5%, and 33.9% for AD, PD, CAD, COPD, and T2DM, respectively, with a mean of only 26% [9].The results of over 400 genome-wide association studies (GWASs) have also elucidated that the heritability of degenerative diseases is only approximately 10% [10,11].Consequently, nongenetic drivers, such as environmental factors, are now recognized as major risk factors for age-related diseases.The contributions of environmental factors to the development of age-related diseases can be revealed by analyses of all of the factors to which individuals are exposed in their life and the relationships between these exposures and age-related diseases [12,13].",
+      "Introduction  The fundamental manifestation of the aging process is a progressive decline in the functional maintenance of tissue homeostasis and an increasing propensity to degenerative diseases and death [1].It has attracted significant interest to study the underlying mechanisms of aging, and many theories have been put forward to explain the phenomenon of aging.There is an emerging consensus that aging is a multifactorial process, which is genetically determined and influenced epigenetically by environment [2].Most aging theories postulate a single physiological cause of aging, and likely these theories are correct to a certain degree and in certain aspects of aging.",
+      "Many factors contribute to aging, including genes.This is the first article in a 10-part series that highlight some of what is known about the influence of genes on aging and emerging treatment options that may slow down or potentially reverse the aging process.The series will address \\genes, adducts, and telomeres, decreased immune defenses, oxidation and inefficient mitochondria, toxins and radiation, glycosylation, caloric intake and sirtuin production, neurotransmitter imbalance, hormone mechanisms, reduced nitric oxide, and stem cell slowdown.Underpinning these factors are wear and tear on cells and aging as a result of inability to repair or replace these affected cells.These topics have been addressed in research, health magazines, and even by talk show hosts.There is even a LongevityMap website addressing significant and nonsignificant genetic association studies in aging across the human genome (http://genomics.senescence.info/longevity/).The series will address a scientific and clinical approach to genome-related aging topics.",
+      "Trying to explain aging in terms of a singular process would be in conflict with evolutionary theory.Even if loss of genome sequence integrity was the most conserved cause of aging, already active in the first replicators (Vijg, 2007), natural selection would allow a multitude of mutations with late adverse effects to accumulate in the germline, many of which would be positively selected for because of their beneficial effects early in life (Williams, 1957), In this respect, somatic mutation accumulation could be a conserved, inevitable cause of aging but superposed on multiple other processes that usually cause the earlier demise of an individual."
+    ],
+    [
+      "Genomic analysis of longevity offers the potential to illuminate the biology of human aging.Here, using genome-wide association meta-analysis of 606,059 parents' survival, we discover two regions associated with longevity (HLA-DQA1/DRB1 and LPA).We also validate previous suggestions that APOE, CHRNA3/5, CDKN2A/B, SH2B3 and FOXO3A influence longevity.Next we show that giving up smoking, educational attainment, openness to new experience and high-density lipoprotein (HDL) cholesterol levels are most positively genetically correlated with lifespan while susceptibility to coronary artery disease (CAD), cigarettes smoked per day, lung cancer, insulin resistance and body fat are most negatively correlated.We suggest that the effect of education on lifespan is principally mediated through smoking while the effect of obesity appears to act via CAD.Using instrumental variables, we suggest that an increase of one body mass index unit reduces lifespan by 7 months while 1 year of education adds 11 months to expected lifespan.",
+      "Genomic analysis of longevity offers the potential to illuminate the biology of human aging.Here, using genome-wide association meta-analysis of 606,059 parents' survival, we discover two regions associated with longevity (HLA-DQA1/DRB1 and LPA).We also validate previous suggestions that APOE, CHRNA3/5, CDKN2A/B, SH2B3 and FOXO3A influence longevity.Next we show that giving up smoking, educational attainment, openness to new experience and high-density lipoprotein (HDL) cholesterol levels are most positively genetically correlated with lifespan while susceptibility to coronary artery disease (CAD), cigarettes smoked per day, lung cancer, insulin resistance and body fat are most negatively correlated.We suggest that the effect of education on lifespan is principally mediated through smoking while the effect of obesity appears to act via CAD.Using instrumental variables, we suggest that an increase of one body mass index unit reduces lifespan by 7 months while 1 year of education adds 11 months to expected lifespan.",
+      "Recent developments on the genetics of aging can be seen as several streams of effort.In general, humans show a relatively modest (<50%) heritability of life spans (results obtained from twin studies discussed below).The apoE polymorphisms are remarkable for their influence on both cardiovascular disease and Alzheimer disease.In contrast, rare mutant genes with high penetrance cause these same diseases but with early onset and a major shortening of the life span.Shortlived laboratory models (fruit flies, nematodes, mice) are yielding rapid advances, with the discovery of mutants that increase life spans in association with altered metabolism, which leads to questions on the physiological organization of aging processes.Although these early findings do not show that a conserved genetic program actually controls aging processes across animal phylogeny, it is striking how frequently findings of metabolic rate, insulin signaling, and free radicals have emerged from very different approaches to aging in nematodes and mammals, for example.These findings hint that the genetic control of life span was already developed in the common ancestor of modern animals so that subsequent evolution of life spans was mediated by quantitative changes in the control of metabolism through insulin and the production of free radicals.",
+      "Studies revealed from 300 to 750 genes related to longevity that are critically involved in a variety of life activities, such as growth and development, energy metabolism, oxidative stress, genomic stability maintenance, and neurocognition [4].These candidate genes include mainly APOE, a gene involved in lipoprotein metabolism [5,6].Others are those involved in cell cycle regulation, cell growth and signal transduction, the maintenance of genome stability, and the endocrine-related pathway [7][8][9].In addition, the candidates for longevity encompass genes related to drug metabolism, the ones involved in protein folding, stabilization, and degradation, as well those related to coagulation and regulation of circulation [10], etc.In most cases, these genes or their polymorphic sites were examined in multiple population replication studies, which discovered certain longevity-associated genes or pathways [4][5][6][7][8][9][10].",
+      "Additional association studies with these families and replication of these results with an independent data set should facilitate the positional cloning of a gene that influences the ability to age well and achieve exceptional longevity.Identification of the genes in humans that allow certain individuals to live to extreme old age should lead to insights on cellular pathways that are important to the aging process.",
+      "Somatic mutations with the inherited gene variations of each individual cumulatively or synergistically influence the health span and life span [11].Very few genetic variants have been associated with human longevity, but those found include the transcription factor FOXO3 gene, the APOE/TOMM40 and the CDKN2B/ ANRIL loci, which are associated with Alzheimer's disease and cellular senescence [12][13][14].In fact, the heritability for human longevity has been estimated to be approximately 20-30%, according to studies of twins, suggesting that external factors such as diet, environment, physical activity and microbiomes are important factors that influence the health span [14][15][16].The increase in the rate of retrotranscription reflects genome deregulation, creating additional mutations, DNA damage, and other forms of genome instability.For instance, the expression of several families of retrotransposable elements increases with age, as observed in mouse skeletal muscle and human fibroblasts, particularly the long interspersed nuclear element-1 (L1 LINE) [17,18].",
+      "Before the advent of NGS technologies, several scientists were interested in the study of allele variants associated with aging, but they were limited by the lack of aging rate biomarkers.Now with NGS technologies, these biomarkers have been emerged such as the epigenetic clock that is described in the DNA methylation sequencing section of this chapter.In this post-genomic era, different strategies have been developed in order to understand the genetic factors involved in aging [17].One strategy used is the study of aging in extreme longevity groups of people, called centenarians.Centenarians are a group that can reach an age above 100 years and has an incidence of 1 every 10,000 people [18].In a pioneering study using extreme longevity people (308 individuals belonging to 137 sibships showing extreme longevity), genome-wide scan analysis identified a region on chromosome 4 associated with extreme longevity [19] that corresponds to the microsomal transfer protein (MTP) [20], which is associated with abetalipoproteinemia and hypobeta lipoproteinemia in humans [21,22].Another approach to study the genetic factors involved in longevity consists in assessing allele frequencies from people of different ages, looking for those polymorphisms (SNPs) with enhanced allele frequencies in high-longevity individuals.Those alleles with diminished frequencies in aged individuals may be associated with age-related diseases.Using this approximation, an SNP that shifts isoleucine to valine was identified in the PKA-anchoring protein (AKAP2) gene.This polymorphism is associated with reduced longevity and cardiac disease [23].Genome-wide association studies (GWAS) have confirmed only three loci that affect longevity: FOXO3A, APOE, and an intergenic locus on chromosome 5q33.3[24][25][26].",
+      "Unbiased genome-wide studies of longevity in S. cerevisiae and C. elegans have led to the identification of more than one hundred genes that determine life span in one or both organisms.Key pathways have been uncovered linking nutrient and growth factor cues to longevity.Quantitative measures of the degree to which aging is evolutionary conserved are now possible.A major challenge for the future is determining which of these genes play a similar role in human aging and using that information to develop therapies toward age-associated diseases.",
+      "Involvement of genes in a wide range of fundamental biological processes suggests also a broad role of these genes in regulating the aging-related phenotypes.",
+      "Gene associations with age-related traits found using longitudinal study data.",
+      "In most experimentally modified animal model systems, single-gene mutations in many different genes have major life extension effects (Fontana et al., 2010;Kenyon, 2010).However, natural human and animal longevity is presumed to be a complex trait (Finch & Tanzi, 1997).In humans, both candidate gene and genome-wide genetic association approaches have been applied in an attempt to identify longevity loci.The frequency of genetic variants has been typically compared between nonagenarian cases and young controls, revealing loci at which genetic variants may contribute to a higher or lower probability of survival into old age.The initial candidate gene studies aimed at finding human longevity genes were dominated by contradictory results (Christensen et al., 2006).The more consistent evidence obtained by repeated observation in independent cohort studies for association with longevity has so far only been observed for three loci, the apolipoprotein E (APOE) locus (Schachter et al., 1994;Christensen et al., 2006), the FOXO3A locus (Willcox et al., 2008;Flachsbart et al., 2009;Pawlikowska et al., 2009;Soerensen et al., 2010), and the AKT1 locus (Pawlikowska et al., 2009).Thus, despite the expectation that longevity would be influenced by many genetic variants with small effect sizes, the effect of variants has consistently been shown in only three genes.",
+      "The lack of success in the identification of genes related to aging in humans may be due to the complexity of the phenotype.One approach to investigate aging and longevity is to compare frequencies of genetic variants between nonagenarians or centenarians and the general population.This approach led to the discovery of an association between APOE (Deelen et al., 2011;Ewbank, 2007;Gerdes et al., 2000) and more recently FOXO3A (Anselmi et al., 2009;Flachsbart et al., 2009;Li et al., 2009a;Pawlikowska et al., 2009;Willcox et al., 2008) and human aging and longevity.However, a recent genome-wide association study (GWAS) of individuals reaching the age of 90 or older failed to identify genome-wide significant variants (Newman et al., 2010).",
+      "Thus, substantially more work is needed in this area to establish whether longevity is driven by nuclear genomic stability.Diverse and unexpected bits of evidence support a relationship.For example, a disproportionate number of genes identified in unbiased and targeted genome-wide association studies (GWASs) as associated with longevity are involved in genome maintenance (75).One study involved age of natural menopause in \u223c70,000 women and led to the identification of 44 genetic variants associated with early or late menopause, a strong biomarker of healthy TIFs (telomere dysfunction-induced foci): co-localization of multiple DNA damage response factors and repair proteins on uncapped telomeric DNA aging (76).Approximately two-thirds of these are associated with genome maintenance genes.Seven of ten significantly associated pathways are involved in DNA repair.The highly significant overrepresentation of DNA repair pathways indicates an intimate connection between genome maintenance and aging phenotypes.From unrelated studies, we know that reduced expression of the repair endonuclease ERCC1-XPF causes accelerated aging (3), whereas ERCC1 is one of the top genes under positive selective pressure in the longest-lived mammalian species, the bowhead whale (77).Intriguingly, hepatocytes from old rats have impaired NER, whereas caloric restriction, which extends longevity, restored the NER capacity of old rats to that of youthful levels (42).In a human interventional study, brief caloric restriction increased NER capacity in PBMCs of individuals who had low NER prior to dietary intervention (78).Therefore, increased DNA repair capacity could promote longevity and may even prove amenable to improvement.",
+      "In addition to aging-and CR-related genes, another source of candidate genes and pathways for drug design are human longevity-associated genes (Barzilai and Shuldiner, 2001;Browner et al., 2004;Kenyon, 2010).Dozens of genes have now been associated with human longevity (de Magalha \u02dces et al., 2009a), although only a handful of genes have been shown to have consistent effects across populations.",
+      "Genes/loci identified by genome-wide association studies of longevity and lifespan traits.",
+      "The genetic basis of human longevity has so far been primarily investigated by association studies.Most results from these experiments have been difficult to confirm in independent samples, probably owing to the modest heritability, multifactorial nature, and heterogeneity of the phenotype (Christensen et al., 2006).To date, variation in only two genes has been identified, which has an effect on longevity in various populations: (i) the apolipoprotein E gene (APOE) (Scha \u00a8chter et al., 1994;Christensen et al., 2006) and (ii) the forkhead box O3A (FOXO3A) gene in the insulin-IGF1 signaling (IIS) pathway (Willcox et al., 2008;Flachsbart et al., 2009).Given the apparent lack of susceptibility candidates, it is conceivable that other genetic factors influence the function or expression of genes relevant for human longevity.",
+      "Since many alleles will fit the two patterns just described, it follows that we expect many genetic and biochemical mechanisms of aging.There are some experiments that have attempted to estimate the number of genes involved in aging, particularly in Drosophila.Quantitative genetic estimates of gene number have probably been subject to artifacts, [6,8] and are highly imprecise.Molecular genetic estimates using 2-D gels [3] and high-density geneexpression arrays [12] indicate the involvement of at least 300 genetic loci in Drosophila aging, and that estimate is highly conservative.For now, the best conclusion is probably that many genes are involved in aging in fruit flies.Vertebrates are unlikely to have fewer genes involved in aging, in view of their larger genomes.",
+      "GenAge consists of several searchable data sets.Considering the extraordinary discoveries in the genetics of aging in model organisms, GenAge includes a data set of genes associated with longevity and/or aging in model organisms.We consider a given gene for inclusion in GenAge if genetic manipulations of the gene result in noticeable changes in the aging phenotype and/or longevity.Most genes in GenAge are from the four typical model organisms: mice, worms, fruit flies and yeast (Table 1).Strikingly, homologues of many genes -such as insulin receptors and sirtuins -have been shown to regulate aging in model organisms separated by large evolutionary distances (Kenyon, 2005;Liu et al ., 2005;Smith et al ., 2008).Moreover, we have shown that genes associated with aging and/or longevity in model organisms are evolutionary conserved in terms of having more homologues than predicted by chance (Budovsky et al ., 2007(Budovsky et al ., , 2008) ) and exhibiting slower molecular evolution rates (de Magalh\u00e3es & Church, 2007).Therefore, it is now clear that at least some genes identified in model organisms may be relevant to human aging.",
+      "Gene associations with age-related traits found using longitudinal study data.",
+      "Most of the human candidate gene studies were performed in cross-sectional designs (Box 1 and Fig. 1), comparing allele frequencies of potential longevity loci between highly aged individuals and young controls.The candidate gene studies based on single genes have pointed a role for genes involved in, e.g., GH/insulin/IGF-1 signaling, immune regulation, and lipoprotein metabolism (Supporting Information Table S1), although most of these results have not (yet) been confirmed in sufficient independent studies.The most convincing human longevity loci today are APOE and FOXO3A which have frequently been associated with longevity in cross-sectional studies (see for a review [26]) and survival in prospective studies [27][28][29] (Fig. 3).APOE encodes the protein apolipoprotein E which seems to play a role in e.g., lipoprotein metabolism, cognitive function, and immune regulation [30].FOXO3A encodes the protein forkhead box O3 which acts as a transcription factor for many different genes involved in processes like apoptosis and oxidative stress [31]."
+    ],
+    [
+      "Genomic analysis of longevity offers the potential to illuminate the biology of human aging.Here, using genome-wide association meta-analysis of 606,059 parents' survival, we discover two regions associated with longevity (HLA-DQA1/DRB1 and LPA).We also validate previous suggestions that APOE, CHRNA3/5, CDKN2A/B, SH2B3 and FOXO3A influence longevity.Next we show that giving up smoking, educational attainment, openness to new experience and high-density lipoprotein (HDL) cholesterol levels are most positively genetically correlated with lifespan while susceptibility to coronary artery disease (CAD), cigarettes smoked per day, lung cancer, insulin resistance and body fat are most negatively correlated.We suggest that the effect of education on lifespan is principally mediated through smoking while the effect of obesity appears to act via CAD.Using instrumental variables, we suggest that an increase of one body mass index unit reduces lifespan by 7 months while 1 year of education adds 11 months to expected lifespan.",
+      "Genomic analysis of longevity offers the potential to illuminate the biology of human aging.Here, using genome-wide association meta-analysis of 606,059 parents' survival, we discover two regions associated with longevity (HLA-DQA1/DRB1 and LPA).We also validate previous suggestions that APOE, CHRNA3/5, CDKN2A/B, SH2B3 and FOXO3A influence longevity.Next we show that giving up smoking, educational attainment, openness to new experience and high-density lipoprotein (HDL) cholesterol levels are most positively genetically correlated with lifespan while susceptibility to coronary artery disease (CAD), cigarettes smoked per day, lung cancer, insulin resistance and body fat are most negatively correlated.We suggest that the effect of education on lifespan is principally mediated through smoking while the effect of obesity appears to act via CAD.Using instrumental variables, we suggest that an increase of one body mass index unit reduces lifespan by 7 months while 1 year of education adds 11 months to expected lifespan.",
+      "Recent developments on the genetics of aging can be seen as several streams of effort.In general, humans show a relatively modest (<50%) heritability of life spans (results obtained from twin studies discussed below).The apoE polymorphisms are remarkable for their influence on both cardiovascular disease and Alzheimer disease.In contrast, rare mutant genes with high penetrance cause these same diseases but with early onset and a major shortening of the life span.Shortlived laboratory models (fruit flies, nematodes, mice) are yielding rapid advances, with the discovery of mutants that increase life spans in association with altered metabolism, which leads to questions on the physiological organization of aging processes.Although these early findings do not show that a conserved genetic program actually controls aging processes across animal phylogeny, it is striking how frequently findings of metabolic rate, insulin signaling, and free radicals have emerged from very different approaches to aging in nematodes and mammals, for example.These findings hint that the genetic control of life span was already developed in the common ancestor of modern animals so that subsequent evolution of life spans was mediated by quantitative changes in the control of metabolism through insulin and the production of free radicals.",
+      "Background: Genetic research on longevity has provided important insights into the mechanism of aging and aging-related diseases.Pinpointing import genetic variants associated with aging could provide insights for aging research.Methods: We performed a whole-genome sequencing in 19 centenarians to establish the genetic basis of human longevity.Results: Using SKAT analysis, we found 41 significantly correlated genes in centenarians as compared to control genomes.Pathway enrichment analysis of these genes showed that immune-related pathways were enriched, suggesting that immune pathways might be critically involved in aging.HLA typing was next performed based on the whole-genome sequencing data obtained.We discovered that several HLA subtypes were significantly overrepresented.Conclusions: Our study indicated a new mechanism of longevity, suggesting potential genetic variants for further study.",
+      "Background: Biological aging estimators derived from DNA methylation data are heritable and correlate with morbidity and mortality.Consequently, identification of genetic and environmental contributors to the variation in these measures in populations has become a major goal in the field.Results: Leveraging DNA methylation and SNP data from more than 40,000 individuals, we identify 137 genome-wide significant loci, of which 113 are novel, from genome-wide association study (GWAS) meta-analyses of four epigenetic clocks and epigenetic surrogate markers for granulocyte proportions and plasminogen activator inhibitor 1 levels, respectively.We find evidence for shared genetic loci associated with the Horvath clock and expression of transcripts encoding genes linked to lipid metabolism and immune function.Notably, these loci are independent of those reported to regulate DNA methylation levels at constituent clock CpGs.A polygenic score for GrimAge acceleration showed strong associations with adiposityrelated traits, educational attainment, parental longevity, and C-reactive protein levels.Conclusion: This study illuminates the genetic architecture underlying epigenetic aging and its shared genetic contributions with lifestyle factors and longevity.",
+      "INTRODUCTION  Human aging is affected by genes, life style, and environmental factors.The genetic contribution to average human aging can be modest with genes explaining \u223c20-25% of the variability of human survival to the mid-eighties (Herskind et al., 1996;Fraser and Shavlik, 2001).By contrast, genetic factors may have greater impact on survival to the ninth through eleventh decades (Tan et al., 2008).Notably, exceptional longevity is rare and may involve biological mechanisms that differ from those implicated in usual human aging.",
+      "Before the advent of NGS technologies, several scientists were interested in the study of allele variants associated with aging, but they were limited by the lack of aging rate biomarkers.Now with NGS technologies, these biomarkers have been emerged such as the epigenetic clock that is described in the DNA methylation sequencing section of this chapter.In this post-genomic era, different strategies have been developed in order to understand the genetic factors involved in aging [17].One strategy used is the study of aging in extreme longevity groups of people, called centenarians.Centenarians are a group that can reach an age above 100 years and has an incidence of 1 every 10,000 people [18].In a pioneering study using extreme longevity people (308 individuals belonging to 137 sibships showing extreme longevity), genome-wide scan analysis identified a region on chromosome 4 associated with extreme longevity [19] that corresponds to the microsomal transfer protein (MTP) [20], which is associated with abetalipoproteinemia and hypobeta lipoproteinemia in humans [21,22].Another approach to study the genetic factors involved in longevity consists in assessing allele frequencies from people of different ages, looking for those polymorphisms (SNPs) with enhanced allele frequencies in high-longevity individuals.Those alleles with diminished frequencies in aged individuals may be associated with age-related diseases.Using this approximation, an SNP that shifts isoleucine to valine was identified in the PKA-anchoring protein (AKAP2) gene.This polymorphism is associated with reduced longevity and cardiac disease [23].Genome-wide association studies (GWAS) have confirmed only three loci that affect longevity: FOXO3A, APOE, and an intergenic locus on chromosome 5q33.3[24][25][26].",
+      "Even more disappointing result is that some genes predisposing to geriatric diseases discovered by GWAS appear to be not correlated with human longevity (Beekman et al. 2010;Deelen et al. 2011).This result questions whether findings obtained from GWAS may provide insights into the bio-genetic mechanisms underlying a healthy lifespan.In fact, this finding is very surprising because (1) genetic studies of non-human species have discovered numerous genes predisposing to aging-related processes (Cutler and Mattson 2006;Vijg and Suh 2005;Kenyon 2005;Johnson 2006;Greer and Brunet 2008), (2) nongenetic association studies show that the long-living individuals are typically in better health compared to the short-living individuals (Barzilai et al. 2003;Willcox et al. 2008b;Willcox et al. 2008a;Evert et al. 2003), and (3) candidate-gene studies (but not GWAS) document that the same genes can affect diseases and lifespan (Koropatnick et al. 2008;Kulminski et al. 2011).This is an apparent paradox which has to be carefully examined.A prominent geneticist and evolutionary biologist T. G. Dobzhansky asserts that \"nothing in biology makes sense except in the light of evolution. \"Evolution primarily maximizes fitness of individuals of reproductive age.The classical evolutionary biological theory of aging claims that aging occurs because of decline in the force of natural selection with age (Kirkwood and Austad 2000).Then, according to that theory, aging-related (senescent) phenotypes with post-reproductive manifestation are non-adaptive and subject to stochastic variation.Therefore, at a first glance evolution should not be relevant to senescent phenotypes (apart so-called grandmother hypothesis; Hawkes et al. 1998).Such phenotypes, however, can be caused by reproductive-age-related risk factors making, thus, evolution to be relevant to them (Vijg and Suh 2005;Di Rienzo and Hudson 2005;Drenos and Kirkwood 2010).",
+      "On the other hand, the same evolutionary-motivated strategy suggesting to focus on more heterogeneous phenotypes (as opposite to more homogenous) can be highly beneficial for unraveling genetic predisposition to fundamental mechanisms of intrinsic biological aging and, consequently, to geriatric diseases.Indeed, aging is associated with systemic remodeling of an organism's functioning which increases chances of virtually all geriatric disorders (Franco et al. 2009;Franceschi et al. 2000;Martin et al. 2007;Cutler and Mattson 2006).Experiments with laboratory animals (Johnson 2006) and heritability estimates in humans (Christensen et al. 2006;Iachine et al. 1998) show that aging can be genetically regulated (Finch and Tanzi 1997;Martin et al. 2007;Vaupel 2010).Accordingly, yielding insights in genetic predisposition to aging-related processes in an organism could be a major breakthrough in preventing and/or ameliorating not one geriatric trait, but perhaps a major subset of such traits (Martin et al. 2007) that can greatly advance progress in solving the problem of extending healthy lifespan in humans.",
+      "In conclusion, we performed a genome-wide association study of longevity-related phenotypes in individuals of European, East Asian and African American ancestry and identified the APOE and GPR78 loci to be associated with these phenotypes in our study.Moreover, our gene-level association analyses highlight a role for tissue-specific expression of genes at chromosome 5q13.3,12q13.2,17q21.31,and 19q13.32 in longevity.Genetic correlation analyses show that our longevity-related phenotypes are genetically correlated with several disease-related phenotypes, which in turn could help to identify phenotypes that could be used as potential biomarkers for longevity in future (genetic) studies.",
+      "M OST genetic studies involved with aging have focused on identifying genes contributing to particular diseases.More recently, it has been recognized that it is also valuable to examine genetic factors related to diseasefree or healthy aging (1,2).Utilizing twins from the National Academy of Sciences-National Research Council (NAS-NRC) twin panel, we have demonstrated that healthy physical aging is under a significant degree of genetic influence, with a heritability over 50% (3).Our definition of healthy aging focused principally on freedom from cardiovascular disease, and has received considerable support in the more recent literature.Brand and colleagues (4) reported that parental age at death was a significant predictor of coronary heart disease death in the Framingham offspring study and concluded that familial similarities for age at death may be mediated through shared coronary heart disease risk factors.Frederiksen and colleagues (5) reported that increased parental life was associated with a reduction in odds ratio for their children to have diabetes, ischemic heart disease, heart failure, stroke, and hypertension.We have found that better midlife lipid levels and blood pressures were associated with increased parental longevity in the National Heart, Lung, and Blood Institute twin study (6).Centenarian siblings and offspring, besides having increased longevity, have been shown to have better health and better cardiovascular risk factor profiles (7)(8)(9)(10).",
+      "The lack of success in the identification of genes related to aging in humans may be due to the complexity of the phenotype.One approach to investigate aging and longevity is to compare frequencies of genetic variants between nonagenarians or centenarians and the general population.This approach led to the discovery of an association between APOE (Deelen et al., 2011;Ewbank, 2007;Gerdes et al., 2000) and more recently FOXO3A (Anselmi et al., 2009;Flachsbart et al., 2009;Li et al., 2009a;Pawlikowska et al., 2009;Willcox et al., 2008) and human aging and longevity.However, a recent genome-wide association study (GWAS) of individuals reaching the age of 90 or older failed to identify genome-wide significant variants (Newman et al., 2010).",
+      "Human longevity and healthy aging show moderate heritability (20%-50%).We conducted a meta-analysis of genome-wide association studies from 9 studies from the Cohorts for Heart and Aging Research in Genomic Epidemiology Consortium for 2 outcomes: (1) all-cause mortality, and (2) survival free of major disease or death.No single nucleotide polymorphism (SNP) was a genome-wide significant predictor of either outcome (p \u03fd 5 \u03eb 10 \u03ea8 ).We found 14 independent SNPs that predicted risk of death, and 8 SNPs that predicted event-free survival (p \u03fd 10 \u03ea5 ).These SNPs are in or near genes that are highly expressed in the brain (HECW2, HIP1, BIN2, GRIA1), genes involved in neural development and function (KCNQ4, LMO4, GRIA1, NETO1) and autophagy (ATG4C), and genes that are associated with risk of various diseases including cancer and Alzheimer's disease.In addition to considerable overlap between the traits, pathway and network analysis corroborated these findings.These findings indicate that variation in genes involved in neurological processes may be an important factor in regulating aging free of major disease and achieving longevity.",
+      "Introduction  The recent, remarkable extension of life expectancy is largely attributed to the postponement of mortality at old age (Vaupel, 1997(Vaupel, , 2010)).The years of life gained in the older population residing in developed nations are a success story of public health measures and improved health care.In addition to such external factors, longevity and healthy aging consistently show a modest heritability between 20% and 50% and aging-associated genetic research may provide further insights into the mechanisms of aging (Herskind et al., 1996;McGue et al., 1993;Reed and Dick, 2003).It has been postulated that genes involved in pathways associated with aging identified in animal models, such as insulin-like growth factor (IGF)-insulin signaling, regulation of lipoprotein metabolism, the mTOR pathway, and the oxidative stress response may also influence survival to old or even exceptionally old age in humans (Christensen et al., 2006;Kenyon, 2010;Vellai et al., 2003).However, in humans, common variants within genes involved in these pathways have not been consistently associated with lifespan (Chris-tensen et al., 2006;Kenyon, 2010;Kuningas et al., 2008;Vijg and Suh, 2005).",
+      "Human longevity and healthy aging show moderate heritability (20%-50%).We conducted a meta-analysis of genome-wide association studies from 9 studies from the Cohorts for Heart and Aging Research in Genomic Epidemiology Consortium for 2 outcomes: (1) all-cause mortality, and (2) survival free of major disease or death.No single nucleotide polymorphism (SNP) was a genome-wide significant predictor of either outcome (p \u03fd 5 \u03eb 10 \u03ea8 ).We found 14 independent SNPs that predicted risk of death, and 8 SNPs that predicted event-free survival (p \u03fd 10 \u03ea5 ).These SNPs are in or near genes that are highly expressed in the brain (HECW2, HIP1, BIN2, GRIA1), genes involved in neural development and function (KCNQ4, LMO4, GRIA1, NETO1) and autophagy (ATG4C), and genes that are associated with risk of various diseases including cancer and Alzheimer's disease.In addition to considerable overlap between the traits, pathway and network analysis corroborated these findings.These findings indicate that variation in genes involved in neurological processes may be an important factor in regulating aging free of major disease and achieving longevity.",
+      "Many factors contribute to aging, including genes.This is the first article in a 10-part series that highlight some of what is known about the influence of genes on aging and emerging treatment options that may slow down or potentially reverse the aging process.The series will address \\genes, adducts, and telomeres, decreased immune defenses, oxidation and inefficient mitochondria, toxins and radiation, glycosylation, caloric intake and sirtuin production, neurotransmitter imbalance, hormone mechanisms, reduced nitric oxide, and stem cell slowdown.Underpinning these factors are wear and tear on cells and aging as a result of inability to repair or replace these affected cells.These topics have been addressed in research, health magazines, and even by talk show hosts.There is even a LongevityMap website addressing significant and nonsignificant genetic association studies in aging across the human genome (http://genomics.senescence.info/longevity/).The series will address a scientific and clinical approach to genome-related aging topics.",
+      "The genetic basis of human longevity has so far been primarily investigated by association studies.Most results from these experiments have been difficult to confirm in independent samples, probably owing to the modest heritability, multifactorial nature, and heterogeneity of the phenotype (Christensen et al., 2006).To date, variation in only two genes has been identified, which has an effect on longevity in various populations: (i) the apolipoprotein E gene (APOE) (Scha \u00a8chter et al., 1994;Christensen et al., 2006) and (ii) the forkhead box O3A (FOXO3A) gene in the insulin-IGF1 signaling (IIS) pathway (Willcox et al., 2008;Flachsbart et al., 2009).Given the apparent lack of susceptibility candidates, it is conceivable that other genetic factors influence the function or expression of genes relevant for human longevity.",
+      "Introduction  Approximately 25-30% of the variation in adult lifespan is attributable to genetic factors that become more important with increasing age and exert their strongest effects in nonagenarians and centenarians (Go \u00a8gele et al., 2010;Hjelmborg et al., 2006).As yet, however, only a few genetic variants have been found consistently to influence longevity.The first to be discovered was the e4 allele of the apolipoprotein E (APOE) gene, a mortality factor that predisposes to both Alzheimer's and cardiovascular diseases (Corder et al., 1993; Panza et al., 2004).APOE e4 is the only variant with a reportedly large adverse effect upon survival at advanced age (Scha \u00a8chter et al., 1994), and this association has been replicated in several populations (Christensen et al., 2006).Variation in the human forkhead box O3A gene (FOXO3A), in contrast, has been found to be associated with the ability to live long, an effect corroborated by studies in Japanese, German, Italian, US-American, Jewish, Chinese and Danish populations (Anselmi et al., 2009;Flachsbart et al., 2009;Li et al., 2009;Pawlikowska et al., 2009;Soerensen et al., 2010;Willcox et al., 2008).More recently, we have identified exonuclease 1 (EXO1) as a potential novel longevity gene (Nebel et al., 2009).All three genes were detected through candidate-gene approaches.",
+      "GenAge: the aging gene database Philosophy and overview of resources  It is undisputed that genetic factors influence aging.In a remarkable series of recent breakthroughs, a number of genes capable of altering the aging process as a whole -or at least to a large degree -have been identified in animal models and even a few in humans (Finch & Ruvkun, 2001;de Magalh\u00e3es, 2005;Kenyon, 2005).Furthermore, multiple alleles have been examined for their association with human exceptional longevity (Vijg & Suh, 2005).This is a fascinating and important area of research, yet there are now so many genes being associated with aging and longevity that keeping track of them all is becoming increasingly more difficult.Moreover, it is necessary now to study not only individual genes but their interactions with each other and with the environment, and how together genes give rise to a given phenotype: the so-called systems biology approach.To help researchers address these issues we created GenAge, a database of genes related to longevity and/or aging.",
+      "I NCREASES in longevity of the general population world- wide are an unprecedented phenomenon with significant health and social impact.Although environmental factors have led to an increase in life span, there is ample evidence that genetic factors are involved in extreme longevity both in humans (1-7) and in other organisms (8).The protective genetic factors that lead to longevity are likely to involve fundamental processes of aging that may be different from those associated with early mortality or premature onset of age-related diseases in younger individuals.The mechanisms of aging in humans are far from understood, but available evidence suggests that several pathways-inflammation, oxidative stress and stress responses, cellular senescence, DNA damage and repair, and the growth hormone or insulinlike growth factor and insulin (GH, IGF, INS) axis-may play key roles (9)(10)(11)(12).Model organisms suggest that inhibiting the GH, IGF, or INS axis, which is involved in regulating cell proliferation, cell death, wound repair, and metabolism, may promote longevity by reducing oxidative stress and slowing the rate of cell replication and the accumulation of somatic-cell DNA mutations (13).There is also evidence for other important pathways such as the heatshock proteins and heat-shock factors that are highly conserved across species and play a role in prolongevity transcription pathways.Clinical and epidemiological investigations, including candidate gene studies, have suggested that inflammation pathways may affect life span and risk of age-related conditions such as cardiovascular disease (CVD) and its risk factors (14)(15)(16)(17)(18)(19).A combination of multiple genetic variants may be required for an individual to achieve exceptional longevity, which may account in part for its rarity."
+    ],
+    [
+      "Indicative biological pathways associated with the candidate aging genes",
+      "Fig. 2 Significant biological processes associated with the candidate aging genes",
+      "Following are examples of the identified genes and experimental or GWAS link between these genes and aging.On the list of the 25 top genes, NAP1L4 encodes a member of the nucleosome assembly protein (NAP) family, which interacts with both core and linker histones, and shuttles between the cytoplasm and nucleus, suggesting a role as histone chaperone.Histone protein levels decline during aging, and dramatically affect chromatin structure.Remarkably, the lifespan can be extended by manipulations that reverse the age-dependent changes to chromatin structure, indicating the pivotal role of chromatin structure in aging [32].In another example, gene expression of NAP1L4 increases with age in the skin tissue [33].Findings of GWAS link a number of the identified genes to age-related disorders, such as GAB2 and late onset Alzheimer's disease [86], and QKI and coronary heart disease/myocardial infarction [79].Interestingly, GWAS reports also link QKI to successful aging [87].",
+      "Examples of biological candidate genes with pleiotropic functions, which are involved in aging in general and in musculoskeletal aging in particular, are numerous: (a) in addition to the IGF-1 and vitamin D genes, estrogen metabolism pathway genes, including estrogen receptors and aromatase (CYP19), are associated with fat-free mass (Walsh et al. 2005) and BMD (Shearman et al. 2004), prostate and breast cancer (Gallicchio et al. 2006), and cardiovascular disease risk (Shearman et al. 2003).",
+      "In-depth analysis of the age-regulated genes revealed that multiple genes in the DNA damage response pathway were upregulated with age including those that function in non-homologous end-joining repair (mre11, rad50, Ku80 and mus308) and in translesion DNA synthesis (mus205 and DNApol-eta) [44][45][46].Genes that encoded enzymes with antioxidant properties, such as the thioredoxin reductase Trxr-1, and antioxidant genes involved in glutamate metabolism, such as GlnRS, isoQC and QC, were also upregulated with age [47][48][49][50].We also observed increased age-associated expression of chaperone genes (Cct1, Cct4, Cct5, Cct6, Hsc70-4) and the unfolded protein response transcription factor Xbp1, consistent with an induction of the unfolded protein response [51][52][53].Under stress conditions, there is a translational switch that favors production of stressrelated proteins while decreasing translation of other proteins [54].Paralogs of canonical translation factors such as NAT1 and Rack1, which were both upregulated, promote this switch to cap-independent translation [55,56].Notably, Rheb, which is downregulated with age, positively regulates ribosome production and capdependent translation by activating the mechanistic target of rapamycin (mTOR) kinase pathway [57].Thus, decreased Rheb levels during aging could decrease mTOR pathway activity, which extends lifespan and is protective against age-related pathology [58].Together, these data suggest that multiple genes are induced in aging photoreceptors to mitigate the effects of oxidative stress, protein misfolding and DNA damage.",
+      "CellAge vs human orthologues of longevity-associated model organism genes  To understand how senescence is linked to the genetics of aging processes, we looked at the intersection of CellAge genes and the 869 genes in the human orthologues of model organisms' longevity-associated genes (LAGs) dataset, collected based on quantitative changes in lifespan [34].Like CellAge, where genes are classified based on whether their upregulation induces, inhibits, or has an unknown impact on CS, the longevity orthologues dataset also provides information on the effect of upregulation of its genes, namely whether it promotes (pro, 421) or inhibits (anti, 448) longevity (Additional file 1: Table S7; Additional file 2: Fig. S2).",
+      "Using network biology, we implicated the CellAge genes in various processes, particularly cell division and immune system processes.We used network topology to identify potential regulators of CS and bottlenecks that could impact various downstream processes if deregulated.Indeed, we identified 11 genes that have already been shown to contribute towards CS, which will be added to future versions of CellAge.Finally, we experimentally verified 26 genes that induce CS morphology or biomarkers when knocked down in human mammary fibroblasts.Of these, 13 genes (C9orf40, CDC25A, CDCA4, CKAP2, GTF3C4, HAUS4, IMMT, MCM7, MTHFD2, MYBL2, NEK2, NIPA2, and TCEB3) were strong hits in inducing a senescent phenotype.",
+      "Results: We develop CellAge (http://genomics.senescence.info/cells),a manually curated database of 279 human genes driving cellular senescence, and perform various integrative analyses.Genes inducing cellular senescence tend to be overexpressed with age in human tissues and are significantly overrepresented in anti-longevity and tumor-suppressor genes, while genes inhibiting cellular senescence overlap with pro-longevity and oncogenes.Furthermore, cellular senescence genes are strongly conserved in mammals but not in invertebrates.We also build cellular senescence protein-protein interaction and co-expression networks.Clusters in the networks are enriched for cell cycle and immunological processes.Network topological parameters also reveal novel potential cellular senescence regulators.Using siRNAs, we observe that all 26 candidates tested induce at least one marker of senescence with 13 genes (C9orf40, CDC25A, CDCA4, CKAP2, GTF3C4, HAUS4, IMMT, MCM7, MTHFD2, MYBL2, NEK2, NIPA2, and TCEB3) decreasing cell number, activating p16/p21, and undergoing morphological changes that resemble cellular senescence.Conclusions: Overall, our work provides a benchmark resource for researchers to study cellular senescence, and our systems biology analyses reveal new insights and gene regulators of cellular senescence.",
+      "Genomics-a fundamental basis for understanding skin aging  In the last decade, genomic tools such as gene chips have been widely developed.This accomplishment has provided us with deeper insights into the molecular events underlying skin aging. 137Gene expression profiling has led to identification of pathways affected by aging, and this information has led to the development of new strategies to enable better skin repair and antiaging benefits. 138ene expression patterns were examined in sun-protected (buttocks) and sun-exposed skin (extensor forearm) from 10 young (age 19 to 20 years) and 10 older women (age 63 to 67 years) to examine gene expression profiles associated with chronologic skin aging and photoaging.Chronologic and photoaging were both associated with downregulation of the biologic process of lipid synthesis.In particular, genes involved in cholesterol and fatty acid synthesis were downregulated, as were genes associated with epidermal differentiation, including keratin filaments and cornified envelope components.An upregulation of the biologic processes of inflammatory response and wound healing, the molecular functions of cytokine activity and protease activity and the cellular component theme of extracellular matrix was also observed in both skin aging types.Elastin gene expression was upregulated with aging only in the photodamaged arm and remained unchanged in the sunprotected buttock.This finding corresponds to the histopathologic findings that show typical elastotic changes, the \"solar elastosis,\" in photoaged skin. 139urther studies conducted to investigate changes in gene expression during skin aging have been performed on naturally aged human foreskin obtained from children and elderly men.Some of the mechanisms proposed to be involved in the induction of aging comprise disturbed lipid metabolism, altered insulin and STAT3 signalling, upregulation of apoptotic genes partly due to the deregulation of FOXO1, downregulation of members of the jun and fos family, differential expression of cytoskeletal proteins (eg, keratin 2A, 6A, and 16A), extracellular matrix components (eg, PI3, S100A2, A7, A9, SPRR2B), and proteins involved in cell-cycle control (eg, CDKs, GOS2). 140Similar results have been presented by a study related to aging of skeletal muscle. 141n a previous study, we proposed that one of the factors significantly involved in the initiation of aging might be the physiologic decline of hormones occurring with age.Human SZ95 sebocytes in vitro treated with hormone levels that can be found in 60 year-old women produce less lipids than sebocytes treated with a hormone mixture representing that found in the serum of 20 year-old women. 6A differential gene expression between SZ95 sebocytes under the 20 and 60 year-old hormone mixture detected differentially expressed genes that are involved in biologic processes such as DNA repair and stability, mitochondrial function, oxidative stress, cell cycle and apoptosis, ubiquitin-induced proteolysis, and transcriptional regulation. 139,140A comparison of these results with data obtained from the aged kidney 142 identified key genes that may be of great importance for global aging.The most significantly altered signalling pathway was that of TGF-\u03b2.A disturbed function of this cascade has been also  c-Fos, which heterodimerize to form the activator protein 1 (AP-1) complex.AP-1 is a key regulator of skin aging, because it induces the expression of the MMP family and inhibits type I procollagen gene expression through interference with TGF-\u03b2 signalling pathway.It has been postulated that MAP kinases may be activated by excess production of reactive oxygen species (ROS) that occurs with advanced age and may be superimposed by extrinsic factors such as ultraviolet irradiation.Excess ROS production also leads to accumulation of cellular damage, which includes oxidation of DNA resulting in mutations, oxidation of proteins leading to reduced function, and oxidation of membrane lipids resulting in reduced transport efficiency and altered transmembrane signalling.IL, interleukin; NF-\u03baB, nuclear factor-\u03baB; TGF-\u03b2, transforming growth factor-\u03b2; TSP-1, thrombospondin-1; TSP-2, thrombospondin-2; VEGF, vascular endothelial growth factor.associated with tumorigenesis, such as in pancreatic, prostate, intestine, breast, and uterine cancer.",
+      "Analysis of prior research (Online Resource 5) shows that the revealed genes can be explicitly involved in other key biological processes in an organism whose role is known to be changing with aging.Specifically, ten genes (BAZ2B, HMGB4, NOC2L, RAI1, SIK1, SMARCA2, SPZ1, TBP, TRIP13, and ZKSCAN1) regulate transcription which is believed to be disrupted when an organism is getting older (Roy et al. 2002).The DBH, TPO, and LSS genes are involved in synthesis of catecholamine, thyroid, and vitamin D hormones, respectively.The GPER binds estrogen and HCRTR2 binds orexin-A and orexin-B neuropeptid hormones.Hormonal deregulation with aging is considered to be one of the major components of senescent processes in an organism (Barzilai and Gabriely 2010).Five genes (ATG2A, NEDD4L, PSMB1, UBXN4, and USP6) are involved in degradation of proteins through ubiquitin-proteasome and the lysosomal/autophagic system.Dysfunction of this system leads to accumulation of damaged proteins in an organism that is associated with aging (Koga et al. 2011).Protein degradation through ubiquitin-mediated proteolysis plays an important role in cell-cycle regulation (Reed 2003).The PSMB1, SIK1, TRIP13, and TTN genes in the revealed set coordinate cell cycle.Cell cycle is linked with the aging-related processes in humans through a gradual increase in cell division errors in all tissues in an organism (Ly et al. 2000).Five genes (EEF1A2, DBH, ITGB2, TUBB2C, and WRN) take part in regulation of apoptosis which plays an important role in the aging process and tumorigenesis (Salvioli et al. 2008).Seven genes (ABCA7, AZGP1, CD36, DEGS2, LSS, PI4KA, and SOAT2) are involved in lipid metabolism which plays one of the key roles in human longevity and healthy aging (Barzilai et al. 2003).",
+      "Genes that are age-regulated in all tissues would reveal genes involved in core mechanisms that underlie cellular ageing.Zahn et al. [63] discovered genetic pathways that show common age regulation in human kidney, brain and muscle.They used microarrays to analyse expression in 81 skeletal muscle samples from patients aged 16 -86 years and found 250 age-regulated muscle genes [63].Similar to the ageing expression profile for the kidney, the overall expression behaviour of this set of age-regulated muscle genes correlated with the physiological as well as chronological age of the muscle sample.Next, they compared their muscle-ageing results to previously published data on kidney and brain ageing of similarly large sample size [56,60].Although most of the age-related changes were tissue specific, they found evidence for common age regulation of six genetic pathways in all three tissues.Specifically, there is an overall increase in expression of the extracellular matrix genes, the ribosomal genes, the cell growth genes and the complement activation genes in all three tissues.Increased overall expression of the extracellular matrix and complement activation gene sets with advancing age may contribute to widespread fibrosis and inflammation in the elderly.There is an overall decrease in expression of the chloride transport genes and the electron transport genes in all three tissues.Decreased overall expression of electron transport chain genes with age might support the mitochondrial free-radical theory of ageing [67], as free-radical generation by mitochondria would preferentially damage the electron transport chain protein complexes.Decreased expression of the electron transport genes (encoded in the nucleus) might be caused by feedback regulation from damage to the electron transport chain protein complexes [63].However, it is also possible that increased oxidative damage occurs as a consequence of the decreased expression of the electron transport chain genes.In addition, an increasing number of studies in model organisms have critically challenged the mitochondrial free-radical theory of ageing [68].",
+      "Discussion  Aging studies from model organisms such as yeast, worms, and flies have repeatedly shown that changes in the expression of certain genes have an effect upon longevity.Although similar aging processes are likely to operate across multiple species [30], it has been much more difficult to identify longevity candidate genes in human studies [30].A key question in human aging is to what extent a signature of aging may be detectable across tissues.Until now there has been a lack of large transcriptional profiles from the same human individuals in multiple tissues.The MuTHER study provides insight into the human aging process by interrogating the largest multiple human tissue gene expression resource to identify genes in which expression was affected by chronological age.The analysis of the skin and adipose tissues samples identified several hundred genes responsive to changes in chronological age.However, the 43 shared genes in skin and adipose tissue showed a single common identifiable pathway related to the stress response.From over 1,800 transcripts that have altered expression with age in skin and adipose tissues, 14 also had age-related differential expression in brain.The limited overlap in these two experiments may partly reflect the smaller sample size of the brain expression dataset, the differences in age range between the studies (16 to 83 years for brain samples; 39 to 85 years for MUTHER samples), or the inclusion of males in the brain samples.But it may also imply, as other studies have suggested, that the effects of age on gene transcription are tissue specific [6,31,32].This hypothesis was supported by the comparison with known related aging genes from the GenAge database, which identified an overlap for a small number of aging-related genes with our data.The GenAge database was the result of a meta-analysis using age-related expression profiles from human brain, kidney, and skeletal muscle, and several expression profiles from mouse and rat; no adipose tissue or skin samples were included (Additional file, Table 1 in [7]).The limited overlap between these datasets supports the idea that molecular signatures of aging reflect predominantly a tissue-specific transcriptional response.The lack of age-related genes in transformed LCLs, suggest that the transformation to immortalize a cell line may mask or even remove the age-related signatures in gene expression.The transformation of primary B lymphocytes into LCLs requires infection by the Epstein-Barr virus which has the effect of disrupting the p53 signaling pathway in order to induce growth and survival [33].Joehanes et al. [15] identified only five genes with age-associated expression in LCLs, including p53 itself (TP53).Although the authors attribute the lack of age-affected genes to their small sample size (n=50) and narrow age range, our analysis with a much larger sample size found even fewer age-related changes, suggesting a lack of detectable aging signature in LCLs.The analysis in the subset of fresh lymphocytes suggested an age influence in fresh lymphocytes may potentially be detectable with a larger sample size.",
+      "Genes Whose Expression Decreased with Age.Of the 26 genes that decreased expression with age in control mice, 23% are involved in DNA replication and the cell cycle (Table 2).Most of these have a negative effect on cell growth and division.Among these, the product of phosphatase and tensin homolog (Pten) gene is a tumor suppressor that induces cell-cycle arrest through inhibition of the phosphoinositide 3-kinase pathway (28).B cell translocation gene 2 (Btg2) is a tumor suppressor that increases expression in response to DNA damage (29).The murine gene product of the amino-terminal enhancer of split (Aes) is a potent corepressor of gene expression and cellular proliferation (30).Calcium-binding protein A11 (S100a10) binds to and regulates the activity of annexin II, which is involved in the transduction of calcium-related mitogenic signals (31).Insulin-like growth factor (IGF) binding protein 1 (Igfbp1) plays an important role in the negative regulation of the IGF-1 system, a stimulator of mitogenesis (32).",
+      "daf-16 dependent genes  Among the 52 genes that we have tested, 29 genes act almost completely in a daf-16 dependent manner, to regulate lifespan (Table 2).One of the genes identified was daf-2 (Y55D5A_391.b).This serves as a proof of principle that our screen is effective in identification of aging genes.",
+      "Several of the genes we identify have previously been shown to influence lifespan in experiments on model organisms.For example, knockouts of the orthologs of APOE, LDLR, CDKN2B, and RBM38 in mice shortens their lifespan [24][25][26][27] , while knockout of IGF1R has the opposite effect 28 .Similarly, overexpression of the FOXO3 orthologue in Drosophila melanogaster 29 and the SNCA orthologue in Caenorhabditis elegans 30 have shown to extend their respective lifespans.Many of our genes are also enriched for pathways previously related to ageing in eukaryotic model organisms, including genomic stability, cellular senescence, and nutrient sensing 31 .For example, FOXO3 and IGF1R are well-known players modulating survival in response to dietary restriction 32 , but we also highlight genes involved in the response to DNA damage and apoptosis, such as CDKN2B, USP28, E2F2, and BCL3.In addition to hallmarks discovered in model organisms, our results suggest that haem metabolism may play a role in human ageing.This pathway includes genes involved in processing haem and differentiation of erythroblasts 33 .Although the enrichment is largely driven by genes linked to the LDLR locus, genes linked to other loci of interest (such as FOXO3, CDKN2B, LINC02513) are involved in similar biological pathways: myeloid differentiation, erythrocyte homeostasis, and chemical homeostasis.",
+      "Hundreds of genes in several pathways act as regulators of ageing (1,32).However, analysis of DrugAge and other HAGR databases has revealed that the overlap between the targets of lifespan-extending drugs and known ageing related genes is modest (31).This indicates that most ageing-related pathways have yet to be targeted pharmacologically; DrugAge may aid in guiding further assays.This was recently demonstrated in one study where machine learning was used to predict whether a compound would increase lifespan in worms using data from Dru-gAge.The best model had 80% prediction accuracy and the top hit compounds could broadly be divided into compounds affecting mitochondria, inflammation, cancer, and gonadotropin-releasing hormone (33).",
+      "Top 25genes co-expressed with aging related genes",
+      "Aging-related gene prediction and putative transcriptional mechanisms  GeneFriends was used to identify genes related to aging.A seed list of genes known to be consistently overexpressed with age in mammals was used [18].In total, 1119 genes were co-expressed with the aging seed list at p <10 -6 ; Table 1 shows the top 25 genes.Many of these genes have been associated with age-related diseases.Several other genes that have been shown to play a role in aging such as lysosomal-associated membrane protein-2 Lamp2 [19] (p = 5.68 -30 ), Fas [20] (p = 2.70 -31 ) and growth hormone receptor Ghr [21] (p = 1.34 -19 ) also showed a significant co-expression.Anxa2, Anxa3 and Anxa4 also show a low p-value (p < 10 -25 ) as well as several S100 calcium binding proteins which have been shown to interact with annexins [22].",
+      "Genetic studies have shown that aging can be slowed in mutants that are defective in a wide range of cellular processes (such as mitochondrial function, chromatin regulation, insulin signaling, transcriptional regulation, and genome stability).This indicates that aging is a complex process driven by diverse molecular pathways and biochemical events.As such, a powerful approach to study aging is to use systems biology, which allows a multitude of factors affecting aging to be analyzed in parallel.For example, DNA microarrays and gene expression chips have been used to perform a genome-wide analysis of changes in gene expres-sion in old age.Extensive studies in Caenorhabditis elegans and Drosophila melanogaster have identified hundreds of ageregulated genes (Hill et al. 2000;Zou et al. 2000;Lund et al. 2002;Pletcher et al. 2002;Murphy et al. 2003).Several studies have described age-regulated genes in the muscle and brain of mice (Lee et al. 1999(Lee et al. , 2000) ) and the retina and muscle of humans (Yoshida et al. 2002;Welle et al. 2003Welle et al. , 2004).These age-regulated genes may serve as markers of aging, enabling one to assess physiological age independently of chronological age.Analysis of the functions of these age-regulated genes has identified specific biochemical mechanisms that change toward the end of life.",
+      "Age-Regulated Genes Involved in Reproductive Capacity.Decline in reproductive capacity is an age-related phenotype, and the reproductive system seems to play an important role in longevity (22).For example, signals from germ cells can affect lifespan in C. elegans (23).In our study, we observed decreased RNA levels for several genes involved in reproduction (Fig. 3).These include two genes that encode members of the Acp family.The Acp from male flies stimulates female egg-laying and facilitates storage of sperm in the female genital tract (24).In addition, two ESTs showing age-regulated decrease of transcript levels represent different genes with homology to Arabidopsis MALE STERIL-  In Northern analysis, the ratios were calculated by dividing mRNA levels at 25-, 40-, and 50-day time points by those at 3-day time points after normalization with mRNA levels of the control gene rp49.Ratios in microarray analysis are provided from each of the duplicate experiments for comparison.ITY 2 (MS2; ref. 25), a gene involved in gametogenesis.Furthermore, an EST with homology to peanut, a member of the septin family (26), is down-regulated in older flies.This downregulation may reflect a decrease in spermatogenesis."
+    ]
+  ]
+}
\ No newline at end of file
diff --git a/gnqa/paper1_eval/src/data/datasets/old/experts_general1_dataset.json b/gnqa/paper1_eval/src/data/datasets/old/experts_general1_dataset.json
new file mode 100644
index 00000000..5643ab90
--- /dev/null
+++ b/gnqa/paper1_eval/src/data/datasets/old/experts_general1_dataset.json
@@ -0,0 +1,128 @@
+{
+  "question": [
+    "How do I determine which gene in my QTL is causal for the trait?",
+    "Why do males have two Y chromosomes and females only one?",
+    "what type of dataset is useful for qtl mapping analysis in genenetwork2?",
+    "What are the ethical considerations surrounding prenatal genetic testing and the selective termination of pregnancies based on genetic factors?",
+    "What are the potential benefits and risks associated with gene editing technologies like CRISPR-Cas9?"
+  ],
+  "answer": [
+    "To determine which gene in your QTL is causal for the trait, you can follow several steps. First, narrow down the QTL interval to a reasonable size. Then, sort through the genes within the interval to identify potential Quantitative Trait Genes (QTGs). This can be challenging as more than one gene may be involved and the function of some genes may be unknown. You can use techniques like LCMS to establish causal relationships between DNA loci and transcripts. Once potential QTGs are identified, you need to prove that a particular gene is involved in the quantitative trait under study. This can be done by complementation of a QTL, which can be achieved in several ways. You can also use mediation analysis to identify the causal gene between the genetic variant and the trait-of-interest. Finally, investigate whether the expression of these genes correlates with the phenotype(s) of interest. If the gene expression strongly correlates with the QTL phenotype, this further strengthens the assumption that this gene might be causal.",
+    "Actually, males have one Y chromosome and one X chromosome, while females have two X chromosomes. This combination determines the sex of an individual.",
+    "GeneNetwork utilizes datasets containing phenotypes and genotypes for QTL mapping analysis. These datasets may include genetic, genomic, and phenotypic data. The data are often formatted and normalized, and come with metadata. The database also includes data on complex phenotypes ranging from gene expression to behavior in standard inbred strains, and panels of mouse recombinant inbred strains. Rat and Arabidopsis populations are also represented in the database.",
+    "The ethical considerations surrounding prenatal genetic testing and selective termination of pregnancies based on genetic factors include the potential for implicit pressure on individuals to violate personal ethics to reduce financial burden on society, the risk of routinization of testing leading to social or medical expectations of testing in all eligible individuals, and the potential compromise of values of informed consent and individual autonomy. There are also cultural and religious beliefs to consider, as well as the potential psychological impact on parents who may feel guilt if they are carriers of genetic conditions. Furthermore, the decision to terminate a pregnancy based on genetic factors is a joint decision between parents, and the involvement of extended family members in this process varies greatly across different cultures.",
+    "The potential benefits of gene editing technologies like CRISPR-Cas9 include the ability to modify genes for the treatment of diseases, improvement of crop species, and the development of personalized drug or cell therapies. It can also be used for functional screening in the development of therapies and for the study of molecular causes of ageing. However, there are risks associated with these technologies. These include off-target effects, which can lead to unwanted mutations, and the potential for wide-ranging deletions or recombination events. There's also a risk of triggering a P53 response leading to apoptosis in cycling cells, and the potential for subjects to generate antibodies to Cas9, which could limit gene therapies. Furthermore, the long-term safety of CRISPR genome editing in humans is yet to be determined."
+  ],
+  "contexts": [
+    [
+      "Prior belief or knowledge about the number of true causal and true independent links that might be expected in a typical QTL, depending on the study design, should be considered to safeguard against high false-positive rates (low positive predictive values). In studies that involve mapping gene expression (eQTL), protein (pQTL) or metabolite (mQTL) traits, information about co-localization of QTL and genes that are functionally linked to the trait provides information about the likelihood of causal links.",
+      "The next step is to investigate whether the expression of these genes correlates with the phenotype(s) of interest.This would suggest a chain of causality: a variant within a gene causes a change in its expression, and the expression of that gene correlates with expression of a phenotypic trait of interest.To do this, we created a correlation matrix between all genes within a QTL with a cis-eQTL in any brain tissue as well as the phenotypes that contributed to the QTL (Supplementary Table S6).Any gene with a cis-eQTL and a significantly correlated expression was considered a good candidate.If the gene only had a cis-eQTL and correlation in a single brain region, then it suggested that this brain region might also be of interest for the phenotype (adding another link to this chain).",
+      "One possible approach to facilitate this endeavor is to identify quantitative trait loci (QTL) that contribute to the phenotype and consequently unravel the candidate genes within these loci. Each proposed candidate locus contains multiple genes and, therefore, further analysis is required to choose plausible candidate genes. One of such methods is to use comparative genomics in order to narrow down the QTL to a region containing only a few genes. We illustrate this strategy by applying it to genetic findings regarding physical activity (PA) in mice and human.",
+      "Network analyses We now have two QTL, and we have picked potentially interesting genes within each, but now we want to build up more evidence for which gene in our QTL interval is causal. The first, and most obvious way, is to see what genes our trait of interest correlates with, in tissues that we expect to be related to the trait. We calculated the Spearman\u2019s correlation between the trait BXD_17850 and all probes with expression data in T helper cells (GN319).",
+      "Another approach to help to determine if a gene located near the mapped QTL would have effects to influence the quantitative trait will be to use genetically engineered mice to determine if altering the expression of a candidate gene will alter the phenotype of interest (38). However, it is possible that a quantitative trait is a combined effect of multiple genes located near the QTL (39).",
+      "With a known QTL and a body of evidence suggesting possible roles for the affected gene, phenotypes can be predicted that may be modulated as a result of this sequence variation. If this phenotype is of interest, it can be directly measured and a traditional \u2018forward\u2019 QTL analysis carried out to confirm the prediction. Such an approach is extremely attractive when the enormous cost and time required for phenotyping a large panel is considered.",
+      "The first step is to narrow down the list of candidate causal genes within a Fig 1. Interval mapping of oviduct gross pathology across the BXD strains  Quantitative Trait Locus (QTL)\u2014a reveals a QTL on distal Chr 3. The L RS values are plotted in blue across the  chromosomal region containing genome and measure the strength of the association between  sequence variants strongly chromosome and Mb position (top and bottom X-\u00ad\u2010axis, respectively) and  associated with phenotypic phenotype expression. Allele contribution is shown by the red (C57BL/6J)  and green (DBA/2J) lines. Red and grey horizontal lines indicate genome-\u00ad\u2010 variation.",
+      "A special case is the correlation of the target phenotype with the expression of the priorized gene(s) (RNA or protein amounts). This refers to colocalization of the QTL of the target phenotype with the eQTL position. Correlation can also be examined between the target QTL phenotype and expression of all genes in the QTL interval. If the gene expression strongly correlates with the QTL phenotype, this further strengthens the assumption that this gene might be causal (see Note 12). For performing a correlation analysis: \u2013  Go to the Trait Overview Page, as described in step 3, point 1.",
+      "QTL mapping of traits in mouse cohorts often ends up with a genetic locus, composed of a list of candidate genes. Several studies proposed the use of mediation analysis to identify the causal gene (mediator) between the genetic variant (independent variable) and the trait-of-interest (dependent variable) (Figure 1.4B) [7, 47, 61, 77]. Mediation analysis can be used either on gene expression levels to identify the regulatory mechanisms [7, 47, 61], or on phenotypic traits to discover the potential causal drivers contributing to the phenotypic variances [77] (Figure 1.4C upper).",
+      "1a). Second-generation offspring are then phenotyped and genotyped, and linkage analysis is carried out to identify a region that is associated with the trait1. This approach has led to the identification of thousands of quantitative trait loci (QTLs) for various phenotypes and diseases. However, each QTL region is large, often tens of megabases, and contains hundreds of genes. The process of identifying the causal variant and the gene involved is therefore difficult and costly. Of the thousands of QTLs identified, only a small fraction of genes has been identified. NIH-PA Author Manuscript  \u00a9 2012 Macmillan Publishers Limited.",
+      "Network analyses We now have two QTL, and we have picked potentially interesting genes within each, but now we want to build up more evidence for which gene in our QTL interval is causal. The first, and most obvious way, is to see what genes our trait of interest correlates with, in tissues that we expect to be related to the trait. We calculated the Spearman\u2019s correlation between the trait BXD_17850 and all probes with expression data in T helper cells (GN319).",
+      "10 JUNE 2016 \u2022 VOL 352 ISSUE 6291  aad0189-5 R ES E A RC H | R E S EA R C H A R T I C LE  Solving QTLs: Finding the quantitative trait gene For cis-QTLs, the causal factors can be quickly identified: With few exceptions, they will be driven by variants within the gene itself or immediately adjacent. For trans-QTLs, mQTLs, and cQTLs, the identification of the causal quantitative trait gene (QTG) is challenging due to the width of the QTLs.",
+      "Once the QTL interval is reduced to a reasonable size, the next step in the process involves sorting through the genes within the interval and attempting to determine which is the QTG. This step is daunting because more than one gene may be involved and the function of some genes within the interval may be unknown. Until recently, this step emphasized the detection of polymorphisms within coding sequence (reviewed in Korstanje and Paigen, 2002 and Glazier et al. 2002); for a polymorphism that produces an amino acid substitution, one can often infer and then test for a functional consequence.",
+      "To understand the genetic networks that underlie quantitative variation in the trait, it is also very important to discover genes whose expression is correlated with the trait after accounting for the known effects of the QTL on the trait. Many of these genes may have expression that is associated with QTL genotype, and would therefore be identified as important via the tests described above. Other  genes, however, may have expression values that are correlated with the trait but unassociated with genotype at the QTL.",
+      "The approach is motivated by the fact that a research project is often focused on a specific classical quantitative trait. If a major QTL for this classical trait has been identified, it is often desirable to test whether this QTL is also associated with the transcription level of any genes, which will provide clues as to which genes belong to the pathway that the QTL uses to modulate the classical trait.",
+      "Confirmation of Candidate Genes The next step is to prove that a particular gene is involved in the quantitative trait under study. This is done by complementation of a QTL, which can be achieved in several ways (9\u201311,40). In principle, transgenic complementation is the most straightforward. This approach has been used successfully to demonstrate that Pla2g2a was the correct candidate gene for Mom1, a modifier of the apcmin allele that causes adenomatous polyposis coli (41).",
+      "So, how do you go about planning and performing a QTL study, and how do you identify the responsible gene within a QTL that you have identified? Generally, one starts by performing a strain survey to find two parental inbred strains that have a markedly different trait. One can now look up many different traits of inbred mice online at the Mouse Phenome Database (http://phenome. jax.org/pub-cgi/phenome/mpdcgi?rtn=docs/home). However, the trait you may want to study may not be present in wild type mice, so you may want to cross a mutant (or genetically engineered) strain onto several inbred strains.",
+      "Along with correlations, this tool also derives new traits representing the principal components (Figure 2d). The user can add these principal components to their Trait Collection and proceed to perform QTL mapping, as in the case of a single trait QTL mapping. The R/QTL (Broman et al. 2003) and R/CAPE (Tyler et al. 2013) packages can be used for deeper analysis of epistasis and pleiotropy for multiple traits and multiple regulatory loci. Prioritizing Candidate Genes 7  Author Manuscript  Following the identification of a significant QTL, focus shifts to identifying the particular gene(s) that cause the QTL.",
+      "The investigators first identified all QTLs associated with a classical phenotype and then winnowed the list of potentially associated gene-expression traits on the basis of their correlation or eQTL overlap with the phenotype of interest. Candidate genes then were ranked by applying  the LCMS technique, which uses the eQTL data to establish causal relationships between DNA loci and transcripts as well as between transcripts and phenotypes and finally identifies a model that best fits the data.",
+      "The goal of QTL mapping is clearly the identification and eventual confirmation of candidate genes (QTGs) underlying the phenotype. The evidence required for such confirmation has engendered much discussion (Complex Trait Consortium 2003; Glazier et al. 2002) and is likely to vary depending on the nature of the trait and specific resources available to pin down underlying genes (e.g. availability of knock-in or knock-outs, specific antibodies, siRNA, etc.). The paucity of QTGs meeting such multifaceted standards is testament to the difficulty of narrowing the confidence interval sufficiently to identify and test suitable candidate genes (Flint et al."
+    ],
+    [
+      "Y chromosome in peripheral blood cells increases with age in men (6) and is correlated with increased risk of cancer mortality and Alzheimer's disease (6,7).X chromosome mosaicism in women also increases with age (8), as does autosomal mosaicism in both sexes (9,10).Recent studies have shown that the prevalence of age-related mosaic abnormalities is greater in men than women (9,10); however, mechanisms underlying the sex differences observed in chromosomal mosaicism in humans are unknown.",
+      "Recent reports suggested a role of Y chromosome loss in risk for all-cause mortality and common age-related disease such as cancer, Alzheimer disease as well as severe atherosclerosis [12][13][14][15][16][17][18][19][20].Building on such reports, we aimed to evaluate the contribution of male Y chromosome mosaicism to the risk for late-stage AMD.",
+      "Box 1. Sex-specific cytonuclear interactions  Several predictions about the nature of cytonuclear conflicts follow from the patterns of chromosomal inheritance (Table I).In a mated pair of animals, mtDNA is co-transmitted with half of the autosomal genes, two-thirds of the X-linked genes and none of the Y-linked genes [76].This predicts that, relative to the autosomal case, positive nuclear-mitochondrial interactions are more likely to evolve for X-linked loci whereas deleterious interactions between Y-linked genes and mtDNA should accumulate (or cannot be purged efficiently).",
+      "In addition to genetic data, the 9p Network Cohort dataset also lists the gender for all 719 individuals.Of these individuals, 406 individuals are female and 313 are male, indicating a female bias (Binomial test p \u00bc 0.0006).This result was surprising considering that no female bias has been previously reported in 9p deletion and duplication syndromes.A possible explanation for the significant bias in the 9p Network Cohort dataset is the XY sex reversal phenotype, which is commonly observed in individuals with 9p deletion syndrome.This phenotype could lead to individuals with XY sex chromosomes being listed in the dataset as having a female gender.To further examine this hypothesis, we subset our dataset to include only the 236 individuals whose sex chromosomes are listed in their genetic information.For this much smaller subset, 125 individuals had female sex chromosomes and 111 had male sex chromosomes, indicating no significant sex bias (Binomial test p \u00bc 0.4).We also found no significant gender bias in this group (Binomial test p \u00bc 0.2), although we did confirm that four of the individuals with XY sex chromosomes had a gender of female.This comparison suggests that the XY sex reversal phenotype may be responsible for a female gender bias, but not a sex bias, in 9p deletion and duplication syndrome cohorts.",
+      "Duplicated variants with multiple alternative alleles and variants in sex chromosomes X and Y",
+      "Autosome-One of the numbered, or nonsex, chromosomes (1 through 22).X and Y are the sex chromosomes.",
+      "Given such a high abundance of young male-biased genes, we asked whether their parental genes are also male-biased.We found that fewer parental genes of X-linked male-biased duplicates were also male-biased (20%, 2/10) compared to the parental genes of autosomal young male-biased duplicates (32%, 12/37).These data, despite the small sample sizes and being statistically not significant, may suggest that compared to autosomal young genes, X-linked young genes more often evolved novel male-biased expression.However, as the majority of young genes are the result of intrachromosomal duplication events, the pattern might also reflect the fact that X-linked old genes are less likely to be male-biased.",
+      "A slight excess of X-linked female-biased genes was also detected (Fig. 2).Although most of them are old, a few recently arose on the X chromosome over 4 to 6 Myr in the common ancestor of the D. melanogaster and D. simulans clade (branch 5).This can be interpreted in the context of the dominance model of the sexual antagonism hypothesis.In this case, a dominant, X-linked gene that is favorable to females but disadvantageous for males can become fixed.The slow accumulation of female-biased genes in the X reflects an overall low rate of female gene origination, either due to a small dominance effect (the degree of dominance h!1/2), or a minor disadvantageous effect on males (the ratio of fitness effects of male relative to female k!0) along with a favorable effect on females (Vicoso and Charlesworth 2006, Equation 10).",
+      "Regarding the second step in the evolution of male-biased genes, namely X!A transposition, sexual antagonism favorable for autosomal fixation (Vicoso and Charlesworth 2006) and/or MSCI (Lifschytz and Lindsley 1972;Betran et al. 2002) may play a role in this process.On the other hand, the within-chromosomal duplication rate is higher than the between-chromosomal duplication rate (Emerson et al. 2008), which may contribute to the slow pace of X!A transposition.",
+      "It has been observed that male-biased genes in Drosophila are overrepresented on autosomes (Parisi et al. 2003;Ranz et al. 2003).Consistent with this result, a dynamic process that can explain the nonrandom autosomal distribution has also been observed, in which autosomal new genes with X-linked parental genes are often male-biased.Specifically, a significant excess of autosomal testisexpressed retrogenes were identified as RNA-duplicates of X-linked parental genes (Betran et al. 2002).Recently, similar X!A gene traffic was observed in the DNA-level duplication and relocation data set of the Drosophila genus (Vibranovski et al. 2009b), and was further confirmed for DNA-level duplications in the D. pseudoobscura neo-X chromosome (Meisel et al. 2009).In addition, selective extinction of neo-X linked male-biased genes also occurred in D. pseudoobscura (Sturgill et al. 2007).These three lines of genome-wide investigation support a common pattern of outof-X traffic for male-biased genes, resulting in an enrichment of these genes on autosomes in the long term.",
+      "It has been reported that the initial manifestations of new gene emergence, namely polymorphic duplicates, occur at a lower frequency on the X chromosome, thus indicating that these duplicates are subject to stronger purifying selection (Emerson et al. 2008).Therefore, the excessive fixation of X-linked duplicates might not occur via neutral processes.Positive selection could have facilitated the fixation of X-linked young genes in addition to driving their subsequent sequence evolution.",
+      "Occasionally, Y chromosome DNA is detected in the maternal plasma, and the fetus appears to have female genitalia on sonographic examination.The underlying mechanisms for this include a twin demise, a maternal disorder of sexual differentiation, such as Swyer syndrome, or that the mother has undergone a bone marrow or solid organ transplant from a male donor (Bianchi, 2018;Hartwig, Ambye, Sorensen, & Jorgensen, 2017).",
+      "Because of the differences in sex chromosome number, the sexunmatched comparison contains internal controls, i.e. , in this comparison, genes on the X-chromosome and Y-chromosome (but not those on the autosomes) should show copy number imbalances re\ufb02ective of a single copy change. We showed that the sample that is not sexmatched had readily detectable differences in aCGH signals for genes on the X and Y chromosomes. No such patterns were evident for the autosomes of the sex unmatched individuals or for the sex chromosomes of the sex matched samples.",
+      "Sex chromosome:  The X or Y chromosome in human beings that determines the sex of an individual.Females have two X chromosomes in diploid cells; males have an X and a Y chromosome.The sex chromosomes comprise the 23rd chromosome pair in a karyotype.See also: autosome Sex-linked: Traits or diseases associated with the X or Y chromosome; generally seen in males.",
+      "X chromosome: One of the two sex chromosomes, X and Y. See also: Y chromosome, sex chromosome Y chromosome: One of the two sex chromosomes, X and Y. See also; X chromosome, sex chromosome",
+      "The male heterogamety (XY) is the most common reported system, but many species have female heterogamety (ZW), and more occasionally, multiple chromosome systems  (Almeida-Toledo and Foresti, 2001; Devlin and Nagahama, 2002; Penman and Piferrer, 2008). Given the low resolution of optical microscopy to differentiate sex chromosomes in \ufb01sh, researchers have looked for an alternative in the tenfold longer meiotic chromosomes to detect mispairing tracts at the synaptonemal complex as an indication of the sex differentiated region with variable success.",
+      "The exclusive female constitution of gynogenetic genomes provides information on the SD system, especially in a XX/XY system, where all female progenies are expected. If ZZ/ZW is the underlying system, male offspring always will be present, but the interpretation is more complex and will depend on the distance of the SD region to centromere and on the viability of WW offspring (Devlin and Nagahama, 2002; Penman and Piferrer, 2008). Induced triploids, on the other hand, are constituted by the combination of two female and one male genomes (Piferrer et al.",
+      "The existence of a maternally silenced X-linked imprinted locus playing a role in social cognition could explain why males (X m Y) are more vulnerable to disorders of social cognition such as autism spectrum disorders than are females (X m X p ).The absence of the expression of this gene would not lead to autism itself, but would eliminate a putative protective factor, making an individual more susceptible to the effects of other ASD-predisposing genetic mutations or environmental factors.",
+      "When meiosis takes place, a pair of chromosomes may fail to separate properly, creating a sperm or egg that has either two copies or no copy of a specific chromosome.This is a sporadic event and it is called nondisjunction.Nondisjunction can lead to an extra chromosome, called trisomy, or a missing chromosome, called monosomy (GHR, 2008l).Down syndrome is an example of trisomy.Individuals who have Down syndrome have an extra chromosome number 21. Turner syndrome is an example of monosomy.Girls who have Turner syndrome have only one X chromosome.This causes them to have short stature and be infertile (NHGRI, 2008l).",
+      "X-Linked Inheritance  X-linked genetic disorders (also called sex-linked) are caused by gene mutations on the X chromosome.Most often X-linked genetic disorders are seen in males.Males inherit the X chromosome from their mother and the Y chromosome from their father.Because males have only one X chromosome, if they inherit a gene mutation on the X chromosome from their mother, they will have the disorder.Examples of X-linked genetic disorders occurring in males include hemophilia and Duchenne muscular dystrophy (GHR, 2008o)."
+    ],
+    [
+      "The project also provides online analysis tools to allow identification of correlations within its data set. GeneNetwork (http://www.genenetwork.org), encompassing WebQTL, is a database of genotypes and complex phenotypes ranging from gene expression to behaviour in standard inbred strains, and six panels of mouse recombinant inbred strains including the two largest sets (BXD and LXS) of approximately 80 strains each. Rat and Arabidopsis populations are also represented. Approximately 1500 phenotypes spanning the 25 year history of these strains are incorporated in this public resource, many of which were retrieved from the literature.",
+      "GN spares the user most of these problem. Data are formatted and normalized, and usually come with good metadata (often in the form of links to more information). This greatly simplifies QTL and eQTL analysis, candidate gene discovery, coexpression analysis, and hypothesis testing [3, 10].",
+      "Suitable for quantitative genetics (QTL mapping) and systems genetics, including correlation and network analysis to compare associations between tissues and between other rodent or human data sets  Description and usage  [32]  [31]  [30]  [11]  References  Many of the Data Sets are amenable to systems genetics mapping and other methods and are accessible at GeneNetwork. The Description and Usage column provides details about the data set and potential usage.",
+      "Bioinformatics All of the genetic analyses were carried out in GeneNetwork, which is an open source bioinformatics resource for systems genetics that exists as both a repository for genetic, genomic and phenotypic data together with a suite of statistical programs for data analysis that includes mapping and evaluating QTLs, examining phenotype/genotype correlations and building interaction networks. QTL mapping The QTL mapping module of GeneNetwork was used to identify QTLs for hippocampal morphometry and radial maze trait data. This module enables interval mapping, composite interval mapping and a pairwise scan option to identify epistatic effects.",
+      "There are four options for QTL mapping on the GeneNetwork website: interval mapping, marker regression analysis, composite interval mapping, and pairscan analysis. In this case, interval mapping was used to compute linkage maps for the entire genome. The log of odds (LOD) score was used to assert that a causal relation exists between a chromosomal location and a phenotypic variant, such as Gsto1 expression variation.",
+      "Webqtl is an online database [110] of linked datasets, including genotype and expression data, covering multiple species including mouse, macaque monkey, rat, drosophila, arabidopsis, plants and humans [60]. While this tool cannot be used to calculate eQTLs, it can be used to find and visualize eQTLs in different species, strains and tissues. It can perform single- and multiple-interval QTL mapping of up to 100 selected traits. Users can also upload their own trait data for populations included in the database. It can also calculate and display trait-correlation matrices and network graphs (also for up to 100 traits).",
+      "GN spares the user most of these problem. Data are formatted and normalized, and usually come with good metadata (often in the form of links to more information). This greatly simplifies QTL and eQTL analysis, candidate gene discovery, coexpression analysis, and hypothesis testing [3, 10].",
+      "Suitable for quantitative genetics (QTL mapping) and systems genetics, including correlation and network analysis to compare associations between tissues and between other rodent or human data sets  Description and usage  [32]  [31]  [30]  [11]  References  Many of the Data Sets are amenable to systems genetics mapping and other methods and are accessible at GeneNetwork. The Description and Usage column provides details about the data set and potential usage.",
+      "QTL MAPPING AND QTG DISCOVERY IN THE RCC A variety of statistical methods and tools have been developed for QTL mapping and implemented in free software for public use. These methods are well suited for simple backcross and F2 RCC populations. R/qtl9,39 was developed for identi\ufb01cation of QTLs and higher order modeling. Another Web-based tool, GeneNetwork or WebQTL (GeneNetwork.org),40 was developed for QTL mapping and to explore associations between variants, molecular traits (e.g. , gene expression), and higher order phenotypes (e.g. , behavior) and facilitate QTG identi\ufb01cation.",
+      "This enables gene expression correlation and interval mapping, candidate gene searches and multitrait analyses. Each exported dataset was subject to an interval mapping analysis, which uses GeneNetwork\u2019s embedded MapManager software (Manly et al . 2001) to perform Haley\u2013Knott regression. Empirical P values were derived using 1000 permutations using the incorporated permutation feature of WebQTL. The peak of each statistically significant (P -value <0.05) or suggestive (P -value <0.63) (Lander & Kruglyak 1995) QTL was determined based on empirical P values (Doerge & Churchill 1996). A one-LOD drop-off was used to determine the QTL confidence interval about each peak.",
+      "The peak linkage value and position was databased in GeneNetwork and users can rapidly retrieve and view these mapping results for any probe set. Any of the QTL maps can also be rapidly regenerated using the same Haley-Knott methods, again using functions imbedded in GeneNetwork. GeneNetwork also enable a search for epistatic interactions (pair scanning function) and composite interval mapping with control for a single marker. Data quality control  We used two simple but effective methods to confirm correct sample identification of all data entered into GeneNetwork.",
+      "QTL analysis All QTL mapping for phenotypes was performed using the WebQTL software module of the  170  GeneNetwork (www.genenetwork.org) [34]. Interval mapping to evaluate potential QTLs was calculated from the likelihood ratio statistics (LRS) as the software\u2019s default measurement of the association between differences in traits and differences in particular genotype markers. Another common measure score, the log of the odds (LOD) ratio, can be converted from the LRS (LRS/4.61). Suggestive and significant LRS values were determined by applying 1000  175  permutations.",
+      "Unlike interval-specific haplotype analysis, which is most useful for narrowing a QTL shared by multiple crosses, genome-wide haplotype analysis requires only phenotype information from many inbred strains and can effectively narrow a QTL identified in only one experimental cross [36]. After narrowing the QTL to an interval that is !5 Mb using these bioinformatics techniques or classical experimental methods, strain-specific sequence and gene expression comparisons are effective for focusing on a few strong candidate genes (Figure 7).",
+      "We considered QTL intervals that achieved genome-wide significance for one phenotype, and genome-wide suggestive for others, as highest priority for candidate gene analysis. The January 2017 BXD genotype file was used4 . Updated linear mixed model mapping algorithms are now available on GeneNetwork 25 (Sloan et al. , 2016), that account for kinship among strains. These new algorithms include GEMMA (Zhou and Stephens, 2012), pyLMM6 (Sul et al. , 2016), and R/qtl27 .",
+      "The peak linkage value and position was databased in GeneNetwork and users can rapidly retrieve and view these mapping results for any probe set. Any of the QTL maps can also be rapidly regenerated using the same Haley-Knott methods, again using functions imbedded in GeneNetwork. GeneNetwork also enable a search for epistatic interactions (pair scanning function) and composite interval mapping with control for a single marker. Data quality control  We used two simple but effective methods to confirm correct sample identification of all data entered into GeneNetwork.",
+      "There are four options for QTL mapping on the GeneNetwork website: interval mapping, marker regression analysis, composite interval mapping, and pairscan analysis. In this case, interval mapping was used to compute linkage maps for the entire genome. The log of odds (LOD) score was used to assert that a causal relation exists between a chromosomal location and a phenotypic variant, such as Gsto1 expression variation.",
+      "eQTL mapping  QTL mapping was performed with GeneNetwork, an online bioinformatics resource featuring tools for systems genetic and complex trait analysis [9, 35]. QTL mapping involves entering VMB and CP iron data (strain means and SEM) as quantitative traits; the software generates whole-genome interval maps for each trait. The interval maps graphically illustrate phenotype\u2013genotype associations as peaks (QTL) indicating the strength of association between genomic polymorphisms and the quantitative trait throughout the genome.",
+      "Genetic Mapping In this study we utilize GeneNetwork, a database containing phenotypes and genotypes, and also serves as an analysis engine for quantitative trait locus (QTL) mapping, genetic correlations, and phenome-wide association studies (PheWAS) (Sloan et al. , 2016; Mulligan et al. , 2017; Watson and Ashbrook, 2020). QTL analysis involves connecting phenotype data with genotype data to examine genetic variation in traits controlled by multiple genes and their interaction with the environment (also called complex traits)(Lynch et al. , 1998; Myles and Wayne, 2008; Goddard et al. , 2016).",
+      "Once the resulting record set of the query is returned, it can be further restricted by selecting relevant records based on attached annotations before forwarding it for further analysis. To map genetic loci associated with mRNA abundance or trait phenotypes, any one of the three QTL mapping functions currently employed by GeneNetwork's WebQTL module can be used. These are 1. interval mapping, 2. single-marker regression, or 3. composite mapping [29,30].",
+      "genenetwork.org/) a set of 3795 markers. Linkage is reported with genome-wide significance levels based on 2000 permutation tests. Two types of QTL mapping analyses\u2013simple mapping using the Haley\u2013 Knott regression equation, and composite interval mapping\u2013were utilized in this study. Simple interval mapping was performed to illustrate the significance of any QTLs that regulate the TID. As a secondary analysis, composite interval mapping which controlled for the influence of Tyrp1 was also performed with the goal of identifying any secondary QTLs that may have been masked by the major QTL on Chr 4."
+    ],
+    [
+      "A number of additional ethical implications must be considered.Associating financial investments with the prevention of disease, especially where reproductive decisions are involved, requires sensitivity, caution, and ethical rigor.Funding decisions based on imputed cost-savings must not result in implicit pressure on individuals to violate personal ethics to reduce financial burden on society.As discussions regarding prenatal testing have demonstrated, 35 is risk that \"routinization\" of testing may lead to social or medical expectations of testing in all eligible individuals.These expectations, if linked with financial incentives for the health system, could risk applying implicit pressure on serious, and potentially irreversible, personal decisions.Such expectations, if applied at the population level, could risk becoming normalized, compromising the values of informed consent and individual autonomy.",
+      "With regard to pregnancies affected by a genetic condition identified through population carrier screening, we modeled the decision to terminate affected pregnancies conservatively (0.50).This is despite the literature suggesting rates above 0.90 for elective TOP for conditions such as Down syndrome 33 and SMA. 34We recognize this issue is controversial, and that laws and ethical positions vary considerably between countries/ jurisdictions.Variations in population attitudes based on age, religion, and other factors, as well as the criticality of preserving individual choice, were acknowledged in adopting this highly conservative estimate.",
+      "The use of genetic testing from pre-conception through adulthood is expanding rapidly.As a result of this expansion, new ethical issues are emerging related to genetic testing and informed consent.These new issues create ethical challenges for nurses and all healthcare providers.Currently expanding areas include newborn screening and genetic testing of children.These new ethical challenges will be described below.",
+      "The use of genetic testing from pre-conception through adulthood is expanding rapidly. Psychological risks for parents who are carriers may include parental guilt.",
+      "Ethnic and cultural backgrounds may also play a role in the decisions that families make regarding prenatal testing.Moyer et al. (1999) concluded that Caucasian women more often undergo prenatal diagnoses than African American or Asian women, or Latinas.Furthermore, Awwad et al. (2008) found American couples less inclined to involve extended relatives in the prenatal decision-making process than Native Palestinian couples.Both of these examples clearly indicate that cultural differences can impact the ways in which families negotiate prenatal decisions.Further research needs to investigate how different families engage in such discussions and decision-making processes, especially as prenatal testing becomes more common and better able to predict or prevent a wider range of genetic conditions.Tightly closed ethnic groups remain at high risk of serving as carriers for genetic mutations, but the management of this possibility varies greatly.For example, some Ashkenazi Jewish groups use screening for mutations for Tay-Sachs disease (TSD) as the basis for rabbinical marriage advice; whereas, children born to Amish families in Pennsylvania more often present with glutaric aciduria type 1 (GA1) but, given their beliefs, parents tend not to accept prenatal testing because of the implication of abortion (McKusick, 2000).",
+      "Researchers studying factors that contribute toward a couple's choice to undergo prenatal testing have determined that partners base their decision upon several factors, including, but not limited to: parental beliefs about abortion, attitudes regarding disability and their \"perceptions of the usefulness of having the information revealed by genetic tests\" (Moyer et al., 1999, p. 522).Abortion beliefs constitute a key issue in the decision-making process.Even though a majority of parents receiving abnormal prenatal test results terminate their pregnancies (Redlinger-Grosse, Bernhardt, Berg, Muenke, & Biesecker, 2002), Moyer et al. noted that, when asked, more families reported that they would make use of prenatal testing than would be willing to terminate a pregnancy.The decision to continue or terminate a pregnancy after prenatal testing Downloaded by [University of the Sunshine Coast] at 10:32 05 August 2017 comprises a joint decision between both parents (e.g., Awwad et al., 2008;Beeson & Golbus, 1985); however, the nature of the conversations leading to the decision and the involvement of extended family members in the decisionmaking process remains highly understudied.",
+      "The Genetic Divide(s) and Communication  The ability of scientists to \"map\" disease through several generations (Collins, 1999) raises practical and ethical issues of access to resulting opportunities and creates family communication challenges.Currently, prenatal testing for chromosomal diseases has become increasingly common (Moyer et al., 1999).Options such as pre-implantation genetic diagnosis (PGD) can identify over 1,250 disease-related mutations creating an opportunity for parents to select unaffected embryos for implantation in the womb (R. M. Green, 2008).Test results provide potential parents with information that may lead to decisions involving intervention in the genetic makeup of future children.Although some families welcome such options, others may be unable or unwilling to consider such procedures, due to fi nancial concerns or moral/ethical/religious beliefs.",
+      "Privacy Issues  Finally, privacy issues should be seriously considered when the use of genetic testing is contemplated, especially with respect to whole-genome sequencing of healthy people.It is an unanswered question under what circumstances, to what extent, and by what means genetic data should be incorporated into the medical record.Although easy access to such data could be helpful to providers in improving patient care, it remains to be seen how other parties (eg, insurance companies) might act on the data in ways that do not benefit patients.The US Congress acted to prohibit discrimination by employers and health insurers on the basis of genetic testing with the Genetic Information Nondiscrimination Act in 2008, but further safeguards will undoubtedly be needed as the health implications of genetic data become clearer.",
+      "The ethical evaluation of genetic testing in children is traditionally based on the balance of clinical benefits and risks (American Society of Human Genetics Board of Directors and the American College of Medical Genetics All correspondence concerning this article should be addressed to Benjamin Wilfond, MD, Treuman Katz Center for Pediatric Bioethics, Seattle Children's Hospital, Metropolitan Park West M/S: MPW 8-2, 1100 Olive Way, Room 876, Seattle WA 98101, USA.E-mail: benjamin.wilfond@seattlechildrens.org Board of Directors, 1995;Andrews, Fullerton, Holtzman, & Motolsky, 1994;Clarke, 1994;Wertz, Fanos, & Reilly, 1994).In the early 1990s, when there were only scant data about children who had received genetic tests results, the presumption was to give greater weight to the potential risks and to restrict testing.However, this criterion is not necessarily consistent with the general practice of respecting broad parental discretion in health care decisionmaking for and on behalf of their children.In general, parents are the presumed decision makers for their children and their decisions are respected unless they are abusive or neglectful (Buchanan & Brock, 1989;Goldstein, Freud, & Solnit, 1979;Ross, 1998).The tension between assessments of benefits and risks made by health care providers and policy makers, and the procedural respect owed to parental authority will be clearly tested as the ability to conduct and interpret whole-genome sequencing and related technologies gain in momentum.",
+      "Ethical Considerations in Developing Policy for ''Comprehensive'' Genomic Testing  In the near future, genomic testing is likely to become more accessible and will provide both information about the risks of common conditions such as heart disease, diabetes, and hypertension as well as predictions about individual responses to specific pharmaceuticals and other medical therapies (Aspinall & Hamermesh, 2007).Over time, the number and range of conditions for which such testing is available is likely to expand to include more behavioral traits, ranging from information about anxiety and depression, to attention and addiction (Rothstein, 2005).",
+      "Objective Ethical evaluation of genetic testing in children is traditionally based on balancing clinical benefits and risks.However, this focus can be inconsistent with the general practice of respecting parental decision-making about their children's health care.We argue that respect for parental decision-making should play a larger role in shaping pediatric genetic testing practices, and play a similar role regarding decisions to use emerging genomic technologies.Methods Genomic testing involves the examination of thousands of DNA markers spanning genes throughout the genome and their interrelationships, yielding virtually limitless interpretations.We presume that parents and providers should proceed cautiously in applying genomic testing in children, as we explore how genomic testing will stress the fault lines of the traditional ethical analysis.Results Empirical data about the psychosocial risks and benefits of genetic testing of children do not reveal serious harms, yet virtually no such data exist yet about genomic testing.Unless empirical social and behavioral data indicate that genomic testing is highly likely to cause serious harms to the children, parental decisions to obtain comprehensive genomic testing in their children should be respected.Once comprehensive genomic testing of children becomes routine, resultant information may be more easily integrated by families than anticipated.Conclusions Research on the social and behavioral impact of comprehensive genomic testing on children and their families is needed to further inform parents, clinicians, and policy makers.",
+      "Objective Ethical evaluation of genetic testing in children is traditionally based on balancing clinical benefits and risks.However, this focus can be inconsistent with the general practice of respecting parental decision-making about their children's health care.We argue that respect for parental decision-making should play a larger role in shaping pediatric genetic testing practices, and play a similar role regarding decisions to use emerging genomic technologies.Methods Genomic testing involves the examination of thousands of DNA markers spanning genes throughout the genome and their interrelationships, yielding virtually limitless interpretations.We presume that parents and providers should proceed cautiously in applying genomic testing in children, as we explore how genomic testing will stress the fault lines of the traditional ethical analysis.Results Empirical data about the psychosocial risks and benefits of genetic testing of children do not reveal serious harms, yet virtually no such data exist yet about genomic testing.Unless empirical social and behavioral data indicate that genomic testing is highly likely to cause serious harms to the children, parental decisions to obtain comprehensive genomic testing in their children should be respected.Once comprehensive genomic testing of children becomes routine, resultant information may be more easily integrated by families than anticipated.Conclusions Research on the social and behavioral impact of comprehensive genomic testing on children and their families is needed to further inform parents, clinicians, and policy makers.",
+      "To the extent that ''personal meaning'' gains wider acceptance as a legitimate criterion for expanding the availability of new tests and applications of genomic technology, the current policies and practices of restricting some genetic testing of children and mandating other tests will need to be reevaluated.There will be some parents who will find the information that becomes available through new technologies and data useful in shaping their parenting practices, while others will be more skeptical of their value.These disparate parental judgments may be independent of professional assessments of clinical validity and utility.Extrapolating from the empirical data about predictive genetic testing of children in at-risk families discussed earlier, we speculate that once comprehensive genomic testing of children becomes routine, the information may be more easily integrated by families than might be predicted.This is not meant to imply that whatever information parents want about their children should be provided carte blanche.Clearly, education and counseling will be crucial to ensure that families understand the limitations of the information.However, restrictions and mandates should be based on a criterion of risk of serious harm (Diekema, 2004).Given the lack of data confirming harm and the related data that indicate children may fare better than anticipated, such restrictions and mandates cannot be justified.Policies and practices will also need to clarify the role of the older adolescent in the decision-making process, although the issues related to balancing and assessing parental and adolescent interests and preferences goes beyond the focus of this article.This is also not meant to ignore the professional and moral obligation to educate parents and to help parents make good decisions on behalf of their children.It is morally appropriate for providers to strongly recommend particular tests in infancy and young childhood (i.e., PKU testing), and to strongly discourage other tests (e.g., ApoE testing of children for adult onset Alzheimer disease and heart disease because ApoE is not predictive but only provides an increased relative risk and has limited sensitivity and specificity) (Roberts, Cupples, Relkin, Whitehouse, & Green, 2005).Selective and directive recommendations are a routine aspect of pediatric practice.However, it will become increasingly important for professional organizations to begin to reconcile their support for mandatory genetic testing for some conditions and their support for restrictions for other conditions with the broad discretion that parents have and need in the health care arena in order to promote their children's well-being.",
+      "What limits should be imposed, if any, need to be determined prior to commercial feasibility.In this article, we consider how genetic testing decisions for children have been made traditionally and how the anticipation of comprehensive genomic testing in the near future will stress the fault lines of traditional approaches.The potential for comprehensive genomic testing in children could shift the equilibrium towards expanding or reducing parental discretion, and forces us to reexamine the evidence for our genetic testing policies and practices.We will highlight specific domains where further empirical social and behavioral research is necessary to inform policy and practice.",
+      "Prenatal genetics is largely practiced by maternal-fetal medicine specialists due to severe deficiency in the number of qualified clinical geneticists.Recent years have witnessed a tremendous growth in the demand for chorionic villous sampling and amniocentesis for the diagnosis of single gene disorders.At KFSHRC alone, the number of prenatal samples that are tested for single gene disorders has increased from 5 in 2004 to 250 in 2013.Therapeutic abortion is permitted by law if performed within 120 days from the time of fertilization in order to comply with the Islamic view of the timing of ensoulment (Alkuraya and Kilani 2001).However, the approved indication for the procedure, which is \"severe malformation\", must be authorized by three attending-level physicians.The definition of \"severe\" is left to the discretion of the medical team after consulting with the family.For example, intellectual disability is a common indication for many therapeutic abortion procedures.Contrary to commonly held views, we have shown that early prenatal diagnosis is the method of choice for couples who had one or more children with single gene disorders, as long as they are provided with a culturally sensitive genetic counseling that addresses their religious and cultural concerns (Alkuraya and Kilani 2001).Nearly 45% of these couples opt for early prenatal diagnosis compared to 35% who choose preimplantation genetic diagnosis (PGD) (Alkuraya 2013a).PGD is available freely at KFSHRC but is also provided by the private sector.Noninvasive prenatal screening using cell-free fetal DNA in maternal blood is quickly becoming integrated in prenatal care.KFSHRC offers this test routinely to all pregnant women regardless of their perceived risk and the MOH is considering making this test available throughout its vast network of hospitals and medical centers.",
+      "Social and psychological implications of accessing genetic services and information.",
+      "A corollary of the predictive power of genetic information is the limited ability to prevent or treat many conditions with significant genetic factors involved.Indeed, virtually all of the complex ethical and legal issues relevant to genetic testing would disappear if there were effective preventions or treatments available for genetic conditions.The ability to predict future disease in conjunction with a limited ability to do much about it has important social and psychological implications that must be addressed in conducting genetic research.",
+      "Interpretations of the literature will likely mirror the priorities and evaluative tendencies of the reader.Are you willing to accept the overall trends in genetic and genomic testing evaluation and to trust that the existing clinical approaches will apply informed consent appropriately while identifying and supporting the rare individual who has a serious adverse response to the testing?If so, you might advocate that attention be turned more toward other issues relevant to the effective implementation of genetic and genomic testing.Or do you feel a strong need to understand in more detail the possible psychosocial harms of the testing, particularly the subtler impacts or responses of individuals who do not fit the norm?In that case, you would likely encourage renewed and innovative efforts to study the psychosocial consequences of the receipt of risk information from genetic and genomic testing.",
+      "Other social issues require our attention if genomic medicine is to benefit our patients.How should genetic tests be regulated?What, if any, are the appropriate uses of direct-to-consumer marketing of genetic tests?The Internet has recently had a proliferation of genetic-testing sites that feature claims grounded in greed and pseudoscience, rather than in data or reality.How will health care providers and the public distinguish between these and responsible testing services, whether they are available through the Internet or in the hospital?",
+      "Environmental Factors  As widespread use of genetic testing increases, it is the responsibility of the medical community to ensure its equitable use across socioeconomic and cultural spectrums."
+    ],
+    [
+      "Gene editing has gained considerable interest with the identification of the CRISPR-Cas9 system, 27 which allows for a targeted modification in the DNA sequence of an organism.Researchers can utilize their knowledge of the basic biology of the gene and its protein function to precisely change the DNA sequence, thus altering the protein function of the gene and allowing for edits to stay within the species.Researchers at the University of Missouri used the CRISPR-Cas9 system to modify the CD163 gene such that the PRRS virus is not able to replicate inside the pig. 28This slight modification of the swine genome through gene editing keeps the pigs from succumbing to PRRS which has an annual estimated loss to the United States swine industry of over $660 million per year.Despite this benefit, given the public's concerns over food safety, it is likely that approval for such technology is years away in the US, Canada and Europe.However, in some cultures, there is a wide range of non-livestock species that are consumed.Therefore, it is conceivable that these countries and cultures may be open to transgenic/gene edited livestock.They may see the importance of useful gene editing which may lead to approval and consumption of reasonable genetically edited animal products such as those with modifications that are already found in nature or those that offer a substantial welfare benefit to society.",
+      "As a researcher who has devoted an entire career since 1994 to the development of genome editing tools and methods, I have been amazed by the rapid progress in the field over the last few years.Considering the widespread use of the tools, I am sure that the pace will continue to accelerate.Indeed, programmable nucleases, may eventually enable humans-products of evolution-to become masters of evolution.delivered preassembled recombinant Cas9-guide RNA ribonucleoproteins (RNPs) into animal embryos 6,9 and plant 11 and mammalian cells [73][74][75] .Indeed, Cas9 RNPs were rapidly turned over in cells 73 , reducing off-target effects and mosaicism in gene-edited organisms 11 .Cas9 RNPs can be delivered into cells by various methods, including microinjection 6,9 , electroporation 73 , lipofection 74 and protein transduction 75 .Importantly-and unlike in conventional gene therapy, where therapeutic genes are delivered via plasmids or viral vectors-Cas9 RNP delivery does not involve the use of exogenous DNA; host innate immune responses against foreign DNA are not elicited, and undesired integration of foreign DNA into the host genome is avoided.",
+      "In comparison to a transgenic approach, a gene editing technique such as CRISPR-Cas9 offers the advantage that gene-edited crops are not considered genetically modified organism (GMO) in some countries, such as the US, where the demand for natural food colorants such as anthocyanins is high.Indeed, the use of GMO crops as a source of natural pigments may be inconsistent with consumer interests.However, carrot cultivars engineered with either the transgenic or gene editing approach have not been reported so far, but their development is possible.",
+      "The notable accuracy and versatility of CRISPR-Cas for genome editing also opened the door to its use in preclinical and translational settings.In the latter case, CRISPR in vivo gene editing has led to several proof-of-concept studies that would have been unachievable without it, as in the first ever correction of inherited pathogenic mutations linked to degenerative disease in a living organism [22] and even shown to be possible in human embryos [23,24].It also has great potential in the field of precision medicine as large-scale population DNA sequencing studies have provided vast amounts of information linking particular diseases with specific genetic mutations which could, in theory, be targeted through CRISPR [25,26].This could be used during the identification and validation of potential DNA targets during the development of personalised drug or cell therapies, which will require the generation of engineered cell lines and/or animal models.Techniques such as HDR-mediated gene targeting are too labour intensive, with low targeting efficiencies and long times necessary for their establishment, and consequently are not ideally suited for drug discovery purposes.Conversely, CRISPR-Cas has been proven to be efficient for editing virtually any kind of cell line, from primary immune cells to induced pluripotent stem cells (iPSCs) [27,28].Additionally, CRISPR can also be used for functional screening in the development of combined inhibitory therapy aimed at strengthening the efficiency of targeted therapeutics.An example of the latter is shown in a study where a variation of the technology known as CRISPR interference (CRISPRi) was used in genome-wide scale to identify different survival pathways used by cancer cells after oncogene inactivation and allowing the identification of successful combination therapies [29].In terms of translational applications, the overall safety of CRISPR genome editing in humans will require long-term scrutiny before its adoption in the clinic.Nonetheless, a number of CRISPR-based clinical trials are currently in progress, including studies focused on targeting patients' own T cells in order to improve the immune response towards some forms of malignant cancer [30,31], and others aimed at correcting pathogenic mutations in the hematopoietic cells of patients with beta-thalassemia and sickle cell disease [32].",
+      "Caveats and Ethical Concerns of CRISPR-Cas Applications  Despite the presence of both a PAM sequence and a specific gRNA, the CRISPR-Cas9 system is not infallible.In fact, DSBs can occur at different sites in the genome, potentially causing so-called \"off-target\" effects.This eventuality remains to date the biggest concern in the field, as possible undesirable modifications must be properly identified and followed in order to guarantee safety for medical purposes.Nevertheless, there is still little evidence of the biological consequence of Cas9 off-target effects.Two recent studies describe new methods to investigate potential off-target effects in both mammals and plants [33,34].In both cases, whole-genome sequencing revealed that selective nucleotide changes, such as conversion of an adenine to a guanine, caused off-target occurrence very rarely, with a frequency comparable to the one of spontaneous mutations.However, substitution of a cytosine with a thymidine was linked to a sizable number of off-target mutations.This newly acquired information adds to the plethora of studies conducted on the safety of CRISPR, which altogether highlight the need for the establishment of clinical standards for the future use of genome-editing techniques in the clinic.Despite this and other technical challenges still ahead for CRISPR genome editing, the pace at which this technology has developed in recent years suggests many of these concerns could be addressed soon, as long as proper ethical guidelines and regulatory mechanisms are established.",
+      "Conclusions  There is no reason to doubt that the development of CRISPR-Cas genome editing represents an unprecedented breakthrough in modern science, as it has potential applications in a wide array of disciplines ranging from agriculture, zoology and renewable energy to biomedicine and synthetic biology.This powerful tool holds promise for further elucidating the molecular causes of ageing by allowing scientists to probe genetic and epigenetic pathways with a level of sophistication that was unattainable just a few years ago.It will allow so in traditional animal and cell models of ageing, but it will also drastically accelerate the generation of refined versions of those models or even allow the development of new research approaches in non-model organisms.Moreover, CRISPR-based genome editing is already having a significant impact in research aiming to understand the cellular and molecular origins of age-related diseases, as well as developing potential treatments against them.The application of CRISPR-Cas gene editing for the treatment of age-related diseases is not over the horizon yet, as it will require the identification of causative genes and their role under a variety of contexts that could be as diverse as the ageing process is across individuals.However, CRISPR-Cas might also hold the key for solving such conundrum, as it has opened the way for achieving true personalised medicine by providing both the precision and scalability required for conducting genome-wide functional screens during the refinement of drug-and cell-based therapies for age-related diseases.",
+      "Since its discovery, CRISPR-Cas technology has ignited a biological revolution by providing a highly versatile platform that allows fast and efficient genome editing in an ever-growing list of organisms.In this chapter we will first describe the most recent advances in the development and application of the CRISPR-Cas platform in biomedical research.Then we will discuss the most recent and notable basic research applications of this technology in the study of the molecular causes of ageing.Finally, we will review how CRISPR-Cas has been used for creating new models for the study of age-related diseases, as well as for manipulating diseaseassociated gene pathways.",
+      "Caveats of advanced genome editing tools  Off-target effects.The DNA-binding domains of ZFNs and TALENs need to be very specific for the target site to avoid off-target cleavage, which results in unwanted mutations and potentially cytotoxic effects [27].CRISPR/Cas9 is also known to generate off-target alterations, albeit apparently at low incidence [28,29], since mispairing is allowed between the guide RNA and the genomic DNA.Nonetheless, caution is required in their design and use.Some strategies involving the optimization of the guide RNA/Cas9 include using of software tools to predict potential off-target sites (http://omictools.com/crispr-cas9-Figure1: Genome editing methodologies which can be applied to human pluripotent stem cells.Homologous recombination (HR), or the more advanced tools such as zinc finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs) or clustered regularly interspaced short palindromic repeat (CRISPR)/Cas system can be applied to human pluripotent stem cells (hPSCs) either to 1) create naturally occurring mutations or 2) repair a mutation to generate isogenic controls in hPSCs, to understand the function of a gene of interest.c1268-p1.html),truncating the guide RNA (<20 nucleotides) to decrease off-target mutagenesis [30], lowering the dosage of guide RNA and Cas9 plasmids, and decreasing the number of mismatches between the guide RNA and the genomic DNA.A \"double nick\" system with Cas9 nickase, which contains a single inactive catalytic domain, may also be used [31e33].",
+      "CRISPR screening technologies  The discovery of CRISPR-Cas9 as a sequence-specific programmable nuclease democratized gene editing and fueled progress in forward genetic screening [20 , 66] .Genetic screens using Cas9 with a pooled singleguide RNA (sgRNA) library allow the interrogation of seemingly all genes in a genome in a single experiment [96 , 97] [null] .Engineered Cas9 variants further extend the versatility of forward genetic screening.Catalytically inactive Cas9 (dCas9) fused with chromatin effector domains permit specific activation (CRISPRa) or inhibition (CRISPRi) of gene expression [37 , 54] .Recently developed and emerging technologies -base editors, prime editors, and Cas transposases -are beginning to enable new types of genetic screens with directed, controlled, and on demand mutations by allowing the creation of user specified modifications, such as single base conversion, deletions, and insertions [4 , 42 , 58] .",
+      "Coming on the heels of engineered nucleases, CRISPR-Cas9 tools have accelerated the pace of genomic research by permitting highly efficient knockouts or edits of virtually any gene in cells or model organisms.Multiple CRISPR-Cas9-based clinical trials are in progress or are expected to begin soon.Although Cas9engineered cells haven't yet demonstrated efficacy at scale, early trial results suggest that such cells are stable and don't cause acute adverse reactions in humans.Long-term safety is yet to be determined.Current applications largely focus on single-gene disorders for which gene editing can be carried out ex vivo on appropriate cells, such as bone marrow hematopoietic stem cells in the case of sickle cell anemia.Exploration is under way to develop delivery systems that can target the gene-editing apparatus to the appropriate tissue in vivo.",
+      "Over the past 8 years, CRISPR (clustered regularly interspaced short palindromic repeats)-Cas9 (CRISPR-associated protein 9) technologies have emerged as accessible and adaptable tools for studying and altering genomes. 5RISPR-Cas9 can be used to induce genome edits by creating targeted DNA breaks that trigger site-specific DNA repair.In nextgeneration formats, it can also control the transcriptional output of genes or alter genome sequences using a process of nucleotide base editing that does not require repair of DNA breaks.As these technologies continue to mature, it will become increasingly possible to alter cellular genomes efficiently and accurately.",
+      "The type II CRISPR-Cas9 systems, repurposed from prokaryotic adaptive immune responses, are now widely used for targeted genome modifications in plants, animals, and human cells (Kim et al. 2014;Woo et al. 2015;Zuris et al. 2015).In particular, Cas9 nucleases have shown promise for gene and cell therapy (Maeder and Gersbach 2016).Typically, these nucleases are expressed or delivered in vivo using plasmid DNA or viruses (Yin et al. 2014;Ran et al. 2015).However, plasmid DNA delivery is often inefficient, especially in vivo, and can cause integration of small plasmid fragments degraded by endogenous nucleases at on-target and offtarget sites in the genome (Kim et al. 2014).Viral delivery of Cas9 can be highly efficient in vivo (Ran et al. 2015;Long et al. 2016;Nelson et al. 2016;Tabebordbar et al. 2016), but may be hampered by antibodies or T cells induced against the protein (Shankar et al. 2007;Calcedo et al. 2015;Chew et al. 2016).We and others have shown that preassembled Cas9 ribonucleoproteins (RNPs) can be delivered to human primary and stem cells and mice to modify target genes (Kim et al. 2014;Schumann et al. 2015;Zuris et al. 2015).Cas9 RNPs are rapidly turned over in cells, reducing off-target effects.Furthermore, Cas9 RNPs are unlikely to be limited by host immune systems because they function and disappear before the generation of antibodies and T cells directed against them.Currently, despite these advantages of RNPs, the difficult delivery of Cas9 RNPs in vivo limits its utility for therapeutic applications (Zuris et al. 2015).Here, we show that in vivo genome editing of an wild-type gene, whose up-regulation is responsible for pathogenesis, could be a new therapeutic modality for the treatment of nongenetic degenerative diseases.Our ultimate goal is to harness Cas9 RNPs for a clinical application of therapeutic genome surgery in patients with AMD.",
+      "Clustered regularly interspaced short palindromic repeat (CRISPR)-Cas nucleases have revolutionized the field of gene editing and have tremendous application in the field of molecular medicine [98][99][100][101][102].Despite a significant surge in CRISPR/Cas9mediated genome editing in various disease models, the progress in the field of AD has lagged behind substantially.We believe that genome editing can significantly improve the development of AD models and also create novel opportunities for the development of the next generation precision targeted AD gene and stem cell therapies.Since there are several excellent review articles on CRISPR/Cas9-mediated genome editing, here we will limit our focus on select recent articles that are noteworthy.CRISPR/Cas9 system can be engineered to either activate transcription (gain-of-function) or achieve gene silencing (Loss-of-function).Dahlman et al. have developed a CRISPR-based system that uses catalytically active Cas9 and distinct single guide (sgRNA) constructs to activate and knockout different genes in the same cell [103].Konermann et al. have used structure-guided engineering of a CRISPR-Cas9 complex to mediate efficient transcriptional activation at endogenous genomic loci [104].Using crystallographic studies, they have engineered a combination of sgRNA2.0,NLS-dCas9-VP64 and MS2-p65-HSF1 to develop one of the most effective transcription activation system.",
+      "Limitations of CRISPR-Cas9  CRISPR provides a simple and easy tool not only for in vitro use but potentially also for in vivo genome editing.However, there are limitations and downsides to this approach.First, and despite considerable improvements in the technology, the risk of the offtarget effect remains and must be considered carefully.Second, DSB may lead to wide-ranging deletions or recombination events involving the on-target site (204).Third, in cycling cells, DNA double strand breaks caused by Cas9 cleavage may trigger a P53 response leading to apoptosis and enrichment for potentially oncogenic P53-deficient cells (205,206).Fourth, subjects may generate antibodies to Cas9, potentially limiting gene therapies (207,208).",
+      "Genome editing tools that target the desired genomic region and allow for variants to be altered (e.g. from risk to protective), or for more substantial changes to be made (e.g. the deletion of a longer stretch of DNA harbouring a number of variants) and can help to answer each of these questions.These technologies are evolving rapidly (Figure 1 and Table 2).The most recently developed of these, Clustered Regularly Interspaced Short Palindromic Repeat (CRISPR) technology, originally developed by Doudna, Charpentier and their colleagues (72,73) and Zhang and his colleagues (50) has become a widely used tool for this purpose.Engineered CRISPR/Cas9 technology uses a guide RNA (gRNA) to direct CRISPR-associated endonuclease (Cas) to the target DNA and generate a double strand DNA break.Correction of a mutation or variant in the target DNA sequence can then be carried out by homology-directed DNA repair (HDR) with a donor template.Since its discovery eight years ago, CRISPR technology has evolved quickly to be a critical part of the molecular biologist's toolbox.",
+      "INTRODUCTION  Genome editing technologies based on the clustered regularly interspaced short palindromic repeats (CRISPR)-associated endonuclease Cas9 enable rapid and efficient modification of endogenous genes in a variety of cell types, allowing for analysis of gene function in many organs in vivo.CRISPR-Cas9 induces DNA double strand breaks (DSBs) at single-guide RNA (sgRNA)-specific loci in the genome, which are repaired through either non-homologous end-joining (NHEJ) or homology-directed repair (HDR) pathways.While NHEJ introduces unpredictable pattern of insertion or deletion (indel) mutations, HDR directs a precise recombination event between a homologous DNA donor template and the damaged DNA site (Cong et al., 2013;Cox et al., 2015;Doudna and Charpentier, 2014;Heidenreich and Zhang, 2016;Jinek et al., 2012;Mali et al., 2013;Sander and Joung, 2014;Wang et al., 2013;Yang et al., 2013).Thus, HDR can be used to precisely introduce sequence insertions, deletions or mutations by encoding the desired changes in the donor template DNA.",
+      "CRISPR technology has rapidly changed the face of biological research, such that precise genome editing has now become routine for many labs within several years of its initial development.What makes CRISPR/Cas9 so revolutionary is the ability to target a protein (Cas9) to an exact genomic locus, through designing a specific short complementary nucleotide sequence, that together with a common scaffold sequence, constitute the guide RNA bridging the protein and the DNA.Wild-type Cas9 cleaves both DNA strands at its target sequence, but this protein can also be modified to exert many other functions.For instance, by attaching an activation domain to catalytically inactive Cas9 and targeting a promoter region, it is possible to stimulate the expression of a specific endogenous gene.In principle, any genomic region can be targeted, and recent efforts have successfully generated pooled guide RNA libraries for coding and regulatory regions of human, mouse and Drosophila genomes with high coverage, thus facilitating functional phenotypic screening.In this review, we will highlight recent developments in the area of CRISPR-based functional genomics and discuss potential future directions, with a special focus on mammalian cell systems and arrayed library screening.",
+      "CRISPR technology has rapidly changed the face of biological research, such that precise genome editing has now become routine for many labs within several years of its initial development.What makes CRISPR/Cas9 so revolutionary is the ability to target a protein (Cas9) to an exact genomic locus, through designing a specific short complementary nucleotide sequence, that together with a common scaffold sequence, constitute the guide RNA bridging the protein and the DNA.Wild-type Cas9 cleaves both DNA strands at its target sequence, but this protein can also be modified to exert many other functions.For instance, by attaching an activation domain to catalytically inactive Cas9 and targeting a promoter region, it is possible to stimulate the expression of a specific endogenous gene.In principle, any genomic region can be targeted, and recent efforts have successfully generated pooled guide RNA libraries for coding and regulatory regions of human, mouse and Drosophila genomes with high coverage, thus facilitating functional phenotypic screening.In this review, we will highlight recent developments in the area of CRISPR-based functional genomics and discuss potential future directions, with a special focus on mammalian cell systems and arrayed library screening.",
+      "The recent development of clustered regularly interspaced short palindromic repeat (CRISPR)/Cas9 for experimental purposes has dismantled the perception that genome editing technology is off-limits for screening in mammalian systems (Heintze et al., 2013).Since this system employs the basic principle of Watson-Crick base pairing for gene targeting, generation of libraries with whole-genome target coverage is relatively easy and cost-effective.For instance, simple protocols are available to synthesize pooled lentiviral libraries by in silico design of oligonucleotides, which can then be cloned, packaged and delivered to cells by viral transduction (Paddison et al., 2004;LeProust et al., 2010).Similarly, the generation of arrayed libraries can be achieved by following protocols originally developed for arrayed shRNA library production that have been in use for a number of years (Moffat et al., 2006).All in all, the stage is set for CRISPR to make an enormous impact on genomic screening and thus scientific discovery in the coming years, and recent demonstrations of this system have shown great promise (Shalem et al., 2015).However, a number of technical challenges must be addressed in order to maximize the benefit of this technology.In this review, we will discuss current applications of CRISPR in functional genomics and provide a perspective on future developments in this area.",
+      "Genome editing for crop improvement  Reports of CRISPR-Cas9-based genome editing first appeared in 2013 (Cong et al., 2013;Feng et al., 2013;Mao et al., 2013).Since then, genome editing technologies have proven to be powerful and efficient tools for the improvement of many crop species.At present, genome editing has been widely used to introduce/modify agronomically important traits, such as increased yield, improved nutritional quality, and resistance to biotic and abiotic stresses, in multiple crops, including rice, wheat, maize, tomato, and potato (Lu et al., 2017;Soyk et al., 2017;Tang et al., 2017;D'Ambrosio et al., 2018;Ye et al., 2018;Miao et al., 2019;Zhang et al., 2019;Zhong et al., 2019;Butt et al., 2020;Zhang et al., 2020c;Li et al., 2021b;Zhan et al., 2021).CRISPR-Cas-based genome editing has been extended to targeted mutagenesis, base editing, and precisely targeted gene/allele replacement or tagging in plants.mportantly, using CRISPR-Cas9 technology, transgenes present in the genomes of genome-edited plants can be removed by chromosomal segregation via a simple self-pollination or hybridization step.Gene editing technologies continue to be developed and utilized (Mao et al., 2013;Lu and Zhu, 2017;Lu et al., 2020)."
+    ]
+  ]
+}
\ No newline at end of file
diff --git a/gnqa/paper1_eval/src/data/datasets/old/experts_general2_dataset.json b/gnqa/paper1_eval/src/data/datasets/old/experts_general2_dataset.json
new file mode 100644
index 00000000..20e7d003
--- /dev/null
+++ b/gnqa/paper1_eval/src/data/datasets/old/experts_general2_dataset.json
@@ -0,0 +1,128 @@
+{
+  "question": [
+    "What about recombination in human centromeres?",
+    "What about recombination in the human genome?",
+    "How can I add a new species to the GeneNetwork database?",
+    "Describe the role of mitochondrial DNA in heredity and how it differs from nuclear DNA.",
+    "what is ensembl?"
+  ],
+  "answer": [
+    "Human centromeres contain the CENP-B box, a 17-bp motif that is bound by the centromere protein CENP-B. This motif is present on every chromosome except for the Y chromosome. It is necessary for the formation of de novo centromeres on artificial chromosomes, but it is not essential for the formation of neocentromeres. Mice lacking CENP-B are viable and fertile.",
+    "The text mentions the application of site-specific recombinase technology, which allows investigators to engineer genes in the mouse that will allow for the deletion, insertion, inversion, or exchange of chromosomal DNA with high fidelity. However, it does not provide specific details about recombination in the human genome.",
+    "The text does not provide specific instructions on how to add a new species to the GeneNetwork database. It is recommended to contact the administrators or operators of the GeneNetwork database for guidance on this process.",
+    "Mitochondrial DNA (mtDNA) is inherited solely from the maternal lineage, unlike nuclear DNA which requires both paternal and maternal contributions. This uniparental transmission of mtDNA is ensured by complex mechanisms that eliminate paternal mitochondria from sperm during fertilization. mtDNA also exhibits a higher mutation rate than nuclear DNA, leading to significant population-level mtDNA polymorphisms. These polymorphisms can be clustered into distinct haplogroups that represent major branch points on the mitochondrial phylogenetic tree. In contrast, nuclear DNA undergoes bi-parental recombination.",
+    "Ensembl is a joint project between the EBI and the Wellcome Trust Sanger Institute. It is a publicly available web resource that contains automatically annotated genomes and integrates data from a wide range of biological research sources. Launched in 1999, the Ensembl database was the first to provide a window on the draft genome, curating the results of a series of computational analyses. It provides comprehensive information about the human genome including variants like SNPs, insertions, deletions, and somatic mutations for several species. Ensembl also offers tools for data retrieval and analysis."
+  ],
+  "contexts": [
+    [
+      "Primate and mouse centromeric DNA contain the CENP-B box, a 17-bp motif that is bound by the centromere protein CENP-B.In humans, the CENP-B box is present on every chromosome except for the Y chromosome [20].It is paradoxically necessary for formation of de novo centromeres on artificial chromosomes, but it is not essential for the formation of neocentromeres, and mice lacking CENP-B are viable and fertile [21,22].",
+      "Box 3 Mechanism of homologous recombination and end joining  The severe phenotype of the mouse mutants and the highly cancer-prone human syndromes highlight the importance of homologous recombination.Mouse KU mutants display sensitivity to agents that lead to breaks in DNA, and have immunological problems because the KU proteins are involved in V(D)J recombination of antibody gene sequences.In addition, these mutants display poor development, several features of premature ageing and increased apoptosis of postmitotic neurons in the developing brain.Mice with defects in DNA-PK cs (SCID mice) display a similar but generally milder phenotype.In contrast, XRCC4-and ligase IV-knockout mice seem more severe, with late embryonic lethality resulting from massive ATM-and p53-dependent neuronal apoptosis 33,38 .",
+      "Cells in G1 have only the homologous chromosome for recombination repair.However, this may be difficult to find in the complex genome.Moreover, it is potentially dangerous as a template for repair as it may lead to homozygosity for recessive mutations.As an alternative, the end-joining reaction simply links ends of a DSB together, without any template, using the end-binding KU70/80 complex and DNA-PK cs , followed by ligation by XRCC4-ligase4 (reviewed by 27,33; see the right panel of the figure, stages V-VII).The function of KU70/80 might involve end protection and approximating the ends, in addition to a signalling function by DNA-PK cs .End joining may be further facilitated when the ends are still held together through nucleosomes or other structures.End joining is sometimes associated with gain or loss of a few nucleotides if internal microhomologies are used for annealing before sealing.This implies the involvement of DNA polymerases and/or nucleases.Note that the KU complex is also involved in telomere metabolism 27,62 .found to be lethal 34 .Inactivation of ATR by itself is inviable already at the blastocyst stage.Inactivation of BRCA1 and BRCA2 in mice is also embryonically lethal; cell lines display defects in homologous recombination [35][36][37] .",
+      "371  A tentative scenario for the homologousrecombination reaction is depicted in the left panel of the figure.To promote strand invasion into homologous sequences, the 5\u1371-3\u1371 exonuclease activity of the RAD50/MRE11/NBS1 complex (also a substrate for ATM phosphorylation) exposes both 3\u1371 ends 30 (I).RPA facilitates assembly of a RAD51 nucleoprotein filament that probably includes RAD51-related proteins XRCC2, XRCC3, RAD51B, C and D. RAD52 stimulates filament assembly (II).RAD51 has, like its Escherichia coli RecA counterpart, the ability to exchange the single strand with the same sequence from a double-stranded DNA molecule.Correct positioning of the sister chromatids by cohesins probably facilitates the identification of a homologous sequence.A candidate for the complex chromatin transactions associated with these DNA gymnastics is RAD54, a member of the SWI/SNF family of DNA-dependent ATPases.After identification of the identical sister chromatid sequence, the intact double-stranded copy is used as a template to properly heal the broken ends by DNA synthesis (III).Finally, the so-called Hollidayjunctions are resolved by resolvases 27,33,60 (IV).Homologous recombination involves the simultaneous action of large numbers of the same molecules, which are found to be concentrated in radiation-induced nuclear foci.These depend on, and also include, the BRCA1 and BRCA2 proteins 36 .Recent evidence implicates BRCA2 directly or indirectly in nuclear translocation of RAD51 (ref.61).",
+      "This picture poses more questions than it seeks to answer.Is the grouping of the regions by product rather than by type of region correct?Given that the recombina- tion fraction between HLA-A and HLA-B is of the order of .08%,and that this is likely to represent a distance of at least hundreds of thousands of nucleotides, how are the pieces put together over such relatively long distances?Is it possible that regions of the DNA loop out, so that transcripts can be made directly from noncon- tiguous DNA sequences, the loops being held in place by small RNAs as suggested for the control of splicing by Steitz, and her colleagues [24] and by others [25]?If these small RNAs are coded for well outside the HLA region, does this provide a mechanism for control of expression of products by unlinked genes, as may be the case for one of the constituent polypeptides of the HLA-DR product?What might be the nature of the signals that control which of a multiple set of alternative regions is expressed by any given chromosome?",
+      "Mamm Genome. 2006; 17:220\u2013229. [PubMed: 16518689] 72. Romanoski CE, et al. Systems genetics analysis of gene-by-environment interactions in human cells. Am J Hum Genet. 2010; 86:399\u2013410. [PubMed: 20170901] 73. Myers S, Freeman C, Auton A, Donnelly P, McVean G. A common sequence motif associated with recombination hot spots and genome instability in humans. Nature Genet. 2008; 40:1124\u2013 1129. [PubMed: 19165926] 74. Myers S, et al. Drive against hotspot motifs in primates implicates the PRDM9 gene in meiotic recombination. Science. 2010; 327:876\u2013879. [PubMed: 20044541] 75. Cordell HJ. Detecting gene-gene interactions that underlie human diseases. Nature Rev Genet. 2009; 10:392\u2013404.",
+      "Classification of common conserved sequences in mammalian intergenic regions. Hum. Mol. Genet. 2002, 11, 669\u2013674. 25. Zhu, L.; Swergold, G.D.; Seldin, M.F. Examination of sequence homology between human chromosome 20 and the mouse genome: Intense conservation of many genomic elements. Hum. Genet. 2003, 113, 60\u201370. 26. Pevzner, P.; Tesler, G. Human and mouse genomic sequences reveal extensive breakpoint reuse in mammalian evolution. Proc. Natl. Acad. Sci. USA 2003, 100, 7672\u20137677. 27. Christmann, R.B. ; Sampaio-Barros, P.; Stifano, G.; Borges, C.L. ; de Carvalho, C.R. ; Kairalla, R.; Parra, E.R. ; Spira, A.; Simms, R.; Capellozzi, V.L. ; et al.",
+      "a The table lists proteins in which mutations have been shown to increase homologous recombination (HR), gross chromosomal rearrangements (GCRs), chromosomal instability (CIN), sister chromatid exchanges (SCEs), tri-nucleotide repeat expansions and contractions (TNR), telomere fusions (Tel fusion), or fragile telomeres (Tel fragility).A phenotype inside brackets ([ ]) indicates that it is caused by overexpression of the protein.For further details and references see Supplementary Table1.Abbreviations: DSB, double-strand break; PCNA, proliferating cell nuclear antigen; RFC, replication factor C complex; SCF, Skp1-Cdc53/Cullin-F-box.",
+      "Figure 3 Intermediates and chromosome structural alterations, as observed by different techniques. (a) Replication fork stalling, as monitored by 2D-gel electrophoresis and Southern analysis in yeast (for details about the technique, see Reference 161). (b) Slower human replication forks covering shorter DNA synthesis tracks, as determined by incorporation of IdU and CldU via DNA combing (52), which permits visualization of the process of replication on DNA fibers. (c) Accumulation of double-strand breaks (DSBs) or replicative stress, as inferred by \u03b3H2AX foci or by \u03b3H2AX pan staining, respectively, in human cells. (d ) DSBs or ssDNA (single-stranded DNA) gaps as seen directly by nuclear \"comet tails\" via single-cell electrophoresis assays in human cells (52). (e) Sister-chromatid exchanges (SCEs), as determined by Giemsa staining in human cells (207). ( f ) Hyper-recombination, as determined by colony sectoring in yeast (5). ( g) Gross chromosomal rearrangements (GCRs), as determined by spectral karyotyping in mouse cells (118). (h) Translocations, as visualized by pulse-field gel electrophoresis in yeast (168). (i ) Fragile sites, as detected by mitotic spreads in human cells (109). ( j) Telomere fusions, as determined by CO-FISH (chromosome-orientation fluorescent in situ hybridization) in mouse cells (124). (k) Anaphase bridges, presumably resulting from unfinished replication, dicentric chromosomes, and sister-chromatid nondisjunction, as detected by fluorescence microscopy in mouse cells.Arrows indicate the specific structural alterations referred to in each panel; in panel h, closed and open arrows indicate the position where the translocated or missing parental chromosome migrate or should migrate, respectively.When necessary, a normal control is shown on top of the panel, with the exception of panel a, which is shown on the left.Detailed description of each technique can be found in the references provided.Photos are from the laboratories of A. Nussenzweig ( g), A. Losada (k), M. Blasco ( j), L. Tora (i ), and ours (all others).Abbreviations: HR, homologous recombination; NHEJ, nonhomologous end-joining.",
+      "In humans, the pericentromeric region of chromosome 9 is densely packed with segmental genomic duplications (segdups) and is prone to microdeletions and microduplications. 5In order to evaluate this region for microdeletions and microduplications in family T, we screened genomic DNA from affected individual II-7 by arrayCGH with the Nimblegen HD2 platform with the previously described CHP-SKN sample 6 as the reference.Data were normalized and CNVs were called by identifying regions where Z-scores consistently deviated from the diploid mean.At 9q21.11, a genomic duplication of ~270 kb was apparent in the genomic DNA of II-7 (Figure 1D).The Genomic duplications may or may not be in tandem with their parent segment and may be either in the same or inverted orientation. 7We developed primers that would uniquely amplify genomic DNA with the duplication under each of these conditions.Forward (5 0 -CCCAGCAGA AGCAATGGTGGTAGCC-3 0 ) and reverse (5 0 -GGTGGTGAA TCCAAAAACACAAGAACAAAGTC-3 0 ) primers diagnostic for a tandem inverted duplication (Figure 2A) yielded products of expected size in family T relatives with hearing loss, but yielded no product in unaffected family T relatives (Figure 2B).Genotypes of all 58 participating relatives in family T indicated that the tandem inverted duplication was coinherited with hearing loss.The duplication spans approximately positions 71,705,804 to 71,974,823 (hg19) on chromosome 9 for a size of ~269,023 bp.The duplication includes the entire locus for the tight junction protein TJP2, which spans positions 71,788,971 to 71,870,124 (hg19).",
+      "Chromosomal context of human NORs  Human NORs are positioned on the short arms of the acrocentric chromosomes that still remain unsequenced and thus missing from the current human genome draft, GRCh38.p7.Seeking an understanding of the chromosomal context of human NORs and to identify potential NOR regulatory elements, my laboratory has begun to characterize the sequences on both proximal (centromeric) and distal (telomeric) sides of the rDNA arrays (Fig. 3A; Floutsakou et al. 2013).Building on earlier reports of sequences distal and proximal to the rDNA array on HSA21 and HSA22, respectively (Worton et al. 1988;Sakai et al. 1995;Gonzalez and Sylvester 1997), 207 kb of sequence immediately proximal and 379 kb distal to rDNA arrays have been reported recently (Floutsakou et al. 2013).Consensus proximal junction (PJ) and distal junction (DJ) sequences were constructed mostly from chromosome 21 BACs (bacterial artificial chromosomes).Comparison of these sequences with BACs and cosmids derived from the other acrocentrics revealed that the PJ and DJ sequences are, respectively, \u223c95% and 99% identical between all five acrocentric chromosomes.Conservation of DJ sequences among the acrocentrics is consistent with frequent recombination between the rDNA arrays on each of the acrocentric chromosomes (Worton et al. 1988).However, conservation of PJ sequences suggests that there must also be frequent recombination events in the interval between the centromere and rDNA arrays.Proximal sequences are almost entirely segmentally duplicated, similar to the regions bordering centromeres.Consequently, they are unlikely to contain any specific elements that would regulate the activity of the linked NOR.In contrast, the distal sequence is predominantly unique to the acrocentric short arms and is dominated by a very large inverted repeat.Each arm of the inverted repeat is >100 kb, and they share an average sequence identity of 80%.There is a large (\u223c40-kb) block of a 48base-pair (bp) satellite repeat, CER, at the distal end of the DJ (Fig. 3A).CER blocks are found distal to the rDNA on all acrocentric chromosomes, with additional pericentromeric blocks on chromosomes 14 and 22. Finally, there are two blocks of a novel 138-bp tandem repeat, ACRO138, present within the DJ.",
+      "The conservation of DJ sequence between the five human acrocentric chromosomes provides a unique opportunity to visualize NORs by FISH.Whereas the rDNA content of NORs can vary greatly, probing of human metaphase chromosome spreads with a DJ BAC results in signal that is consistent between NORs (Floutsakou et al. 2013).Using this probing scheme, it was observed that in most human cell lines analyzed, including multiple primary lines, at least one and sometimes as many as four of the NORs present have very little or no detectable rDNA (C van Vuuren and B McStay, unpubl. ).Many studies have used silver staining of metaphase spreads prepared from stimulated human peripheral blood lymphocytes to determine how many NORs are active in normal human cells.The number of active NORs ranges from seven to 10, with an average of eight (Heliot et al. 2000).Possibly, NORs with low rDNA content are active but fall below a detection threshold in silver staining.At this point, it is worth considering the distribution of active versus silent rDNA repeats in humans and other mammals.If 50% of rDNA repeats are truly repressed, there are insufficient \"silent\" NORs to house them.We must conclude that active NORs are a mosaic of active and silent repeats.",
+      "However, excluding some cases, recombination suppression occurs in a small genomic tract where these genes are located, and it does not extend over most of the sex chromosome pair, as occurs in mammals and birds (Bergero and Charlesworth, 2009). It is not clear if this suppression occurs by the presence of inversions or as a modulation of the recombination mechanism itself, but both could be involved (Bergero and Charlesworth, 2009). Evidence of recombination in the SD region in sex reversal individuals supports the second hypothesis.",
+      "Orthologous chromosomes between baboon and human",
+      "Lichter P, Cremer T, Borden J, Manuelidis L, Ward DC (1988) Delineation of individual human chromosomes in metaphase and interphase cells by in situ suppression hybridization using recombinant DNA libraries. Hum Genet 80:224\u2013234 3. Jang W, Yonescu R, Knutsen T, Brown T, Reppert T, Sirotkin K, Schuler GD, Ried T, Kirsch IR (2006) Linking the human cytogenetic map with nucleotide sequence: the CCAP clone set. Cancer Genet Cytogenet 168:89\u201397 4.",
+      "Nature Genet 1:222\u2013225 55. Foote S, Vollrath D, Hilton A, Page DC (1992) The human Y chromosome: overlapping DNA clones spanning the euchromatic region. Science 258:60\u201366 56. Chumakov IM, Rigault P, Le Gall I et al (1995) A YAC contig map of the human genome. Nature 377:175\u2013297 57. Hudson TJ, Stein LD, Gerety SS et al (1995) An STS-based map of the human genome. Science 270:1945\u20131954 58. Coffey AJ, Roberts RG, Green ED et al (1992) Construction of a 2.6-Mb contig in yeast artificial chromosomes spanning the human dystrophin gene using an STSbased approach. Genomics 12:474\u2013484 59.",
+      "Figure 4 Schematic depiction of proposed mechanisms for observed intrachromosomal rearrangements.The blue and red arrows indicate the orientation of the integrated plasmid loci and the recovered mouse sequences, respectively, on the original non-rearranged chromosome (left column).All four combinations are given for an arbitrarily orientated chromosome (green line).The middle column shows how two breakpoints (lightning signs) could lead to the inversion or deletion of the encompassed chromosomal sequence (yellow-orange dual tone line) and result in a recoverable mutation in the right column.The last row indicates the two options for a transposition, in which either the transgene locus or the recovered mouse sequence is copied or excised (as indicated by the pink and light blue arrows) and integrates in the breakpoint at the other location.",
+      "As mentioned above, by taking into account that for a genome rearrangement to be detected, the 5\u0408 plasmid sequence of the breakpoint in lacZ must remain intact and end immediately in front of the recovered mouse sequence, the simplest intrachromosomal mutation that could have taken place was inferred (Fig. 4).Rearrangements with breakpoints in the mouse genome on either site of the integrated plasmid concatamer, but with reversely orientated sequences, could be inversions (Fig. 4).Rearrangements in the direction of the integrated plasmids, proximal for chromosome 3 and distal for chromosome 4 (Fig. 3), with similarly orientated breakpoints in the mouse genome, could be deletions (Fig. 4).Rearrangements in the reverse direction of the integrated plasmids, with reversely orientated mouse sequences, are more complicated and might be owing to transpositions (Fig. 4).According to these schemes, half of the intrachromosomal rearrangements would have been inversions, whereas deletions and transpositions each made up one fourth (Fig. 3).Alternatively, these rearrangements could be explained by translocations involving the transgene clusters integrated on either the homolog or the other chromosome.",
+      "FIGURE 3. Telomere arrays of chicken and human chromosomes: the chicken genome contains more telomere sequence than the human genome.Chicken (a) and human (b) metaphase chromosomes and interphase cells hybridized with a telomeric sequence-peptide nucleic acid (PNA)-fluorescein probe.Human and chicken slide preparations were processed, and images were captured using the same parameters.Qualitatively, the telomere-positive fluorescent signals (white spots) from chicken cells and chromosomes have greater intensity than those of human (4\u2032,6 diamidino-2-phenylindole, DAPI counterstain).",
+      "In a previous study on the accumulation of spontaneous genome rearrangements in normal mice with aging, we discovered that 50% of the events were intrachromosomal, i.e., large deletions or inversions [22].In contrast, in this present study most of the rearrangements resulted from inter-chromosomal recombination, in both the Ercc1-mutant and control animals (Table 3).Previously, we used lacZ-plasmid line 60 mice with integration sites on Chromosomes 3 and 4, while in the present study line 30 mice were used with a single integration site on Chromosome 11.This indicates that the relative frequency of translocations is founder line specific and could be due to the position of the lacZ-plasmid cluster on the chromosome.Indeed, the chromosomal integration sites in line 60 mice are in the E1 region of Chromosome 3 (half way along the chromosome) and the C5 region of Chromosome 4 (two-thirds of the way along the chromosome) [22], while the integration site of founder line 30 (used in this study) is on the centromeric tip of Chromosome 11 (region A1-A2; not shown).The proximal location on Chromosome 11 prevents the detection of all but relatively small intra-chromosomal recombinations; larger events would lead to loss of the centromere and, therefore, the entire chromosome.If the orientation of the integration site in line 30, which is currently unknown, is towards the centromere, transpositions and inversions towards the distal end are the only detectable large intra-chromosomal rearrangements (for a detailed explanation of the different chromosomal events that can occur at the lacZ locus, see [22])."
+    ],
+    [
+      "Genome Res, 2011, 21: 1769\u20131776 Mattick JS, Dinger ME. The extent of functionality in the human genome. HUGO J, 2013, 7, doi:10.1186/1877-6566-1187-1182 ENCODE Project Consortium, Bernstein BE, Birney E, Dunham I, Green ED, Gunter C, Snyder M. An integrated encyclopedia of DNA elements in the human genome. Nature, 2012, 489: 57\u201374 Pheasant M, Mattick JS. Raising the estimate of functional human sequences. Genome Res, 2007, 17: 1245\u20131253 Hu T, Long M, Yuan D, Zhu Z, Huang Y, Huang S. The genetic equidistance result, misreading by the molecular clock and neutral theory and reinterpretation nearly half of a century later.",
+      "This approach enables, on the one hand, studying the process of mammalian evolution and, on the other hand, translational studies using model organisms of complex human phenotypes. Detection of regions conserved between distant species points to high functional importance of these fragments of the DNA sequence. Human and mouse developmental lines diverged about 75 million years ago, and ever since evolutionary forces shaped the two genotypes in a different manner (Waterston et al. , 2002). Nevertheless, the extent of the changes is, however, small enough for conservation of local gene order (Waterston et al. , 2002).",
+      "First, the human and mouse genome projects elucidated the sequences of over 20,000 genes [Lander et al. , 2001; Venter et al. , 2001], and most are expressed in the CNS. The availability of gene sequences has allowed rapid analysis of candidate human disease and disorder genes and the isolation of the mouse homologues. Second, the application of site-speci\ufb01c recombinase technology provides investigators with the opportunity to engineer genes in the mouse that will allow for the deletion, insertion, inversion, or exchange of chromosomal DNA with high \ufb01delity (for review see Branda and Dymechi, 2004].",
+      "In some cases, structural variations, such as copy number polymorphisms, exist (Feuk et al. , 2006); however, because of the nature of the genome assembly process, these will invariably be collapsed into a single contig that does not reflect the natural sequence. To address the technical challenges of whole-genome assembly, the human genome is released as defined \u2018builds\u2019 on a quarterly basis (Lander et al. , 2001; reviewed in Chapter 4). The increasing complexity of processes that map data to the genome implicitly involves some lag in availability of the most current sequence assembly.",
+      "In practical terms, this has meant that we acquire many fragments, from a few hundred bases to a few hundred kilobases in length, of a genome that must then be assembled computationally to produce a continuous sequence. In the case of the human genome, two unfinished \u2018draft\u2019 sequences were produced by different methods, one by the International Human Genome Sequencing Consortium (IHGSC) and one by Celera Genomics (CG). The IHGSC began with a BAC (bacterial artificial chromosome) clone-based physical map of the genome (IHGSC, 2001).",
+      "4 Assembling a View of the Human Genome Colin A. M. Semple Bioinformatics, MRC Human Genetics Unit, Edinburgh EH4 2XU, UK  4.1 Introduction The miraculous birth of the draft human genome sequence took place against the odds. It was only made possible by parallel revolutions in the technologies used to produce, store and analyse the sequence data, and by the development of new, large-scale consortia to organize and obtain funding for the work (Watson, 1990). The initial flood of human sequence has subsided as the sequencing centres have sequenced genomes from other mammalian orders and beyond.",
+      "THE HUMAN GENOME PROJECT IS generating vast amounts of new information at breakneck speed and causing a fundamental shift in disease research.Now with the availability of a nearly complete, high-accuracy sequence of the mouse genome (7), a new and powerful paradigm for biomedical research is established.The remarkable similarity of mouse and human genomes, in both synteny and sequence, unconditionally validates the mouse as an exceptional model organism for understanding human biology.The discovery among inbred mouse strains of defined regions of high and low genomic variation inherited primarily from two ancestral Mus subspecies (6) holds great promise to make mapping and positional cloning more rapid and feasible.Haplotype maps of inbred mouse strains combined with sophisticated delineation of their phenotypic variation and gene expression patterns will enable complex trait analysis on an unprecedented scale.This issue of Journal of Applied Physiology highlights inbred strain surveys exploring phenotypic variation in drug responses [see Crabbe et al. (1) and Watters et al. (8)  in this issue].These mouse initiatives demonstrate a viable, cost-effective alternative to human research requiring family studies, population linkage analysis, or genome-wide genotyping on a multitude of individuals for association mapping.",
+      "How Many Genes are There in the Human Genome?",
+      "The Landscape of Human Genome Variation",
+      "In some cases, structural variations, such as copy number polymorphisms, exist (Feuk et al. , 2006); however, because of the nature of the genome assembly process, these will invariably be collapsed into a single contig that does not reflect the natural sequence. To address the technical challenges of whole-genome assembly, the human genome is released as defined \u2018builds\u2019 on a quarterly basis (Lander et al. , 2001; reviewed in Chapter 4). The increasing complexity of processes that map data to the genome implicitly involves some lag in availability of the most current sequence assembly.",
+      "In practical terms, this has meant that we acquire many fragments, from a few hundred bases to a few hundred kilobases in length, of a genome that must then be assembled computationally to produce a continuous sequence. In the case of the human genome, two unfinished \u2018draft\u2019 sequences were produced by different methods, one by the International Human Genome Sequencing Consortium (IHGSC) and one by Celera Genomics (CG). The IHGSC began with a BAC (bacterial artificial chromosome) clone-based physical map of the genome (IHGSC, 2001).",
+      "4 Assembling a View of the Human Genome Colin A. M. Semple Bioinformatics, MRC Human Genetics Unit, Edinburgh EH4 2XU, UK  4.1 Introduction The miraculous birth of the draft human genome sequence took place against the odds. It was only made possible by parallel revolutions in the technologies used to produce, store and analyse the sequence data, and by the development of new, large-scale consortia to organize and obtain funding for the work (Watson, 1990). The initial flood of human sequence has subsided as the sequencing centres have sequenced genomes from other mammalian orders and beyond.",
+      "Science 291:1304\u2013 1351 3. Lander ES et al (2001) Initial sequencing and analysis of the human genome. Nature 409:860\u2013921 4. Engle LJ, Simpson CL, Landers JE (2006) Using high-throughput SNP technologies to study cancer. Oncogene 25:1594\u20131601 5. Elston RC, Anne Spence M (2006) Advances in statistical human genetics over the last 25 years. Stat Med 25:3049\u20133080 6. Larson GP et al (2005) Genetic linkage of prostate cancer risk to the chromosome 3 region bearing FHIT. Cancer Res 65:805\u2013814 7. Botstein D, Risch N (2003) Discovering genotypes underlying human phenotypes: past successes for mendelian disease, future approaches for complex disease.",
+      "McPherson JD, Marra M, Hillier L et al (2001) A physical map of the human genome. Nature 409:934\u2013941 13. Burke DT, Carle GF, Olson MV. (1987) Cloning of large segments of exogenous DNA into yeast by means of artificial chromosome vectors. Science 236:806\u2013812 14. Fleischmann RD, Adams MD, White O et al (1995) Whole-genome random sequencing and assembly of Haemophilus influenzae Rd Science 269:496\u2013512 15. Arabidopsis Genome Initiative (2000) Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408:796\u2013815 16.",
+      "T he human genome has been cracked wide open in recent years and is spilling many of its secrets.More than 100 genome wide association studies have been conducted for scores of hu man diseases, identifying hun dreds of polymorphisms that are widely seen to influence disease risk.After many years in which the study of complex human traits was mired in false claims and methodologic inconsistencies, ge nomics has brought not only com prehensive representation of com mon variation but also welcome rigor in the interpretation of sta tistical evidence.Researchers now know how to properly account for most of the multiple hypothesis testing involved in mining the ge nome for associations, and most reported associations reflect real biologic causation.But do they matter?",
+      "In some cases, structural variations, such as copy number polymorphisms, exist (Feuk et al. , 2006); however, because of the nature of the genome assembly process, these will invariably be collapsed into a single contig that does not reflect the natural sequence. To address the technical challenges of whole-genome assembly, the human genome is released as defined \u2018builds\u2019 on a quarterly basis (Lander et al. , 2001; reviewed in Chapter 4). The increasing complexity of processes that map data to the genome implicitly involves some lag in availability of the most current sequence assembly.",
+      "In practical terms, this has meant that we acquire many fragments, from a few hundred bases to a few hundred kilobases in length, of a genome that must then be assembled computationally to produce a continuous sequence. In the case of the human genome, two unfinished \u2018draft\u2019 sequences were produced by different methods, one by the International Human Genome Sequencing Consortium (IHGSC) and one by Celera Genomics (CG). The IHGSC began with a BAC (bacterial artificial chromosome) clone-based physical map of the genome (IHGSC, 2001).",
+      "In some cases, structural variations, such as copy number polymorphisms, exist (Feuk et al. , 2006); however, because of the nature of the genome assembly process, these will invariably be collapsed into a single contig that does not reflect the natural sequence. To address the technical challenges of whole-genome assembly, the human genome is released as defined \u2018builds\u2019 on a quarterly basis (Lander et al. , 2001; reviewed in Chapter 4). The increasing complexity of processes that map data to the genome implicitly involves some lag in availability of the most current sequence assembly.",
+      "In practical terms, this has meant that we acquire many fragments, from a few hundred bases to a few hundred kilobases in length, of a genome that must then be assembled computationally to produce a continuous sequence. In the case of the human genome, two unfinished \u2018draft\u2019 sequences were produced by different methods, one by the International Human Genome Sequencing Consortium (IHGSC) and one by Celera Genomics (CG). The IHGSC began with a BAC (bacterial artificial chromosome) clone-based physical map of the genome (IHGSC, 2001).",
+      "4 Assembling a View of the Human Genome Colin A. M. Semple Bioinformatics, MRC Human Genetics Unit, Edinburgh EH4 2XU, UK  4.1 Introduction The miraculous birth of the draft human genome sequence took place against the odds. It was only made possible by parallel revolutions in the technologies used to produce, store and analyse the sequence data, and by the development of new, large-scale consortia to organize and obtain funding for the work (Watson, 1990). The initial flood of human sequence has subsided as the sequencing centres have sequenced genomes from other mammalian orders and beyond."
+    ],
+    [
+      "The hierarchical organization of GN\u2019s main Select and Search menu is simple and makes it relatively easy to find relevant data sets (Fig. 1). To get data, after opening the browser, select the most appropriate Species from the dropdown menu. For an open-ended search of phenotypes you can also select All Species at the bottom of the menu. The next steps are to select the Group, Type, and Data Set from the drop-down menus. For many groups, a combination of phenotypes, genotypes, and molecular data are available.",
+      "GeneNetwork contains data from a wide range of species, from humans to soybeans, but most of the available phenotypic data is from mice. Within the mouse dataset there are groups of families, crosses, non-genetic groupings, and individual data. The type of dataset must be selected after defining the species and sample population. While genotypes, mRNA, methylated DNA, protein, metagenomic, and 2 bioRxiv preprint doi: https://doi.org/10.1101/2020.12.23.424047; this version posted December 24, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. metabolome datasets are available (i.e.",
+      "The hierarchical organization of GN\u2019s main Select and Search menu is simple and makes it relatively easy to find relevant data sets (Fig. 1). To get data, after opening the browser, select the most appropriate Species from the dropdown menu. For an open-ended search of phenotypes you can also select All Species at the bottom of the menu. The next steps are to select the Group, Type, and Data Set from the drop-down menus. For many groups, a combination of phenotypes, genotypes, and molecular data are available.",
+      "Search and Data Retrieval Point your browser to www.genenetwork.org. This brings you by default to the Search page, from which you can retrieve data from many GN data sets. We will focus on the default data set, defined by Species: Mouse, Group: BXD, Type: Whole Brain, Database: INIA Brain mRNA M430 (Apr05) PDNN Enter \u201cKcnj*\u201d into the ALL or ANY field and click the Search button. Note the location and annotation of available potassium channel genes in the Search Results page that opens. Use the browser Back button to return to previous page.",
+      "Add information on data provenance by giving details in Investigation, Protocols and ProtocolApplications  Customize Customize \u2018my\u2019 XGAP database with extended variants of Trait and Subject. In the online XGAP demonstrator, Probe traits have a sequence and genome location and Strain subjects have parent strains and (in)breeding method. Describe extensions using MOLGENIS language and the generator automatically changes XGAP database software to your research Upload  Upload data from measurement devices, public databases, collaborating XGAP databases, or a public XGAP repository with community data.",
+      "However, a suitable and customizable integration of these elements to support high throughput genotype-tophenotype experiments is still needed [34]: dbGaP, GeneNetwork and the model organism databases are designed as international repositories and not to serve as general data infrastructure for individual projects; many of the existing bespoke data models are too complicated and specialized, hard to integrate between profiling technologies, or lack software support to easily connect to new analysis tools; and customization of the existing infrastructures dbGaP, GeneNetwork or other international repositories [35,36] or assembly of Bioconductor and generic model organism database components to suit particular experimental designs, organisms and biotechnologies still requires many minor and sometimes major manual changes in the software code that go beyond what individual lab bioinformaticians can or should do, and result in duplicated efforts between labs if attempted.",
+      ", 2014; see Section 9). GeneNetwork is a database that enables searching for \u223c4000 phenotypes from multiple studies in the BXD, HXB, and in other recombinant inbred rodent families, as well as in other model organisms and even humans (Mulligan et al. , 2017). GeneNetwork employed a somewhat di\ufb00erent strategy than MPD in that it did not rely solely on researchers submitting their data. Instead the database operators extracted the data from the scienti\ufb01c literature and integrated them into a uniform format (Chesler et al. , 2003).",
+      "GeneNetwork contains data from a wide range of species, from humans to soybeans, but most of the available phenotypic data is from mice. Within the mouse dataset there are groups of families, crosses, non-genetic groupings, and individual data. The type of dataset must be selected after defining the species and sample population. While genotypes, mRNA, methylated DNA, protein, metagenomic, and 2 bioRxiv preprint doi: https://doi.org/10.1101/2020.12.23.424047; this version posted December 24, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. metabolome datasets are available (i.e.",
+      "However, a suitable and customizable integration of these elements to support high throughput genotype-to-phenotype experiments is still needed[340]: dbGaP, GeneNetwork and the model organism databases are designed as international repositories and not to serve as general data infrastructure for individual projects; many of the existing bespoke data models are too complicated and specialized, hard to integrate between pro\ufb01ling technologies, or lack software support to easily connect to new analysis tools; and customization of the existing infrastructures dbGaP, GeneNetwork or other international repositories[384, 154] or assembly of Bioconductor and generic model organism database components to suit particular experimental designs, organisms and biotechnologies still requires many minor and sometimes major manual changes 38 2.1.",
+      "All data presented in this paper were deposited in the online database GeneNetwork (www.genenetwork.org), an open web resource that contains genotypic, gene expression, and phenotypic data from several genetic reference populations of multiple species (e.g. mouse, rat and human) and various cell types and tissues.35;36 It provides a valuable tool to integrate gene networks and phenotypic traits, and also allows cross-cell type and cross-species comparative gene expression and eQTL analyses.",
+      "There is a good chance that you will be able to apply these new techniques to specific problems, even while you read. If you have a computer with an Internet connection\u2014so much the better, and you can read and work along at the same time. This short review and primer will take you on a tour of a web site called GeneNetwork that embeds many large data sets that are relevant to studies of behavioral variation. GeneNetwork is an unusual site because it contains a coherent \"universe\" of data, as well as many powerful analytic tools.",
+      "The GeneNetwork database provides open access to BXD and other RI strain derived microarray data, single nucleotide polymorphism (SNP) data, and phenotypic data for quantitative trait loci analysis and gene expression correlation analyses. Gene expression data were exported for manually selected probes in the PDNN hippocampus database (Hippocampus Consortium M430v2), and the PDNN whole brain database (INIA Brain mRNA M430). The Hippocampus database was chosen as one of the most elaborate brain databases, as well as most highly recommended dataset on GeneNetwork itself (http://www.genenetwork.org/ webqtl/main.py?FormID=sharinginfo&GN_AccessionId=112).",
+      "2016) and can also be accessed in GeneNetwork by entering Record ID 18494 in the Get Any space on the Search page and clicking on the Search button. Alternatively, enter data by hand into the designated boxes provided by GeneNetwork. These latter options also allow for the inclusion of trait variance. It is a good idea to name the trait in the box provided. Then click Next, and manually enter the data for each RI strain, F1, and founder strain. 3  Author Manuscript  After entering the data, click on the blue plus sign button called Add.",
+      "To submit multiple phenotypes at the same time, select the option for Batch Submission under the Home tab. This allows users to submit up to 100 traits for analysis by GeneNetwork. Here, select BXD as the cross or RI set to analyze from the first pull-down menu. The phenotype file should follow the format described in the Sample text (http:// genenetwork.org/sample.txt). After uploading the appropriate file using the Browse button, enter a name for the file in the Dataset space. The data will be stored in the GeneNetwork server for 24 hours. Click Next.",
+      "Author Manuscript  Materials Here we will provide detailed instructions for using GeneNetwork along with some \u201cworked\u201d examples taken from the recent study of intravenous cocaine self-administration by Dickson et al. (2016) in BXD RI mice. A complete overview of GeneNetwork is beyond the scope of this protocol, but is extensively covered in elsewhere (see Mulligan et al. 2016; Williams & Mulligan 2012 for excellent reviews on GeneNetwork). A computer with an internet connection and current web browser. See the GeneNetwork.org site for information on supported browser versions. Author Manuscript  Method Entering Data  Author Manuscript  1  Link to http://www.genenetwork.org.",
+      "Species in GenAge model organisms",
+      "Data are reviewed before entry in GeneNetwork by the senior author. Phenotypes are currently split into 15 broad phenotypic categories (Supplementary Data 1). Phenome curation and description was initiated by R.W.W. and Dr Elissa Chesler in 2002 by literature review and data extraction. The early work is described brie\ufb02y in Chesler et al.51,52. Most work over the past 5 years has been performed by two of the coauthors (R.W.W. and M.K.M.). We have used a controlled vocabulary and set of rules described here (http://www.genenetwork.org/faq.html#Q-22).",
+      "9) To bring your data to GeneWeaver, click on the GeneWeaver icon, making sure to be previously login to your GeneWeaver account. You will be brought to the GeneSet upload page with the Genes Uploaded and the Geneweaver Analysis Platform  139  Fig. 5 Default settings at GeneNetwork.org are set to search \u201cMouse\u201d, \u201cPhenotypes\u201d, from among the \u201cBXD Published Phenotypes\u201d data set. Here the term nociception was searched for  Fig. 6 The search results page in GeneNetwork showing the 33 records retrieved from the phenotype search for nociception.",
+      "Users may also share their data with other users selectively, make it public, or keep it restricted to a private account. Data can be imported by users, uploading their gene set data directly or exporting to GeneWeaver from within another online resource such as Neuro Informatics Framework (NIF) [8], Grappa [9], Mouse Phenome Database (MPD) [10] or GeneNetwork [11]. These datasets can then be added to your collection to be analyzed together with other gene sets retrieved from the GeneWeaver database. To begin a GeneWeaver analysis a user must collect \u201cGeneSets\u201d together in a \u201cProject\u201d.",
+      "Alternatively the spreadsheet can be saved as a .txt file and uploaded by clicking on \u201cSwitch to file upload.\u201d Once complete click on upload GeneSet. 7. Once completed you are taken to the GeneSet detail page. If there are errors in your uploaded data you can correct them by clicking on \u201cEdit\u201d. 8. Use the Add Selected to Project, and create a new project, e.g. \u201cChronic Cocaine\u201d. 9. Now using the Search function populate this project with additional gene sets related to this study trying Queries such as \u201cCocaine Addiction\u201d, \u201cChronic Cocaine\u201d."
+    ],
+    [
+      "Oxidative stress and mitochondrial DNA  Not long after it was discovered that mitochondria have their own genetic apparatus, Harman proposed that mitochondria play a central role in the free radical theory of aging [16].This idea was developed further by Miquel et al. [330], and the notion that mtDNA mutagenesis played a role in aging took hold.The phenotypical importance of mutations in mtDNA was demonstrated by Wallace et al. [331] and Holt et al. [332], who first showed that Leber's hereditary optic neuropathy and mitochondrial myopathies were caused by mtDNA mutations (reviewed in [333]).Because mtDNA is so close to the site of mitochondrial ROS production, it is exposed to considerably higher oxidative stress, resulting in 3-fold higher levels of DNA oxidative damage (the previously quoted 20-fold figure is apparently due to an isolation artifact [334,335]).In the 1990s a series of papers reported that the frequency of mitochondrial DNA deletions increases dramatically with age, being essentially undetectable in young individuals and reaching levels as high as 2% of mtDNA in old individuals.This age-related increase in mtDNA deletions was found in organisms as diverse as worms, mice, and humans (reviewed in [24,336]).The same is also true with mtDNA point mutations [337,338].Certain mtDNA polymorphisms have been found in increased frequency in centenarians, implying a protective effect during aging [339][340][341].Similar protective effects of mtDNA polymorphisms have been reported for the age-related neurodegenerative condition, Parkinson's disease [342].",
+      "Variation in the structure and function of mitochondria underlies variation in organismal energetics broadly (Seebacher et al., 2010) and evidence for the importance of mitochondrial function in the evolution of natural populations continues to accumulate (Ballard and Melvin, 2010;Glanville et al., 2012;Hicks et al., 2012;Kurbalija Novi\u010di\u0107 et al., 2015).For example, variation in mitochondrial DNA sequences (mtDNA) can determine whole-organism metabolism, i.e., the rate at which organisms process energy from their environment, a phenomenon widespread across animal taxa (Arnqvist et al., 2010;Ballard et al., 2007;Ballard and Pichaud, 2014;Havird et al., 2019;Hood et al., 2018;James et al., 2016;Wolff et al., 2014).Specifically, mtDNA sequence variants are linked to functional metabolic differences in fish (Chapdelaine et al., 2020;Flight et al., 2011;Healy et al., 2019), birds (Scott et al., 2011), and mammals (Fontanillas et al., 2005), including humans (Amo and Brand, 2007;Dato et al., 2004;Niemi et al., 2003;Tranah et al., 2011).These mtDNA variants are often correlated with environmental factors such as temperature and altitude (Storz et al., 2010).However, other studies attempting to link mitochondrial function to mitochondrial DNA (mtDNA) sequence variation or environmental factors have offered mixed reports (Amo and Brand, 2007;Flight et al., 2011;Fontanillas et al., 2005;Hicks et al., 2012).",
+      "The results here point to several potentially fruitful research directions.We have identified how nonsynonymous mutations in the mitochondrial genome associate with variation in whole-organism metabolism (including CytB, ND1, ND5 and ND6).A next step will be to characterize the molecular details of how these changes affect molecular function.It would also be beneficial to describe how variation in cellular oxygen consumption rate scales up to determine whole-organism metabolic rate across a range of temperatures, thus identifying potential mismatches across levels of organization that may impact organismal performance (Gangloff and Telemeco, 2018).While the interconnected processes that shape organismal and population-level responses to environmental variation do not lend themselves to simple narratives, and many molecular processes interact to produce the emergent ecotypic divergences at the phenotypic level, it is clear that the mitochondria play a central role even as that role may change across populations and ecological contexts (Fig. 1).Research within well-characterized natural systems, such as these garter snake populations, can offer illustrative case studies of how mitochondria respond to their environments, and thus impact physiological pathways and evolutionary patterns, creating variation in life histories and aging.",
+      "Despite the complexities underlying observed variation in mitochondrial function, recent work has demonstrated examples of how evolution and plasticity in mitochondrial function across populations within a species can shape life histories.For example, evidence from Drosophila has demonstrated the effect of temperature on components of the ETC and has linked mtDNA variants to metabolic thermosensitivity (Pichaud et al., 2012), to differences in whole-organism metabolic rates (Kurbalija Novi\u010di\u0107 et al., 2015), and to fitness-related traits (Ballard et al., 2007;Pichaud et al., 2011;Pichaud et al., 2010).In general, studies in birds and mammals demonstrate that mitochondria of longer-lived species are more efficient in ATP production, produce less reactive oxygen species, and demonstrate increased antioxidant capacities (Barja and Herrero, 2000;Ku et al., 1993;Lambert et al., 2007).While some studies in lizards and snakes demonstrate a similar pattern (Olsson et al., 2008;Robert et al., 2007), the extent to which these results are generalizable across vertebrate taxa is not yet known.The diversity of life-history traits and immense variation in longevity demonstrated by reptiles, both within and among species, make these taxa ideal candidates for understanding how variation in mitochondrial physiology drives this variation in whole-organism traits (reviewed in Hoekstra et al., 2019).Such work has moved to the forefront with a recent focus on the ecological and evolutionary significance of aging processes in wild populations (reviewed in Nussey et al., 2013;Fletcher and Selman, 2015;Gaillard and Lema\u00eetre, 2020).",
+      "Over evolutionary time, differential mortality rates are a selective force in shaping genetic structure.This results in divergence of a variety of physiological networks that shape, ultimately, patterns of aging and longevity in different habitats (Monaghan et al., 2008;Stojkovi\u0107 et al., 2017).Such selective pressures can have differential effects on the nuclear and mitochondrial genomes (McKenzie et al., 2019;Wolff et al., 2014).Genetic variation in the mitochondrial genome is known to drive mitochondrial function in many species (Ballard and Melvin, 2010;McKenzie et al., 2019;Novelletto et al., 2016) and we find this in our system as well.Whole organism metabolic rate varies with the mitochondrial genome haplogroups we identified in this study.T. elegans individuals with the introgressed T. sirtalis mitochondrial genome had the lowest metabolic rate and had 68 amino acid changes in the ETC genes relative to the T. elegans mitochondrial genomes.As species divergence are a continuation of population divergence, this introgression provides additional insight into how genetic variation can alter mitochondrial function.Whether the lower metabolic rate in our snakes with the introgressed mitochondrial genome is due to the fixed amino acid changes between the species or a mismatch between the coadapted nuclear and mitochondrially-encoded ETC proteins that could alter function of the mitochondria (Burton et al., 2013;Haenel, 2017;Rawson and Burton, 2002;Toews et al., 2014;Wolff et al., 2014) will require further comparisons to T. sirtalis individuals.",
+      "Building on previous work in this system, the current study tests three primary hypotheses about how variation in mtDNA and mitochondrial function relate to variation in life-history traits and aging within this system (Fig. 1): (1) First, we test whether rates of cellular oxygen consumption in isolated immune cells exhibit patterns that are consistent with the hypothesis that cellular processes drive whole-organism senescence and aging, and if these patterns differ between the SA and FA ecotypes and between sexes.By measuring basal, ATP-production associated, and maximal rates of cellular oxygen consumption, we further test for evidence that phenotypic divergence is dependent on a specific aspect of oxidative phosphorylation within immune cells.The energetics of these cells are particularly important given their essential role in modulating disease and infection, important factors contributing to senescence (Metcalf et al., 2019).We predict that SA snakes will maintain levels of cellular oxygen consumption across age, whereas the FA snakes will show a decline with age, especially in ATP-associated rates, possibly due to continual degradation of electron transport chain functionality from accumulating oxidative damage and reduced DNA repair mechanisms (Robert and Bronikowski, 2010;Schwartz and Bronikowski, 2013). ( 2) Second, we expand our mitochondrial genomics dataset to quantify mtDNA genetic structure across the landscape and test whether mtDNA haplotypes, and alleles at a nonsynonymous SNP in the Cytochrome B (CytB) gene correlate with aging ecotypes. (3) Third, we test the hypothesis that variation in mtDNA correlates with whole-organism variation in metabolic rates, suggesting a pathway linking mitochondrial genetic variation in mtDNA to whole-organism energetics.We first test whether different haplotypes differ in resting metabolic rate.Then, we test the effects of the nonsynonymous SNP in CytB on resting metabolic rate.The CytB gene encodes a component of complex III of the ETC, and was previously found to segregate between these life-history ecotypes (Schwartz et al., 2015).This SNP results in an amino acid substitution from isoleucine (aliphatic, hydrophobic) to threonine (hydrophilic) on a region that comes into close contact with a nuclear-encoded subunit (Schwartz et al., 2015).We combine previously published and new data on whole-organism resting metabolic rates (oxygen consumption) to test for the effects of this nonsynonymous mutation in three populations where we find heterogeneity at this nucleotide, thus allowing us to disentangle the effects of shared environment (population) from sequence variation (SNP).We predict that this SNP will correlate with variation in whole-organism metabolic rate, demonstrating a putatively adaptive difference between the derived and ancestral sequence.By utilizing this integrative data setfrom genes to organelles to whole organisms to populationsin a known life-history context, we are able to test hypotheses across levels of organization to provide a more complete picture of the complicated story of mitochondria and life history (Havird et al., 2019).",
+      "mtDNA Diversity  Unlike the nuclear genome, which requires both paternal and maternal contributions, mtDNA is inherited solely from the maternal lineage.It is unclear what advantage a uniparental mtDNA transmission confers, but one possibility is to minimize the number of distinct genomes to maximize the efficiency of a multi-genomic system (Hill et al. 2019).In fact, humans have developed complex, redundant mechanisms to ensure uniparental inheritance of mtDNA (DeLuca and O'Farrell 2012; Rojansky et al. 2016).Paternal mitochondria from sperms that enter into the egg during fertilization are actively and selectively eliminated via mitophagy through two E3 ligases, PARKIN, and MUL1 (Rojansky et al. 2016).PARKIN and MUL1 serve redundant purposes, and mitophagy becomes insufficient to eliminate paternal mtDNA only in the absence of both (Rojansky et al. 2016).Even though oocytes have  at least a thousand-fold more mitochondria than a sperm cell (Rojansky et al. 2016) and heteroplasmy levels would be very low if paternal mtDNA were to contaminate the embryo, the results can still be non-trivial.However, challenging this notion, a recent study provides evidence of potential paternal transmission (Luo et al. 2018), but awaits further corroborating studies (Lutz-Bonengel and Parson 2019).",
+      "MtDNA exhibit a higher mutation rate than nuclear DNA, leading to significant population-level mtDNA polymorphisms (van Oven and Kayser 2009; Wallace 1999; Wallace and Chalkia 2013).In fact, the co-evolution of the mitonuclear genomes has been proposed to be driven by mtDNA mutations that select for compensatory changes in the nuclear genome (Havird and Sloan 2016).Populations that share similar mtDNA polymorphisms can be clustered into distinct haplogroups that are designated using all letters of the alphabet (i.e., A through Z).The mtDNA haplogroups represent major branch points on the mitochondrial phylogenetic tree that have strong regional ties around the globe, thus supporting the concept of a 'mitochondrial eve' (Wallace 1999).Haplogroups present inherently different mitonuclear interactions (Zaidi and Makova 2019), which eventually affect the aging process (Wolff et al. 2016).For example, one haplogroup commonly found in Ashkenazi Jews can interact with a specific enrichment of an amino acid sequence in complex I, and result in altered susceptibility to type 2 diabetes mellitus (Gershoni et al. 2014).The effect of mitonuclear compatibility on lifespan is influenced by environmental cues in flies (Drummond et al. 2019).It is unclear if mitonuclear compatibility is invariable throughout an organism's life, or antagonistically pleiotropic during aging, making it a difficult moving target to understand.",
+      "Background: The accumulation of mitochondrial DNA (mtDNA) mutations, and the reduction of mtDNA copy number, both disrupt mitochondrial energetics, and may contribute to aging and age-associated phenotypes.However, there are few genetic and epidemiological studies on the spectra of blood mtDNA heteroplasmies, and the distribution of mtDNA copy numbers in different age groups and their impact on age-related phenotypes.In this work, we used whole-genome sequencing data of isolated peripheral blood mononuclear cells (PBMCs) from the UK10K project to investigate in parallel mtDNA heteroplasmy and copy number in 1511 women, between 17 and 85 years old, recruited in the TwinsUK cohorts.",
+      "Discussion  Two significant questions are raised by the findings that mitochondrial DNA can integrate into the nucleus.Firstly, is this an extraordinarily rare event or is it occurring continually and at high frequency?Secondly, can such an event have pathological consequences to the organism?",
+      "Phylogeny  The mtDNA is maternally inherited (120) by offspring through the oocyte cytoplasm; namely, the mother transmits her mtDNAs to all of her offspring, and her daughters transmit their mtDNAs to the next generation.This is the consequence of the fact that the mature oocyte such as mouse (304) or bovine (144) contains lOO-1,000 times more mtDNA than is found in somatic cells.Hence, the few sperm mtDNAs that enter the egg (130) have little effect on the genotype.The maternal inheritance results in sequentially diverged mtDNA polymorphism of modern human, as shown in Figure 2. The polymorphism derives from the combinations of small deletions and additions of <14 bp in noncoding region and base substitutions including some point mutations in coding region.",
+      "There have been few reports on distinct correlation between mitochondrial morphology and human aging, except changes in number and size of mitochondria associated with age.Concerning the gross structure of mitochondria, the overwhelming importance of the cell nucleus in mitochondrial biogenesis should be noted, because the major parts of mitochondrial proteins are encoded by nuclear genes that are stable during life with the efficient repair mechanism for nDNA.",
+      "Early data on DNA polymorphism detected by restriction endonuclease (263) have suggested that the evolutionary change of mtDNA in higher animals occurs mainly by nucleotide substitution rather than by deletion and insertion.The mtDNA nucleotide sequence evolves 6-17 times faster than comparable nuclear DNA gene sequences (51,52,405).Rapid evolution of mtDNA of higher primates including human, 0.02 base substitutions per site per million years, was calculated from the restriction map of mtDNA (51).Because orthodox recombination mechanism appears to be absent in mtDNA (128), germline mutation seems to go down to posterity as maternal inheritance from our common ancestor (57).",
+      "A number of conclusions may be drawn from these results.Firstly, the data begin to answer the question of how closely mtDNA replication is kept in synchrony with nuclear DNA replication: it would appear to be regulated not by direct coupling to the nuclear DNA replication, but rather by the cell mass to be serviced by mitochondria.",
+      "It may be that high mtDNA levels are indeed indicative of compromised mitochondria, but that the underlying defects are unrelated to alterations in the DNA sequence.Alternatively, elevated quantities of mtDNA might be associated with increased metabolic requirements of the embryo, rather than organelles of suboptimal function.It is possible that embryos produced by older oocytes are under some form of stress and therefore have larger energy requirements.Functional experiments will be required to address these questions.Whatever the underlying basis, the current study has unequivocally demonstrated that female reproductive aging is associated with changes in the mtDNA content at the blastocyst stage.",
+      "Age-associated alterations of the mitochondrial genome occur in several different species; however, their physiological relevance remains unclear.The age-associated changes of mitochondrial DNA (mtDNA) include nucleotide point mutations and modifications, as well as deletions.In this review, we summarize the current literature on age-associated mtDNA mutations and deletions and comment on their abundance.A clear need exists for a more thorough evaluation of the total damage to the mitochondrial genome that accumulates in aged tissues.\u1b67 1997 Elsevier Science Inc.",
+      "Mitochondrial genetics  One underexplored avenue for determining maternal risk for preterm birth involves the influence of the mitochondrial genome.The high mutation rate of mito chondrial DNA (mtDNA), together with the fact that most of its encoded proteins are evolutionarily con served, allowing for the selection of neutral or beneficial variants, has generated interest in defining human mtDNA variations and their roles in human biology [58].",
+      "Clearly, as mitochondrial metabolic and genetic therapies advance for treating mitochondrial disease, they will also be available to enhance the personal lives of others.However, mitochondrial genetic variation appears to have been one of the primary factors that permitted our ancestors to adapt to new environments, survive adverse conditions, and multiple throughout the globe.Is it possible that by taking over control of individual mtDNA variation, we might also be setting our species on the road to functional decline and ultimately extinction?",
+      "Mitochondrial therapeutics and performance enhancement  It is now clear that not all mtDNA variation is deleterious.Indeed, about 25% of all ancient mtDNA variation appears to have caused functional mitochondrial changes and thus been adaptive.Those mtDNA variants that are adapted to warm climates have mtDNA variants that result in tightly coupled OXPHOS, thus maximizing ATP output and minimizing heat production.The presence of these mtDNAs permits maximum muscle performance but also predispose sedentary individuals that consume excess calories to multiple problems.They would be prone to be overweight and their mitochondria would generate excessive ROS, thus making them susceptible to a variety of degenerative diseases, cancer and premature aging.Partially uncoupled mitochondria generate more heat, but at the expense of ATP production.Individual's with these variants are better able to tolerate the cold, and are less prone to obesity.They also generate less ROS making then resistant to degenerative diseases and aging.Finally, the mitochondria are why we breathe.Hence, mitochondrial variation might be an important factor in individual predisposition to altitude sickness.",
+      "Human mtDNA codes for 13 essential polypeptide components of the mitochondrial oxidative phosphorylation (OXPHOS) system.mtDNA undergoes strict maternal inheritance, resulting in the absence of bi-parental recombination (Elson et al., 2001) and has a high mutation rate (Tuppen et al., 2010).As such, the evolution of mtDNA is characterised by the emergence of distinct lineages (or haplogroups) (Hernstadt et al., 2002).This results in high levels of mtDNA variation at the population level despite its rather small size, which is also illustrated by the large number of sub-haplogroups (van Oven and Kayser, 2009).Africa"
+    ],
+    [
+      "Annotation, preprocessing and categorization of data  We used Ensembl (version 39) as the annotation reference database.Homology between human and mouse genes was derived via BioMart.The total number of genes under study comprises 15,277 Ensembl mouse genes representing the union of the homologue genes from all data sources.An overview about the T2DM specific datasets is given in Table 1.",
+      "But the four sites are not equivalent; there are important distinctions between them in terms of the data analysed, the analyses carried out and the way the results are displayed. 4.4.1 Ensembl Ensembl is a joint project between the EBI (http://www.ebi.ac.uk/) and the Wellcome Trust Sanger Institute (http://www.sanger.ac.uk/). The Ensembl database (Hubbard et al. , 2002; http://www.ensembl.org/), launched in 1999, was the first to provide a window on the draft genome, curating the results of a series of computational analyses.",
+      "Until January 2002 (Release 3.26.1), Ensembl used the UCSC draft sequence assemblies as its starting point, but it is now based upon NCBI assemblies. The Ensembl analysis pipeline consists of a rule-based system designed to mimic decisions made by a human annotator. The idea is to identify \u2018confirmed\u2019 genes that are computationally predicted (by the GENSCAN gene prediction program) and also supported by a significant BLAST match to one or more expressed sequences or proteins. Ensembl also identifies the positions of known human genes from public sequence database entries, usually using GENEWISE to predict their exon structures.",
+      "Data retrieval is extremely well catered for in Ensembl, with text searches of all database entries, BLAST searches of all sequences archived, and the availability of bulk downloads of all Ensembl data and even software source code. Ensembl annotation can also be viewed interactively on one\u2019s local machine with the Apollo viewer (Lewis et al. , 2002; http://www.fruitfly.org/annot/apollo/). 4.4.2 The UCSC Human Genome Browser The UCSC Human Genome Browser (UCSC) bears many similarities to Ensembl; it, too, provides annotation of the NCBI assemblies, and it displays a similar array of features, including confirmed genes from Ensembl.",
+      "But the four sites are not equivalent; there are important distinctions between them in terms of the data analysed, the analyses carried out and the way the results are displayed. 4.4.1 Ensembl Ensembl is a joint project between the EBI (http://www.ebi.ac.uk/) and the Wellcome Trust Sanger Institute (http://www.sanger.ac.uk/). The Ensembl database (Hubbard et al. , 2002; http://www.ensembl.org/), launched in 1999, was the first to provide a window on the draft genome, curating the results of a series of computational analyses.",
+      "Until January 2002 (Release 3.26.1), Ensembl used the UCSC draft sequence assemblies as its starting point, but it is now based upon NCBI assemblies. The Ensembl analysis pipeline consists of a rule-based system designed to mimic decisions made by a human annotator. The idea is to identify \u2018confirmed\u2019 genes that are computationally predicted (by the GENSCAN gene prediction program) and also supported by a significant BLAST match to one or more expressed sequences or proteins. Ensembl also identifies the positions of known human genes from public sequence database entries, usually using GENEWISE to predict their exon structures.",
+      "Data retrieval is extremely well catered for in Ensembl, with text searches of all database entries, BLAST searches of all sequences archived, and the availability of bulk downloads of all Ensembl data and even software source code. Ensembl annotation can also be viewed interactively on one\u2019s local machine with the Apollo viewer (Lewis et al. , 2002; http://www.fruitfly.org/annot/apollo/). 4.4.2 The UCSC Human Genome Browser The UCSC Human Genome Browser (UCSC) bears many similarities to Ensembl; it, too, provides annotation of the NCBI assemblies, and it displays a similar array of features, including confirmed genes from Ensembl.",
+      "Ensembl provides a DAS reference server giving access to a wide range of specialist annotations of the human genome (for more detail, see http://www.ensembl.org/das/). Data mining The ability to query very large databases in order to satisfy a hypothesis (\u2018top-down\u2019 data mining), or to interrogate a database in order to generate new hypotheses based on rigorous statistical correlations (\u2018bottom-up\u2019 data mining). Domain (protein) A region of special biological interest within a single protein sequence.",
+      "But the four sites are not equivalent; there are important distinctions between them in terms of the data analysed, the analyses carried out and the way the results are displayed. 4.4.1 Ensembl Ensembl is a joint project between the EBI (http://www.ebi.ac.uk/) and the Wellcome Trust Sanger Institute (http://www.sanger.ac.uk/). The Ensembl database (Hubbard et al. , 2002; http://www.ensembl.org/), launched in 1999, was the first to provide a window on the draft genome, curating the results of a series of computational analyses.",
+      "Until January 2002 (Release 3.26.1), Ensembl used the UCSC draft sequence assemblies as its starting point, but it is now based upon NCBI assemblies. The Ensembl analysis pipeline consists of a rule-based system designed to mimic decisions made by a human annotator. The idea is to identify \u2018confirmed\u2019 genes that are computationally predicted (by the GENSCAN gene prediction program) and also supported by a significant BLAST match to one or more expressed sequences or proteins. Ensembl also identifies the positions of known human genes from public sequence database entries, usually using GENEWISE to predict their exon structures.",
+      "Data retrieval is extremely well catered for in Ensembl, with text searches of all database entries, BLAST searches of all sequences archived, and the availability of bulk downloads of all Ensembl data and even software source code. Ensembl annotation can also be viewed interactively on one\u2019s local machine with the Apollo viewer (Lewis et al. , 2002; http://www.fruitfly.org/annot/apollo/). 4.4.2 The UCSC Human Genome Browser The UCSC Human Genome Browser (UCSC) bears many similarities to Ensembl; it, too, provides annotation of the NCBI assemblies, and it displays a similar array of features, including confirmed genes from Ensembl.",
+      "Ensembl provides a DAS reference server giving access to a wide range of specialist annotations of the human genome (for more detail, see http://www.ensembl.org/das/). Data mining The ability to query very large databases in order to satisfy a hypothesis (\u2018top-down\u2019 data mining), or to interrogate a database in order to generate new hypotheses based on rigorous statistical correlations (\u2018bottom-up\u2019 data mining). Domain (protein) A region of special biological interest within a single protein sequence.",
+      "Ensembl  Ensembl is a publicly available web resource that contains automatically annotated genomes.It is integrated with other available biological databases like Jasper for binding motifs.It is a much larger web resource than T1Dbase, and contains general information about the human genome including variants.These include SNPs, insertions, deletions and somatic mutations (Alterations in DNA that occur after conception, meaning that they are not inherited) for several species.Data from Ensembl can be accessed in a number of ways.The names of all the SNPs that occur in the T1D susceptibility regions can be collected from Ensembl using the Biomart tool (Kinsella et al., 2011).To achieve this, the coordinates of the T1D regions obtained from T1Dbase are uploaded to the biomart query page which allows one to search the genome browser and retrieve data like the names, chromosomal positions, and genic positions (referred to as \"consequence to transcript\", in Ensembl) of the SNPs.The SNP genic positions tell if a SNP is located within a gene, adjacent to a gene or whether they occur in inter-genic positions between gene coding regions, as well as the particular genes in which they are located.",
+      "Advantages of Ensembl:  There is a number of advantages to using Ensembl. (i) It is a larger web resource than T1Dbase and integrates data from a wide range of biological research sources into its database.Therefore, available information is quite comprehensive. (ii) Genic positions for 99% of the variants obtained from T1Dbase could be retrieved. (iii) Ensembl contains quality checks for genetic variants in its variation pipeline.A variant is flagged as failed if certain quality criteria are not met, for instance if none of the variant alleles match the reference allele of the variant.Generally, Ensembl was found to give more detailed information regarding the genic positions of variants compared to T1Dbase.",
+      "Information about genes, including gene names, chromosomal coordinates, biotype (coding or non-coding), and number of splice variants, can also be retrieved from Ensembl.",
+      "doi:10.1093/nar/gkp858 Cunningham F, Amode MR, Barrell D, Beal K, Billis K, Brent S, Carvalho-Silva D, Clapham P, Coates G, Fitzgerald S, Gil L, Giron CG, Gordon L, Hourlier T, Hunt SE, Janacek SH, Johnson N, Juettemann T, Kahari AK, Keenan S, Martin FJ, Maurel T, McLaren W, Murphy DN, Nag R, Overduin B, Parker A, Patricio M, Perry E, Pignatelli M, Riat HS, Sheppard D, Taylor K, Thormann A, Vullo A, Wilder SP, Zadissa A, Aken BL, Birney E, Harrow J, Kinsella R, Muffato M, Ruffier M, Searle SM, Spudich G, Trevanion SJ, Yates A, Zerbino DR, Flicek P (2015) Ensembl 2015.",
+      "But the four sites are not equivalent; there are important distinctions between them in terms of the data analysed, the analyses carried out and the way the results are displayed. 4.4.1 Ensembl Ensembl is a joint project between the EBI (http://www.ebi.ac.uk/) and the Wellcome Trust Sanger Institute (http://www.sanger.ac.uk/). The Ensembl database (Hubbard et al. , 2002; http://www.ensembl.org/), launched in 1999, was the first to provide a window on the draft genome, curating the results of a series of computational analyses.",
+      "Until January 2002 (Release 3.26.1), Ensembl used the UCSC draft sequence assemblies as its starting point, but it is now based upon NCBI assemblies. The Ensembl analysis pipeline consists of a rule-based system designed to mimic decisions made by a human annotator. The idea is to identify \u2018confirmed\u2019 genes that are computationally predicted (by the GENSCAN gene prediction program) and also supported by a significant BLAST match to one or more expressed sequences or proteins. Ensembl also identifies the positions of known human genes from public sequence database entries, usually using GENEWISE to predict their exon structures.",
+      "Data retrieval is extremely well catered for in Ensembl, with text searches of all database entries, BLAST searches of all sequences archived, and the availability of bulk downloads of all Ensembl data and even software source code. Ensembl annotation can also be viewed interactively on one\u2019s local machine with the Apollo viewer (Lewis et al. , 2002; http://www.fruitfly.org/annot/apollo/). 4.4.2 The UCSC Human Genome Browser The UCSC Human Genome Browser (UCSC) bears many similarities to Ensembl; it, too, provides annotation of the NCBI assemblies, and it displays a similar array of features, including confirmed genes from Ensembl.",
+      "Ensembl provides a DAS reference server giving access to a wide range of specialist annotations of the human genome (for more detail, see http://www.ensembl.org/das/). Data mining The ability to query very large databases in order to satisfy a hypothesis (\u2018top-down\u2019 data mining), or to interrogate a database in order to generate new hypotheses based on rigorous statistical correlations (\u2018bottom-up\u2019 data mining). Domain (protein) A region of special biological interest within a single protein sequence."
+    ]
+  ]
+}
\ No newline at end of file
diff --git a/gnqa/paper1_eval/src/data/datasets/old/experts_suga1_dataset.json b/gnqa/paper1_eval/src/data/datasets/old/experts_suga1_dataset.json
new file mode 100644
index 00000000..b9cfeb6d
--- /dev/null
+++ b/gnqa/paper1_eval/src/data/datasets/old/experts_suga1_dataset.json
@@ -0,0 +1,152 @@
+{
+  "question": [
+    "what genes are associated with diabetes?",
+    "nutrition is a factor for diabetes. construct an abstract about how can genomics be use to better understand nutritional factors of diabetes",
+    "nutrition is a factor for diabetes. how can genomics be use to better understand nutritional factors of diabetes",
+    "Is any of the genes SH2B3, IFIH1 or ERBB3  related to diabetes?",
+    "How does genetics influence the emergency of diabetes?",
+    "How many types of diabetes exist?"
+  ],
+  "answer": [
+    "The genes associated with diabetes mentioned in the text include PPARG, KCNJ11, TCF7L2, SLC30A8, HHEX, CDKN2A, IGF2BP2, CDKAL1, IRS1, CCR5, FTO, NOTCH2, WFS1, JAZF1, ADIPOQ, AHSG, CAPN10, ENPP1, PPARGC1A, SREBF1, PDX1, PFAS, GCK, HNFA4, LOC387761, KCNQ1, MTNR1B, GCKR, BACH2, CTNNB1, CREB5, PARD3B, PARD6G, PRKCZ, EBF1, GIGYF1, ADCY5, PROX1, DGKB-TMEM195, CRY2, FADS1, GLIS3, and C2CD4B.",
+    "Genomics provides a comprehensive understanding of the genetic factors influencing the development and progression of diabetes. Nutritional genomics focuses on the interaction between genes and dietary patterns, which plays a crucial role in diabetes. By studying gene-nutrient interactions and genetic variations, researchers can develop tailored diets that maximize the use of nutrients and other functional ingredients present in food, aiding in the prevention and delay of diabetes and its complications. Furthermore, genomics can help identify how an individual's genetic makeup can affect nutrient metabolism and response to nutrient intake, potentially leading to diabetes. Thus, genomics offers a promising approach to understanding the nutritional factors of diabetes and developing personalized dietary interventions.",
+    "Genomics can be used to understand the nutritional factors of diabetes by studying nutrient-gene interactions and how an individual's genetic makeup can affect nutrient metabolism and response to nutrient intake. This approach can help develop tailored diets that maximize the use of nutrients and other functional ingredients present in food, aiding in the prevention and delay of diabetes and its complications. Additionally, genomics can identify gene variants that interact with specific nutrients, providing insights into how diet and genetics interact to influence diabetes risk.",
+    "Yes, the gene IFIH1 has been identified as contributing to susceptibility to type 1 diabetes. However, the text does not mention any association between the genes SH2B3 or ERBB3 and diabetes.",
+    "Genetics plays a significant role in the emergence of diabetes. In type 1 diabetes, gene variants in the human leukocyte antigen (HLA) locus contribute to 50-60% of the genetic risk by affecting protein binding and antigen presentation to T cells. Other gene variants that modulate immune regulation, viral responses, environmental signals, and endocrine function also contribute to the disease. The risk for type 1 diabetes is strongly influenced by multiple genetic loci. In type 2 diabetes, many common variants, largely with small effect size, contribute to overall disease risk. Monogenic forms of diabetes, such as MODY, result from mutations in a single gene. Family history also plays a role, suggesting that genetics conveys more than heritable information; it probably includes non-genetic familial behaviors and norms.",
+    "There are primarily four types of diabetes: Type 1, Type 2, gestational diabetes, and diabetes due to specific causes. However, there are also subtypes within these broad categories, such as latent autoimmune diabetes of adults (LADA) and maturity onset diabetes of the young (MODY)."
+  ],
+  "contexts": [
+    [
+      "To see which other significant genes were likely to have a role in diabetes we looked at all variant sets with a significant glucose, HbA1c, or T2D association and examined whether they had associations with additional diabetes traits (p \u2264 0.0016, correcting for 32 sets tested).Damaging missense variants in PDX1 and PFAS, which significantly associated with HbA1c levels in our primary analysis, associated with T2D diagnosis using this threshold (Table 3 and Supplementary Table 14).",
+      "Identification of genes with a biological role in diabetes. Variants in two genes, GCK and GIGYF1, significantly associated with glucose, HbA1c and T2D diagnosis, strongly suggesting a biological role in diabetes; GCK is involved in Mendelian forms of diabetes while GIGYF1 has not previously been implicated by genetics in the disease.Both GCK and GIGYF1 are located on chromosome 7 but are 56 Mb apart, strongly suggesting that these signals are independent; this independence was confirmed by conditional analysis (Supplementary Table 13).Two additional variant sets, HNF1A pLOF and TNRC6B pLOF, had genome-wide associations with both T2D diagnosis and HbA1c levels while G6PC2 damaging missense variants associated with decreased levels of both glucose and HbA1c but not T2D diagnosis (Table 3).",
+      "One obvious locus to consider is TCF7L2 in the context of type 2 diabetes.Common genetic variation located within the gene encoding transcription factor 7 like 2 (TCF7L2) has been consistently reported to be strongly associated with the disease.Such reports range from 2006, when we first published the association [3], to the recent transethnic meta-analysis GWAS of type 2 diabetes [4].",
+      "Testing of these loci for association with T2D as a dichotomous trait in up to 40,655 cases and 87,022 nondiabetic controls demonstrated that the fasting glucose-raising alleles at seven loci (in or near ADCY5, PROX1, GCK, GCKR and DGKB-TMEM195 and the known T2D genes TCF7L2 and SLC30A8) are robustly associated (P < 5 \u00d7 10 \u22128 ) with increased risk of T2D (Table 2).The association of a highly correlated SNP in ADCY5 with T2D in partially overlapping samples is reported by our companion manuscript 29 .We found less significant T2D associations (P < 5 \u00d7 10 \u22123 ) for variants in or near CRY2, FADS1, GLIS3 and C2CD4B (Table 2).These data clearly show that loci with similar fasting glucose effect sizes may have very different T2D risk effects (see, for example, ADCY5 and MADD in Table 2).",
+      "Among the confirmed and potential type 2 diabetes risk genes described in Tables 1 and 2, eight genes influence whole-body or peripheral insulin sensitivity: ADIPOQ (47, 52, 250 -257), AHSG (75, 258), CAPN10 (259 -264), ENPP1 (265)(266)(267)(268)(269)(270)(271), PPARG (272)(273)(274)(275)(276)(277)(278)(279)(280)(281)(282)(283), PPARGC1A (284,285), SREBF1 (65), and TCF7L2 (133,151,286,287).",
+      "Despite identification of many putative causative genetic variants, few have generated credible susceptibility variants for type 2 diabetes.Indeed, the most important finding using linkage studies is the discovery that the alteration of TCF7L2 (TCF-4) gene expression or function (33) disrupts pancreatic islet function and results in enhanced risk of type 2 diabetes.Candidate gene studies have also reported many type 2 diabetes-associated loci and the coding variants in the nuclear receptor peroxisome proliferator-activated receptor-g (34), the potassium channel KCNJ11 (34), WFS1 (35), and HNF1B (TCF2) (36) are among the few that have been replicated (Table 2).Recently, there have been great advances in the analysis of associated variants in GWA and replication studies due to highthroughput genotyping technologies, the International HapMap Project, and the Human Genome Project.Type 2 susceptibility loci such as JAZF1, CDC123-CAMK1D, TSPAN8-LGR5, THADA, ADAMTS9, NOTCH2, and ADCY5 (37,38) are among some of the established loci (Table 2).CDKN2A/B, CDKAL1, SLC30A8, IGF2BP2, HHEX/IDE, and FTO are other established susceptibility loci for diabetes (Table 2) (34,39,40).GWA studies have also identified the potassium voltage-gated channel KCNQ1 (32) as an associated gene variant for diabetes.A recent GWA study reporting a genetic variant with a strong association with insulin resistance, hyperinsulinemia, and type 2 diabetes, located adjacent to the insulin receptor substrate 1 (IRS1) gene, is the C allele of rs2943641 (41).Interestingly, the parental origin of the single nucleotide polymorphism is of importance because the allele that confers risk when paternally inherited is protected when maternally transmitted.GWA studies for glycemic traits have identified loci such as MTNR1B (42), GCK (glucokinase) (42), and GCKR (glucokinase receptor) (42); however, further investigation of genetic loci on glucose homeostasis and their impact on type 2 diabetes is needed.Indeed, a recent study by Soranzo et al. (42) using GWA studies identified ten genetic loci associated with HbA 1c .Genetic factors affecting expression, turnover, and abnormal glycation of hemoglobin may be associated with changes in levels of HbA 1c .",
+      "G enome-wide association studies (GWAS) have iden- tified several type 2 diabetes mellitus (T2DM) susceptibility loci including CDKAL1, CDKN2B, IGF2BP2, HHEX, SLC30A8, PKN2, LOC387761 (1)(2)(3)(4)(5), and KCNQ1, which was recently identified by similar GWAS approach in two independent Japanese samples (6,7).Although these associations have been well replicated in Japanese populations (8), the role of these loci in other East Asian populations remains less clear.For example, a study in China by Wu et al. (9) did not find significant associations between single-nucleotide polymorphisms (SNPs) in IGF2BP2 and SLC30A8 with T2DM, whereas an association between SNPs at the HHEX locus and T2DM was reported among Chinese living in Shanghai, but not among Chinese in Beijing.Another study in Hong Kong Chinese (10) also did not find an association with SNPs at the IGF2BP2 locus; however, they reported an association between T2DM with SNPs at the HHEX and SLC30A8 loci.",
+      "In studies where overt T2D has been the phenotype the majority of associated polymorphisms have encoded proteins known to be involved in \u03b2-cell metabolism; for example TCF7L2, KCNJ11 and HHEX have shown robust association [170,171].This suggests that these genes could prove useful in predicting \u03b2-cell preservation during the course of T2D.The glucokinase gene (GCK) coding for the initial glucose-sensing step in the \u03b2-cell can have activating mutations causing hypoglycemia that might provide structural and functional models leading to drug targets for treating T2D [172].In the GoDARTs study, investigators examined the medication response of metformin and sulphonylurea based on the TCF7L2 variants mainly affecting the \u03b2-cell.The carriers of the at risk 'T' allele responded less well to sulphonylurea therapy than metformin [173].Also it is of significant public health interest that in the Diabetes Prevention Program, lifestyle modifications were shown to reduce the risk of diabetes conferred by risk variants of TCF7L2 at rs7093146, and in placebo participants who carried the homozygous risk genotype (TT), there was 80% higher risk for developing diabetes compared to the lifestyle intervention group carrying the same risk genotypes [35].These findings could herald significant future progress in the field of T2D pharmacogenomics, possibly leading to the development and use of agents tailored on the basis of genotype.",
+      "Despite sharing only 9 loci (among 26 and 17 total in the two analyses, respectively), the separate analyses both identified genes involved in diabetes-related biological functions, including \"glucose homeostasis,\" \"pancreas development\" and \"insulin secretion\" (Supplementary Tables 3 and 5).Three of the top eleven scoring genes in our independent replication analysis have verified causal links to T2D, as annotated in the OMIM 41 .These include genes encoding transcription factors TCF7L2 (TCF4), which has extensive evidence of being causal in T2D 61,62 , and HNF1B, which is a known cause of maturity onset diabetes of the young 63 .Other high-ranking candidate genes have been identified as therapeutic targets in T2D (for example, CTBP1 (ref.64) and LEP 65 ), and the high-scoring gene HHEX has recently been shown to play a key role in islet function 66 .",
+      "Similar findings to AMD are now unfolding with type 2 DM.Grant et al. (24) first reported on a variant of the gene TCF7L2, which has been linked to reduced beta cell function and poor insulin response to oral glucose loads (51).Since its first discovery, this gene has been widely confirmed in independent studies as a pivotal susceptibility marker for type 2 DM (23,(25)(26)(27)(28)40).Recently, 6 genome-wide SNP association studies have identified and replicated in separate stages several additional novel genes conferring susceptibility to type 2 DM (23,(25)(26)(27)(28)40) (Table 2).Interestingly, these loci primarily include genes involved in pancreatic beta cell development and function as opposed to insulin resistance-the current accepted mechanism for type 2 DM.This development casts doubt on our traditional pathophysiological modeling of the type 2 diabetic patient and underscores the need for genomic studies to further define pathobiological processes of complex traits.",
+      "Of the 16 loci that have been associated with type 2 diabetes previously, [8][9][10][11][12][13][14][15] we showed that 11 -TCF7L2, PPARG, FTO, KCNJ11, NOTCH2, WFS1, CDKAL1, IGF2BP2, SLC30A8, JAZF1, and HHEXwere associated with an enhanced risk of future diabetes.Many of the variants that we genotyped appear to influence beta-cell function, possibly through effects on proliferation, regeneration, and apoptosis.There was a time-dependent increase in the BMI and a decrease in insulin sensitivity in the subjects from the Botnia study, an increase in insulin resistance that was reflected by an increase in insulin secretion.However, this increase was inadequate to compensate for the increase in insulin resistance in carriers with a high genetic risk, which resulted in a markedly impaired disposition index.Only variants in FTO were associated with an increased BMI.Both FTO and PPARG together with TCF7L2 and KCNJ11 predicted transition from impaired fasting glucose levels or impaired glucose tolerance to manifest diabetes, which suggests that a combination of increased obesity and insulin resistance with a deterioration in beta-cell function contribute to the manifestation of diabetes in these subjects.Collectively, our findings emphasize the critical role of inherited defects in beta-cell function for the development of type 2 diabetes.",
+      "Type 2 Diabetes  Common variants in 11 genes were significantly associated with the risk of future type 2 diabetes in the MPP cohort, including TCF7L2 (odds ratio, 1.30; P = 9.5\u00d710 \u221213 ), PPARG (odds ratio, 1.20; P = 4.0\u00d710 \u22124 ), FTO (odds ratio, 1.14; P = 9.2\u00d710 \u22125 ), KCNJ11 (odds ratio, 1.13; P = 3.6\u00d710 \u22124 ), NOTCH2 (odds ratio, 1.13; P = 0.02), WFS1 (odds ratio, 1.12; P = 0.001), CDKAL1 (odds ratio, 1.11; P = 0.004), IGF2BP2 (odds ratio, 1.10; P = 0.008), SLC30A8 (odds ratio, 1.10; P = 0.008), JAZF1 (odds ratio, 1.08; P = 0.03), and HHEX (odds ratio, 1.07; P = 0.03) (Table 2).Although these findings could not be fully replicated in the smaller Botnia study, there was little heterogeneity between the studies with respect to the risk conferred by different genotypes.",
+      "To date, more than 70 genes have been identified as involved in T2DM, primarily by association analysis [34].In addition, via GWAS arrays, more than 100 SNPs have been identified for T2DM [35].From the 50 novel loci associated with T2DM previously identified, more than 40 loci have been associated with T2DM-related traits, including fasting proinsulin, insulin and glucose (Table 1) [36][37][38][39].However, for T2DM-related traits, such as the HOMA index or pancreatic \u03b2 cell function, there are virtually no published data examining the relationship between these traits or the genotype and environment interactions.Clinical investigations of some loci have suggested that the genetic components of T2DM risk act preferentially through \u03b2 cell function [40].Among all 40 loci associated with T2DM-related traits, only transcription factor-7-like 2 (TCF7L2) was shown to clearly contribute to T2DM risk [41].Several studies in white European [42], Indian [43], Japanese [44], Mexican American [45] and West African [46] individuals have shown a strong association between TCF7L2 and T2DM.It is also noteworthy that these populations represent the major racial groups with a high prevalence of T2DM.In all populations, TCF7L2 showed a strong association, with the odds of developing T2DM increased by 30%-50% for each allele inherited.This finding indicates an approximately double odds ratio compared to most other diabetes susceptibility polymorphisms.TCF7L2 is a transcription factor involved in the Wnt signaling pathway that is ubiquitously expressed, and it has been observed that TCF7L2 risk alleles result in the overexpression of TCF7L2 in pancreatic \u03b2 cells.This overexpression causes reduced nutrient-induced insulin secretion, which results in a direct predisposition to T2DM as well as an indirect predisposition via an increase in hepatic glucose production [47].",
+      "Most Relevant T2DM Susceptibility Genes  Gene and environment interaction studies have shown a nice association between variants in peroxisome proliferator-activated receptor gamma (PPARG), TCF7L2 and fat mass and obesity-associated protein (FTO) genes, a Western dietary pattern and T2DM.",
+      "One of these genes associated with type 2 diabetes is the insulin receptor substrate 1 (IRS1, OMIM association number, 147545) (Alharbi, Khan, Abotalib, & Al-Hakeem, 2014;Alharbi, Khan, Munshi et al., 2014;Brender et al., 2013;Brunetti, Chiefari, & Foti, 2014) and another is the C-C motif chemokine receptor5(CCR5, OMIM association number, 601373) (Balistreri et al., 2007;Mokubo et al., 2006;Muntinghe et al., 2009).",
+      "Genes boosted in type 2 diabetes  Before the Wellcome Trust study, PPARG, KCNJ11, and TCF7L2 had all been identified as genes involved in type 2 diabetes through genome-wide association studies and replicated in follow-up studies (for review, see Bonnefond et al. 2010).The strongest candidate gene for type 2 diabetes, TCF7L2, was also the strongest signal seen in the Wellcome trust study, although the others were not so strong.However, the exact mechanism by which TCF7L2 acts was not entirely clear.In our analysis (Fig. 5), we find it directly connected to the b-catenin/WNT signaling pathway by its functional connection to CTNNB1, as well as to BACH2, a gene that has been repeatedly implicated in type 1 diabetes (e.g., Cooper et al. 2008;Madu et al. 2009), but which has not yet been linked to type 2 diabetes.BACH2 is among the genes most strongly boosted by network linkages, deriving additional signal from CREB5 and PARD3B, which both score highly in the GWAS data.PARD6G, PARD3B, and CDC42 are also emphasized by the method.Notably, these genes form a complex with PRKCZ (Koh et al. 2008), a variant of which correlates with type 2 diabetes in Han Chinese (Qin et al. 2008).EBF1, a known regulator of adipocyte differentiation (Akerblad et al. 2005) is also strongly boosted by the network, supporting a possible role in type 2 diabetes.",
+      "RESULTS-  We confirmed the associations of TCF7L2, SLC30A8, HHEX, CDKAL1, CDKN2A/CDKN2B, IGF2BP2, and FTO with risk for type 2 diabetes, with odds ratios ranging from 1.13 to 1.35 (1.3 \u03eb 10 \u03ea12 \u03fd P unadjusted \u03fd 0.016).In addition, the A allele of rs8050136 at FTO was associated with increased BMI in the control subjects (P unadjusted \u03ed 0.008).However, we did not observe significant association of any genetic variants with surrogate measures of insulin secretion or insulin sensitivity indexes in a subset of 2,662 control subjects.Compared with subjects carrying zero, one, or two risk alleles, each additional risk allele was associated with 17% increased risk, and there was an up to 3.3-fold increased risk for type 2 diabetes in those carrying eight or more risk alleles.Despite most of the effect sizes being similar between Asians and Europeans in the metaanalyses, the ethnic differences in risk allele frequencies in most of these genes lead to variable attributable risks in these two populations.",
+      "OBJECTIVE-Recent genome-wide association studies have identified six novel genes for type 2 diabetes and obesity and confirmed TCF7L2 as the major type 2 diabetes gene to date in Europeans.However, the implications of these genes in Asians are unclear.RESEARCH DESIGN AND METHODS-We studied 13 associated single nucleotide polymorphisms from these genes in 3,041 patients with type 2 diabetes and 3,678 control subjects of Asian ancestry from Hong Kong and Korea. RESULTS-We confirmed the associations of TCF7L2, SLC30A8, HHEX, CDKAL1, CDKN2A/CDKN2B, IGF2BP2, and FTO with risk for type 2 diabetes, with odds ratios ranging from 1.13 to 1.35 (1.3 \u03eb 10 \u03ea12 \u03fd P unadjusted \u03fd 0.016).In addition, the A allele of rs8050136 at FTO was associated with increased BMI in the control subjects (P unadjusted \u03ed 0.008).However, we did not observe significant association of any genetic variants with surrogate measures of insulin secretion or insulin sensitivity indexes in a subset of 2,662 control subjects.Compared with subjects carrying zero, one, or two risk alleles, each additional risk allele was associated with 17% increased risk, and there was an up to 3.3-fold increased risk for type 2 diabetes in those carrying eight or more risk alleles.Despite most of the effect sizes being similar between Asians and Europeans in the metaanalyses, the ethnic differences in risk allele frequencies in most of these genes lead to variable attributable risks in these two populations. CONCLUSIONS-Ourfindings support the important but differential contribution of these genetic variants to type 2 diabetes and obesity in Asians compared with Europeans.Diabetes 57: 2226-2233, 2008T ype 2 diabetes is a major health problem affecting more than 170 million people worldwide.In the next 20 years, Asia will be hit hardest, with the diabetic populations in India and China more than doubling (1).Type 2 diabetes is characterized by the presence of insulin resistance and pancreatic \u2424-cell dysfunction, resulting from the interaction of genetic and environmental factors.Until recently, few genes identified through linkage scans or the candidate gene approach have been confirmed to be associated with type 2 diabetes (e.g., PPARG, KCNJ11, CAPN10, and TCF7L2).Under the common variant-common disease hypothesis, several genome-wide association (GWA) studies on type 2 diabetes have been conducted in large-scale case-control samples.Six novel genes (SLC30A8, HHEX, CDKAL1, CDKN2A and CDKN2B, IGF2BP2, and FTO) with modest effect for type 2 diabetes (odds ratio [OR] 1.14 -1.20) had been reproducibly demonstrated in multiple populations of European ancestry.Moreover, TCF7L2 was shown to have the largest effect for type 2 diabetes (1.37) in the European populations to date (2-8).Although many of these genes may be implicated in the insulin production/secretion pathway (TCF7L2, SLC30A8, HHEX, CDKAL1, CDKN2A/B, and IGF2BP2) (6,9 -11), FTO is associated with type 2 diabetes through its regulation of adiposity (8,12,13).Moreover, two adjacent regions near CDKN2A/B are associated with type 2 diabetes and cardiovascular diseases risks, respectively (7,14 -16).Despite the consistent associations among Europeans, the contributions of these genetic variants in other ethnic groups are less clear.Given the differences in environmental factors (e.g., lifestyle), risk factor profiles (body composition and insulin secretion/resistance patterns), and genetic background (linkage disequilibrium pattern and risk allele frequencies) between Europeans and Asians, it is important to understand the role of these genes in Asians.A recent case-control study in 1,728 Japanese subjects revealed nominal association to type 2 diabetes for variants at the SLC30A8, HHEX, CDKAL1, CDKN2B, and FTO genes but not IGF2BP2 (17).In the present large-scale case-control replication study of 6,719 Asians, we aimed to test for the association of six novel genes from GWA studies and TCF7L2, which had the largest effect in Europeans, and their joint effects on type 2 diabetes risk and metabolic traits. RESEARCH DESIGN AND METHODSAll subjects were recruited from Hong Kong and Korea and of Asian ancestry.The subjects in the Hong Kong case-control study were of southern Han Chinese ancestry residing in Hong Kong.Participants for the case cohort consisting of 1,481 subjects with type 2 diabetes were selected from two",
+      "OBJECTIVE-Common variants in PPARG, KCNJ11, TCF7L2, SLC30A8, HHEX, CDKN2A, IGF2BP2, and CDKAL1 genes have been shown to be associated with type 2 diabetes in European populations by genome-wide association studies.We have studied the association of common variants in these eight genes with type 2 diabetes and related traits in Indians by combining the data from two independent case-control studies.",
+      "Introduction  Many genes have been evaluated as candidates for T2D susceptibility.However, only variants in the TCF7L2, PPARG, KCNJ11 and HNFA4 genes have been extensively replicated in populations around the world, showing their indisputable association with T2D risk (Zeggini 2007).In the particular case of the HNF4A gene, it has been implicated in maturity-onset diabetes of the young type 1 (MODY 1) (Mitchell and Frayling 2002;Zhu et al. 2003).HNF4A is a member of the nuclear receptor super-family that plays a critical role in embryogenesis and metabolism, by regulating gene expression in pancreatic beta cells, liver and other tissues.The HNF4A gene is localized to chromosome 20q13, a region that has demonstrated evidence for linkage with T2D (Sladek et al. 1990;Ghosh et al. 1999).Several genetic studies, mainly in Caucasian and Asian populations, have provided evidence for the association of the variants in HNF4A with T2D (Ghosh et al. 1999;Silander et al. 2004;Winckler et al. 2005)."
+    ],
+    [
+      "Researchers are expanding our understanding of genetic risk factors for diabetes through ongoing discoveries.Genetic variants associated with increased susceptibility to type 2 diabetes, a disease that affects more than 200 million people worldwide, have been identified (NHGRI & NIDDK, 2007).Such discoveries accelerate efforts to understand genetic contributions to chronic illness, as well as facilitate greater investigation of how these genetic factors interact with each other and with lifestyle factors.Ultimately, once the association of these variants with diabetes are confirmed, genetic tests may be utilized to identify (even before escalating blood sugars) those individuals, like Vanessa, who may be able to delay or prevent diabetes with healthy lifestyle decisions and behaviors.Information to assist nurses in this challenge is available in a toolkit \"Your Game Plan for Preventing Type 2 Diabetes\" (Your Game Plan, n.d.).Would you have known whether or not genetic testing was available for Vanessa?If you had said no to this question but could have explained the progress currently being made in understanding diabetes, Vanessa would have had access to the best care possible today.",
+      "enetic factors for many decades have been known to play a critical role in the etiology of diabetes, but it has been only recently that the specific genes have been identified.The identification of the underlying molecular genetics opens the possibility for understanding the genetic architecture of clinically defined categories of diabetes, new biological insights, new clinical insights, and new clinical applications.This article examines the new insights that have arisen from defining the etiological genes in monogenic diabetes and the predisposing polymorphisms in type 2 diabetes.",
+      "Genomics has contributed to a better understanding of many disorders including diabetes.The following article looks at the ethical, social and legal consequences of genomic medicine and predictive genetic testing for diabetes.This is currently a field in its nascent stage and developing rapidly all over the world.The various ethical facets of genomic medicine in diabetes like its effects on patient physician relationship, risk communication, genetic counseling and familial factors are explored and elucidated from a clinical, ethical, social and legal perspective.",
+      "Diabetes is a genetically complex multifactorial disease that requires sophisticated consideration of multigenic and phenotypic influences.As well as standard nonpara-  metric methods, we used novel approaches to evaluate and identify locus heterogeneity.It has also proved productive to consider phenotypes such as age at type 2 diabetes onset and obesity, which may define a more homogeneous subgroup of families.A genome-wide scan of 247 African-American families has identified a locus on chromosome 6q and a region of 7p that apparently interacts with early-onset type 2 diabetes and low BMI, as target regions in the search for African-American type 2 diabetes susceptibility genes.",
+      "Genetic factors are known to play a role in T2D and an understanding of the genetic basis of T2D could lead to the development of new treatments (Frayling, 2007a,b;Frayling & Mccarthy, 2007;Frayling, 2008).With the increased prevalence of diabetes worldwide, the need for intensive research is of high priority.Sequencing of the human genome and development of a set of powerful tools has made it possible to find the genetic contributions to common complex diseases (Donnelly, 2011).Genome-wide association studies (GWAS) have been used to search for genetic risk factors for complex disease (Hindorff, Junkins et al., 2009;Hindorff, Sethupathy et al., 2009).Used in combination with the scaffold data of the human genome courtesy of the HUGO Project (2003) and the International HapMap Project (Thorisson et al., 2005), it is now possible to analyse the whole genome to identify genetic variants that contribute to common disease in a fast and efficient manner.",
+      "In conclusion, genome-wide studies have added valuable scientific data to our repertoire of diabetes knowledge.However, there have been few genomic nuggets that enable a more robust prediction of diabetes than is achieved by using common environmental risk factors and none that clarify the peculiar ethnic proclivities of type 2 diabetes.The latter realization ought to temper enthusiasm for the indiscriminate use of genetic testing for diabetes.",
+      "To date, studies of diabetes have played a major role in shaping thinking about the genetic analysis of complex diseases.Based on trends in genomic information and technology, combined with the growing public health importance of diabetes, diabetes will likely continue to be an important arena in which methods will be pioneered and lessons learned.It is with great enthusiasm that we look forward to this effort, and with avid curiosity we await to see whether the lessons of today will be supported by the data of tomorrow.",
+      "In recent years tremendous changes had occurred in the field of molecular genetics and personalized medicine especially on exploring novel genetic factors associated with complex diseases like T2D with the advancement of new and improved genetic techniques including the next generation sequencing (NGS).In this review, we summarize recent developments from studies on the genetic factors associated with the development of T2D in the Arab world published between 2015 and 2018, which were based on the latest available genetic technologies.Few such studies have been conducted in this region of the world.Therefore, our study will provide valuable contributions to advanced genetic research and a personalized approach to diabetes management.",
+      "Nonetheless, \"evidence\" for the genetics of diabetes risk is mounting, often at the expense of understanding the social context and determinants of the disease.Biogenetic views tend to trump sociological views in the diabetes research imaginary of consortium members.However, the genetic epidemiologists who make up part of the diabetes consortium are not ignorant of the effects of proper diet and adequate exercise. \"Take away the television and the automobile and diabetes would all but disappear,\" quipped the head of one lab.Neither are researchers unsympathetic to those who suffer from social inequality in the United States.Their career and intellectual interests lie in genetic explanations of diabetes, which, as I aim to show in this discussion, involves folding political and economic social relationships into biomedical discourse.In fact, the case of diabetes genetic epidemiology illustrates how, in spite of the sympathies of diabetes scientists, arrangements of racial inequality in the United States find their way into diabetes research publications and drug company promotional campaigns.To illustrate this phenomenon further, I present two tales from the field, one dealing with the naming of a publication article, the other with the marketing of a diabetes drug.",
+      "Discussion  Our study provides insight into the relative importance of clinical risk factors and those that are related to a panel of DNA variants associated with type 2 diabetes.Obesity was a strong risk factor for future diabetes, a risk that almost doubled in subjects with a family history of diabetes.However, the addition of data from genotyping of the known DNA variants to clinical risk factors (including a family history of diabetes) had a minimal, albeit statistically significant, effect on the prediction of future type 2 diabetes.Notably, the ability of genetic risk factors to predict future type 2 diabetes improved with an increasing duration of follow-up, suggesting that assessment of genetic risk factors is clinically more meaningful the earlier in life they are measured.",
+      "Diabetes mellitus (DM) is considered a global pandemic, and the incidence of DM continues to grow worldwide.Nutrients and dietary patterns are central issues in the prevention, development and treatment of this disease.The pathogenesis of DM is not completely understood, but nutrient-gene interactions at different levels, genetic predisposition and dietary factors appear to be involved.Nutritional genomics studies generally focus on dietary patterns according to genetic variations, the role of gene-nutrient interactions, genediet-phenotype interactions and epigenetic modifications caused by nutrients; these studies will facilitate an understanding of the early molecular events that occur in DM and will contribute to the identification of better biomarkers and diagnostics tools.In particular, this approach will help to develop tailored diets that maximize the use of nutrients and other functional ingredients present in food, which will aid in the prevention and delay of DM and its complications.This review discusses the current state of nutrigenetics, nutrigenomics and epigenomics research on DM.Here, we provide an overview of the role of gene variants and nutrient interactions, the importance of nutrients and dietary patterns on gene expression,",
+      "Diabetes mellitus (DM) is considered a global pandemic, and the incidence of DM continues to grow worldwide.Nutrients and dietary patterns are central issues in the prevention, development and treatment of this disease.The pathogenesis of DM is not completely understood, but nutrient-gene interactions at different levels, genetic predisposition and dietary factors appear to be involved.Nutritional genomics studies generally focus on dietary patterns according to genetic variations, the role of gene-nutrient interactions, genediet-phenotype interactions and epigenetic modifications caused by nutrients; these studies will facilitate an understanding of the early molecular events that occur in DM and will contribute to the identification of better biomarkers and diagnostics tools.In particular, this approach will help to develop tailored diets that maximize the use of nutrients and other functional ingredients present in food, which will aid in the prevention and delay of DM and its complications.This review discusses the current state of nutrigenetics, nutrigenomics and epigenomics research on DM.Here, we provide an overview of the role of gene variants and nutrient interactions, the importance of nutrients and dietary patterns on gene expression,",
+      "The aim of the present review was to provide insights regarding the role of nutrient-gene interactions in DM pathogenesis, prevention and treatment.In addition, we explored how an individual's genetic makeup can affect nutrient metabolism and the response to nutrient intake, potentially leading to DM.",
+      "It is important to promote greater research in this field because these findings will provide a framework for the development of genotype-dependent food health promotion strategies and the design of dietetic approaches for the prevention and management of DM.This knowledge has begun to provide evidence where specific targeted nutritional advice, such as following a Mediterranean Diet, helps to decrease cardiovascular risk factors and stroke incidence in people with polymorphisms strongly associated with T2DM [8].",
+      "A new generation of genetic studies of diabetes is underway.Following from initial genome-wide association (GWA) studies, more recent approaches have used genotyping arrays of more densely spaced markers, imputation of ungenotyped variants based on improved reference haplotype panels, and sequencing of protein-coding exomes and whole genomes.Experimental and statistical advances make possible the identification of novel variants and loci contributing to trait variation and disease risk.Integration of sequence variants with functional analysis is critical to interpreting the consequences of identified variants.We briefly review these methods and technologies and describe how they will continue to expand our understanding of the genetic risk factors and underlying biology of diabetes.",
+      "A new generation of genetic studies of diabetes is underway.Following from initial genome-wide association (GWA) studies, more recent approaches have used genotyping arrays of more densely spaced markers, imputation of ungenotyped variants based on improved reference haplotype panels, and sequencing of protein-coding exomes and whole genomes.Experimental and statistical advances make possible the identification of novel variants and loci contributing to trait variation and disease risk.Integration of sequence variants with functional analysis is critical to interpreting the consequences of identified variants.We briefly review these methods and technologies and describe how they will continue to expand our understanding of the genetic risk factors and underlying biology of diabetes.",
+      "In this review, we briefly outlined salient features of pathophysiology and results of the genetic association studies hitherto conducted on type 2 diabetes.Primarily focusing on the current status of genomic research, we briefly discussed the limited progress made during the post-genomic era and tried to identify the limitations of the post-genomic research strategies.We suggested reanalysis of the existing genomic data through advanced statistical and computational methods and recommended integrated genomics-metabolomics approaches for future studies to facilitate understanding of the gene-environment interactions in the manifestation of the disease.We also propose a framework for research that may be apt for determining the effects of urbanization and changing lifestyles in the manifestation of complex genetic disorders like type 2 diabetes in the Indian populations and offset the confounding effects of both genetic and environmental factors in the natural way.",
+      "In this review, we briefly outlined salient features of pathophysiology and results of the genetic association studies hitherto conducted on type 2 diabetes.Primarily focusing on the current status of genomic research, we briefly discussed the limited progress made during the post-genomic era and tried to identify the limitations of the post-genomic research strategies.We suggested reanalysis of the existing genomic data through advanced statistical and computational methods and recommended integrated genomics-metabolomics approaches for future studies to facilitate understanding of the gene-environment interactions in the manifestation of the disease.We also propose a framework for research that may be apt for determining the effects of urbanization and changing lifestyles in the manifestation of complex genetic disorders like type 2 diabetes in the Indian populations and offset the confounding effects of both genetic and environmental factors in the natural way.",
+      "In a nutshell, genomic and post-genomic approaches identified a large number of biomarkers to ponder over and explore further but we are yet to identify universally accepted biomarker which can be used for the successful management and prevention of type 2 diabetes.In order to understand environment related modifications of genetic susceptibility, it may be prudent to conduct studies with integrated genomic-metabolomic approach.It is also imperative to gather existing molecular genetic data and curate it into uniform format and analyze the same for understanding the present status of research.A few attempts were, however, made to develop type 2 diabetes informative databases.While the databases T2DGADB and T2D-DB are only a collection of publications related to type 2 diabetes genetic association studies, proteinprotein interactions and expression studies, T2D@ZJU is a comprehensive collection of pathway databases, protein-protein interaction databases, and literature (Yang et al. 2013).Further, T2D@ZJU is a user-friendly interface database that provides graphical output of information organized in networks.These attempts may provide basis for studying type 2 diabetes utilizing systems biology, which is a better approach for understanding complex genetic diseases.",
+      "Genetic factors appear to play a role in determining an individual's risk of developing diabetes.It is hoped that genetic studies will ultimately identify key genetic elements that help determine susceptibility to diabetes, disease progression, and responsiveness to specific therapies, as well as help identify novel targets for future intervention.A substantial number of genetic loci, gene polymorphisms, and mutations have already been reported as having variable degrees of association with one or other type of diabetes (type 1, type 2, maturity onset diabetes of the young [MODY]), while others appear to be involved in response to antihyperglycemic agents.We have compiled the following glossary of genetic and genomic terms relating to diabetes, which we hope will prove a useful reference to researchers and clinicians with an interest in this disease.This is by no means an exhaustive list, but includes many of the genetic loci and variants that have been studied in association with diabetes.Gene encoding insulin-like growth factor 2 mRNA binding protein 2 (also known as IMP-2).SNPs in the gene have been associated with type 2 diabetes IFIH1"
+    ],
+    [
+      "Researchers are expanding our understanding of genetic risk factors for diabetes through ongoing discoveries.Genetic variants associated with increased susceptibility to type 2 diabetes, a disease that affects more than 200 million people worldwide, have been identified (NHGRI & NIDDK, 2007).Such discoveries accelerate efforts to understand genetic contributions to chronic illness, as well as facilitate greater investigation of how these genetic factors interact with each other and with lifestyle factors.Ultimately, once the association of these variants with diabetes are confirmed, genetic tests may be utilized to identify (even before escalating blood sugars) those individuals, like Vanessa, who may be able to delay or prevent diabetes with healthy lifestyle decisions and behaviors.Information to assist nurses in this challenge is available in a toolkit \"Your Game Plan for Preventing Type 2 Diabetes\" (Your Game Plan, n.d.).Would you have known whether or not genetic testing was available for Vanessa?If you had said no to this question but could have explained the progress currently being made in understanding diabetes, Vanessa would have had access to the best care possible today.",
+      "Genomics has contributed to a better understanding of many disorders including diabetes.The following article looks at the ethical, social and legal consequences of genomic medicine and predictive genetic testing for diabetes.This is currently a field in its nascent stage and developing rapidly all over the world.The various ethical facets of genomic medicine in diabetes like its effects on patient physician relationship, risk communication, genetic counseling and familial factors are explored and elucidated from a clinical, ethical, social and legal perspective.",
+      "Genomics has contributed to a better understanding of many disorders including diabetes.The following article looks at the ethical, social and legal consequences of genomic medicine and predictive genetic testing for diabetes.This is currently a field in its nascent stage and developing rapidly all over the world.The various ethical facets of genomic medicine in diabetes like its effects on patient physician relationship, risk communication, genetic counseling and familial factors are explored and elucidated from a clinical, ethical, social and legal perspective.",
+      "Genetic factors are known to play a role in T2D and an understanding of the genetic basis of T2D could lead to the development of new treatments (Frayling, 2007a,b;Frayling & Mccarthy, 2007;Frayling, 2008).With the increased prevalence of diabetes worldwide, the need for intensive research is of high priority.Sequencing of the human genome and development of a set of powerful tools has made it possible to find the genetic contributions to common complex diseases (Donnelly, 2011).Genome-wide association studies (GWAS) have been used to search for genetic risk factors for complex disease (Hindorff, Junkins et al., 2009;Hindorff, Sethupathy et al., 2009).Used in combination with the scaffold data of the human genome courtesy of the HUGO Project (2003) and the International HapMap Project (Thorisson et al., 2005), it is now possible to analyse the whole genome to identify genetic variants that contribute to common disease in a fast and efficient manner.",
+      "In conclusion, genome-wide studies have added valuable scientific data to our repertoire of diabetes knowledge.However, there have been few genomic nuggets that enable a more robust prediction of diabetes than is achieved by using common environmental risk factors and none that clarify the peculiar ethnic proclivities of type 2 diabetes.The latter realization ought to temper enthusiasm for the indiscriminate use of genetic testing for diabetes.",
+      "Genomics of T2D  Diet, lifestyle, environment, and even genetic variation influence an individual's response to disease therapy.Like GWAS which identify genetic variants conferring risk for a disease, studies have been carried out for identifying genetic variants responsible for patient differences in drug response.Pharmacogenomics in diabetes focuses on the study of gene polymorphisms which influence an individual's response to antidiabetic drugs.Such genetic variants influence the pharmacodynamics and/or pharmacokinetics of the drug, thus affecting its efficacy or toxicity in an individual.The difference in response to treatments and therapies across individuals on account of these factors strengthens the case for personalized medicine in diabetes.",
+      "It is possible that there are genes that because of their known metabolic involvement are likely to interact with specific nutrients.For example, SLC30A8 which encodes a zinc transporter localized in secretory granules, interacted with dietary zinc to effect fasting insulin levels [132].However, the majority of GWAS variants have not shown interaction with environmental factors for effect on diabetes or related traits.Therefore, it is likely that prospective future studies will utilize improved assessment methods to increase power and avoid false interpretation [133,134].This could be enhanced by prioritizing variants that are most likely to have effects [135] or selective sampling according to extremes of the environmental factor could reduce the requirement for sample size [136].These and other strategies such as meta-analysis, nested case control and genotype-based studies have been recently reviewed [123,133] and the difficulties in measuring environmental exposures have been emphasized, including the application of analyses based on logistic regression [124] and problems with instruments such as physical activity questionnaires [137].Validated food frequency questionnaires are popular instruments for evaluation diabetes risk and are often used in conjunction with food analysis software [138,139].Similar methodology has been adapted to assess two predominant food consumption patterns by Prudent and Western [140], and demonstrated synergistic interaction with genotype and a less healthy Western dietary pattern in determining male risk for T2D by showing that the gene-diet interaction was higher in men with a high genetic risk score determined by a gene counting method [141].Also the effects of diet may predominate at specific developmental periods [142] suggesting that age and associated physiological changes are important as well as differences between genders.It has also been observed that homogeneity of an environmental factor such as physical activity in an Asian Indian study, may reduce ability to detect interaction, but could be solved by subgrouping by the level of activity [143], but increased recruitment would be needed to maintain power.",
+      "To date, studies of diabetes have played a major role in shaping thinking about the genetic analysis of complex diseases.Based on trends in genomic information and technology, combined with the growing public health importance of diabetes, diabetes will likely continue to be an important arena in which methods will be pioneered and lessons learned.It is with great enthusiasm that we look forward to this effort, and with avid curiosity we await to see whether the lessons of today will be supported by the data of tomorrow.",
+      "In recent years tremendous changes had occurred in the field of molecular genetics and personalized medicine especially on exploring novel genetic factors associated with complex diseases like T2D with the advancement of new and improved genetic techniques including the next generation sequencing (NGS).In this review, we summarize recent developments from studies on the genetic factors associated with the development of T2D in the Arab world published between 2015 and 2018, which were based on the latest available genetic technologies.Few such studies have been conducted in this region of the world.Therefore, our study will provide valuable contributions to advanced genetic research and a personalized approach to diabetes management.",
+      "Discussion  Our study provides insight into the relative importance of clinical risk factors and those that are related to a panel of DNA variants associated with type 2 diabetes.Obesity was a strong risk factor for future diabetes, a risk that almost doubled in subjects with a family history of diabetes.However, the addition of data from genotyping of the known DNA variants to clinical risk factors (including a family history of diabetes) had a minimal, albeit statistically significant, effect on the prediction of future type 2 diabetes.Notably, the ability of genetic risk factors to predict future type 2 diabetes improved with an increasing duration of follow-up, suggesting that assessment of genetic risk factors is clinically more meaningful the earlier in life they are measured.",
+      "Diabetes mellitus (DM) is considered a global pandemic, and the incidence of DM continues to grow worldwide.Nutrients and dietary patterns are central issues in the prevention, development and treatment of this disease.The pathogenesis of DM is not completely understood, but nutrient-gene interactions at different levels, genetic predisposition and dietary factors appear to be involved.Nutritional genomics studies generally focus on dietary patterns according to genetic variations, the role of gene-nutrient interactions, genediet-phenotype interactions and epigenetic modifications caused by nutrients; these studies will facilitate an understanding of the early molecular events that occur in DM and will contribute to the identification of better biomarkers and diagnostics tools.In particular, this approach will help to develop tailored diets that maximize the use of nutrients and other functional ingredients present in food, which will aid in the prevention and delay of DM and its complications.This review discusses the current state of nutrigenetics, nutrigenomics and epigenomics research on DM.Here, we provide an overview of the role of gene variants and nutrient interactions, the importance of nutrients and dietary patterns on gene expression,",
+      "Diabetes mellitus (DM) is considered a global pandemic, and the incidence of DM continues to grow worldwide.Nutrients and dietary patterns are central issues in the prevention, development and treatment of this disease.The pathogenesis of DM is not completely understood, but nutrient-gene interactions at different levels, genetic predisposition and dietary factors appear to be involved.Nutritional genomics studies generally focus on dietary patterns according to genetic variations, the role of gene-nutrient interactions, genediet-phenotype interactions and epigenetic modifications caused by nutrients; these studies will facilitate an understanding of the early molecular events that occur in DM and will contribute to the identification of better biomarkers and diagnostics tools.In particular, this approach will help to develop tailored diets that maximize the use of nutrients and other functional ingredients present in food, which will aid in the prevention and delay of DM and its complications.This review discusses the current state of nutrigenetics, nutrigenomics and epigenomics research on DM.Here, we provide an overview of the role of gene variants and nutrient interactions, the importance of nutrients and dietary patterns on gene expression,",
+      "The aim of the present review was to provide insights regarding the role of nutrient-gene interactions in DM pathogenesis, prevention and treatment.In addition, we explored how an individual's genetic makeup can affect nutrient metabolism and the response to nutrient intake, potentially leading to DM.",
+      "Thus, studies performed during the last decade have provided strong evidence to support a diet-genome interaction as an important factor leading to the development of T2DM.",
+      "It is important to promote greater research in this field because these findings will provide a framework for the development of genotype-dependent food health promotion strategies and the design of dietetic approaches for the prevention and management of DM.This knowledge has begun to provide evidence where specific targeted nutritional advice, such as following a Mediterranean Diet, helps to decrease cardiovascular risk factors and stroke incidence in people with polymorphisms strongly associated with T2DM [8].",
+      "A new generation of genetic studies of diabetes is underway.Following from initial genome-wide association (GWA) studies, more recent approaches have used genotyping arrays of more densely spaced markers, imputation of ungenotyped variants based on improved reference haplotype panels, and sequencing of protein-coding exomes and whole genomes.Experimental and statistical advances make possible the identification of novel variants and loci contributing to trait variation and disease risk.Integration of sequence variants with functional analysis is critical to interpreting the consequences of identified variants.We briefly review these methods and technologies and describe how they will continue to expand our understanding of the genetic risk factors and underlying biology of diabetes.",
+      "A new generation of genetic studies of diabetes is underway.Following from initial genome-wide association (GWA) studies, more recent approaches have used genotyping arrays of more densely spaced markers, imputation of ungenotyped variants based on improved reference haplotype panels, and sequencing of protein-coding exomes and whole genomes.Experimental and statistical advances make possible the identification of novel variants and loci contributing to trait variation and disease risk.Integration of sequence variants with functional analysis is critical to interpreting the consequences of identified variants.We briefly review these methods and technologies and describe how they will continue to expand our understanding of the genetic risk factors and underlying biology of diabetes.",
+      "In a nutshell, genomic and post-genomic approaches identified a large number of biomarkers to ponder over and explore further but we are yet to identify universally accepted biomarker which can be used for the successful management and prevention of type 2 diabetes.In order to understand environment related modifications of genetic susceptibility, it may be prudent to conduct studies with integrated genomic-metabolomic approach.It is also imperative to gather existing molecular genetic data and curate it into uniform format and analyze the same for understanding the present status of research.A few attempts were, however, made to develop type 2 diabetes informative databases.While the databases T2DGADB and T2D-DB are only a collection of publications related to type 2 diabetes genetic association studies, proteinprotein interactions and expression studies, T2D@ZJU is a comprehensive collection of pathway databases, protein-protein interaction databases, and literature (Yang et al. 2013).Further, T2D@ZJU is a user-friendly interface database that provides graphical output of information organized in networks.These attempts may provide basis for studying type 2 diabetes utilizing systems biology, which is a better approach for understanding complex genetic diseases.",
+      "The public health genomics approach to type 2 diabetes.So, while exciting gene discoveries are being made, what can we do?The answer may lie in the relatively new field of public health genomics, \"a multidisciplinary field concerned with the effective and responsible translation of genome-based knowledge and technologies to improve population health\" (12).Researchers, policymakers, and practitioners in public health genomics use populationbased data on genetic variation and gene-environment interactions to develop, implement, and evaluate evidencebased tools for improving health and preventing disease.They also apply systematic evidence-based knowledge synthesis and appraisal of the clinical validity and utility of genomic applications in health practice.Validated genomic information is then integrated into disease control and prevention programs (13).",
+      "Genomics for Type 2 Diabetes  Many research studies have been carried out on genetic-based illness prediction.Incorporating machine learning approaches with genetic-based illness prediction could result in an accurate outcome.This has intensified the role of Artificial Intelligence (AI) in healthcare.It has been estimated that approximately $36 billion will be invested in AI by 2025 [48].Deep genomics through machine learning approaches has outperformed accuracy in predicting and diagnosing illnesses such as cancer with minimal inclusion of radiologists.It is desired to have sufficient biological knowledge to understand how genetics can help us predict various conditions and analyze each chromosome to identify the disease-causing gene.Pre-existing research studies have focused on genomics and gene interaction patterns of various persistent illnesses such as Alzheimer's, multiple cancers, and Parkinson's."
+    ],
+    [
+      "Figure 8 Molecular changes in the islets of patients with T2D mirror the processes altered in NOD mice.mRNA expression in human pancreatic islets from healthy individuals (n = 105) and those diagnosed with T2D (n = 14) was assessed through RNA-seq analysis. (a) Relationship between GLIS3 and MANF expression in healthy individuals (Spearman correlation P value = 0.043), individuals with T2D (Spearman correlation P value = 0.075) and all individuals (Spearman correlation P value = 0.028). (b-e) Expression of XRCC4 (b), LIG4 (c), H2AFX (d) and CDKN1A (e) in healthy islets as compared to i slets from patients withT2D (P values shown after multiple-testing correction).The median and interquartile range (IQR; box) are shown, with error bars indicating 1.5 times the IQR.Individual values are shown if beyond 1.5 times the IQR. (f) Relationship between H2AFX and LIG4 expression in human islets (Spearman correlation P value = 5 \u00d7 10 \u22129 ).",
+      "All the genes involved in these pathways, as well as the genes involved in b-cells development and turnover, may be considered candidate genes for T2DM with predominant insulin deficiency.",
+      "One method of searching for the cause of NIDDM is via the candidate gene approach.Possible candidates for NIDDM include genes involved in specifying pancreatic islet (3-cell phenotype and in directing fj-cell development and (3-cell responses of glucose-mediated insulin synthesis and secretion.The transcription factor islet-1 (Isl-1) has been shown to be a unique protein that binds to the mini-enhancer or Far-FLAT region (nucleotide -247 to -198) of the rat insulin I gene (7).Isl-1, a protein comprised of 349 residues (38 kD), is a member of the LIM/homeodomain family of proteins, named for the first three members described: lin-11, isl-1, and mec-3 (8,9).These proteins are comprised of three putative regulatory regions, two LIM domains (cysteine-rich motifs) in the amino terminus of the protein, a homeobox domain near the middle, and a glutamine-rich transcriptional activation domain at the carboxyl end (7,9).With the use of an antibody to Isl-1, expression was shown to be restricted to a subset of endocrine cells, including islets, neurons involved in autonomic and endocrine control, and selected other tissues in the adult rat (10)(11)(12).",
+      "Results  Impairment or alteration of the insulin-signaling pathway is a commonly recognized feature of type 2 diabetes.It is therefore notable that the IS-HD gene set (Dataset S4) was not detected to be significantly transcriptionally altered by application of either hypergeometric enrichmentt test, DEA or GSEA.In particular, applying GSEA to the transcriptional profile dataset of diabetic and normal glucose-tolerant skeletal muscle described in Mootha et al. [10] did not identify a significant level of alteration in the IS-HD gene set (p \u00bc 0.536), while DEA produced a comparably weak enrichment score (p \u00bc 0.607).The failure to detect a significant transcriptional alteration in IS-HD may be explained by a number of factors.The enrichment results depended on the specific choice of the IS-HD gene set, and it is possible that an alternatively defined insulin-signaling gene set would be determined as significantly enriched.Additionally, expression changes in a few critical genes in IS-HD may be sufficient to substantially alter insulin signaling, and running DEA on the large IS-HD set may miss the contributions from these few genes.",
+      "35 ABSTRACT 11 A GENE EXPRESSION NETWORK MODEL OF TYPE 2 DIABETES ESTABLISHES A RELATIONSHIP BETWEEN CELL CYCLE REGULATION IN ISLETS AND DIABETES SUSCEPTIBILITY MP Keller, YJ Choi, P Wang, DB Davis, ME Rabaglia, AT Oler, DS Stapleton, C Argmann, KL Schueler, S Edwards, HA Steinberg, EC Neto, R Klienhanz, S Turner, MK Hellerstein, EE Schadt, BS Yandell, C Kendziorski, and AD Attie Depts.",
+      "Second, we performed an extensive manual curation according to a previously described b-cell-targeted annotation (Kutlu et al, 2003;Ortis et al, 2010).In partial agreement with the IPA, we found these genes to fall into three broad categories: (1) genes related to b-cell dysfunction and death, (2) genes potentially facilitating the adaptation of the pancreatic islets to the altered metabolic situation in T2D and (3) genes whose role in disease pathogenesis remains to be unearthed (Figure 6B).The adaptation-related gene category contains few metabolism-associated genes (e.g., HK1, FBP2; Figure 6B, right part, Figure 7) and many more genes involved in signal transduction or encoding hormones, growth factors (e.g., EGF, FGF1, IGF2/IGF2AS; Figure 7), or transcription factors involved in important regulatory networks (for instance, FOXA2/HNF3B, PAX4 and SOX6) (Figure 6B, right part, Figure 7).In the b-cell dysfunction and death category, there were hypomethylated genes related to DNA damage and oxidative stress (e.g., GSTP1, ALDH3B1; Figure 7), the endoplasmic reticulum (ER) stress response (NIBAN, PPP2R4, CHAC1), and apoptosis (CASP10, NR4A1, MADD; Figure 6B, left part, Figure 7).Some genes of interest from the highlighted categories are depicted in Figure 7. Their annotated functions provide possible explanations of how the epigenetic dysregulation of these genes in diabetic islets is connected to T2D pathogenesis.Numerous genes that were identified by our methylation profiling approach have been functionally implicated in insulin secretion.Examination of the available literature on the function of these genes revealed three aspects of insulin secretion with which they interfere: some of these genes influence the expression of the insulin gene, like MAPK1 and SOX6, or its post-translational maturation, like PPP2R4 (cf. Figure 7 and references therein).Others can deregulate the process of insulin secretion itself (SLC25A5, Ahuja et al, 2007;RALGDS, Ljubicic et al, 2009) or influence synthesis as well as secretion (vitronectin, Kaido et al, 2006).A third group of differentially methylated genes affects (i) signalling processes in the b-cell leading to insulin secretion or (ii) glucose homeostasis in b-cells, thereby modulating insulin response upon stimulation.GRB10 (Yamamoto et al, 2008), FBP2 and HK1 (Figure 7) are examples for these genes.Additional genes found in our study have been implicated in the b-cells' capability to secrete insulin, though the mechanisms have not yet been fully established.The putative functions of these genes indicate a potential epigenetic impact on insulin secretion at multiple levels, namely signalling, expression/synthesis and secretion.",
+      "In summary, we have associated mutations in the SLC29A3 gene with diabetes mellitus in humans and the insulin signaling pathway in Drosophila.The mechanistic basis of these findings remains to be determined.This is strong evidence supporting the investment of resources to further investigate the role of SLC29A3 and its orthologs in diabetes and glucose metabolism in model systems.",
+      "DISCUSSION  We have identified mutations in the equilibrative nucleoside transporter 3 protein that are associated with an inherited syndrome of insulin-dependent DM, and provide prima facie evidence that the Drosophila ortholog of this protein interacts with the insulin signaling pathway.This is the first evidence that mutations in the human SLC29A3 gene can be associated with a diabetic phenotype.",
+      "These observations taken together suggest that molecules involved in innate immunity could serve as candidate genes that determine the susceptibility of sensitive strains of mice to virusinduced diabetes.Interestingly, deficiency of the Tyk2 gene results in a reduced antiviral response 24 .In addition, the human TYK2 gene was mapped to the possible type 1 diabetes susceptibility locus 25 .",
+      "A recent sequencing study provides an example of detection of rare variants in type 1 diabetes.Targeted sequencing in a series of candidate coding regions resulted in IFIH1 being identified as the causal gene in a region associated with type 1 diabetes by GWA studies (58).IFIH1 encodes a cytoplasmic helicase that mediates induction of the interferon response to viral RNA.The discovery of IFIH1 as a contributor to susceptibility to type 1 diabetes has strengthened the hypothesis (70) about a mechanism of disease pathogenesis involving virusgenetic interplay and raised type 1 interferon levels as a cofactor in \u2424-cell destruction.Nonetheless, it should be recognized that a component of the missing heritability (familial aggregation) in type 1 diabetes could well be due to unrecognized intra-familial environmental factors.Disease pathogenesis.Contemporary models of pathogenesis of type 1 diabetes support the involvement of two primary dramatis personae: the immune system and the \u2424-cell.The known and newly identified genetic risk factors for type 1 diabetes present exciting opportunities to build on to the current cast of disease mechanisms and networks.Most of the listed genes of interest (Table 2) and those in extended regions are assumed to regulate immune function.Some of these genes, however, may also have roles in the \u2424-cell (insulin being the most obvious example).Another gene, PTPN2, encoding a protein tyrosine phosphatase, was identified as affecting the risk for type 1 diabetes as well as for Crohn disease (47,71).PTPN2 is expressed in immune cells, and its expression is highly regulated by cytokines.However, PTPN2 is expressed also in \u2424-cells, where it modulates interferon (IFN)-\u2425 signal transduction and has been shown to regulate cytokineinduced apoptosis (72).Other candidate genes, such as NOS2A, IL1B, reactive oxygen species scavengers, and candidate genes, identified in large GWA studies of type 2 diabetes, have not been found to be significant contributors to the susceptibility of type 1 diabetes (73).",
+      "Differential Expression Analyses of Type 1 Diabetes Mellitus Associated Genes  For the aforementioned 171 'novel' genes, we used t-test to compare ribonucleic acid expression signals in PBMCs or monocytes between type 1 diabetes mellitus patients and healthy controls.We found that 37 genes, including 21 non-HLA genes (e.g.FAM46B, OLFML3 and HIPK1), were differentially expressed between type 1 diabetes mellitus patients  and controls (Table 2).For the differential expression study, the significance level of P < 5.0E-02 was used.",
+      "In this study, we have correlated the function and genotype of human islets obtained from diabetic and nondiabetic (ND) donors.We have analyzed a panel of 14 gene variants robustly associated with T2D susceptibility identified by recent genetic association studies.We have identified four genetic variants that confer reduced b-cell exocytosis and six variants that interfere with insulin granule distribution.Based on these observations, we calculate a genetic risk score for islet dysfunction leading to T2D that involves decreased docking of insulin-containing secretory granules, impaired insulin exocytosis, and reduced insulin secretion.",
+      "At present, insulin [15], glucokinase [16], amylin [17], mitochondrial DNA [18], and several transcriptional factors [19][20][21][22] are recognized as diabetogenic genes in pancreatic b-cells.In the present study we used the candidate gene approach in the examination of genomic variation in the a 1D and Kir6.2 channel genes in type 2 diabetic patients.",
+      "In summary, we report AEIs that are consistent with type 2 diabetes-associated variation regulating the expression of cis-linked genes in human islets.For some of the genes where significant AEI was identified (e.g., SLC30A8, WFS1), there is strong evidence from human genetics that small changes in gene dosage may have significant consequences for the pancreatic b-cell.For other genes with significant AEI (e.g., ANPEP, HMG20A), their role is less well defined, and hence this study should provide a platform for further work examining the effects of carefully manipulating the expression of these genes in human islets.",
+      "Results.Pathway analysis of genes with differentially methylated promoters identified the top 3 enriched pathways as maturity onset diabetes of the young (MODY), type 2 diabetes, and Notch signaling.Several genes in these pathways are known to affect pancreatic development and insulin secretion.",
+      "The authors then used mouse liver and adipose expression data from several mouse crosses to construct causal expression networks for the ERBB3 and RPS26 orthologs in the mouse. They then showed that ERBB3 is not associated with any known Type I diabetes genes whereas RPS26 is associated a network of several genes that are part of the KEGG Type I diabetes pathway (Schadt et al. 2008). This type of analysis demonstrates the power of combining human and mouse data with a network based approach that has been proposed for use in drug discovery (Schadt et al.",
+      "In conclusion, GWAS studies focusing on the causes of T2D have implicated islet dysfunction as a major contributing factor (18,71).By examining isolated islets for stress responses and cross-referencing gene hits with genes associated with glucose-stimulated insulin release in human populations with T2D, we identified 7 genes that may play a role in promoting or preventing islet decline in T2D.By further examining stress-induced expression changes in each of these genes, we identified 5 genes that stood out: F13a1 as a novel stress-inhibited gene in islets, Klhl6 and Pamr1 as induced genes specific to ER stress, Ripk2 as a  broadly stress-induced gene, and Steap4 as an exceptionally cytokine-sensitive gene.These genes provide promising leads in elucidating islet stress responses and islet dysfunction during the development of T2D.",
+      "Genome-wide association studies in human type 2 diabetes (T2D) have renewed interest in the pancreatic islet as a contributor to T2D risk.Chronic low-grade inflammation resulting from obesity is a risk factor for T2D and a possible trigger of \u2424-cell failure.In this study, microarray data were collected from mouse islets after overnight treatment with cytokines at concentrations consistent with the chronic low-grade inflammation in T2D.Genes with a cytokine-induced change of \u03fe2-fold were then examined for associations between single nucleotide polymorphisms and the acute insulin response to glucose (AIRg) using data from the Genetics Underlying Diabetes in Hispanics (GUARDIAN) Consortium.Significant evidence of association was found between AIRg and single nucleotide polymorphisms in Arap3 (5q31.3),F13a1 (6p25.3),Klhl6 (3q27.1),Nid1 (1q42.3),Pamr1 (11p13), Ripk2 (8q21.3),and Steap4 (7q21.12).To assess the potential relevance to islet function, mouse islets were exposed to conditions modeling low-grade inflammation, mitochondrial stress, endoplasmic reticulum (ER) stress, glucotoxicity, and lipotoxicity.RT-PCR revealed that one or more forms of stress significantly altered expression levels of all genes except Arap3.Thapsigargininduced ER stress up-regulated both Pamr1 and Klhl6.Three genes confirmed microarray predictions of significant cytokine sensitivity: F13a1 was down-regulated 3.3-fold by cytokines, Ripk2 was up-regulated 1.5-to 3-fold by all stressors, and Steap4 was profoundly cytokine sensitive (167-fold up-regulation).Three genes were thus closely associated with low-grade inflammation in murine islets and also with a marker for islet function (AIRg) in a diabetes-prone human population.This islet-targeted genome-wide association scan identified several previously unrecognized candidate genes related to islet dysfunction during the development of T2D.",
+      "Genome-wide association studies in human type 2 diabetes (T2D) have renewed interest in the pancreatic islet as a contributor to T2D risk.Chronic low-grade inflammation resulting from obesity is a risk factor for T2D and a possible trigger of \u2424-cell failure.In this study, microarray data were collected from mouse islets after overnight treatment with cytokines at concentrations consistent with the chronic low-grade inflammation in T2D.Genes with a cytokine-induced change of \u03fe2-fold were then examined for associations between single nucleotide polymorphisms and the acute insulin response to glucose (AIRg) using data from the Genetics Underlying Diabetes in Hispanics (GUARDIAN) Consortium.Significant evidence of association was found between AIRg and single nucleotide polymorphisms in Arap3 (5q31.3),F13a1 (6p25.3),Klhl6 (3q27.1),Nid1 (1q42.3),Pamr1 (11p13), Ripk2 (8q21.3),and Steap4 (7q21.12).To assess the potential relevance to islet function, mouse islets were exposed to conditions modeling low-grade inflammation, mitochondrial stress, endoplasmic reticulum (ER) stress, glucotoxicity, and lipotoxicity.RT-PCR revealed that one or more forms of stress significantly altered expression levels of all genes except Arap3.Thapsigargininduced ER stress up-regulated both Pamr1 and Klhl6.Three genes confirmed microarray predictions of significant cytokine sensitivity: F13a1 was down-regulated 3.3-fold by cytokines, Ripk2 was up-regulated 1.5-to 3-fold by all stressors, and Steap4 was profoundly cytokine sensitive (167-fold up-regulation).Three genes were thus closely associated with low-grade inflammation in murine islets and also with a marker for islet function (AIRg) in a diabetes-prone human population.This islet-targeted genome-wide association scan identified several previously unrecognized candidate genes related to islet dysfunction during the development of T2D.",
+      "Finally, several of the linking nodes introduced into this islet network through their PPI connections represent interesting candidates for a role in T2D pathogenesis, and there are several examples where external data provides validation of those assignments.An interesting example involves the gene GINS4 which maps at the ANK1 locus.Though this gene generated a low PCS [0.03] and was not included in the set of seed genes for this locus, GINS4 knock-down has an impact in a human beta-cell line [14].In addition, cyclin-dependent kinase 2 (CDK2) has been shown to influence beta-cell mass in a compensatory mechanism related to age-and diet-induced stress, connecting beta-cell dysfunction and progressive beta-cell mass deterioration [54].YHWAG is a member of the 14-3-3 family, known to be signalling hubs for beta-cell survival [55], and disruption of SMAD4 drives islet hypertrophy [56]."
+    ],
+    [
+      "A. Genetic Screening  We have discussed above the genetic component of T1D.The genetic susceptibility to T1D is determined by genes related to immune function with the potential exception of the insulin gene (434).The genetic susceptibility component of T1D allows some targeting of primary preventive care to family members of diagnosed T1D patients, but there is no complete inheritance of the disease.Nevertheless, the risk for developing T1D compared with people with no family history is \u03f310 -15 times greater.Although \u03f370% of individuals with T1D carry defined risk-associated genotypes at the HLA locus, only 3-7% of the carriers of such genetic risk markers develop diabetes (3).",
+      "II. THE GENETICS OF TYPE 1 DIABETES  A comprehensive overview of genetic data in mouse and human is beyond the scope of this article.Instead, we will focus on how the various susceptibility genes and environmental triggers can fit in a mechanistic model for T1D etiology.",
+      "The relative prevalence of mutations causal for monogenic forms of diabetes suggests that mutations in \u2424-cellrelated processes are a more frequent cause of severe early-onset diabetes than those influencing insulin action (see above).Studies of the relative heritabilities of indexes of \u2424-cell function and insulin action in the general population also hint at a preponderance of \u2424-cell effects (52).",
+      "In 1976, the noted human geneticist James Neel titled a book chapter \"Diabetes Mellitus: A Geneticist's Nightmare.\" 1 Over the past 30 years, however, the phenotypic and genetic heterogeneity of diabetes has been painstakingly teased apart to reveal a family of disorders that are all characterized by the disruption of glucose homeostasis but that have fundamentally different causes.Recently, the availability of detailed information on the structure and variation of the human genome and of new high-throughput techniques for exploiting these data has geneticists dreaming of unraveling the genetic complexity that underlies these disorders.This review focuses on type 1 diabetes mellitus and includes an update on recent progress in understanding genetic factors that contribute to the disease and how this information may contribute to new approaches for prediction and therapeutic intervention.Type 1 diabetes becomes clinically apparent after a preclinical period of varying length, during which autoimmune destruction reduces the mass of beta cells in the pancreatic islets to a level at which blood glucose levels can no longer be maintained in a physiologic range.The disease has two subtypes: 1A, which includes the common, immune-mediated forms of the disease; and 1B, which includes nonimmune forms.In this review, we focus on subtype 1A, which for simplicity will be referred to as type 1 diabetes.Although there are rare monogenic, immune-mediated forms of type 1 diabetes, 2,3 the common form is thought to be determined by the actions, and possible interactions, of multiple genetic and environmental factors.The concordance for type 1 diabetes in monozygotic twins is less than 100%, and although type 1 diabetes aggregates in some families, it does not segregate with any clear mode of inheritance. 4-7Despite these complexities, knowledge of genetic factors that modify the risk of type 1 diabetes offers the potential for improved prediction, stratification of patients according to risk, and selection of possible therapeutic targets.As germ-line factors, genetic risk variants are present and amenable to study at all times -before, during, and after the development of diabetes.Thus, genetic information can serve as a potential predictive tool and provide insights into pathogenetic factors occurring during the preclinical phase of the disease, when preventive measures might be applied. Gene tic S t udiesBecause of the uncertainty regarding the number and action of genes involved in type 1 diabetes, genetic studies have tended to focus on approaches that require few assumptions about the underlying model of disease risk.The two primary approaches have been linkage studies (using pairs of affected relatives, typically siblings) and association studies (using either case-control or family-based designs).Linkage studies using affected sibling pairs seek to identify regions of the genome that are shared",
+      "Environment  The second factor in Figure 1 is environmental aspects.An important concept is the diabetes genotype typically causes only a predisposition for glucose intolerance (note the terminology susceptibility gene was used in the preceding paragraphs).Whether one develops the diabetes phenotype depends on environmental factors, some obvious in how they act, others less so.For instance, the Nurses Health Survey showed positive associations between obesity and lack of physical activity in the development of type 2 diabetes (as expected), but also protection by not smoking and moderate alcohol intake (14).Already discussed, many studies have shown an association between TV watching, high calorie diets, and lack of physical activity with risk of diabetes, i.e., our modern lifestyle, so it is not surprising that there is an explosion in the incidence of diabetes worldwide.",
+      "The genetics of type 1 diabetes  There is a strong genetic risk to T1D.This is exemplified by (Redondo et al., 2001) who demonstrated a strong concordance of genetic inheritance (65%) and T1D susceptibility in monozygotic twin pairs.That is, when one sibling is afflicted, there is a high probability that the other twin will develop T1D by the age of 60 years.Additionally, autoantibody positivity and islet destruction was observed after a prospective long-term follow-up of monozygotic twins of patients with T1D, despite initial disease-discordance among the twins (Redondo et al., 2008).",
+      "Type 1 diabetes has unusual epidemiological features related to gender  Type 1 diabetes also displays unusual patterns of inheritance that may yield insights into etiology and provide clues to the best methods for analyzing genetic studies.The risk to the offspring is generally greater from a mother or father who was diagnosed at an early age (again suggesting that early-onset cases are more heavily genetically 'loaded').However, the risk of diabetes is approximately two to four times higher for a child whose father has type 1 diabetes than one whose mother is affected [see (52,53) and references therein].This parental difference is largely due to a low risk for offspring of mothers who were diagnosed at a later age (53).The difference could be explained by at least three different factors.First, the risk alleles could only be active when transmitted by the father (such as is seen in imprinting, where only one of the parental alleles is expressed).Alternatively, a maternal environmental factor during pregnancy could be protective.However, it is difficult to see how this protective effect would be restricted to mothers diagnosed at a later age, especially since the protective effect was unrelated to the mother's duration of diabetes or even diabetic status at delivery (53).Finally, mothers who are diagnosed at a later age could represent more 'environmental' cases of diabetes, and thus be less likely to pass on risk genes to their offspring.",
+      "Type 1 diabetes is a genetic disease  Family studies have indicated that genetic factors are important determinants of type 1 diabetes risk.First, the risk to a sibling of an affected individual is approximately 6%, as compared with an average risk of 0.4% (depending on the population), or a relative increased risk of 15-fold (17).The increased risk to siblings is referred to as l s (18) and is one measure of the degree of familial clustering of the disease.",
+      "Family and twin studies indicate that a substantial fraction of susceptibility to type 1 diabetes is attributable to genetic factors.These and other epidemiologic studies also implicate environmental factors as important triggers.Although the specific environmental factors that contribute to immune-mediated diabetes remain unknown, several of the relevant genetic factors have been identified using two main approaches: genome-wide linkage analysis and candidate gene association studies.This article reviews the epidemiology of type 1 diabetes, the relative merits of linkage and association studies, and the results achieved so far using these two approaches.Prospects for the future of type 1 diabetes genetics research are considered.",
+      "Family and twin studies indicate that a substantial fraction of susceptibility to type 1 diabetes is attributable to genetic factors.These and other epidemiologic studies also implicate environmental factors as important triggers.Although the specific environmental factors that contribute to immune-mediated diabetes remain unknown, several of the relevant genetic factors have been identified using two main approaches: genome-wide linkage analysis and candidate gene association studies.This article reviews the epidemiology of type 1 diabetes, the relative merits of linkage and association studies, and the results achieved so far using these two approaches.Prospects for the future of type 1 diabetes genetics research are considered.",
+      "Genes affecting type 1 diabetes diagnosis age / A. Syreeni et al.",
+      "Thus, the most likely scenario is that these genes are more poised for activation in the case group compared with the control group, contributing to various diabetes complications in the long term.This could be a consequence of the early exposure to hyperglycemia (measured by HbA 1c level), which is known to be associated with increased rates of long-term diabetes complications.",
+      "The risk for T1D is strongly influenced by multiple genetic loci and environmental factors.The disease is heritable, with first-degree relatives of patients with T1D being at 15-fold greater risk for developing the condition than the general population.",
+      "Genetic Background and Environment  Both type 1 and 2 diabetes as well as other rare forms of diabetes that are directly inherited, including MODY and diabetes due to mutations in mitochondrial DNA, are caused by a combination of genetic and environmental risk factors.Unlike some traits, diabetes does not seem to be inherited in a simple pattern.Undoubtedly, however, some people are born prone to developing diabetes more so than others.Several epidemiological patterns suggest that environmental factors contribute to the etiology of T1D.Interestingly, the recent elevated number of T1D incidents projects a changing global environment, which acts either as initiator and/or accelerator of beta cell autoimmunity rather than variation in the gene pool.Several genetic factors are involved in the development of the disease [127].There is evidence that more than twenty regions of the genome are involved in the genetic susceptibility to T1D.",
+      "Type 1 Diabetes  The higher type 1 diabetes prevalence observed in relatives implies a genetic risk, and the degree of genetic identity with the proband correlates with risk (22)(23)(24)(25)(26). Gene variants in one major locus, human leukocyte antigen (HLA) (27), confer 50-60% of the genetic risk by affecting HLA protein binding to antigenic peptides and antigen presentation to T cells (28).Approximately 50 additional genes individually contribute smaller effects (25,29).These contributors include gene variants that modulate immune regulation and tolerance (30)(31)(32)(33), variants that modify viral responses (34,35), and variants that influence responses to environmental signals and endocrine function (36), as well as some that are expressed in pancreatic b-cells (37).Genetic influences on the triggering of islet autoimmunity and disease progression are being defined in relatives (38,39).Together, these gene variants explain ;80% of type 1 diabetes heritability.Epigenetic (40), gene expression, and regulatory RNA profiles (36) may vary over time and reflect disease activity, providing a dynamic readout of risk.",
+      "Genetics  Both type 1 and type 2 diabetes are polygenic diseases where many common variants, largely with small effect size, contribute to overall disease risk.Disease heritability (h 2 ), defined as sibling-relative risk, is 3 for type 2 diabetes and 15 for type 1 diabetes (17).The lifetime risk of developing type 2 diabetes is ;40% if one parent has type 2 diabetes and higher if the mother has the disease (18).The risk for type 1 diabetes is ;5% if a parent has type 1 diabetes and higher if the father has the disease (19).Maturity-onset diabetes of the young (MODY) is a monogenic disease and has a high h 2 of ;50 (20).Mutations in any 1 of 13 different individual genes have been identified to cause MODY (21), and a genetic diagnosis can be critical for selecting the most appropriate therapy.For example, children with mutations in KCJN11 causing MODY should be treated with sulfonylureas rather than insulin.",
+      "Type 1 diabetes as well as type 2 diabetes shows a genetic predisposition, although only type 1 diabetes is HLA dependent [32,33,36,40].",
+      "Genetic factors have an important role in the development of diabetes, with some forms of the disease resulting from mutations in a single gene.Others are multifactorial in origin.The monogenic forms of diabetes account for approximately 5% of cases and are caused by mutations in genes encoding insulin 3 , the insulin receptor 4 , the glycolytic enzyme glucokinase 5 , and the transcription factors hepatocyte nuclear factor-1\u03b1 (HNF-1\u03b1), HNF-1\u03b2, HNF-4\u03b1, insulin promoter factor-1 and NeuroD1/BETA2 (refs  6-10).Mutations in maternally inherited mitochondrial genes can also cause diabetes, often in association with hearing loss 11 .",
+      "Studies [71][72][73][74] in Mexican and Asian populations have identified several mutations associated with type 2 diabetes in young people.The high prevalence of type 2 diabetes in the parents of young people diagnosed with type 2 diabetes could reflect a stronger genetic predisposition, even when monogenic diabetes is excluded.This hypothesis suggests that efforts to define genes that cause type 2 diabetes by linkage might be more powerful if focused on young adults with diabetes, raising the question of whether type 2 diabetes in older populations has a relatively smaller genetic contribution and a stronger environmental contribution. 66",
+      "We found that the presence or absence of parental diabetes and the genotype score were independently associated with the risk of diabetes.This suggests that family history as a risk factor for diabetes conveys more than heritable genetic information; it probably includes nongenetic familial behaviors and norms.The lower relative risks for diabetes associated with observed parental diabetes as compared with those associated with self-reported family history (approximately 1.8 vs. approximately 2.2) support the contention that family history contains more risk information than is implied by inheritance of the diabetes phenotype alone.One of the limitations of our study is that the 18 SNPs we included are probably insufficient to account for the familial risk of diabetes.They account for a minority of diabetes heritability, and the SNP array platforms from which they were chosen capture only approximately 80% of common variants in Europeans.In addition, we have not considered structural variants that might confer a risk of diabetes.It is possible that the addition of rare risk alleles with large effects, or a much larger number of common risk alleles with small individual effects, could improve discrimination. 36Indeed, as many as 500 loci may underlie the genetic risk of type 2 diabetes. 16Also, we did not study interactions among genes or between genes and the environment that might alter the genetic risk in exposed persons.As more diabetes risk variants become known, their incorporation into the genotype score may explain more of the genetic risk implied by parental diabetes."
+    ],
+    [
+      "The prevalence of diabetes mellitus worldwide is sobering; the International Diabetes Federation estimates that 415 million people have diabetes mellitus, with 90% of these individuals having type 2 diabetes mellitus (T2DM) 1 .T2DM occurs when pancreatic \u03b2-cells fail to release enough insulin to meet the demands of insulin-responsive tissues, which safely store and metabolize glucose.Driven by both genetic and environmental risk factors, T2DM is a complex, multifactorial disorder.Although the increasing prevalence of T2DM is driven by changes in our environment and mirrors the increase in obesity, the greater concordance for T2DM found in monozygotic compared with dizygotic twins has long provided evidence for a genetic component in T2DM risk 2 .",
+      "In the UK alone, nearly 1.8 million people are already recognized to have this disorder (consuming w5% of the total National Health Service budget), and the search is on to find the 'missing million' who are living with the condition but in whom the diagnosis has yet to be made. 3In the USA, the situation appears to be even more serious with some commentators predicting that one in every three Americans born in the year 2000 will go on to develop diabetes during their lifetime, bringing unprecedented costs in terms of healthcare dollars as well as human morbidity and mortality. 4The majority (w90%) of these cases will be type 2 in origin, reflecting a trend towards obesity and more sedentary lifestyles as the 'norm' rather than the exception in 'developed' societies.Indeed, the face of T2DM is changing, as a condition that was once considered the preserve of middle/old age is increasingly diagnosed in young adults and even children, reflecting the high rates of obesity (and, in particular, visceral adiposity) in these populations.",
+      "Table 1 lists the various subtypes of diabetes based on the classification suggested by the ADA [4].",
+      "The ADA lists four subtypes of diabetes based on the clinical symptoms at time of presentation, [4] namely, Type 1 diabetes, Type 2 diabetes (T2D), gestational diabetes, and diabetes due to specific causes (genetic defects causing deficient insulin secretion or action, diseases of pancreas, use of certain drugs such as steroids, thiazides among others).Of these, T2D is the most prevalent (close to 90% of all cases) and is the major cause of morbidity and mortality in both developed and developing nations [1].At times it is difficult to assign a patient to a particular subtype due to the difference in conditions associated with hyperglycemia at the time of diagnosis [4,7].For example, a lady diagnosed with gestational diabetes mellitus during pregnancy is highly susceptible to develop T2D later.Therefore, other than proper treatment during and post pregnancy, a regular follow-up is required for stratifying disease risk, and for timely management before progression to another subtype.It is clear that the classification of diabetes may not be as simple as just categorizing it into any one of the four given subtypes due to its miscellaneous nature.Every case needs to be considered at the time of presentation, on the basis of the risk factors or underlying cause of hyperglycemia, the clinical symptoms, and disease prognosis.",
+      "Introduction  Globally, diabetes affects more than 400 million people (World Health Organization, 2016), with Type 1 (insulin-dependent) diabetes (T1D) accounting for up to 10 percent of cases (American Diabetes Association, 2009).In the United States, T1D occurs at a rate of 15-30 cases per 100,000 children aged 0-14 years annually (International Diabetes Foundation, 2017;Maahs et al., 2010), with similar prevalence in Canada, Europe, Australia, and New Zealand (Fig. 1) (Derraik et al., 2012;International Diabetes Foundation, 2017;Maahs et al., 2010).By contrast, the estimated incidence rate of T1D among Asians, South Americans, and Africans is below 15 cases per 100,000 children (Fig. 1) (International Diabetes Foundation, 2017;Maahs et al., 2010).The global incidence of T1D has been rising by 3-5% per annum over the past two decades, with a notable increase in children below 10 years of age (Diamond Project, 2006;Patterson et al., 2009).",
+      "Animal Models  9.2% in women and 9.8% in men, with approximately 347 million people suffering from the disease worldwide in 2008 (Danaei et al., 2011).There are several different classifications of diabetes, the most common being type 1 and type 2 diabetes.",
+      "Type 2 diabetes is the most common type of diabetes with prevalence in the United Kingdom of around 4%.It is most commonly diagnosed in middle-aged adults, although more recently the age of onset is decreasing with increasing levels of obesity (Pinhas-Hamiel and Zeitler, 2005).Indeed, although development of the disease shows high hereditability, the risk increases proportionally with body mass index (Lehtovirta et al., 2010).Type 2 diabetes is associated with insulin resistance, and a lack of appropriate compensation by the beta cells leads to a relative insulin deficiency.Insulin resistance can be improved by weight reduction and exercise (Solomon et al., 2008).If lifestyle intervention fails, there are a variety of drugs available to treat type 2 diabetes (Krentz et al., 2008), which can be divided into five main classes: drugs that stimulate insulin production from the beta cells (e.g.sulphonylureas), drugs that reduce hepatic glucose production (e.g.biguanides), drugs that delay carbohydrate uptake in the gut (e.g.a-glucosidase inhibitors), drugs that improve insulin action (e.g.thiazolidinediones) or drugs targeting the GLP-1 axis (e.g.GLP-1 receptor agonists or DPP-4 inhibitors).",
+      "Introduction  Diabetes impacts the lives of approximately 200 million people worldwide [1], with chronic complications including accelerated development of cardiovascular disease.Over 90% of cases are of type 2 diabetes (T2D), with the bulk of the remainder presenting with type 1 diabetes (T1D).",
+      "Classification of Diabetes  On the basis of insulin deficiency, diabetes can be classified into the following types as follows.",
+      "| INTRODUCTION  Today, more than 265 million people are affected across the world.It is estimated that by the year 2030 this number will reach 366 million people (about 4/4 percent of the world's population), and now the cause of death is more than 1.1 million per year (including 50% of the population under-70 years of age and 55% of women).On the other hand, given its negative effect on the economic growth of developing countries, it calls for universal mobilization to combat this disease (Bhattacharya, Dey, & Roy, 2007).Diabetes or diabetes mellitus is referred to as a heterogeneous group of metabolic disorders characterized by chronic hyperglycemia and carbohydrate, fat and protein metabolism disorders that result from a defect in the secretion of insulin, or impairment in its function, or both.Types of diabetes mellitus include type 1, type 2 diabetes and other kind of diabetes, but the two most common types of diabetes mellitus are type 1 and type 2, which are different in several aspects (Meshkani, Taghikhani, Mosapour et al., 2007).Type 1 diabetes has been identified with autoimmune destruction of pancreatic beta cells (insulin secreting cells) and accounts for about 5% of all diabetic people, while type 2 diabetes is a predominant disorder characterized by insulin resistance or a relative decline in insulin production, and accounts for about 90% of all types of diabetes mellitus (Meshkani, Taghikhani, Al-Kateb et al., 2007).Important factors that predispose a person to type 2 diabetes are multifactorial, including genetic factors and environments.However, its inheritance has certainly not been proven, but it is believed that first-degree relatives of diabetic patients have a higher chance to develop the disease.In this regard, recognizing gene polymorphisms of this disease seems to be necessary (H\u00e4ring et al., 2014).Multiple genes have been studied in the pathogenesis of type 2 diabetes.",
+      "CONCLUSIONS  Diabetes is currently broadly classified as type 1, type 2, gestational, and a group of \"other specific syndromes. \"However, increasing evidence suggests that there are populations of individuals within these broad categories that have subtypes of disease with a well-defined etiology that may be clinically characterized (e.g., LADA, MODY).These developments suggest that perhaps, with more focused research in critical areas, we are approaching a point where it would be possible to categorize diabetes in a more precise manner that can inform individual treatment decisions.",
+      "Type 2 Diabetes  In the U.S., an estimated 95% of the nearly 30 million people living with diabetes have type 2 diabetes.An additional 86 million have prediabetes, putting them at high risk for developing type 2 diabetes (9).Among the demographic associations for type 2 diabetes are older age, race/ ethnicity, male sex, and socioeconomic status (9).",
+      "Type 1 Diabetes  Between 2001 and 2009, there was a 21% increase in the number of youth with type 1 diabetes in the U.S. (7).Its prevalence is increasing at a rate of ;3% per year globally (8).Though diagnosis of type 1 diabetes frequently occurs in childhood, 84% of people living with type 1 diabetes are adults (9).Type 1 diabetes affects males and females equally (10) and decreases life expectancy by an estimated 13 years (11).An estimated 5-15% of adults diagnosed with type 2 diabetes actually have type 1 diabetes or latent autoimmune diabetes of adults (LADA) (12).",
+      "Background Diabetes is presently classified into two main forms, type 1 and type 2 diabetes, but type 2 diabetes in particular is highly heterogeneous.A refined classification could provide a powerful tool to individualise treatment regimens and identify individuals with increased risk of complications at diagnosis.",
+      "Diabetes mellitus now affects ~8% of the world's adult population [1], including ~3 000 000 individuals in the UK (with a further 600 000 people affected but presently undiagnosed) [2].Of these cases, > 90% have Type 2 diabetes.Treatments of the complications of the disease, which range from stroke, blindness and kidney failure to lower limb amputations and cancer, presently consume ~10% of the National Health Service budget, some \u00a314 bn per year [3].These figures are anticipated to increase further in the next 10 years, driven by increasingly sedentary lifestyles and increases in obesity; the collision between these 'environmental' factors and genetic susceptibility (see below) being the key underlying driver.Whilst existing treatments ameliorate the symptoms of the disease, notably hyperglyca-emia, none target the underlying molecular aetiology.In particular, no available treatments tackle the progressive and largely irreversible loss of insulin production [4] which, in the face of insulin resistance, underlies the progressive deterioration in glucose control.Reductions in b-cell mass [5,6] and dysfunction [7] both contribute to this gradual impairment in insulin release.Recent years have seen an increase in the view that the former may play a less important role than the latter, with a 2008 study by Rahier et al. [6] reporting that b-cell mass (and insulin content) in people with Type 2 diabetes was on average ~35% lower than that of healthy control subjects.However, this difference was only ~24% within 5 years of diagnosis, far below levels likely to lead to the symptoms of diabetes.Indeed, given our present inability to monitor b-cell mass prospectively over the course of the disease, it is conceivable that the differences observed post mortem between healthy individuals and those with Type 2 diabetes [5,6] may reflect an increased predisposition to diabetes in those born with a lower than average b-cell mass.",
+      "INTRODUCTION  Type 2 diabetes (T2D) affects an estimated 425 million people worldwide, a number predicted to rise to 629 million by 2045 (1).The disease usually involves insulin resistance but is ultimately the result of pancreatic b cell failure, a sine qua non for disease development (2).In contrast, Type 1 diabetes (T1D) affects a smaller proportion of people with diabetes and is chiefly the result of pancreatic b cell destruction mediated by immune cells (3).",
+      "Introduction  Diabetes is a complex and heterogeneous disease with a staggering global impact and the most recent estimates indicate 346 million people worldwide suffer from this disease (WHO Diabetes Fact sheet No. 312, 2011).Type 2 diabetes mellitus (T2DM) is the most common form of diabetes, accounting for >90% of cases, and occurs when peripheral tissue insulin resistance accompanies insufficient b-cell insulin production.While >80% of diabetes deaths occur in low-and middle-income countries [1].India and China have the highest reported prevalence of diabetes with 65 and 98 million in 2013, respectively [2].",
+      "The disease burden related to diabetes is high and rising in every country, fuelled by the global rise in the prevalence of obesity and unhealthy lifestyles.The latest estimates show a global prevalence of 382 million people with diabetes in 2013, expected to rise to 592 million by 2035.The aetiological classification of diabetes has now been widely accepted.Type 1 and type 2 diabetes are the two main types, with type 2 diabetes accounting for the majority (>85%) of total diabetes prevalence.Both forms of diabetes can lead to multisystem complications of microvascular endpoints, including retinopathy, nephropathy and neuropathy, and macrovascular endpoints including ischaemic heart disease, stroke and peripheral vascular disease.The premature morbidity, mortality, reduced life expectancy and financial and other costs of diabetes make it an important public health condition.",
+      "The disease burden related to diabetes is high and rising in every country, fuelled by the global rise in the prevalence of obesity and unhealthy lifestyles.The latest estimates show a global prevalence of 382 million people with diabetes in 2013, expected to rise to 592 million by 2035.The aetiological classification of diabetes has now been widely accepted.Type 1 and type 2 diabetes are the two main types, with type 2 diabetes accounting for the majority (>85%) of total diabetes prevalence.Both forms of diabetes can lead to multisystem complications of microvascular endpoints, including retinopathy, nephropathy and neuropathy, and macrovascular endpoints including ischaemic heart disease, stroke and peripheral vascular disease.The premature morbidity, mortality, reduced life expectancy and financial and other costs of diabetes make it an important public health condition.",
+      "Introduction: Is Type 2 Diabetes a Genetic Disorder?According to the World Health Organization (WHO), approximately 350 million people worldwide have diabetes, and this disorder is likely to be the seventh leading cause of death in 2030.Diabetes is an economic burden on healthcare systems, especially in developing countries (World Health Organization, 2013)."
+    ]
+  ]
+}
\ No newline at end of file
diff --git a/gnqa/paper1_eval/src/data/datasets/old/full_aging_dataset.json b/gnqa/paper1_eval/src/data/datasets/old/full_aging_dataset.json
new file mode 100644
index 00000000..9e2008f5
--- /dev/null
+++ b/gnqa/paper1_eval/src/data/datasets/old/full_aging_dataset.json
@@ -0,0 +1,248 @@
+{
+  "question": [
+    "List as many studies are you can that include rapamycin",
+    "Why is it so difficult to map gene loci that control aging in humans?",
+    "How would one evaluate a candidate gene for mouse lifespan?",
+    "Do we find significant quantitative trait loci in aging studies, if so give the top three examples with descriptions?",
+    "What are the most fundamental human life expectancy extending geroscience studies?",
+    "List and describe the grand challenges related to aging research in genetics and genomics.",
+    "How do gene-by-drug effects (GxD) modulate life expectancies and health?",
+    "What loci and gene variants modulate normal variation in lifespan, and in which cases is this sex-specific (GxS)?",
+    "what is the influence of rapamycin and aging on the in vivo physiological and metabolic states and cellular bioenergetic, and how do these relate to the rapamycin-induced health and lifespan modification for that genometype?",
+    "Describe the genotypes related to alzheimers and dementia which have commonalities with those for aging?"
+  ],
+  "answer": [
+    "1. A study assessing the impact of a rapamycin derivative on healthy aging in people (Mannick et al. 2014). 2. A veterinary clinical trial testing rapamycin as a treatment for osteosarcoma (Paoloni et al. 2010). 3. A veterinary clinical trial assessing the side effect profile and effects on age-associated cardiac function following 10 weeks of rapamycin treatment in healthy elderly dogs (Kaeberlein 2015). 4. A study on the pharmacokinetic analysis of rapamycin treatment in healthy dogs (Larson et al. 2016). 5. A study on the effects of rapamycin in aging mice (Johnson et al. 2015). 6. A study on the effects of rapamycin in rodents (Johnson et al. 2013). 7. A study on the effects of rapamycin on the replicative life of yeast (Reference [57]). 8. A study on the effects of rapamycin in 20-month-old mice (Reference [58]). 9. A study on the combination of rapamycin with other compounds for anti-aging activity (Reference [59]).",
+    "Mapping gene loci that control aging in humans is difficult due to several reasons. Firstly, aging is a complex phenotype influenced by multiple genes and their interactions with each other and the environment. Secondly, the effects of common genetic variations on longevity are small, requiring large studies for identification. Thirdly, human studies often face issues such as environmental heterogeneity, genetic diversity, and lack of birth matched controls. Lastly, the influence of social, economic, and cultural factors, which are unique to humans, can't be replicated in traditional model organisms, adding to the complexity.",
+    "One would evaluate a candidate gene for mouse lifespan by conducting gene mapping methods and using mouse life span as a surrogate for aging. This involves studying the genetic contributions to mortality rates, conducting sex-specific analyses, and determining any changes in pathology associated with the loci. The study would involve a large population of mice, and the results would be compared to Mendelian expectations. Additionally, epigenetic predictors and differential methylation analysis could be used to assess the relationship between the gene and lifespan. The gene's influence on specific late-life diseases and its effect on the risk of these diseases would also be evaluated.",
+    "Yes, significant quantitative trait loci (QTLs) have been found in aging studies. Three examples include:  1. Lfdr1: This QTL was found on chromosome 7 and affects lifespan and fertility in response to dietary restriction. It also has suggestive effects on fuel efficiency (FE).  2. Fedr1 and Fedr2: These QTLs were identified on chromosomes 9 and 15 respectively. They have significant effects on fuel efficiency in response to dietary restriction.  3. QTLs associated with age-related thymic involution: These were identified in a study using C57BL/6 J X DBA/2 J (BXD) recombinant inbred (RI) strains of mice.",
+    "The most fundamental human life expectancy extending geroscience studies include genetic and genomic research into longevity. These studies use both hypothesis-driven candidate gene approaches and explorative genome-wide analyses. They also involve the study of biomarkers related to aging, deep phenotyping, and the use of novel technologies like next-generation sequencing. Additionally, they examine the role of environmental factors, lifestyle, and heritability in human lifespan variation.",
+    "The grand challenges related to aging research in genetics and genomics include:  1. Lack of awareness and interest: Health professionals, governments, and researchers often lack awareness and interest in genetic diseases, which hinders progress.  2. Limited funding and access to technology: Research funding is often limited, and access to high-end technology is not always available, which restricts the scope of research.  3. Economic and health disparities: These disparities can affect the application and benefits of genomic medicine, making it less accessible to those in lower income brackets.  4. Ethical, legal, and social issues: The interpretation, storage, and sharing of genetic data raise ethical and legal concerns, including issues of privacy and consent.  5. Rapid advancements: The fast pace of developments in genetics and genomics can outpace the ability to address emerging ethical, legal, and social questions.  6. Public knowledge and trust: There is a need to increase public knowledge about genomic research and address issues of trust in medicine and scientific research.  7. Integration of genomics into public health: This is a major future challenge for healthcare systems, requiring dialogue with all stakeholders in society.",
+    "Gene-by-drug effects (GxD) can modulate life expectancies and health by influencing how individuals respond to medications. Genetic variations can affect how drugs are absorbed, disseminated, metabolized, and excreted in the body, which can lead to differences in drug efficacy and the occurrence of adverse drug reactions. For example, a genetic variant in the HMG-CoA reductase gene can modify the LDL-C response to pravastatin. Additionally, certain genes are involved in DNA damage repair and oxidative stress, which can influence susceptibility to adverse drug effects. Therefore, understanding these genetic variations can help in personalizing treatment and potentially improving health outcomes.",
+    "The study identified 10 genomic loci which influence healthspan, parental lifespan, and longevity. These include loci near FOXO3, SLC4A7, LINC02513, ZW10, and FGD6. The life-extending variant near FOXO3 is associated with a delay in the age at menarche, suggesting a possible sex-specific effect. However, the study did not perform sex-stratified analyses, so it's unclear if there are other sex-specific longevity-related genetic variants.",
+    "Rapamycin, an mTOR inhibitor, has been shown to extend the lifespan of various organisms, including mice, by delaying or reversing aging in multiple cell types. It does this by regulating rates of protein synthesis and energy utilization, which are processes associated with aging. In the hematopoietic system, rapamycin limits age-related increases in stem cells and biomarkers of aging, enhancing the performance of these cells. However, rapamycin also has potential unwanted metabolic effects, such as insulin resistance and glucose intolerance, though these effects are controversial and potentially reversible. The mTOR pathway, which rapamycin targets, integrates signals from insulin, cytokines, nutrients, oxygen, and mitogenic stimuli, and its regulation has implications for longevity and against the negative effects of aging. Rapamycin also induces autophagy, a process important for cellular homeostasis and damage prevention. Despite these benefits, the exact mechanisms by which rapamycin extends lifespan and whether it delays aging or affects specific diseases remain unclear.",
+    "The genotypes related to Alzheimer's and dementia that have commonalities with those for aging include a locus on chromosome 17 associated with the age at onset, and a specific variant in CCL11. Other genes associated with Alzheimer's include APP, PSEN1, PSEN2, and APOE. These genes are also associated with early-onset Alzheimer's disease. The APOE gene is the strongest genetic risk factor for later onset Alzheimer's. The heritability of late-onset Alzheimer's disease (LOAD) is estimated to be ~60-80%, suggesting a large proportion of individual differences in LOAD risk is driven by genetics."
+  ],
+  "contexts": [
+    [
+      "Intervention trials and cell-based monotherapy",
+      "Rapamycin has been shown to be well tolerated in dogs, improves outcome in a glycogen storage disease model (Yi et al. 2014), and is currently being tested in veterinary clinical trials as a treatment for osteosarcoma (Paoloni et al. 2010).A veterinary clinical trial is underway to assess the side effect profile and effects on age-associated cardiac function following 10 weeks of rapamycin treatment in healthy elderly dogs (Kaeberlein 2015).",
+      "Rapamycin is used clinically to prevent organ transplant rejection, for some forms of cancer, and to prevent restenosis in cardiac stents (Kaeberlein 2013b).Shortterm treatment with the rapamycin derivative RAD001 improves ageassociated decline in immune function, as measured by antibody response to an influenza vaccine, in healthy elderly people (Mannick et al. 2014).",
+      "To date, only one study has been performed assessing the impact of a rapamycin derivative on healthy aging in people.In this trial, it was observed that 6 weeks of treatment with the rapamycin derivative RAD001 (everolimus) was sufficient to enhance function of the aged immune system, as assessed by response to an influenza vaccine (Mannick et al. 2014).This recapitulates what was observed in elderly mice (Chen et al. 2009), and suggests that at least some of the mechanisms by which rapamycin delays aging in mice work similarly in humans.Although both compounds have essentially identical biological activities, RAD001 was used in this study instead of rapamycin because the study was funded by Novartis, who holds the patent rights for RAD001 (rapamycin is now off patent and sold as a generic drug).The doses of RAD001 used in the human immune aging study were lower than those typically used to prevent organ transplant rejection and showed improved side effect profiles, although some adverse effects, including the presence of mouth sores in a subset of the patients, were noted.",
+      "This trial is designed to determine whether treatment with the drug rapamycin (see Table 1) can significantly reduce age-related disease and disability as well as mortality in middle-aged large dogs.The initial phase of this trial, which is in progress at the time of this writing, is intended to enroll at least 32 dogs 6 years of age or older and 40 lb in weight or greater.Each animal receives an initial veterinary exam and comprehensive blood work along with a cardiac exam including echocardiography (Fig. 3).Those dogs that do not present with any abnormalities or significant pre-existing health conditions are randomized into either placebo or rapamycin treatment groups for a 10-week treatment period.Initial rapamycin dosing regimens were determined, in part, based on pharmacokinetic analysis of rapamycin treatment in healthy dogs (Larson et al. 2016).After 10 weeks in the study, each dog receives another full exam and blood chemistry panel as well as repeat cardiac exam.The primary goals of this first phase are to establish appropriate dosing of rapamycin in the absence of significant adverse events and to determine whether similar improvements in cardiac function are achieved in aged dogs after 10 weeks of rapamycin treatment, as has been observed in laboratory mice (Dai et al. 2014;Flynn et al. 2013).",
+      "Fig. 3 Design of the current short-term rapamycin intervention trial.Dogs must weigh at least 40 pounds and be at least 6 years old at time of entry into the study.If no significant pre-existing health conditions are detected at the first exam, dogs are randomized into either placebo or one of the rapamycin treatment groups.Red indicates the 10-week period during which the dogs receive either rapamycin or placebo.Dogs receive the same generic rapamycin (sirolimus) pill that is provided to human patients.Asterisk Serum and feces are collected at each appointment for future metabolomic and microbiome analyses and for quantitation of circulating rapamycin levels",
+      "Pending the outcome of phase 1, we anticipate enrolling several hundred additional dogs with similar entry criteria into a longer-term, 3-5 year study, to carefully assess the extent to which rapamycin improves health and reduces mortality in middle-age companion dogs.In addition to cardiac function, assessments of multiple age-related phenotypes will be performed including measures of cognitive function, muscle function, kidney function, glucose homeostasis, and cancer incidence.Many of these parameters are beneficially impacted by rapamycin in aging mice (Johnson et al. 2015), and we predict that rapamycin will induce similar improvements in aging dogs.",
+      "Rapamycin is currently the most effective pharmacological intervention for extending lifespan and delaying a broad range of age-related functional declines in rodents (Johnson et al. 2013).However, the doses used clinically to prevent organ transplant rejection are associated with side effects, such as impaired wound healing, edema, elevated circulating triglycerides, impaired glucose homeostasis, gastrointestinal discomfort, and mouth ulcers (Augustine et al. 2007;de Oliveira et al. 2011).These adverse side effects would likely preclude long-term use of rapamycin at these levels in otherwise healthy people.With the possible exception of impaired glucose homeostasis (Lamming et al. 2012), these side effects have not been observed at doses that are associated with increased lifespan and healthspan in mice, however, raising the possibility that lower doses of this drug could promote healthy aging with minimal adverse effects.",
+      "Rapamycin  Rapamycin is a macrolide isolated from Streptomyces hygroscopicus, a bacteria from Pascua Island (Rapa Nui).It has functions as an antibiotic, an immune suppressant drug, and it is also proposed as a CRM.After the first studies, it was found that rapamycin could induce the extension of the replicative life of yeast through the inhibition of TOR signaling [57].This compound could extend the lifetime useful in 20-month-old mice in correlation with TOR activity [58].These studies were the basis of the research to determine the function of rapamycin as a CRM, due to its modulating properties over proteostasis.In addition, studies suggest that rapamycin can be combined with other compounds (metformin, losartan, statins, propranolol, and aspirin among others) to potentiate their anti-aging activity [59].",
+      "Rapamycin  Rapamycin is a macrolide isolated from Streptomyces hygroscopicus, a bacteria from Pascua Island (Rapa Nui).It has functions as an antibiotic, an immune suppressant drug, and it is also proposed as a CRM.After the first studies, it was found that rapamycin could induce the extension of the replicative life of yeast through the inhibition of TOR signaling [57].This compound could extend the lifetime useful in 20-month-old mice in correlation with TOR activity [58].These studies were the basis of the research to determine the function of rapamycin as a CRM, due to its modulating properties over proteostasis.In addition, studies suggest that rapamycin can be combined with other compounds (metformin, losartan, statins, propranolol, and aspirin among others) to potentiate their anti-aging activity [59].",
+      "One out of the 25 FDA approved Breast cancer drugs (Gemcitabine), was found in the top 20 drug list from LINCS from breast cancer stage I (dark magenta). As shown in Fig. 12, one drug out of 25 FDA approved Breast cancer drugs, Gemcitabine, was found as repurposed drug from LINCS for breast cancer stage III. Letrozole (Breast cancer drug) has similar structure (greater than 60%) with Ruxolitinib (repurposed drug from LINCS) a drug for the treatment of intermediate or high-risk myelofibrosis (Fig. 13).",
+      "One out of the 25 FDA approved Breast cancer drugs (Palbociclib), was found in the top 20 drug list from LINCS from breast cancer stage II (deep pink). Scientific Reports | 6:20518 | DOI: 10.1038/srep20518  13 www.nature.com/scientificreports/  Figure 11. Highlighted target genes that physically interact with genes from the breast cancer stage II common network pattern and their corresponding repurposed drugs from LINCS, along with their structurally similar Breast cancer drugs. As shown in Figs 16\u201317 two target genes (TOP2A and TYMS) are also involved in the Triple Negative pattern.",
+      "Two of them (Gemcitabine and Palbociclib) are included in the list of the 25 known FDA-approved Breast cancer therapeutic drugs. We performed a Hypergeometric distribution test in order to find the statistical significance of this drug overlapping. More precisely, LINCS_L1000 database is comprised from 20,413 chemical reagents. Twenty two out of twenty five breast cancer drugs are also included in LINCS database. Finally, from the 105 drugs that were found from our analysis, the probability of finding two drugs to overlap with the Breast Cancer drugs in LINCS is 0.005471157, pointing out that there is statistical significance in their selection.",
+      "Two from the 25 FDA approved Breast cancer drugs (Gemcitabine and Palbociclib), was found in the top 20 drug list from LINCS from Luminal A breast cancer (dark magenta and deep pink respectively).",
+      "18 two drugs out of 25 FDA approved Breast cancer drugs \u2013 Gemcitabine and Palbociclib \u2013 were also found as repurposed drugs from LINCS for breast cancer Luminal A (Fig. 18). Two genes from the Luminal A network pattern physically interact with four genes that involved in Histone deacetylases class (HDAC1, HDAC2, HDAC3 and HDAC8), which are target genes of Vorinostat (repurposed drug from LINCS). Vorinostat is a member of a larger class of compounds that inhibit histone deacetylases (HDAC) and it is used to treat cutaneous T cell lymphoma (CTCL).",
+      "One out of the 25 FDA approved Breast cancer drugs (Gemcitabine), was found in the top 20 drug list from LINCS from breast cancer stage III (dark magenta). that was found from the drug repurposing analysis of HER2 pattern. It has similar structure - 75% with WZ-4002 repurposed drug, which is a novel mutant-selective inhibitor of EGFR. Finally, both Palbociclib and WZ-4002 are structurally similar to Dasatinib (more than 60%), which is a cancer drug used to treat acute lymphoblastic leukemia.",
+      "Network pattern for each breast cancer subtype and the common interactions across Luminal A and Luminal B. As shown in Fig. 8, one drug out of 25 FDA approved Breast cancer drugs, Gemcitabine, was proposed as repurposed drug by the LINCS for breast cancer stage I. Furthermore, Gemcitabine is quite similar (tanimoto31 similarity greater than 80%) with Clofarabine and Kinetin-riboside (repurposed drugs from LINCS). Clofarabine is also an anti-cancer, antineoplastic chemotherapy drug and is classified as an antimetabolite.",
+      "Hierarchical clustering using tanimoto similarity (Soergel distance) was applied to each of the top 20 drug list from LINCS and the 25 known FDA-approved Breast cancer therapeutic drugs (Supplementary Figs 54\u201361). LINCS Drug Names were transformed into ChemSpider IDs (see Supplementary Table 1) In synopsis, the unique drugs for the breast cancer stages were 63 and for the breast cancer subtypes 58, as we have located common drugs across them. Taking their union and removing the duplicates we conclude to a total of 105 repurposed drugs.",
+      "13, is also structurally similar (greater than 60%) with 6-(1,3-Benzodioxol-5-yl)-N-(cyclopentylmethyl)-4-quinazolinamine (repurposed small molecule from LINCS). As in breast cancer stages I and III one drug out of 25 FDA approved Breast cancer drugs \u2013 Gemcitabine \u2013 was found as repurposed drug from LINCS for breast cancer stage IV (Fig. 14). A repurposed drug from LINCS \u2013 Homoharringtonine was found to be structurally similar with Everolimus and Vinblastine Breast cancer drugs (greater than 70%). On the other hand, as shown in Fig.",
+      "Rapamycin has serious side effects, particularly as an immunosuppressor, and thus it is not suitable as an antiaging drug.As in sirtuins, however, these studies highlight the road from basic discovery on the biology of aging to antiaging interventions.Further studies of the TOR pathway and of repressors more specific of its downstream signaling pathway are ongoing.Whether rapamycin produces a change in another parameter related to energy uptake or utilization is unknown, and determining which of its effects modulate lifespan is an important unsolved question.Like resveratrol, TOR has attracted considerable attention from the pharmaceutical industry, particularly in the context of cancer (Meric-Bernstam and Gonzalez-Angulo, 2009)."
+    ],
+    [
+      "FUTURE DIRECTIONS: HIGHER RESOLUTION DATA VIA HIGHER THROUGHPUT ASSAYS  One inescapable conclusion of the aggregate results of genome-wide studies of aging to date (see summary Table 1) is that we have not come close to saturating the number of potentially lifespan-altering genes in any organism.This is in no small part because directly generating survival curves is a relatively time-consuming process in most model organisms using current methods.There are several possible ways to address this.One way that has been tried is by attempting to find surrogate phenotypes [72,73,126] that can be screened more rapidly, or even scored under selection.Another is mining candidates from the many whole-genome expression profiles.Results to date with these have been very fruitful, but have not suggested that these methods alone will rapidly saturate our search for lifespan-and healthspan-altering genes in tractable model organisms.",
+      "Genetic linkage studies of long-lived human families identified a longevity locus while candidate gene approaches have been used to identify and confirm the association between specific variants in the FOXO3A gene and human longevity [3\u20137]. Genome-wide association studies have also been used to identify the association of APOE with life  123 Aging Clin Exp Res  span and have yielded insights into potential biological pathways and processes related to aging. Despite these successes, several problems are inherent in human longevity studies including potentially high degrees of environmental heterogeneity, genetic diversity, and lack of birth matched controls, among others [8].",
+      "Additional association studies with these families and replication of these results with an independent data set should facilitate the positional cloning of a gene that influences the ability to age well and achieve exceptional longevity.Identification of the genes in humans that allow certain individuals to live to extreme old age should lead to insights on cellular pathways that are important to the aging process.",
+      "The aging process most certainly is under highly polygenic controls\u2026 This should not discourage us from pursuing a search for those loci which may be of profound importance to human aging as it ordinarily occurs in most human beings.",
+      "In most experimentally modified animal model systems, single-gene mutations in many different genes have major life extension effects (Fontana et al., 2010;Kenyon, 2010).However, natural human and animal longevity is presumed to be a complex trait (Finch & Tanzi, 1997).In humans, both candidate gene and genome-wide genetic association approaches have been applied in an attempt to identify longevity loci.The frequency of genetic variants has been typically compared between nonagenarian cases and young controls, revealing loci at which genetic variants may contribute to a higher or lower probability of survival into old age.The initial candidate gene studies aimed at finding human longevity genes were dominated by contradictory results (Christensen et al., 2006).The more consistent evidence obtained by repeated observation in independent cohort studies for association with longevity has so far only been observed for three loci, the apolipoprotein E (APOE) locus (Schachter et al., 1994;Christensen et al., 2006), the FOXO3A locus (Willcox et al., 2008;Flachsbart et al., 2009;Pawlikowska et al., 2009;Soerensen et al., 2010), and the AKT1 locus (Pawlikowska et al., 2009).Thus, despite the expectation that longevity would be influenced by many genetic variants with small effect sizes, the effect of variants has consistently been shown in only three genes.",
+      "1993), and gene expression microarrays (Pletcher et al. 2002). Given the ambiguities and limitations of large-effect mutant studies of aging, discussed earlier, those publications do not provide very useful evidence with respect to the question of the number of loci that affect aging. At present, the best answer to the question of the number of genes controlling aging is many (Rose and Long 2002), in keeping with the original expectations of evolutionary biologists. However, studies of the genetics of the experimental evolution of aging are now amenable to the application of genomic methods.",
+      "Accepted Article  \u00a9 2013 The Authors Aging Cell \u00a9 2013 Blackwell Publishing Ltd/Anatomical Society of Great Britain and Ireland over 90 years and 1,955 controls between 55 and 80 years did not reveal genome-wide significant loci (Newman et al., 2010) and neither did the analyses of all-cause mortality and survival free of major disease in this cohort (Walter et al., 2011).A smaller Dutch study of 403 nonagenarians and 1,670 controls younger than 65 years identified the APOE gene as a mortality locus (Deelen et al., 2011), which was confirmed in a German study of 763 long-lived individuals and 1,085 younger controls (Nebel et al., 2011) and a longitudinal study of 1,606 Danes showed that the effect size of this association increases at the highest ages (Jacobsen et al., 2010).Apparently, the influence of the common genetic variation on longevity is small which requires large meta-GWA studies for identification.Alternatively, rare genetic variants may play a more important role in longevity.Since the previous linkage studies showed contradictory results potentially due to heterogeneity in the longevity phenotype, it is expected that longevity is influenced by many private rare variants.",
+      "The lack of success in the identification of genes related to aging in humans may be due to the complexity of the phenotype.One approach to investigate aging and longevity is to compare frequencies of genetic variants between nonagenarians or centenarians and the general population.This approach led to the discovery of an association between APOE (Deelen et al., 2011;Ewbank, 2007;Gerdes et al., 2000) and more recently FOXO3A (Anselmi et al., 2009;Flachsbart et al., 2009;Li et al., 2009a;Pawlikowska et al., 2009;Willcox et al., 2008) and human aging and longevity.However, a recent genome-wide association study (GWAS) of individuals reaching the age of 90 or older failed to identify genome-wide significant variants (Newman et al., 2010).",
+      "Several explanations are possible for the lack of genomewide significant findings.First, mortality is arguably 1 of the most complex phenotypes, and several trajectories toward extreme old age have been identified (Evert et al., 2003).Multiple genes could mediate the aging process but would have their effects through numerous different patho-physiological processes and diseases that act as intermediate factors on the pathway to death (de Magalhaes et al., 2010).Therefore, any common variation in genes associated with aging probably has a small effect.",
+      "Second, the largely negative findings of this and other studies contrast with the intriguing animal studies of longevity.Very large effects of single genes on lifespan have indeed been observed in laboratory animals, but humans often have several homologues of these genes which might significantly differ in function or compensate for mutated genes through redundant mechanisms (Kuningas et al., 2008).This could explain why our top findings did not include genes in these pathways found in animal models.Animal models also represent genetically homogenous populations and are exposed to controlled environmental influences.The lack of replication of animal model findings in humans suggests that the use of knockout animals may not provide the optimal approach to understanding the variation in survival in humans as interactions with environmental factors may obscure the associations and prevent the identification of loci in humans.",
+      "1993), and gene expression microarrays (Pletcher et al. 2002). Given the ambiguities and limitations of large-effect mutant studies of aging, discussed earlier, those publications do not provide very useful evidence with respect to the question of the number of loci that affect aging. At present, the best answer to the question of the number of genes controlling aging is many (Rose and Long 2002), in keeping with the original expectations of evolutionary biologists. However, studies of the genetics of the experimental evolution of aging are now amenable to the application of genomic methods.",
+      "The remarkable discoveries of the past 2 decades showing that single genes can regulate aging in model organisms demonstrate that aging can be genetically manipulated (Finch and Ruvkun, 2001;Kenyon, 2010).Hundreds of genes that modulate longevity have now been identified in model organisms (de Magalha \u02dces et al., 2009a).In some cases (e.g., in worms), mutations in single genes can extend lifespan by almost 10-fold (Ayyadevara et al., 2008).Nonetheless, aging is a complex process that derives not from single genes but from the interactions of multiple genes with each other and with the environment.Evidence from animal systems shows a major impact of the environment on aging, yet environmental manipulations of aging act through genes and proteins, usually by triggering signaling pathways and modulating gene expression.In fact, some genes have been shown in model organisms to have varying effects on lifespan depending on diet (Heikkinen et al., 2009).Genes that can regulate aging in model organisms cannot be directly applied to humans through genetic manipulations for numerous legal, ethical, and technical reasons.If we could understand how the environment modulates these aging-related genes, we might be able to create antiaging therapies applicable to humans, potentially through diet, lifestyle, and even pharmacological interventions.Therefore, understanding genome-environment interactions in the context of aging can be a powerful approach to identify attractive targets for drug design.",
+      "Results from mutational analysis across eukaryote model organisms have shown unexpected conservation of genes and processes regulating aging.While unique properties exist within particular organisms that modulate these foundational networks, the conservation provides a tool to refine human genetic studies.As noted, GWAS for human longevity metrics suffer from large sample size requirements to obtain statistical resolution due to multiple hypothesis testing across the genome.Assuming that evolutionary genesets for longevity could be generated with confidence, an intersection of them with human variation data would increase the sensitivity of association studies.This would serve as a selective filter to refine the number of loci investigated for association in human populations.Similarly, such evolutionary filters could refine analysis of rare, unique variation within genome sequence data from extremely long-lived cohorts.A similar approach to refine human longevity GWAS used an intersection with age-related disease datasets.This 'disease-informed' GWAS helped refine candidates (iGWAS, Fortney et al., 2015), though, it should be noted that this particular strategy would further blur the distinction between aging and longevity as discussed above.The definition of gene sets from evolutionary experiments in longevity, across clades, would similarly empower detection of networks previously hidden under GWAS in human population analyses (Figure 3).",
+      "ANALYSIS OF HUMAN VARIATION IN THE GENETIC CONTROL OF LONGEVITY  Heritability studies have convincingly demonstrated that at least some fraction of human lifespan is heritable.In tandem, large-scale genome-wide association studies (GWAS) have identified numerous loci associated with age-related traits (Buniello et al., 2019).While genetic studies have functionally shown an inverse effect of multiple age-related, diseaseassociated variants on lifespan regulation, the number of well-replicated longevity-conferring variants remains limited to variants in APOE (ApoE \u03b52), and more recently, CDKN2A/B and IL6 (see Table 1).To date, studies in humans have been hampered by the specific phenotype definitions used, sample sizes of the extreme phenotypes, and modest heritability of the longevity-related traits (Breitbach et al., 2019).This is due to the complex interplay of biological and social factors involved in human aging, as well as the limited power of GWAS, which require sampling thousands of subjects to achieve statistical significance (Breitbach et al., 2019).Genetic studies of aging have also been hindered by an inconsistent use of definitions of aging (reviewed in Baghdadi et al., 2020).The two main ways of conducting research on the genetics of longevity in human populations are by studying (i) the lifespan (continuous trait, years lived) and (ii) the longevity (dichotomous trait, i.e., being among the longest-lived individuals within a specific population).These complexities have limited the resolution and capability of broad association studies of human longevity.Importantly, these genomic analyses focus on a shift of survival in a population; these variables may be genetically distinct from the mechanisms establishing potential for longevity overall (Figure 1A).We argue that an understanding of this shift in lifespan as well as genetic mechanisms of regulating a species specific 'set points' (Figure 1B) will aid in the conceptual distinction of aging and longevity in humans.",
+      "TRANSLATION OF LONGEVITY MODEL ORGANISMS AND CORE AGING PATHWAYS  Genetic studies on lifespan have proven to be challenging.While longevity is a defining trait for a given species, the lifespan of individuals is of limited heritability, making analyses more difficult.Exceptional human life span, although a rare phenotype, is likely multifactorial; refined analyses are required to obtain statistically robust genomic signatures of longevity (Zhang et al., 2020) and these have proven elusive.Unlike laboratory models, the effect of environmental variance cannot be controlled in human studies, potentially masking purely biological aging mechanisms.Even laboratory models cannot replicate the complex \"environment\" of humans; it includes psychosocial, economic, and cultural factors, rather than strictly biological.These human-specific confounders are difficult or impossible to target in traditional model organisms.Despite these limitations, experimentally tractable model organisms have proven invaluable in deciphering the purely genetic contribution to lifespan, including genes and pathways conserved across the tree of life.",
+      "Our analyses show that it is extremely unlikely that there is a single gene harboring rare protein-altering variants shared by all supercentenarians but no controls.It is not surprising that a highly complex trait such as longevity is not explained by a single Mendelian gene.",
+      "With modern genomic technologies and largescale data analysis methods, it is possible to sift through the genes of populations to find the loci that act to postpone aging. [3]There are uncertainties with the comparison of populations with different rates of aging.However, it is superior to experimental designs that only consider age-dependence or dietary-response, without determining causal mechanisms.",
+      "Although the models data set comprises all genes (to our knowledge) shown by the time of the latest update to statistically increase longevity or alter the aging process in a noticeable way, in the human data set we try to evaluate whether a given intervention is affecting the aging process itself or not.For example, many mutations may increase longevity by decreasing the incidence of specific diseases, rather than by altering the basic process of aging (de Magalh\u00e3es et al ., 2005a(de Magalh\u00e3es et al ., , 2005b)).Therefore, the human data set is not merely an extension of the work conducted in model organisms and of its bibliography, but a manually selected list of the most pertinent human aging candidate genes, each presented with a higher annotation level.We cite studies on whether the functions of aging-associated genes in model organisms are conserved in their human orthologues.Likewise, we cite flaws in previous studies based on new published observations, although we have a neutral stance on conflicting findings from different research groups.Our policy is to cite all conflicting reports and let visitors make their own decisions on how to interpret them.By contrast, each entry in GenAge model organisms has only one reference: the first publication reporting an association of the gene with longevity or aging.Moreover, one of the latest enhancements in the human data set was the inclusion of Gene Ontology annotation.Gene Ontology terms and annotation files were obtained from the Gene Ontology Consortium website (http://www.geneontology.org/ ) and provide an additional layer of description for the gene products in a cellular context (Ashburner et al ., 2000).",
+      "Conclusions and prospects  Over the past two decades the human aging field has built up the necessary resources to study the biology of aging and longevity by establishing human populations with a diversity of designs.Meta-analyses integrating genetic and phenotypic datasets have successfully identified variants associated with a range of age-related traits and diseases.Despite these accomplishments, the number of novel leads contributing to human lifespan regulation is limited.Although positive regions of linkage and suggestive GWAS hits have been reported, the field has not yet identified the loci that explain the clustering of longevity in families and the variation in biological aging rate in the population.As for animal models, down-signaling of the IIS and mTOR pathway appeared to be relevant in humans.These findings are being followed up by molecular and physiological profiling using skin, fat and muscle tissue of long-lived family members and controls.Human studies now also include the response of nutrient sensing systems to the application of dietary and physical challenges.",
+      "Most of the human candidate gene studies were performed in cross-sectional designs (Box 1 and Fig. 1), comparing allele frequencies of potential longevity loci between highly aged individuals and young controls.The candidate gene studies based on single genes have pointed a role for genes involved in, e.g., GH/insulin/IGF-1 signaling, immune regulation, and lipoprotein metabolism (Supporting Information Table S1), although most of these results have not (yet) been confirmed in sufficient independent studies.The most convincing human longevity loci today are APOE and FOXO3A which have frequently been associated with longevity in cross-sectional studies (see for a review [26]) and survival in prospective studies [27][28][29] (Fig. 3).APOE encodes the protein apolipoprotein E which seems to play a role in e.g., lipoprotein metabolism, cognitive function, and immune regulation [30].FOXO3A encodes the protein forkhead box O3 which acts as a transcription factor for many different genes involved in processes like apoptosis and oxidative stress [31]."
+    ],
+    [
+      "Funding: See page 22 Preprinted: 24 June 2021 Received: 03 November 2021 Accepted: 01 April 2022 Published: 07 April 2022 Reviewing Editor: Joris Deelen, Max Planck Institute for Biology of Ageing, Germany \u200d \u200dCopyright Mozhui et al. This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited. Editor's evaluation This article used three newly generated epigenetic predictors to test how they differ between genetically diverse mice from the BXD family (by looking at metabolic traits and lifespan).",
+      "Longevity data was obtained from a parallel cohort of BXD mice housed in the same UTHSC colony, and members of this \u2018longevity cohort\u2019 were allowed to age until natural death (more detail on the longevity cohort can be found in Roy et al. , 2021). Males were excluded and strain-\u00adby-\u00addiet lifespan summary statistics were derived. Only strain-\u00adby-\u00addiet groups with five or more observations for lifespan were included in the correlational analyses with the epigenetic predictors. Multivariable EWAS Site-\u00adby-\u00adsite differential methylation analysis (EWAS) was performed on the 27,966 CpGs using a multivariable regression model.",
+      "Funding: See page 22 Preprinted: 24 June 2021 Received: 03 November 2021 Accepted: 01 April 2022 Published: 07 April 2022 Reviewing Editor: Joris Deelen, Max Planck Institute for Biology of Ageing, Germany \u200d \u200dCopyright Mozhui et al. This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited. Editor's evaluation This article used three newly generated epigenetic predictors to test how they differ between genetically diverse mice from the BXD family (by looking at metabolic traits and lifespan).",
+      "Longevity data was obtained from a parallel cohort of BXD mice housed in the same UTHSC colony, and members of this \u2018longevity cohort\u2019 were allowed to age until natural death (more detail on the longevity cohort can be found in Roy et al. , 2021). Males were excluded and strain-\u00adby-\u00addiet lifespan summary statistics were derived. Only strain-\u00adby-\u00addiet groups with five or more observations for lifespan were included in the correlational analyses with the epigenetic predictors. Multivariable EWAS Site-\u00adby-\u00adsite differential methylation analysis (EWAS) was performed on the 27,966 CpGs using a multivariable regression model.",
+      "Conclusions These results suggest a novel locus influencing survival in the B6/D2 genetic background, perhaps via a metabolic disorder that emerges by 200 days of age in male animals. Keywords Pathology  Longevity \u2401 Lifespan \u2401 Mouse \u2401 Linkage \u2401  Introduction Longevity, the quintessential complex trait, likely reflects all aspects of an organism\u2019s life history. In humans, the estimated heritability of age at death is estimated at 25\u201333 % [1]. Genetic contributions to mortality rates are thus of great interest and may aid in the understanding of disease etiology and the process of aging itself [2].",
+      "Leduc MS, Hageman RS, Meng Q et al (2010) Identification of genetic determinants of IGF-1 levels and longevity among mouse inbred strains. Aging Cell 9(5):823\u2013836. doi:10.1111/j.14749726.2010.00612.x 10. Lang DH, Gerhard GS, Griffith JW et al (2010) Quantitative trait loci (QTL) analysis of longevity in C57BL/6J by DBA/2J (BXD) recombinant inbred mice. Aging Clin Exp Res 22(1):8\u201319 11. Gelman R, Watson A, Bronson R et al (1988) Murine chromosomal regions correlated with longevity. Genetics 118(4):693\u2013704 12. Jackson AU, Galecki AT, Burke DT et al (2002) Mouse loci associated with life span exhibit sex-specific and epistatic effects.",
+      "Here, we have extended this analysis to search for genotypes related to survival to the age of 800 days in a population of a reciprocal F2 cross between (B6) and (D2) mice. Since QTL for longevity in mice have shown strong sex specificity [10, 12], we conducted sex-specific analyses. In addition, we also determined whether there were any change in pathology changes associated with the loci that showed frequency distortions with aging. To confirm the associations of the loci of interest with longevity and pathology, we performed replication analyses on a panel of BXD recombinant inbred strains.",
+      "Methods We examined a population of 1200 mice that were F2 generation offspring of a 4-way reciprocal cross between C57BL6/J and DBA2/J strains. Animals were sacrificed at age 200, 500, or 800 days and genotyped at 96 markers. The 800 days old cohort, which were the survivors of a much larger breeding group, were examined for enriched frequency of alleles that benefit survival and depletion of alleles that reduce survival. Results Loci on Chr 13 in males and on Chr X in females were significantly distorted from Mendelian expectations, even after conservative correction for multiple testing.",
+      "Assessing epigenetic age in long-lived mice  The epigenetic-aging model was applied to the methylation profiles of long-lived mice and the age-matched controls not used for training (Additional file 2: Datasets used summary).Reductions in age were calculated by subtracting the epigenetic ages of the untreated, wild-type mice from those of the treated mice of the same genetic background.To assess the significance, we used an ANOVA for all 22-month-old mice or only 22-month-old UM-HET3 mice.We also compared the epigenetic ages between treatments with their agematched controls from the same genetic background using a t-test (Additional file 4: Treatment vs wild type stats).",
+      "Editor's evaluation  This article used three newly generated epigenetic predictors to test how they differ between genetically diverse mice from the BXD family (by looking at metabolic traits and lifespan).The authors subsequently identified several quantitative trait loci for the different predictors, using linkage analysis, and performed transcriptome and proteome analyses of liver and adipose tissue.The described results provide some important new insights on the underlying biology of epigenetic mouse aging and may be used to inform future studies in other model organisms and humans focused on studying the relationship between epigenetic aging and metabolism.",
+      "352(6291): p. aad0189. Liao, C.Y. , et al. , Genetic variation in the murine lifespan response to dietary restriction: from life extension to life shortening. Aging Cell, 2010. 9(1): p. 92-5. Johnson, M., Laboratory Mice and Rats. Mater. Methods, 2012. 2: p. 113. Fontaine, D.A. and D.B. Davis, Attention to Background Strain Is Essential for Metabolic Research: C57BL/6 and the International Knockout Mouse Consortium. Diabetes, 2016. 65(1): p. 25-33. Simon, M.M. , et al. , A comparative phenotypic and genomic analysis of C57BL/6J and C57BL/6N mouse strains. Genome Biol, 2013. 14(7): p. R82. Lilue, J., et al.",
+      "Materials and Methods  Study Design.Female mice of the long-lived F 1 hybrid strain C3B10RF1 were fed and maintained as described (7).Briefly, mice were weaned at 28 days, individually housed, given free access to water, and randomly assigned to study groups.Comparisons between five groups of mice were used to determine the effects of aging and CR on gene expression.Control young (7-month-old; n \u03ed 3) and old (27-month-old; n \u03ed 3) mice were fed 95 kcal of a semipurified control diet (Harlan Teklad, Madison, WI; no.TD94145) per week after weaning.Long-term CR (LT-CR) young (7-month-old; n \u03ed 3) and old (27-month-old; n \u03ed 3) mice were fed 53 kcal of a semipurified CR diet (Harlan Teklad; no.TD94146) per week after weaning.Short-term CR (ST-CR) mice were 34-monthold control mice that were switched to 80 kcal of CR diet for 2 weeks, followed by 53 kcal for 2 weeks (n \u03ed 3).The effects of age on gene expression in control mice were determined by comparison between results from the young control and the old control groups.The effects of LT-CR on gene expression were determined by comparison between results from the young control and the young LT-CR groups, and from the old control and the old LT-CR groups.The effects of ST-CR were determined by comparison between results from the old control and the ST-CR groups.Mice were fasted for 48 h before killing.Mice were killed by cervical dislocation, and the livers were rapidly excised and flash frozen in liquid nitrogen.No signs of pathology were detected in any of the animals used.All animal use protocols were approved by the institutional animal use committee of the University of California, Riverside.",
+      "Accessing data resources in the mouse phenome database for genetic analysis of murine life span and health span. J. Gerontol. A Biol. Sci. Med. Sci. 71 (2), 170\u2013177. Brown, R.E. , Stanford, L., Schellinck, H.M., 2000. Developing standardized behavioral tests for knockout and mutant mice. ILAR J. 41 (3), 163\u2013174. Bubier, J.A. , Jay, J.J., Baker, C.L. , Bergeson, S.E. , Ohno, H., Metten, P., Crabbe, J.C., Chesler, E.J. , 2014. Identi\ufb01cation of a QTL in Mus musculus for alcohol preference, withdrawal, and Ap3m2 expression using integrative functional genomics and precision genetics. Genetics 197 (4), 1377\u20131393. Burn, C.C. , 2008.",
+      "Our own work has taken a different tack: we have attempted to determine whether mutations with differential effects on aging may be present within the many available populations of laboratory-adopted inbred mice.The goal is not so much to clone these genes-if indeed they existbecause positional cloning strategies of this kind require many thousands of animals and would be extremely expensive using an assay, age at death, that is itself so costly.Instead, the goal has been to use gene mapping methods to test hypotheses about aging and to develop new animal models that will be useful for testing well-specified hypotheses about the molecular basis for age-dependent changes.In the absence of a validated battery of biomarkers of aging, we (like most others) have reluctantly decided to use mouse life span as a crude surrogate for aging itself, reasoning that genetic alleles that extend life span well beyond the median for the tested population may be operating via an influence on aging itself.Work conducted using recombinant inbred mouse stocks (Gelman et al., 1988;de Haan and Van Zant, 1999) has suggested that life-span differences between pairs of inbred mouse lines might reflect the influence of as few as 4-7 polymorphic loci, providing some basis for hope that some of these would have an effect large enough to be detected by a genome scan experiment involving 300-1,200 mice.",
+      "The available dataset also provides examples in which genetic variants seem to influence the risk of specific late-life diseases.Figure 8-6, for example, shows longevity results for mice stratified by their inheritance at the 12th chromosome locus D12Mit167.This is a locus associated with differential longevity in both male and female mice, with the strongest effect (adjusted p < 0.01) seen in those mice living more than 657 days (Jackson et al., unpublished results).The longest-lived mice are those that inherit both the C57BL/6 allele from their mother and the C3H allele from their father; on average, they survive 93 days longer than siblings with the BALB plus C3H combination.Figure 8-6 shows that the D12Mit167, like the pair of loci illustrated in Figure 8-5, has significant and similar effects in mice dying of cancer (85 days) and in mice dying of non-neoplastic diseases (126 days).A more detailed analysis of the cancers, however, suggests that while lymphoma and hepatoma victims are equally protected by the favorable alleles (effect sizes of 93 and 167 days, respec-  mice of two subgroups: those dying of the urinary syndrome MUS, and those dying of all other causes.The genetic analysis contrasts mice with both the C57BL/6 allele at D4Mit84 and the C3H allele at D9Mit110 to mice with any of the three other allele combinations.In the males dying of causes other than MUS, this allele pair is associated with a 170-day increment in longevity (post-hoc p < 0.00003).But for males that do die of MUS, the same allele combination is associated with a 187-day decline in mean life span (post-hoc p < 0.03).This effect is thus pleiotropic, in that these alleles accelerate death in mice susceptible to MUS, while postponing death for all other males in the population.Although these loci are associated with differential longevity in mice that do develop MUS, they do not have a significant effect on the chances that MUS will indeed occur (not shown).The risk of developing MUS seems to be under control of a separate locus on chromosome 6.As shown in the bottom panel of Figure 8-7, males that inherit the C3H allele at D6Mit268 are far more likely to develop MUS (28 percent risk) than are their brothers who receive the DBA/2 allele at this locus (7 percent risk; p = 0.012 by two-tailed Fisher's exact test).",
+      "Previously, the methylation status of CpG sites within the genes Prima1, Hsf4, Kcns1 was shown to qualify as a reliable predictor of chronological age of B6 mice.10 This same study also revealed enhanced epigenetic aging of the D2 strain in accordance with its general reduced mean life span, supporting the possibility that the panel might also serve as a marker for the biological age in mice. Applying this B6trained marker panel to our (congenic) experimental strains, we observed that epigenetic age predictions correlated with chronological age in B6 (R2=0.93) and line A mice (R2=0.89).",
+      "34. Gelman R, Watson A, Bronson R & Yunis E Murine chromosomal regions correlated with longevity. Genetics 118, 693\u2013704 (1988). [PubMed: 3163317] 35. Houtkooper RHet al.The metabolic footprint of aging in mice. Sci. Rep1, (2011). 36. Houtkooper RHet al.Mitonuclear protein imbalance as a conserved longevity mechanism. Nature497, 451\u2013457 (2013). [PubMed: 23698443] 37. Williams EGet al.An Evolutionarily conserved role for the aryl hydrocarbon receptor in the regulation of movement. PLOS Genet. 10, e1004673 (2014). [PubMed: 25255223] 38. Lang DHet al.Quantitative trait loci (QTL) analysis of longevity in C57BL/6J by DBA/2J (BXD) recombinant inbred mice. Aging Clin. Exp. Res. 22, 8\u201319 (2010).",
+      "For females, hairs of the congenic mice grew 31% faster, also highly significant (P = 0.0006, 1-tailed). These results validated the presence of a gene in the differential region affecting FE. Discussion We report the outcomes of a quantitative genetic study on aging and longevity in the mouse. We studied an extant series of recombinant inbred strains (ILSXISS) that have been used both in DR aging studies as well as to study alcohol sensitivity (Williams et al. , 2004).",
+      "FOURTH STEP: MEDICAL TESTING OF CANDIDATE DRUGS  Many genes are common between fruit flies and mammals, but by no means all.Therefore, it is important to test biochemical pathways that work in fruit flies with mammals.Mice are the system of choice, as they have relatively short lifespans (2 -3 years) and a great deal is known of their genetics.Mortality rate measurements, like those studied in fruit flies, [10] might speed up mouse trials to just 6-12 months.Mouse trials would also help address issues of safety, such as liver and kidney toxicity, before going on to human trials.",
+      "Experimental Procedures  Mouse Breeding, Maintenance, and Longevity.Cdc42GAP \u03ea/\u03ea and p53 \u03ea/\u03ea mice were generated as previously described (6,35), and the mice used in the studies were mixed C57BL/6 \u03e9/\u03ea 129/Sv inbred.Littermates of different genotypes were housed and fed freely with standard mouse chow over their life span in a pathogen-free environment and were monitored for vitality and longevity.Mice exhibiting extreme morbidity were euthanized and subjected to necropsy.All animal procedures were approved by the Institutional Animal Care and Use Committee at the Children's Hospital Research Foundation."
+    ],
+    [
+      "Genetic associations for two biological age measures point to distinct aging phenotypes. Aging Cell 20:e13376. DOI: https://doi.org/10.1111/acel.13376, PMID: 34038024 Lang DH, Gerhard GS, Griffith JW, Vogler GP, Vandenbergh DJ, Blizard DA, Stout JT, Lakoski JM, McClearn GE. 2010. Quantitative trait loci (QTL) analysis of longevity in C57BL/6J by DBA/2J (BXD) recombinant inbred mice. Aging Clinical and Experimental Research 22:8\u201319. DOI: https://doi.org/10.1007/BF03324809, PMID: 20305363 Lappalainen T. 2015. Functional genomics bridges the gap between quantitative genetics and molecular biology. Genome Research 25:1427\u20131431.",
+      "Pharmacol Biochem Behav 81, 764\u2013768. Hsu, H.C., Lu, L., Yi, N., Van Zant, G., Williams, R.W. & Mountz, J.D. (2007) Quantitative trait locus (QTL) mapping in aging systems. Methods Mol Biol 371, 321\u2013348. Hurlin, P.J. & Huang, J. (2006) The MAX-interacting transcription factor network. Semin Cancer Biol 16, 265\u2013274. Jones, B.C. , Tarantino, L.M. , Rodriguez, L.A., Reed, C.L. , McClearn, G.E. , Plomin, R. & Erwin, V.G. (1999) Quantitative-trait loci analysis of cocaine-related behaviours and neurochemistry. Pharmacogenetics 9, 607\u2013617. Jones, B.C. , Beard, J.L. , Gibson, J.N. , Unger, E.L., Allen, R.P. , McCarthy, K.A. & Earley, C.J.",
+      "Genetic associations for two biological age measures point to distinct aging phenotypes. Aging Cell 20:e13376. DOI: https://doi.org/10.1111/acel.13376, PMID: 34038024 Lang DH, Gerhard GS, Griffith JW, Vogler GP, Vandenbergh DJ, Blizard DA, Stout JT, Lakoski JM, McClearn GE. 2010. Quantitative trait loci (QTL) analysis of longevity in C57BL/6J by DBA/2J (BXD) recombinant inbred mice. Aging Clinical and Experimental Research 22:8\u201319. DOI: https://doi.org/10.1007/BF03324809, PMID: 20305363 Lappalainen T. 2015. Functional genomics bridges the gap between quantitative genetics and molecular biology. Genome Research 25:1427\u20131431.",
+      "Interestingly, the correlation analysis indicates QTL Mapping in Aging Systems  333  Fig. 5. Basic statistics provided by the WebQTL GeneNetwork website. The strain distribution pattern (SDP) of the quantitative trait is presented in the basic statistics page of WebQTL in the following ways: (A) the raw data of the quantitative trait obtained from each BXD recombinant inbred (RI) strain, (B) data mean and distribution, (C) bar graph showing the mean and variable of each strain, and (D) the normal probability plot of the SDP.",
+      "23 Quantitative Trait Locus (QTL) Mapping in Aging Systems Hui-Chen Hsu, Lu Lu, Nengjun Yi, Gary Van Zant, Robert W. Williams, and John D. Mountz Summary Understanding the genetic basis of the effects of aging on the decline in the immune response is an enormous undertaking. The most prominent age-related change in the immune system is thymic involution. This chapter will focus on the use of C57BL/6 J X DBA/2 J (BXD) recombinant inbred (RI) strains of mice to map genetic loci associated with age-related thymic involution in mice.",
+      "For further prioritization, we converted the mouse QTL regions to the corresponding syntenic regions in the human genome and retrieved GWAS annotations for these intervals (Buniello et al., 2019).We specifically searched for the traits: epigenetic aging, longevity, age of menarche/menopause/puberty, Alzheimer's disease, and age-related cognitive decline and dementia.This highlighted five genes in Eaa11 and three genes in Eaa19 (Supplementary file 4c).We also identified a GWAS that found associations between variants near Myof-Cyp26a1 and human longevity (Yashin et al., 2018), and a meta-GWAS that found gene-level associations between Nkx2-3 and Cutc, and epigenetic aging (Supplementary file 4c; McCartney et al., 2021).",
+      "Jiang, C. and Zeng, Z. B. (1995). Multiple trait analysis of genetic mapping for quantitative trait loci. Genetics 140, 1111\u20131127. Jin, W., Riley, R. M., Wolfinger, R. D.et al. (2001). The contributions of sex, genotype and age to transcriptional variance in Drosophila melanogaster. Nat Genet 29, 389\u2013395. Kempermann, G., Chesler, E. J., Lu, L. et al. (2006). Natural variation and genetic covariance in adult hippocampal neurogenesis. Proc Natl Acad Sci U S A 103, 780\u2013785. Kendziorski, C. M., Chen, M., Yuan, M. et al. (2006). Statistical methods for expression quantitative trait loci (eQTL) mapping. Biometrics 62, 19\u201327.",
+      "Hypothesis-free genome-wide approaches have also been undertaken.Genome-wide linkage scans reported evidence for linkage with longevity on chromosome 4q25 (Puca et al., 2001), 3p24-22, 9q31-34, and12q24 (Boyden &Kunkel, 2010).However, the evidence for these loci is still very weak as the results, obtained in centenarians and their families, could not be replicated in nonagenarian sibling pairs (Beekman et al., 2006) or have yet to be tested in other studies.A meta GWAS of survival to 90 years or older in 1836 cases and 1955 controls did not find any significant genome-wide associations (Newman et al., 2010).Thus far, hypothesis-free approaches have not identified any loci involved in longevity.",
+      "Abiola O, Angel JM, Avner P, Bachmanov AA, Belknap JK, Bennett B, et al. The nature and identification of quantitative trait loci: a community\u2019s view. Nat Rev Genet. Nature Publishing Group; 2003; 4: 911\u2013916. https://doi.org/10.1038/nrg1206 PMID: 14634638  18. Grupe A, Germer S, Usuka J, Aud D, Belknap JK, Klein RF, et al. In silico mapping of complex diseaserelated traits in mice. Science. American Association for the Advancement of Science; 2001; 292: 1915\u20131918. https://doi.org/10.1126/science.1058889 PMID: 11397946  19. Pletcher MT, McClurg P, Batalov S, Su AI, Barnes SW, Lagler E, et al.",
+      "coid levels, etc.The mapping project should thus help to guide the search for human genes that regulate these interesting phenotypes and at the same time spark new investigations, in animal models, for the biochemical differences that mediate the genetic effects we detect.At the same time, the dataset that emerges should also allow us to test more general questions about the nature of aging and its genetic control.We may, for example, be able to identify QTLs that not only retard the development of one or more age-sensitive T-cell subsets, but also retard age-dependent changes in protein conformation, bone matrix turnover, and brain GFAP levels.Such a finding would imply that these changes are influenced, together, by a common biochemical pathway, and the corresponding QTLs would be excellent candidates for genes that regulate aging per se, rather than merely one among the many more agesensitive traits.In the same way, it will be of particular interest to determine if QTLs that regulate age-sensitive traits also are associated with differences in life span, and conversely if QTLs identified on the basis of longevity effects modify one (or nearly all?) of the age-sensitive traits in our test battery.",
+      "The strategy for mapping such quantitative trait loci (QTL) involves looking for preferential segregation of specific alleles or allele combina-tions in mice that differ in life span (or, more generally, any age-sensitive trait of interest).Our test population, called UM-HET3, consisted of a group of mice bred as the progeny of females of the (BALB/c \u00d7 C57BL/6)F1 genotype and males of the (C3H/HeJ \u00d7 DBA/2)F1 genotype.Mice bred in this way are, from a genetic perspective, all siblings; each shares a random half of its alleles with every other animal in the UM-HET3 population.The current set of analyses was conducted when genotype and longevity data were available from a group of 110 virgin males and 143 virgin females.The analytical method adjusted, by permutation testing, for Type I errors attributable to the simultaneous evaluation of multiple linkage hypotheses, and also included gender as a covariate to look for instances of sex-specific genetic effects.Because we had particular interest in regulation of late-life diseases rather than in causes of premature death, and because of evidence that genetic influences on mouse longevity were particularly strong when early deaths were not considered (Covelli et al., 1989), we repeated each analysis after exclusion of those animals dying before 657 days of age, i.e., the age at which 20 percent of the animals had already died.",
+      "The proportion of the phenotypic variance accounted for by the QTL yield for Hbact and Hbrear was substantial and of the same order of magnitude as that contributed by age. A small number of age-dependent QTL were found in the midst of a majority of age-stable QTL (see discussion above). These age-sensitive loci point toward genes whose functions are correlated with important behavioral changes during aging.",
+      "Ageing genes and pathways.Assessing the loci of interest for colocalisation with gene expression quantitative trait loci (eQTL), we find strong evidence (FDR SMR < 5%; P HEIDI > 1%; see \"Methods\") of cis-acting eQTL colocalisation for eight out of 10 loci.In total, we highlight 27 unique genes acting across 32 tissues, especially whole blood (12 genes) and the tibial nerve (7 genes) (Supplementary Data 5).In blood, higher expression levels of BCL3 and CKM (near APOE); CTC-510F12.2, ILF3, KANK2 and PDE4A (near LDLR); USP28 and ANKK1 (near ZW10); and CDKN2B are linked to an increase in multivariate ageing traits (i.e.improved survival), while the opposite is true for EXOC3L2 (near APOE), TTC12 (near ZW10), and FOXO3.For the multivariate signal near SLC4A7 we find colocalisation with expression of NEK10 (liver); for the signal near LPA we find colocalisation with expression of SLC22A1/A3 (multiple tissues) and MAP3K4 (pituitary); and for the signal near FGD6 we find colocalisation with expression of FGD6 itself (adipose/arterial).Including trans-acting eQTL from blood, while keeping the same thresholds for colocalisation, we additionally discover higher expression levels of FOXO3B colocalises with the life-extending signal near FOXO3.When we include genes which could not be tested for heterogeneity (N eQTL < 3), we identify one additional cis-acting and 49 additional trans-acting genes (of which 10 colocalise with the signal near LINC02513) (Table 2; Supplementary Data 5).",
+      "Jiang, C. and Zeng, Z. B. (1995). Multiple trait analysis of genetic mapping for quantitative trait loci. Genetics 140, 1111\u20131127. Jin, W., Riley, R. M., Wolfinger, R. D.et al. (2001). The contributions of sex, genotype and age to transcriptional variance in Drosophila melanogaster. Nat Genet 29, 389\u2013395. Kempermann, G., Chesler, E. J., Lu, L. et al. (2006). Natural variation and genetic covariance in adult hippocampal neurogenesis. Proc Natl Acad Sci U S A 103, 780\u2013785. Kendziorski, C. M., Chen, M., Yuan, M. et al. (2006). Statistical methods for expression quantitative trait loci (eQTL) mapping. Biometrics 62, 19\u201327.",
+      "Quantitative trait loci (QTLs) can be identified in several ways, but is there a definitive test of whether a candidate locus actually corresponds to a specific QTL? NIH-PA Author Manuscript  Much of the genetic variation that underlies disease susceptibility and morphology is complex and is governed by loci that have quantitative effects on the phenotype. Gene-gene and geneenvironment interactions are common and make these loci difficult to analyse. Here, we present a community\u2019s view on the steps that are necessary to identify genetic loci that govern quantitative traits, along with a set of interpretive guidelines.",
+      "QTL Analysis in Hematopoiesis  47  3 Quantitative Trait Analysis in the Investigation of Function and Aging of Hematopoietic Stem Cells Hans-Willem Snoeck Summary Extensive genetically determined quantitative variation exists in the number and function of hematopoietic stem cells in inbred mouse strains. Furthermore, aging of hematopoietic stem cells is genetically determined. Gene identification of quantitative trait loci involved in the regulation and aging of hematopoietic stem cells would provide novel insights into regulatory mechanisms that are relevant in vivo and may be clinically important.",
+      "In order to find the causal loci for heritable differences in transcript levels and possible interactions between age and genotype, we applied a two-time-point model.In this model, we used three factors-(1) relative age, (2) genotype (marker), and (3) the interaction between factors 1 and 2-to explain the differences in gene expression between RILs and age groups.With this mapping procedure, we found almost 900 genes that had an eQTL or gxa eQTL in developing and/or aging worms (P < 0.0001; Fig. 2).Almost half of these genes with heritable transcript differences were found to have a genotype-by-age effect (396 at P < 0.0001; Table 1) allocated to a specific marker, which we coined genotype-by-age expression-QTL ( gxa eQTL).One specific hotspot (trans-band) for gxa eQTL was found on chromosome IV for aging worms and a trans-band for eQTL on chromosome I was detected in developing worms (Fig. 2).",
+      "NIH-PA Author Manuscript  We found three significant QTLs (genetic regions harboring genes controlling these various aging traits, Supplementary Table 5). On chromosome 7, we found a QTL affecting lifespan and fertility after DR that we have named Lfdr1 for \u201clongevity and fertility response to dietary restriction, QTL 1; this QTL also has suggestive effects on FE (Fig. 5D). Two QTLs having significant effects on FE were identified on chromosomes 9 and 15. These we have named Fedr1 and Fedr2, respectively, for \u201cfuel efficiency response to dietary restriction\u201d QTLs 1 and 2.",
+      "Quantitative trait locus (QTL) mapping in aging systems. Methods in Molecular Biology (Clifton, NJ ). 2007; 371:321\u2013348. Hunter KW, Crawford NPS. The future of mouse QTL mapping to diagnose disease in mice in the age of whole-genome association studies. Annual Review of Genetics. 2008; 42:131\u2013141. Ito R, Robbins TW, Everitt BJ. Differential control over cocaine-seeking behavior by nucleus accumbens core and shell. Nature Neuroscience. 2004; 7:389\u2013397. [PubMed: 15034590] Kapp MB. Ethical and legal issues in research involving human subjects: do you want a piece of me? Journal of Clinical Pathology. 2006; 59:335\u2013339.",
+      "Jiang, C. and Zeng, Z. B. (1995). Multiple trait analysis of genetic mapping for quantitative trait loci. Genetics 140, 1111\u20131127. Jin, W., Riley, R. M., Wolfinger, R. D.et al. (2001). The contributions of sex, genotype and age to transcriptional variance in Drosophila melanogaster. Nat Genet 29, 389\u2013395. Kempermann, G., Chesler, E. J., Lu, L. et al. (2006). Natural variation and genetic covariance in adult hippocampal neurogenesis. Proc Natl Acad Sci U S A 103, 780\u2013785. Kendziorski, C. M., Chen, M., Yuan, M. et al. (2006). Statistical methods for expression quantitative trait loci (eQTL) mapping. Biometrics 62, 19\u201327."
+    ],
+    [
+      "Introduction  With the development of human genomics research, a large number of studies of the genetics of longevity have been conducted.Scientists from various countries have proposed many different theories concerning the mechanisms of aging from different perspectives, involving oxidative stress, energy metabolism, signal transduction pathways, immune response, etc. [1,2].These mechanisms interact with each other and are influenced by heredity to some degree [2,3].The identification of longevity-related biological markers is critical to an indepth understanding of the mechanisms of carrier protection against common disease and/or of the retardation of the process of aging.",
+      "INTRODUCTION  Human aging is affected by genes, life style, and environmental factors.The genetic contribution to average human aging can be modest with genes explaining \u223c20-25% of the variability of human survival to the mid-eighties (Herskind et al., 1996;Fraser and Shavlik, 2001).By contrast, genetic factors may have greater impact on survival to the ninth through eleventh decades (Tan et al., 2008).Notably, exceptional longevity is rare and may involve biological mechanisms that differ from those implicated in usual human aging.",
+      "Introduction  Geroscience refers to research aimed at understanding the mechanisms of biological aging (Kennedy et al. 2014).A major goal of geroscience is to define the genetic, epigenetic, and environmental features that determine individual rates of aging.From a translational perspective, a further goal is to use this knowledge to develop interventions that can slow or delay aging in order to promote healthy longevity and increase healthspan, the period of life spent in good health free from chronic disease and disability (Burch et al. 2014;Pitt and Kaeberlein 2015).",
+      "the maximum human life span.Several avenues to studying aging have placed us on Department of Biology Massachusetts Institute of Technology the threshold of understanding basic underlying mechanisms.These approaches include the identification of Cambridge, Massachusetts 02139 key genes and pathways important in aging; genetic studies of heritable diseases that cause the appearance of premature aging in affected people; physiological ex-Introduction periments that relate the pace of aging to caloric intake; Is aging the final act in the script of developmental bioland advances in human genetics, as well as cell and ogy?The characteristic changes that are part and parcel molecular biology leading to an understanding of the of aging appear similar to developmentally regulated basis of many diseases of aging.Strikingly, single gene programs.But why would aging mechanisms have been mutations have been found to significantly extend the evolutionarily selected as advantageous?Indeed, evolife span in C. elegans, yeast, and, most recently, Drolutionary biologists might argue that aging occurs by sophila, suggesting that aging may be relatively simple, default due to the absence of selection in the postreproat least in these organisms.Further, the limited replicaductive phase of life.By this view, the aging process is tion potential of human cells in culture has been attribnot programmed, but, rather, the detritus of the absence uted to a specific mechanism (i.e., the shortening of of selection for maintenance (Medawar, 1952; Kirkwood, telomeric ends of chromosomes).An important chal- 1977).However, it is quite reasonable that any mechalenge is now to relate these recent findings to the more nisms that sprang up to slow or regulate the pace of complex case of human aging.aging would be selected, because lucky individualsIn this review, we will discuss several important mocould potentially give rise to more progeny.Therefore, lecular models of aging that come from current research.it is reasonable to suppose that life span extending pro-These are damage by reactive oxygen species (ROS) cesses have been selected and that these can be viewed generated by metabolism, genome instability, genetias an elaboration of development itself.In principle, cally programmed extension mechanisms, cell death, such extension mechanisms may act to slow or forestall and systemic aging.Questions to be posed include the deleterious changes in an organism that progressively following.What evidence exists for and against these lead to death.The life span of an organism, therefore, models?Can more than one of these models apply to is the sum of deleterious changes and counteracting aging of different tissues in humans-specifically do repair and maintenance mechanisms that respond to organs with continually dividing cells age by the same the damage (Figure 1).mechanism as organs that are postmitotic?Finally, is A priori, one imagines such longevity mechanisms to aging amenable to therapeutic intervention, and would be much less complex than those regulating embryonic such intervention be advisable?development.The spatial and temporal constraints on embryonic development are many, while requirements Oxidative Damage for longevity mechanisms might be much more specific One theory of aging proposes that ROS which are generif there were a single process (or a few processes) whose ated by metabolism cause cumulative damage over a breakdown is the limiting event in longevity (i.e., the lifetime (Harman, 1981).Roughly two to three percent Achilles heel).of oxygen taken up is chemically reduced by the addition Aging is defined when two criteria are met.First, the of single electrons, which are sequentially converted probability of death at any point in time increases with into ROS, including the superoxide anion, hydrogen perthe age of the organism.This statistical definition applies oxide, and the hydroxyl radical.ROS have been shown from yeast to mammals and reflects the progressive to cause molecular damage relatively indiscriminately nature of aging.Second, characteristic changes in pheto proteins, lipids, and nucleic acids.In addition, specific notype occur in all individuals over time due to the limdamage has been observed in the mitochondrial DNA, iting processes.which we consider below in Genome Instability.The phenotypic definition is equally general and is What is the evidence that oxidative damage causes useful in distinguishing the aging process itself from aging?One category of study that is supportive of this diseases of aging, such as cancer and heart disease.view involves animals transgenic for genes encoding Phenotypes of aging affect all of the individuals in a antioxidants.Transgenic Drosophila overexpressing both population, while diseases of aging affect only a subset.Cu/Zn SOD and catalase live 34% longer than controls Both impact on life span, but in different ways.For exam-(Orr and Sohal, 1994).A more recent study shows that ple, the many advances in medicine and public health expression of human SOD1 exclusively in Drosophila in this century have caused a large increase in the averadult motor neurons leads to a 40% extension in life age life span of humans in developed countries.Howspan (Parkes et al., 1998).Further experiments are necever, because these advances have not altered the aging essary to clarify the nature of this primary role of motor neurons in life span.Conversely, mice knocked out for either GPX1 (encoding glutathione peroxidase), SOD1,",
+      "the maximum human life span.Several avenues to studying aging have placed us on Department of Biology Massachusetts Institute of Technology the threshold of understanding basic underlying mechanisms.These approaches include the identification of Cambridge, Massachusetts 02139 key genes and pathways important in aging; genetic studies of heritable diseases that cause the appearance of premature aging in affected people; physiological ex-Introduction periments that relate the pace of aging to caloric intake; Is aging the final act in the script of developmental bioland advances in human genetics, as well as cell and ogy?The characteristic changes that are part and parcel molecular biology leading to an understanding of the of aging appear similar to developmentally regulated basis of many diseases of aging.Strikingly, single gene programs.But why would aging mechanisms have been mutations have been found to significantly extend the evolutionarily selected as advantageous?Indeed, evolife span in C. elegans, yeast, and, most recently, Drolutionary biologists might argue that aging occurs by sophila, suggesting that aging may be relatively simple, default due to the absence of selection in the postreproat least in these organisms.Further, the limited replicaductive phase of life.By this view, the aging process is tion potential of human cells in culture has been attribnot programmed, but, rather, the detritus of the absence uted to a specific mechanism (i.e., the shortening of of selection for maintenance (Medawar, 1952; Kirkwood, telomeric ends of chromosomes).An important chal- 1977).However, it is quite reasonable that any mechalenge is now to relate these recent findings to the more nisms that sprang up to slow or regulate the pace of complex case of human aging.aging would be selected, because lucky individuals",
+      "Currently prevailing studies of genetic and biological origin of human health and longevity follow largely two approaches which focus on the aging-related diseases and on individuals with exceptionally long lives (Martin et al. 2007).This study provides de facto the rationale for a new approach.Specifically, Fig. 2 suggests that a promising strategy could be to focus on individuals who died prematurely.Studies of genetic profiles of short-lived subjects compared to those who aged more successfully (i.e., those who lived longer and perhaps healthier lives) can be a core of this strategy.Importantly, this strategy can be naturally implemented in longitudinal studies of aging and longevity by focusing on individuals who died first.",
+      "T he average human life expectancy has been increasing for centuries 1 .Based on twin studies, the heritability of human lifespan has been estimated to be ~25%, although this estimate differs among studies 2 .On the other hand, the heritability of lifespan based on the correlation of the mid-parent (i.e., the average of the father and mother) and offspring difference between age at death and expected lifespan was estimated to be 12% 3 .A recent study has indicated that the different heritability estimates may be inflated due to assortative mating, leaving a true heritability that is below 10% 4 .The heritability of lifespan, estimated using the sibling relative risk, increases with age 5 and is assumed to be enriched in long-lived families, particularly when belonging to the 10% longest-lived of their generation 6 .To identify genetic associations with human lifespan, several genome-wide association (GWA) studies have been performed [7][8][9][10][11][12][13][14][15][16][17][18][19][20] .These studies have used a discrete (i.e., older cases versus younger controls) or a continuous phenotype (such as age at death of individuals or their parents).The selection of cases for the studies using a discrete longevity phenotype has been based on the survival to ages above 90 or 100 years or belonging to the top 10% or 1% of survivors in a population.Studies defining cases using a discrete longevity phenotype often need to rely on controls from more contemporary birth cohorts, because all others from the case birth cohorts have died before sample collection.Previous GWA studies have identified several genetic variants, but the only locus that has shown genome-wide significance (P \u2264 5 \u00d7 10 \u22128 ) in multiple independent meta-analyses of GWA studies is apolipoprotein E (APOE) 21 , where the ApoE \u03b54 variant is associated with lower odds of being a long-lived case.",
+      "Introduction  Worldwide human populations have shown an increase in mean life expectancy in the past two centuries (Oeppen & Vaupel, 2002).This is mainly because of environmental factors such as improved hygiene, nutrition, and health care.The large variation in healthy lifespan among the elderly has prompted research into the determinants of aging and lifespan regulation.The genetic contribution to human lifespan variation was estimated at 25-30% in twin studies (Gudmundsson et al., 2000;Skytthe et al., 2003;Hjelmborg et al., 2006).The most prominent genetic influence is observed in families in which the capacity to attain a long lifespan clusters (Perls et al., 2000;Schoenmaker et al., 2006).Exceptional longevity can be reached with a low degree of age-related disability (Christensen et al., 2008;Terry et al., 2008), raising the question whether protective mechanisms against disease exist in long-lived subjects.",
+      "Introduction  Human life expectancies are increasing almost everywhere in the world where socio-economic circumstances are permissive (Tuljapurkar et al., 2000) and there is no evidence that a limit to life is anywhere near (Oeppen and Vaupel, 2002).While this increase in life span would prevent a proposed compression of morbidity (Fries, 1980), there is no evidence that higher average life spans are associated with an extension of the period of increased morbidity (Manton and Gu, 2001).On the contrary, older individuals have never been so healthy and further improvements in life style, environmental conditions and medical care are likely to help this trend to continue.Especially the medical sciences now seem poised to push the biological limits of longevity further by a number of innovations that seem to affect basic mechanisms of ageing and disease rather than merely alleviating its symptoms.While in the past medicine contributed mainly to public health advances by redu-cing infectious diseases, thereby helping infant mortality to decline, more recent developments hold promise for a more basic intervention in the processes that underlie age-related decline.An example is atherosclerosis, a common problem in ageing and, along with hypertension, the cause of most cardiovascular disease.Basic medical research has likely contributed significantly to the current dramatic decline in cardiovascular disease by actively intervening in some of its main risk factors, i.e., lipid levels and hypertension (Levi et al., 2002).However, one could question whether age-related diseases should be seen as separate from ageing.In this respect, ageing has been considered as a process of cellular degeneration and death universal to all or most species, increasing the risk of fatal disease in humans and other mammals.Would it be possible to define such a process and ultimately understand it in terms of the timedependent, coordinated action of the products of multiple genes interacting with the environment?If so, then ageing per se rather than the diseases associated with it, may offer a more logical starting point for further increasing healthy life expectancies through prevention and therapy.This is especially true now that we have a working draft of the human genome and are in a position to determine the functional significance of each gene as part of the dynamic network of all genes that ultimately determine the physiology of an organism.Termed 'Functional Genomics', this new discipline is now often called upon to solve the complex problems in biology, such as to understand functional control mechanisms and investigate the role that genotype and environment play in determining disease phenotypes.The question is then if this same approach would apply to ageing as a complex phenotype.What is ageing, how does it differ from its diametrical opposite, i.e., organismal development, and what role can functional genomics play in unraveling the basic causes of ageing and exploit such knowledge for developing new, rational strategies for extending healthy life span?",
+      "Introduction  As a result of improvements in health care and living conditions over the past two centuries, the average human life expectancy has dramatically increased in many regions of the world [1].This major success reflects the great malleability of the ageing process.Unfortunately, for most people, ageing is accompanied with an increased risk of developing age-related illnesses/disabilities and frailty.Therefore new approaches are required to understand the genetic, cellular, and molecular factors controlling ageing to identify strategies to extend healthy life span.",
+      "The search for the genetic determinants of extreme human longevity has been challenged by the phenotype's rarity and its nonspecific definition by investigators.To address these issues, we established a consortium of four studies of extreme longevity that contributed 2,070 individuals who survived to the oldest one percentile of survival for the 1900 U.S. birth year cohort.We conducted various analyses to discover longevity-associated variants (LAV) and characterized those LAVs that differentiate survival to extreme age at death (eSAVs) from those LAVs that become more frequent in centenarians because of mortality selection (eg, survival to younger years).The analyses identified new rare variants in chromosomes 4 and 7 associated with extreme survival and with reduced risk for cardiovascular disease and Alzheimer's disease.The results confirm the importance of studying truly rare survival to discover those combinations of common and rare variants associated with extreme longevity and longer health span.",
+      "The search for the genetic determinants of extreme human longevity has been challenged by the phenotype's rarity and its nonspecific definition by investigators.To address these issues, we established a consortium of four studies of extreme longevity that contributed 2,070 individuals who survived to the oldest one percentile of survival for the 1900 U.S. birth year cohort.We conducted various analyses to discover longevity-associated variants (LAV) and characterized those LAVs that differentiate survival to extreme age at death (eSAVs) from those LAVs that become more frequent in centenarians because of mortality selection (eg, survival to younger years).The analyses identified new rare variants in chromosomes 4 and 7 associated with extreme survival and with reduced risk for cardiovascular disease and Alzheimer's disease.The results confirm the importance of studying truly rare survival to discover those combinations of common and rare variants associated with extreme longevity and longer health span.",
+      "Introduction  The recent, remarkable extension of life expectancy is largely attributed to the postponement of mortality at old age (Vaupel, 1997(Vaupel, , 2010)).The years of life gained in the older population residing in developed nations are a success story of public health measures and improved health care.In addition to such external factors, longevity and healthy aging consistently show a modest heritability between 20% and 50% and aging-associated genetic research may provide further insights into the mechanisms of aging (Herskind et al., 1996;McGue et al., 1993;Reed and Dick, 2003).It has been postulated that genes involved in pathways associated with aging identified in animal models, such as insulin-like growth factor (IGF)-insulin signaling, regulation of lipoprotein metabolism, the mTOR pathway, and the oxidative stress response may also influence survival to old or even exceptionally old age in humans (Christensen et al., 2006;Kenyon, 2010;Vellai et al., 2003).However, in humans, common variants within genes involved in these pathways have not been consistently associated with lifespan (Chris-tensen et al., 2006;Kenyon, 2010;Kuningas et al., 2008;Vijg and Suh, 2005).",
+      "DESIGNS TO STUDY PARAMETERS OF HEALTHY AGEING, MORBIDITY, MORTALITY AND LONGEVITY  Human cohorts may vary considerably in their morbidity, mortality and longevity characteristics and yet they have shown a common increase in mean life expectancy in the past two centuries [5].This is mainly due to improved hygiene, nutrition and healthcare.There is a large variation in healthy lifespan among the elderly and remarkably exceptional longevity (EL) can be reached with a low degree of agerelated disability [6,7].Heritability studies comparing the concordance of lifespan in monozygous and dizygous twins estimated a 25 -30% genetic contribution to human lifespan variation [8 -11], which becomes increasingly important at higher ages.The most prominent genetic influence is present in families in which survival to high ages clusters [12,13].Unlike model systems where single-gene mutations have major life extension effects, human longevity is presumed to be a complex trait [14].",
+      "INTRODUCTION  Genomic studies into human longevity are inspired by the fact that, in animal models, healthy lifespan has proved to be remarkably plastic, and major pathways of lifespan regulation have been identified.Considerable lifespan extension has been induced in models as diverse as yeast, worms, fish, flies and rodents by applying genetic manipulation and dietary restriction (DR) (see [1] for review).Reduced activity of nutrient-sensing pathways such as insulin/insulin-like growth factor (IGF-1) signalling (IIS) and target of rapamycin (TOR) signalling mediated lifespan extension, and also the extension of lifespan by DR [2].An interesting observation from the perspective of human ageing is that, in rodents and monkeys, diets restricted in glucose, fat or protein uptake reduced or delayed the risk of cancer and metabolic disease, thus extending the healthspan of the animals [2].Following the discovery of genes and pathways involved in animal lifespan extension, human research has focused on the corresponding candidate human genes with genetic, genomic and epigenetic studies into ageing and longevity.The designs of these studies differ with respect to the selection of naturally occurring phenotypes and the study populations, which include population-based, patient-based, family-based and exposure-based cohorts.",
+      "GENETIC STUDIES OF HUMAN LONGEVITY  Genetic and genomic studies into longevity have been performed based on a hypothesis, referred to as a candidate gene approach.Alternatively, explorative genome-wide analyses have been applied in which genetic variation and gene transcription across the complete genome are being studied for associations with longevity and related traits.Genetic studies into human disease and longevity include candidate gene approaches, genome-wide association studies (GWASs) and genome-wide linkage studies.",
+      "ANALYSIS OF HUMAN VARIATION IN THE GENETIC CONTROL OF LONGEVITY  Heritability studies have convincingly demonstrated that at least some fraction of human lifespan is heritable.In tandem, large-scale genome-wide association studies (GWAS) have identified numerous loci associated with age-related traits (Buniello et al., 2019).While genetic studies have functionally shown an inverse effect of multiple age-related, diseaseassociated variants on lifespan regulation, the number of well-replicated longevity-conferring variants remains limited to variants in APOE (ApoE \u03b52), and more recently, CDKN2A/B and IL6 (see Table 1).To date, studies in humans have been hampered by the specific phenotype definitions used, sample sizes of the extreme phenotypes, and modest heritability of the longevity-related traits (Breitbach et al., 2019).This is due to the complex interplay of biological and social factors involved in human aging, as well as the limited power of GWAS, which require sampling thousands of subjects to achieve statistical significance (Breitbach et al., 2019).Genetic studies of aging have also been hindered by an inconsistent use of definitions of aging (reviewed in Baghdadi et al., 2020).The two main ways of conducting research on the genetics of longevity in human populations are by studying (i) the lifespan (continuous trait, years lived) and (ii) the longevity (dichotomous trait, i.e., being among the longest-lived individuals within a specific population).These complexities have limited the resolution and capability of broad association studies of human longevity.Importantly, these genomic analyses focus on a shift of survival in a population; these variables may be genetically distinct from the mechanisms establishing potential for longevity overall (Figure 1A).We argue that an understanding of this shift in lifespan as well as genetic mechanisms of regulating a species specific 'set points' (Figure 1B) will aid in the conceptual distinction of aging and longevity in humans.",
+      "Introduction  Human longevity is influenced by multiple genetic and environmental factors.Approximately 25-32% of the overall variation in adult lifespan is because of genetic variation that becomes particularly important for survival at advanced age (Hjelmborg et al., 2006).Epidemiological studies have revealed that long-lived individuals (LLI), that is, people surviving to the 95th percentile of the respective birth cohort-specific age distributions (Gudmundsson et al., 2000), frequently show a favorable ('healthy') course of the aging process, with the absence or a delayed onset of agerelated diseases (Hitt et al., 1999).Hence, the LLI offer the key to elucidate the molecular mechanisms underlying the 'healthy aging' phenotype (Perls, 2006).",
+      "Conclusions and prospects  Over the past two decades the human aging field has built up the necessary resources to study the biology of aging and longevity by establishing human populations with a diversity of designs.Meta-analyses integrating genetic and phenotypic datasets have successfully identified variants associated with a range of age-related traits and diseases.Despite these accomplishments, the number of novel leads contributing to human lifespan regulation is limited.Although positive regions of linkage and suggestive GWAS hits have been reported, the field has not yet identified the loci that explain the clustering of longevity in families and the variation in biological aging rate in the population.As for animal models, down-signaling of the IIS and mTOR pathway appeared to be relevant in humans.These findings are being followed up by molecular and physiological profiling using skin, fat and muscle tissue of long-lived family members and controls.Human studies now also include the response of nutrient sensing systems to the application of dietary and physical challenges.",
+      "Human lifespan variation is mainly determined by environmental factors, whereas the genetic contribution is 25-30% and expected to be polygenic.Two complementary fields go hand in hand in order to unravel the mechanisms of biological aging: genomic and biomarker research.Explorative and candidate gene studies of the human genome by genetic, transcriptomic, and epigenomic approaches have resulted in the identification of a limited number of interesting positive linkage regions, genes, and pathways that contribute to lifespan variation.The possibilities to further exploit these findings are rapidly increasing through the use of novel technologies, such as next-generation sequencing.Genomic research is progressively being integrated with biomarker studies on aging, including the application of (noninvasive) deep phenotyping and omics data -generated using novel technologies -in a wealth of studies in human populations.Hence, these studies may assist in obtaining a more holistic perspective on the role of the genome in aging and lifespan regulation."
+    ],
+    [
+      "There is a great need for continuing efforts to increase public knowledge about genomic research.As individuals and communities from diverse social backgrounds become more aware of genomic research and the potential role of genetics in contributing to health outcomes, the public will hopefully be more informed about the implications of genomic research for personal medical care, public health and more broadly the public representation of diverse population groups based on genetic findings.This knowledge should reinforce the ability of potential participants to make informed choices about joining a genetic study.There are complicated issues underlying public trust in medicine as well as scientific and genetic research that must be addressed.Innovative strategies for public education and community engagement should take into account cultural settings and historical experiences that have contributed to distrust in the past.",
+      "The issues discussed in this section refl ect key current concerns, but, given the rapid advances in genetic and genomic research, new issues will continue to confront families in the next few years.For example, major advances in the developing area of neuropsychiatric genetics, studies of the heritable nature of psychiatric and other nervous system disorders, characterized at the molecular, cellular, or behavioral levels, will challenge family members to address the potential role genes play in the development of schizophrenia, bipolar, or affective disorders (Genomics Network, n.d.).",
+      "Future Implications and Communication Research Directions  Given ever-expanding research on genetics and genomics, scholars interested in family interaction will be challenged to stay abreast of the implications for family disclosure and discussion of genetic health.We believe that the following issues will emerge as key concerns:",
+      "Conclusion  After more than four decades of working, genetics and genomic medicine still faces a considerable challenge to be addressed.Lack of awareness of health professionals and government, lack of interest of researcher on genetic diseases, limited research funding, limited access to high technology, low national health budget and low income family are seem to be the main obstacles to be overcome in implementation of genetics and genomic medicine.Despite these conditions, several research centers still managed to do some studies and few numbers of genetic testing.Several collaborations with countries abroad have been done to overcome some obstacles.Yet, Indonesia still has to accelerate this effort to be able to catch up its lag.Mentoring and collaborations are needed to enable Indonesia in doing so.",
+      "Opportunities for Population-Based Research on Aging Human Subjects:  Pathology and Genetics",
+      "Concluding remarks  The next decade will provide a window of opportunity to prepare health professionals, public health practitioners, the public and policy makers for the advent of genomics on health and health care.This will be a doable project but will require regional, national, European and global coordination on both the vertical and horizontal levels.We argue that there is an ethical obligation to prepare society to meet this challenge and to take up the opportunities provided by the science in a medically useful, effective, efficient, socially desirable and ethically justifiable manner.Here, health literacy, health communication and empowerment in managing risks are key for opening the doors to a truly beneficial Public Health Genomics practice.This can be facilitated by implementing ethical benchmarks and legal safeguards 70 such as respect for autonomy and social justice in the context of policy development.",
+      "Clarifying the general conditions under which genomic knowledge can be put to best practice in the field of public health, paying particular consideration to the ethical, legal and social implications 12,17,35 is currently the most pressing task in Public Health Genomics.Aiming the application of genetic and molecular science to the promotion of health and disease prevention through the organised efforts of society, integral to its activities is a dialogue with all stakeholders in society, including industry, governments, health professionals and the general public. 18Thus, the integration of genomics into public health research, policy and practice is one of the major future challenges for our health-care systems. 36,37Expertise is already feasible and can be clustered and evaluated for a socially accountable use.",
+      "Public health needs to prepare itself for the upcoming challenges, which derive from genomics.In this sense, it needs to strengthen the communication efforts among all sciences involved.Public health can serve as the umbrella, that spans the disciplines such as genetics, ethics, law and all other stakeholders.",
+      "Economic and health disparities related to genetics and genomics.",
+      "Capabilities and limitations of current genetic/genomic technologies.",
+      "Identify ethical, legal, and social issues associated with genetic/genomic information.",
+      "Ongoing research contributing to improved understanding of the genetic/genomic influences on health.",
+      "Economic and health disparities related to genetics and genomics. Integrate knowledge from psychology, history, politics, sociology and culture when delivering genetic and genomic care.",
+      "Ethical and legal issues surrounding genetic and genomic information and services.",
+      "Developments in genetics and genomics occur very rapidly and bring with them new ethical, legal and social questions that need swift, sensible and responsible responses (Pepper, 2011).Examples include next-generation sequencing, genetic cohort studies and biobanks, which have raised questions about data management, including quality of interpretation of data, data storage, data sharing, consent for re-use of data, as well as concerns about identifiability and privacy interests of those who provide samples (Kaye, 2012;Wolf, 2013;Pinxten and Howard, 2014).However, the rapidity of advancement poses difficulties for those who must determine the responses to these questions.They are often slow or even overtaken by further advancements.Ethical, legal and social-related challenges should be prioritised for policymakers, researchers, clinicians and public health practitioners to maximise the benefits of genomic and genetic applications while minimising the risk of harm to people (Geller et al., 2014).Any education strategy developed should therefore be dynamic.",
+      "Query 2. Perceptions of Genetics and Genomics  Awareness of Genetic and Genomic Advancements.",
+      "In addition, 4 scholarly commentaries in this issue provide insights into several current practical issues and developments in genetics and genomics.Feero and colleagues 11 describe advances in genomics science and explore many of the issues surrounding translation of these advances to routine \"personalized\" patient care.Offit 12 discusses the increasing availability of direct-to-consumer marketing of genomic and genetic testing and sounds an appropriately cautionary note about the need for standards, quality control, and appropriate regulation.Uhlmann and Guttmacher 13 present a useful collection of practical Internet genetics resources for clinicians and patients, including genetics information on specific diseases; guidelines for genetic testing; and educational resources to help clinicians integrate genetics into patient care.Ginsberg and colleagues 14 discuss the importance of centralized biorepositories for genetics and genomics research and empha-size the need to develop and implement standards for informed consent, informatics, and governance.",
+      "Key Themes Relevant To Genomic Research . . . . . . . . . . . . . . . . . . . . . . . . . . 3",
+      "A first step is to define the challenges that stand in the way of realizing the promise of genomic medicine.These include addressing gaps in the oversight of genetic testing (including regulation of companies providing test interpretation services), ensuring that realistic claims are made in promotional materials for genetic testing, determining the appropriate role of new genomic technologies in patient care, ensuring the privacy of patients' genomic data, and improving insurance coverage and reimbursement for genetic services.The Secretary's Advisory Committee on Genetics, Health, and Society (SACGHS), on which two of us serve, advises the secretary of health and human services and reports on these issues.",
+      "How can we maximize the benefits of these new developments and minimize the harms?How can we encourage patients' involvement and autonomy yet establish appropriate safeguards while avoiding inappropriate paternalism?How do we promote Preparing for a Consumer-Driven Genomic Age the understanding that interpretations of genomic information may evolve as research unravels the meaning of gene-gene and gene-environment interactions and the roles of noncoding DNA sequences, copy-number variants, epigenetic mechanisms, and behavioral factors in health and disease?"
+    ],
+    [
+      "A supervised (pathway driven) approach was used to specifically query three general gene ontology (GO) areas of interest, namely xenobiotic metabolism, DNA damage repair, and oxidative stress-related genes (Table 1).These gene categories are hypothesized to play important roles in sex-and age-related susceptibility to adverse drug effects [18,30].Of the 122 genes included in the xenobiotic metabolism gene list in the Ingenuity Knowledge Base, 61 were differentially expressed.These included Cyp2d4, the rat ortholog of human gene CYP2D6, which is speculated to metabolize up to 25% of commonly prescribed drugs [31].Genes involved in DNA Damage Repair, derived from Ingenuity, were combined with the list by Wood et al. [32] to give 222 genes involved in DNA damage repair.Sixty-five of these genes (approximately 25%) were found to be differentially expressed in the liver.Oxidative Stress genes were defined by 68 genes included in \"response to oxidative stress\" (IPA) of which 23 genes were differentially expressed (Table 1).",
+      "Pharmacogenomics has advanced the field of drug-response assessment.For example, the first experiences with guiding vitamin K antagonist therapy with the aid of CYP2C9 (cytochrome P450, family 2, subfamily C, polypeptide 9) or VKORC1 (vitamin K epox- ide reductase complex, subunit 1) polymorphisms (93 ), and the use of cytochrome P450 polymorphisms for assessing clopidogrel response have entered US Food and Drug Administration recommendations (94 ).Disease prevention lags behind.Gene chips and modern sequencing approaches that allow largescale interrogation of the genome at the population level will generate novel hypotheses of disease causation.Furthermore, with the continuing drop in the costs of whole-genome sequencing, the practicing physician may soon be faced with having to comment on the disease risks of a patient's \u03fe4 \u03eb 10 6 sequence variants before any clinical signs occur, a task that no certified genetic counselor could fulfill at present.With advent of GWASs, ethical and practical concerns of reporting genetic research results have become apparent.Initial efforts at defining rules of reporting large-scale association results and assessing the level of evidence also apply to nextgeneration large-scale genomics (95,96 ).Reports have suggested that on the consumer side, genomewide genetic profiling of employees of health and technology companies does not change anxiety symptoms, dietary fat intake, or exercise behavior (i.e., lifestyle factors) over a 6-month period (97 ); however, the association of genetic variation with risk and the dissection of objective markers of risk and risk factors that reside in the causal pathways of disease will need careful assessment before these approaches can enter clinical decision making (98 ).A data set containing 80 genes associated with coronary heart disease in GWASs was uploaded and overlaid onto the molecular networks developed from information contained in the Ingenuity Knowledge Base.Networks of Network Eligible Molecules were then algorithmically generated on the basis of their connectivity.The most substantially enriched network, as shown, comprises 36 genes, of which 20 are coronary heart disease genes.",
+      "19.3.1 An environmental or pharmacogenetic basis for drug efficacy and ADR? Before getting into the complexities of PGx, it is important to recognize that many non-genetic factors also influence the efficacy of medications, including the patient\u2019s age, sex and general health, but also environmental factors, such as concomitant therapies, drug interactions and diet. To give a seemingly innocuous example, grapefruit juice is an inhibitor of intestinal cytochrome P-450 3A4, which is responsible for the first-pass metabolism of many medications.",
+      "Finally, it is possible that other molecules (or drugs) might modulate the biological context within which the drug\u2013 target interaction takes place. Variation in any of the elements that control these types of processes can lead to variability in drug action, which might well confound the search for causative genes among the usual ADME and target-related candidates. 19.3 PHARMACOGENETICS (PGx)  519  19.3.5 Using bioinformatics to gain understanding of adverse drug reaction (ADR) One of the biggest concerns during the development of any medication is the possibility of unintended consequences in the patient.",
+      "19.3 Pharmacogenetics (PGx) It is well known that after exposure to a drug, almost any given cohort of patients show a wide variety of responses. In an ideal situation, patients show a beneficial response to the therapy, although they may also show no response or a weak response, and perhaps most worryingly, they may experience an adverse drug reaction (ADR), which in extreme situations could lead to serious illness or even death. ADR is an increasingly serious problem with a huge toll in lives and health-care costs every year.",
+      "A good understanding of disease biology and effective chemistry is not the only requirement for an efficacious drug; we also must understand how variation at the target affects drug action, and how variation in other genes affects the way drugs are absorbed, disseminated, metabolized and excreted. Genetic analysis in the drug development paradigm also faces some unique challenges; for example, the exquisite rarity of some adverse reactions makes collection of sufficient samples for well-powered genetic analysis almost impossible.",
+      "19.3.1 An environmental or pharmacogenetic basis for drug efficacy and ADR? Before getting into the complexities of PGx, it is important to recognize that many non-genetic factors also influence the efficacy of medications, including the patient\u2019s age, sex and general health, but also environmental factors, such as concomitant therapies, drug interactions and diet. To give a seemingly innocuous example, grapefruit juice is an inhibitor of intestinal cytochrome P-450 3A4, which is responsible for the first-pass metabolism of many medications.",
+      "Finally, it is possible that other molecules (or drugs) might modulate the biological context within which the drug\u2013 target interaction takes place. Variation in any of the elements that control these types of processes can lead to variability in drug action, which might well confound the search for causative genes among the usual ADME and target-related candidates. 19.3 PHARMACOGENETICS (PGx)  519  19.3.5 Using bioinformatics to gain understanding of adverse drug reaction (ADR) One of the biggest concerns during the development of any medication is the possibility of unintended consequences in the patient.",
+      "19.3 Pharmacogenetics (PGx) It is well known that after exposure to a drug, almost any given cohort of patients show a wide variety of responses. In an ideal situation, patients show a beneficial response to the therapy, although they may also show no response or a weak response, and perhaps most worryingly, they may experience an adverse drug reaction (ADR), which in extreme situations could lead to serious illness or even death. ADR is an increasingly serious problem with a huge toll in lives and health-care costs every year.",
+      "A good understanding of disease biology and effective chemistry is not the only requirement for an efficacious drug; we also must understand how variation at the target affects drug action, and how variation in other genes affects the way drugs are absorbed, disseminated, metabolized and excreted. Genetic analysis in the drug development paradigm also faces some unique challenges; for example, the exquisite rarity of some adverse reactions makes collection of sufficient samples for well-powered genetic analysis almost impossible.",
+      "Drug-Gene Interactions Predicting Efficacy  In 1 candidate gene study, a genetic variant in the HMG-CoA reductase gene, present in 6.7% of patients, modified the LDL-C response to pravastatin by 6.4 mg/dL. 244][247] However, these effect sizes are small and difficult to distinguish from random variation in individual patients.Indeed, the metformin finding is less important for its potential clinical applications than for the biological insight provided by this link between glucose control and a gene involved in the response to DNA damage. 245,246",
+      "Nutrition and metabolism  The power of these new experimental protocols, comparing gene expression profiles to understand spontaneous differences in phenotype due to disease, was extended by inducing phenotypic differences using creative molecular intervention.The first experiments to manipulate phenotype in this way used drugs.A comparison of the gene expression of a drug-induced phenotype with that of the normal phenotype was brilliantly executed in a single study that simultaneously identified a mechanism for the regulation of sterol uptake in the intestine and a genetic disease, sitosterolemia [17  \u2022 ], mice were treated with a lipid-metabolism altering compound and the expression profiles of various tissues compared with normal mice using gene arrays.Differentially expressed genes were evaluated 'in silico,' and an unknown gene was found using bioinformatic tools to be homologous to the ATP-binding cassette (ABC) family of genes.Members of the ABC family include cellular cholesterol transport proteins.Defects in a member of this family (ABCA1) form the basis for the poor cholesterol delivery to high-density lipoprotein (HDL) that underlies Tangiers disease [18], another cholesterol-related disease [19].Through the use of a variety of in silico techniques, Berge et al. [17 \u2022\u2022 ] concluded that the proteins produced from the newly discovered genes, ABCG5 and ABCG8, were responsible for the regulated reverse transport of newly absorbed cholesterol and phytosterols out of the apical surface of intestinal cells.Using public gene databases, a human homolog of the putative mouse transporter was identified, cloned and used to screen sitosterolemic humans.Dysfunctional mutations were found in these genes in all individuals suffering from sitosterolemia.Thus, individuals suffering from sitosterolemia lack the machinery responsible for the selective and controlled transport of cholesterol, and therefore hyperabsorb various sterols (including plant sterols).This study illustrated many of the strengths of genomic experimentation: the identification of phenotypically important genes using global differential gene expression analysis; querying internet databases to deduce structure/function relationships from sequence comparison; and the characterization of individual variation (polymorphism) linked to health.These findings have transformed our understanding of lipid absorption and metabolism, begging the question: how long would this knowledge have waited to be discovered without genomics?",
+      "19.3.1 An environmental or pharmacogenetic basis for drug efficacy and ADR? Before getting into the complexities of PGx, it is important to recognize that many non-genetic factors also influence the efficacy of medications, including the patient\u2019s age, sex and general health, but also environmental factors, such as concomitant therapies, drug interactions and diet. To give a seemingly innocuous example, grapefruit juice is an inhibitor of intestinal cytochrome P-450 3A4, which is responsible for the first-pass metabolism of many medications.",
+      "Finally, it is possible that other molecules (or drugs) might modulate the biological context within which the drug\u2013 target interaction takes place. Variation in any of the elements that control these types of processes can lead to variability in drug action, which might well confound the search for causative genes among the usual ADME and target-related candidates. 19.3 PHARMACOGENETICS (PGx)  519  19.3.5 Using bioinformatics to gain understanding of adverse drug reaction (ADR) One of the biggest concerns during the development of any medication is the possibility of unintended consequences in the patient.",
+      "19.3 Pharmacogenetics (PGx) It is well known that after exposure to a drug, almost any given cohort of patients show a wide variety of responses. In an ideal situation, patients show a beneficial response to the therapy, although they may also show no response or a weak response, and perhaps most worryingly, they may experience an adverse drug reaction (ADR), which in extreme situations could lead to serious illness or even death. ADR is an increasingly serious problem with a huge toll in lives and health-care costs every year.",
+      "A good understanding of disease biology and effective chemistry is not the only requirement for an efficacious drug; we also must understand how variation at the target affects drug action, and how variation in other genes affects the way drugs are absorbed, disseminated, metabolized and excreted. Genetic analysis in the drug development paradigm also faces some unique challenges; for example, the exquisite rarity of some adverse reactions makes collection of sufficient samples for well-powered genetic analysis almost impossible.",
+      "19.3.1 An environmental or pharmacogenetic basis for drug efficacy and ADR? Before getting into the complexities of PGx, it is important to recognize that many non-genetic factors also influence the efficacy of medications, including the patient\u2019s age, sex and general health, but also environmental factors, such as concomitant therapies, drug interactions and diet. To give a seemingly innocuous example, grapefruit juice is an inhibitor of intestinal cytochrome P-450 3A4, which is responsible for the first-pass metabolism of many medications.",
+      "Finally, it is possible that other molecules (or drugs) might modulate the biological context within which the drug\u2013 target interaction takes place. Variation in any of the elements that control these types of processes can lead to variability in drug action, which might well confound the search for causative genes among the usual ADME and target-related candidates. 19.3 PHARMACOGENETICS (PGx)  519  19.3.5 Using bioinformatics to gain understanding of adverse drug reaction (ADR) One of the biggest concerns during the development of any medication is the possibility of unintended consequences in the patient.",
+      "19.3 Pharmacogenetics (PGx) It is well known that after exposure to a drug, almost any given cohort of patients show a wide variety of responses. In an ideal situation, patients show a beneficial response to the therapy, although they may also show no response or a weak response, and perhaps most worryingly, they may experience an adverse drug reaction (ADR), which in extreme situations could lead to serious illness or even death. ADR is an increasingly serious problem with a huge toll in lives and health-care costs every year.",
+      "A good understanding of disease biology and effective chemistry is not the only requirement for an efficacious drug; we also must understand how variation at the target affects drug action, and how variation in other genes affects the way drugs are absorbed, disseminated, metabolized and excreted. Genetic analysis in the drug development paradigm also faces some unique challenges; for example, the exquisite rarity of some adverse reactions makes collection of sufficient samples for well-powered genetic analysis almost impossible."
+    ],
+    [
+      "In one case, a gene identified by mutation recovered from a genetic screen in the laboratory, methuselah, may have variants in natural populations.In particular, the common ATATC haplotype has a sharp geographic (north-south) cline in U.S. populations, which, intriguingly, is associated with an 18% difference in life span (97).It would be interesting to examine these natural populations for differences in their reproductive schedule.Extensive studies show that life span can be rapidly selected as an indirect outcome of artificial selection for age at reproduction.Samples from natural populations of Drosophila contain genetic variants that can be rapidly selected, within 15 generations, for 50% or greater differences in life span on the basis of choosing individuals that are reproductive at early versus later ages (93).Selection was reversible, indicating that these life history variants depended on existing gene combinations not new mutations.Among the genes that differed in quantitative expression between young-and old-selected lines were heat shock proteins, e.g., hsp 22 (60).An overarching conclusion from fly aging genetics is that stress resistance is coupled to longevity (94), as in C. elegans.Other gene candidates are being sought by QTL analysis and show complex interactions with gender and population density (17,115).",
+      "Murabito JM, Yuan R, Lunetta KL (2012) The search for longevity and healthy aging genes: insights from epidemiological studies and samples of long-lived individuals. J Gerontol A Biol Sci Med Sci 67(5):470\u2013479. doi:10.1093/gerona/gls089 20. Nuzhdin SV, Pasyukova EG, Dilda CL et al (1997) Sex-specific quantitative trait loci affecting longevity in Drosophila melanogaster. Proc Natl Acad Sci USA 94(18):9734\u20139739 21. Gems D, Riddle DL (2000) Genetic, behavioral and environmental determinants of male longevity in Caenorhabditis elegans. Genetics 154(4):1597\u20131610  123  22.",
+      "Somatic mutations with the inherited gene variations of each individual cumulatively or synergistically influence the health span and life span [11].Very few genetic variants have been associated with human longevity, but those found include the transcription factor FOXO3 gene, the APOE/TOMM40 and the CDKN2B/ ANRIL loci, which are associated with Alzheimer's disease and cellular senescence [12][13][14].In fact, the heritability for human longevity has been estimated to be approximately 20-30%, according to studies of twins, suggesting that external factors such as diet, environment, physical activity and microbiomes are important factors that influence the health span [14][15][16].The increase in the rate of retrotranscription reflects genome deregulation, creating additional mutations, DNA damage, and other forms of genome instability.For instance, the expression of several families of retrotransposable elements increases with age, as observed in mouse skeletal muscle and human fibroblasts, particularly the long interspersed nuclear element-1 (L1 LINE) [17,18].",
+      "Our study has several limitations.First, we did not analyse the sex and mitochondrial chromosomes, since we were unable to gather enough cohorts that could contribute to the analysis of these chromosomes.However, these chromosomes may harbour loci associated with longevity that we thus have missed.Second, although we included as many cohorts as possible, the sample size of our study is still relatively small (especially for the 99th percentile analysis) in comparison to GWA studies of age-related diseases, such as T2D and cardiovascular disease, and parental age at death 11,51,52 .Hence, this limited our power to detect loci with a low MAF (<1%) that contribute to longevity.Third, we did not perform sex-stratified analyses and may thus have missed sexspecific longevity-related genetic variants.The reason for this is that (1) we only identified a limited number of suggestive significant associations in our unstratified 90th and 99th percentile analyses, (2) our sample size is modest (especially when stratified by sex), and (3) thus far, there has been no report of any genomewide significant sex-specific longevity locus.",
+      "In most experimentally modified animal model systems, single-gene mutations in many different genes have major life extension effects (Fontana et al., 2010;Kenyon, 2010).However, natural human and animal longevity is presumed to be a complex trait (Finch & Tanzi, 1997).In humans, both candidate gene and genome-wide genetic association approaches have been applied in an attempt to identify longevity loci.The frequency of genetic variants has been typically compared between nonagenarian cases and young controls, revealing loci at which genetic variants may contribute to a higher or lower probability of survival into old age.The initial candidate gene studies aimed at finding human longevity genes were dominated by contradictory results (Christensen et al., 2006).The more consistent evidence obtained by repeated observation in independent cohort studies for association with longevity has so far only been observed for three loci, the apolipoprotein E (APOE) locus (Schachter et al., 1994;Christensen et al., 2006), the FOXO3A locus (Willcox et al., 2008;Flachsbart et al., 2009;Pawlikowska et al., 2009;Soerensen et al., 2010), and the AKT1 locus (Pawlikowska et al., 2009).Thus, despite the expectation that longevity would be influenced by many genetic variants with small effect sizes, the effect of variants has consistently been shown in only three genes.",
+      "Previously, it has been suggested that genetic variation in the FOXO1 gene is specifically contributing to human female longevity (reviewed in Chung et al., 2010).However, at chromosome 13q14.11harboring the FOXO1 gene we found no evidence for linkage with female longevity (LOD<0.05)and at the gene position of FOXO1 we found no evidence for association in the females-only metaanalysis (p-values>0.042) in the GEHA Study.Potentially, the effect of this locus is not only influenced by gender but also by genetic background.",
+      ", 2003), to study GXE and consequences of treatments as a function of age, diet, and sex (Fleet et al. , 2016; Philip et al. , 2010; Roy et al. , 2020; Sandoval-Sierra et al. , 2020; Williams et al. , 2016, 2020), gene pleiotropy (Wang et al. , 2016a), and to test behavioral predictions based on differences in brain architecture (Yang et al. , 2008). Author Manuscript Author Manuscript  Here we summarize the current status of this resource with a focus on genetic structure, and on the power and precision of mapping trait variance to loci and genes.",
+      "Somatic mutations with the inherited gene variations of each individual cumulatively or synergistically influence the health span and life span [11].Very few genetic variants have been associated with human longevity, but those found include the transcription factor FOXO3 gene, the APOE/TOMM40 and the CDKN2B/ ANRIL loci, which are associated with Alzheimer's disease and cellular senescence [12][13][14].In fact, the heritability for human longevity has been estimated to be approximately 20-30%, according to studies of twins, suggesting that external factors such as diet, environment, physical activity and microbiomes are important factors that influence the health span [14][15][16].The increase in the rate of retrotranscription reflects genome deregulation, creating additional mutations, DNA damage, and other forms of genome instability.For instance, the expression of several families of retrotransposable elements increases with age, as observed in mouse skeletal muscle and human fibroblasts, particularly the long interspersed nuclear element-1 (L1 LINE) [17,18].",
+      "The Height-Life Span Nexus  Several observations and lines of experimentation have raised the issue of whether interindividual differences in aging rate are influenced by genes that modulate body size and early-life growth patterns.These include (a) the association between small stature and exceptional longevity in calorically restricted rodents (Yu et al., 1985), methionine-restricted rats (Orentreich et al., 1993), and mutant dwarf mice (Brown-Borg et al., 1996;Miller, 1999); and (b) the association between small body size and longer life span in natural populations of mice (Falconer et al., 1978), flies (Hillesheim and Stearns, 1992), dogs (Li et al., 1996), and, possibly, people (Samaras andStorms, 1992).The correlation in dogs is particularly striking: selective breeding for dogs of different body size has produced breeds varying in size from Chihuahua to Irish wolfhound.These breeds also vary greatly in mean longevity, from approximately 7 to 10.5 years, and the correlation between breed longevity and breed body weight (Miller, 1999) is a remarkable R 2 = 0.56.These differences are genetic and affect stature rather than obesity: no amount of overeating will convert a West Highland white terrier to a St. Bernard.The selective pressures applied were designed to create dogs of specific sizes and temperaments and were not intended to influence aging rate or life span.The clear implication is that the effects on longevity are pleiotropic, i.e., that genes selected for their effect on body size and conformation influenced life span as a side effect.It is of interest to note that the few analyses (Eigenmann et al., 1984(Eigenmann et al., , 1988) ) of the hormonal basis for interbreed differences in body size have shown that the genes in question influence levels of IGF-1, the most likely mediator of the life-span effects in the long-lived df/df and dw/dw mouse mutants.Could it be mere coincidence that long-lived mutant nematode worms (Kimura et al., 1997) also show mutations in genes related to insulin and IGF-1 receptors?",
+      "The antagonistic pleiotropy and hyperfunction theories of ageing predict the presence of genetic variants important for growth and development in early life with deleterious effects towards the end of the reproductive window 19,20 .While we are unable to directly capture the genetic effects on individuals before age 40 due to the study design of our datasets, we found that the life-extending variant near FOXO3 is associated with a delay in the age at menarche and a decrease in intracranial volume and cognitive abilities.It thus appears that there are loci exhibiting antagonistic effects, although we are unable to discern whether this is due to true pleiotropy or due to linkage of causal variants within a region  Genes which showed a significant effect (FDR < 5%) of gene expression on ageing traits are displayed here.Gene names are annotated with the direction of effect, where + andindicate whether the life-extending association of the locus is linked with higher or lower gene expression, respectively.Locus: nearest gene to lead variant in the multivariate analysis, Chr: chromosome, Position: base-pair position of lead variant (GRCh37), Cis-genes: genes in physical proximity (<500 kb) to the lead variant of the locus which colocalise with the multivariate signal, Trans-genes: genes located more than 500 kb from the lead variant of the locus.",
+      "Ageing phenotypes, such as years lived in good health (healthspan), total years lived (lifespan), and survival until an exceptional old age (longevity), are of interest to us all but require exceptionally large sample sizes to study genetically.Here we combine existing genome-wide association summary statistics for healthspan, parental lifespan, and longevity in a multivariate framework, increasing statistical power, and identify 10 genomic loci which influence all three phenotypes, of which five (near FOXO3, SLC4A7, LINC02513, ZW10, and FGD6) have not been reported previously at genome-wide significance.The majority of these 10 loci are associated with cardiovascular disease and some affect the expression of genes known to change their activity with age.In total, we implicate 78 genes, and find these to be enriched for ageing pathways previously highlighted in model organisms, such as the response to DNA damage, apoptosis, and homeostasis.Finally, we identify a pathway worthy of further study: haem metabolism.",
+      "Here, we assess the degree of genetic overlap between published GWAS of three different kinds of ageing phenotypeshealthspan, parental lifespan, and longevity (defined as survival to an age above the 90th percentile)-and perform a multivariate meta-analysis to identify genetic variants related to healthy ageing.We subsequently characterise the sex-and age-specific effects of loci which affect all three ageing traits and look up reported associations with age-related phenotypes and diseases.Finally, we link the observed signal in these loci to the expression of specific genes, including some that are currently studied in model organisms, and identify pathways involved in healthy ageing.",
+      "Ageing phenotypes, such as years lived in good health (healthspan), total years lived (lifespan), and survival until an exceptional old age (longevity), are of interest to us all but require exceptionally large sample sizes to study genetically.Here we combine existing genome-wide association summary statistics for healthspan, parental lifespan, and longevity in a multivariate framework, increasing statistical power, and identify 10 genomic loci which influence all three phenotypes, of which five (near FOXO3, SLC4A7, LINC02513, ZW10, and FGD6) have not been reported previously at genome-wide significance.The majority of these 10 loci are associated with cardiovascular disease and some affect the expression of genes known to change their activity with age.In total, we implicate 78 genes, and find these to be enriched for ageing pathways previously highlighted in model organisms, such as the response to DNA damage, apoptosis, and homeostasis.Finally, we identify a pathway worthy of further study: haem metabolism.",
+      "LongevityMap--human genetic variants associated with longevity  Variation in human lifespan has been found to be 20-30% heritable, with increasing heritability at advanced ages (27).As next-generation sequencing and genome-wide approaches advance, so does the capacity for performing longevity association studies.To catalog the increasing volume of data in genetic studies of human longevity, we created LongevityMap (http://genomics.senescence.info/longevity/), a database of genes, gene variants and chromosomal locations associated with longevity (28).This differs from the GenAge database, which focuses mostly on data from model organisms and the few genes associated with human ageing (e.g.genes causing progeroid syndromes).",
+      "Genes/loci identified by genome-wide association studies of longevity and lifespan traits.",
+      "ANALYSIS OF HUMAN VARIATION IN THE GENETIC CONTROL OF LONGEVITY  Heritability studies have convincingly demonstrated that at least some fraction of human lifespan is heritable.In tandem, large-scale genome-wide association studies (GWAS) have identified numerous loci associated with age-related traits (Buniello et al., 2019).While genetic studies have functionally shown an inverse effect of multiple age-related, diseaseassociated variants on lifespan regulation, the number of well-replicated longevity-conferring variants remains limited to variants in APOE (ApoE \u03b52), and more recently, CDKN2A/B and IL6 (see Table 1).To date, studies in humans have been hampered by the specific phenotype definitions used, sample sizes of the extreme phenotypes, and modest heritability of the longevity-related traits (Breitbach et al., 2019).This is due to the complex interplay of biological and social factors involved in human aging, as well as the limited power of GWAS, which require sampling thousands of subjects to achieve statistical significance (Breitbach et al., 2019).Genetic studies of aging have also been hindered by an inconsistent use of definitions of aging (reviewed in Baghdadi et al., 2020).The two main ways of conducting research on the genetics of longevity in human populations are by studying (i) the lifespan (continuous trait, years lived) and (ii) the longevity (dichotomous trait, i.e., being among the longest-lived individuals within a specific population).These complexities have limited the resolution and capability of broad association studies of human longevity.Importantly, these genomic analyses focus on a shift of survival in a population; these variables may be genetically distinct from the mechanisms establishing potential for longevity overall (Figure 1A).We argue that an understanding of this shift in lifespan as well as genetic mechanisms of regulating a species specific 'set points' (Figure 1B) will aid in the conceptual distinction of aging and longevity in humans.",
+      "Put more simply: What is the strength of evidence in favor of GXE effects on lifespan? We ask if youthful adult body weight (~120 days) predicts lifespan. Is the change in body weight in adults in response to a HFD a causal predictor of lifespan? Finally, we ask whether levels of classic serum metabolites or metabolic hormones measured in middle-age or old-age predict variation in lifespan? Our focus is both on overall effects and on strain-specific difference in effect of diet on lifespan and weight gain, rather than on specific genetic modifiers or loci of lifespan.",
+      "Studies in various models have revealed that genetic differences and somatic mutations underlie longevity, but non-genetic contributions also play a major role (Cournil and Kirkwood, 2001).Calorie restriction (Bordone and Guarente, 2005), lowering of basal metabolic rate (Ruggiero et al., 2008), upregulated stress response (Migliaccio et al., 1999), restoration of mi-tonuclear protein balance (Houtkooper et al., 2013), and reduced fertility (Westendorp and Kirkwood, 1998) have all been shown to correlate with lifespan extension.These observations illuminate the role of ''epi''-genetic mechanisms in modulating longevity pathways.",
+      "Introduction  Approximately 25-30% of the variation in adult lifespan is attributable to genetic factors that become more important with increasing age and exert their strongest effects in nonagenarians and centenarians (Go \u00a8gele et al., 2010;Hjelmborg et al., 2006).As yet, however, only a few genetic variants have been found consistently to influence longevity.The first to be discovered was the e4 allele of the apolipoprotein E (APOE) gene, a mortality factor that predisposes to both Alzheimer's and cardiovascular diseases (Corder et al., 1993; Panza et al., 2004).APOE e4 is the only variant with a reportedly large adverse effect upon survival at advanced age (Scha \u00a8chter et al., 1994), and this association has been replicated in several populations (Christensen et al., 2006).Variation in the human forkhead box O3A gene (FOXO3A), in contrast, has been found to be associated with the ability to live long, an effect corroborated by studies in Japanese, German, Italian, US-American, Jewish, Chinese and Danish populations (Anselmi et al., 2009;Flachsbart et al., 2009;Li et al., 2009;Pawlikowska et al., 2009;Soerensen et al., 2010;Willcox et al., 2008).More recently, we have identified exonuclease 1 (EXO1) as a potential novel longevity gene (Nebel et al., 2009).All three genes were detected through candidate-gene approaches.",
+      "Studies of mono-and dizygous twins have revealed that the genetic contribution to the variation in human lifespan is about 25-30% [12,13], and is most prominent in families clustered for longevity [14,15].This genetic contribution is mainly apparent after the age of 60 years and seems to increase with age [13,16].Furthermore, human lifespan is a complex trait which is assumed to be determined by many genes with small individual effects [17], although the polygenic architecture still needs to be characterized [18,19].The diverse health features of long-lived families illustrate that different age-related diseases have common determinants and implicate that pathways can be identified that attenuate aging and delay age-related disease.From a genomic perspective, individuals from long-lived families are assumed to be characterized by a decreased prevalence of disease-promoting variants (referred to as disease-susceptibility alleles) and an increased prevalence of variants conferring maintenance of health and protection from disease, when compared to population controls.In the last 5 years, many diseasesusceptibility alleles have been identified (National Human Genome Research Institute (NHGRI) genome-wide association study (GWAS) Catalog; http://www.genome.gov/gwastudies/)[20].A first comparison between long-lived individuals, selected from both long-lived families (LLS) and the general population (Leiden 85-plus study), and young controls showed no difference in the distribution or frequency of disease-susceptibility alleles identified in cancer, coronary artery disease and type 2 diabetes [21].The search for lifespan regulating loci -contributing to longevity and population mortality -must therefore extend beyond a focus on disease-susceptibility alleles.We will first discuss the efforts to identify longevity loci by genetics approaches."
+    ],
+    [
+      "One surprising result of our experiment was the relatively weak support for involvement of the insulin/insulin-like signaling (IIS) or target-of-rapamycin (TOR) pathways in the evolution of late-life performance.Mutations in genes within these pathways can alter life span and fertility in flies and other organisms (Partridge and Gems 2002); natural genetic variation in expression of IIS/TOR-pathway genes has been reported to predict agingrelated phenotypes (Nuzhdin et al. 2009), and natural clinal variation in the insulin receptor gene InR has been associated with variation in stress resistance and fecundity (Paaby et al. 2010).We therefore expected that some of these genes would contribute to the evolution of life span and late-life fecundity in our experiment.Only one gene previously annotated with the Gene Ontology biological function \"determination of adult life span\" (Cct1) was among the genes bearing the strongest signature of selection, no more than would be expected by chance (1/96 of the candidate genes that had some biological process annotation, compared to 116/10,792 of all genes with some biological-process annotation, \u03c7 [1] 2 = 0.002, P > 0.96).Genes annotated with the functions \"aging\" or \"determination of adult life span\" were also significantly underrepresented among differentially expressed genes (43/215 transcripts with these annotations had P < 0.05 for line or line-by-age effects, compared to 4488/13,258 of all annotated transcripts, \u03c7 [1] 2 = 18.1, P < 0.0001).Most of the genes we identified are therefore novel candidates for the regulation of life span and late-age performance.",
+      "Rapamycin  Rapamycin has been shown to robustly increase lifespan in at least three different mouse strains and to improve healthspan measures including cognitive function, cardiac function, immune function, obesity, and cancer incidence (Johnson et al. 2015;Kaeberlein 2014).",
+      "mTOR activates the kinase S6K, which phosphorylates S6, inhibiting autophagy [92].Rapamycin can extend the life span of organisms from yeast to mammals in a dose-dependent manner [95].However, some data suggest that rapamycin has unwanted metabolic effects, including insulin resistance, hyperlipidemia, glucose intolerance, and hypophosphatemia; however, whether rapamycin is responsible for these effects remains controversial, and some of the effects are reversible [96,97].The mTOR pathway integrates different signals from insulin, cytokines, nutrients, oxygen, and mitogenic stimuli, and its regulation has important implications for longevity and against the negative effects of aging [92].",
+      "The molecular mechanisms that drive cellular senescence in proliferative and nonproliferative cells are being discovered.One of the metabolic pathways associated with aging is the growth-promoting mitogen/nutrient-sensing pathway, in which the target of rapamycin (mTOR) is considered a central signaling molecule that affects multiple cellular pathways associated with aging [137].In particular, mTOR participates in the transition of cells from quiescence to senescence [138].",
+      "Inductors of Autophagy and its Impact on Aging  Autophagy has a role in homeostasis, which plays an essential role in the maintenance of cellular physiology and the prevention of cellular damage.Among the inducers of autophagy have been described the already-mentioned rapamycin, resveratrol, and polyamines; however, only polyamines have demonstrated results in clinical research in humans [65].It is known that these compounds can induce the canonical autophagy pathway, which includes inactivation of the mammalian objective of the rapamycin complex 1 (mTORC1), allowing phosphorylation and activation of the Unc-51 complex (Ulk1/2), where the cascade of the other members of the complex is subsequently activated, ULK as FIP200 and ATG13 [65].",
+      "A third example illustrates that pharmacological targeting of pathways that have been implicated in promoting aging may also restore youthfulness at cellular and biochemical levels.Among the key regulators associated with interventions that extend life span is the enzyme mTOR, which senses cellular nutrient levels and in turn regulates rates of protein synthesis and energy utilization.Notably, administration of rapamycin, an mTOR inhibitor, starting at midlife can extend the life span of mice, suggesting that aging can be delayed or reversed in multiple cell types (Harrison et al., 2009).In the hematopoietic system, aging is associated with an increase in mTOR activation in stem cells and progenitors (Chen et al., 2009).Administration of rapamycin to old mice to inhibit mTOR not only limited the normal age-related increases in hematopoietic stem cells and biomarkers of aging in those cells, but also enhanced the performance of the stem cells to become as effective as young stem cells in heterochronic transplantation experiments (Chen et al., 2009) (Figure 1).",
+      "Rapamycin inhibits TOR signalling to alter nDNA translation, inducing mitonuclear protein imbalance35, and increases lifespan in various species, including mice33. Rapamycin also increased mean worm lifespan (by 16%)34 in a ubl-5-dependent manner, induced UPRmt, but not UPRER or heat shock response, and increased respiration (Fig. 6a, c and Supplementary Fig. 9a). This was associated with increased ATP levels, equal citrate synthase activity and altered nDNA/mtDNA oxidative phosphorylation protein ratio (Fig. 6d, e). Additionally, rapamycin changed the balance between nDNA- and mtDNA-encoded oxidative phosphorylation subunits in mouse hepatocytes in a dose dependent manner (Fig. 6f, g).",
+      "Zylbee, E., Vesco, C. & Penman, S. Selective inhibition of the synthesis of mitochondria-associated RNA by ethidium bromide. J. Mol. Biol. 44, 195\u2013204 (1969). 33. Harrison, D. E. et al. Rapamycin fed late in life extends lifespan in genetically heterogeneous mice. Nature 460, 392\u2013395 (2009). 34. Robida-Stubbs, S. et al. TOR signaling and rapamycin influence longevity by regulating SKN-1/Nrf and DAF-16/FoxO. Cell Metab. 15, 713\u2013724 (2012). 35. Zid, B. M. et al. 4E-BP extends lifespan upon dietary restriction by enhancing mitochondrial activity in Drosophila. Cell 139, 149\u2013160 (2009). 36. Schulz, T. J. et al.",
+      "a, Rapamycin (Rapa, 1 nM) extends worm lifespan in a ubl-5-dependent manner; b, ubl-5-dependently induced UPRmt (hsp-6::GFP) but not UPRER (hsp-4::GFP) (n 5 4). c\u2013e, Rapamycin increased respiration (c, n 5 10) and ATP content but not citrate synthase activity (d, n 5 3) and induced mitonuclear protein imbalance (e). f\u2013h, In mouse hepatocytes, rapamycin induces mitonuclear protein imbalance (f, g) and induces UPRmt as  shown at the protein (f, g, n 5 3), and transcriptional (h, n 5 8) level. i, Resveratrol (Resv, 25 mM) induced mitonuclear protein imbalance in mouse hepatocytes (n 5 4).",
+      "pivotal in this aspect providing molecular insights and having huge conceptual contributions in the field.Characterising the contribution of individual mutants in ageing is a continuously active and informative activity in the field.On top of these studies, genome-wide screens have provided insights on the role of evolutionarily conserved processes and signalling pathways in ageing such as nutrient response [17,18], protein translation, oxidative damage [19,20], mitochondrial function [21,22] and autophagy [22,23] opening new avenues for biogerontology research.Yeasts have proved informative and helped in understanding mechanisms of highly conserved pathways (from yeast to human) in physiology, health and disease such as the Target of Rapamycin (TOR) [24], glucose sensing (PKA) and stress response pathways (Sty1/p38) [25].",
+      "mTOR activates the kinase S6K, which phosphorylates S6, inhibiting autophagy [92].Rapamycin can extend the life span of organisms from yeast to mammals in a dose-dependent manner [95].However, some data suggest that rapamycin has unwanted metabolic effects, including insulin resistance, hyperlipidemia, glucose intolerance, and hypophosphatemia; however, whether rapamycin is responsible for these effects remains controversial, and some of the effects are reversible [96,97].The mTOR pathway integrates different signals from insulin, cytokines, nutrients, oxygen, and mitogenic stimuli, and its regulation has important implications for longevity and against the negative effects of aging [92].",
+      "The molecular mechanisms that drive cellular senescence in proliferative and nonproliferative cells are being discovered.One of the metabolic pathways associated with aging is the growth-promoting mitogen/nutrient-sensing pathway, in which the target of rapamycin (mTOR) is considered a central signaling molecule that affects multiple cellular pathways associated with aging [137].In particular, mTOR participates in the transition of cells from quiescence to senescence [138].",
+      "Inductors of Autophagy and its Impact on Aging  Autophagy has a role in homeostasis, which plays an essential role in the maintenance of cellular physiology and the prevention of cellular damage.Among the inducers of autophagy have been described the already-mentioned rapamycin, resveratrol, and polyamines; however, only polyamines have demonstrated results in clinical research in humans [65].It is known that these compounds can induce the canonical autophagy pathway, which includes inactivation of the mammalian objective of the rapamycin complex 1 (mTORC1), allowing phosphorylation and activation of the Unc-51 complex (Ulk1/2), where the cascade of the other members of the complex is subsequently activated, ULK as FIP200 and ATG13 [65].",
+      "Background  Genetic, dietary and drug interventions can enhance longevity and suppress age-associated disease, such as cancer.Prominent genetic interventions that robustly extend longevity and healthspan in mammals include those that decrease growth hormone (GH) and insulin-like growth factor (IGF) signalling; for example, Ames dwarf mice live more than 50% longer than their wild-type siblings [1].These diminutive mice result from a point mutation in a gene (Prop1 df/df ) that drives development of the pituitary gland, so that mutant mice are deficient in specific hormones.The GH deficiency, in particular, has been shown to underlie their enhanced health span and extended lifespan.Ames mice are highly insulinsensitive, resistant to some stresses and the incidence of cancer is delayed [2][3][4].Dietary and drug interventions that extend lifespan include calorie restriction (CR) and the mTOR inhibitor rapamycin [5].Like the Ames dwarf mutation, CR and rapamycin also suppress and/ or delay the incidence of cancer [5][6][7].A detailed understanding of how these interventions exert their beneficial effects is essential to develop strategies to promote healthy aging in humans [8].Currently, these interventions are thought to exert their effects by related and interconnected effects on some or all of the following: genome stability, the epigenome, telomere attrition and/or function, protein quality control, mitochondrial function, nutrient sensing, cellular senescence, stem cell exhaustion, cellular stress responses and altered intercellular communication [9].Of note, the effects of longevity promoting interventions on the epigenome, a key determinant of cell phenotype, are poorly understood.",
+      "The target of rapamycin (TOR) signaling pathway has also emerged as a major regulator of lifespan.TOR is a highly conserved kinase that transduces signals from nutrients to regulate cell size, cell growth, and metabolism (Martin & Hall, 2005).Genetic studies in yeast Saccharomyces cerevisiae have shown that reduced levels of nutrients, namely amino acids and sugars, can extend yeast lifespan through regulation of the TOR signaling pathway (Kaeberlein et al ., 2005;Powers et al ., 2006).In Drosophila , recent studies have shown that amino acid restriction, rather than 'calorie restriction', extends lifespan (Min & Tatar, 2006).In C. elegans , either inactivation of CeTOR/let-363 by RNAi, or mutations in Raptor/daf-15 , encoding a regulatory subunit of CeTOR, leads to lifespan extension (Vellai et al ., 2003;Jia et al ., 2004).",
+      "As mentioned above, a number of genes regulating longevity also control growth and development.Some of these, such as the insulin/IGF1/GH pathway, have been suggested to play a role in the mechanisms of CR (Fig. 1).An emerging critical player is the target of rapamycin (TOR) signaling pathway, which involves both nutrient sensing and regulation of growth.Several genes in the TOR pathway, and the TOR gene itself, regulate longevity in flies (Kapahi et al., 2004) and both longevity and dauer diapause in worms (Jia et al., 2004).Strikingly, not only have genetic manipulations of the TOR gene extended lifespan in yeast and worms (Stanfel et al., 2009) but also feeding rapamycin (which inhibits TOR and is also known as sirolimus) to middle-aged mice significantly (9 -14%) increased lifespan (Harrison et al., 2009).Whether rapamycin is extending lifespan by delaying of aging or by affecting a specific disease, such as cancer, remains unclear.More recent studies show that starting rapamycin administration earlier in life does AGING GENES AS TARGETS FOR DRUG DISCOVERY not result in a significantly greater increase in lifespan (10 -18%) than that obtained in middle-aged mice (Miller et al., 2011).",
+      "Replacement of the C/ebp\u03b1 gene with C/ebp\u03b2 increases lifespan by 20% [35,36], and may alter the rate of aging [37], indicating that altering the isoform expression of these genes can affect lifespan.Moreover, the life-extending drug rapamycin may affect isoform ratios of C/ebp\u03b2.Rapamycin has been shown to increase lifespan via the suppression of Mtor [38] which in turn controls the isoform ratios of C/ebp\u03b2 [39].Therefore, we speculate that rapamycin may in part exert its life extending effect through C/ebp\u03b2.",
+      "The genome-wide RNAi study conducted by the Ruvkun lab, authored by Hamilton et al. [88], identified a total of 89 additional aging genes with disparate functions including cell structure, cell surface proteins, cell signaling, cellular metabolism, and protein turnover.Of the 66 genes with previously known functions, 17 corresponded to various aspects of carbon metabolism, including citric acid cycle enzymes and subunits of complexes I, IV, and V of the ETC.Researchers also speculated that protein translation might play a role in lifespan regulation, based on the identification of iff-1 (T05G5.10),a gene that has homology to the translation initiation factor eIF5A.Other hits from this screen included two genes containing PH domains known to interact with phosphatidylinositol lipids, multiple G protein-coupled receptors, protein processing and degradation genes such as proteases and ubiquitin ligases/hydrolases, and chromatin modifying factors.",
+      "How cellular processes that regulate aging impact genome stability also remain unclear.Compelling evidence now exists that in all eukaryotes, aging is regulated by conserved insulin/insulin-like growth factor (I-(IFG-1)) pathways and growth-signaling pathways regulated by the target of rapamycin (TOR) family of kinases (4).In general, experimental manipulations that upregulate these pathways promote aging, and manipulations that downregulate these pathways-including mutational inactivation or caloric restriction-extend life span and mitigate age-related pathologies.Downregulation of these pathways often leads to a reduction in oxidative stress and oxidative damage to DNA and other cellular constituents.For the most part, however, the relationship between aging and changes in oxidative damage downstream of alterations in growth-signaling pathways remains correlative rather than causal.",
+      "The potential of interventional approaches targeted at aging has yet to be realized in part because aging is a complicated multisystem process that has remained enigmatic.However, research over the last two decades has led to significant excitement.One of the most striking findings is that it is possible to administer a clinically approved drug, rapamycin, to mice at 20 months of age and extend both their life span and health span (Harrison et al., 2009).Surprisingly, much of the recent success of aging research can be traced back to one of its simplest model organisms: yeast.Two of the major pathways studied in the context of aging and age-related disease are the sirtuin pathway and the TOR signaling pathway, and yeast was pivotal in their discovery."
+    ],
+    [
+      "We briefly comment on rare mutations that shorten life span through the early onset of diseases that are increasingly common during aging in the general population, e.g., familial forms of Alzheimer, breast cancer, coronary artery disease, type II diabetes, etc.The later onset forms of these diseases are associated with causes of death at later ages.A major question is what role the more common allelic variants of these same genes have in \"normal aging\".Although examination of this huge emerging topic goes beyond the present discussion, we may consider the example of Werner's syndrome, a rare autosomal recessive that causes adult onset progeria with a high incidence of cancer and atherosclerosis (70).The absence of Alzheimer-type dementia in Werner's syndrome illustrates the \"segmental\" nature of this and other progerias (70).Thus, heritable shortening of life span should not be considered as a simple acceleration of general aging processes.The Werner's lesion maps to a defective gene encoding a helicase and exonuclease, which also has several polymorphisms.In Japan, 1367Arg was associated with a lower risk of myocardial infarction (70), although it was not associated with longevity in Finland (14).In general, we know little of the genetic factors involved in frailty and morbidity at later ages, which are important to the geneenvironment interactions implied in the major longevity increase seen during the twentieth century.",
+      "Indicative diseases associated with the candidate aging genes",
+      "D  ementia has an age-and sex-standardized prevalence of ~7.1% in Europeans 1 , with Alzheimer's disease (AD) being the most common form of dementia (50-70% of cases) 2 .AD is pathologically characterized by the presence of amyloid-beta plaques and tau neurofibrillary tangles in the brain 3 .Most patients are diagnosed with AD after the age of 65, termed late-onset AD (LOAD), while only 1% of AD cases have an early onset (before the age of 65) 3 .On the basis of twin studies, the heritability of LOAD is estimated to be ~60-80% (refs. 4,5 ), suggesting that a large proportion of individual differences in LOAD risk is driven by genetics.The heritability of LOAD is spread across many genetic variants; however, Zhang et al. 6 suggested that LOAD is more of an oligogenic than a polygenic disorder due to the large effects of APOE variants.Zhang et al. 6 and Holland et al. 7 predicted there to be ~100-10,000 causal variants contributing to LOAD; however, only a fraction have been identified.Increasing the sample size of genome-wide association studies (GWAS) will improve the statistical power to identify the missing causal variants and may highlight additional disease mechanisms.In combination with increasing the number of samples, it is beneficial to use different approaches to identify rare and private variation to help identify additional causal variants and increase understanding of disease mechanisms; however, we deem this to be out of the scope of the current analysis.",
+      "Dementia has an age-and sex-standardized prevalence of ~7.1% in Europeans 1 , with Alzheimer's disease (AD) being the most common form of dementia (50-70% of cases) 2 .AD is pathologically characterized by the presence of amyloid-beta plaques and tau neurofibrillary tangles in the brain 3 .Most patients are diagnosed with AD after the age of 65, termed late-onset AD (LOAD), while only 1% of AD cases have an early onset (before the age of 65) 3 .On the basis of twin studies, the heritability of LOAD is estimated to be ~60-80% (refs. 4,5 ), suggesting that a large proportion of individual differences in LOAD risk is driven by genetics.The heritability of LOAD is spread across many genetic variants; however, Zhang et al. 6 suggested that LOAD is more of an oligogenic than a polygenic disorder due to the large effects of APOE variants.Zhang et al. 6 and Holland et al. 7 predicted there to be ~100-10,000 causal variants contributing to LOAD; however, only a fraction have been identified.Increasing the sample size of genome-wide association studies (GWAS) will improve the statistical power to identify the missing causal variants and may highlight additional disease mechanisms.In combination with increasing the number of samples, it is beneficial to use different approaches to identify rare and private variation to help identify additional causal variants and increase understanding of disease mechanisms; however, we deem this to be out of the scope of the current analysis.The largest previous GWAS of LOAD, identified 29 risk loci from 71,880 (46,613 proxy) cases and 383,378 (318,246 proxy) controls 8 .Our current study expands this to include 90,338 (46,613 proxy) cases and 1,036,225 (318,246 proxy) controls.The recruitment of LOAD cases can be difficult due to the late age of onset, so proxy cases can allow for the inclusion of younger individuals by estimating their risk of LOAD using parental status.Proxy cases and controls were defined on the basis of known parental LOAD status weighted by parental age (Supplementary Information).In the current study, we identified 38 loci, including seven loci that have not been reported previously.Functional follow-up analyses implicated tissues, cell types and genes of interest through tissue and cell type enrichment, colocalization and statistical fine-mapping.This study highlights microglia, immune cells and protein catabolism as relevant to LOAD, while identifying previously unidentified genes of potential interest. ResultsGenome-wide inferences.We performed meta-analysis on data from 13 cohorts, totaling 1,126,563 individuals (Supplementary",
+      "Introduction  Alzheimer's disease (AD) is a complex disorder and is the most common form of dementia [1].After age, family history is the single greatest risk factor for AD.AD can be classified into early and late onset forms.Mutations in three genes: PSEN1/2 and APP are known to cause early onset AD in an autosomal dominant manner [2,3].The majority of AD cases, however, are late onset (LOAD) and the APOE e4 allele is the strongest known genetic risk factor.Many additional genetic polymorphisms have been identified, though with substantially lower risk estimates [1,4,5,6,7,8,9,10].LOAD appears to be inherited and/or sporadic and there is evidence of a maternal inheritance pattern [11].Current estimates suggest that more than 20% of inherited LOAD cases are maternally inherited [12].",
+      "INTRODUCTION  Many common noninfectious diseases exhibit a more severe clinical presentation in older individuals.These diseases often exhibit complex etiology and can affect different tissues and cell types, with a wide spectrum of clinical outcomes.Prominent aging-associated neurodegenerative diseases are Alzheimer's disease (AD), Parkinson's disease (PD), and age-related macular degeneration (AMD), all of which can severely compromise the quality of life and have serious repercussions on both the individual and society at large.These late-onset diseases generally result from the interplay between multiple genetic susceptibility factors and environmental components.Sequencing of the human genome, cataloging of millions of single nucleotide polymorphisms (SNPs) together with the development of a map of common haplotypes, and technological innovations in genotyping are among the major milestones that are facilitating exploration of the genetic basis of common diseases (1,7,50).In the field of AMD genetics, these advances have led to the identification of several genetic susceptibility factors and enabled us to start dissecting the relationship between environmental risk factors and the genetic constitution of each individual (66,118,148).As a result, new opportunities are emerging for improved understanding of disease pathogenesis that may lead to better management and treatment of AMD.Clinical aspects of AMD are discussed only briefly (for a more in-depth discussion, see Reference 79).",
+      "Aging-associated neurodegenerative diseases significantly influence the quality of life of affected individuals.Genetic approaches, combined with genomic technology, have provided powerful insights into common late-onset diseases, such as age-related macular degeneration (AMD).Here, we discuss current findings on the genetics of AMD to highlight areas of rapid progress and new challenges.We also attempt to integrate available genetic and biochemical data with cellular pathways involved in aging to formulate an integrated model of AMD pathogenesis.",
+      "Aging-associated neurodegenerative diseases significantly influence the quality of life of affected individuals.Genetic approaches, combined with genomic technology, have provided powerful insights into common late-onset diseases, such as age-related macular degeneration (AMD).Here, we discuss current findings on the genetics of AMD to highlight areas of rapid progress and new challenges.We also attempt to integrate available genetic and biochemical data with cellular pathways involved in aging to formulate an integrated model of AMD pathogenesis.",
+      "Genetics of Alzheimer Disease: Early-Onset AD  In the early to mid-1990s, genetic studies of AD focused on extended families with high burden of disease (two or more cases among first-degree relatives), and used linkage analysis of highly polymorphic genetic markers called short tandem repeats (STRs, or microsattelites) in order to identify genomic regions co-transmitting with disease in affected family members.This strategy, followed by \"fine mapping\"-the positional cloning of candidate genes-was used to identify genes and genetic variants contributing to AD risk.The first three genes known to cause AD were identified among families with multiple early-onset cases (age-at-onset <60 years): APP, encoding amyloid precursor protein [Goate et al., 1991], and PS1 and PS2, encoding presenilins I and II respectively [Levy-Lahad et al., 1995;Rogaev et al., 1995;Sherrington et al., 1995], each transmitting disease-causing variants in the predicted autosomal-dominant fashion.",
+      "Alzheimer's disease (AD) (MIM: 104300) is a highly heritable disease with great complexity in its genetic contributors, and represents the most common form of dementia.With the gradual aging of the world's population, leading to increased prevalence of AD, and the substantial cost of care for those afflicted, identifying the genetic causes of disease represents a critical effort in identifying therapeutic targets.Here we provide a comprehensive review of genomic studies of AD, from the earliest linkage studies identifying monogenic contributors to early-onset forms of AD to the genome-wide and rare variant association studies of recent years that are being used to characterize the mosaic of genetic contributors to late-onset AD (LOAD), and which have identified approximately $20 genes with common variants contributing to LOAD risk.In addition, we explore studies employing alternative approaches to identify genetic contributors to AD, including studies of AD-related phenotypes and multi-variant association studies such as pathway analyses.Finally, we introduce studies of next-generation sequencing, which have recently helped identify multiple lowfrequency and rare variant contributors to AD, and discuss ongoing efforts with next-generation sequencing studies to develop statistically well-powered and comprehensive genomic studies of AD.Through this review, we help uncover the many insights the genetics of AD have provided into the pathways and pathophysiology of AD.",
+      "Alzheimer's disease (AD) (MIM: 104300) is a highly heritable disease with great complexity in its genetic contributors, and represents the most common form of dementia.With the gradual aging of the world's population, leading to increased prevalence of AD, and the substantial cost of care for those afflicted, identifying the genetic causes of disease represents a critical effort in identifying therapeutic targets.Here we provide a comprehensive review of genomic studies of AD, from the earliest linkage studies identifying monogenic contributors to early-onset forms of AD to the genome-wide and rare variant association studies of recent years that are being used to characterize the mosaic of genetic contributors to late-onset AD (LOAD), and which have identified approximately $20 genes with common variants contributing to LOAD risk.In addition, we explore studies employing alternative approaches to identify genetic contributors to AD, including studies of AD-related phenotypes and multi-variant association studies such as pathway analyses.Finally, we introduce studies of next-generation sequencing, which have recently helped identify multiple lowfrequency and rare variant contributors to AD, and discuss ongoing efforts with next-generation sequencing studies to develop statistically well-powered and comprehensive genomic studies of AD.Through this review, we help uncover the many insights the genetics of AD have provided into the pathways and pathophysiology of AD.",
+      "Indeed, as age increases, there is an exponential increase in the incidence of AD, with a corresponding effect on healthcare costs and quality of life. AD is a complex disease involving several genetic and environmental components (Hardy, 1997; Munoz & Feldman, 2000), and 15% of patients have a genetic predisposition. Almost 100 candidate genes are currently known to be involved in the development of AD, and only 4 (APP, PSEN1, PSEN2, APOE) in humans have been proven to play a direct role in AD pathogenesis (Thomas & Fenech, 2007).",
+      "T  he genetics of Alzheimer disease (AD) to date support an age-dependent dichotomous model whereby earlier age of disease onset (\u03fd60 years) is explained by 3 fully penetrant genes (APP [NCBI Entrez gene 351], PSEN1 [NCBI Entrez gene 5663], and PSEN2 [NCBI Entrez gene 5664]), whereas later age of disease onset (\u054665 years) representing most cases of AD has yet to be explained by a purely genetic model.The APOE gene (NCBI Entrez gene 348) is the strongest genetic risk factor for later onset, although it is neither sufficient nor necessary to explain all occurrences of disease.Numerous putative genetic risk alleles and genetic variants have been reported.Although all have relevance to biological mechanisms that may be associated with AD pathogenesis, they await replication in large representative populations.Genome-wide association studies have emerged as an increasingly effective tool for identifying genetic contributions to complex diseases and represent the next frontier for furthering our understanding of the underlying etiologic, biological, and pathologic mechanisms associated with chronic complex disorders.There have already been success stories for diseases such as macular degeneration and diabetes mellitus.Whether this will hold true for a genetically complex and heterogeneous disease such as AD is not known, although early reports are encouraging.This review considers recent publications from studies that have successfully applied genome-wide association methods to investigations of AD by taking advantage of the currently available high-throughput arrays, bioinformatics, and software advances.The inherent strengths, limitations, and challenges associated with study design issues in the context of AD are presented herein.",
+      "Arch Neurol.2008;65(3): 329-334   Alzheimer disease (AD) is the most common cause of dementia and the most prevalent neurodegenerative disorder associated with aging. 1 Alzheimer disease is a heterogeneous disorder with a complex etiology owing to genetic and environmental influences as causal or risk modifiers.The neuropathologic hallmarks of disease are extracellular amyloid plaques and intracellular neurofibrillary tangles of hyperphosphorylated tau protein. 2 Only 10% of AD cases occurring before 60 years of age (early-onset AD) are due to rare, fully penetrant (autosomal dominant) mutations in 3 genes: A\u2424 precursor protein (APP) on chromosome 21, 3 presenilin 1 (PSEN1) on chromosome 14, 4 and presenilin 2 (PSEN2) on chromosome 1. 5,6In contrast, most cases of AD are later in onset (\u0546 65 years of age) (late-onset AD), are nonfamilial, and are likely the result of highly prevalent genetic variants with low penetrance. 7To date, the only genetic risk factor for lateonset AD remains the apolipoprotein E gene (APOE), specifically the \u03b54 allele, which is moderately penetrant, accounting for up to 50% of cases. 8owever, a robust literature reports numerous putative genetic risk alleles and promising genetic variants.Recent reports from individual studies reveal significant associations with the sortilin-related receptor (SORL1 [NCBI Entrez gene 6653]) 9,10 and glycine-rich protein 2-associated binding protein 2 (GAB2 [NCBI Entrez gene 9846]) 11 on chromosome 11; death-associated protein kinase 1 (DAPK1 [NCBI Entrez gene 1612]), 12 ubiquilin 1 (UBQLN1 [NCBI Entrez gene 299798]), 13 and adenosine triphosphate-binding cassette transporter 1, subfamily A (ABCA1 [NCBI Entrez gene 19]), on chromosome 9 14 ; and low-density lipoprotein receptor-related protein 6 (LRP6 [NCBI Entrez gene 4040]) on chromosome 12. 15 All of these putative variants still lack replication in large representative populations but have relevance to neuropathologic mechanisms and pathways that may be associated with AD pathogenesis (   A large meta-analysis from the AlzGene database 16 17 All are associated with relevant biological mechanisms and pathways but await replication to further elucidate their utility as significant markers for AD.",
+      "Background  Alzheimer's disease (AD) is the most common neurodegenerative disorder and the leading cause of dementia in the elderly [1].Diagnosis of AD is based on the presence of neurofibrillary tangles and amyloid plaques [2], and symptoms typically include memory loss and impaired cognitive ability.Although the pathological hallmarks associated with dementia-related symptoms in AD appear largely similar between both the early-onset and late-onset forms of the disease, their underlying etiologies contrast [3].Whereas early-onset AD is a familial autosomal dominant disorder caused by rare, highly penetrant mutations in one of a small set of genes (APP, PSEN1, and PSEN2), the more common late-onset form of the disease (accounting for 90-95 % of cases) occurs sporadically, and risk is determined by complex underlying mechanisms [3][4][5][6].Estimates based on twin concordance rates suggest heritability of late-onset AD is as high as 70 %, implicating major roles for genetic as well as non-genetic factors [6].Indeed, through candidate gene studies, as well as more recent genome-wide association studies (GWASs) and whole-exome sequencing, both common and rare variants associated with the late-onset form of AD have been identified [7][8][9][10][11].Collectively, however, common GWAS variants account for only a modest proportion (~30 %) of the underlying variance in disease susceptibility [12].Several environmental factors are also thought to play a role [5,6], yet exactly how these contribute to risk, onset, and progression remains poorly defined.",
+      "Alzheimer's disease is the most common type of dementia, and it is characterized by a decline in memory or other thinking skills.The greatest risk factor for Alzheimer's disease is advanced age.A recent genome-wide study identified a locus on chromosome 17 associated with the age at onset, and a specific variant in CCL11 is probably responsible for the association.The association of a protective haplotype with a 10-year delay in the onset of Alzheimer's disease and the identification of a CCL11 variant with possible functional roles in this association might allow the future development of immunomodulators with the potential to halve disease incidence.",
+      "Alzheimer's disease is the most common type of dementia, and it is characterized by a decline in memory or other thinking skills.The greatest risk factor for Alzheimer's disease is advanced age.A recent genome-wide study identified a locus on chromosome 17 associated with the age at onset, and a specific variant in CCL11 is probably responsible for the association.The association of a protective haplotype with a 10-year delay in the onset of Alzheimer's disease and the identification of a CCL11 variant with possible functional roles in this association might allow the future development of immunomodulators with the potential to halve disease incidence.",
+      "INTRODUCTION  Alzheimer's disease (AD) is a common debilitating disorder with a prevalence that rises steeply with age from below 1% at 65 years to as high as 40% after the age of 90 [Bachman et al., 1992].Genes are known to play a role in the development of AD.Twin studies show heritabilities of around 60% [Bergem et al., 1997;Gatz et al., 1997].Indeed, variation in four genes has already been shown to cause rare forms of early-onset AD [the Amyloid Precursor Protein Gene (APP); Goate et al., 1991; Presenilin 1 (PS1); Sherrington et al., 1995; Presenilin 2 (PS2); Levy Lahad et al., 1995, Rogaev et al., 1995] or increase the general risk of disease development [Apolipoprotein E (APOE), Corder et al., 1993].As well as increasing disease susceptibility, APOE e4 alleles are associated with reduced age at onset (AAO) and appear to show their strongest effect below 70 years [Farrer et al., 1997].There is also evidence from both twin [Pedersen et al., 2001] and family studies [Tunstall et al., 2000;Li et al., 2002] that AAO in AD is heritable.Daw et al. [2000] have estimated that in addition to APOE, there are at least four loci with similar effect sizes, which contribute to AAO in AD.",
+      "Introduction  Alzheimer's disease (AD), a devastating neurodegenerative disease, is the most common form of dementia among the elderly.Genetically, AD is a complex and multifactorial disease with the possible involvement of multiple genes.The rare early-onset form of the disease usually follows an autosomal-dominant inheritance pattern and to date three genes have been identified: amyloid precursor protein (APP) and presenilin 1 and 2 (PSEN1 and PSEN2).The common late-onset form of the disease is much more complex than the early-onset form and until recently the apolipoprotein E (APOE) gene was the only major genetic factor accounting for 20-29% of the risk for late-onset AD. 1,2 Recent large genome-wide association studies (GWAS) have identi-fied nine additional genes for late-onset AD, including CR1, BIN1, CLU (a.k.a.4][5][6][7] There is high heritability for AD risk (up to 80%), 8 but the total risk attributable to all confirmed loci is about 50%, indicating the presence of additional risk genes for late-onset AD.",
+      "NE OF EVERY 5 PERSONS AGED 65 years is predicted to develop Alzheimer disease (AD) in their lifetime, and genetic variants may play an important part in the development of the disease. 1 The apparent substantial heritability of late-onset AD 2 is inadequately explained by genetic variation within the well-replicated genes (apolipoprotein E [APOE; RefSeq NG_007084], presenilin-1 [PSEN1; RefSeq NG_007386], presenilin-2 [PSEN2; RefSeq NG_007381], and amyloid beta precursor protein [APP;"
+    ]
+  ]
+}
diff --git a/gnqa/paper1_eval/src/data/datasets/old/full_general_dataset.json b/gnqa/paper1_eval/src/data/datasets/old/full_general_dataset.json
new file mode 100644
index 00000000..5e8a3b4c
--- /dev/null
+++ b/gnqa/paper1_eval/src/data/datasets/old/full_general_dataset.json
@@ -0,0 +1,248 @@
+{
+  "question": [
+    "What is the most cited environmental factor for the onset of asthma.",
+    "What is apoptosis?",
+    "How would one extract the DNA, from say, mammals or plants?",
+    "What is a genetic marker?",
+    "Explain the process of finding a genetic marker followed by a quantitative trait loci.",
+    "Create a how-to guide for genetic sequencing.",
+    "What is the significance of the length of telomeres?",
+    "Once a sperm combines with an egg, what determines how traits are passed onto the resulting lifeform?",
+    "Why is genetic tracing matrilineal rather than patrilineal?",
+    "How does one tell the difference between X and Y DNA, with respect to DNA tracing and determining QTLs?"
+  ],
+  "answer": [
+    "Air pollution is the most cited environmental factor for the onset of asthma.",
+    "Apoptosis, also known as cell suicide or programmed cell death, is a biological process in multicellular organisms that allows specific cells to be removed during the development of complex tissues, or potentially dangerous damaged cells to be destroyed for the benefit of the whole organism. It is characterized by a sequence of well-defined events resulting in cell destruction and is necessary for normal cell turnover. It is also essential to various other biological processes.",
+    "For mammals, DNA can be extracted using a DNA Isolation Kit for Mammalian Blood, following the manufacturer's instructions. This process typically involves mixing blood with lysis buffer and Proteinase K, followed by the addition of isopropanol and Inhibitor Removal Buffer. The DNA is then washed, centrifuged, and the supernatant discarded. The remaining pellet, which contains the purified genomic DNA, is diluted in TE buffer.\n\nFor plants, the mixed alkyl trimethyl ammonium bromide (MATAB) procedure can be used. This involves grinding plant material in liquid nitrogen, incubating it in extraction buffer, and then purifying it with chloroform:isoamylalcohol. The DNA extracts are then precipitated with isopropanol and resuspended in a buffer. The extracts are further purified on anion exchange columns.",
+    "A genetic marker is a measurable polymorphic sequence of DNA whose chromosomal location is known. They often have no known functional significance but are used as pointers to a particular chromosomal location. They are used to track the inheritance of genes and can be a gene or some section of DNA with no known function.",
+    "The process of finding a genetic marker followed by a quantitative trait loci (QTL) involves several steps. First, a population is developed for genetic mapping. This population can be a segregating population or a permanent population. The population is then genotyped using molecular markers. Next, the population is phenotyped for an interested trait. QTL analysis is then conducted using statistical procedures to find markers linked to the QTL. This involves single-marker regression across all chromosomes, where a hypothetical QTL is evaluated at the location of each marker locus. The significance of that QTL is estimated from a likelihood ratio statistic. A permutation test is then conducted to establish genome-wide significance criteria for the trait. The result is a list of marker loci that show a significant association with the trait. These loci are most likely to be near QTLs. The goal of QTL mapping is to identify regions of the genome that harbor genes relevant to a specified trait.",
+    "1. Start with an initial linkage or association: This is the first step in a genetic study where a trait or disease is linked to a specific region of the genome.\n\n2. Perform Laboratory Genotyping: This involves determining the genetic variation of an individual by examining their DNA sequence using biological assays and comparing it to a standard sequence.\n\n3. Use Genomic Sequence: The role of genomic sequence is crucial at every stage. It serves as a common thread that ties all the stages together.\n\n4. Establish True Orthology: Determine the true orthology between genes in different species. If two genes are truly orthologous, their evolution closely follows patterns of speciation.\n\n5. Apply Bioinformatics: Use bioinformatics tools and techniques to analyze the data. This can be applied to sequence data from any collection of organisms.\n\n6. Understand Genome Structure and Content: Provide an overview of genome structure and content to provide a context for subsequent discussions.\n\n7. Improve DNA Sequencing Methods: Constantly work on improving DNA sequencing methods with the ultimate goal of sequencing a human genome in a single day for a cost of about US $1,000.\n\n8. Interpret the Information: The major challenge ahead is the interpretation of this information. Understand how our genes interact with each other, and how the environment contributes to the development of health and disease.\n\n9. Consider the Implications: Understand the individual and societal implications of knowing our genome sequence.\n\n10. Keep Up with Technological Advances: With further technological advances, other previously unimaginable research approaches will become real.",
+    "The length of telomeres is significant as it is associated with aging, disease, and mortality. Longer telomeres are generally associated with better health outcomes, including protection from age-related diseases, better cognitive function, and healthier lipid profiles. Shorter telomeres, on the other hand, have been linked to higher risks of heart disease, infection-related death, and other diseases. Telomere length can also be influenced by factors such as stress, with high-stress situations potentially leading to faster telomere shortening and premature aging.",
+    "The traits are determined by the combination of genes from both the sperm and the egg. This process involves meiosis, where each gamete (sperm and egg) contributes one chromosome to each pair, resulting in a zygote with a full complement of 23 chromosome pairs. The process of recombination or crossing over, where similar DNA sequences from the paired chromosomes swap genetic material, also plays a crucial role in determining the traits of the offspring. This results in a shuffling of genetic material and contributes to the genetic variation seen among offspring.",
+    "Genetic tracing is matrilineal due to the inheritance of mitochondrial DNA (mtDNA), which is passed from mother to all her children without any admixture from the father. This allows for a clear lineage tracing through the maternal line. In contrast, Y-DNA is passed from father to son, allowing for patrilineal tracing, but it does not provide information about the genetic contributions of other ancestors in a family tree.",
+    "The difference between X and Y DNA is not explicitly discussed in the context of determining QTLs in the provided text. However, in general, X and Y chromosomes are different in size and gene content, which can be identified through genetic sequencing. In the context of QTL studies, these differences could potentially influence the traits being studied. However, the specific process of how this is done in QTL mapping is not detailed in the provided text."
+  ],
+  "contexts": [
+    [
+      "INTRODUCTION  Asthma is a chronic disease of the airways defined by its symptoms, which include reversible airflow obstruction, inflammation, and bronchial hyperresponsiveness.The ancient Egyptians, Greeks, and Romans made reference to the symptoms of asthma, and today the disease is estimated to affect 235-334 million people worldwide (44,53).",
+      "  The atopic triad.Perhaps the most widely recognized pattern of co-occurrence is the one of asthma, atopic dermatitis (eczema), and allergic rhinitis (hay fever), which together are referred to as the atopic triad and characteristically present clinically in a temporal sequence known as the atopic march.Within this sequence, atopic dermatitis is typically the first component to manifest, with approximately 20-30% of individuals with mild disease and 70% of those with severe disease going on to develop asthma.Individuals who undergo this distinctive sequence of disease progression frequently exhibit a more severe and persistent phenotype, with increased risk of allergen sensitization.",
+      "  Clinically, asthma is characterized by episodes of coughing, chest tightness, wheezing, dyspnea, or sputum production.Often, asthma sufferers experience a combination of these symptoms, or some symptoms more than others.Pulmonary breathing tests typically demonstrate variable airway obstruction and hyperreactivity, but may be normal, even in patients with severe and uncontrolled disease [8].Thus, the diagnosis of asthma, which is based on general clinical symptoms and variable lung function testing, is non-specific and heavily dependent on clinical history.Within the \"umbrella\" diagnosis of asthma there exists a diverse array of differing clinical phenotypes [9].For example, childhood asthma is often associated with personal and parental atopic diseases (i.e., atopic dermatitis, food allergy, eosinophilic esophagitis, allergic rhinitis), viral infections, and tobacco smoke exposure [10].Alternatively, adult-onset asthma is less associated with atopic disease [11,12], but more associated with female sex [13], sinus disease [14], and preceding respiratory infections such as pneumonia [15].In addition, adult-onset disease is often of higher severity [12,16] with a faster and more persistent decline in lung function [17].Moreover, although severe patients are found in every demographic and age group, the most common phenotype is an adult female that is older and obese [18].",
+      "Introduction  An estimated 9% of children and 6% of adults in the United States have asthma [1].The total number of asthma sufferers worldwide is estimated to be over 300 million, with an additional 100 million expected to develop asthma by 2025 [2][3][4][5].Developed countries are the most affected, with some of the highest rates found in the United Kingdom, Australia, New Zealand and the Republic of Ireland [3].Asthma prevalence is rising significantly in developing countries in transition to a more Western lifestyle [3].In 2007, the cost of disease in the United States was estimated to be $56 billion in relation to medical expenses, missed days of work, and early deaths [1].The rate of asthma deaths has likely plateaued, but is still as high as 250,000 per year worldwide [6].Morbidity and mortality are particularly high in ethnic minorities living below or near the poverty line, and African American children had a death rate 10 times that of non-Hispanic white children in 2015 [7].Thus, asthma is a costly, growing health problem associated with high morbidity and mortality.",
+      "Getting accurate estimates of exposures is difficult, whether this is air pollution or toxins in our food and drink, but these are important questions. Rutter: That is an important point. From the twin study data it is clear that environmental effects account for quite a lot of the variance on all the multifactorial disorders. Yet the kinds of measures that are used aren\u2019t terribly solid. They include broad thing such as socio-economic status (SES). Even where there are good measures the care taken in testing for environmental mediation is usually poor.",
+      "Bronchiolitis, a disease that happens in the first year of life in many infants, is strongly associated with subsequent asthma. We ascertained it in the first years of life and have been following these people to age 25 now. For the people who had bronchiolitis and now have asthma, their parents recall much better that they had bronchiolitis than those who don\u2019t have asthma now. It is at least twice more. Extraordinarily, some of these latter parents don\u2019t recall that they took their child to the doctor in the fi rst year of life.",
+      "If you arrive in the USA when you are young you have almost the same prevalence of asthma as an adult as those who are born in the USA and who are not Mexican. But if you arrive at older ages you have less asthma. If you arrive at the age of 20 you have the same asthma risk as those born in Mexico (Eldeirawi et al 2005). Kotb: This is extremely interesting. There is a relationship between depression and the immune system. This especially applies to natural killer (NK) cells, which are the main cells that fight cancers.",
+      "A colleague of mine in Georgia found this may have a protective effect against later development of asthma (Ownby et al 2002). Martinez: We find significantly decreased likelihood of asthma if you have a dog in a home, but not if you have a cat. The reason for this is not that I hate cats, which I do, but most likely because cats are stealth hunters, and they have to be very clean. Dogs are collective hunters and they don\u2019t care if they smell.",
+      "; Guffey, S.E. Investigation into pedestrian exposure to near-vehicle exhaust emissions. Environ. Health 2009, 8, 13. [CrossRef] [PubMed] Our World in Data.org. 2017. Available online: https://ourworldindata.org/data-review-air-pollution-deaths (accessed on 10 January 2022). Pope, C.A. , III. Respiratory disease associated with community air pollution and a steel mill, Utah Valley. Am. J. Public Health 1989, 79, 623\u2013628. [CrossRef] [PubMed] Pope, C.A. , III. What do epidemiologic findings tell us about the health effects of environmental aerosols? J. Aerosol. Med. 2000, 13, 335\u2013354. [CrossRef] [PubMed] Pope, C.A. , III.",
+      "Case for Support BBSRC Grant Application September 2005 \u201cIntegrative Analysis of the Genetic Factors behind Asthma and Atopic Dermatitis\u201d  Part I: Research Proposal Background A Introduction of topic of research and its academic and wider context Asthma is the most common disease of childhood, and affects one child in seven in the United Kingdom. Atopic Dermatitis (AD, eczema) affects similar numbers of children. About 60% of children with severe AD will have concomitant asthma. Treatments for both diseases are unsatisfactory. Abandonment of orthodox medical therapy for AD is common in many families who have children with the disease.",
+      "This is most common during the rainy season when aerosols are created, which results in repeated inhalation of Bp [43, 44]. Environmental sampling studies reveal there is a positive association between the prevalence of disease and the degree of environmental contamination [7]. In addition to environmental factors, data suggests that host factors play an important role in mounting an immune response against infectious diseases [45] such as melioidosis. While healthy persons can contract melioidosis, most patients in endemic regions have an underlying predisposition [28], which suggests that the immunological status of the patient can influence disease initiation and progression [15].",
+      "Sensitivity analysis  We did two sets of post-hoc sensitivity analyses to assess the effects of potential poor recall of age of onset among individuals with adult-onset asthma, and the effects of misclassification of COPD as asthma among the adultonset cases, even with exclusion of cases with a reported diagnosis of COPD, emphysema, or chronic bronchitis.First, to assure that the adult-onset cases did not include a significant proportion of childhood-onset asthma in which symptoms remitted in early life but then relapsed in adulthood, we replaced adult-onset cases with increasing proportions of randomly selected childhood-onset cases, and then tested for association at the two most significant childhood onset-specific loci.This procedure was repeated 20 times for each proportion to quantify the sampling variability (appendix pp 7-8).Second, we did two analyses in which we removed either individuals with ages of asthma onset between 46 and 65 years or adult-onset cases and controls with FEV\u2081/FVC <0\u202270.For each, we compared p values and ORs with the GWAS including all adult-onset cases (appendix pp 8-9).",
+      "  We used data for British white individuals from UK Biobank data release July 19, 2017. 8We extracted disease status (asthma, allergic rhinitis, atopic dermatitis, food allergy, chronic obstructive pulmonary disease (COPD), emphysema, and chronic bronchitis), age of on set of asthma, and sex from self-reported question naires and hospital records (International Classification of Diseases 10th revision [ICD-10] codes) by querying our in-house protected UK Biobank database server. 9For our main case analysis, we included individuals who self-reported that they had doctor-diagnosed asthma.Further details of our research approach are provided in the appendix (pp 4-7).",
+      "; Guffey, S.E. Investigation into pedestrian exposure to near-vehicle exhaust emissions. Environ. Health 2009, 8, 13. [CrossRef] [PubMed] Our World in Data.org. 2017. Available online: https://ourworldindata.org/data-review-air-pollution-deaths (accessed on 10 January 2022). Pope, C.A. , III. Respiratory disease associated with community air pollution and a steel mill, Utah Valley. Am. J. Public Health 1989, 79, 623\u2013628. [CrossRef] [PubMed] Pope, C.A. , III. What do epidemiologic findings tell us about the health effects of environmental aerosols? J. Aerosol. Med. 2000, 13, 335\u2013354. [CrossRef] [PubMed] Pope, C.A. , III.",
+      "8 The socio-ecologic framework posits that various aspects of a child\u2019s environment directly and indirectly impact the child\u2019s health and development.9 Drawing on this framework, Beck and colleagues10 examined several biologic, social and ecologic variables to provide a greater understanding of factors influencing asthma-related hospital readmissions for black children compared to their white counterparts. The study revealed that black children were over two times as likely to be readmitted for an asthma-related illness compared to white children; this resulted from significant differences in almost every socio-ecologic variable measured, including disease management practices and access to primary care.",
+      "Specific Aims Asthma is the most common chronic pediatric medical condition in the United States, with a prevalence over 9.6% in children under 18 years of age.1, 2 Low-income, urban children incur a disproportionate share of asthma prevalence and morbidity;2-4 13% of children living below the poverty threshold are diagnosed with asthma compared to 8% of non-poor (>200% poverty),3 and poverty is associated with higher rates of asthma attacks.1 Living in an urban area confers additional risk for asthma and increased ED utilization.4, 5 Implementation of the National Asthma Education and Prevention Program\u2019s (NAEPP) Guidelines has contributed to reductions in asthma morbidity and mortality rates, and these guidelines emphasize establishing a partnership between healthcare providers and patients/families to promote effective asthma management.6 The NAEPP expert panel states, \u201cbuilding a partnership requires that clinicians promote open communication and ensure that patients have a basic and accurate foundation of knowledge about asthma\u2026\u201d (p.124),6 yet care partnerships also require that the patient/parent effectively communicate issues such as emerging symptoms or response to medications.",
+      "Vital & health statistics Series 3, Analytical and epidemiological studies. 2012(35):1-58. CDC. Current Asthma Prevalence. https://www.cdc.gov/asthma/most_recent_data.htm. 2015. Updated June 2017. Accessed March 9, 2018. Northridge J, Ramirez OF, Stingone JA, Claudio L. The role of housing type and housing quality in urban children with asthma. Journal of urban health : bulletin of the New York Academy of Medicine. 2010;87(2):211-224. Flores G, Snowden-Bridon C, Torres S, et al. Urban minority children with asthma: substantial morbidity, compromised quality and access to specialists, and the importance of poverty and specialty care.",
+      "Asthma Prevalence and Disparities Asthma is the most common chronic pediatric medical condition in the United States,1 affecting an estimated 6.2 million children annually.2 Poorly controlled pediatric asthma contributes to over 700,000 visits a year to emergency departments (ED).1 Children living in impoverished, urban settings are disproportionately affected by asthma,3 and the disparate impact of asthma is even worse among black and Latino children, and children whose parents have limited English proficiency (LEP) in these urban low-income areas.4-6 A 2017 longitudinal study revealed that black race and Latino ethnicity are significantly associated with worse asthma outcomes including 1) asthma knowledge, 2) asthma-related quality of life, 3) asthma severity, and 4) asthma control.",
+      "The Journal of asthma : official journal of the Association for the Care of Asthma. 2017:16. Inkelas M, Garro N, McQuaid EL, Ortega AN. Race/ethnicity, language, and asthma care: findings from a 4-state survey. Annals of allergy, asthma & immunology : official publication of the American College of Allergy, Asthma, & Immunology. 2008;100(2):120-127. National Asthma Education and Prevention Program. Expert Panel Report 3: Guidelines for the Diagnosis and Management of Asthma Bethesda, MD: National Institutes of Health, National Heart, Lung, and Blood Institute; 2007. Publication no. 08-045.1. NIH Consensus Group. Video report: What is mHealth?",
+      "Contact PD/PI: Coker, Tumaini Rucker  INTRODUCTION TO APPLICATION Research Plan Overview Childhood asthma is the most common pediatric medical condition in the United States, and disproportionately affects children living in low-income, urban settings. Many low-income, urban families rely on emergency department (ED) services as their source for sick care for their child. This is often due to not having a primary care provider or sufficient access to their primary care provider for asthma management."
+    ],
+    [
+      "  Apoptosis, or controlled cell death [62], is another major stressed-cell response, and was also represented in our results (Fig. 9e).A large body of direct evidence points to apoptosis as one of the main routes of RPE degeneration in AMD [63].Induction of apoptosis upon stress is dictated by the action of master regulator p53, and it was recently shown that aging increases the activity of p53 in RPE cells and the likelihood for apoptotic cell death [64].Consistent with this evidence, we found association with pathways in Transcriptional regulation by TP53 group (Fig. 9d).In particular, Regulation of TP53 activity through methylation was among the top pathway in our association analysis (Table 1), suggesting that p53 modification by methylation and the closely related histone modifications [Protein lysine methyltransferases (PKMTs) methylate histone lysine in Fig. 9e] play important roles in RPE apoptosis regulation.In the intrinsic apoptotic pathway induced by oxidative stress, cytochrome c is released from mitochondria into the cytosol, binding and activating caspases, the main proteases central to apoptotic action.We found association in pathways involving 'inhibitor of apoptosis' (IAP) and its negative regulator 'second mitochondrial activator of caspases' (SMAC) [65], which suggests that disruption to regulatory mechanisms preventing apoptosis in RPE cells may play roles in AMD.",
+      "Apoptosis  Persistent DNA damage",
+      "42 ABSTRACT 18 A MODULARIZED MODEL OF APOPTOSIS HA Harrington, KHo, Sk Ghosh, KC Tung , CY Kao, and B Aguda Imperial College London, Courant Institute of Mathematical Sciences New York University, University of Texas at Arlington, University of Texas Southwestern Medical Center, Mathematical Biosciences Institute, and Department of Mathematics, The Ohio State University Columbus, OH, USA Background: One of the key physiological mechanisms employed by the cell (during development and for maintenance of homeostasis) in multi-cellular organism is apoptosis, which is characterized by a sequence of well-defined events resulting in cell destruction.",
+      "14 Apoptosis is caused by the activation of the caspase cascade, which is initiated by two signaling routes (stress-induced death and death-domain receptor-induced death) (Domen 2001). This process can be prevented by antiapoptotic molecules, such as Bcl-2 (Domen and Weissman 2000). Direct evidence for the involvement of apoptosis in HSC number regulation came from the findings that overexpression of the anti-apoptotic gene bcl-2 led to increased numbers of Thy-1.1low, Sca-1+, c-kit+, Lin- cells, a population with long-term multi-lineage repopulation potential (Domen et al. 2000).",
+      "Several lines of evidence have indicated that apoptosis acts as an important regulator of stem cells. First of all, expression of some apoptosisrelated genes were detected in human and/or murine HSCs (Domen 2001). Secondly, targeted disruption of some of these genes in null and dominant negative mutant mice interfered with normal apoptotic processes in HSCs. For example, overexpression of Bcl-2, a negative regulator of apoptosis, increased not only the numbers and competitive repopulation capabilities of HSCs, but also the resistance of HSCs to apoptosis induced by ionizing radiation (Domen and Weissman 2003).",
+      "Apoptosis  Cell suicide, or apoptosis, is a well-studied biological phenomenon in multicellular organisms that allows specific cells to be removed during the development of complex tissues, or potentially dangerous damaged cells to be destroyed for the benefit of the whole organism.The lack of an apparent evolutionary benefit for such a process in a single-celled organism initially caused controversy about the presence of an apoptotic pathway in yeast.Today, however, a number of yeast orthologues to mammalian apoptosis genes have been discovered and apoptotic-like cell death has been linked to mating, colony formation, and aging (Buttner et al. 2006;Eisenberg et al. 2007;Frohlich et al. 2007).With respect to aging, both replicatively and chronologically aged cells that die have increased ROS and display apoptotic phenotypes (Fabrizio et al. 2004a;Herker et al. 2004;Laun et al. 2001).",
+      "  The importance of apoptosis in yeast aging has yet to be fully characterized.At the very least, yeast apoptosis provides a useful pathway for studying genetic interactions for age-related diseases that affect humans, such as cancer.Readers interested in further information related to yeast apoptosis are referred to several in-depth reviews (Buttner et al. 2006;Eisenberg et al. 2007;Frohlich et al. 2007).",
+      "Early redistribution of plasma membrane phosphatidylserine is a general feature of apoptosis regardless of the initiating stimulus: inhibition by overexpression of Bcl-2 and Abl. J Exp Med 182: 1545-56. Mathew CG (2006). Fanconi anaemia genes and susceptibility to cancer. Oncogene 25: 5875-84. McBride MW, Carr FJ, Graham D, Anderson NH, Clark JS, Lee WK et al (2003). Microarray analysis of rat chromosome 2 congenic strains. Hypertension 41: 847-53. Merino-Trigo A, Kerr MC, Houghton F, Lindberg A, Mitchell C, Teasdale RD et al (2004).",
+      "  When a cell harbors such severe DNA damage that it is beyond repair, it is disposed of through apoptosis.Alternatively, DNA damage can induce cellular senescence, the irreversible cessation of mitosis.Both processes are critically dependent on p53, which is known as the guardian of the genome [3] .DNA damage may also trigger autophagy, a cellular catabolic process that maintains homeostasis [4] .It should be noted that under normal conditions cells are rarely exposed to very high doses of DNAdamaging agents, which may be the explanation why we do not age and die because we run out of cells.However, aging is associated with some atrophy [1] and it is conceivable that at older ages bursts of DNA damage, for example from free radical reactions associated with inflammation, do occur and give rise to an increasingly high rate of apoptosis or cellular senescence.While there is some evidence for increased apoptosis and cellular senescence at old age, it is doubtful that under normal conditions this would lead to a significant loss of functional cells.",
+      "  Apoptosis, or programmed cell death, literally eliminates cells at risk for neoplastic transformation.Senescence, by contrast, permanently arrests their growth.Both processes are controlled by the p53 tumor suppressor protein (Amundson, Myers, & Fornace, 1998;Bringold & Serrano, 2000;Hickman, Moroni, & Helin, 2002;Itahana, Dimri, & Campisi, 2001).p53 is a transcriptional regulator that both transactivates and transrepresses target genes in response to stress (Prives & Hall, 1999;Ryan, Phillips, & Voudsen, 2001).These target genes, in turn, stimulate DNA repair, transient cell cycle arrest, permanent cell cycle arrest (senescence) or cell death (apoptosis), depending on cell type, degree and type of damage, and other variables.In contrast, cells that lack normal p53 regulation or function -for example, tumor cells -tend to die in response to telomere dysfunction.Some normal human cells, on the other hand, undergo a senescence growth arrest.In either case, when present, p53 is crucial for mediating the cellular response to telomere dysfunction (Yaswen & Stampfer, 2002) (Fig. 4).",
+      "Cell death, and in particular apoptosis, can be caused by a number of mechanisms including loss of growth factors and excitotoxicity (e.g. , Bhutta and Anand, 2002; Nikolic\u0301 et al. , 2013). It is of interest therefore, that proximal to the region of the QTL there are several genes that are related to growth factors including the latent transforming growth factor protein 2 (ltbp2), placental growth factor (pgf), and transforming growth factor beta (Tgf beta).",
+      "  Apoptosis-related gene expression profiles",
+      "  Apoptosis.Programmed death of cells during embryogenesis and metamorphosis or during cell turnover in adult tissues.",
+      "14 Apoptosis is caused by the activation of the caspase cascade, which is initiated by two signaling routes (stress-induced death and death-domain receptor-induced death) (Domen 2001). This process can be prevented by antiapoptotic molecules, such as Bcl-2 (Domen and Weissman 2000). Direct evidence for the involvement of apoptosis in HSC number regulation came from the findings that overexpression of the anti-apoptotic gene bcl-2 led to increased numbers of Thy-1.1low, Sca-1+, c-kit+, Lin- cells, a population with long-term multi-lineage repopulation potential (Domen et al. 2000).",
+      "Several lines of evidence have indicated that apoptosis acts as an important regulator of stem cells. First of all, expression of some apoptosisrelated genes were detected in human and/or murine HSCs (Domen 2001). Secondly, targeted disruption of some of these genes in null and dominant negative mutant mice interfered with normal apoptotic processes in HSCs. For example, overexpression of Bcl-2, a negative regulator of apoptosis, increased not only the numbers and competitive repopulation capabilities of HSCs, but also the resistance of HSCs to apoptosis induced by ionizing radiation (Domen and Weissman 2003).",
+      "  Fraction of cells displaying apoptosis",
+      "  It has been known that mitochondria play a central role in the life and death of cells (Kroemer & Reed, 2000).Apoptosis was observed in developmentally arrested embryos by 72 h, but not at 24 h after FCCP treatment, despite considerable telomere attrition at this early stage, suggesting that telomere attrition occurs prior to apoptosis and may serve as an intermediate step between mitochondrial dysfunction and apoptosis.These results also suggest that telomere shortening may signal apoptosis (Lee et al ., 1998;Karlseder et al ., 1999).",
+      "Cell Death  A form of programmed cell death, apoptosis is necessary for normal cell turnover and is essential to a plethora of other biological processes.Apoptosis can be executed via Bcl-2 activation of caspases, via signals from the death receptor on the plasma membrane, or via induction by granzyme B secreted from cytotoxic T cells (Tc cells) [35].Endonucleases and proteases are activated by active caspases, eventually leading to the death of the cell.With age, however, apoptotic activity changes.In heart [36], kidney [37], skeletal muscle [38], and Tc cells [39], increased apoptosis has been reported, perhaps contributing to loss of cellularity in these tissues.This escalation across various tissues may be attributed to the increased production of free radicals [40] and furthermore exacerbated by the accumulation of DNA damage in the aged cells [41].As the risk increases for cells to turn cancerous and dysfunctional with advancing age, increased apoptosis in aged cells is argued to be a defense strategy.In other tissues, such as the colon, apoptosis appears to decrease with age perhaps contributing to the accumulation of senescent cells and age-associated carcinogenesis [42].",
+      "  The regulation and execution of apoptosis in endothelial cells is a complex process involving paracrine factors, membrane receptors, interaction of pro-and anti-apoptotic factors and cysteinyl aspartate-specific proteases (caspases).Recent studies suggest that in aging there is an imbalance in the expression of pro-and anti-apoptotic genes resulting in an enhanced apoptosis in the myocardium (19), central nervous system (24), skeletal muscle (10), lung (33), and liver (2,33).Yet, age-related alterations in the expression of pro-and anti-apoptotic genes in coronary arteries have not been elucidated.",
+      "Apoptosis modulating genes  Apopotosis or programmed cell death is associated with alterations in cell morphology, particularly the nucleus, with endonucleatytic cleavage of DNA into nucleosomal length fragments.Apoptosis may result from withdrawal of growth signals."
+    ],
+    [
+      "DNA and RNA extraction of tissues  Genomic DNA was extracted from frozen placentae (n \u03ed 3/group) and liver (n \u03ed 9/group) using a modified version of an established protocol (28,29).Total RNA was extracted from the remaining tissue using TRIzol, as per the manufacturer's instructions (Invitrogen Canada Inc).Genomic DNA and RNA purity and concentration were assessed using spectrophotometric anal-ysis, and integrity was verified using agarose gel [1% (wt/vol)] electrophoresis.",
+      "Taxon Sampling and DNA Extractions  We extracted DNA from 72 pinned specimens from the National Museum of Natural History (NMNH) Entomology collection for this study.We plucked middle legs from the pinned bees using a pair of sterilized forceps and washed the tissue in 95% ethanol to remove dust, pollen, and other forms of accumulated debris on the bee legs.After evaporation of the ethanol (by drying the tissue on a clean Kimwipe \u2122 ), the samples were placed in a freezer for several hours.DNA was then extracted destructively by grinding the frozen tissue with a sterile pestle, using a DNeasy Blood and TissueKit (Qiagen, Valencia, CA, USA) and following the manufacturer's protocol, except the DNA was eluted in 130\u03bcL ddH 2 O instead of the supplied buffer.We ran 10\u03bcL of each extract for 60 min at 100 volt on 1.5% agarose SB (sodium borate) gels, to estimate size of the genomic DNA.",
+      "DNA extraction  DNA was extracted from PBMCs using the QIAamp DNA Mini kit (Qiagen, CA, USA), following the manufacturer's instructions for the spin protocol.The DNA was eluted in 60 \u03bcl of AE elution buffer and stored at -20\u00b0C.The concentration and quality of the DNA was assessed with the Qubit dsDNA HS Assay (Invitrogen, Eugene, OR, USA).",
+      "Methods  Laboratory procedures.We initially screened 107 ancient samples (Supplementary Data 1) in dedicated clean facilities at the ancient DNA lab of Jilin University, China, following published protocols for DNA extraction and library preparation 36,37 .Prior to sampling, we wiped all skeletal elements with 5% bleach and irradiated with UV-light for 30 min from each side.We drilled teeth to obtain fine powder using a dental drill (Dremel, USA).We sampled the dense part of petrous bones around the cochlea by first removing the outer part using the sandblaster (Renfert, Germany), and then grinding the clean inner part into fine powder with the mixer mill (Retsch, Germany).We digested the powder (50-100 mg) in 900 \u03bcl 0.5 M EDTA (Sigma-Aldrich), 16.7 \u03bcl of Proteinase K (Sigma-Aldrich), and 83.3 \u03bcl ddH 2 O (Thermo Fisher, USA) at 37 \u00b0C for 18 h.Then we transferred the supernatant to a MinElute silica spin column (QIAGEN, Germany) after fully mixed with the 13 ml custom binding buffer [5 M guanidine hydrochloride (MW 95.53), 40% Isopropanol, 90 mM Sodium Acetate (3 M), and 0.05% Tween-20] followed by two washes with PE buffer (80% ethanol).Then we eluted the DNA with 100 \u03bcl TET buffer (QIAGEN, Germany).",
+      "DNA Extraction  After blood was drawn into EDTA tubes, genomic DNA was extracted using a DNA Isolation Kit for Mammalian Blood Kit (Roche Applied Science, Indianapolis, IN, USA) according to the manufacturer's recommendations.Briefly, 300 \u03bcl of whole blood from each sample was mixed with 200 \u03bcl of lysis buffer (50 mM Tris pH 8.0, 100 mM EDTA, 100 mM NaCl, 1% SDS) and 40 \u03bcl of Proteinase K, followed by addition of 100 \u03bcl of isoproponal and 500 \u03bcl of Inhibitor Removal Buffer (5M guanidine-HCl, 20 mM Tris-HCl pH 6.6).The DNA was washed with a buffer (20 mM NaCl; 2 mM Tris-HCl; pH 7.5), centrifuged twice at 2000 rpm, washed using cold 70% ethanol and centrifuged at 3000 rpm.The supernatant was discarded and the pellet containing purified genomic DNA was diluted in TE buffer (1 mM EDTA; 10 mM Tris-HCl, pH 7.5) to a concentration of approximately 50 ng/\u03bcl.",
+      "Genomic DNA extraction  Leukocytes were isolated from 5-ml peripheral blood samples.DNA was prepared by phenol extraction and chloroform extraction followed by isopropanol precipitation, washed with ethanol, and air-dried.Tris-EDTA buffer pH 8.0 was used to dissolve the final genomic DNA product.",
+      "  The pulled down DNA fragments were extracted and purified using phenolchloroform extraction/ethanol precipitation.The samples were stored at -20 \u00b0C until use.",
+      "DNA extraction for genotyping  For the majority of samples, DNA was extracted from either spleen or the exocrine fraction of the islet isolation using the Tissue DNA Purification Kit according to manufacturer's instructions on an automated Maxwell 16 system (both Promega, USA).When no other tissue was available, DNA was extracted from human islets using the Trizol fraction remaining after extraction of RNA (see above).To precipitate the DNA, 300\u03bcl 100% ethanol was added to the thawed solution.This mixture was incubated at room temperature for a minimum of 30 minutes.DNA was then pelleted by centrifugation at 4,000 x g for 5 minutes at 4\u00b0C.After removing the supernatant, the pellet was twice washed with 0.1M trisodium citrate (Sigma Aldrich, UK) in 10% ethanol and left at room temperature for 30 minutes, followed by another wash step with 75% ethanol.After the final wash step, pellets were air-dried for 10 minutes to remove residual ethanol and re-suspended in a minimum of 100 \u03bcL 8mM NaOH (Sigma Aldrich).Extracted DNA was stored at -20\u00b0C before further use.",
+      "DNA extraction  Tissue samples were incubated at 50\u00b0C overnight with shaking in DNA extraction buffer (100 mM NaCl, 10 mM Tris.HCl pH8, 25 mM EDTA, 0.5% (w/v) SDS), containing 200 \u03bcg/ml proteinase K. DNA was isolated by two rounds of phenol:chloroform extraction, followed by RNAse A treatment, precipitation in absolute ethanol containing 10% (v/v) sodium acetate (3 M, pH 5.2), and resuspended in 100 \u03bcl nuclease-free water (Ambion, Austin, TX, USA) or using salting out method followed by purification with Qiagen blood and tissue kit (Qiagen, Mississauga, ON, USA).DNA was stored at -20\u00b0C.",
+      "Methods  Human DNA samples DNA was extracted from human patient tissue samples acquired from the University of Minnesota Tissue Procurement Facility from BioNet (IRB#0805E32181).See Supplemental Table S4 for patient data.Briefly, 2 mg of tissue was digested overnight at 55\u00b0C on a rotating platform in 710 mL of digest buffer (1 M Tris at pH 8.0, 1 mM EDTA, 13 SSC, 1% SDS, 1 Mm NaCl, 10 mg/mL Proteinase K).Following digest, DNA was purified using phenolchloroform-isoamyl alcohol (Life Sciences) isolation protocol.",
+      "3.2.2 Isolation of genomic DNA Genomic DNA was isolated from frozen liver tissue. The isolation was conducted using the Qiagen DNeasy\uf8e8 Blood & Tissue Kit (Qiagen) according to the manufacturer\u2019s protocol. DNA concentration was evaluated photometrically at a wavelength of 260 nm using the FusionTM Universal Microplate Analyzer. For nucleic acid quantification, the Beer-Lambert (A = \u03b5 * b * c) equation is modified to use an extinction coefficient with units of M-1 cm-1.",
+      "  Most typically, DNA is extracted from blood samples, dried blood spots, buccal swabs, saliva, tissue and even urine and stool samples.In forensic science, other sources have been validated e.g.bone, tooth pulp, dandruff and others.",
+      "DNA isolation  High-molecular weight DNAs was isolated from the samples by organic solvent extraction method, followed by precipitation in cold ethanol [14].",
+      "Genomic DNA extraction  DNA from MEF cultures or mouse liver was isolated by phenol/chloroform extraction, as described [11].",
+      "DNA is usually recovered from cells by methods that include cell rupture but that prevent the DNA from fragmenting by mechanical shearing. This is generally undertaken in the presence of EDTA, which chelates the magnesium ions needed as cofactors for enzymes that degrade DNA, termed DNase. Ideally, cell walls, if present, should be digested enzymatically (e.g. , lysozyme in the bacteria or bacterial cell). In addition the cell membrane should be solubilized using detergent.",
+      "DNA solutions can be stored frozen, although repeated freezing and thawing tends to damage long DNA molecules by shearing. A flow diagram summarizing the extraction of DNA is given in Fig. 1.2. The above-described procedure is suitable for total cellular DNA. If the DNA from a specific organelle or viral particle is needed, it is best to isolate the organelle or virus before extracting its DNA, because the recovery of a particular type of DNA from a mixture is usually rather difficult.",
+      "Genomic DNA extraction  Genomic DNA was extracted by the mixed alkyl trimethyl ammonium bromide (MATAB) procedure.Briefly, 250 mg of plant material was ground in liquid nitrogen and immediately incubated in 2 ml of pre-warmed extraction buffer (100 mM Tris-HCl, pH 8, containing 20 mM EDTA, 1.4 M NaCl, 2% (w/v) MATAB, 1% (w/v) PEG6000 (polyethylene glycol), 0.5% (w/v) sodium sulfite, 20% (w/v) Igepal CA630, 20% (w/v) lithium dodecyl sulfate, and 20% (w/v) sodium deoxycholate) at 74 \u00b0C for 20 min.After purification with 2 ml of chloroform:isoamylalcohol (24:1, v/v), DNA extracts were precipitated with 1.6 ml of isopropanol then resuspended in 1 ml of buffer (50 mM Tris-HCl, pH 8, containing 10 mM EDTA and 0.7 M NaCl).The extracts were purified on anion exchange columns (QIAGEN-tip 20) following the manufacturer's instructions (QIAGEN, Valencia, CA).",
+      "DNA extraction and enzymatic digestion  Total DNA was isolated from whole blood and separated blood subtypes using a Qiagen DNeasy Blood & Tissue Kit following the manufacturer instructions.After extraction, DNA was quantified by NanoDrop (Thermo Scientific NanoDrop products, Wilmington, DE).The isolated genomic DNA was enzymatically digested according to previously described method.Briefly, DNA (3 \u03bcg) was first denatured by heating at 95 \u00b0C for 5 min and then chilling on ice for 2 min.Then, 1/10 volume of S1 nuclease buffer (30 mM CH 3 COONa, pH 4.6, 280 mM NaCl, 1 mM ZnSO 4 ) and 100 units of S1 nuclease were added before the mixture (20 \u03bcL) was incubated at 37 \u00b0C for 16 h.Subsequently, after 1/10 volume of alkaline phosphatase buffer (50 mM Tris-HCl, 10 mM MgCl 2 , pH 9.0), 0.002 units of venom phosphodiesterase I, and 10 units of alkaline phosphatase were added, the solution was incubated at 37 \u00b0C for an additional 4 h followed by extraction with an equal volume of chloroform for twice.The aqueous layer was collected and lyophilized to dryness and then reconstituted in 100 \u03bcL water.About 30 \u03bcL of the obtained samples were then subjected to liquid chromatography-electrospray ionization-tandem mass spectrometry (LC-ESI-MS/MS) analysis.",
+      "  The conventional DNA extraction procedure involved the homogenization of single D. magna in 400 l of sperm lysis buffer (100 mM Tris-HCl, pH 8; 500 mM NaCl; 10 mM ethylenediaminetetraacetic acid [EDTA], pH 8; 1% SDS; 2% mercaptoethanol) followed by RNase treatment (40 g, 37\u040aC for 1.5 h).The DNA was then extracted in phenol (pH 8) and chloroform:isoamyl alcohol (1:1).The DNA was finally precipitated by two volumes of ice-cold ethanol in the presence of 3 M sodium acetate (1/10 of the DNA volume) and was incubated at \u03ea80\u040aC overnight.Precipitated DNA was harvested by centrifugation, dried in air, and the final pellet dissolved in sterile analytic grade water.",
+      "DNA extraction and quantification  DNA was extracted from whole organs by standard techniques (34) with emphasis on minimizing shearing or nicking of DNA as nicked DNA has been shown to be refractory to LX-PCR (35).DNA from the brain was extracted from the right hemisphere.Extracted DNA was resuspended in 10 mM Tris 1 mM EDTA (pH 8) (TE) and stored at 4_C.A number of samples were normalized for mtDNA content by dot blotting and hybridization with digoxigenin-labeled full-length mtDNA and densitometry.In cases where mtDNA quantification was not carried out, the DNAs were normalized by A 260 of total DNA."
+    ],
+    [
+      "Using genetic markers, the pattern of inheritance can be tracked through families. For example, by analyzing a marker linked to the eye color gene in several generations, it is possible to determine from which grandparents a child has inherited its eye color alleles. More importantly, \ufb01nding a marker linked to a disease can lead to location of the faulty gene causing the disease. Finding the gene is very valuable in the search for the cure. The distance between two loci can be expressed either as physical or genetic distance.",
+      "  It is well known, however, that not all genomic markers are independent (Frazer et al., 2007).Genetic variation is often inherited in contiguous segments of DNA, such that there tends to be correlation between the inheritance of alleles at markers close to each other on the same chromosome.This genetic correlation is called linkage disequilibrium (LD), and, as a result, the effective number of independent tests (M eff ) conducted is less than the total number of markers (M).By effective number of tests, we mean the number of independent tests that would have to be conducted to lead to a null distribution for the minimum P-values that was approximately the same as that obtained when conducting tests that are necessarily correlated due to LD.",
+      "Genetic mapping is a powerful strategy that exploits genomic information to dissect complex traits into Mendelian loci (quantitative trait loci or QTL) and identifies genetic * Correspondence: marioenrico.pe@sssup.it 1 Institute of Life Sciences, Scuola Superiore Sant\u2019Anna, Pisa, Italy Full list of author information is available at the end of the article  determinants that may lead to crop improvement. As marker density ceases to be a limiting factor [3], our ability to discover specific genetic determinants in a single mapping study depends upon the availability of populations with high genetic diversity and recombination density [4].",
+      "This capacity allows samples to be placed into meaningful genetic groups that reflect evolutionary relationships (more stable, lower diversity markers), while simultaneously permitting high levels of strain resolution (high diversity markers). From a clinical perspective, markers that accurately reflect broad evolutionary relationships are valuable for comparing the genetic similarity of an isolate to isolates on a regional or global scale, whereas high-resolution markers are valuable for detailed epidemiological tracking in an outbreak. Variable-number tandem repeats (VNTRs) are genetic markers that can span a range of variability and, therefore, can capture genetic relationships on multiple scales (18\u201319).",
+      "Identifying the genetic loci that modulate a trait based on correlation between variation in phenotype and variation in genotype is the essence of genetic mapping. This first involves systematically genotyping a genetically diverse population using microsatellite or SNP markers. The phenotype of interest is then measured and its variability in the population assessed. A statistical test is then carried out to identify chromosomal regions that segregate with the trait and show linkage with the trait, i.e. ,  3 identify genetic regions that have the same genotype among individuals with similar trait values but differ between individuals with dissimilar trait values.",
+      "Using genetic markers, the pattern of inheritance can be tracked through families. For example, by analyzing a marker linked to the eye color gene in several generations, it is possible to determine from which grandparents a child has inherited its eye color alleles. More importantly, \ufb01nding a marker linked to a disease can lead to location of the faulty gene causing the disease. Finding the gene is very valuable in the search for the cure. The distance between two loci can be expressed either as physical or genetic distance.",
+      "Genetic variation  For decades researchers used single markers to elucidate clinal differentiation and spatial variation in allele frequencies.This approach revealed multiple markers with variation that tracked the clines, including some with the same allele at higher frequency at the same latitude in the Northern and Southern hemispheres.Examples include alcohol dehydrogenase (Adh), a-glycerol-3-phosphate dehydrogenase (Gpdh), glucose-6-phosphate dehydrogenase (G6pd), esterase-6 (Est-6), octanol dehydrogenase (Odh), and 6-phosphogluconate dehydrogenase (Pgd) [30][31][32][33] (Table 1).Perhaps the most heavily explored locus in D. melanogaster has been Adh, the first step in the ethanol detoxification pathway.The Adh-F allele encodes high catalytic activity of ADH, but this increase in activity trades off with enzyme stability at higher temperatures [34,35].Unsurprisingly, the Adh-F allele is found at a higher frequency in cooler high-latitude populations, and differentiation has occurred in parallel along clines in",
+      "In the case of genetic markers, this easily runs in the several hundreds to thousands. Moreover, the optimal subset of markers is heavily dependent on how these markers are combined, i.e. dependent on the optimal Boolean function . Altogether, one frequently has to rely on greedy search strategies that easily get stuck in local optima or near exhaustive searches that are computationally too expensive, especially when employed in permutation procedures required to assess statistical significance. Our solution to this problem hinges upon two observations.",
+      "GENE MAPPING  The opportunity to merge advances in molecular genetic technology with advances in statistical techniques expanded in earnest with the development of DNA markers such as restriction fragment length polymorphisms (Lander and Botstein, 1989).Research exploded in the past decade with the continued refinement of molecular technology yielding a variety of DNA markers-e.g., short tandem repeats (STRs) or microsatellites; variable number of tandem repeats (VNTRs); single nucleotide polymorpohisms (SNPs), and gene expression microarrays or gene chips.A genetic marker is a measurable polymorphic sequence of DNA whose chromosomal location is known.Markers often have no known functional significance but are used as pointers to a particular chromosomal location.The logic of gene mapping technology is simple: Determine if there is a relationship between variability in a phenotype and variability in an anonymous DNA marker of known chromosomal location.If there is a relationship, it is taken as evidence that there is a gene that influences the trait at or near the marker.",
+      "Genetic drift. Genetic changes in populations caused by random phenomena rather than by selection.Genetic marker.A segment of DNA with an identifiable physical location on a chromosome whose inheritance can be followed.A marker can be a gene, or it can be some section of DNA with no known function.",
+      "  Biological characteristics indicating initial resiliency or susceptibility of an organism include genetic profiles.As noted above, genetic markers need to have a high prevalence in the population and have a reasonably strong effect on common population health outcomes, or have an interaction effect with other health-affecting mechanisms, to be candidates for inclusion in population studies.At the moment, the only known genetic marker of clear value in a population survey is the apolipoprotein E gene (APOE), although this is likely to change in the very near future.APOE allele status is clearly related to a number of major health outcomes in older populations which are reasonably well measured in population surveys: mortality, heart disease, and cognitive functioning (Albert et al., 1995b;Corder et al., 1993;Evans et al., 1997;Ewbank, 1997;Hofman et al., 1997;Hyman et al., 1996;Luc et al., 1994;Saunders et al., 1993).Both the prevalence of alleles indicating higher risk and the size of the effect are large enough to be of importance in explaining variability in currently studied health outcomes.APOE allele status has been shown to have independent effects on health outcomes and to interact with other life circumstances such as sex and race in its effect on health outcomes (Jarvik et al., 1995;Maestre et al., 1995;Payami et al., 1992).Incorporation of information on this genetic indicator could lead to increased knowledge of the interactive mechanisms of this genetic marker and other social and behavioral variables and thus clarify some of the mechanisms leading to population differentials in cognition, heart disease, and mortality.",
+      "  As described by Hermalin (1999), if genetic markers are modeled as part of an individual's physiological structure, they can provide controls for predisposing factors that affect more proximate mid-level markers of function as well as downstream health outcomes.This potential benefit of genetic information-i.e., its power in explicating the black box of Figure 11-1-may outweigh, or at least precede, its near-term potential for discovering genetic links to chronic disease.As discussed by Weiss (1998b), the situation with chronic disease differs from single locus disorders that are inherited following well-identified Mendelian rules.In general, we cannot expect to find relationships that are even as straightforward as the APOE links to cardiovascular and Alzheimer's disease.Variation across populations, difficulty in identifying a small enough area on the chromosome to search for disease-associated genes, and the problems inherent in identifying continuous outcomes with particular genes may limit finding the connections.",
+      "This capacity allows samples to be placed into meaningful genetic groups that reflect evolutionary relationships (more stable, lower diversity markers), while simultaneously permitting high levels of strain resolution (high diversity markers). From a clinical perspective, markers that accurately reflect broad evolutionary relationships are valuable for comparing the genetic similarity of an isolate to isolates on a regional or global scale, whereas high-resolution markers are valuable for detailed epidemiological tracking in an outbreak. Variable-number tandem repeats (VNTRs) are genetic markers that can span a range of variability and, therefore, can capture genetic relationships on multiple scales (18\u201319).",
+      "These variations provide a species the ability of adapting to the environment change (Liu and Cordes, 2004). DNA markers are among the most powerful tools for revealing genetic variations in organisms. Historically, many different types of markers have been used for aquaculture studies  Functional Genomics in Aquaculture, First Edition. Edited by Marco Saroglia and Zhanjiang (John) Liu. \u2402 C 2012 John Wiley & Sons, Inc. Published 2012 by John Wiley & Sons, Inc.  41 42  Functional Genomics in Aquaculture  Table 2.1  A summary of characteristics of various molecular markers used in aquaculture species.",
+      "For instance, mapping of a trait or a phenotype would require polymorphic DNA markers such as microsatellites (SSRs) or single nucleotide polymorphisms (SNPs); expression pro\ufb01ling would require genome annotation information; microarray design would require sequence information of genes, etc. The objective of this chapter is to provide a general review of genomic resources needed, and currently present for aquaculture species, for functional genomics studies. Polymorphic DNA Markers The key factor behind the signi\ufb01cant differences at the level of individuals, species, and higher order of taxonomic groups is genetic variation (polymorphism).",
+      "Functional genomics:  The study of genes, their resulting proteins, and the role played by the proteins in the biochemical processes of the body.Gene: A unit of inheritance; a working subunit of DNA.Each of the 20 000 to 25 000 genes in the body contains the code for a specific product, typically a protein such as an enzyme.Gene expression: The process by which the coded information of a gene is translated into the structures present and operating in the cell (either proteins or ribonucleic acids).Gene markers: Landmarks for a target gene, either detectable traits that are inherited along with the gene or distinctive segments of DNA.Gene map: A description of the relative positions of genes on a chromosome and the distance between them.Genetic counseling: A short-term educational counseling process for individuals and families who have a genetic disease or who are at risk for such a disease.Genetic counseling provides patients with information about their condition and helps them make informed decisions.Genetic linkage maps: DNA maps that assign relative chromosomal locations to genetic landmarks-either genes for known traits or distinctive sequences of DNA (ie, genetic markers)-on the basis of how frequently they are inherited together.Genetic testing: Examining a sample of blood or other body fluid or tissue for biochemical, chromosomal, or genetic markers that indicate the presence or absence of genetic disease.Genetics: The scientific study of heredity, how particular qualities or traits are transmitted from parents to offspring.Genome: All the genetic material in the chromosomes of a particular organism.Genome-wide: Descriptor that indicates that the entire breadth of the genome has been examined in a study (eg, a linkage or association study).Genome-wide studies do not resequence the entire genome but type (an increasingly large set of) markers distributed throughout the genome.Genomics: A \"scaled-up\" version of the science of genetics that investigates the structure and function of large sections of the genome simultaneously.Genotype: The actual genes carried by an individual (as distinct from phenotype-ie, the physical, bodily characteristics into which genes are translated).Haplotype: A way of denoting the collective genotype of a number of closely linked loci on a chromosome.Heritability (h 2 ): For any trait, the proportion of the phenotypic variability resulting from genetic variance.Note that heritability does not indicate the degree to which a trait is \"genetic. \"Nor does a high h 2 mean that the trait cannot be influenced by environment.A heritability significantly \u03fe0, however, can provide a rationale for further genetic and genomic study of a trait of interest.Heterozygous: Possessing 2 different sequences (ie, genotypes) of a particular gene, 1 inherited from each parent.High-throughput genotyping: In contrast to the older labor-and time-intensive genotyping methods, high-throughput genotyping makes use of robots, computers, and other evolving technologies, thus enabling laboratories to type up to hundreds of thousands of polymorphisms in many samples in a relatively short period of time.Homozygous: Possessing 2 identical sequences of a particular gene, 1 inherited from each parent.Interaction: The differing effect of 1 independent variable on the dependent variable, depending on the particular level of another independent variable.For example, there would be an interaction between the factors sex and treatment if the effect of treatment was not the same for male and female subjects in a drug trial.Linkage analysis: A gene-hunting technique that traces patterns of heredity in large, high-risk families in an attempt to locate a disease-causing gene mutation by identifying traits that are coinherited with it.Linkage disequilibrium: Two alleles at different loci that occur together on the same chromosome more often than would be predicted by chance alone.It is a measure of cosegregation of alleles in a population.",
+      "Source: Kearsey and Pooni (1996). Genetic maps consist of a series of markers or identifiable features at known, or perhaps best described as estimated, locations on the genome (see Figure 9). For some discrete traits, simple Mendelian inheritance is followed and the phenotype has a one to one correspondence with the genes controlling it. These are so called morphological markers, which were then related to continuous or quantitative traits of interest. Examples are shape, colour, size or height in particular varieties of peas, as studied by Mendel. For another example, see Appendix A.2.",
+      "Genomic markers used in linkage mapping have evolved from restriction fragment length polymorphisms (RFLPs) to microsatellites (simple sequence repeat polymorphisms; SSRPs), to single-nucleotide polymorphisms (SNPs), with the more modern markers exhibiting higher frequencies in the genome (thus ensuring fuller coverage). Linkage mapping of a trait is in fact the demonstration of linkage between the phenotype and a genomic marker, followed by an inference of linkage between the genomic marker and the responsible DNA variant. Transitive logic ties the phenotype with the DNA variant, which is of course the point of the exercise. See Fig.",
+      "However, because of time constraints it is often more practicable to choose an appropriate mapping population that is already available through the current stock centers. Plant species chosen for study will depend largely on the availability of suitable plant resources. Obtain appropriate mapping population information to include information on markers/genotypes (see Note 4). A marker is an identifying factor; a gene or other DNA of known location that is used to track the inheritance and so on of other genes whose exact location is not yet known.",
+      "The closer two genes are together on a chromosome, the  less likely it is for a recombination event to occur between the two, causing a non-random association. This is the basis for genetic linkage. The development of genetic markers allowed the theory of linkage disequilibrium (LD) to be used in mapping genes. Genetic markers are speci c genetic di\u241berences between species or cultivars, and genetic linkage of these markers to particular morphological traits can allow genetic markers to be used to represent the gene of interest (Collard et al. , 2005)."
+    ],
+    [
+      "This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. 1. Introduction The association between a complex phenotypic trait and genetic markers on the chromosomes can be detected through statistical analysis, leading to the identification of quantitative trait loci (QTL)\u2014regions of the chromosomes that appear to be associated with the phenotype. Quantitative trait loci (QTL) are expected to be associated with the genes controlling some aspects of the phenotype.",
+      "Nowadays many different cost-efficient genotyping solutions (including sequencing and Single Nucleotide Polymorphisms arrays) have opened the way to systematic genome-wide fine mapping of quantitative traits (Quantitative Trait Locus or QTL mapping). The process of QTL mapping (Figure 1) consists in searching for genome regions that influence the value of a given trait. For example, identifying a QTL for plant height means finding a DNA region at which the plants that carry a certain allele tend to be significantly higher or lower than those carrying another allele.",
+      "QTLs are regions within the genome whose genetic variation modulates quantitatively a phenotype characteristic of the particular trait under study (Lynch and Walsh, 1998). Determining the association between variations in specific disease phenotypes or a trait, with variations in genotypes of a reference population can be used to locate a QTL. One of the methods used for mapping QTLs associated with complex traits is genetic markers-trait association. Genetic markers associated with certain loci can be inherited in linkage disequilibrium. Generating populations with linked loci in disequilibrium is achieved though either crosses between inbred lines, or use of the out-bred populations.",
+      "Often, the first step in analysis of new trait data is single-marker regression across all chromosomes. A hypothetical QTL is evaluated at the location of each marker locus, and the significance of that QTL is estimated from a likelihood ratio statistic (LRS) (Haley and Knott, 1992). For this analysis, WebQTL automatically does a permutation test to establish genomewide significance criteria for the trait (Churchill and Doerge, 1994).",
+      "One possible approach to facilitate this endeavor is to identify quantitative trait loci (QTL) that contribute to the phenotype and consequently unravel the candidate genes within these loci. Each proposed candidate locus contains multiple genes and, therefore, further analysis is required to choose plausible candidate genes. One of such methods is to use comparative genomics in order to narrow down the QTL to a region containing only a few genes. We illustrate this strategy by applying it to genetic findings regarding physical activity (PA) in mice and human.",
+      "Elucidation of the molecular basis of these traits has proven difficult as they are under the control of multiple genes and genetic loci. The standard approach to gene identification involves mapping by linkage analysis in experimental crosses, and this has led to the localization in the rat genome of hundreds of quantitative trait loci (QTLs) underlying trait variation (68). We refer to these loci as physiological quantitative trait loci (pQTLs).",
+      "  Often, the first step in analysis of new trait data is single-marker regression across all chromosomes.A hypothetical QTL is evaluated at the location of each marker locus, and the significance of that QTL is estimated from a likelihood ratio statistic (LRS) (Haley and Knott, 1992).For this analysis, WebQTL automatically does a permutation test to establish genomewide significance criteria for the trait (Churchill and Doerge, 1994).By default, it returns a list of marker loci that show greater than sugges-tive association with the trait according to standard criteria (Lander and Kruglyak, 1995), but it will also accept user-defined criteria.Local maxima in the LRS in this list identify loci that are most likely to be near QTLs.WebQTL provides this list within a few seconds.",
+      "QTLs can be identified through their genetic linkage to visible marker loci with genotypes that can be readily classified [94, 97]. As such, markers that are genetically linked quantitative trait will segregate more often with trait values, whereas unlinked markers will lack an association with the phenotype [94, 98]. The principal goal of a QTL analysis is to identify all QTLs linked to a trait and discern whether phenotypic differences are mainly due to a few loci with large effects, or many loci with small effects [98].",
+      "This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. 1. Introduction The association between a complex phenotypic trait and genetic markers on the chromosomes can be detected through statistical analysis, leading to the identification of quantitative trait loci (QTL)\u2014regions of the chromosomes that appear to be associated with the phenotype. Quantitative trait loci (QTL) are expected to be associated with the genes controlling some aspects of the phenotype.",
+      "The basic principle of classic QTL is trait segregation along with the markers and necessitated the availability of two or more genetically different lines corresponding with the phenotypic trait. Markers like single nucleotide polymorphisms (SNPs) and microsatellites are used for genotypic distinctions (Vignal et al. , 2002). QTL mapping is achieved in four basic steps; the first one is the measurement of variation for a trait in the individuals. It is a prerequisite to have the traits that show phenotypic variability among the individuals (inbred strains).",
+      "  Often, the first step in analysis of new trait data is single-marker regression across all chromosomes.A hypothetical QTL is evaluated at the location of each marker locus, and the significance of that QTL is estimated from a likelihood ratio statistic (LRS) (Haley and Knott, 1992).For this analysis, WebQTL automatically does a permutation test to establish genomewide significance criteria for the trait (Churchill and Doerge, 1994).By default, it returns a list of marker loci that show greater than sugges-tive association with the trait according to standard criteria (Lander and Kruglyak, 1995), but it will also accept user-defined criteria.Local maxima in the LRS in this list identify loci that are most likely to be near QTLs.WebQTL provides this list within a few seconds.",
+      "Often, the first step in analysis of new trait data is single-marker regression across all chromosomes. A hypothetical QTL is evaluated at the location of each marker locus, and the significance of that QTL is estimated from a likelihood ratio statistic (LRS) (Haley and Knott, 1992). For this analysis, WebQTL automatically does a permutation test to establish genomewide significance criteria for the trait (Churchill and Doerge, 1994).",
+      "Quantitative Trait Locus (QTL) mapping To map QTL, we used 934 AXB/BXA genetic informative markers obtained from http://www. genenetwork.org. For all the in vitro measurements and gene expression linkage analysis, a genome-wide scan was performed using R/qtl [57]. Significance of QTL logarithm-of-odds (LOD) scores was assessed using 1000 permutations of the phenotype data [114] and the corresponding p-values reported. For the cellular phenotypes, QTL significance was reported at a genome-wide threshold corresponding to p < 0.05.",
+      "Typically one may obtain a location known to derive from only one of the two parent strains that contains a chromosomal region that correlates with a trait of interest. Since the actual gene and gene product will frequently remain unknown, the region is referred to as quantitative trait locus (QTL), and is simply named for the trait itself (Alberts & Schughart, 2010). Growing sets of strain-dependent marker locations in established RI strains are continually updated in online repositories.",
+      "By definition, a quantitative trait locus is a chromosomal region that contains a gene, or genes, that regulate a portion of the genetic variation for a particular phenotype (Wehner et al. 2001). The goal of QTL mapping is to identify regions of the genome that harbour genes relevant to a specified trait. QTL map locations are commonly determined by initial screening of mice with specific genetic characteristics, such as recombinant inbred strains, the F2 of two inbred strains, or recombinant congenic strains (Flint 2003).",
+      "Often, the first step in analysis of new trait data is single-marker regression across all chromosomes. A hypothetical QTL is evaluated at the location of each marker locus, and the significance of that QTL is estimated from a likelihood ratio statistic (LRS) (Haley and Knott, 1992). For this analysis, WebQTL automatically does a permutation test to establish genomewide significance criteria for the trait (Churchill and Doerge, 1994).",
+      "QTL linkage studies are conducted in order to map a region or regions of the genome which affect a continuous or quantitative trait. In agriculture, as soon as markers linked to QTL are found for economically important traits, these markers can be used for selecting individuals in breeding programmes. In human studies, the aim is often to identify markers indicating disease susceptibility. Current techniques for measuring markers are usually relatively slow and laborious. Newer DNA technology, such as SNP or single nucleotide polymorphisms (Kwok, 2001b; Patil et al.",
+      "Genomic regions linked to complex traits can be identified by genetic mapping and quantitative trait locus (QTL) analysis (Shehzad and Okuno 2014). 7 QTL mapping QTL mapping with molecular markers is the first strategy in genetic studies. In plant breeding, QTL mapping is an essential step required for marker-assisted selection (Mohan et al. 1997; Shehzad and Okuno 2014). The fundamental idea underlying QTL analysis is to associate genotype and phenotype in a population exhibiting a genetic variation (Broman and Sen 2009).",
+      "Four steps of QTL mapping are (1) development a  W  population, (2) genotyping the population using molecular markers, (3) phenotyping the population for an interested trait, and (4) QTL analysis using statistical procedures to find  IE  markers linked to the QTL (Bernardo 2002). PR EV  Populations used for genetic mapping can be a segregating population (F2 and backcross) or a permanent population (double haploids or recombinant inbred lines). Recombinant inbred lines (RILs) are developed by selfing of individual progenies of the F2 plants until homozygosity is achieved (F7-F8).",
+      "This tool allows systems genetic analysis of single genes or small sets of genes using a bottom-up approach. relations define quantitative trait loci (QTLs). Because the marker is not typically the actual site of the polymorphism, interpolative methods have been developed to estimate the distance of the QTL from the marker and the strength of the association. Using multiple-regression and model-fitting methods, the true complexity of the phenotypic variation can be modeled through the consideration of multiple loci and environmental factors as predictors [13]."
+    ],
+    [
+      "  To overcome the lack of phenotypic information in the 1000 Genomes Project, The ClinSeq Project was developed to pilot large-scale genome sequencing for research in genomic medicine at the National Institutes of Health Clinical Research Center in Bethesda, MD. 40 The study seeks to enroll 1000 individuals who will be evaluated for personal health status and family history.The project aims to:",
+      "We (Hein, Schierup and Wiuf) have published a 300 page book on molecular population genetics titled \u201cGene Genealogies, Sequence Variation and Evolution\u201d Oxford University Press, and are presently developing a tutorial in association mapping that we hope to publish as a booklet in 2006 and are also involved in a very large EU collaboration (Holland, Denmark, Iceland and UK) to find susceptibility genes for breast and prostate cancer. In comparative genomics, the most fundamental investigation is to find genes in a pair of aligned genomes.",
+      "Key bioinformatic steps to take a genetic study from an initial linkage or association to laboratory genotyping are illustrated. The reader should note the role of genomic sequence as a common thread through every stage  regions in man (see Chapter 5). Similar issues also exist in the establishment of true orthology between genes in different species, where one is identified to play a role in a disease model. If two genes are truly orthologous, their evolution closely follows patterns of speciation (Fitch, 2000).",
+      "In general terms, the approaches we describe can be applied to sequence data from any collection of organisms, but our emphasis here is primarily on Bioinformatics for Geneticists, Second Edition. Edited by Michael R. Barnes 2007 John Wiley & Sons, Ltd ISBN 978-0-470-02619-9 (HB) ISBN 978-0-470-02620-5 (PB)  \u2402 C 106  CH 6 COMPARATIVE GENOMICS  questions of relevance to human genetics. We begin, in Section 6.2 by presenting an overview of genome structure and content, providing a context for the subsequent discussions.",
+      "4 Assembling a View of the Human Genome Colin A. M. Semple Bioinformatics, MRC Human Genetics Unit, Edinburgh EH4 2XU, UK  4.1 Introduction The miraculous birth of the draft human genome sequence took place against the odds. It was only made possible by parallel revolutions in the technologies used to produce, store and analyse the sequence data, and by the development of new, large-scale consortia to organize and obtain funding for the work (Watson, 1990). The initial flood of human sequence has subsided as the sequencing centres have sequenced genomes from other mammalian orders and beyond.",
+      "This fully indexed but semi-intelligible  Bioinformatics for Geneticists, Second Edition. Edited by Michael R. Barnes 2007 John Wiley & Sons, Ltd ISBN 978-0-470-02619-9 (HB) ISBN 978-0-470-02620-5 (PB)  \u2402 C 4  CH 1 BIOINFORMATICS CHALLENGES FOR THE GENETICIST  \u2018book of life\u2019 immediately began to serve as a valuable framework for integration of genetic and biological data. However, knowledge of the genome sequence did not immediately clarify the nature and structure of human genetic variation.",
+      "  Methods for DNA sequencing are constantly being improved, with the ultimate goal of sequencing a human genome in a single day for a cost of about US $1,000, an end that appears to be in sight (Hayden, 2014).In the very near future, whole-genome sequencing will be routinely available for clinical purposes, perhaps even beginning at birth.The major challenge ahead is the interpretation of this information.How do our genes interact with each other, and how does the environment contribute to the development of health and disease?What are the individual and societal implications of knowing our genome sequence?The answers to these and other important questions will unfold in the years ahead.Thus, we are truly in an era where precision medicine may soon become a reality.",
+      "  Characteristics of genotyping and sequencing technologies",
+      "Key bioinformatic steps to take a genetic study from an initial linkage or association to laboratory genotyping are illustrated. The reader should note the role of genomic sequence as a common thread through every stage  regions in man (see Chapter 5). Similar issues also exist in the establishment of true orthology between genes in different species, where one is identified to play a role in a disease model. If two genes are truly orthologous, their evolution closely follows patterns of speciation (Fitch, 2000).",
+      "In general terms, the approaches we describe can be applied to sequence data from any collection of organisms, but our emphasis here is primarily on Bioinformatics for Geneticists, Second Edition. Edited by Michael R. Barnes 2007 John Wiley & Sons, Ltd ISBN 978-0-470-02619-9 (HB) ISBN 978-0-470-02620-5 (PB)  \u2402 C 106  CH 6 COMPARATIVE GENOMICS  questions of relevance to human genetics. We begin, in Section 6.2 by presenting an overview of genome structure and content, providing a context for the subsequent discussions.",
+      "4 Assembling a View of the Human Genome Colin A. M. Semple Bioinformatics, MRC Human Genetics Unit, Edinburgh EH4 2XU, UK  4.1 Introduction The miraculous birth of the draft human genome sequence took place against the odds. It was only made possible by parallel revolutions in the technologies used to produce, store and analyse the sequence data, and by the development of new, large-scale consortia to organize and obtain funding for the work (Watson, 1990). The initial flood of human sequence has subsided as the sequencing centres have sequenced genomes from other mammalian orders and beyond.",
+      "Introduction  Since the first human genome was sequenced at an estimated cost of $150 million, several advanced high-throughput techniques \u2013 some with lower costs - have come up. At the same time, this resulted in a data deluge and a critical need to connect the heterogeneous sequencing data and associated annotations \u2013 structural and functional with the basic tenets of biology or molecular basis of development and disease.",
+      "Key bioinformatic steps to take a genetic study from an initial linkage or association to laboratory genotyping are illustrated. The reader should note the role of genomic sequence as a common thread through every stage  regions in man (see Chapter 5). Similar issues also exist in the establishment of true orthology between genes in different species, where one is identified to play a role in a disease model. If two genes are truly orthologous, their evolution closely follows patterns of speciation (Fitch, 2000).",
+      "In general terms, the approaches we describe can be applied to sequence data from any collection of organisms, but our emphasis here is primarily on Bioinformatics for Geneticists, Second Edition. Edited by Michael R. Barnes 2007 John Wiley & Sons, Ltd ISBN 978-0-470-02619-9 (HB) ISBN 978-0-470-02620-5 (PB)  \u2402 C 106  CH 6 COMPARATIVE GENOMICS  questions of relevance to human genetics. We begin, in Section 6.2 by presenting an overview of genome structure and content, providing a context for the subsequent discussions.",
+      "4 Assembling a View of the Human Genome Colin A. M. Semple Bioinformatics, MRC Human Genetics Unit, Edinburgh EH4 2XU, UK  4.1 Introduction The miraculous birth of the draft human genome sequence took place against the odds. It was only made possible by parallel revolutions in the technologies used to produce, store and analyse the sequence data, and by the development of new, large-scale consortia to organize and obtain funding for the work (Watson, 1990). The initial flood of human sequence has subsided as the sequencing centres have sequenced genomes from other mammalian orders and beyond.",
+      "  Ample time was allotted to answer questions and a copy of \"A Guide to Your Genome\" (National Human Genome Research Institute 2007) was provided to further assist participants' understanding and ability to communicate results with family members or others.",
+      "  Whether within 10 or 12 (or 8) years, such inexpensive sequencing will change both research and clinical care, and progress does not need to wait even that long.The National Human Genome Research Institute (NHGRI) plans to focus a significant portion of the sequencing capacity that it supports on medical sequencing.For instance, the NHGRI and the National Cancer Institute are actively considering a Human Cancer Genome Project, 22 which would use DNA sequencing and a host of other genome technologies to gather information about the mutations and functional abnormalities found in multiple samples from many major types of cancer.Medical sequencing should also provide important insight into many other diseases.For example, sequencing all exons in X-linked mental retardation syndromes may reveal much about their etiology.Sequencing candidate genes in the extremes of the distribution of quantitative traits should also reveal much of importance about common diseases, such as coronary atherosclerosis. 23With further technological advances, other previously unimaginable research approaches will become real.",
+      "Key bioinformatic steps to take a genetic study from an initial linkage or association to laboratory genotyping are illustrated. The reader should note the role of genomic sequence as a common thread through every stage  regions in man (see Chapter 5). Similar issues also exist in the establishment of true orthology between genes in different species, where one is identified to play a role in a disease model. If two genes are truly orthologous, their evolution closely follows patterns of speciation (Fitch, 2000).",
+      "In general terms, the approaches we describe can be applied to sequence data from any collection of organisms, but our emphasis here is primarily on Bioinformatics for Geneticists, Second Edition. Edited by Michael R. Barnes 2007 John Wiley & Sons, Ltd ISBN 978-0-470-02619-9 (HB) ISBN 978-0-470-02620-5 (PB)  \u2402 C 106  CH 6 COMPARATIVE GENOMICS  questions of relevance to human genetics. We begin, in Section 6.2 by presenting an overview of genome structure and content, providing a context for the subsequent discussions.",
+      "4 Assembling a View of the Human Genome Colin A. M. Semple Bioinformatics, MRC Human Genetics Unit, Edinburgh EH4 2XU, UK  4.1 Introduction The miraculous birth of the draft human genome sequence took place against the odds. It was only made possible by parallel revolutions in the technologies used to produce, store and analyse the sequence data, and by the development of new, large-scale consortia to organize and obtain funding for the work (Watson, 1990). The initial flood of human sequence has subsided as the sequencing centres have sequenced genomes from other mammalian orders and beyond."
+    ],
+    [
+      "  In birds, where erythrocyte telomere length (ETL) is measured, the majority of species sampled have shown no sex difference (36).Nonetheless, bird telomere dynamics are complex and, as with humans, may be affected by environment and stress.For example, a longitudinal study of black-tailed gulls (Larus crassitostris) over 2-5 years found no correlation between ETL and age or sex.Rather, ETL attrition was correlated with reduced food availability and environmental stressors (55).In a captive zebra finch (Taeniopygia guttata) population, male and female mean telomere length decreased with increasing age of the animals, but did differ between sexes (56).As these examples illustrate, the relationship between telomere length, lifespan, and sex is likely to be complex in other vertebrates.",
+      "  Comparative studies of age-related telomere attrition in other species also reveal a variety of patterns.Barrett and Richardson (36) recently summarized the comparative data available on sex differences in telomere length.They found a strong correlation between male-biased mortality and either shorter telomeres or greater telomere attrition in males across bird and mammal taxa.However, telomere length did not differ between males and females in species where females are shorter-lived than males (36), suggesting that telomere shortening is not associated with species-specific longevity in a simple linear fashion.These studies generally suffer from relatively small sample sizes and are largely cross-sectional.Further, the use of diverse assays, different tissues (eg, leukocytes in mammals vs erythrocytes in birds), and lack of standardized benchmarks for accuracy makes comparisons between studies difficult.",
+      "  In some organisms, there is no clear relationship between telomere length and lifespan.Age-related telomere attrition could not be detected in Daphnia pulex (57) or sea urchin species (Strongylocentrotus franciscanus and Lytechinus variegatus) (58).Studies in C. elegans examining natural variation in telomere length and experimentally manipulated telomere length detect no correlation with lifespan (59,60), and in Drosophila, which uses a telomerase-independent mechanism for telomere maintenance, there is a similar lack of correlation between longevity and telomere length (61).Similarly, data on sex differences in age-related telomere shortening are mixed.For example, in the ant species Lasius niger, the rate of telomere shortening is more rapid in short-lived males compared to longer-lived females.But, mean telomere length does not differ between the two types of females, queens and workers, despite the fact that queens live much longer than workers (up to 28 years vs 2-3 months) (62).These findings suggest that the question of how telomere shortening affects aging across species and how sex affects telomere attrition rates are complex.",
+      "  With new methodologies to assess relative telomere length by Q-PCR, studies were designed to address the impact of telomere length on aging, aging associated pathologies, and mortality.One such study has correlated shorter leukocyte telomere lengths at age 60 with a three times higher risk of heart disease and an eightfold increase in risk of infection-related death (36), thereby associating measured relative cellular aging with disease and life expectancy.In a similar way, chronic stress was shown to correlate with short leukocyte telomere length, a phenomenon attributed to higher levels of oxidative stress at the cellular level (70).More recent studies have linked telomere length in smooth muscle cells with senescence and disease severity in patients with atherosclerosis (141,150).Leukocyte telomere length was also short in a cohort of similar patients and associated with a higher risk of developing occult cardiovascular disease (71).More data are needed to understand and validate the use of leukocyte telomere length as a biomarker for cardiovascular and other diseases.",
+      "  Shortening of the telomeres at the ends of chromosomes has been associated with age-related disease and mortality [16][17][18].A recent study identified a common haplotype of four SNPs in the human telomerase reverse transcriptase gene (hTERT) that is enriched in centenarians and associated with longer telomere length [19].It was also shown that centenarians and their offspring maintain longer telomeres compared with controls and that longer telomeres are associated with protection from age-related diseases, better cognitive function and lipid profiles of healthy ageing [19].",
+      "  New research has indicated how social factors, such as subordination, may translate into biological effects (epel et al. 2004;Chae et al. 2014).In a now classic study, epel et al. ( 2004) examined the telomere lengths of fifty-eight healthy premenopausal women who either had a healthy child (n = 19) or were giving care to a chronically ill child (n = 39. )They measured perceived stress, years of caregiving, telomere length, and oxidative stress.They found highly statistically significant differences in telomere length between women taking care of chronically ill children and those who had healthy children.They found highly statistically significant negative correlations between telomere length and perceived stress and years of caregiving.Telomerase activity had highly statistically significant negative correlations with perceived stress and years of caregiving.Oxidative stress was highly positively correlated with perceived stress and years of caregiving.They concluded that the telomere length shortening was equivalent to 9 to 17 years of aging in the high stress group.Telomere length is considered a biomarker of aging (Finch and Kirkwood 2000).Thus, this study showed that caregiver stress had essentially aged these women 9 to 17 years compared to women who had healthy children.",
+      "  The single, consistent predictor of the rate of telomere attrition shown in multiple adult and the few child longitudinal studies is the baseline measurement of telomere length at the start of each study.This suggests the importance of understanding predictors of telomere length prior to adulthood, as it determines in part the rate of change (Revesz et al. 2014a, b;Nordfjall et al. 2009).Moreover, longitudinal studies in adults have had found that telomere attrition rate is dependent on baseline telomere length independent of any phenotypic predictors of shortening, such as disease or demographic variables (Nordfjall et al. 2009), attesting to the importance of studies to evaluate risk factors for shortening prior to adulthood.",
+      "  Rates of decline in childhood may be particularly relevant for later chronic disease risk as shorter telomere length has been implicated in disease progression through exposure to cellular senescence, inflammatory cytokines and adipocyte hypertrophy (Raschenberger et al. 2015;Willeit et al. 2014;Monickaraj et al. 2012;Fyhrquist et al. 2013).",
+      "  Adult studies have also found a negative correlation with baseline telomere length, suggesting a negative feedback regulation of leukocyte telomere length (Farzaneh-Far et al. 2010;Aviv et al. 2009;Epel et al. 2008;Nordfjall et al. 2009).It is possible that while our follow-up period was shorter than Shalev et al. 2013 and adult studies, which had a minimum of 5 year intervals with the exception of Puterman et al. (2015) who followed for a one-year time period, there may be biological regulation of telomere length at 4 and 5 years of age such that shorter telomeres are more robustly maintained, whereas longer telomeres have greater rates of decline, over a short period of one year.It is unlikely that this relationship is due to assay error or regression to the mean given the consistency of our findings across studies.We have had similar findings of longer telomeres having greater rates of decline and shorter telomeres being maintained in our different studies (Farzaneh-Far et al. 2010;Epel et al. 2008;Puterman et al. 2015).",
+      "  We found primarily maintenance and lengthening from 4 to 5 years of age in children, with minimal telomere attrition, indicating that most of the telomere loss happens in the first 4 years, plateauing by age 4. Lastly, we found close to 10 % of the variance in rate of change in children shared by mothers.While some of this shared variance is genetic, there are likely environmental factors that need to be further identified that impact rate of telomere length change.",
+      "  Abstract Telomeres are the protective complexes at the end of chromosomes, required for genomic stability.Little is known about predictors of attrition in young children or the relationship between parental and child patterns of telomere change.Telomere length was assessed twice over one year, at 4 and at 5 years of age, in Latino preschool children (n = 77) and their mothers (n = 70) in whole blood leukocytes.Maternal and child rates of attrition during the same time period were compared in 70 mother-child pairs.More children showed lengthened telomeres over one year compared to their mothers and very few children showed attrition (2.6 %).Approximately 31 % of children and 16 % of mothers displayed lengthening over one year while 66 % of children showed maintenance in contrast with 74 % of mothers.The strongest predictor for child telomere length change was child's baseline telomere length (r = \u22120.61,p < 0.01).Maternal rate of change was associated with child rate of change (r = 0.33, p < 0.01).After controlling for child baseline telomere length, the relationship between child and maternal rate of change trended towards significance (Coeff = 0.20, 95 % CI \u22120.03 to 0.43; p = 0.08).",
+      "  Blackburn and Epel, a health psychologist who did original research on how specific lifestyle and psychological habits can protect telomeres, published The Telomere Effect (Blackburn & Epel, 2017), in which they suggested that individuals with shorter telomeres developed diseases earlier in life (a shorter \"disease span\").What follows is the evidence from these authors, their colleagues, and other researchers describing how length of telomeres contributes to mind-body connection and healthy longevity.",
+      "  As early as at the time of birth, each of the 92 telomeres of the human genome has its own characteristic length.Additionally, each telomere shortens by its individual attrition rate.In general, longer telomeres at birth are associated with higher age-dependent attrition rates and vice versa.Overall, telomere shortening appears more dynamic in males.",
+      "  In conclusion, a combination of overall and chromosomespecifi c shorter telomeres and more pronounced age-dependent telomere erosion could be observed in males.There is a prospective clinical study strongly suggesting that longer telomeres decrease the risk of dying (Cawthon et al., 2003).With this in mind, the telomere length discrepancies between the sexes may indeed be a factor infl uencing the differences in their life expectancy.",
+      "  In every chromosome a linear decline of telomere length with age was observed, being more pronounced in men independent of the examined chromosome arm.This might suggest that telomere length on single chromosome arms may be infl uenced by the same factors which determine overall telomere length.",
+      "  S. Mayer a S. Br\u00fcderlein a S. Perner a I. Waibel a A. Holdenried a N. Ciloglu a C. Hasel a T. Mattfeldt a K.V. Nielsen b P. M\u00f6ller a a Institute of Pathology, University of Ulm, Ulm (Germany); b DakoCytomation A/S, Glostrup (Denmark) follow uniformity.In previous studies, sex-specifi c differences in telomere length and attrition rate of men and women were found (Benetos et al., 2001;Cawthon et al., 2003;Nawrot et al., 2004), suggesting gender differences in behavior of telomeres.In individual chromosome arms, telomere length was also shown not to be homogeneous (Lansdorp et al., 1996;Benn, 1997;Martens et al., 1998;Surralles et al., 1999;Hao and Tan, 2001;Londono-Vallejo et al., 2001;Graakjaer et al., 2003), some telomeres being signifi cantly shorter, others longer than the average length.",
+      "  To date, these characteristics in telomere lengths could not be set in a biological context, as only a few groups have provided detailed information about chromosome-specifi c patterns of telomere distribution (Lansdorp et al., 1996;Graakjaer et al., 2003).Whether accumulation of short telomeres (Martens et al., 2000;Londono-Vallejo et al., 2001) or rather the shortest telomere of one specifi c chromosome arm (Hemann et al., 2001) elicits senescence, remains an open question so far.",
+      "  In recent literature, there are hints that the average telomere length may be higher in women and that their annual shortening rate may be somewhat lower (Vaziri et al., 1993;Rufer et al., 1998;Jeanclos et al., 2000), but these reported differences failed to reach statistical signifi cance except for one study (Jeanclos et al., 2000).Here, we provide compelling evidence that this is indeed the case.",
+      "  It is generally accepted that telomeres shorten during DNA replication both in vitro and in vivo.In individuals, short telomeres are considered to be a sign of advanced age.Cawthon and coworkers (2003) showed that telomere shortening in humans likely contributes to mortality, supporting the hypothesis that they might act as a mitotic clock (Allsopp et al., 1992).Telomere length dynamics, however, does not seem to Abstract.During aging, telomeres are gradually shortened, eventually leading to cellular senescence.By T/C-FISH (telomere/centromere-FISH), we investigated human telomere length differences on single chromosome arms of 205 individuals in different age groups and sexes.For all chromosome arms, we found a linear correlation between telomere length and donor age.Generally, males had shorter telomeres and higher attrition rates.Every chromosome arm had its individual age-specifi c telomere length and erosion pattern, resulting in an unexpected heterogeneity in chromosomespecifi c regression lines.This differential erosion pattern, however, does not seem to be accidental, since we found a correlation between average telomere length of single chromosome arms in newborns and their annual attrition rate.Apart from the above-mentioned sex-specifi c discrepancies, chromosome arm-specifi c telomere lengths were strikingly similar in men and women.This implies a mechanism that arm specifi cally regulates the telomere length independent of gender, thus leading to interchromosomal telomere variations.",
+      "  Shortening of the telomeres at the ends of chromosomes has been associated with age-related disease and mortality [16][17][18].A recent study identified a common haplotype of four SNPs in the human telomerase reverse transcriptase gene (hTERT) that is enriched in centenarians and associated with longer telomere length [19].It was also shown that centenarians and their offspring maintain longer telomeres compared with controls and that longer telomeres are associated with protection from age-related diseases, better cognitive function and lipid profiles of healthy ageing [19]."
+    ],
+    [
+      "  Selection could occur at multiple levels, from germ cell generation and propagation to fertilization and early embryonic growth.Chromosomal abnormalities, including aneuploidy, were found in 10-20% of spermatozoa and oocytes (20) and in the cleaved embryo, with a 21% rate of abnormalities in preimplantation embryos (21).These findings led to a model for natural selection against chromosome abnormalities (21).Selection extends to the end of gestation: Only approximately 30% of all conceptions result in a live birth, with more than half of aborted fetuses containing chromosomal abnormalities (22), a number likely to be an underestimate because of technological limitations in measuring all possible mutations.But even in the very small fraction of germ cell duos that survive this withering genome attack and result in a live birth, a number of severe de novo mutations will still be found (23).The data on gross chromosomal alterations suggest that overall, mutation frequency early in life is very high.The functional consequence, however, is limited because of selection.Somewhat surprisingly, this picture points toward an initial decline in genomic alterations, allowing the adult individual to acquire a somatic genome optimally equipped to provide function.",
+      "The phenotype of the F1 hybrids is compared to those of the parental inbred strains to reveal dominance or semi-dominance relationships between the alleles that a\u00a1ect the phenotype. Phenotypic di\u00a1erences between reciprocal F1 hybrids indicate that one or more of the following factors may a\u00a1ect the trait: (1) sex linkage (X- or Ylinked traits), (2) genomic imprinting of QTLs that a\u00a1ect the phenotype, (3) prenatal maternal e\u00a1ects (e\u00a1ects of intrauterine environment), and/or (4) postnatal maternal or paternal e\u00a1ects (e\u00a1ects of maternal and/or paternal parenting behaviour on o\u00a1spring).",
+      "Sex brings harmful alleles together into the same genetic background, allowing selection to more efficiently purge them from the population and potentially producing some offspring that are fitter than either parent. However, the benefit of recombining deleterious mutations may depend on the nature of the epistatic interactions between them. The mutational deterministic hypothesis (Kondrashov 1988) depends partly on this epistasis.",
+      "In most plants and animals, sex is a necessary component of reproduction, and the question for evolutionary biologists is why reproductive mechanisms have evolved that way. In one of the experiments described next, evolutionary geneticists have nevertheless devised a way to compare evolution with and without recombination in the obligately sexual fruit fly.",
+      "This disparity in investment is the basis for the twofold cost: asexual females hypothetically could transmit twice as many alleles at the same cost. In most plants and animals, mates tend to be unrelated, leading to outcrossing. But sex usually also involves the basic process of physical recombination: the breakage and reunion of two different DNA or RNA molecules. Of these two processes, recombination is clearly the more widespread feature of sexual reproduction. A variety of reproductive systems, such as selfing and automixis, involve recombination but not outcrossing. In contrast, relatively few reproductive systems have outcrossing without recombination.",
+      "  Crossing over-The swapping of genetic material that occurs in the germline.During the formation of egg and sperm cells, also known as meiosis, paired chromosomes from each parent align so that similar DNA sequences from the paired chromosomes cross over one another.Crossing over results in a shuffling of genetic material and is an important cause of the genetic variation seen among offspring.This process is also known as meiotic recombination.",
+      "  The reason for the rarity of these mutations is natural selection: If the mutations result in disorders that decrease health and reproductive fitness, they will eventually be eliminated from a population.In exceptional cases, mutations may cause both beneficial and detrimental consequences, resulting in opposing forces of positive selection and negative selection that may cause the mutations to be preserved at nonrare frequencies in a population.For example, the HbS mutation in the HBB gene (which produces the \u03b2 subunit of hemoglobin) causes sickle cell disease when present in both alleles, a detrimental consequence, but protects against malaria when present in 1 allele, a beneficial consequence, ensuring that the mutation persists in populations in areas of the world where malaria is endemic.Genes are passed from parents to offspring via the process of meiosis by which gametes, the egg cells in the mother and the sperm cells in the father, are generated.Ordinarily, each cell has 23 pairs of chromosomes; the gametes have 23 unpaired chromosomes.In meiosis, the 23 pairs are split so that each gamete receives 1 chromosome from each pair (Figures 8 and 9).Two gametes (egg and sperm) ultimately join into a single cell, the zygote, which has the full complement of 23 chromosome pairs restored.If all goes well, the zygote gives rise to a live offspring.",
+      "  Recombination (meiotic recombination)-The swapping of genetic material that occurs in the germline.During the formation of egg and sperm cells, also known as meiosis, paired chromosomes from each parent align so that similar DNA sequences from the paired chromosomes recombine with one another.Recombination results in a shuffling of genetic material and is an important cause of the genetic variation seen among offspring.Also known as crossing over.",
+      "  In the generation of gametes, crossing over regularly occurs, and genetic information is swapped between members of a chromosome pair.That doesn't matter within inbred animals, because the swapped parts are identical.In an F 1 animal, however, the chromosomes of a particular pair are genetically different, one each having come from each parent.Each gamete produced will be unique, as will be each F 2 zygote formed by uniting of the gametes from two F 1 parents.An F 2 group thus provides for expression of some genetic variability.This variability is limited to the allelic differences existing between the parent strains of the F 1 s, so that another F 2 , derived from different inbred strains, will express different genetic differences.",
+      "Sex brings harmful alleles together into the same genetic background, allowing selection to more efficiently purge them from the population and potentially producing some offspring that are fitter than either parent. However, the benefit of recombining deleterious mutations may depend on the nature of the epistatic interactions between them. The mutational deterministic hypothesis (Kondrashov 1988) depends partly on this epistasis.",
+      "In most plants and animals, sex is a necessary component of reproduction, and the question for evolutionary biologists is why reproductive mechanisms have evolved that way. In one of the experiments described next, evolutionary geneticists have nevertheless devised a way to compare evolution with and without recombination in the obligately sexual fruit fly.",
+      "This disparity in investment is the basis for the twofold cost: asexual females hypothetically could transmit twice as many alleles at the same cost. In most plants and animals, mates tend to be unrelated, leading to outcrossing. But sex usually also involves the basic process of physical recombination: the breakage and reunion of two different DNA or RNA molecules. Of these two processes, recombination is clearly the more widespread feature of sexual reproduction. A variety of reproductive systems, such as selfing and automixis, involve recombination but not outcrossing. In contrast, relatively few reproductive systems have outcrossing without recombination.",
+      "  Aberrant recombination patterns on chromosomes that have missegregated have also been identified as an important factor, in both male and female gametes (Table I).This is because recombination together with cohesion of sister chromatids establish the unique 'bivalent' chromosome structure where homologous partner chromosomes are tethered together, a configuration that is critical for their accurate segregation in meiosis I (Fig. 2A).The remarkable feature is that recombination occurs in foetal oocytes whereas chromosome segregation takes place decades later (Fig. 2A).Since mammalian oocytes are arrested at the G2/M transition (or dictyate stage), this raises the intriguing question of how the bivalent is maintained until the meiotic divisions.",
+      "Traditionally, it has been agreed that the \ufb01nal sex of an individual (phenotypic sex) depends on two sequential processes: the sex determination system of the species and the gonad differentiation process (Valenzuela, 2008). However, recently, these two seemingly distinct processes are viewed as part of a general process leading to gonad formation and sex ratios (Sarre et al. , 2004; Quinn et al. , 2011; Uller and Helantera\u0308, 2011).",
+      "However, we expect that only at this level, the most signi\ufb01cant contributions brought by integrating epigenetics will be made. Concluding Remarks and Future Prospects Fish sex ratios are the result of a complex interaction between genetic, biochemical, and environmental interactions. The ultimate result of these interactions at the individual level is gender: male or female. However, at the population level, the combination of sex determination and differentiation sets the sex ratio. In turn, sex ratios de\ufb01ne the reproductive capacity of populations and, if sex growth dimorphism exists, also the growth characteristics, something very important in an aquaculture context.",
+      "Obehav is, in turn, influenced by offspring genes and environment (Ogene and Oenvir respectively). Hence, indirect genetic effects (blue arrows) and direct genetic effects (red arrow) are important influencers of behaviour. B) Parentoffspring conflict theory predicts that parental resource investment and offspring solicitation behaviours are influenced by the fitness benefit to a focal individual (O), cost to a social partner such as a sibling (S1 and S2) or parent (P), and by their coefficient of relatedness (black arrows). 42 Figure 2: Genomic imprinting can result in divergent phenotypes from the same genotype. A) A paternally imprinted gene, i.e. maternally expressed.",
+      "Because of the small contribution, through the sperm, of the paternal transcriptome to the fertilized zygote, and because of the stronger maternal contribution to child rearing in most model organisms, parental effects are typically thought of as synonymous with maternal effects, although true paternal effects are known to exist (Rando, 2012). Maternal effects have been shown to be important during embryonic development, leading to differences in the birth weight of mice depending on the genotype of the mother (Cowley et al. , 1989; Wolf et al. , 2011).",
+      "Therefore, the resulting phenotypic patterns lag a generation behind the genetic transmission of the causal variants. The most well-studied parental genetic effects are caused by deposition of maternal transcripts into the egg prior to fertilization, resulting in differences in early embryonic development depending on the genotype of the mother. Certain genes have also been shown to respond to maternal influence after birth through genetically defined maternal behaviors (Weaver et al. , 2004).",
+      "The phenotype of the F1 hybrids is compared to those of the parental inbred strains to reveal dominance or semi-dominance relationships between the alleles that a\u00a1ect the phenotype. Phenotypic di\u00a1erences between reciprocal F1 hybrids indicate that one or more of the following factors may a\u00a1ect the trait: (1) sex linkage (X- or Ylinked traits), (2) genomic imprinting of QTLs that a\u00a1ect the phenotype, (3) prenatal maternal e\u00a1ects (e\u00a1ects of intrauterine environment), and/or (4) postnatal maternal or paternal e\u00a1ects (e\u00a1ects of maternal and/or paternal parenting behaviour on o\u00a1spring).",
+      "It was believed by many that for each trait variant we should expect to find a corresponding genetic change, or \u201egene for\u201f that trait. Through historical happenstance the relationship between genes and traits was set up and treated as if it were one-to-one. But the production of a trait involves not only genes, but also their interactions with each other and the environment, and chance."
+    ],
+    [
+      "distinguishing prenatal from postnatal maternal effects, see below). Maternal effects can account for a large proportion of phenotypic variance, especially during early life, and for some traits explain more variation than direct genetic effects [33, 97, 99, 100, 102\u2013115]. However, maternal and offspring genotype are correlated (i.e. half their genes are shared), and in inbred lines they are fully confounded, thus separating the effects of their respective genotypes is difficult. To remove this confounding effect cross-fostering has been used, both in the laboratory and in the field [119, 131].",
+      "Using genetic markers, the pattern of inheritance can be tracked through families. For example, by analyzing a marker linked to the eye color gene in several generations, it is possible to determine from which grandparents a child has inherited its eye color alleles. More importantly, \ufb01nding a marker linked to a disease can lead to location of the faulty gene causing the disease. Finding the gene is very valuable in the search for the cure. The distance between two loci can be expressed either as physical or genetic distance.",
+      "  Although autosomal SNPs are commonly used as genetic markers to infer ancestry or race/ethnicity membership, haploid such as mitochondria, Y-DNA, and X-lined markers are also important to provide separate stories of ancestry of individuals from paternal and maternal sides [42,43].Therefore, genetic structure created due to autosomal markers could be different from those of lineage markers (often influenced by political, social, and migration history of individuals/populations).mitochondrial DNA or mtDNA haploid is the maternally inherited mitochondrial genome (mtDNA) [44].All children inherit mtDNA from their mother, with no admixture from the father.Like Y-line DNA, mtDNA is passed intact from one generation to the next but through maternal line.",
+      "  a) Autosomal DNA (testing both sexes) markers: autosomal DNA tests utilize DNA from the 22 pairs of autosomal chromosomes.Autosomal DNA is inherited from both parents.Autosomal testing provides percentages of ethnicity using autosomal DNA SNP test (i.e., ancestry informative markers), and it is the most commonly used test to infer ancestry across diploid genome.b) Y-DNA or Y-SNPs (paternal line testing) markers: a haploid Y-DNA is the paternally inherited non-recombining portion of the Y chromosome, and it tests only for males.The Y-DNA testing tests the Y chromosome which is passed intact from father to son with no DNA from the mother.Y-DNA testing can then be used to trace direct paternal line.Y-DNA remains the same in each generation, allowing us to compare surname from different regions to see if we are from the same family.Y-line testing does not indicate anything about the contributions of the other ancestors in a family tree.In other words, you could be 3/4th Native American, with only the direct paternal line being European, and this test would tell you nothing at all about those other three Native lines.When testing the Y-chromosome, there are two types of tests, short tandem repeat (STR) and SNP markers.STR tests are best for recent ancestry while SNP tests tell about more ancient ancestry.c) Mitochondrial DNA (maternal line testing) markers:",
+      "  Additional information about past breeding practices can be gleaned by quantifying the number of reproductive males and females in a population.This can be achieved by comparing levels of genetic diversity between sex chromosomes, autosomes and mtDNA 99 .In cattle, for example, gene flow from aurochs is evident in the autosomes but is absent in mtDNA 41 .This has been interpreted as a management strategy that may have involved allowing insemination of domesticated females by wild bulls 41,100 .In horses, a comparison of the levels of diversity of the Y chromosome and the autosomal chromosomes demonstrated that some cultures allowed fewer males to breed and instead selected specific stallion bloodlines 55 .This male-oriented breeding strategy was not practised by the Romans and only became increasingly prominent in the past 1,000 years as a result of the growing influence of Oriental stallions (Arabian, Persian and Turkmen) 101 .",
+      "  Dr Ring: What makes the maternal gene so peculiar compared to the paternal?Dr Cookson: If you look in the epidemiologic sense, many studies show that there is increased risk of allergic disease if the mother is affected.However, very few studies have actually set out to test that formally and most of them might suffer from some sort of selection bias because the mother is more likely to be aware of her symptoms and feel guilty, and so on.It is very difficult to explain.Is it genomic imprinting, where the gene is only active when transmitted through the mother?I do not think all of these genes would be imprinted, though it is possible.It also seems that there are effects of the maternal phenotype.The maternal phenotype, if the mother is affected or unaffected, determines the strength of the maternal effect.Again, if a gene was imprinted, you would not expect maternal phenotype to be important.So, I think that this has something to do with maternal/fetal interaction, either through the placenta or shortly after birth.There is the issue of immune conflict between mother and child.At the same time, the mother is trying to prime the infant's immune system.",
+      "Genetic and Genomic Discovery Using Family Studies  Ingrid B. Borecki, PhD; Michael A. Province, PhD G enetic studies traditionally have been performed on sets of related individuals, that is, families.Mendel's early studies in sweet peas (Pisum sativum) on the inheritance patterns of discrete traits from parents with specific mating types to offspring has shed light on the basic mechanisms of inheritance, including the fundamental laws of segregation of discrete factors (genes) from parents to offspring and the cosegregation of genes that are closely located on a chromosome (linkage).The distribution of traits within families exhibited mathematical segregation ratios in offspring from known mating types.These expected segregation ratios have been used as an important discovery tool in the study of human diseases in pedigrees, providing evidence for a multitude of single-gene disorders.Furthermore, in some cases, trait cosegregation with genetic markers with known positions provides mapping information that enables localization and, ultimately, identification of the relevant causative gene.",
+      "In fact, this idea has been pursued before in the context of signatures of reproductive isolation and shown to reveal patterns consistent with epistatic gene interactions that arise in the shape of Dobzhansky-Muller incompatibilities [10,11]. In contrast to the mouse data, the available human genotypes were derived from outbred, ethnically distinct populations. In this case pairs of functionally interacting genes can be detected following a slightly different approach.",
+      "Family Structure  The first re-identification method (FAMILY) employs genealogical data accompanying genomic data.Genealogies, rich in depth and structure, permit the construction of complex familial relationships.Consider a simple family structure of two parents and one child.Since the parental genders are guaranteed, there exist 2 variants of this structure, since the child's gender is either male or female.When disease status is taken into account, it is represented as a Boolean variable; either an individual afflicted or not afflicted.In this aspect, all three family members can be represented as three attributes {Father, Mother, Child}, and there exist (father's disease status)*(mother's disease status)*(child's disease status)*(child's gender) = 2*2*2*2 = 16 possible family-disease combinations.In reality, pedigrees are much more robust than a simple nuclear family.For example, a three-generation family of two children per family permits on the order of 10 5 distinct variants of the family-disease structure and 10 6 individuals that could be uniquely characterized.The number of combinationsk is larger when supplementary information, such as living status or medical/genetic features, is considered. 16e ability to determine unique family structures is only one part of the re-identification process.These structures must be linked to identifiable information, which, in many instances, is publicly available in the form of various genealogical databases.These databases are accessible both offline and via the World Wide Web.For example, genealogical records are available in many public databases, including ,Ancestry.com>,,Infospace.com>,,RootsWeb.com>,,GeneaNet.com>,,FamilySearch.org>, and ,Genealogy.com>. {From such data, it is not difficult to construct family structures and, with such information in hand, an adversary can link disease-labeled family structures to named individuals.",
+      "  Fig. 3. Illustrations of the three CEU pedigrees (black) showing how genetic information from distant patrilineal relatives (arrow; red, patrilineal lines) can identify individuals.Filled squares represent sequenced individuals.To respect the privacy of these families, only abbreviated versions are presented.The sex of the CEU grandchildren was randomized.The numbers of grandchildren are not given.",
+      "  When I was in high school, I remember often trying to match my friends to their parents at various school functions and being surprised at how easy this was.As human geneticists, in spite of the enormous advances being made in our field, we still cannot answer many of the everyday questions that we are asked, such as: \"Why does he look just like his mother? \"Max Perutz [1], in a recent editorial comment in the New Scientist entitled \"The Molecular Biology of the Future,\" suggested some questions, for, as he put it, \"an examination in some future century. \"Here are two of them: (1) \"The time has come\" the Walrus said, \"To talk of many things ...And why the sea is boiling hot And whether pigs have wings. \"Calculate the amount of genetic information this would require in megacricks.",
+      "Using genetic markers, the pattern of inheritance can be tracked through families. For example, by analyzing a marker linked to the eye color gene in several generations, it is possible to determine from which grandparents a child has inherited its eye color alleles. More importantly, \ufb01nding a marker linked to a disease can lead to location of the faulty gene causing the disease. Finding the gene is very valuable in the search for the cure. The distance between two loci can be expressed either as physical or genetic distance.",
+      "In contrast, genomic imprinting is due to epigenetic changes within the individual causing differential gene expression characterized by either complete or partial silencing of one parental allele (Barlow, 2011; Abramowitz and Bartolomei, 2012; Ashbrook and Hager, 2013). As both mothers and fathers had contact with the pups in our study, our observed PGEs could come from either parent. Among quantitative USV traits only peak amplitude of call displayed a possible parent-of-origin effect. For call number, call duration, mean peak frequency, and all morphological traits, there were no significant parent-of-origin effect in reciprocal F1 females. In contrast, Thornton et al.",
+      "  Another way of avoiding stratification is to use family-based samples.This approach has several theoretical advantages: as well as being immune to stratification 114 , these samples can be used to determine whether an allele has different effects on disease when it is inherited maternally or paternally 115 , and DISCORDANT SIB designs [116][117][118] can control for the effects of shared environment.Furthermore, more complex family-based designs are possible 119 that might allow combined association and linkage analysis 120 , and family-based association tests have also been developed for quantitative traits [94][95][96][97][98] .However, pure sibship-based association studies are underpowered relative to case-control studies 107,116,117 , and the requirement for living parents might introduce an age-of-onset bias towards younger patients for diseases that usually arise late in life.Furthermore, family-based samples are often much more difficult to collect, particularly if larger pedigrees are sought.Finally, the most commonly used family-based design, the TRANSMISSION DISEQUILIBIRIUM TEST (TDT; see REF. 114) is susceptible to technical artefacts (see below).",
+      "  Because mtDNA is not subjected (as far as we know) to sexual recombination and crossover at the time of nuclear meiosis, nature must call on other means to ensure that inevitable germ plasm mtDNA mutations (Medvedev, 1981) are not transmitted.These mutations among primary oocytes, on the face of it, can be expected to increase with time, that is with maternal age.Empirical data on this question are incomplete and conflicting, being mostly confined so far to searches for deletions rather than point mutations (Chen et al., 1995;Keefe et al., 1995).It is inevitable, however, that there will be such mutations and that there must therefore be a reliable physiological mechanism (a) for giving an opportunity for back-mutations to occur, (b) for selecting in favor of those back-mutations (thus preserving the genome) and in favor of rare advantageous mutations, and (c) for preventing the spread of persistent harmful mutations through the population -mutations that are too slight (or too late in origin) to have escaped intraovarian culling.The sheer conservation of the mitochondrial genome over 0.5 billion years or more, despite a mutation rate estimated at 10 -20 times that of nuclear DNA, is ample reason to conclude that such a physiological purification process must exist.",
+      "To scrutinize the polygenic networks underlying complex diseases, however, mouse resources that are optimized to study the actions of isolated genetic loci on a fixed background will be insufficient on their own. For example, predisposition to the metabolic syndrome is inherited in a non-Mendelian fashion stressing genetic heterogeneity and multigenetic pathogenesis (Nandi et al. , 2004). With the reawakening as to the extraordinary genetic resources and phenotypic diversity archived in extant inbred strains, however, a foundation is in place for tracking down these complex traits and quantitative trait loci (QTL).",
+      "Otherwise, tens of thousands or markers will appear significant in the genome-wise association studies using up to one million genetic markers. Approaches to control for stratification include using of self report of ancestry or genetically derived principle components in the analysis. For studies using inbred mouse lines, a cladogram which is a hierarchical grouping based on phylogenetic analysis of strain relatedness can be created to subdivide inbred strains into more genetically homogenous subgroups.",
+      "  Although bilateral descent is the norm in Western societies, it is not universal and there is variation with cultural practices around lineage.In certain societies, individuals place greater importance on (and have greater knowledge about) one side of the family than another (unilineal descent).Thus, individuals in patrilineal groups trace relationships through males only so that your father's brother's children are members of your family, but not your father's sisters (Kottak, 2007).They are members of their husband's group or family.Efforts to create a family pedigree may be hampered if the participant is not familiar with her mother's relatives, but her mother's brother's children (her cousins) may be able to supplement her overall family history.Knowledge about the cultural system of unilineal descent avoids assuming the universality of bilateral descent.Cultural beliefs such as these also have implications in the conduct of genetic research in terms of confidentiality and autonomy (Benkendorf et al., 1997;Wertz, 1997).One cannot assume that the named proband is in a position to speak for the extended family in agreeing to participate in any genetic research (DudokdeWit et al., 1997).",
+      "In particular in polygynous species, a female\u2019s offspring may have different fathers and are thus more closely related through the maternal than the paternal line. Therefore, any fitness cost to mothers, such as increased provisioning and care, affect maternally derived genes more strongly than paternally derived genes, leading to the silencing of the maternal copy (i.e. paternal expression) of genes that increase resource transfer. 5. Coadaptation between offspring and maternal traits The genetics of the co-evolution of parental and offspring traits has been investigated using quantitative genetics models and in several empirical studies (Agrawal et al.",
+      "In this scenario, genes expressed in parents will be selected for their effects on parental behaviour while genes expressed in offspring will be selected for their effects on influencing parental behaviour. At the genetic level the predicted conflict between paternal and maternal genomes is thought to have led to the evolution of genomic imprinting (monoallelic gene expression). Genomic imprinting effects are good examples of offspring genetic effects on maternal care because of the impact on the quality of maternal care and level of resource provisioning (e.g. Li et al. , 1999)."
+    ],
+    [
+      "Genetic mapping in mouse strains enhances the power of detecting modifier genes and identifying complex genetic interactions. Genomewide quantitative trait locus (QTL) analysis, as described in more detail below, represents a promising approach to detect genetic variants that are associated with specific phenotypes and interact with each other. 16 ACCEPTED MANUSCRIPT In experimental crosses of two (inbred) strains the first generation (F1) of offsprings is genetically heterozygous but equal. Then in the next generation (F2) the  PT  strain-specific genetic information is distributed across the genomes of their progeny and  RI  each offspring is genetically unique.",
+      "Second, and perhaps more important, is the difference in the size and types of the genetic reference populations. In our previous study, we mapped the QTL with 36 F2 mice that were genotyped at 82 markers. In the current study, by comparison, we were able to map QTLs after examining 342 mice from 55 strains that were genotyped at approximately 4000 markers.",
+      "This contrast can be exploited to identify subregions that underlie the trans-QTLs [67]. SNPs were counted for all four pairs of parental haplotypes\u2014B vs D, B vs H, B vs C, and L vs S\u2014and SNP profiles for the four crosses were compared (figure 6). Qrr1 is a highly polymorphic PLoS Genetics | www.plosgenetics.org  8  November 2008 | Volume 4 | Issue 11 | e1000260 QTL Hotspot on Mouse Distal Chromosome 1  Figure 5. QTL for aminoacyl-tRNA synthetases in distal Qrr1.",
+      "The traditional approach to QTL mapping is to use two strains that differ maximally in the phenotype as parental strains for genetic crosses, with the following caveats. QTL analysis based on a single cross will most likely reflect only a small portion of the net genetic variation, and QTL detection will be limited to regions where the two progenitor strains have functional polymorphisms. Data from multiple crosses, or from an HS, will overcome this limitation and can also be used to reduce QTL intervals [5,30].",
+      "These candidate genes are then sequenced in the two parental inbred strains looking for sequence di\u00a1erences in coding or regulatory regions. After \u00a2ne mapping the QTL interval and shortening the list of plausible candidate polymorphisms, the major challenge remains \u2401 proving de\u00a2nitively which nucleotide polymorphism underlies the QTL. The most direct proof would be replacing one strain\u2019s allele with another strain\u2019s allele (creating a FIG. 1. Intercross breeding strategy for mapping quantitative trait loci (QTLs). On the right, the parental, F1 hybrid, and intercross (F2) mouse generations are depicted.",
+      "Furthermore, splicing QTLs (sQTLs) rather than eQTLs could comprise the molecular mechanism linking DNA variants with YFP53; thus, sQTL analysis could uncover genes that would not normally be detected at the level of differential gene expression (DGE),53 and thus, a differentially  181 182  Molecular-Genetic and Statistical Techniques for Behavioral and Neural Research  Figure 8.5 Schematic for immediate, rapid \ufb01ne mapping in select F2 recombinants of the RCC-F2 cross. Top panel: Genome-wide signi\ufb01cant QTL (green trace; red dashed line \u00bc signi\ufb01cance threshold; blue vertical lines \u00bc Bayes credible interval).",
+      "Interval-specific haplotype analysis Approximately 97% of the genetic variation between inbred mouse strains is ancestral [22], so regions of identity by descent (IBD) between two strains used to detect a QTL are highly unlikely to contain the causal genetic polymorphism underlying the QTL [28]. For example, a cross between C57BL/6J and A/J mice detected www.sciencedirect.com  a blood pressure QTL on Chr 1 [7].",
+      "Interval-specific haplotype analysis Approximately 97% of the genetic variation between inbred mouse strains is ancestral [22], so regions of identity by descent (IBD) between two strains used to detect a QTL are highly unlikely to contain the causal genetic polymorphism underlying the QTL [28]. For example, a cross between C57BL/6J and A/J mice detected www.sciencedirect.com  a blood pressure QTL on Chr 1 [7].",
+      "At present, the BXD panel is composed of 80 different strains that all have been fully genotyped.26 Variation in any quantifiable trait can be associated with the segregation of parental alleles, and linkage genetics can map this variation to quantitative trait loci (QTLs), thereby identifying the genomic region(s) affecting that trait. An overview of the QTL mapping approach is depicted in Figure 2. Classical QTL analysis has permitted the identification of loci that are associated with variation in HSC traits.",
+      "In general, linking genetic variation with trait variation identifies QTL and a significant linkage of phenotype and genotype suggest that the DNA status helps to determine trait expression. As stated above, mouse QTL studies provide distinct advantages over human studies in the examination of genetic causes of a quantitative trait (e.g. alcoholism), even in the absence of specific hypotheses regarding its aetiology or candidate genes.",
+      "The progenitor mouse strains should have sufficient variation for the traits of interest and they should be genetically diverse enough to enable genetic mapping (BENNETT et al. 2006; FLINT 2003; GRISEL 2000). The sample size required for the identification of QTL depends largely on the effect size that a QTL contributes to phenotypes on interest. Inference about QTL can be made if one or more genetic markers are over- or underrepresented in the analysed individuals. Genotyping is often done by means of microsatellite markers, which contains mono, di-, tri-, or tetranucleotide tandem repeats flanked by specific sequences (Figure 4a).",
+      "This comparison gives information about the reliability of the observed genotype information: The more the marker locations differ between the two maps (which signifies variation in marker positions), the higher the possibility of genotyping errors. QTL mapping was done in several stages to identify loci acting individually and QTL that interacted, either additively or epistatically. To determine individually-acting QTL, a singleQTL genome scan was conducted with the function scanone.",
+      "Importantly, whereas these studies required substantial labor, time, and resources, X-QTL is a quick and easy approach to achieve a comparable level of genetic dissection. The levels of complexity observed here (e.g. 14 loci explaining 70% of the genetic variance for 4-NQO resistance) are still dramatically lower than those seen in for some human traits in GWAS (e.g. 40 loci explaining 5% of the variance for height 2,5). One obvious explanation is the difference in experimental designs (line crosses vs. population association studies), but differences in genetic architectures among species and traits may also contribute.",
+      "The method uses two pieces of information: mapping data from crosses that involve more than two inbred strains and sequence variants in the progenitor strains within the interval containing a quantitative trait locus (QTL). By testing whether the strain distribution pattern in the progenitor strains is consistent with the observed genetic effect of the QTL we can assign a probability that any sequence variant is a quantitative trait nucleotide (QTN). It is not necessary to genotype the animals except at a skeleton of markers; the genotypes at all other polymorphisms are estimated by a multipoint analysis.",
+      "The method uses two pieces of information: mapping data from crosses that involve more than two inbred strains and sequence variants in the progenitor strains within the interval containing a quantitative trait locus (QTL). By testing whether the strain distribution pattern in the progenitor strains is consistent with the observed genetic effect of the QTL we can assign a probability that any sequence variant is a quantitative trait nucleotide (QTN). It is not necessary to genotype the animals except at a skeleton of markers; the genotypes at all other polymorphisms are estimated by a multipoint analysis.",
+      "Genotyping all the individual progeny for markers that show allelic variation between the parental strains (either single nucleotide polymorphisms or simple sequence repeats) will allow the detection of associations between trait values and marker genotype, and in this way demonstrate to which set of markers a QTL is linked. To reduce the genotyping effort, selective genotyping of the individuals at the extremes of the phenotypic spectrum can be performed (20,23). Although these three approaches are in general considered to be the best to detect and map QTL, they have several disadvantages for quantitative traits involving HSC.",
+      "So, how do you go about planning and performing a QTL study, and how do you identify the responsible gene within a QTL that you have identified? Generally, one starts by performing a strain survey to find two parental inbred strains that have a markedly different trait. One can now look up many different traits of inbred mice online at the Mouse Phenome Database (http://phenome. jax.org/pub-cgi/phenome/mpdcgi?rtn=docs/home). However, the trait you may want to study may not be present in wild type mice, so you may want to cross a mutant (or genetically engineered) strain onto several inbred strains.",
+      "QTL Theory and Planning The theory behind the most basic form of QTL mapping is based upon intercrossing two inbred strains. The mouse genome consists of 19 pairs of autosomes (non sex-determining chromosome) and the X and Y chromosomes. In the example shown in Fig. 18.1, we are intercrossing stain A (shown with a black chromosome pair) with strain B (shown with a white chromosome pair). The initial F1 (filial generation 1) mice are true hybrids, with each individual  From: Molecular Biomethods Handbook, 2nd Edition.",
+      "These candidate genes are then sequenced in the two parental inbred strains looking for sequence di\u00a1erences in coding or regulatory regions. After \u00a2ne mapping the QTL interval and shortening the list of plausible candidate polymorphisms, the major challenge remains \u2401 proving de\u00a2nitively which nucleotide polymorphism underlies the QTL. The most direct proof would be replacing one strain\u2019s allele with another strain\u2019s allele (creating a FIG. 1. Intercross breeding strategy for mapping quantitative trait loci (QTLs). On the right, the parental, F1 hybrid, and intercross (F2) mouse generations are depicted.",
+      "QTL mapping studies then seek to detect the polymorphisms underlying the complex traits of interest by scanning for alleles that co-vary with the traits. Similar experiments also can be conducted with special derivatives of inbred strains known as recombinant inbred (RI) mice. These animals are derived by cross-breeding two or more distinct parental strains (which often diverge widely for the trait of interest), followed by inbreeding of the offspring for several generations (Bailey 1971). Given the correct breeding strategy, this method 1  This is an issue faced by GWASs researchers when classifying samples as cases or controls."
+    ]
+  ]
+}
diff --git a/gnqa/paper1_eval/src/data/datasets/old/full_test_dataset.json b/gnqa/paper1_eval/src/data/datasets/old/full_test_dataset.json
new file mode 100644
index 00000000..8818869f
--- /dev/null
+++ b/gnqa/paper1_eval/src/data/datasets/old/full_test_dataset.json
@@ -0,0 +1,248 @@
+{
+  "question": [
+    "List genes related to asthma with their phenotype descriptions.",
+    "Which types of mouse studies are popular in genenetwork, where a type an ailment, e.g. asthma, allergies, aging, heart failure, diabetes, glaucoma?",
+    "Do we find significant quantitative trait loci in the studies for diabetes?",
+    "What environmental factors effect mice that also influence diabetes?",
+    "List mouse studies that involve insulin.",
+    "What is the typical weight of a mouse liver?",
+    "How are C57BL/6 and DBA/2 different?",
+    "What behavioural QTLs have been found in mice?",
+    "Where do I look for genes that regulate hippocampal neural stem cell proliferation?",
+    "What does QTL mean?"
+  ],
+  "answer": [
+    "The text mentions two severe asthma clusters, SC2 and SC3, but does not specify individual genes. SC2 is characterized by high T2 inflammation, high FeNO, bronchoalveolar lavage and blood eosinophilia, and up-regulated expression of T2 signature and mast cell genes. SC3 is characterized by lower levels of T2 clinical biomarkers and gene expression, a later age of onset, increased incidence of nasal polyposis, and higher levels of neutrophilic inflammation.",
+    "Mouse studies popular in genenetwork often focus on ailments such as diabetes, glaucoma, obesity-associated diabetes, and metabolic diseases. These studies utilize genetically modified mouse models, inbred mouse models, and polygenic models. They also use specific strains like DBA/2J, Goto-Kakizaki (GK) rat, and KK mice.",
+    "Yes, the studies identified significant quantitative trait loci associated with both type 1 and type 2 diabetes. These include loci such as TCF7L2, DNER, SRR, HNF1A, KCNQ1, CDKN2A/CDKN2B, JAZF1, CDC123, CAMK1D, TSPAN8, LGR5, THADA, ADAMTS9, NOTCH2, and others. However, the significance of some loci varied across different populations and studies.",
+    "Several environmental factors can affect mice and influence diabetes. These include the stress response, which can be greater in male mice and can confound feeding behavior, the unique gut microbiota of rodents from different production facilities, cage position, single versus group housing, the skill level of the researcher, ambient room temperature, and the type of cage bedding. Diet, particularly high-fat feeding, can also induce obesity and diabetes. Additionally, the intrauterine environment can impact the subsequent development of diabetes amongst offspring.",
+    "1. Diabetes incidence study where mice were tested for diabetes monthly by blood glucose and weekly by urine assessment.\n2. Studies involving transgenic/knockout animal models of type 2 diabetes to study the role of genes and their effects on peripheral insulin action.\n3. Study involving male type-2 diabetic db/db mice and genetic control non-diabetic db/ mice.\n4. Study involving young adult mice fed a high-fat or Western diet to elicit DIO and insulin resistance.\n5. Study involving Akita mouse genotyping.\n6. Study involving 12-week-old male type 2 diabetic db/db mice and contemporary control wild-type mice.\n7. Study involving transgenic mice to create specific models of type 1 and type 2 diabetes.\n8. Study involving AKITA mice derived from a C57BL/6NSlc mouse with a spontaneous mutation in the insulin 2 gene.\n9. Study monitoring mice for the development of diabetes.",
+    "The typical weight of a mouse liver can vary, but in the context provided, it can weigh up to 4.5 grams in a 40 gram mouse and 1.2 grams in a 20 gram normal mouse.",
+    "C57BL/6 and DBA/2 mice differ in several ways. C57BL/6 mice are more accurate in tasks with greater demands and make fewer impulsive responses. They also have larger brain weights and are more resistant to radiation-induced genomic instability. DBA/2 mice, on the other hand, make fewer omission errors, respond more vigorously to certain influenza viruses, and perform better in less spatially demanding tasks. They also have larger cortical areas when adjusted for brain weight and are more sensitive to acute nephrotoxicity. Genetic differences between the two strains affect various responses including activity, memory, learning tasks, and responses to alcohol and opioids.",
+    "Several behavioral QTLs have been found in mice. These include QTLs affecting aggressive behaviors, which have been identified in strains such as the Turku aggressive (TA) and non-aggressive (TNS) strains, the NC900 and NC100 strains, and the short attack latency (SAL) and long attack latency (LAL) strains. Other QTLs have been associated with traits like anxiety-related behavior, seizure, hippocampal volume, and alcohol preference. Additionally, QTLs related to alcohol, morphine or other drugs, painful arthritis, emotionality/anxiety, and seizure susceptibility have been mapped. There are also QTLs that influence exploratory behavior in mice.",
+    "You can look for genes that regulate hippocampal neural stem cell proliferation in the results of gene expression profiling studies, such as the one mentioned in the text. Specifically, the study identified LRP6 as a novel regulator of adult hippocampal neurogenesis. Other genes like Lef1, Mtdh, and Chchd8 may also be involved.",
+    "QTL stands for Quantitative Trait Locus. It is a region of the genome that is linked to a trait and contains a gene or genes that regulate a portion of the genetic variation for a particular phenotype. QTLs are associated with quantitative traits, which are traits that have measurable phenotypic variation due to genetic and/or environmental influences."
+  ],
+  "contexts": [
+    [
+      "  We present an analytical technique designed to test a priori defined gene sets (for example, pathways) for association with disease phenotypes.We apply this method to gene expression profiles of human diabetic muscle, identifying a set of genes whose expression is correlated with insulin resistance and aerobic capacity.These results suggest hypotheses about pathways contributing to human metabolic disease and, more generally, show the value of incorporating information about functional relationships among genes in the analysis of microarray data.",
+      "  Pathway and gene ontology analysis for select phenotypes and envionmental factors showing GxE interactions.",
+      "  Next, the genes that correlated with FeNO (n = 549) were used to objectively cluster asthma subjects into subgroups.In agreement with Moore et al., most of the severe asthma patients clustered into 2 subject clusters (SCs) (SC2 and SC3).One severe asthma cluster (SC2) had high T2 inflammation, as evidence by a high FeNO, bronchoalveolar lavage and blood eosinophilia, and up-regulated expression of T2 signature and mast cell genes.The other severe asthma cluster (SC3) had lower levels of T2 clinical biomarkers and gene expression, in addition to a later age of onset, increased incidence of nasal polyposis and higher levels of neutrophilic inflammation.Roughly 1/2 of all asthma subjects had evidence of high T2 inflammatory response (by clinical biomarkers and gene expression), confirming the prior findings of Woodruff et al. in a more severe and steroid-treated patient population.In general, both severe asthma clusters (SC2 and SC3) were older and more obese than the other non-severe subclusters.Further, both of the severe SCs demonstrated suppression of genes associated with cilia function, neuronal function, cell adhesion and wound repair.These findings suggested that airway epithelial defense, repair, neuronal function are an integral part of a healthy epithelial layer and perhaps prevention of severe asthma.",
+      "These genes are high priority candidates, although we acknowledge that causal variants may lie in non-coding regions. For each of these high priority candidates we then examined which GO:biological processes (Consortium, 2015) and KEGG pathways (Kanehisa et al. , 2012) the gene was annotated as being part of, and highlighted those which may relate to our phenotypes. We also reviewed known effects of mutations using the Mouse Genome Informatics (MGI) Phenotypes, Alleles and Disease Models Search (www.informatics.jax.org/allele) (Bello et al. , 2015).",
+      "Results were displayed as a matrix with all phenotypes/diseases associated with  173  mouse models and human genes found for the candidate gene list. 174 175  2.6. Expression-phenotype correlations  176  For each gene discovered after filtering, an adequate probe within the well-curated INIA Amygdala  177  Cohort Affy MoGene 1.0ST (Mar11) RMA, Hippocampus Consortium M430v2 (Jun06) PDNN,  178  VCU BXD Prefrontal Cortex M430 2.0 (Dec06) RMA, INIA Hypothalamus Affy MoGene 1.0ST  179  (Nov10), and INIA Adrenal Affy MoGene 1.0ST (Jun12) RMA Databases was identified using  180  GeneNetwork (http://www.genenetwork.org; Williams and Mulligan, 2012)).",
+      "  The GeneNetwork website contains extensive phenotypic datasets ranging from behavioral to morphological to pharmacological.To identify phenotypes associated with Gsto1 variation, we queried the BXD phenotype database in GeneNetwork, which contains nearly 3000 phenotypes, to look for the phenotypes that are most closely related to hippocampal expression of Gsto1 (probe set 1416531_at).",
+      "  To examine known causal genes that have been reported in the literature, including related genes and pathways, a gene list was generated consisting of 6264 genes categorized by disorders, pathways, expression, AmiGO terms, and other into 26 sublists (supplemental data).This list was manually collected from different database sources covering all aspects of insulin-and glucose-related genes and disorders.This was done through an extensive literature review using PubMed, Ovid\u00ae, GeneCards\u00ae, and the National Center for Biotechnology Information (NCBI).Gene and protein expression databases such as BioGPS and The Human Protein Atlas were used.Protein interactions and gene network databases, such as AmiGO, BioGRID, GIANT, KEGG, and Reactome, were also used.Knockout mouse databases, such as MGI and IMPC, were also used.However, filtering against the gene list will not replace the manual screening for all variants called; therefore, we did not consider the results of our gene list alone.Once the raw data were obtained, they were filtered and investigated individually.As shown in Fig. 1, mutations went through serial steps ending up with a single nucleotide polymorphism mutation as a potential explanation.Pathogenicity scores were determined by SIFT, PolyPhen-2, PROVEAN, and PhD-SNP.",
+      "Chesler, E. J., Wang, J., Lu, L., Qu, Y., Manly, K. F., and Williams, R. W. (2003). Genetic correlates of gene expression in recombinant inbred strains: a relational model system to explore neurobehavioral phenotypes. Neuroinformatics 1, 343\u2013357. doi:10.1385/NI:1:4:343. Denny, J. C., Ritchie, M. D., Basford, M. A., Pulley, J. M., Bastarache, L., Brown-Gentry, K., et al. (2010). PheWAS: demonstrating the feasibility of a phenome-wide scan to discover genedisease associations. Bioinformatics 26, 1205\u20131210. doi:10.1093/bioinformatics/btq126. Farrar, C. A., Zhou, W., and Sacks, S. H. (2016). Role of the lectin complement pathway in kidney transplantation. Immunobiology 221, 1068\u20131072. doi:10.1016/j.imbio.2016.05.004. Gene Ontology Consortium (2015).",
+      "Exploring genes, molecules, and phenotypes is easily accomplished using GeneNetwork. In this manuscript we will outline some simple use cases, and show how a small number of plausible candidate genes can be identified for an immune phenotype. 1. Data Once you have navigated to genenetwork.org, there are two ways to search for data in GN. The first is to use the global search bar located at the top of the page (Figure 1). This is a new feature in GN that allows researchers to search for genes, mRNAs, or proteins across all of the datasets.",
+      "Protein interaction data: There is a growing body of protein-interaction data and this data is a useful extension to inferences of functional interaction between disease gene candidates and co-expressed genes. Ontologies for Functional Annotation: This project will lead to a small subset of genes of interest for asthma and AD.. Ontologies are key in making automated and vocabulary controlled statements about function and it will be interesting to interface the analytical framework presented in the proposal with contemporary advances in gene ontology methodology.",
+      "A network or interaction model will be generated using methods of graphical modelling with both inhouse data and public databases to propose predictive models for epithelial cells and characterise critical molecular interactions within asthma and AD biology. Finally, supporting and extending methodologies from above will contribute to (E) Future Directions of the study and include interfacing and data exchange with contemporary public databases. D(a) Disease Association and eQTL Mapping Mapping the human genome for regions and positions that are responsible for disease susceptibility and differential gene expression is central to this project.",
+      "For example, time series data sets potentially capture relationships and dependencies of gene expression within and between time points which may suggest causative co-regulation. These dependencies and interactions could be better uncovered using statistical modelling approaches such as Bayesian model based methods that aim to identify co-expressed clusters of genes under a model of temporal dependence between observations, that is utilising gene expression measures in time to better judge cluster membership11,12. Secondly, the asthma and AD expression dataset of sibpairs inherently contains underlying structures of shared genetic disease risk.",
+      "Genes are arranged based on their genetic positions, and genes annotated to be involved in the module are colored red. Genes with absolute GMAS over 0.268 are considered significantly associated. DDT, BOLA3, and ARID1A are labeled. B, Venn diagram of novel genes associated with respiratory electron transport module in human, mouse and rat. 707 genes were predicted to be mito-proteins by G-MAD in all three species.",
+      "Chesler, E. J., Wang, J., Lu, L., Qu, Y., Manly, K. F., and Williams, R. W. (2003). Genetic correlates of gene expression in recombinant inbred strains: a relational model system to explore neurobehavioral phenotypes. Neuroinformatics 1, 343\u2013357. doi:10.1385/NI:1:4:343. Denny, J. C., Ritchie, M. D., Basford, M. A., Pulley, J. M., Bastarache, L., Brown-Gentry, K., et al. (2010). PheWAS: demonstrating the feasibility of a phenome-wide scan to discover genedisease associations. Bioinformatics 26, 1205\u20131210. doi:10.1093/bioinformatics/btq126. Farrar, C. A., Zhou, W., and Sacks, S. H. (2016). Role of the lectin complement pathway in kidney transplantation. Immunobiology 221, 1068\u20131072. doi:10.1016/j.imbio.2016.05.004. Gene Ontology Consortium (2015).",
+      "Exploring genes, molecules, and phenotypes is easily accomplished using GeneNetwork. In this manuscript we will outline some simple use cases, and show how a small number of plausible candidate genes can be identified for an immune phenotype. 1. Data Once you have navigated to genenetwork.org, there are two ways to search for data in GN. The first is to use the global search bar located at the top of the page (Figure 1). This is a new feature in GN that allows researchers to search for genes, mRNAs, or proteins across all of the datasets.",
+      "6  Phenotype-matched reports  7  The framework implementation we have presented uses only genomic information to generate a patient or research report. Of course, the clinical features of the sample o\ufb00er vital clues as to which gene is likely responsible for the disease. It would therefore make sense to include phenotype-based gene \ufb01ltering or prioritization to the report. To make this possible, associations of Human Phenotype Ontology (HPO) terms[292] to their known disease genes could be integrated into the system. Users can enter HPO terms that match the phenotypes observed in a patient to shorten their list of candidate genes.",
+      "Predicted transcriptome association test  We used the PrediXcan 16 framework to identify genes that might mediate associations between genetic variants and asthma risk.PrediXcan is a software tool that estimates tissue-specific gene expression profiles from an individual's SNP genotype profile by use of prediction models trained in large reference databases of genotypes and tissue-specific gene expression profiles.With these genotype-imputed expression profiles, PrediXcan can perform gene-based association tests that correlate predicted expression levels with phenotypes (eg, asthma) to identify candidate causal genes from GWAS data.We used a summary version of PrediXcan, which has high concordance with the individual-level version (r\u00b2>0\u202299). 17or predictions, we downloaded elastic net models trained with reference transcriptome data from the Genotype-Tissue Expression consortium 18 for 49 tissues (appendix pp 9, 47).",
+      "  Gene selection was based on searches conducted using the Genetic Association Database (geneticassociationdb.nih.gov).Only genes with multiple, independent indicators of function were included.aPhenotype available for one cohort only.",
+      "The results from the phenotype-driven searches should then be linked to gene names associated with a given phenotype. These genes are presented as a list from which the user can choose the genes of interest and save them in a shopping cart. It is then possible to feed the genes into the gene-centric use-case and perform a more detailed data mining or meta-analysis. The description and further development of the phenotype-driven use-case may represent a very useful concept for scientists and clinicians outside the mouse community.",
+      "  As a demonstration of the utility of the web interface, we entered the 9 genes that reached suggestive significance in a recent genome-wide association study of opioid cessation (Cox et al. 2020).The graph view of the search results are shown in Fig. 3. Genes and keywords are all shown as circles and lines connecting them show the number of abstracts containing the 2 circles they connect.Keywords under the same main category are shown with the same color in the graphic output.Clicking on the lines brings up a new page that displays all sentences containing the keywords that line connects.An alternative tabular view of the same results is also available, where genes, the keywords, and number of abstracts are shown as separate columns."
+    ],
+    [
+      "A major advantage of the mouse as an animal model is the availability of well-characterized inbred strains that enable functional genomics on defined genetic backgrounds. Currently, however, exploiting the full utility of mice to study human diseases is hampered by the lack of gene targeting resources for multiple inbred mouse strains. DBA/2J is a common inbred mouse strain critical in studying a diverse range of human diseases. For example, it is widely used as an inherited model of glaucoma. Glaucoma is a neurodegenerative disorder that affects 70 million people worldwide.",
+      "The network is driven by a common regulator, Ebi2 (also known as Gpr183), which is conserved in rats and humans, is expressed in macrophages and is associated in GWASs with human type 1 diabetes48. Such systemsgenetics studies are possible in rats because of the ready availability of ex vivo tissues and the statistical power gained from studies of inbred strains in controlled environments. Overall, these vignettes provide clear examples of the translational focus of the rat genetics community in an era of unprecedented scientific opportunity enabled by ultra-high-throughput genomics and mathematical biology.",
+      "  Inbred animal models with homogeneous genetic backgrounds have been a powerful adjunct to human studies, providing a sufficiently large number of samples required for an unconstrained genetic analysis.Several polygenic NIDDM rodent models have been developed.These include the Goto-Kakizaki (GK) rat, the Otsuka Long-Evans Tokushima Fatty (OLETF) rat, the Nagoya Shibata Yasuda mouse, the New Zealand Obese mouse (reviewed in Kim et al., 1998), and the Tsumura-Suzuki Obese Diabetes mouse (Suzuki et al., 1999).The underlying genetic factors in these animal models have been studied by quantitative trait locus (QTL) mapping analysis, and several QTLs associated with glucose intolerance, defective insulin secretion, or parameters defining glucose homeostasis have been located (reviewed in Kim et al., 1998;Hirayama et al., 1999;Ueda et al., 1999).",
+      "In as much as it is quite difficult to conduct certain infectious disease studies in humans, there has been a critical need for small animal models for infectious diseases. Appreciating the limitations of existing models, we developed several novel and complementary mouse models that are ideal for use in systems genetics studies of complex diseases. These models not only allow biological validation of known genetic associations, but importantly they afford an unbiased tool for discovering novel genes and pathways contributing to disease outcomes, under different environments. 2008 Genetic effects on environmental vulnerability to disease.",
+      "Generalities  Mouse models have been developed to give new insights into human diseases.Mouse models can be classified into two main classes: 1) genetically modified mouse models, animals that lack (knockout) or overexpress a specific gene and the protein that is encoded for, 2) mice that acquire a disease/symptom following an experimental procedure, such as diet, chemical injections and specific surgery.",
+      "However, in other contexts, B6 mice are more likely than D2 to spontaneously develop diabetic syndromes, Aging Clin Exp Res  indicating that risk factors exist on both genetic backgrounds [29]. QTL mapping studies indicate that these murine metabolic traits have a complex genetic architecture that is not dominated by any single allele [29\u201331], much like humans [32, 33]. Prior work identified candidate genes on Chr 13 that might underlie diabetes-related traits, including RASA1, Nnt, and PSK1. RASA1 show strong sequence differences between B6 and D2 strains [34]. Rasche et al.",
+      "In other cases, the rat phenotypes have proved more robust and consistent, such as pristane-induced arthritis as a model for rheumatoid arthritis (Holmdahl et al. 2001) and cresentic glomerulonephritis (Aitman et al. 2006). Decades of careful phenotyping and detailed analyses in rat experimental crosses have led to the localization of hundreds of rat physiological quantitative trait loci (pQTLs) containing genes that confer susceptibility to complex disease phenotypes, including hypertension, type 2 diabetes, autoimmune disorders, and cancer (Flint et al. 2005). The availability of the rat genome sequence in June 2003 (Gibbs et al.",
+      ", et al. , Harnessing Genetic Complexity to Enhance Translatability of Alzheimer's Disease Mouse Models: A Path toward Precision Medicine. Neuron, 2019. 101(3): p. 399-411 e5. Beura, L.K. , et al. , Normalizing the environment recapitulates adult human immune traits in laboratory mice. Nature, 2016. 532(7600): p. 512-6. Kleinert, M., et al. , Animal models of obesity and diabetes mellitus. Nat Rev Endocrinol, 2018. 14(3): p. 140-162. Kebede, M.A. and A.D. Attie, Insights into obesity and diabetes at the intersection of mouse and human genetics. Trends Endocrinol Metab, 2014. 25(10): p. 493-501. von Scheidt, M., et al.",
+      "Researchers have access to all the tissue samples in mice, especially those highly relevant in diseases, which is impossible in most human studies because of ethical issues. 8. Mouse models can be used to capture the disease progression stages in longitudinal studies. 9. Mouse genetic populations are able to model the genetic diversity of human populations, and require fewer individuals for genetic association analyses. 10. Unlike human genetic studies where data should always be kept highly confidential, data from mouse studies can be made public available to facilitate its re-analysis to the fullest extent.",
+      "Knock-out and transgenic mice in diabetes research  Transgenic mice have been used to create specific models of type 1 and type 2 diabetes, including hIAPP mice, humanized mice with aspects of the human immune system and mice allowing conditional ablation of beta cells, as outlined above.Beta cells expressing fluorescent proteins can also provide elegant methods of tracking beta cells for use in diabetes research (Hara et al., 2003).",
+      "  Polygenic models of obesity.Polygenic models of obesity may provide a more accurate model of the human condition.A variety of different polygenic mouse models of obesity, glucose intolerance and diabetes exist, allowing a variety of genotypes and susceptibilities to be studied.However, unlike the monogenic models, there are no wild-type controls.In addition, the male sex bias is more extreme in these models (Leiter, 2009).These polygenic models have been used in a wide variety of studies that have aimed to reverse the symptoms of type 2 diabetes (Chen et al., 2009;Fukaya et al., 2009;Guo et al., 2010;Mochizuki et al., 2011;Yoshinari and Igarashi, 2011), understand more about the interplay of obesity and glucose homeostasis (Kluth et al., 2011) (Jurgens et al., 2007) or study diabetic complications (Cheng et al., 2007;Fang et al., 2010;Buck et al., 2011;Lee et al., 2011a).KK mice.KK mice are a mildly obese and hyperleptinaemic strain derived from wild-derived ddY mice in Japan by Kondo in 1957 (Clee and Attie, 2007).They develop severe hyperinsulinaemia and demonstrate insulin resistance in both muscle and adipose tissue.The pancreatic islets are hypertrophic and degranulated.This mouse strain also shows signs of diabetic nephropathy (Ikeda, 1994).",
+      ", 2008) and specific genetic factors for predisposition to DN were recently identified in several diabetic sibling studies (Bleyer et al. , 2008; Schelling et al.,2008; Tanaka et al. , 2005). Similar to humans, inbred strains of mice exhibit differences in their susceptibility to diabetes, renal and cardiovascular diseases (Krolewski et al. , 1996). More recently, differential susceptibilities to DN have also been observed in well-defined strains of  23",
+      "  The third advantage of the mouse model is that after identification of a candidate gene, direct genetic evidence for its involvement in a pathophysiology can be obtained in mice, but very rarely in humans.Thus, inbred mouse models are ideally suited for the investigation of the obesity-associated diabetes.However, the genetic homogeneity of the inbred strains is not only an advantage, it also limits their potential.Individuals of an inbred mouse line are genetically identical, and it cannot be expected that a single strain carries more than a small portion of all relevant gene variants.Currently, more than 2000 mouse QTL for different traits have been identified in crosses between inbred stains, but only about 1 % has been characterized on molecular level (Flint et al. 2005).Thus, more than one model and new resources, e.g., systems biology may be required for a complete genetic analysis of complex traits.Previous and ongoing research supports the view that the combination of individual genomes-by intercross of inbred strains and by the generation of congenic lineswill reveal effects of many more genes and gene interactions than can be observed in a single inbred strain.Because the cross-breeding experiments are time consuming and expensive, selecting the ''right'' models of the obesity-associated diabetes is of crucial importance (Leiter 2009).Another advantage of mouse studies in comparison to human studies is the ability to control the environment and to investigate effects of diets, exercise, and intestinal microbiota.",
+      "Introduction Rodents, particularly mouse and rat have been widely used for biomedical research in models of human diseases since it is known that almost of all of genes in mouse and rat are similar to that of humans. However, not every genetic pathway or molecular mechanism of diseases or drugs discovered to be efficacious in these models can be extrapolated to human diseases. Thus, while much data from animal studies have been successfully applied to humans, some have not. The present study aims to explore the degrees of differences in the causal pathways for lung fibrosis between humans and mice.",
+      "  These limitations support the increasing need of experimental systems to characterize the fundamental biological mechanisms responsible for diabetes inheritance and the function of risk genes.In the context of diabetes pathogenesis, in vitro systems are useful but often limited, in particular to assess glucose tolerance, insulin sensitivity, islet architecture and function and diabetes complications.The laboratory mouse provides a wide range of experimental models for diabetes gene discovery and for in vivo post-GWAS studies of diabetes that develops either spontaneously or following gene editing [5].The laboratory rat is also a powerful system to implement phenotyping methods required to record biological variables relevant to common chronic diseases.The rat is the preferred model to perform phenotyping procedures that are often technically challenging in mice or require the collection of large volumes of blood or organs.For these reasons, rat models of type 2 diabetes or hypertension have been successfully used to localise in the genome genes controlling endophenotypes relevant to these complex diseases.This review addresses strategies used to map the genetic determinants of physiological and molecular phenotypes relevant to type 2 diabetes pathogenesis and to characterize their biological function in vivo through examples derived from genetic and genomic research in the Goto-Kakizaki (GK) rat strain.",
+      "However, many of the phenotypes of the homozygous null mutations were extreme and/or did not model the complexity of the metabolic syndrome. For example, IR knockout (IR2/2) mice died because of developmental effects (Accili et al. , 1996), which precluded analysis of adult mice. Likewise, GLUT42/2 mice exhibited only moderate insulin resistance and were not overtly diabetic, suggesting compensatory mechanisms (Katz et al. , 1995). Monogenic GEMMs furthermore ignore the polygenic nature of metabolic diseases, resulting from genetic and environmental factors impacting at multiple levels in signaling cascades. Oligogenic mouse models remedied some of these shortcomings.",
+      "Since glucokinase2/2 mice are embryonic lethal, this collection of glucokinase mutants is useful for dissecting the pathogenesis of MODY2. Genetic reference populations (GRPs) Perhaps the most \u2018\u2018refreshing\u2019\u2019 mouse resource for investigating complex diseases is the construction of mouse crosses using inbred mice and the subsequent QTL mapping. Inbred mice have an inherent wealth of variation due to past spontaneous mutation events, which have been preserved through systematic and uninterrupted brother-sister matings (Paigen, 2003). Inbred mice are appealing since they are genetically identical within a strain but are diverse between strains.",
+      "Mouse Models of Oxidative Stress and Mitochondrial  Dysfunction in Aging.Genetically engineered mouse models provide great systems to directly dissect the complex relationship between oxidative damage, mitochondrial dysfunction, and aging.Although it is difficult to manipulate mitochondrial genome, genetic engineering of nuclear genes that are involved in oxidative stress response and mitochondrial function has been utilized to study mitochondrial biology and aging.",
+      "Rodent models of glaucoma have gained favor in the research community due to their ease of handling and the lower costs associated with acquisition and care. In particular, the mouse provides a number of useful genetic approaches to create models and to test specific molecular interactions associated with the disease process. Furthermore, the mouse genome is relatively conserved compared to the human genome.",
+      "Better Mouse Models. A key point to bear in mind in assessing the usefulness of mouse models is the relative plasticity displayed by rodents faced with gene deletions.Thus, differences between the penetrance of mutations in human genes linked to monogenic forms of diabetes, including maturity onset diabetes of the young (MODY), between humans and mice, are usually observed [114] with the mouse equivalents showing far less marked disturbances in glycemia or changes which are seen only after deletion of both alleles.This clearly reflects the limitations of the use of mice (weight \u223c25 g, life expectancy \u223c3 years) for comparisons with human subjects.Nonetheless, and although the phenotypes of the above murine models are thus often more subtle than the human counterparts, they remain useful models for the study of diabetes, allowing single-targeted gene deletions which are impossible in man.For example, human populations with different genetic backgrounds have different susceptibility to the R235W ZnT8 polymorphism.We should not, therefore, find surprising the results that different genetic backgrounds and different diet reveal different phenotypes in ZnT8 knockout models."
+    ],
+    [
+      "  Additional large-scale meta-analysis predicated upon increased sample size, were carried out on existing datasets from the WTCCC [18] and the Genetics of Kidneys in Diabetes (GoKinD) study [69][70][71] plus control data derived from the National Institute of Mental Health.These investigators observed significant association of previously observed loci.Importantly, they did not observe evidence of new T1D loci reaching the threshold for genome-wide significance.Instead they re-analyzed the most nominally significant associated SNP in an independent British cohort of approximately 6000 cases, 7000 controls and in 2800 families, where they uncovered four additional loci, BACH2 (previously reported [67]), 10p15 harboring protein kinase C theta (PRKCQ), 15q24 harboring nine genes including the cathepsin H (CTSH), complement 1q (C1q), tumor necrosis factor related protein 6 (C1QTNF6) and somatostatin receptor 3 (SSTR3) genes.Table 1 summarizes the 16 T1D loci reported to date.An example of a tag-SNP that captures the association with T1D in each instance is highlighted together with its relative minor allele frequency in controls and what magnitude of risk or protection it confers.Key references regarding the role of each locus in the context of the disease are included and along with the chromosomal band where each locus resides, the main candidate gene (symbol and full name) is highlighted.",
+      "Detection of established loci  We explored the extent to which previously reported type 2 diabetes association signals could be detected in African-descent individuals.Based on the previously reported effect sizes and the effect allele frequency and sample size from our African meta-analysis, we had sufficient power (80%) to detect three signals (TCF7L2, DNER and SRR) at genome-wide significance (p < 2.5 \u00d7 10 \u22128 ) (ESM Table 2).Only the TCF7L2 variant reached genome-wide significance in our study, whereas both variants in DNER (rs1861612) and SRR (rs391300), originally discovered in Pima Indians and East Asians, respectively, had p > 0.1 (ESM Table 2).",
+      "  On the basis of the combined stage 1-3 analyses, we found that six signals reached compelling levels of evidence (P \u00bc 5.0 \u00c2 10 -8 or better) for association with T2D (Table 2).As in all linkage disequilibrium (LD)-mapping approaches, characterization of the causal variants responsible, their effect sizes and the genes through which they act will require extensive resequencing and fine-mapping.However, on the basis of current evidence, we found that the most associated variants in each of these signals map to intron 1 of JAZF1, between CDC123 and CAMK1D, between TSPAN8 and LGR5, in exon 24 of THADA, near ADAMTS9 and in intron 5 of NOTCH2.",
+      "  Replication study of newly identified type 1 diabetes risk loci",
+      "  Although these are considered to be loci convincingly associated with susceptibility to type 2 diabetes in populations of European descent, other genes related to susceptibility to the disease are probably still unidentified, particularly those for populations of other ancestries.In order to uncover genetic variants that increase the risk of type 2 diabetes, we conducted a genome-wide association study in Japanese individuals with type 2 diabetes and unrelated controls.We first genotyped 268,068 SNPs, which covered approximately 56% of common SNPs in the Japanese, in 194 individuals with type 2 diabetes and diabetic retinopathy (case 1) and in 1,558 controls (control 1) collected in the BioBank Japan.We compared the allele frequencies of 207,097 successfully genotyped SNPs and selected the 8,323 SNPs showing the lowest P values.We then attempted to genotype these 8,323 SNPs in 1,367 individuals with type 2 diabetes and diabetic retinopathy (case 2) and for 1,266 controls (control 2) (stage 2), and successfully obtained data for 6,731 SNPs (the P value distribution in the second test is shown in Supplementary Fig. 1a online).The results of principal component analysis 8 in the stage 1 and 2 samples and HapMap samples revealed that there was no evidence for population stratification between the case and control groups throughout the present tests (Supplementary Fig. 1b,c).We selected the 9 SNP loci showing P values o0.0001 (additive model in stage 2, Table 1) and genotyped a third set of cases and controls comprising 3,557 Japanese individuals with type 2 diabetes (cases 3,4,5) and 1,352 controls (controls 3,4).We evaluated the differences in the population structure among these three sets of case and two sets of control groups by Wright's F test.As the results indicated that there was no difference in the population structure among these groups (Supplementary Table 1b online), we combined these populations for the third test of case-control study.The third set of analysis identified the significant associations for six SNPs (Table 1), including the CDKAL1 locus at 6p22.3 (rs4712524, rs9295475 and rs9460546), the IGF2BP2 locus at 3q27.2 (rs6769511 and rs4376068) and the KCNQ1 locus at 11p15.5 (rs2283228).The remaining three SNPs (rs13259803, rs612774 and rs10836097) had P values of 40.05 in the third test and were not further examined.CDKAL1 and IGF2BP2 were previously reported as susceptibility genes for type 2 diabetes in the Japanese population 9 .Therefore, we focused on the KCNQ1 locus, which was highly associated with type 2 diabetes.",
+      "  We consider these data as an interesting preliminary result that surely requires additional independent studies including a higher number of patients in order to confirm and clarify the possible contribution of this locus to the development of T2DM complications.",
+      "DISCUSSION  Taken together, our full second-stage approach and combined meta-analysis have revealed additional loci associated with type 1 diabetes.Clearly the risks are relatively modest compared with previously described associations, and it was only with this sample size at our disposal that we could we detect and establish these signals as true positives through an independent validation effort.",
+      "Identification of susceptibility loci  The degree of evidence for all reported T2D loci was quantified as follows: a locus with a logarithm of odds ratio (LOD) score of 3 or more was considered significant, a LOD score between 2.2 and 3 was considered suggestive and a LOD score between 1 and 2.2 was considered nominal.For T2D, only those loci were included that were significant at least once, or were suggestive in at least one study and at least nominal in two or more studies.The inclusion of the second category of loci was based on a study by Wiltshire et al. [72], in which it was postulated that locus counting is a useful additional tool for the evaluation of genome scan data for complex trait loci.We used the same two criteria to determine the loci from the five papers published on obesity since 2004 and combined these loci with those from Bell et al. [7].As obesity phenotypes, BMI, serum leptin levels, abdominal subcutaneous and visceral fat, and percentage body fat were included.All of these phenotypes were used as continuous quantitative traits, as well as with various cut-off levels.",
+      "  Today, more than 100 loci for type 2 diabetes and glycemic traits have been identified through numerous GWA studies of common and rare variation in populations of diverse ancestral origins [31]; however, to date, very few GWA studies have been published in cohorts of Mexican ancestry.The first GWA study performed in a non-European cohort was published in 2007 and comprised 561 Mexican American type 2 diabetes cases and controls drawn from the Starr County Health Studies [32].Although no loci reached genome-wide significance, several loci identified in prior GWA studies in Europeans were replicated [32].This analysis was subsequently expanded (N = 1273) and meta-analyzed with a cohort from Mexico City (N = 1310) in 2011 [33,34].The most significant variants observed in this meta-analysis included known regions near HNF1A and KCNQ1.Top association signals were then meta-analyzed with the DIAGRAM and DIAGRAM+ datasets of European ancestry individuals, resulting in two regions reaching genome-wide significance: HNF1A and CDKN2A/CDKN2B (Table 1).Top association signals in both studies were annotated to explore their roles as expression quantitative trait loci (eQTL) in both adipose and muscle tissues, revealing a marked excess of transacting eQTL in top signals in both tissue types.",
+      "  75\u00b179 The main conclusion is that there is no major locus for T2D (analogous to HLA in type 1 diabetes).This is not surprising given the modest l s for T2D (approximately 3.5 in Europeans), imposing a limit on the magnitude of any single gene eect. 4Many scans have consequently been signi\u00aecantly underpowered to detect the modest gene eects anticipated.Certainly, few T2D scans have reported linkages meeting the established criteria for genomewide signi\u00aecance. 80This modest power, combined with the diversity of the pedigrees sampled and the analytical techniques used, means that the replication of positive \u00aendings between data sets has been the exception rather than the rule.",
+      "Quantitative Trait Analysis  Exploration of putative T2DM variants with quantitative glycemic traits in a subset of African-American samples (n = 671 from the IRAS and IRASFS control samples, Table S5) revealed     limited insight into the biological mechanism associated with T2DM risk.In addition, the five putative African-American T2DM susceptibility loci were tested for association with quantitative measures of glucose homeostasis in the European Caucasian population, in silico, by the Meta-Analyses of Glucose and Insulin-related traits Consortium (MAGIC; [16]).These results did not provide further insight into the probable role these variants may have in disease susceptibility (Table S6).The most significantly associated SNP in African Americans, rs7560163, failed quality controls filters and was not included in analysis likely due to being monomorphic as seen in a representative Caucasian population from the HapMap project (Table S4).",
+      "  Discovery of novel loci for T2D susceptibility.We tested for T2D association with ~27 million variants passing quality-control filters, ~21 million of which had a minor allele frequency (MAF) < 5%.Our meta-analysis identified variants at 231 loci reaching genomewide significance (P < 5 \u00d7 10 \u22128 ) in the BMI-unadjusted analysis (N eff 231,436) and 152 in the smaller (N eff 157,401) BMI-adjusted analysis.Of the 243 loci identified across these two analyses, 135 mapped outside regions previously implicated in T2D risk (Methods, Fig. 1 and Supplementary Table 2).",
+      "  Genetic studies performed since 2012 have identified many additional T2D loci based on risk alleles common in one population but less common in others.Studies in African Americans identified RND3-RBM43 (28), HLA-B and INS-IGF2 (29).Studies in South Asians identified TMEM163 (30) and SGCG (31).One locus, SLC16A11-SLC16A13, was simultaneously identified in Japanese and Mexican Americans (32,33), and studies in East Asians identified ANK1 (34), GRK5 and RASGRP1 (35), LEP and GPSM1 (32), and CCDC63 and C12orf51 (36).A study of individuals from Greenland identified TBC1D4 (37), and a sequencing-based study of Danes with follow-up in other Europeans identified MACF1 (38).Finally, the largest GWAS to date in American Indians identified DNER at near genome-wide significance (P = 6.6 \u00d7 10 \u22128 ) (39).Three of these studies imputed GWAS data using the 1000 Genomes Project sequence-based reference panels, providing better genome coverage (29,32,33,40).Taken together, these studies highlight the value of diverse populations, including founder and historically isolated populations, to detect risk loci.",
+      "  Finally, a recent study identified additional susceptibility loci for type 2 diabetes by performing a meta-analysis of three published GWAs. 21As acknowledged by the authors, GWAs are limited by the modest effect sizes of individual common variants and the need for stringent statistical thresholds.Thus, by combining data involving 10,128 samples, the authors found in the initial stages of the analysis highly associated variants (they followed only 69 signals out of over 2 million metaanalyzed SNPs) with P values \u03fd10 \u03ea4 in unknown loci, and 11 of these type 2 diabetes' associated SNPs were taken forward to further stages of analysis.Large stage replication testing allowed the detection of at least six previously unknown loci with robust evidence for association with type 2 diabetes.",
+      "  Surprisingly, data about previous published loci associated with type 2 diabetes were not sufficiently powerful to reach a significant P value in individual scans.For example, variants at SLC30A8 and PPARG were significantly associated with type 2 diabetes only when pooling all the GWAs data, whereas in a single genome scan (DGI), no gene showed a positive signal (P value: 0.92 and 0.83, respectively).Thus, this may suggest that GWAs are still underpowered to find SNPs with small effect size.",
+      " Background: The two genome-wide association studies published by us and by the Wellcome Trust Case-Control Consortium (WTCCC) revealed a number of novel loci, but neither had the statistical power to elucidate all of the genetic components of type 1 diabetes risk, a task for which larger effective sample sizes are needed.Methods: We analysed data from two sources: (1) The previously published second stage of our study, with a total sample size of the two stages consisting of 1046 Canadian case-parent trios and 538 multiplex families with 929 affected offspring from the Type 1 Diabetes Genetics Consortium (T1DGC); (2) the Rapid Response 2 (RR2) project of the T1DGC, which genotyped 4417 individuals from 1062 non-overlapping families, including 2059 affected individuals (mostly sibling pairs) for the 1536 markers with the highest statistical significance for type 1 diabetes in the WTCCC results.Results: One locus, mapping to a linkage disequilibrium (LD) block at chr15q14, reached statistical significance by combining results from two markers (rs17574546 and rs7171171) in perfect LD with each other (r 2 = 1).We obtained a joint p value of 1.3610 26 , which exceeds by an order of magnitude the conservative threshold of 3.26610 25 obtained by correcting for the 1536 single nucleotide polymorphisms (SNPs) tested in our study.Meta-analysis with the original WTCCC genome-wide data produced a p value of 5.83610 29 .Conclusions: A novel type 1 diabetes locus was discovered.It involves RASGRP1, a gene known to play a crucial role in thymocyte differentiation and T cell receptor (TCR) signalling by activating the Ras signalling pathway.",
+      "  Finally, we examined whether genes identified using our association studies were enriched within diabetes-related pathways.We collated a list of 42 genes to which 53 CpG sites associated with T2D traits (CS score \u22651.77, combined P < 0.017) mapped.Even in this small dataset, pathway analysis (Supplementary Material, Table S12) indicated significant enrichment in 31 pathways (Fisher's exact P < 0.05), including those related to circadian clock (P = 0.005), adipocytokine signaling (P = 0.009), leptin pathway (P = 0.023), HDL-mediated lipid transport (P = 0.031) and insulin signaling (P = 0.033).",
+      "  In recent years, progress has been made in following up mechanistic studies of GWAS type 2 diabetes-association signals [6,7,9,[25][26][27][28][29][30], but challenges remain in sifting through the many associated variants at a locus to identify those influencing disease.We hypothesized that a common variant with modest effect underlies the association at the CDC123/CAMK1D locus and evaluated the location of high LD variants (r 2 $.7; n = 11) at the locus relative to known transcripts and to putative DNA regulatory elements.We identified two variants that overlapped putative islet and/or liver regulatory regions and none located in exons.We did not assess variants in lower LD (r 2 ,.7), and additional functional SNPs may exist at this locus acting through alternate functional mechanisms untested in the current study.",
+      "  Meta-analysis results for T2D SNPs for insulin and glucose-related traits.",
+      "A r t i c l e s  By combining genome-wide association data from 8,130 individuals with type 2 diabetes (T2D) and 38,987 controls of European descent and following up previously unidentified meta-analysis signals in a further 34,412 cases and 59,925 controls, we identified 12 new T2D association signals with combined P < 5 \u00d7 10 \u22128 .These include a second independent signal at the KCNQ1 locus; the first report, to our knowledge, of an X-chromosomal association (near DUSP9); and a further instance of overlap between loci implicated in monogenic and multifactorial forms of diabetes (at HNF1A).The identified loci affect both beta-cell function and insulin action, and, overall, T2D association signals show evidence of enrichment for genes involved in cell cycle regulation.We also show that a high proportion of T2D susceptibility loci harbor independent association signals influencing apparently unrelated complex traits."
+    ],
+    [
+      "Methods  Mouse models of diabetes.All animal studies were conducted according to a protocol approved by the Institutional Animal Care and Use Committee at the Beckman Research Institute of City of Hope.Male type-2 diabetic db/db mice (T2D leptin receptor deficient; Strain BKS.Cg-m \u00fe / \u00fe lepr db/J) and genetic control non-diabetic db/ \u00fe mice (10-12 weeks old), were obtained from The Jackson Laboratory (Bar Harbor, ME) 11,17 .Male C57BL/6 mice (10 week old, The Jackson Laboratory) were injected with 50 mg kg \u00c0 1 of STZ intraperitoneally on 5 consecutive days.Mice injected with diluent served as controls.Diabetes was confirmed by tail vein blood glucose levels (fasting glucose 4300 mg dl \u00c0 1 ).Each group was composed of five to six mice.Mice were sacrificed at 4-5 or 22 (ref.17) weeks post-induction of diabetes.Glomeruli were isolated from freshly harvested kidneys by a sieving technique 11,17 in which renal capsules were removed, and the cortical tissue of each kidney separated by dissection.The cortical tissue was then carefully strained through a stainless sieve with a pore size of 150 mm by applying gentle pressure.Enriched glomerular tissue below the sieve was collected and transferred to another sieve with a pore size of 75 mm.After several washes with cold PBS, the glomerular tissue remaining on top of the sieve was collected.Pooled glomeruli were centrifuged, and the pellet was collected for RNA, protein extraction or for preparing MMCs 11,17 .Male Chop-KO mice were also obtained from the Jackson Laboratory (B6.129S(Cg)-Ddit3 tm2.1Dron /J).Based on our previous experience, sample size was determined to have enough power to detect an estimated difference between two groups.With minimum sample size of 5 in each group, the study can provide at least 80% power to detect an effect size of 2 between diabetic and non-diabetic groups or treated and untreated groups at the 0.05 significant level using two-sided t-test.Since we expected larger variation between groups especially for the mice with oligo-injection, we used more than 5 mice in each group (with 6 mice in each group, we have 80% power to detect an effect size of 1.8 at the 0.05 confidence level).Our actual results with current sample size did show statistical significance for majority of the miRNAs in the cluster.Histopathological and biochemical analysis of tissues or cells derived from animal models were performed by investigators masked to the genotypes or treatments of the animals.",
+      "  In these models, adult offspring of diabetic animals were noted to have normal development of the endocrine pancreas (Aerts et al., 1997;Ma et al., 2012).However, they develop glucose intolerance and impaired insulin response to glucose challenge, and display insulin resistance, mainly in the liver and muscle, highlighting the presence of both insulin resistance and b-cell dysfunction (Aerts et al., 1988;Holemans et al., 1991a,b).The key role of the intrauterine environment was demonstrated by a series of embryo transfer experiments, which showed that the diabetes risk in a low genetic risk strain can be substantially increased by the hyperglycaemic environment of a dam with a high genetic risk of diabetes (Gill-Randall et al., 2004).",
+      "  Diabetes-obesity syndromes in rodents",
+      "However, in other contexts, B6 mice are more likely than D2 to spontaneously develop diabetic syndromes, Aging Clin Exp Res  indicating that risk factors exist on both genetic backgrounds [29]. QTL mapping studies indicate that these murine metabolic traits have a complex genetic architecture that is not dominated by any single allele [29\u201331], much like humans [32, 33]. Prior work identified candidate genes on Chr 13 that might underlie diabetes-related traits, including RASA1, Nnt, and PSK1. RASA1 show strong sequence differences between B6 and D2 strains [34]. Rasche et al.",
+      "  Other diet-induced rodent models of type 2 diabetes.Although rats and mice are the most commonly used models for studies of type 2 diabetes, other rodents have also been identified as useful models.These include the desert gerbil and the newly described Nile grass rat, both of which tend to develop obesity in captivity.",
+      "  Summary of rodent models of type 2 diabetes",
+      "  Since the obesity is induced by environmental manipulation rather than genes, it is thought to model the human situation more accurately than genetic models of obesityinduced diabetes.High fat feeding is often used in transgenic or knock-out models, which may not show an overt diabetic phenotype under normal conditions, but when the beta cells are 'pushed', the gene may be shown to be of importance.It should be noted that the background strain of the mice can determine the susceptibility to diet-induced metabolic changes, and thus, effects could be missed if a more resistant strain is used (Surwit et al., 1995;Bachmanov et al., 2001;Almind and Kahn, 2004).It has also been reported that there is heterogeneity of the response to high fat feeding within the inbred C57BL/6 strain, indicating that differential responses to a high-fat diet are not purely genetic (Burcelin et al., 2002).",
+      "Other considerations and limitations  A myriad of factors affect animal experiments.Men elicit a greater stress response in mice than women 292 , likely confounding feeding behaviour.Rodents from different production facilities (for example, Jackson Laboratory and Taconic) have unique gut microbiotas 293 , perhaps contributing to differences in their susceptibility to DIO and related diabetic complications 293 .Similarly, cage position within a rack of cages, single versus group housing, the skill level of the researcher, ambient room temperature or the type of cage bedding can all affect experimental outcomes.",
+      "  We believe there are several factors that researchers should consider when conducting obesity and diabetes mellitus research in rodents (FIG.2).Although our list is by no means an exhaustive, it demonstrates the complexity and interconnectedness of the myriad of factors that can confound experimental outcomes.Although it is impossible to control for everything, researchers should accurately detail all experimental conditions and methods to allow for better interpretation of the results and, importantly, for better reproducibility.",
+      "  Figure2| Important experimental parameters and potential confounders of experimental outcomes in obesity and diabetes research and their interrelatedness.Countless factors influence experimental outcomes when using animal models, and what is enumerated here is by no means a complete list.This figure is one depiction of the multifactorial and interconnected genetic and environmental matrix that makes it virtually impossible to design the perfect experiment.For example, single-housing mice to obtain more accurate food intake data introduces a stress that in turn affects food intake.The severity of this stress response is both strain-specific and sex-dependent.What is important is to be aware of these challenges and to control for them in the most optimal manner.It is equally, if not more, important to accurately and comprehensively detail all experimental conditions in research papers, as these have bearing on the interpretation and reproducibility of the published results.DIO, diet-induced obesity.",
+      "  Another concern pertains to control mice.Compared with free-living mice in the wild, laboratory control mice with ad libitum access to food are sedentary, overweight, glucose intolerant and tend to die at a younger age 297 .Comparisons between mice with DIO and control mice might be analogous to investigating the genetic cause of obesity-resistance by comparing humans who are overweight or obese.This potential problem with control mice could explain why the use of DIO diets that have 40% to 60% of total energy from fat is so prevalent, as this might be necessary to achieve divergent weight gains.With free access to running wheels, C57BL/6J mice voluntarily run 5-10 km per day 298,299 .As is the case with humans 300 , mice get health benefits from regular physical activity including weight loss, decreased adiposity and improved insulin sensitivity 301,302 .Physical activity might also affect the epigenome over several generations 303 .An enriched physical and social cage environment alone improves leptin sensitivity and energy expenditure in mice, independent of physical activity 304,305 .Overall, these data suggest that with standard mouse husbandry, chow-fed laboratory mice are not the ideal healthy and lean control group for meaningful obesity research.",
+      "  To better address these points, various animal models have been developed.For example, using HFD-T2DM male rats, the F1 female offspring showed reduced \u03b2 cell area and insulin secretion, together with glucose intolerance, without changes in body weight [145].The islets of the F1 female offspring showed differential expression of many genes involved in Ca 2+ , mitogen-activated protein kinase and Wnt signaling, apoptosis and cell cycle regulation [145].Similarly, in pregnant C57BL6J mice, food deprivation resulted in \u03b2 cell mass reduction and an increased risk of \u03b2 cell failure in offspring [146].",
+      "They are probably typical of those few mice that develop diabetes more slowly and do not tax the pancreatic insulin supply as severely early in the course of the disease. Attempts at therapy. Attempts to keep the weight of diabetic mice within normal limits by total or partial food restriction resulted in premature deaths. After it was discovered that gluconeogenesis is greatly increased in diabetic mice, attempts were made to regulate blood sugar levels and also weight gain by feeding rations devoid of carbohydrate.",
+      "The degree of dependence of adiposity, hyperglycemia, and islet hypertrophy on food consumption varies among these mice, but in all, the increase in islet volume and consequent fi-eell hyperplasia appears to be an effective  247  means of maintaining blood sugar concentrations at near normal levels. I n contrast, neither the diabetic sand rat [5] nor the diabetic mouse has hypertrophied islets and neither effectively controls blood sugar levels.",
+      "HV~MEI,: Studies with the Mutation, Diabetes  almost undetectable. Similarly, the activities of citrate lyase and glucose-6-phosphate dehydrogenase were greatly decreased in these older diabetic as compared  Diabetologia  the diabetic mice have attained m a x i m u m weight, after which no further accumulation of adipose tissue is noted. Fig. 8.",
+      "Rodent models of monogenic obesity and diabetes  Obesity and the consequent insulin resistance is a major harbinger of Type 2 diabetes mellitus in humans.Consequently, animal models of obesity have been used in an attempt to gain insights into the human condition.Some strains maintain euglycaemia by mounting a robust and persistent compensatory \u03b2 -cell response, matching the insulin resistance with hyperinsulinaemia.The ob / ob mouse and fa / fa rats are good examples of this phenomenon.Others, such as the db / db mouse and Psammomys obesus (discussed later) rapidly develop hyperglycaemia as their \u03b2 -cells are unable to maintain the high levels of insulin secretion required throughout life.Investigation of these different animal models may help explain why some humans with morbid obesity never develop Type 2 diabetes whilst others become hyperglycaemic at relatively modest levels of insulin resistance and obesity.",
+      "  As with the KK mouse, the Israeli sand rat model is particularly useful when studying the effects of diet and exercise [120] on the development of Type 2 diabetes.",
+      "Animal models of diabetes in pregnancy and the role of intrauterine environment  Another important field of diabetes research that has relied heavily on animal experimentation is the study of diabetes in pregnancy and the role of the intrauterine environment on the subsequent development of diabetes amongst offspring.",
+      "  Animal models of Type 2 diabetes mellitus",
+      "Assessment of Diabetes  Mice were monitored for the development of diabetes as described previously (Wicker et al. 1994)."
+    ],
+    [
+      "Methods  Mouse models of diabetes.All animal studies were conducted according to a protocol approved by the Institutional Animal Care and Use Committee at the Beckman Research Institute of City of Hope.Male type-2 diabetic db/db mice (T2D leptin receptor deficient; Strain BKS.Cg-m \u00fe / \u00fe lepr db/J) and genetic control non-diabetic db/ \u00fe mice (10-12 weeks old), were obtained from The Jackson Laboratory (Bar Harbor, ME) 11,17 .Male C57BL/6 mice (10 week old, The Jackson Laboratory) were injected with 50 mg kg \u00c0 1 of STZ intraperitoneally on 5 consecutive days.Mice injected with diluent served as controls.Diabetes was confirmed by tail vein blood glucose levels (fasting glucose 4300 mg dl \u00c0 1 ).Each group was composed of five to six mice.Mice were sacrificed at 4-5 or 22 (ref.17) weeks post-induction of diabetes.Glomeruli were isolated from freshly harvested kidneys by a sieving technique 11,17 in which renal capsules were removed, and the cortical tissue of each kidney separated by dissection.The cortical tissue was then carefully strained through a stainless sieve with a pore size of 150 mm by applying gentle pressure.Enriched glomerular tissue below the sieve was collected and transferred to another sieve with a pore size of 75 mm.After several washes with cold PBS, the glomerular tissue remaining on top of the sieve was collected.Pooled glomeruli were centrifuged, and the pellet was collected for RNA, protein extraction or for preparing MMCs 11,17 .Male Chop-KO mice were also obtained from the Jackson Laboratory (B6.129S(Cg)-Ddit3 tm2.1Dron /J).Based on our previous experience, sample size was determined to have enough power to detect an estimated difference between two groups.With minimum sample size of 5 in each group, the study can provide at least 80% power to detect an effect size of 2 between diabetic and non-diabetic groups or treated and untreated groups at the 0.05 significant level using two-sided t-test.Since we expected larger variation between groups especially for the mice with oligo-injection, we used more than 5 mice in each group (with 6 mice in each group, we have 80% power to detect an effect size of 1.8 at the 0.05 confidence level).Our actual results with current sample size did show statistical significance for majority of the miRNAs in the cluster.Histopathological and biochemical analysis of tissues or cells derived from animal models were performed by investigators masked to the genotypes or treatments of the animals.",
+      "Diabetes incidence study. Mice were kept for 20-28 weeks and tested for diabetes monthly by blood glucose and weekly by urine assessment, with a positive indication being followed by twice-weekly blood testing.Mice were diagnosed as diabetic when the blood glucose concentration was over 260 mg/dl (14.4 mM) after 2-3 h of fasting for two sequential tests.Glucose and insulin tolerance tests were performed by injecting glucose (2 g/kg body weight) or insulin (1 U/kg body weight) intraperitoneally in mice fasted for 6-7 h.Tail vein blood was tested by a Contour glucometer.Assessments of plasma insulin, proinsulin and C-peptide levels were performed using commercial ELISA kits, according to the manufacturer's instructions (insulin, proinsulin and C-peptide mouse ELISA kits, R&D Systems Quantikine).Assays were performed with blinding, with mice coded by number until experimental end.",
+      "Animal group and study design  First, one set of animals comprising 12-week-old male type 2 diabetic db/db (C57BL/KsJ-db\u2212/db\u2212, n = 8) and contemporary control wild-type (C57BL/KsJ-db+/db\u2212, n = 8) mice (Jackson Laboratories) were included in this study.Their weights and blood glucose levels were analysed to eliminate variation.Erectile functions of the animals were evaluated by the apomorphine-induced penile erection test, according to a previously described protocol (Pan et al. 2014).Afterwards, intracavernous pressure (ICP) investigations and histological measurements were applied to further confirm the results of the function tests.Then, all mice were sacrificed and the corpus cavernosum (CC) was collected from each mouse.Because the tissue of the CC is difficult to crush, we randomly collected the CCs from two mice and mixed them into one subgroup.As a result, four diabetic subgroups (DB groups) and four normal control subgroups (NC groups) were used for molecular measurements.Second, another set of animals, including three T2DMED and three normal control mice that were independent from the original set of animals, were included in the validation experiments using qRT-PCR.Third, another separate set of animals, including five T2DMED and five control mice, were used to verify one of the predicted targets, IGF-1, using ELISA.A luciferase reporter assay was performed to verify the binding of the differentially expressed miRNAs to the target gene IGF-1.All procedures were approved by the Institutional Animal Care and Use committee at Nanjing Medical University.",
+      "  Summary of rodent models of type 2 diabetes",
+      "  Summary of rodent models of type 1 diabetes",
+      "Knock-out and transgenic mice in diabetes research  Transgenic mice have been used to create specific models of type 1 and type 2 diabetes, including hIAPP mice, humanized mice with aspects of the human immune system and mice allowing conditional ablation of beta cells, as outlined above.Beta cells expressing fluorescent proteins can also provide elegant methods of tracking beta cells for use in diabetes research (Hara et al., 2003).",
+      "Genetically induced insulin-dependent diabetes  AKITA mice.The AKITA mouse was derived in Akita, Japan from a C57BL/6NSlc mouse with a spontaneous mutation in the insulin 2 gene preventing correct processing of proinsulin.This causes an overload of misfolded proteins and subsequent ER stress.This results in a severe insulindependent diabetes starting from 3 to 4 weeks of age, which is characterized by hyperglycaemia, hypoinsulinaemia, polyuria and polydipsia.Untreated homozygotes rarely survive longer than 12 weeks.The lack of beta cell mass in this model makes it an alternative to streptozotocin-treated mice in transplantation studies (Mathews et al., 2002).It has also been used as a model of type 1 diabetic macrovascular disease (Zhou et al., 2011) and neuropathy (Drel et al., 2011).In addition, this model is commonly used to study potential alleviators of ER stress in the islets and in this respect models some of the pathology of type 2 diabetes (Chen et al., 2011).",
+      "  To achieve a slow pathogenesis of T2DM, young adult mice 284 or rats 285 are fed a high-fat or Western diet to elicit DIO and insulin resistance.Single or multiple injections with low-dose streptozotocin (~30-40 mg/kg intraperitoneally) then elicit partial loss of \u03b2-cells, which results in hypoinsulinaemia and hyperglycaemia.Protocols are being continuously refined and likely differ between species and even strains 283 .The HFD streptozotocin rat is sensitive to metformin, further demonstrating the utility of this model 285 .Downsides of streptozotocin treatment include liver and kidney toxicity and mild carcinogenic adverse effects (TABLE 1).",
+      "Materials and methods 2.1 Mouse models 2.1.1 Mouse strains 2.1.2 Induction of type 1 diabetes 8 2.1.3 Insulin treatment on diabetic mice 2.1.4 Akita mouse genotyping 2.2 Characterization of diabetic nephropathy in mice 2.2.1 Proteinuria measurement 2.2.2 Glomerular cells quantification 2.2.3 Methenamine silver staining quantification  3. 4. 5. 6.",
+      "  ii) Rodent models of diabetic retinopathy",
+      "  There are some good reviews available in the literatures describing the transgenic/knockout animal models of type 2 diabetes [114][115][116][117][118] .The transgenic and knockout models are developed for studying the role of genes and their effects on peripheral insulin action such as insulin receptor, IRS-1, IRS-2, glucose transporter (GLUT 4), peroxisome proliferator activated receptor-g (PPAR-g) and tumour necrosis factor-a (TNF-a) as well as in insulin secretion such as GLUT-2, glucokinase (GK), islet amyloid polypeptide (IAPP) and GLP-1 and in hepatic glucose production (expression of PEPCK) associated with development of type 2 diabetes.Further, combination or double knockout mouse models including defect in insulin action and insulin secretion (e.g., IRS-1 +/-/GK +/-double knockout) have been produced which clearly illustrate the mechanisms associated with development of insulin resistance and beta cell dysfunction leading to overt hyperglycaemic state in human type 2 diabetes.These above genetically modified animals exhibit various phenotypic features of type 2 diabetes varying from mild to severe hyperglycaemia, insulin resistance, hyperinsulinaemia, impaired glucose tolerance and others as explained in detail elsewhere 6,9,[114][115][116][117][118] .Very recently, tissue specific knockout mouse models have been achieved, allowing further insight into the insulin action with respect to particular target tissues (muscle, adipose tissue and liver) associated with insulin resistance and type 2 diabetes 115,117,118 .The transgenic/knockout animals are currently used mostly for the mechanistic study in diabetes research and not usually recommended for screening programme as they are more complicated and costly.",
+      "Functional deficits refs  Non-Alzheimer-disease mouse [71][72][73][74]76,78,81,85,87 and rat 59,75,77 ,79,95,97  Mouse [81][82][83][84][85] and rat 79,111  Cerebral effects of inducing diabetes or insulin resistance in normal rodents (that is, non-Alzheimer-disease rodent models) and in rodents genetically modified to accumulate amyloid\u03b2 in the brain (that is, rodent models of Alzheimer disease). Common intervetions to induce diabetic conditions in rodents included recessive mutations in the leptin gene (Lep; also known as Ob), defects in the leptin receptor (LEPR; also known as OB-R), diet and administration of streptozotocin. Rodents with pancratic overexpression of human amylin spontaneously develop both type 2 diabetes mellitus and dementia-like pathology.",
+      "  Animal models have been used extensively in diabetes research.Early studies used pancreatectomised dogs to confirm the central role of the pancreas in glucose homeostasis, culminating in the discovery and purification of insulin.Today, animal experimentation is contentious and subject to legal and ethical restrictions that vary throughout the world.Most experiments are carried out on rodents, although some studies are still performed on larger animals.Several toxins, including streptozotocin and alloxan, induce hyperglycaemia in rats and mice.Selective inbreeding has produced several strains of animal that are considered reasonable models of Type 1 diabetes, Type 2 diabetes and related phenotypes such as obesity and insulin resistance.Apart from their use in studying the pathogenesis of the disease and its complications, all new treatments for diabetes, including islet cell transplantation and preventative strategies, are initially investigated in animals.In recent years, molecular biological techniques have produced a large number of new animal models for the study of diabetes, including knock-in, generalized knock-out and tissue-specific knockout mice.",
+      "  Animal models of Type 2 diabetes mellitus",
+      "  As with the KK mouse, the Israeli sand rat model is particularly useful when studying the effects of diet and exercise [120] on the development of Type 2 diabetes.",
+      "  Animal models of Type 1 diabetes",
+      " Animal models have been used extensively in diabetes research.Early studies used pancreatectomised dogs to confirm the central role of the pancreas in glucose homeostasis, culminating in the discovery and purification of insulin.Today, animal experimentation is contentious and subject to legal and ethical restrictions that vary throughout the world.Most experiments are carried out on rodents, although some studies are still performed on larger animals.Several toxins, including streptozotocin and alloxan, induce hyperglycaemia in rats and mice.Selective inbreeding has produced several strains of animal that are considered reasonable models of Type 1 diabetes, Type 2 diabetes and related phenotypes such as obesity and insulin resistance.Apart from their use in studying the pathogenesis of the disease and its complications, all new treatments for diabetes, including islet cell transplantation and preventative strategies, are initially investigated in animals.In recent years, molecular biological techniques have produced a large number of new animal models for the study of diabetes, including knock-in, generalized knock-out and tissue-specific knockout mice.",
+      "Rodent models of monogenic obesity and diabetes  Obesity and the consequent insulin resistance is a major harbinger of Type 2 diabetes mellitus in humans.Consequently, animal models of obesity have been used in an attempt to gain insights into the human condition.Some strains maintain euglycaemia by mounting a robust and persistent compensatory \u03b2 -cell response, matching the insulin resistance with hyperinsulinaemia.The ob / ob mouse and fa / fa rats are good examples of this phenomenon.Others, such as the db / db mouse and Psammomys obesus (discussed later) rapidly develop hyperglycaemia as their \u03b2 -cells are unable to maintain the high levels of insulin secretion required throughout life.Investigation of these different animal models may help explain why some humans with morbid obesity never develop Type 2 diabetes whilst others become hyperglycaemic at relatively modest levels of insulin resistance and obesity.",
+      "Introduction  Animal experimentation has a long history in the field of diabetes research.The aim of this article is to review the commonly used animal models and discuss the recent technological advances that are being employed in the discipline.The review is based on an extensive literature search using the terms rodent, mouse, rat, animal model, transgenics, knockout, diabetes and pathogenesis, in scientific journal databases such as MEDLINE \u00ae.In addition, abstracts presented at meetings of Diabetes UK, the European Association for the Study of Diabetes and the American Diabetes Association over the last 5 years were examined in order to gain an appreciation of recent and ongoing research projects.",
+      "Assessment of Diabetes  Mice were monitored for the development of diabetes as described previously (Wicker et al. 1994)."
+    ],
+    [
+      "  FIG. 6. Hepatic steatosis during DIO is associated with loss of eAT mass.A: Liver weight (adjusted for body weight) of mice fed a HF diet for 1, 4, 8, 12, 16, and 20 weeks.B: Inverse association of eAT mass and liver weight (as in A) between DIO weeks 12 and 20.C: Representative micrographs of hematoxylin and eosin-stained liver sections demonstrating that hepatic macrosteatosis in HF-fed mice is initially evident at DIO week 12 and increases through week 20.",
+      "  RESEARCH DESIGN AND METHODS-Male C57BL/6 mice were fed a high-fat diet for 20 weeks to induce obesity.Every 4 weeks, insulin resistance was assessed by intraperitoneal insulin tolerance tests, and epididymal (eAT) and inguinal subcutaneous AT (iAT) and livers were harvested for histological, immunohistochemical, and gene expression analyses.",
+      "BXD and HMDP mouse strains, as well as HXB/BXH rat strains, with higher Cd36 expression had increased fat mass and body weight, as well as decreased VO 2 and liver acid beta\u2212glucosidase activity (Figure S2.4B-C), confirming the involvement of Cd36 in metabolism [126] and suggesting a potential role in Gaucher's disease, which results from the deficiency of acid beta\u2212glucosidase [127]. An association between Abca8a liver transcripts and triglyceride levels was also revealed (Figure S2.4D).",
+      "The mice were sacrificed at 9 am after a 4-hour fast. (A-E) PARPi reduced body weight (A; *, #, and $ indicates significant differences between  27 HFHS and CD, HFHS and PAPRi-Prev, and HFHS and PARPi-Ther, respectively), liver weight (B), epididymal fat pad (C), liver triglyceride content (D), and cholesterol (E) in both preventive and therapeutic cohorts (n=8-10). (F,G) Representative images of livers (F) and liver sections stained with H&E and Oil Red O (lipid content appears in red) (G), (n= 4-5).",
+      "CD45 positive cells appear brown. (n=4). * P <0.05; ** P < 0.001; *** P< 0.0001. Data are expressed as the mean \u00b1 SEM. One-way ANOVA with a post-hoc Bonferroni test was used for all statistical analyses. Male mice were used in these experiments. Fig. 5. Liver damage in MCD diet-induced NAFLD was reversed by NAD+ repletion. C57BL/6J mice were fed with CD, MCD, or MCD+PARPi (PARPi, 50 mg/kg/day). The mice were sacrificed at 9 am after a 4-hour fast. (A) PARPi reduces global protein PARylation and (B) recovers NAD+ levels in liver tissue (n=6).",
+      "At 10 weeks of age, male C57BL/6J mice were challenged with an MCD diet for 5 weeks. Similar to the effects seen in mice on a HFHS diet, MCD-fed mice treated with PARPi in a preventive manner exhibited reduced PARylation and increased hepatic NAD+ levels (Fig. 5A and B). Mice fed with a MCD diet for 5 weeks showed classical pathophysiological characteristics of NAFLD, including hepatic steatosis, inflammation and fibrosis. MCD diet increased AST and ALT levels compared to a control diet, while PARPi treatment reduced their levels (Fig. 5C and D).",
+      "  The left inguinal, gonadal, and retroperitoneal fat pads were dissected and weighed individually. (Prior data showed that weights of left and right fat pads are highly correlated. )The mesenteric fat pad was also dissected and weighed.An adiposity index (AI) was computed for each mouse as follows: the left inguinal, gonadal, and retroperitoneal fat pad weights were summed, doubled, added to mesenteric fat pad weight, divided by body weight, and multiplied by 100.The ratios of the individual fat pad weights divided by body weight and expressed as a percentage (for example, 200\u00d7 left gonadal fat pad weight/body weight) were analyzed as separate traits, as were blood glucose level, plasma leptin level (log 10 transformed), body weight, and body length.",
+      "Metabolic phenotypes were compared between mice in the upper (Lonp1-high) and lower (Lonp1-low) quartiles with respect to WAT Lonp1 expression (n=9\u201310 mice per Copyright \u00a9 2021 Korean Endocrine Society  VAT mRNA levels of OXPHOS-complex and UPRmt genes in relation to BMI Among 48 patients, 11 were obese (\u226525 kg/m2), 11 were overweight (23 to 24.9 kg/m2), and 26 were of normal or underweight (<22.9 kg/m2), according to the World Health Organization Asia-Pacific Obesity Classification [16]. Clinical characteristics of the participants stratified by BMI (<23 kg/m2 vs. \u226523 kg/m2) are summarized in Table 1.",
+      "In an F2 cohort derived from these parental strains, we have shown that the range of blood glucose, insulin levels, and body weight exceeds that of either the C57BL/6 (B6) leptinob/ob or BTBR leptinob/ob parental strains. We went on to identify several diabetesrelated QTL in this F2 sample [21,22]. In the current study, we focused on a subset of 60 F2 mice that have previously been evaluated in detail with regard to liver gene expression profiles [24] to ask if the abundances of hepatic metabolic intermediates would show sufficient heritability to enable us to map metabolic QTL (mQTL).",
+      "(E\u2013G) Data from CTB6F2 (E) and HMDP (F) mouse cohorts, and the HXB/BXH rat cohort (G) indicate significant negative correlations between liver Rpl26 levels and body weight, and other metabolic traits. adipose tissue (subWAT) mass (Figure 2D), suggesting pleiotropic effects of Pten. The links between Pten and neurobiological and metabolic phenotypes have been confirmed by independent studies (Kwon et al. , 2006; Ortega-Molina et al. , 2012). Overall, PheWAS showed that 4,230 out of 11,548 genes were associated with at least one phenotypic trait and all genes had significant associated molecular traits after phenome-wide correction (Figures 2E; Table S3).",
+      "Curves of weight ( \u2022 ... \u2022 ) and blood sugar concentration with age in a less typical diabetic mouse  Diabetologia  (I  --I  )  Aside from the large accumulations of fat, subcutaneously in axillary and inguinal regions and intraabdominally in mescnteric and gonadal fat pads, the most striking anatomical deviation is the size of the liver. The liver m a y weigh up to 4.5 grams in a 40 gram mouse, compared with 1.2 grams in a 20 gram normal mouse.",
+      "In mice, within hours after the last meal, the organs respond with changes in gene expression mainly in general metabolism (70). The role of the liver is to provide energy for glucose-dependent tissues, by glycogenolysis, gluconeogenesis, ketogenesis, and fatty-acid \u03b2-oxidation (71). The basic architecture of the lobules and the zonation are not affected, but the cell size declines in prolonged fasting, when murine liver restores partly its glycogen deposits, and much of gene expression returns to control values (72). In Abcb4-/- mice, collagens, fibronectin and vimentin, responsible for the structural integrity of the ECM, were strongly affected by fasting.",
+      "James SJ, Muskhelishvili L. Rates of apoptosis and proliferation vary with caloric intake and may influence incidence of spontaneous hepatoma in C57BL/6 x C3H F1 mice. Cancer Res 1994 Nov 1;54(21):5508-5510. 50. Hakvoort TB, Moerland PD, Frijters R, Sokolovic A, Labruyere WT, Vermeulen JL, et al. Interorgan coordination of the murine adaptive response to fasting. J Biol Chem 2011 May 6;286(18):16332-16343. 51. Lin S, Saxena NK, Ding X, Stein LL, Anania FA. Leptin increases tissue inhibitor of metalloproteinase I (TIMP-1) gene expression by a specificity protein 1/signal transducer and activator of transcription 3 mechanism. Mol Endocrinol 2006 Dec;20(12):3376-3388. 52.",
+      "  Characterization of lean and obese control and mGHRKO mice",
+      "  Consistent with the broad up-regulation of genes associated with fatty acid synthesis (Table 1), Oil Red O staining of liver sections from 15-d-old pups and naturally aged mice revealed enhanced accumulation of triacylglycerides in both compared to control littermates and 8-wk-old mice (Figure 7C), indicating hepatic steatosis.This and the absence of adipose tissue suggest that Csb m/m /Xpa \u00c0/\u00c0 mice display generalized lipodystrophy (loss and abnormal redistribution of body fat) [31]., and Csb m/m /Xpa \u00c0/\u00c0 mice (n \u00bc 6).The levels of IGF1 (ng/ml) and glucose (mmol/l) in the serum of Csb m/m /Xpa \u00c0/\u00c0 mice are significantly lower than that of control littermates (p , 0.0004 and p , 0.04, respectively). (C) PAS staining for glycogen and Oil Red O staining for triglycerides in livers of 15-d-old wt and Csb m/m /Xpa \u00c0/\u00c0 mice and 96-wk-old wt mice.Pictures were taken at 1003 magnification.Note the large polyploid nuclei in the 96-wk-old wt mouse liver and the reduced glycogen levels in the Csb m/m /Xpa \u00c0/\u00c0 liver after overnight fasting.doi:10.1371/journal.pbio.0050002.g007",
+      "Association between lifespan and metabolic organ weights We measured weight of certain metabolic organs and tissues of a subsample of cases on both diets at ~500 days of age. HFD mice (n = 63) had 84% greater fat mass, 25% greater heart mass, 19% greater liver mass, and 18% greater kidney mass at ~500 days compared to controls (n = 71). However, HFD did not influence brain mass (Supplemental Table).",
+      "  Young adult dwarf mice have more body fat than normal mice.But, with age, normal mice from this line accumulate fat at a higher rate, and the percent body fat in old DF mice does not differ from that of normal mice, as measured by dual energy X-ray absorptiometry (DEXA) (29).Downregulation of lipid biosynthetic genes and upregulation of \u2424-oxidation-related genes in the liver of DF mice may explain this slower rate of fat deposition.",
+      "(b) Serum levels of liver injury markers, triglyceride, and cholesterol profiles of 20-month-old WT (n = 6) and Gdf15 KO (n = 6) mice. (c) Serum levels of pro-inflammatory cytokines of 20-month-old WT (n = 6) and Gdf15 KO (n = 6) mice. (d) H&E staining for liver tissues of 20-month-old WT (n = 6) and Gdf15 KO (n = 6) mice. Scale bar, 200 \u03bcm. Arrows indicate fat accumulation. (e) Fixed adipose tissue from 20-month-old WT (n = 6) and Gdf15 KO (n = 6) mice was stained for F4/80 antibodies. Scale bar, 200 \u03bcm.",
+      "(12) studied liver gene expression changes in Stat5b knockout and wild-type mice, finding 1,603 differentially regulated genes, with 850 being male- and 753 female biased (P \u2b0d 0.05 and FC \u2b0e 1.5). A large study consisting of 344 mice comprising an F2 cross between C57B/6J.apoE\u2afa/\u2afa and C3H/HeJ.apoE\u2afa/\u2afa strains (\u2b0350% from each sex) produced two reports (57, 61) that examined sexually dimorphic gene expression in adipose tissue, brain, liver, and muscle. It was reported that 9,250 genes are dimorphic in the liver (P \u2b0d 0.01 and FC \u2b0e 1).",
+      "2006) studied liver gene expression changes in Stat5b knockout and wild type mice, finding 1,603 differentially regulated genes, with 850 being male- and 753 female-biased (p<0.05 and FC>1.5). A large study consisting of 344 mice comprising an F2 cross between C57B/6J.apoE-/- and C3H/HeJ.apoE-/- strains (~50% from each sex) produced two reports (Wang et al. 2006; Yang et al. 2006) which examined sexually dimorphic gene expression in adipose tissue, brain, liver and muscle. It was reported that 9,250 genes are dimorphic in the liver (p<0.01 and FC>1)."
+    ],
+    [
+      "However, when the data were adjusted for brain weight, there was a significant (p = 0.008) difference between DBA/2J and C57BL/6J (2.14 \u00b1 0.06 mm2 and 1.96 \u00b1 0.03 mm2, respectively) making the DBA/2J larger by 8.50%. Total brain weight of DBA/2J animals was significantly (p < 0.0001) smaller than that of C57BL/ 6J animals (0.35 \u00b1 0.01 g, 0.42 \u00b1 0.01 g respectively).",
+      "Phenotypes are often very different between mouse strains with diverse genetic backgrounds and the strain characteristics of DBA/ 2J are often contrasted with other genetically distinct inbred strains such as C57BL/6J. These defined genetic backgrounds provide an excellent system for mapping modifier genes [20,21,22]. To study these differences a number of DBA/2J-relevant resources have been generated. For instance, a genome-wide panel of congenic strains has been created that contain portions of DBA/2J chromosomes on a C57BL/6J background [23]. These 65 strains contain more than 95% of the DBA/2J genome.",
+      "Well-documented behavioral differences between C57 and DBA, including enhanced closed-arm preference and deficits in conditional fear, were observed. This suggests at a minimum that the influence of previous testing in the two parental strains was comparable. The use of DBA/2J donor segments for the GTM panel may have implications for loci identified in tests involving auditory stimuli, as this strain is known to undergo progressive hearing loss with age. While no rigorous examination of hearing capacity in the GTM has been conducted, inspection of time course data for individual mice in both the general  Mol Psychiatry.",
+      "Particularly striking is the difference in their locomotor response: the C57BL/6J strain shows a marked locomotor activation following an acute opiate administration, which is virtually absent in DBA/2 mice [6, 25, 29]. After chronic morphine treatment, either tolerance or sensitization of the locomotor response was evidenced in C57BL/6J mice, depending on the treatment paradigm, whereas no altered responses were observed in the DBA/2J strain [1, 22, 29, 31]. Other inter-strain differences in reactions to opioids have also been reported, including a greater sensitivity to opioid reward and stronger withdrawal symptoms in the C57BL/6J strain [2, 6, 17, 30, 35].",
+      "Although no differences in attentional performance were detected between C57BL/6J and DBA/2J, in line with previous reports in the 5-CSRTT and five-choice CPT (Loos et al . 2010; Young et al . 2009), we observed significant differences among BXD recombinant inbred strains that transgressed beyond the phenotypes of the founders. This suggested the contribution of multiple genetic loci to these phenotypes, of which we detected a significant one on chromosome 16 for response variability.",
+      "Given the large differences that we found previously (Crusio 2013) between C57BL/6 and DBA/2, this is unexpected. One possible explanation for the lower than expected performance of the C57BL/6 and (at least some) BXD strains lies in the housing conditions. Our animal facility was built to house about 500 cages in one large breeding room. However, the cage-washing installation (and the available personnel) could not handle that many cages at a time. As a result, every day one or two racks of cages were changed. C57BL/6 mice are sensitive to such disruptions and, indeed, breeding results were only mediocre.",
+      "C57BL/6 and DBA/2 mice is not yet fully understood but involves multiple genetic differences between the two mouse lineages, affecting several pathways and processes (1). Certain influenza viruses grow to higher titers in DBA/2 mice (A/Hong Kong/213/2003 [H5N1] or A/Memphis/33/2008 [H1N1]) (data not shown) while others do not (H7N3 and H10N5) (this study). Irrespective of the difference in viral loads, DBA/2 mice respond more vigorously, producing larger quantities of certain proinflammatory molecules like TNF-\u2423, which was shown to correlate with increased morbidity and mortality in humans (5).",
+      "Additionally, in this protocol the strains DBA/2J, A/J, NOD/ShiLt/J, C57BL/10J, SM/J, and C57BR/cdJ are AA sensitive; the strains CAST/EiJ and BTBR T\u2af9 tf/J are resistant; and the strains NZW/LacJ, KK,HIJ, and SWR/J have intermediate resistance to AA-induced acute nephrotoxicity (supplementary data; all supplementary material for this article is available online at the journal web site.). For this QTL study, C57BL/6J and DBA/2J mice were used as resistant and sensitive strains, respectively. Each strain has a complete genomic sequence available, and the genetic basis of differences in their ability to respond to xenobiotics is extensively studied (reviewed in Ref. 8).",
+      "The C57BL/6J X DBA/2J (BXD) recombinant inbred (RI) mouse strains, which are unique mosaic of alleles derived from the parental C57BL/6J (B6) and DBA/2J (D2) strains have been constructed as a high precision genetic reference population for systems genetics in unraveling the genetic architecture of polygenic traits (Ashbrook et al. , 2019). The BXD family consists of more than 150 BXD fully inbred strains that segregate for \u223c6 million genetic variants and thus can be used as an informative murine genetic reference panel.",
+      "Because we have now shown that the parental strains C57BL/6J and DBA/2J markedly differ in both quantitative measures of cortex area size [6] and shape, this assures variation in the derivative BXD lines, and provides an empirical basis for using the BXD panel to study cortical development. Conclusion C57BL/6J and DBA/2J have markedly different cortical area maps, in both size and shape. These differences suggest polymorphism in genetic factors underlying cortical specification, even between common isogenic strains. Comparing cortical phenotypes between normally varying inbred mice or between genetically modified mice can identify genetic contributions to cortical specification.",
+      "The C57BL/6 mice were more accurate than DBA/2 mice at the shorter SD where the task demands were greater, and they also made anticipatory (impulsive) responses at a lower rate. In contrast, the DBA/2 mice made fewer omission errors than the C57BL/6 but this effect was not seen until the final stages of the experimental procedures. These findings are in agreement with those of Greco et al. [18]. Although they used different breeders as well as different test chambers, training protocols and reinforcers, the results were similar: DBA/2 males were less accurate and made more anticipatory responses than C57BL/6 males.",
+      "DBA/2 mice perform poorly in other spatial tasks as well as in the 5-CSRTT (see Section 1) but this is by no means true for paradigms that are less spatially demanding. For instance, in the four-arm baited and cued versions of the radial maze, as well as in auditory fear conditioning, C57BL/6 and DBA/2 do not differ [1,30]; DBA/2 mice even perform better than C57BL/6 with regard to two-way active avoidance learning [37].",
+      "While the factorial structure of C57BL/6 mice remained the same as under low attentional demands (two factors), there was only one factor for DBA2 mice. This factor was characterised by high positive loadings (>0.78) from the percent of correct responses and omission errors, and a high negative loading (0.87) from anticipatory responses. 4. Discussion The results indicated that both C57BL/6 and DBA/2 mice were able to learn the complex 5-CSRTT task but there were considerable quantitative differences in their performance.",
+      "It can be seen that at all SD, accuracy was greater for C57BL/6 than for DBA/2 mice. The clearest difference was at 1 s SD where C57BL/6 mice were responding at a mean accuracy of 80% compared with the DBA/2 group for which the mean was 59% (Fig. 1(A)). With a SD of 5 s there was no significant main effect for group (F1,28 = 3.13), whereas at 2 and 1 s SD significant group effects were achieved (F1,28 = 5.44 and 25.1; P < 0.05 and 0.001, respectively).",
+      "In marked contrast, the C57BL/6J strain was found to have the highest level of oral morphine consumption [6]. However, sensitivity to the reinforcing effects of morphine in conditioned place preference and intravenous self-administration paradigms was higher in DBA mice than in C57BL [10]. The two frequently used laboratory strains of mice C57BL/6J and DBA/2J show remarkable differences in analgesic response to morphine. Moreover, several studies have reported profound differences in morphine induced locomotor activity between the sensitive C57BL/6 and insensitive DBA/2 mice [3,7].",
+      ", increased exploration of the open areas) in both tests. One explanation is that DBA/2J is \u201csusceptible\u201d to this stressor, whereas C57BL/6J is \u201cresilient.\u201d However, a more circumscribed but potentially more accurate interpretation is that both strains react strongly to this particular stress regime, but differ in the manner in which the response manifests behaviorally. Thus, DBA/2J may develop a classic \u201cpassive\u201d anxiety-like suppression of approach behavior, whereas C57BL/6J may exhibit more of an \u201cactive\u201d response to stress. This could reflect an increased panic-like escape drive or manic-like reaction to stress in C57BL/6J, rather than a decrease in anxiety-like behavior.",
+      "Differences in radiation sensitivity between the BXD parental strains were first described by Roderick more than 45 years ago, with DBA/2J succumbing more quickly than C57BL/6J to a lethal dose of radiation (26). At more modest doses, C57BL/6J mice were shown to be more resistant to radiation-induced genomic instability than DBA/2J (38, 84, 85).",
+      "Genetic differences between C57 and DBA mice have been shown to translate into a broad spectrum of CNS related functional and molecular correlates, for example, differences in activity, impulsive action, hippocampal related memory and learning tasks, post- and pre-synaptic protein expression, and synaptic transmission and plasticity [27\u201340]. Through genetic linkage analyses, the genetic and phenotypic differences in the BXD panel of RI strains have resulted in identification of genes and loci involved in complex CNS functions, such as impulsivity [41], reversal learning [42], attention [43], neuronal oscillations [44], hearing loss [45], and fear and spatial learning [39,40].",
+      "For example, the C57BL/6J (B6) and DBA2/J (D2) inbred mice frequently are used in alcohol research because they clearly differ in various responses to alcohol, including development of functional tolerance (Grieve and Littleton 1979), locomotor activation (Phillips et al. 1998), and sensitivity to withdrawal symptoms (Metten and Crabbe 1994). Because the environmental conditions in these experiments can be controlled, any differences observed between the mouse strains in these phenotypes most likely can be attributed to genetic differences.",
+      "For example, when subjected to HFD, DBA/2J had 12.5% more body fat compared to C57BL/6J (P < 0.0001, Fig 1A). Additionally, the F1 offspring generated by DBA/2J dams (DBA/2J x C57BL/6J) had 10.6% more body fat (P < 0.001) compared to the F1 from C57BL/ 2J dams (C57BL/6J x DBA/2J). While the source of these latter effects appears to be maternal, further studies are needed to identify the molecular basis of these differences. In general, genetic differences between strains impacted body weight variation throughout the experiment (P < 0.05) (Fig 1B)."
+    ],
+    [
+      "  Quantitative trait locus (QTL) mapping has been carried out in numerous species to associate regions of the genome to phenotypes even before the structure of the genome was well understood (e.g., [3]).Rodents, especially mice, have been the species most prominently used for biomedically relevant traits.Amongst these, the BXD family of recombinant inbred (RI) strains derived from crossing two inbred strains-C57BL/6J and DBA/2J mice-have been extensively used for almost 50 years in fields such as neuropharmacology [4][5][6], immunology [7][8][9][10][11][12][13], behaviour [13][14][15][16][17][18][19][20][21], aging [21][22][23][24][25][26][27][28][29], neurodegeneration [30][31][32][33], and gut microbiome-host interactions [34].",
+      "Milhaud JM, Halley H, Lassalle JM (2002) Two QTLs located on chromosomes 1 and 5 modulate different aspects of the performance of mice of the B6D Ty RI strain series in the Morris navigation task. Behav Genet 32: 69\u201378. 16. Buck KJ, Rademacher BS, Metten P, Crabbe JC (2002) Mapping murine loci for physical dependence on ethanol. Psychopharmacology (Berl) 160: 398\u2013407. 17. Ferraro TN, Golden GT, Smith GG, Schork NJ, St Jean P, et al. (1997) Mapping murine loci for seizure response to kainic acid. Mamm Genome 8: 200\u2013208. 18.",
+      "Other aggression QTLs Several lines of mice have been selectively bred for high or low levels of o\u00a1ensive aggression, which con\u00a2rms that a propensity for aggressive behaviours is partially heritable. These lines include the Turku aggressive (TA) and non-aggressive (TNS) strains bred in Finland, the NC900 and NC100 strains bred in North Carolina, and the short attack latency (SAL) and long attack latency (LAL) strains bred in the Netherlands (Miczek et al 2001). In wild mice, there is evidence for a QTL a\u00a1ecting aggressive behaviours in a region of chromosome 17, the t region.",
+      "QTL ANALYSIS OF AGGRESSIVE BEHAVIOURS IN MICE  65  Progress towards identifying QTLs that a\u00a1ect aggressive behaviours in mice An example of aggression QTLs identi\u00a2ed as part of a whole genome scan One of the few studies to identify intermale aggression QTLs as part of a whole genome scan was published recently (Brodkin et al 2002). This study used NZB/ B1NJ (extremely aggressive) and A/J (extremely unaggressive) inbred mice as parental strains. The methods chosen for housing and aggression testing were designed to control the e\u00a1ect of non-genetic factors on the phenotype.",
+      "Neuroscientist 4:317^323 Brodkin ES, Goforth SA, Keene AH, Fossella JA, Silver LM 2002 Identi\u00a2cation of quantitative trait loci that a\u00a1ect aggressive behavior in mice. J Neurosci 22:1165^1170 Chesler EJ, Lu L, Wang J, Williams RW, Manly KF 2004 WebQTL: rapid exploratory analysis of gene expression and genetic networks for brain and behavior. Nat Neurosci 7:485^486 Darvasi A 1997 Interval-speci\u00a2c congenic strains (ISCS): an experimental design for mapping a QTL into a 1-centimorgan interval. Mamm Genome 8:163^167 Darvasi A 1998 Experimental strategies for the genetic dissection of complex traits in animal models.",
+      "Brodkin: Such a course mapping study with only about 400 mice would be unlikely to detect a QTL that accounts for only 2.5% of the phenotypic variance, QTL ANALYSIS OF AGGRESSIVE BEHAVIOURS IN MICE  73  but it should detect a QTL that accounts for approximately 10% of the variance (Lynch & Walsh 1998, Darvasi 1998). QTLs of this magnitude of e\u00a1ect on neurobiological or behavioural traits have been found fairly commonly in crosses between inbred mouse strains (see e.g. Wehner et al 1997).",
+      "By correlating genotypes with phenotypes in quantitative trait locus (QTL) analysis, a large number of polymorphic regions harboring trait relevant allelic variation have been defined for a wide range of behavioral phenotypes [17]. At present, there are 340  549 QTLs for behavioral phenotypes in the Mouse Genome Informatics database, which are largely derived from crosses of 2 inbred strains of mice [18].",
+      "A search of the Mouse Genome Informatics database (www.informatics.jax.org, March 16,2006) revealed 34 neurobehavioral- and/or pain-related QTLs mapped to >75 cM; these inc1ude seven traits related to alcohol, six to morphine or other drugs, two to painful arthritis, five to emotionality/anxiety, and one to seizure susceptibility. Several ofthese QTLs have been finely mapped near the peak of linkage of our analgesia QTL.",
+      "The behavioral QTLs were determined from the MGI database as of October 1, 2004. Alcrsp2 (Erwin et al. , 1997); Ap3q (Bachmanov et al. , 2002); Alcp12 (Gill et al. , 1998). Behavioral QTLs have been mapped using other mouse strains, and their validity in the ILS and ISS strains has not been tested. Mb, megabases. Table 4.",
+      "In the fourth step, we sought to identify DNA sequence variants that influence both molecular phenotypes as well as phenotypes at the structural and behavioral level. A remarkable region located on the distal end of mouse Chr 1 (172\u2013178 Mb) was the ideal subject for such an integrative study. This region, which we have named as Qrr1 (QTL rich region on distal Chr 1), is known for its unusually high density of QTLs for neural and behavioral traits, e.g. , traits like anxiety-related behavior, seizure, hippocampal volume, and alcohol preference consistently map to this region.",
+      "Overall, these studies reveal the existence of an extensive polygenic system influencing the exploratory behavior of mice similar to the kind of genetic architecture shown to influence behavior in tests of fear and anxiety (Caldarone et al. 1997; Flint et al. 1995; Gill & Boyle 2005; Henderson et al. 2004; Laarakker et al. 2008; Singer et al. 2005; Turri et al. 2001a,b). The significance of the QTL, and also of the polygenic system, is heightened by the finding that roughly the same set of genes has the potential to influence some behaviors from early adulthood to old age.",
+      "The behavioral phenotypes with QTLs on distal Chr 17 are (1) prepulse inhibition, assayed by McCaughran et al.41 in a panel of 21 BXD strains (trait ID on Genenetwork is 10396), (2) anxiety trait measure by time spent in open quadrant of zero-maze, assayed in a larger panel of 57 BXD strains42 (trait ID 11696) and (3) handling induced convulsion as an index of ethanol withdrawal severity, measured in 25 BXD strains43 (trait ID 10065). Gene\u2013gene interaction analysis.",
+      "Quantitative trait locus (QTL) mapping has been carried out in numerous species to associate regions of the genome to phenotypes even before the structure of the genome was well understood (e.g. , [3]). Rodents, especially mice, have been the species most prominently used for biomedically relevant traits. Amongst these, the BXD family of recombinant inbred (RI) strains derived from crossing two inbred strains\u2014C57BL/6J and DBA/2J mice\u2014have been extensively used for almost 50 years in fields such as neuropharmacology [4\u20136], immunology [7\u201313], behaviour [13\u201321], aging [21\u201329], neurodegeneration [30\u201333], and gut microbiome\u2013host interactions [34].",
+      "Two QTLs located on chromosomes 1 and 5 modulate different aspects of the performance of mice of the BXD Ty RI strain series in the Morris navigation task. Behav Genet. 2002; 32:69\u201378. [PubMed: 11958544] Mozhui RT, Ciobanu DC, Schikorski T, Wang XS, Lu L, Williams RW. Dissection of a QTL hotspot on mouse distal chromosome 1 that modulates neurobehavioral phenotypes and gene expression. PLoS Genetics. 2008; 4:e1000260. [PubMed: 19008955] Mulligan MK, Wang X, Adler AL, Mozhui K, Lu L, Williams RW. Complex control of GABA(A) receptor subunit mRNA expression: variation, covariation, and genetic regulation. PLoS One. 2012; 7(4):e34586.",
+      "Type I and type II error rates for quantitative trait loci (QTL) mapping studies using recombinant inbred mouse strains. Behav Genet, 26(2): 149-160. Bidwell, L. C., Willcutt, E. G., Defries, J. C., & Pennington, B. F. 2007. Testing for neuropsychological endophenotypes in siblings discordant for attentiondeficit/hyperactivity disorder. Biol Psychiatry, 62(9): 991-998. Bitanihirwe, B. K., Dubroqua, S., Singer, P., Feldon, J., & Yee, B. K. 2011. Sensorimotor gating and vigilance-dependent choice accuracy: a within-subject correlative analysis in wild-type C57BL/6 mice. Behav Brain Res, 217(1): 178-187. 151 References Bitsios, P., & Giakoumaki, S. G. 2005.",
+      "Quantitative trait locus (QTL) mapping has been carried out in numerous species to associate regions of the genome to phenotypes even before the structure of the genome was well understood (e.g. , [3]). Rodents, especially mice, have been the species most prominently used for biomedically relevant traits. Amongst these, the BXD family of recombinant inbred (RI) strains derived from crossing two inbred strains\u2014C57BL/6J and DBA/2J mice\u2014have been extensively used for almost 50 years in fields such as neuropharmacology [4\u20136], immunology [7\u201313], behaviour [13\u201321], aging [21\u201329], neurodegeneration [30\u201333], and gut microbiome\u2013host interactions [34].",
+      "Other aggression QTLs Several lines of mice have been selectively bred for high or low levels of o\u00a1ensive aggression, which con\u00a2rms that a propensity for aggressive behaviours is partially heritable. These lines include the Turku aggressive (TA) and non-aggressive (TNS) strains bred in Finland, the NC900 and NC100 strains bred in North Carolina, and the short attack latency (SAL) and long attack latency (LAL) strains bred in the Netherlands (Miczek et al 2001). In wild mice, there is evidence for a QTL a\u00a1ecting aggressive behaviours in a region of chromosome 17, the t region.",
+      "QTL ANALYSIS OF AGGRESSIVE BEHAVIOURS IN MICE  65  Progress towards identifying QTLs that a\u00a1ect aggressive behaviours in mice An example of aggression QTLs identi\u00a2ed as part of a whole genome scan One of the few studies to identify intermale aggression QTLs as part of a whole genome scan was published recently (Brodkin et al 2002). This study used NZB/ B1NJ (extremely aggressive) and A/J (extremely unaggressive) inbred mice as parental strains. The methods chosen for housing and aggression testing were designed to control the e\u00a1ect of non-genetic factors on the phenotype.",
+      "Neuroscientist 4:317^323 Brodkin ES, Goforth SA, Keene AH, Fossella JA, Silver LM 2002 Identi\u00a2cation of quantitative trait loci that a\u00a1ect aggressive behavior in mice. J Neurosci 22:1165^1170 Chesler EJ, Lu L, Wang J, Williams RW, Manly KF 2004 WebQTL: rapid exploratory analysis of gene expression and genetic networks for brain and behavior. Nat Neurosci 7:485^486 Darvasi A 1997 Interval-speci\u00a2c congenic strains (ISCS): an experimental design for mapping a QTL into a 1-centimorgan interval. Mamm Genome 8:163^167 Darvasi A 1998 Experimental strategies for the genetic dissection of complex traits in animal models.",
+      "Brodkin: Such a course mapping study with only about 400 mice would be unlikely to detect a QTL that accounts for only 2.5% of the phenotypic variance, QTL ANALYSIS OF AGGRESSIVE BEHAVIOURS IN MICE  73  but it should detect a QTL that accounts for approximately 10% of the variance (Lynch & Walsh 1998, Darvasi 1998). QTLs of this magnitude of e\u00a1ect on neurobiological or behavioural traits have been found fairly commonly in crosses between inbred mouse strains (see e.g. Wehner et al 1997)."
+    ],
+    [
+      "Other cell cyclerelated genes, such as p21, p18 and p27, were also reported to be involved in regulating different types of hematopoietic cells (Cheng 2004; Steinman 2002). For example, p21 and p18 specifically control HSC proliferation, whereas p27 only affects hematopoietic progenitor cells. Further study of the chromosome 3 QTL interval in the congenic mouse model may provide a platform leading to the discovery of novel cycle-active gene and/or functions of already known genes. The apoptotic analyses shown in Table 3.2 are novel.",
+      "Bystrykh L, Weersing E, Dontje B, Sutton S, Pletcher MT, Wiltshire T, Su AI, Vellenga E, Wang J, Manly KF, Lu L, Chesler EJ, Alberts R, Jansen RC, Williams RW, Cooke MP, de Haan G: Uncovering regulatory pathways that affect hematopoietic stem cell function using \u2018genetical genomics\u2019. Nat Genet 2005, 37(3):225-32. 29. Overall RW, Kempermann G, Peirce J, Lu L, Goldowitz D, Gage FH, Goodwin S, Smit AB, Airey DC, Rosen GD, Schalkwyk LC, Sutter TR, Nowakowski RS, Whatley S, Williams RW: Genetics of the hippocampal transcriptome in mouse: a systematic survey and online neurogenomics resource.",
+      "In summary, I have identified p107 and Snx5 as quantitative trait genes that regulate the number of HSCs in B6 and congenic mice. CAFC assays confirmed that increased expression of both genes increases HSC number in an in vitro setting. Although the increased expression of both Snx5 and p107 resulted in small increases in HSC number, the changes are biologically significant given the extensive proliferative potential of primitive stem cells.",
+      "The molecular mechanisms that regulate progenitor cell division and differentiation in the RMS remain largely unknown. Here, we surveyed the mouse genome in an unbiased manner to identify candidate gene loci that regulate proliferation in the adult RMS. We quantified neurogenesis in adult C57BL/6J and A/J mice and 27 recombinant inbred lines derived from those parental strains. We showed that the A/J RMS had greater numbers of bromodeoxyuridine-labeled cells than that of C57BL/6J mice with similar cell cycle parameters, indicating that the differences in the number of bromodeoxyuridine-positive cells reflected the number of proliferating cells between the strains.",
+      "Page 10  NIH-PA Author Manuscript  Septin 9 (Sept9) and cyclin-dependent kinase 3 (cdk3) and are two other genes that are worth mentioning because even though they are not directly linked to neurogenesis, they are both cell cycle regulatory genes. Sept9 is involved in the progression through G1 of the cell cycle and it is highly expressed throughout the adult mouse brain (Gonzalez et al. , 2009). Whereas, cdk3 is expressed at low levels throughout the adult mouse brain and it is required for G1-S transition (Braun et al. , 1998).",
+      "Bystrykh L, Weersing E, Dontje B, Sutton S, Pletcher MT, Wiltshire T et al. (2005). Uncovering regulatory pathways that effect hematopoietic stem cell function using \u2018genetical genomics\u2019. Nat Genet 37:225\u2013232. Cai L, Morrow EM, Cepko CL (2000). Misexpression of basic helix-loop-helix genes in the murine cerebral cortex affects cell fate choices and neuronal survival. Development 127:3021\u20133030. Caldarone B, Saavedra C, Tartaglia K, Wehner JM, Dudek BC, Flaherty L (1997). Quantitative trait loci analysis affecting contextual conditioning in mice. Nat Genet 17:335\u2013337. Calder AJ, Lawrence AD, Young AW (2001). Neuropsychology of fear and loathing. Nature Rev Neurosci 2:352\u2013363.",
+      "As further step, this finding opens the door to study the molecular networks via which LRP6 acts to regulate proliferation. ! '*! ! +&(/. ((&-*)  5.2. Redox regulation of Adult Hippocampal Precursor Cells  5.2.1. Hypoxia increases AHPCs proliferation and neuronal differentiation Oxygen concentration plays an important role in cellular development and tissue homeostasis. In the brain, depending on the tissue, the oxygen concentration varies from 0.1 to 5% and in the rat hippocampus it is around 3.2% (Studer et al. , 2000).",
+      "While this study covers only one part in the several conceptual levels of regulation we are confident that this work will lead to finding a central regulatory pathway that regulates adult hippocampal precursor cell proliferation. ! &*! ! +&(/. ((&-*)  5.1.1. Establishment of AHPCs Isolating the precursor cells has become extremely important in order to study them in detail away from the influence of their in vivo niche. Once the cells are in culture they express their autonomous, intrinsic properties without the niche influences such as cell-cell contacts, blood vessels, known and unknown growth factors and network activities.",
+      "Gene expression profiling using RNA samples from proliferating cultures of the 20 BXD mice strains yielded two cis eQTL candidates that directly regulated proliferation, LRP6 and Chchd8. LRP6 is well known as a co-receptor of Wnt signaling, but the function of Chchd8 is not known. Further experimentation, using over-  ! I! ! SUMMARY  expression and gene silencing demonstrated that LRP6 negatively regulates AHPCs proliferation. Thus, from this study using a system genetics approach, we were able to identify, LRP6 as a novel regulator of adult hippocampal neurogenesis. ! V! ! INTRODUCTION  2. INTRODUCTION 2.1.",
+      "Gene expression profiling ...............................................................68 4.1.8. LRP6 is a novel regulator of AHPCs proliferation .........................73 4.2. Redox regulation of Adult Hippocampal Precursor Cells................78 4.2.1. AHPCs yield increased under hypoxic conditions..........................78  ! T! ! TABLE OF CONTENTS  4.2.2. More neuronal differentiation under hypoxic conditions................79 5. DISCUSSION ..............................................................................................81 5.1. Systems genetic approach to identify genes regulating AHPCs proliferation .................................................................................................81 5.1.1. Establishment of AHPCs................................................................82 5.1.2. Variation in proliferative and differentiative properties of AHPCs83 5.1.3. QTL analysis ...................................................................................86 5.1.4. Candidate genes from gene expression profiling ............................87 5.1.5. Lrp6 as negative regulator of AHPCs proliferation ........................89 5.2. Redox regulation of Adult Hippocampal Precursor Cells................92 5.2.1.",
+      "Mapping determinants of human gene expression by regional and genome-wide association. Nature 437, 1365-1369. Chiasson, B.J. , Tropepe, V., Morshead, C.M. , and van der Kooy, D. (1999). Adult mammalian forebrain ependymal and subependymal cells demonstrate proliferative potential, but only subependymal cells have neural stem cell characteristics. Journal of Neuroscience 19, 4462-4471. Cipolleschi, M.G. , Dello Sbarba, P., and Olivotto, M. (1993). The role of hypoxia in the maintenance of hematopoietic stem cells. Blood 82, 20312037. Clarke, D.L. , Johansson, C.B. , Wilbertz, J., Veress, B., Nilsson, E., Karlstrom, H., Lendahl, U., and Frisen, J. (2000).",
+      "List of BXD AHPC lines stored  Table 3. List of eQTls in 0.6 threshold range  Table 4. Cis acting genes regulating proliferation trait  ! U#! ! PUBLICATIONS  Publications  A protocol for isolation and enriched monolayer cultivation of neural precursor cells from mouse dentate gyrus. Harish Babu*, Jan-Hendrik Claasen*, Suresh Kannan, Annette E. R\u00fcnker, Theo Palmer, Gerd Kempermann. Front. Neurosci. 5:89. doi: 10.3389/fnins.2011.00089  System genetics approach yields candidate genes regulating adult hippocampal precursor cells proliferation, Manuscript in preparation (first author paper)  ! U##! ! SUMMARY  1. SUMMARY Adult hippocampal neurogenesis is regulated at various levels and by various factors.",
+      "A recent study suggesting the role of mitochondria and  ! &&! ! +&(/. ((&-*)  cytochrome  oxidase  in  enhancing  hippocampal  neurogenesis  during  inflammation (Voloboueva et al. , 2010) may reveal the link for Chchd8 gene in adult neurogenesis. 5.1.5. Lrp6 as negative regulator of AHPCs proliferation The results from our gene expression profiling suggest that high expression level of Lrp6 is associated with slow proliferating AHPCs and vice versa. We confirmed this result by over expressing LRP6 in AHPCs. This revealed that LRP6 over expression reduced the proliferation of AHPCs by more than 2fold.",
+      "Two types of collagen and N-Cadherin were also in this pathway. The top upstream regulators of this gene set were Huntingtin (HTT) which regulates 32 of the 193 genes analyzed (p = 1.22 \u00d7 10\u221215), and \u03b2-estradiol which may regulate 39 out of 193 genes in the set (p = 4.06 \u00d7 10\u221210). 3.2.2. Genes regulated by ethanol in the NAC following CIE\u2014Three hundred seventy-eight probesets were exclusively altered by ethanol in the NAC only following CIE (Supplemental Fig. 2 and Table 5).",
+      "Expression of a subset of these neurogenesis-associated transcripts was controlled in cis across the BXD set. These self-modulating genes are particularly interesting candidates to control neurogenesis. Among these were musashi (Msi1h) and prominin1\u517eCD133 (Prom1), both of which are linked to stem-cell maintenance and division. Twelve neurogenesis-associated transcripts had significant cis-acting quantitative trait loci, and, of these, six had plausible biological association with adult neurogenesis (Prom1, Ssbp2, Kcnq2, Ndufs2, Camk4, and Kcnj9). Only one cis-acting candidate was linked to both neurogenesis and gliogenesis, Rapgef6, a downstream target of ras signaling.",
+      "Other cell cyclerelated genes, such as p21, p18 and p27, were also reported to be involved in regulating different types of hematopoietic cells (Cheng 2004; Steinman 2002). For example, p21 and p18 specifically control HSC proliferation, whereas p27 only affects hematopoietic progenitor cells. Further study of the chromosome 3 QTL interval in the congenic mouse model may provide a platform leading to the discovery of novel cycle-active gene and/or functions of already known genes. The apoptotic analyses shown in Table 3.2 are novel.",
+      "  and Tgfbr3 (transforming growth factor beta receptor 3).Of the significant genes correlated with the hippocampal cell death phenotype, there were 107 genes that were significant for a strain \u00d7 treatment interaction.Four of these genes also showed an FC > 1.5: Gadd45g (growth arrest and DNA-damage-inducible, gamma), Kcnj13 (potassium inwardly rectifying channel, subfamily J, member 13), Plekhg1 (pleckstrin homology domain containing, family G (with RhoGef domain) member 1), and Sgms2 (sphingomyelin synthase 2).",
+      "111 Bystrykh, L., E. Weersing, et al. (2005). \"Uncovering regulatory pathways that affect hematopoietic stem cell function using 'genetical genomics'. \"Nat Genet 37(3): 225-32. Cashman, J., A. C. Eaves, et al. (1985). \"Regulated proliferation of primitive hematopoietic progenitor cells in long-term human marrow cultures. \"Blood 66: 1002-1005. Celeste, A., O. Fernandez-Capetillo, et al. (2003). \"Histone H2AX phosphorylation is dispensable for the initial recognition of DNA breaks. \"Nat Cell Biol 5(7): 675-9. Chen, J., B. A. Astle, et al. (1999). \"Development and aging of primitive hematopoietic stem cells in BALB/cBy mice.\"Exp. Hematol. 27: 928-935. Cheng, T., N. Rodrigues, et al.",
+      "  The next category was Cellular Growth and Proliferation, which includes growth, proliferation, expansion and differentiation of cells and is also pertinent to the possible formation of new cells in this area of the hippocampus.37 genes were associated with this function.Not surprisingly, in the Cell Cycle function (Supplementary Table 2) we found thirty genes involved in cell cycle progression indicating the activity of dividing cells in this region.",
+      "Lef1 is expressed in cultured hippocampal neural stem cells in response to activation of the Wnt signaling pathway (Cui et al. , 2011). Our evidence and the literature both suggest that genes known to be involved in hippocampal adult neurogenesis are targets of Lef1, an important factor in generating granule cells in the dentate gyrus during development (Galceran et al. , 2000). The only two genes not targeted by Lef1 can be closely associated with it: Mtdh regulates the expression of Lef1 (Hu et al. , 2009; Yoo et al."
+    ],
+    [
+      "QTL Mapping and Identification of Candidate Genes A QTL is a region of the genome shown to be linked to a trait. The purpose of mapping this region is to identify a region of a genome that has a higher probability of harbouring the genetic variations controlling variability in trait values.",
+      "Often a local eQTL will be caused by allelic variation in the regulatory region of the gene or within the gene itself. mQTL A metabolite Quantitative Trait Locus is a region in the genome at which allelic variation correlates with the abundance variation of a certain metabolite. pQTL A protein Quantitative Trait Locus is a region in the genome at which allelic variation correlates with the abundance variation of a certain protein. Just like eQTL, pQTL can be local or distant according to the genomic position of the gene encoding for the protein relative to the QTL.",
+      "QTLs are regions within the genome whose genetic variation modulates quantitatively a phenotype characteristic of the particular trait under study (Lynch and Walsh, 1998). Determining the association between variations in specific disease phenotypes or a trait, with variations in genotypes of a reference population can be used to locate a QTL. One of the methods used for mapping QTLs associated with complex traits is genetic markers-trait association. Genetic markers associated with certain loci can be inherited in linkage disequilibrium. Generating populations with linked loci in disequilibrium is achieved though either crosses between inbred lines, or use of the out-bred populations.",
+      "Quantitative trait locus-mapping is a statistical method used to map chromosomal intervals (loci) that contribute to heritable variance in phenotypes. The method simply compares the inheritance of allelic variants (B or D genotypes in our case) with differences in phenotypes. A QTL will generally cover a region that includes 10\u2013100 genes, and these positional candidates can then be ranked roughly on the basis of criteria such as the types of DNA variants, patterns of mRNA expression, data from complementary human genetic cohorts (GWAS and linkage) and relevant literature about gene effects on central nervous system structure and function.",
+      "Chromosomal regions containing a gene (or genes) that a\u00a1ect the level of a quantitative trait are called quantitative trait loci (QTLs). The relevant genes in these regions have been called quantitative trait genes (QTGs) (Hitzemann et al 2003). Quantitative trait locus (QTL) analysis is an experimental strategy for identifying QTLs, and ultimately QTGs, that a\u00a1ect quantitative traits. Because of the complexity of these traits, progress in identifying QTGs has been slow compared to that in cloning genes underlying Mendelian traits (Glazier et al 2002).",
+      "Expression QTL Next, we will examine expression quantitative trait loci (eQTLs). These are QTLs for gene expression traits, a subset of the molecular phenotypes mentioned above. Much like classical phenotypes, expression of transcripts can be influenced by variants within the genome. However, because we know the location of the gene, we can split these eQTL into two categories, trans- (or distal) or cis- (or local) eQTL. A trans-eQTL (or distal-eQTL) describes when the expression of a gene is influenced by a locus far away from that gene, and therefore indicates that the gene of interest is downstream of another gene.",
+      "These loci which are associated with changes in transcript expression are often termed expression QTL (eQTL): a variant (or variants) within the locus alters the expression of the gene of interest. An eQTL found near to the location (~ \u2264 1Mbp) of the transcript is described as a local eQTL, and are often called ciseQTL. This is in contrast to trans-eQTL which are found more distally. Cis-eQTL are interesting when they are found for a gene within a QTL for another phenotype (e.g.",
+      "The location of these genotypes are quantitative trait loci (QTLs) [Abiola et al. , 2003]. Detected via statistical methods [Doerge, 2002], QTLs are stretches of DNA highly associated with a specific phenotype, analogous to genetic landmarks which roughly indicate the position of the active gene. QTLs are not defined at very fine granularity; they usually correspond to areas large enough to hold several genes. The genetic polymorphism (genotypes) in neighboring areas of a set of loci, as a group, influence structure and function on both molecular and organismic scales.",
+      "Quantitative trait loci (QTL)  132  analysis is a means to query the entire genome for DNA variants (markers) that show significant  133  associations with the phenotype (quantitative trait) under investigation. This is the first step to  134  identify candidate genes whose variants (alleles) affect the value of the phenotype. QTL analysis  135  was performed using WebQTL (http://www.genenetwork.org) for each PCA factor. WebQTL  136  performs 2,000 or more permutations of the strain data and significant QTL are defined by the  137  likelihood ratio statistic (LRS) score of correctly ordered data exceeding all other permutations  138  95% of the time, i.e.",
+      "Expression QTL Next, we will examine expression quantitative trait loci (eQTLs). These are QTLs for gene expression traits, a subset of the molecular phenotypes mentioned above. Much like classical phenotypes, expression of transcripts can be influenced by variants within the genome. However, because we know the location of the gene, we can split these eQTL into two categories, trans- (or distal) or cis- (or local) eQTL. A trans-eQTL (or distal-eQTL) describes when the expression of a gene is influenced by a locus far away from that gene, and therefore indicates that the gene of interest is downstream of another gene.",
+      "These are referred to as expression QTLs, or eQTLs (Schadt et al. , 2003), which control a portion of expression variation of particular genes in a population. eQTLs result from genetic differences in regulatory elements close to or within the gene (apparent cis-acting eQTLs) as well as those that map elsewhere in the genome from the gene whose expression is modulated (trans-acting eQTLs). By combining microarray and QTL analysis on the same mice, much can be learned about the genetic underpinnings of particular alcohol traits (Hitzemann et al. , 2004; Tabakoff et al. , 2003).",
+      "Working with complex traits that typically vary in their manifestation across a continuous distribution, in contrast to the binary nature of monogenic traits, QTLs are discovered by simply identifying loci with alleles that consistently covary with a phenotype across a population. Genomic regions that show a sufficiently strong association with a phenotype are considered QTLs. The simplest, or most hopeful, interpretation of a mapped QTL is that the implicated region harbors a single gene affecting manifestation of the associated phenotype.",
+      "By definition, a quantitative trait locus is a chromosomal region that contains a gene, or genes, that regulate a portion of the genetic variation for a particular phenotype (Wehner et al. 2001). The goal of QTL mapping is to identify regions of the genome that harbour genes relevant to a specified trait. QTL map locations are commonly determined by initial screening of mice with specific genetic characteristics, such as recombinant inbred strains, the F2 of two inbred strains, or recombinant congenic strains (Flint 2003).",
+      "(2003) and others defined the expression QTLs (eQTLs) as either cis (mapping near the gene locus) or trans (mapping elsewhere in the genome). When behavioral QTLs (bQTLs) and cis-eQTLs overlap, the cis-eQTL genes are inferred as strong quantitative trait gene (QTG) candidates (see e.g. Farris et al. 2010). The situation for trans-eQTLs is more complicated since the QTL confidence interval is generally larger and any gene within the QTL interval could have a regulatory role. The application of genetical genomics to mouse has generally focused on segregating populations involving R. Hitzemann et al.",
+      "Page 2  Definition of a QTL NIH-PA Author Manuscript  A quantitative trait is one that has measurable phenotypic variation owing to genetic and/or environmental influences. This variation can consist of discrete values, such as the number of separate tumours in the intestine of a cancer-prone mouse, or can be continuous, such as measurements of height, weight and blood pressure. Sometimes a threshold must be crossed for the quantitative trait to be expressed; this is common among complex diseases. A QTL is a genetic locus, the alleles of which affect this variation.",
+      "When the phenotype of interest is a quantitative trait, such as blood pressure or cholesterol levels, the underlying genetic locus is referred to as a \u201cQTL\u201d. A common strategy investigates the association between quantitative traits of transcriptional responses and their underlying DNA loci called \u201cresponse QTLs\u201d (reQTLs) (Albert and Kruglyak 2015). Studies have provided clear evidence for the colocalization of reQTLs and disease-related loci (Caliskan et al. 2015).",
+      "81 Gene Expression Quantitative Trait Locus Analysis Quantitative trait locus (QTL) mapping is a statistical technique that finds associations between phenotype and genotype in a genetically segregating population (Lander and Botstein 1989). Here, we performed eQTL mapping on the male and female data separately. There were 1,137 significant (q\u22640.5 and p\u22640.025) male and 1,232  female eQTLs. First, we explored differences in patterns of eQTL locations between sexes by plotting the genomic locations of each eQTL versus the transcript location (Figure 4.3a, b).",
+      "Chromosomal regions containing a gene (or genes) that a\u00a1ect the level of a quantitative trait are called quantitative trait loci (QTLs). The relevant genes in these regions have been called quantitative trait genes (QTGs) (Hitzemann et al 2003). Quantitative trait locus (QTL) analysis is an experimental strategy for identifying QTLs, and ultimately QTGs, that a\u00a1ect quantitative traits. Because of the complexity of these traits, progress in identifying QTGs has been slow compared to that in cloning genes underlying Mendelian traits (Glazier et al 2002).",
+      "1.4  Q u a n tita tiv e T rait L ocu s M a p p in g  Q uantitative tra it loci (QTLs) are genetic regions on a chromosome th a t control certain quantitative traits, such as crop yield or body fat. QTL m apping involves con\u00ad struction of genomic m aps and testing for association between tra its and polymorphic markers. A significant association provides evidence th a t a QTL is near th e m arker.",
+      "  Current data processing capabilities have also made it possible to search genome-wide for QTL (quantitative trait loci) [109].QTL mapping seeks to identify the relationship between various genomic locations and a set of quantitative traits, leading to a chromosomal location and ultimately to identification of gene(s) with the final goal of looking at gene expression.Among other things, this will lead to a better understanding of genetic mechanisms of variation and adaptation [121].Results can then be applied to adjust conservation measures in response to rapid change, for example, by identifying the genetic adaptability potential of individuals to be used in assisted migration or reintroduction [122,123]."
+    ]
+  ]
+}
diff --git a/gnqa/paper1_eval/src/data/datasets/old/general1_dataset.json b/gnqa/paper1_eval/src/data/datasets/old/general1_dataset.json
new file mode 100644
index 00000000..b3e9c06d
--- /dev/null
+++ b/gnqa/paper1_eval/src/data/datasets/old/general1_dataset.json
@@ -0,0 +1,128 @@
+{
+  "question": [
+    "What is the most cited environmental factor for the onset of asthma.",
+    "What is apoptosis?",
+    "How would one extract the DNA, from say, mammals or plants?",
+    "What is a genetic marker?",
+    "Explain the process of finding a genetic marker followed by a quantitative trait loci."
+  ],
+  "answer": [
+    "Air pollution is the most cited environmental factor for the onset of asthma.",
+    "Apoptosis, also known as cell suicide or programmed cell death, is a biological process in multicellular organisms that allows specific cells to be removed during the development of complex tissues, or potentially dangerous damaged cells to be destroyed for the benefit of the whole organism. It is characterized by a sequence of well-defined events resulting in cell destruction and is necessary for normal cell turnover. It is also essential to various other biological processes.",
+    "For mammals, DNA can be extracted using a DNA Isolation Kit for Mammalian Blood, following the manufacturer's instructions. This process typically involves mixing blood with lysis buffer and Proteinase K, followed by the addition of isopropanol and Inhibitor Removal Buffer. The DNA is then washed, centrifuged, and the supernatant discarded. The remaining pellet, which contains the purified genomic DNA, is diluted in TE buffer.  For plants, the mixed alkyl trimethyl ammonium bromide (MATAB) procedure can be used. This involves grinding plant material in liquid nitrogen, incubating it in extraction buffer, and then purifying it with chloroform:isoamylalcohol. The DNA extracts are then precipitated with isopropanol and resuspended in a buffer. The extracts are further purified on anion exchange columns.",
+    "A genetic marker is a measurable polymorphic sequence of DNA whose chromosomal location is known. They often have no known functional significance but are used as pointers to a particular chromosomal location. They are used to track the inheritance of genes and can be a gene or some section of DNA with no known function.",
+    "The process of finding a genetic marker followed by a quantitative trait loci (QTL) involves several steps. First, a population is developed for genetic mapping. This population can be a segregating population or a permanent population. The population is then genotyped using molecular markers. Next, the population is phenotyped for an interested trait. QTL analysis is then conducted using statistical procedures to find markers linked to the QTL. This involves single-marker regression across all chromosomes, where a hypothetical QTL is evaluated at the location of each marker locus. The significance of that QTL is estimated from a likelihood ratio statistic. A permutation test is then conducted to establish genome-wide significance criteria for the trait. The result is a list of marker loci that show a significant association with the trait. These loci are most likely to be near QTLs. The goal of QTL mapping is to identify regions of the genome that harbor genes relevant to a specified trait."
+  ],
+  "contexts": [
+    [
+      "INTRODUCTION  Asthma is a chronic disease of the airways defined by its symptoms, which include reversible airflow obstruction, inflammation, and bronchial hyperresponsiveness.The ancient Egyptians, Greeks, and Romans made reference to the symptoms of asthma, and today the disease is estimated to affect 235-334 million people worldwide (44,53).",
+      "The atopic triad.Perhaps the most widely recognized pattern of co-occurrence is the one of asthma, atopic dermatitis (eczema), and allergic rhinitis (hay fever), which together are referred to as the atopic triad and characteristically present clinically in a temporal sequence known as the atopic march.Within this sequence, atopic dermatitis is typically the first component to manifest, with approximately 20-30% of individuals with mild disease and 70% of those with severe disease going on to develop asthma.Individuals who undergo this distinctive sequence of disease progression frequently exhibit a more severe and persistent phenotype, with increased risk of allergen sensitization.",
+      "Clinically, asthma is characterized by episodes of coughing, chest tightness, wheezing, dyspnea, or sputum production.Often, asthma sufferers experience a combination of these symptoms, or some symptoms more than others.Pulmonary breathing tests typically demonstrate variable airway obstruction and hyperreactivity, but may be normal, even in patients with severe and uncontrolled disease [8].Thus, the diagnosis of asthma, which is based on general clinical symptoms and variable lung function testing, is non-specific and heavily dependent on clinical history.Within the \"umbrella\" diagnosis of asthma there exists a diverse array of differing clinical phenotypes [9].For example, childhood asthma is often associated with personal and parental atopic diseases (i.e., atopic dermatitis, food allergy, eosinophilic esophagitis, allergic rhinitis), viral infections, and tobacco smoke exposure [10].Alternatively, adult-onset asthma is less associated with atopic disease [11,12], but more associated with female sex [13], sinus disease [14], and preceding respiratory infections such as pneumonia [15].In addition, adult-onset disease is often of higher severity [12,16] with a faster and more persistent decline in lung function [17].Moreover, although severe patients are found in every demographic and age group, the most common phenotype is an adult female that is older and obese [18].",
+      "Introduction  An estimated 9% of children and 6% of adults in the United States have asthma [1].The total number of asthma sufferers worldwide is estimated to be over 300 million, with an additional 100 million expected to develop asthma by 2025 [2][3][4][5].Developed countries are the most affected, with some of the highest rates found in the United Kingdom, Australia, New Zealand and the Republic of Ireland [3].Asthma prevalence is rising significantly in developing countries in transition to a more Western lifestyle [3].In 2007, the cost of disease in the United States was estimated to be $56 billion in relation to medical expenses, missed days of work, and early deaths [1].The rate of asthma deaths has likely plateaued, but is still as high as 250,000 per year worldwide [6].Morbidity and mortality are particularly high in ethnic minorities living below or near the poverty line, and African American children had a death rate 10 times that of non-Hispanic white children in 2015 [7].Thus, asthma is a costly, growing health problem associated with high morbidity and mortality.",
+      "Getting accurate estimates of exposures is difficult, whether this is air pollution or toxins in our food and drink, but these are important questions. Rutter: That is an important point. From the twin study data it is clear that environmental effects account for quite a lot of the variance on all the multifactorial disorders. Yet the kinds of measures that are used aren\u2019t terribly solid. They include broad thing such as socio-economic status (SES). Even where there are good measures the care taken in testing for environmental mediation is usually poor.",
+      "Bronchiolitis, a disease that happens in the first year of life in many infants, is strongly associated with subsequent asthma. We ascertained it in the first years of life and have been following these people to age 25 now. For the people who had bronchiolitis and now have asthma, their parents recall much better that they had bronchiolitis than those who don\u2019t have asthma now. It is at least twice more. Extraordinarily, some of these latter parents don\u2019t recall that they took their child to the doctor in the fi rst year of life.",
+      "If you arrive in the USA when you are young you have almost the same prevalence of asthma as an adult as those who are born in the USA and who are not Mexican. But if you arrive at older ages you have less asthma. If you arrive at the age of 20 you have the same asthma risk as those born in Mexico (Eldeirawi et al 2005). Kotb: This is extremely interesting. There is a relationship between depression and the immune system. This especially applies to natural killer (NK) cells, which are the main cells that fight cancers.",
+      "A colleague of mine in Georgia found this may have a protective effect against later development of asthma (Ownby et al 2002). Martinez: We find significantly decreased likelihood of asthma if you have a dog in a home, but not if you have a cat. The reason for this is not that I hate cats, which I do, but most likely because cats are stealth hunters, and they have to be very clean. Dogs are collective hunters and they don\u2019t care if they smell.",
+      "; Guffey, S.E. Investigation into pedestrian exposure to near-vehicle exhaust emissions. Environ. Health 2009, 8, 13. [CrossRef] [PubMed] Our World in Data.org. 2017. Available online: https://ourworldindata.org/data-review-air-pollution-deaths (accessed on 10 January 2022). Pope, C.A. , III. Respiratory disease associated with community air pollution and a steel mill, Utah Valley. Am. J. Public Health 1989, 79, 623\u2013628. [CrossRef] [PubMed] Pope, C.A. , III. What do epidemiologic findings tell us about the health effects of environmental aerosols? J. Aerosol. Med. 2000, 13, 335\u2013354. [CrossRef] [PubMed] Pope, C.A. , III.",
+      "Case for Support BBSRC Grant Application September 2005 \u201cIntegrative Analysis of the Genetic Factors behind Asthma and Atopic Dermatitis\u201d  Part I: Research Proposal Background A Introduction of topic of research and its academic and wider context Asthma is the most common disease of childhood, and affects one child in seven in the United Kingdom. Atopic Dermatitis (AD, eczema) affects similar numbers of children. About 60% of children with severe AD will have concomitant asthma. Treatments for both diseases are unsatisfactory. Abandonment of orthodox medical therapy for AD is common in many families who have children with the disease.",
+      "This is most common during the rainy season when aerosols are created, which results in repeated inhalation of Bp [43, 44]. Environmental sampling studies reveal there is a positive association between the prevalence of disease and the degree of environmental contamination [7]. In addition to environmental factors, data suggests that host factors play an important role in mounting an immune response against infectious diseases [45] such as melioidosis. While healthy persons can contract melioidosis, most patients in endemic regions have an underlying predisposition [28], which suggests that the immunological status of the patient can influence disease initiation and progression [15].",
+      "Sensitivity analysis  We did two sets of post-hoc sensitivity analyses to assess the effects of potential poor recall of age of onset among individuals with adult-onset asthma, and the effects of misclassification of COPD as asthma among the adultonset cases, even with exclusion of cases with a reported diagnosis of COPD, emphysema, or chronic bronchitis.First, to assure that the adult-onset cases did not include a significant proportion of childhood-onset asthma in which symptoms remitted in early life but then relapsed in adulthood, we replaced adult-onset cases with increasing proportions of randomly selected childhood-onset cases, and then tested for association at the two most significant childhood onset-specific loci.This procedure was repeated 20 times for each proportion to quantify the sampling variability (appendix pp 7-8).Second, we did two analyses in which we removed either individuals with ages of asthma onset between 46 and 65 years or adult-onset cases and controls with FEV\u2081/FVC <0\u202270.For each, we compared p values and ORs with the GWAS including all adult-onset cases (appendix pp 8-9).",
+      "We used data for British white individuals from UK Biobank data release July 19, 2017. 8We extracted disease status (asthma, allergic rhinitis, atopic dermatitis, food allergy, chronic obstructive pulmonary disease (COPD), emphysema, and chronic bronchitis), age of on set of asthma, and sex from self-reported question naires and hospital records (International Classification of Diseases 10th revision [ICD-10] codes) by querying our in-house protected UK Biobank database server. 9For our main case analysis, we included individuals who self-reported that they had doctor-diagnosed asthma.Further details of our research approach are provided in the appendix (pp 4-7).",
+      "; Guffey, S.E. Investigation into pedestrian exposure to near-vehicle exhaust emissions. Environ. Health 2009, 8, 13. [CrossRef] [PubMed] Our World in Data.org. 2017. Available online: https://ourworldindata.org/data-review-air-pollution-deaths (accessed on 10 January 2022). Pope, C.A. , III. Respiratory disease associated with community air pollution and a steel mill, Utah Valley. Am. J. Public Health 1989, 79, 623\u2013628. [CrossRef] [PubMed] Pope, C.A. , III. What do epidemiologic findings tell us about the health effects of environmental aerosols? J. Aerosol. Med. 2000, 13, 335\u2013354. [CrossRef] [PubMed] Pope, C.A. , III.",
+      "8 The socio-ecologic framework posits that various aspects of a child\u2019s environment directly and indirectly impact the child\u2019s health and development.9 Drawing on this framework, Beck and colleagues10 examined several biologic, social and ecologic variables to provide a greater understanding of factors influencing asthma-related hospital readmissions for black children compared to their white counterparts. The study revealed that black children were over two times as likely to be readmitted for an asthma-related illness compared to white children; this resulted from significant differences in almost every socio-ecologic variable measured, including disease management practices and access to primary care.",
+      "Specific Aims Asthma is the most common chronic pediatric medical condition in the United States, with a prevalence over 9.6% in children under 18 years of age.1, 2 Low-income, urban children incur a disproportionate share of asthma prevalence and morbidity;2-4 13% of children living below the poverty threshold are diagnosed with asthma compared to 8% of non-poor (>200% poverty),3 and poverty is associated with higher rates of asthma attacks.1 Living in an urban area confers additional risk for asthma and increased ED utilization.4, 5 Implementation of the National Asthma Education and Prevention Program\u2019s (NAEPP) Guidelines has contributed to reductions in asthma morbidity and mortality rates, and these guidelines emphasize establishing a partnership between healthcare providers and patients/families to promote effective asthma management.6 The NAEPP expert panel states, \u201cbuilding a partnership requires that clinicians promote open communication and ensure that patients have a basic and accurate foundation of knowledge about asthma\u2026\u201d (p.124),6 yet care partnerships also require that the patient/parent effectively communicate issues such as emerging symptoms or response to medications.",
+      "Vital & health statistics Series 3, Analytical and epidemiological studies. 2012(35):1-58. CDC. Current Asthma Prevalence. https://www.cdc.gov/asthma/most_recent_data.htm. 2015. Updated June 2017. Accessed March 9, 2018. Northridge J, Ramirez OF, Stingone JA, Claudio L. The role of housing type and housing quality in urban children with asthma. Journal of urban health : bulletin of the New York Academy of Medicine. 2010;87(2):211-224. Flores G, Snowden-Bridon C, Torres S, et al. Urban minority children with asthma: substantial morbidity, compromised quality and access to specialists, and the importance of poverty and specialty care.",
+      "Asthma Prevalence and Disparities Asthma is the most common chronic pediatric medical condition in the United States,1 affecting an estimated 6.2 million children annually.2 Poorly controlled pediatric asthma contributes to over 700,000 visits a year to emergency departments (ED).1 Children living in impoverished, urban settings are disproportionately affected by asthma,3 and the disparate impact of asthma is even worse among black and Latino children, and children whose parents have limited English proficiency (LEP) in these urban low-income areas.4-6 A 2017 longitudinal study revealed that black race and Latino ethnicity are significantly associated with worse asthma outcomes including 1) asthma knowledge, 2) asthma-related quality of life, 3) asthma severity, and 4) asthma control.",
+      "The Journal of asthma : official journal of the Association for the Care of Asthma. 2017:16. Inkelas M, Garro N, McQuaid EL, Ortega AN. Race/ethnicity, language, and asthma care: findings from a 4-state survey. Annals of allergy, asthma & immunology : official publication of the American College of Allergy, Asthma, & Immunology. 2008;100(2):120-127. National Asthma Education and Prevention Program. Expert Panel Report 3: Guidelines for the Diagnosis and Management of Asthma Bethesda, MD: National Institutes of Health, National Heart, Lung, and Blood Institute; 2007. Publication no. 08-045.1. NIH Consensus Group. Video report: What is mHealth?",
+      "Contact PD/PI: Coker, Tumaini Rucker  INTRODUCTION TO APPLICATION Research Plan Overview Childhood asthma is the most common pediatric medical condition in the United States, and disproportionately affects children living in low-income, urban settings. Many low-income, urban families rely on emergency department (ED) services as their source for sick care for their child. This is often due to not having a primary care provider or sufficient access to their primary care provider for asthma management."
+    ],
+    [
+      "Apoptosis, or controlled cell death [62], is another major stressed-cell response, and was also represented in our results (Fig. 9e).A large body of direct evidence points to apoptosis as one of the main routes of RPE degeneration in AMD [63].Induction of apoptosis upon stress is dictated by the action of master regulator p53, and it was recently shown that aging increases the activity of p53 in RPE cells and the likelihood for apoptotic cell death [64].Consistent with this evidence, we found association with pathways in Transcriptional regulation by TP53 group (Fig. 9d).In particular, Regulation of TP53 activity through methylation was among the top pathway in our association analysis (Table 1), suggesting that p53 modification by methylation and the closely related histone modifications [Protein lysine methyltransferases (PKMTs) methylate histone lysine in Fig. 9e] play important roles in RPE apoptosis regulation.In the intrinsic apoptotic pathway induced by oxidative stress, cytochrome c is released from mitochondria into the cytosol, binding and activating caspases, the main proteases central to apoptotic action.We found association in pathways involving 'inhibitor of apoptosis' (IAP) and its negative regulator 'second mitochondrial activator of caspases' (SMAC) [65], which suggests that disruption to regulatory mechanisms preventing apoptosis in RPE cells may play roles in AMD.",
+      "Apoptosis  Persistent DNA damage",
+      "42 ABSTRACT 18 A MODULARIZED MODEL OF APOPTOSIS HA Harrington, KHo, Sk Ghosh, KC Tung , CY Kao, and B Aguda Imperial College London, Courant Institute of Mathematical Sciences New York University, University of Texas at Arlington, University of Texas Southwestern Medical Center, Mathematical Biosciences Institute, and Department of Mathematics, The Ohio State University Columbus, OH, USA Background: One of the key physiological mechanisms employed by the cell (during development and for maintenance of homeostasis) in multi-cellular organism is apoptosis, which is characterized by a sequence of well-defined events resulting in cell destruction.",
+      "14 Apoptosis is caused by the activation of the caspase cascade, which is initiated by two signaling routes (stress-induced death and death-domain receptor-induced death) (Domen 2001). This process can be prevented by antiapoptotic molecules, such as Bcl-2 (Domen and Weissman 2000). Direct evidence for the involvement of apoptosis in HSC number regulation came from the findings that overexpression of the anti-apoptotic gene bcl-2 led to increased numbers of Thy-1.1low, Sca-1+, c-kit+, Lin- cells, a population with long-term multi-lineage repopulation potential (Domen et al. 2000).",
+      "Several lines of evidence have indicated that apoptosis acts as an important regulator of stem cells. First of all, expression of some apoptosisrelated genes were detected in human and/or murine HSCs (Domen 2001). Secondly, targeted disruption of some of these genes in null and dominant negative mutant mice interfered with normal apoptotic processes in HSCs. For example, overexpression of Bcl-2, a negative regulator of apoptosis, increased not only the numbers and competitive repopulation capabilities of HSCs, but also the resistance of HSCs to apoptosis induced by ionizing radiation (Domen and Weissman 2003).",
+      "Apoptosis  Cell suicide, or apoptosis, is a well-studied biological phenomenon in multicellular organisms that allows specific cells to be removed during the development of complex tissues, or potentially dangerous damaged cells to be destroyed for the benefit of the whole organism.The lack of an apparent evolutionary benefit for such a process in a single-celled organism initially caused controversy about the presence of an apoptotic pathway in yeast.Today, however, a number of yeast orthologues to mammalian apoptosis genes have been discovered and apoptotic-like cell death has been linked to mating, colony formation, and aging (Buttner et al. 2006;Eisenberg et al. 2007;Frohlich et al. 2007).With respect to aging, both replicatively and chronologically aged cells that die have increased ROS and display apoptotic phenotypes (Fabrizio et al. 2004a;Herker et al. 2004;Laun et al. 2001).",
+      "The importance of apoptosis in yeast aging has yet to be fully characterized.At the very least, yeast apoptosis provides a useful pathway for studying genetic interactions for age-related diseases that affect humans, such as cancer.Readers interested in further information related to yeast apoptosis are referred to several in-depth reviews (Buttner et al. 2006;Eisenberg et al. 2007;Frohlich et al. 2007).",
+      "Early redistribution of plasma membrane phosphatidylserine is a general feature of apoptosis regardless of the initiating stimulus: inhibition by overexpression of Bcl-2 and Abl. J Exp Med 182: 1545-56. Mathew CG (2006). Fanconi anaemia genes and susceptibility to cancer. Oncogene 25: 5875-84. McBride MW, Carr FJ, Graham D, Anderson NH, Clark JS, Lee WK et al (2003). Microarray analysis of rat chromosome 2 congenic strains. Hypertension 41: 847-53. Merino-Trigo A, Kerr MC, Houghton F, Lindberg A, Mitchell C, Teasdale RD et al (2004).",
+      "When a cell harbors such severe DNA damage that it is beyond repair, it is disposed of through apoptosis.Alternatively, DNA damage can induce cellular senescence, the irreversible cessation of mitosis.Both processes are critically dependent on p53, which is known as the guardian of the genome [3] .DNA damage may also trigger autophagy, a cellular catabolic process that maintains homeostasis [4] .It should be noted that under normal conditions cells are rarely exposed to very high doses of DNAdamaging agents, which may be the explanation why we do not age and die because we run out of cells.However, aging is associated with some atrophy [1] and it is conceivable that at older ages bursts of DNA damage, for example from free radical reactions associated with inflammation, do occur and give rise to an increasingly high rate of apoptosis or cellular senescence.While there is some evidence for increased apoptosis and cellular senescence at old age, it is doubtful that under normal conditions this would lead to a significant loss of functional cells.",
+      "Apoptosis, or programmed cell death, literally eliminates cells at risk for neoplastic transformation.Senescence, by contrast, permanently arrests their growth.Both processes are controlled by the p53 tumor suppressor protein (Amundson, Myers, & Fornace, 1998;Bringold & Serrano, 2000;Hickman, Moroni, & Helin, 2002;Itahana, Dimri, & Campisi, 2001).p53 is a transcriptional regulator that both transactivates and transrepresses target genes in response to stress (Prives & Hall, 1999;Ryan, Phillips, & Voudsen, 2001).These target genes, in turn, stimulate DNA repair, transient cell cycle arrest, permanent cell cycle arrest (senescence) or cell death (apoptosis), depending on cell type, degree and type of damage, and other variables.In contrast, cells that lack normal p53 regulation or function -for example, tumor cells -tend to die in response to telomere dysfunction.Some normal human cells, on the other hand, undergo a senescence growth arrest.In either case, when present, p53 is crucial for mediating the cellular response to telomere dysfunction (Yaswen & Stampfer, 2002) (Fig. 4).",
+      "Cell death, and in particular apoptosis, can be caused by a number of mechanisms including loss of growth factors and excitotoxicity (e.g. , Bhutta and Anand, 2002; Nikolic\u0301 et al. , 2013). It is of interest therefore, that proximal to the region of the QTL there are several genes that are related to growth factors including the latent transforming growth factor protein 2 (ltbp2), placental growth factor (pgf), and transforming growth factor beta (Tgf beta).",
+      "Apoptosis-related gene expression profiles",
+      "Apoptosis.Programmed death of cells during embryogenesis and metamorphosis or during cell turnover in adult tissues.",
+      "14 Apoptosis is caused by the activation of the caspase cascade, which is initiated by two signaling routes (stress-induced death and death-domain receptor-induced death) (Domen 2001). This process can be prevented by antiapoptotic molecules, such as Bcl-2 (Domen and Weissman 2000). Direct evidence for the involvement of apoptosis in HSC number regulation came from the findings that overexpression of the anti-apoptotic gene bcl-2 led to increased numbers of Thy-1.1low, Sca-1+, c-kit+, Lin- cells, a population with long-term multi-lineage repopulation potential (Domen et al. 2000).",
+      "Several lines of evidence have indicated that apoptosis acts as an important regulator of stem cells. First of all, expression of some apoptosisrelated genes were detected in human and/or murine HSCs (Domen 2001). Secondly, targeted disruption of some of these genes in null and dominant negative mutant mice interfered with normal apoptotic processes in HSCs. For example, overexpression of Bcl-2, a negative regulator of apoptosis, increased not only the numbers and competitive repopulation capabilities of HSCs, but also the resistance of HSCs to apoptosis induced by ionizing radiation (Domen and Weissman 2003).",
+      "Fraction of cells displaying apoptosis",
+      "It has been known that mitochondria play a central role in the life and death of cells (Kroemer & Reed, 2000).Apoptosis was observed in developmentally arrested embryos by 72 h, but not at 24 h after FCCP treatment, despite considerable telomere attrition at this early stage, suggesting that telomere attrition occurs prior to apoptosis and may serve as an intermediate step between mitochondrial dysfunction and apoptosis.These results also suggest that telomere shortening may signal apoptosis (Lee et al ., 1998;Karlseder et al ., 1999).",
+      "Cell Death  A form of programmed cell death, apoptosis is necessary for normal cell turnover and is essential to a plethora of other biological processes.Apoptosis can be executed via Bcl-2 activation of caspases, via signals from the death receptor on the plasma membrane, or via induction by granzyme B secreted from cytotoxic T cells (Tc cells) [35].Endonucleases and proteases are activated by active caspases, eventually leading to the death of the cell.With age, however, apoptotic activity changes.In heart [36], kidney [37], skeletal muscle [38], and Tc cells [39], increased apoptosis has been reported, perhaps contributing to loss of cellularity in these tissues.This escalation across various tissues may be attributed to the increased production of free radicals [40] and furthermore exacerbated by the accumulation of DNA damage in the aged cells [41].As the risk increases for cells to turn cancerous and dysfunctional with advancing age, increased apoptosis in aged cells is argued to be a defense strategy.In other tissues, such as the colon, apoptosis appears to decrease with age perhaps contributing to the accumulation of senescent cells and age-associated carcinogenesis [42].",
+      "The regulation and execution of apoptosis in endothelial cells is a complex process involving paracrine factors, membrane receptors, interaction of pro-and anti-apoptotic factors and cysteinyl aspartate-specific proteases (caspases).Recent studies suggest that in aging there is an imbalance in the expression of pro-and anti-apoptotic genes resulting in an enhanced apoptosis in the myocardium (19), central nervous system (24), skeletal muscle (10), lung (33), and liver (2,33).Yet, age-related alterations in the expression of pro-and anti-apoptotic genes in coronary arteries have not been elucidated.",
+      "Apoptosis modulating genes  Apopotosis or programmed cell death is associated with alterations in cell morphology, particularly the nucleus, with endonucleatytic cleavage of DNA into nucleosomal length fragments.Apoptosis may result from withdrawal of growth signals."
+    ],
+    [
+      "DNA and RNA extraction of tissues  Genomic DNA was extracted from frozen placentae (n \u03ed 3/group) and liver (n \u03ed 9/group) using a modified version of an established protocol (28,29).Total RNA was extracted from the remaining tissue using TRIzol, as per the manufacturer's instructions (Invitrogen Canada Inc).Genomic DNA and RNA purity and concentration were assessed using spectrophotometric anal-ysis, and integrity was verified using agarose gel [1% (wt/vol)] electrophoresis.",
+      "Taxon Sampling and DNA Extractions  We extracted DNA from 72 pinned specimens from the National Museum of Natural History (NMNH) Entomology collection for this study.We plucked middle legs from the pinned bees using a pair of sterilized forceps and washed the tissue in 95% ethanol to remove dust, pollen, and other forms of accumulated debris on the bee legs.After evaporation of the ethanol (by drying the tissue on a clean Kimwipe \u2122 ), the samples were placed in a freezer for several hours.DNA was then extracted destructively by grinding the frozen tissue with a sterile pestle, using a DNeasy Blood and TissueKit (Qiagen, Valencia, CA, USA) and following the manufacturer's protocol, except the DNA was eluted in 130\u03bcL ddH 2 O instead of the supplied buffer.We ran 10\u03bcL of each extract for 60 min at 100 volt on 1.5% agarose SB (sodium borate) gels, to estimate size of the genomic DNA.",
+      "DNA extraction  DNA was extracted from PBMCs using the QIAamp DNA Mini kit (Qiagen, CA, USA), following the manufacturer's instructions for the spin protocol.The DNA was eluted in 60 \u03bcl of AE elution buffer and stored at -20\u00b0C.The concentration and quality of the DNA was assessed with the Qubit dsDNA HS Assay (Invitrogen, Eugene, OR, USA).",
+      "Methods  Laboratory procedures.We initially screened 107 ancient samples (Supplementary Data 1) in dedicated clean facilities at the ancient DNA lab of Jilin University, China, following published protocols for DNA extraction and library preparation 36,37 .Prior to sampling, we wiped all skeletal elements with 5% bleach and irradiated with UV-light for 30 min from each side.We drilled teeth to obtain fine powder using a dental drill (Dremel, USA).We sampled the dense part of petrous bones around the cochlea by first removing the outer part using the sandblaster (Renfert, Germany), and then grinding the clean inner part into fine powder with the mixer mill (Retsch, Germany).We digested the powder (50-100 mg) in 900 \u03bcl 0.5 M EDTA (Sigma-Aldrich), 16.7 \u03bcl of Proteinase K (Sigma-Aldrich), and 83.3 \u03bcl ddH 2 O (Thermo Fisher, USA) at 37 \u00b0C for 18 h.Then we transferred the supernatant to a MinElute silica spin column (QIAGEN, Germany) after fully mixed with the 13 ml custom binding buffer [5 M guanidine hydrochloride (MW 95.53), 40% Isopropanol, 90 mM Sodium Acetate (3 M), and 0.05% Tween-20] followed by two washes with PE buffer (80% ethanol).Then we eluted the DNA with 100 \u03bcl TET buffer (QIAGEN, Germany).",
+      "DNA Extraction  After blood was drawn into EDTA tubes, genomic DNA was extracted using a DNA Isolation Kit for Mammalian Blood Kit (Roche Applied Science, Indianapolis, IN, USA) according to the manufacturer's recommendations.Briefly, 300 \u03bcl of whole blood from each sample was mixed with 200 \u03bcl of lysis buffer (50 mM Tris pH 8.0, 100 mM EDTA, 100 mM NaCl, 1% SDS) and 40 \u03bcl of Proteinase K, followed by addition of 100 \u03bcl of isoproponal and 500 \u03bcl of Inhibitor Removal Buffer (5M guanidine-HCl, 20 mM Tris-HCl pH 6.6).The DNA was washed with a buffer (20 mM NaCl; 2 mM Tris-HCl; pH 7.5), centrifuged twice at 2000 rpm, washed using cold 70% ethanol and centrifuged at 3000 rpm.The supernatant was discarded and the pellet containing purified genomic DNA was diluted in TE buffer (1 mM EDTA; 10 mM Tris-HCl, pH 7.5) to a concentration of approximately 50 ng/\u03bcl.",
+      "Genomic DNA extraction  Leukocytes were isolated from 5-ml peripheral blood samples.DNA was prepared by phenol extraction and chloroform extraction followed by isopropanol precipitation, washed with ethanol, and air-dried.Tris-EDTA buffer pH 8.0 was used to dissolve the final genomic DNA product.",
+      "The pulled down DNA fragments were extracted and purified using phenolchloroform extraction/ethanol precipitation.The samples were stored at -20 \u00b0C until use.",
+      "DNA extraction for genotyping  For the majority of samples, DNA was extracted from either spleen or the exocrine fraction of the islet isolation using the Tissue DNA Purification Kit according to manufacturer's instructions on an automated Maxwell 16 system (both Promega, USA).When no other tissue was available, DNA was extracted from human islets using the Trizol fraction remaining after extraction of RNA (see above).To precipitate the DNA, 300\u03bcl 100% ethanol was added to the thawed solution.This mixture was incubated at room temperature for a minimum of 30 minutes.DNA was then pelleted by centrifugation at 4,000 x g for 5 minutes at 4\u00b0C.After removing the supernatant, the pellet was twice washed with 0.1M trisodium citrate (Sigma Aldrich, UK) in 10% ethanol and left at room temperature for 30 minutes, followed by another wash step with 75% ethanol.After the final wash step, pellets were air-dried for 10 minutes to remove residual ethanol and re-suspended in a minimum of 100 \u03bcL 8mM NaOH (Sigma Aldrich).Extracted DNA was stored at -20\u00b0C before further use.",
+      "DNA extraction  Tissue samples were incubated at 50\u00b0C overnight with shaking in DNA extraction buffer (100 mM NaCl, 10 mM Tris.HCl pH8, 25 mM EDTA, 0.5% (w/v) SDS), containing 200 \u03bcg/ml proteinase K. DNA was isolated by two rounds of phenol:chloroform extraction, followed by RNAse A treatment, precipitation in absolute ethanol containing 10% (v/v) sodium acetate (3 M, pH 5.2), and resuspended in 100 \u03bcl nuclease-free water (Ambion, Austin, TX, USA) or using salting out method followed by purification with Qiagen blood and tissue kit (Qiagen, Mississauga, ON, USA).DNA was stored at -20\u00b0C.",
+      "Methods  Human DNA samples DNA was extracted from human patient tissue samples acquired from the University of Minnesota Tissue Procurement Facility from BioNet (IRB#0805E32181).See Supplemental Table S4 for patient data.Briefly, 2 mg of tissue was digested overnight at 55\u00b0C on a rotating platform in 710 mL of digest buffer (1 M Tris at pH 8.0, 1 mM EDTA, 13 SSC, 1% SDS, 1 Mm NaCl, 10 mg/mL Proteinase K).Following digest, DNA was purified using phenolchloroform-isoamyl alcohol (Life Sciences) isolation protocol.",
+      "3.2.2 Isolation of genomic DNA Genomic DNA was isolated from frozen liver tissue. The isolation was conducted using the Qiagen DNeasy\uf8e8 Blood & Tissue Kit (Qiagen) according to the manufacturer\u2019s protocol. DNA concentration was evaluated photometrically at a wavelength of 260 nm using the FusionTM Universal Microplate Analyzer. For nucleic acid quantification, the Beer-Lambert (A = \u03b5 * b * c) equation is modified to use an extinction coefficient with units of M-1 cm-1.",
+      "Most typically, DNA is extracted from blood samples, dried blood spots, buccal swabs, saliva, tissue and even urine and stool samples.In forensic science, other sources have been validated e.g.bone, tooth pulp, dandruff and others.",
+      "DNA isolation  High-molecular weight DNAs was isolated from the samples by organic solvent extraction method, followed by precipitation in cold ethanol [14].",
+      "Genomic DNA extraction  DNA from MEF cultures or mouse liver was isolated by phenol/chloroform extraction, as described [11].",
+      "DNA is usually recovered from cells by methods that include cell rupture but that prevent the DNA from fragmenting by mechanical shearing. This is generally undertaken in the presence of EDTA, which chelates the magnesium ions needed as cofactors for enzymes that degrade DNA, termed DNase. Ideally, cell walls, if present, should be digested enzymatically (e.g. , lysozyme in the bacteria or bacterial cell). In addition the cell membrane should be solubilized using detergent.",
+      "DNA solutions can be stored frozen, although repeated freezing and thawing tends to damage long DNA molecules by shearing. A flow diagram summarizing the extraction of DNA is given in Fig. 1.2. The above-described procedure is suitable for total cellular DNA. If the DNA from a specific organelle or viral particle is needed, it is best to isolate the organelle or virus before extracting its DNA, because the recovery of a particular type of DNA from a mixture is usually rather difficult.",
+      "Genomic DNA extraction  Genomic DNA was extracted by the mixed alkyl trimethyl ammonium bromide (MATAB) procedure.Briefly, 250 mg of plant material was ground in liquid nitrogen and immediately incubated in 2 ml of pre-warmed extraction buffer (100 mM Tris-HCl, pH 8, containing 20 mM EDTA, 1.4 M NaCl, 2% (w/v) MATAB, 1% (w/v) PEG6000 (polyethylene glycol), 0.5% (w/v) sodium sulfite, 20% (w/v) Igepal CA630, 20% (w/v) lithium dodecyl sulfate, and 20% (w/v) sodium deoxycholate) at 74 \u00b0C for 20 min.After purification with 2 ml of chloroform:isoamylalcohol (24:1, v/v), DNA extracts were precipitated with 1.6 ml of isopropanol then resuspended in 1 ml of buffer (50 mM Tris-HCl, pH 8, containing 10 mM EDTA and 0.7 M NaCl).The extracts were purified on anion exchange columns (QIAGEN-tip 20) following the manufacturer's instructions (QIAGEN, Valencia, CA).",
+      "DNA extraction and enzymatic digestion  Total DNA was isolated from whole blood and separated blood subtypes using a Qiagen DNeasy Blood & Tissue Kit following the manufacturer instructions.After extraction, DNA was quantified by NanoDrop (Thermo Scientific NanoDrop products, Wilmington, DE).The isolated genomic DNA was enzymatically digested according to previously described method.Briefly, DNA (3 \u03bcg) was first denatured by heating at 95 \u00b0C for 5 min and then chilling on ice for 2 min.Then, 1/10 volume of S1 nuclease buffer (30 mM CH 3 COONa, pH 4.6, 280 mM NaCl, 1 mM ZnSO 4 ) and 100 units of S1 nuclease were added before the mixture (20 \u03bcL) was incubated at 37 \u00b0C for 16 h.Subsequently, after 1/10 volume of alkaline phosphatase buffer (50 mM Tris-HCl, 10 mM MgCl 2 , pH 9.0), 0.002 units of venom phosphodiesterase I, and 10 units of alkaline phosphatase were added, the solution was incubated at 37 \u00b0C for an additional 4 h followed by extraction with an equal volume of chloroform for twice.The aqueous layer was collected and lyophilized to dryness and then reconstituted in 100 \u03bcL water.About 30 \u03bcL of the obtained samples were then subjected to liquid chromatography-electrospray ionization-tandem mass spectrometry (LC-ESI-MS/MS) analysis.",
+      "The conventional DNA extraction procedure involved the homogenization of single D. magna in 400 l of sperm lysis buffer (100 mM Tris-HCl, pH 8; 500 mM NaCl; 10 mM ethylenediaminetetraacetic acid [EDTA], pH 8; 1% SDS; 2% mercaptoethanol) followed by RNase treatment (40 g, 37\u040aC for 1.5 h).The DNA was then extracted in phenol (pH 8) and chloroform:isoamyl alcohol (1:1).The DNA was finally precipitated by two volumes of ice-cold ethanol in the presence of 3 M sodium acetate (1/10 of the DNA volume) and was incubated at \u03ea80\u040aC overnight.Precipitated DNA was harvested by centrifugation, dried in air, and the final pellet dissolved in sterile analytic grade water.",
+      "DNA extraction and quantification  DNA was extracted from whole organs by standard techniques (34) with emphasis on minimizing shearing or nicking of DNA as nicked DNA has been shown to be refractory to LX-PCR (35).DNA from the brain was extracted from the right hemisphere.Extracted DNA was resuspended in 10 mM Tris 1 mM EDTA (pH 8) (TE) and stored at 4_C.A number of samples were normalized for mtDNA content by dot blotting and hybridization with digoxigenin-labeled full-length mtDNA and densitometry.In cases where mtDNA quantification was not carried out, the DNAs were normalized by A 260 of total DNA."
+    ],
+    [
+      "Using genetic markers, the pattern of inheritance can be tracked through families. For example, by analyzing a marker linked to the eye color gene in several generations, it is possible to determine from which grandparents a child has inherited its eye color alleles. More importantly, \ufb01nding a marker linked to a disease can lead to location of the faulty gene causing the disease. Finding the gene is very valuable in the search for the cure. The distance between two loci can be expressed either as physical or genetic distance.",
+      "It is well known, however, that not all genomic markers are independent (Frazer et al., 2007).Genetic variation is often inherited in contiguous segments of DNA, such that there tends to be correlation between the inheritance of alleles at markers close to each other on the same chromosome.This genetic correlation is called linkage disequilibrium (LD), and, as a result, the effective number of independent tests (M eff ) conducted is less than the total number of markers (M).By effective number of tests, we mean the number of independent tests that would have to be conducted to lead to a null distribution for the minimum P-values that was approximately the same as that obtained when conducting tests that are necessarily correlated due to LD.",
+      "Genetic mapping is a powerful strategy that exploits genomic information to dissect complex traits into Mendelian loci (quantitative trait loci or QTL) and identifies genetic * Correspondence: marioenrico.pe@sssup.it 1 Institute of Life Sciences, Scuola Superiore Sant\u2019Anna, Pisa, Italy Full list of author information is available at the end of the article  determinants that may lead to crop improvement. As marker density ceases to be a limiting factor [3], our ability to discover specific genetic determinants in a single mapping study depends upon the availability of populations with high genetic diversity and recombination density [4].",
+      "This capacity allows samples to be placed into meaningful genetic groups that reflect evolutionary relationships (more stable, lower diversity markers), while simultaneously permitting high levels of strain resolution (high diversity markers). From a clinical perspective, markers that accurately reflect broad evolutionary relationships are valuable for comparing the genetic similarity of an isolate to isolates on a regional or global scale, whereas high-resolution markers are valuable for detailed epidemiological tracking in an outbreak. Variable-number tandem repeats (VNTRs) are genetic markers that can span a range of variability and, therefore, can capture genetic relationships on multiple scales (18\u201319).",
+      "Identifying the genetic loci that modulate a trait based on correlation between variation in phenotype and variation in genotype is the essence of genetic mapping. This first involves systematically genotyping a genetically diverse population using microsatellite or SNP markers. The phenotype of interest is then measured and its variability in the population assessed. A statistical test is then carried out to identify chromosomal regions that segregate with the trait and show linkage with the trait, i.e. ,  3 identify genetic regions that have the same genotype among individuals with similar trait values but differ between individuals with dissimilar trait values.",
+      "Using genetic markers, the pattern of inheritance can be tracked through families. For example, by analyzing a marker linked to the eye color gene in several generations, it is possible to determine from which grandparents a child has inherited its eye color alleles. More importantly, \ufb01nding a marker linked to a disease can lead to location of the faulty gene causing the disease. Finding the gene is very valuable in the search for the cure. The distance between two loci can be expressed either as physical or genetic distance.",
+      "Genetic variation  For decades researchers used single markers to elucidate clinal differentiation and spatial variation in allele frequencies.This approach revealed multiple markers with variation that tracked the clines, including some with the same allele at higher frequency at the same latitude in the Northern and Southern hemispheres.Examples include alcohol dehydrogenase (Adh), a-glycerol-3-phosphate dehydrogenase (Gpdh), glucose-6-phosphate dehydrogenase (G6pd), esterase-6 (Est-6), octanol dehydrogenase (Odh), and 6-phosphogluconate dehydrogenase (Pgd) [30][31][32][33] (Table 1).Perhaps the most heavily explored locus in D. melanogaster has been Adh, the first step in the ethanol detoxification pathway.The Adh-F allele encodes high catalytic activity of ADH, but this increase in activity trades off with enzyme stability at higher temperatures [34,35].Unsurprisingly, the Adh-F allele is found at a higher frequency in cooler high-latitude populations, and differentiation has occurred in parallel along clines in",
+      "In the case of genetic markers, this easily runs in the several hundreds to thousands. Moreover, the optimal subset of markers is heavily dependent on how these markers are combined, i.e. dependent on the optimal Boolean function . Altogether, one frequently has to rely on greedy search strategies that easily get stuck in local optima or near exhaustive searches that are computationally too expensive, especially when employed in permutation procedures required to assess statistical significance. Our solution to this problem hinges upon two observations.",
+      "GENE MAPPING  The opportunity to merge advances in molecular genetic technology with advances in statistical techniques expanded in earnest with the development of DNA markers such as restriction fragment length polymorphisms (Lander and Botstein, 1989).Research exploded in the past decade with the continued refinement of molecular technology yielding a variety of DNA markers-e.g., short tandem repeats (STRs) or microsatellites; variable number of tandem repeats (VNTRs); single nucleotide polymorpohisms (SNPs), and gene expression microarrays or gene chips.A genetic marker is a measurable polymorphic sequence of DNA whose chromosomal location is known.Markers often have no known functional significance but are used as pointers to a particular chromosomal location.The logic of gene mapping technology is simple: Determine if there is a relationship between variability in a phenotype and variability in an anonymous DNA marker of known chromosomal location.If there is a relationship, it is taken as evidence that there is a gene that influences the trait at or near the marker.",
+      "Genetic drift. Genetic changes in populations caused by random phenomena rather than by selection.Genetic marker.A segment of DNA with an identifiable physical location on a chromosome whose inheritance can be followed.A marker can be a gene, or it can be some section of DNA with no known function.",
+      "Biological characteristics indicating initial resiliency or susceptibility of an organism include genetic profiles.As noted above, genetic markers need to have a high prevalence in the population and have a reasonably strong effect on common population health outcomes, or have an interaction effect with other health-affecting mechanisms, to be candidates for inclusion in population studies.At the moment, the only known genetic marker of clear value in a population survey is the apolipoprotein E gene (APOE), although this is likely to change in the very near future.APOE allele status is clearly related to a number of major health outcomes in older populations which are reasonably well measured in population surveys: mortality, heart disease, and cognitive functioning (Albert et al., 1995b;Corder et al., 1993;Evans et al., 1997;Ewbank, 1997;Hofman et al., 1997;Hyman et al., 1996;Luc et al., 1994;Saunders et al., 1993).Both the prevalence of alleles indicating higher risk and the size of the effect are large enough to be of importance in explaining variability in currently studied health outcomes.APOE allele status has been shown to have independent effects on health outcomes and to interact with other life circumstances such as sex and race in its effect on health outcomes (Jarvik et al., 1995;Maestre et al., 1995;Payami et al., 1992).Incorporation of information on this genetic indicator could lead to increased knowledge of the interactive mechanisms of this genetic marker and other social and behavioral variables and thus clarify some of the mechanisms leading to population differentials in cognition, heart disease, and mortality.",
+      "As described by Hermalin (1999), if genetic markers are modeled as part of an individual's physiological structure, they can provide controls for predisposing factors that affect more proximate mid-level markers of function as well as downstream health outcomes.This potential benefit of genetic information-i.e., its power in explicating the black box of Figure 11-1-may outweigh, or at least precede, its near-term potential for discovering genetic links to chronic disease.As discussed by Weiss (1998b), the situation with chronic disease differs from single locus disorders that are inherited following well-identified Mendelian rules.In general, we cannot expect to find relationships that are even as straightforward as the APOE links to cardiovascular and Alzheimer's disease.Variation across populations, difficulty in identifying a small enough area on the chromosome to search for disease-associated genes, and the problems inherent in identifying continuous outcomes with particular genes may limit finding the connections.",
+      "This capacity allows samples to be placed into meaningful genetic groups that reflect evolutionary relationships (more stable, lower diversity markers), while simultaneously permitting high levels of strain resolution (high diversity markers). From a clinical perspective, markers that accurately reflect broad evolutionary relationships are valuable for comparing the genetic similarity of an isolate to isolates on a regional or global scale, whereas high-resolution markers are valuable for detailed epidemiological tracking in an outbreak. Variable-number tandem repeats (VNTRs) are genetic markers that can span a range of variability and, therefore, can capture genetic relationships on multiple scales (18\u201319).",
+      "These variations provide a species the ability of adapting to the environment change (Liu and Cordes, 2004). DNA markers are among the most powerful tools for revealing genetic variations in organisms. Historically, many different types of markers have been used for aquaculture studies  Functional Genomics in Aquaculture, First Edition. Edited by Marco Saroglia and Zhanjiang (John) Liu. \u2402 C 2012 John Wiley & Sons, Inc. Published 2012 by John Wiley & Sons, Inc.  41 42  Functional Genomics in Aquaculture  Table 2.1  A summary of characteristics of various molecular markers used in aquaculture species.",
+      "For instance, mapping of a trait or a phenotype would require polymorphic DNA markers such as microsatellites (SSRs) or single nucleotide polymorphisms (SNPs); expression pro\ufb01ling would require genome annotation information; microarray design would require sequence information of genes, etc. The objective of this chapter is to provide a general review of genomic resources needed, and currently present for aquaculture species, for functional genomics studies. Polymorphic DNA Markers The key factor behind the signi\ufb01cant differences at the level of individuals, species, and higher order of taxonomic groups is genetic variation (polymorphism).",
+      "Functional genomics:  The study of genes, their resulting proteins, and the role played by the proteins in the biochemical processes of the body.Gene: A unit of inheritance; a working subunit of DNA.Each of the 20 000 to 25 000 genes in the body contains the code for a specific product, typically a protein such as an enzyme.Gene expression: The process by which the coded information of a gene is translated into the structures present and operating in the cell (either proteins or ribonucleic acids).Gene markers: Landmarks for a target gene, either detectable traits that are inherited along with the gene or distinctive segments of DNA.Gene map: A description of the relative positions of genes on a chromosome and the distance between them.Genetic counseling: A short-term educational counseling process for individuals and families who have a genetic disease or who are at risk for such a disease.Genetic counseling provides patients with information about their condition and helps them make informed decisions.Genetic linkage maps: DNA maps that assign relative chromosomal locations to genetic landmarks-either genes for known traits or distinctive sequences of DNA (ie, genetic markers)-on the basis of how frequently they are inherited together.Genetic testing: Examining a sample of blood or other body fluid or tissue for biochemical, chromosomal, or genetic markers that indicate the presence or absence of genetic disease.Genetics: The scientific study of heredity, how particular qualities or traits are transmitted from parents to offspring.Genome: All the genetic material in the chromosomes of a particular organism.Genome-wide: Descriptor that indicates that the entire breadth of the genome has been examined in a study (eg, a linkage or association study).Genome-wide studies do not resequence the entire genome but type (an increasingly large set of) markers distributed throughout the genome.Genomics: A \"scaled-up\" version of the science of genetics that investigates the structure and function of large sections of the genome simultaneously.Genotype: The actual genes carried by an individual (as distinct from phenotype-ie, the physical, bodily characteristics into which genes are translated).Haplotype: A way of denoting the collective genotype of a number of closely linked loci on a chromosome.Heritability (h 2 ): For any trait, the proportion of the phenotypic variability resulting from genetic variance.Note that heritability does not indicate the degree to which a trait is \"genetic. \"Nor does a high h 2 mean that the trait cannot be influenced by environment.A heritability significantly \u03fe0, however, can provide a rationale for further genetic and genomic study of a trait of interest.Heterozygous: Possessing 2 different sequences (ie, genotypes) of a particular gene, 1 inherited from each parent.High-throughput genotyping: In contrast to the older labor-and time-intensive genotyping methods, high-throughput genotyping makes use of robots, computers, and other evolving technologies, thus enabling laboratories to type up to hundreds of thousands of polymorphisms in many samples in a relatively short period of time.Homozygous: Possessing 2 identical sequences of a particular gene, 1 inherited from each parent.Interaction: The differing effect of 1 independent variable on the dependent variable, depending on the particular level of another independent variable.For example, there would be an interaction between the factors sex and treatment if the effect of treatment was not the same for male and female subjects in a drug trial.Linkage analysis: A gene-hunting technique that traces patterns of heredity in large, high-risk families in an attempt to locate a disease-causing gene mutation by identifying traits that are coinherited with it.Linkage disequilibrium: Two alleles at different loci that occur together on the same chromosome more often than would be predicted by chance alone.It is a measure of cosegregation of alleles in a population.",
+      "Source: Kearsey and Pooni (1996). Genetic maps consist of a series of markers or identifiable features at known, or perhaps best described as estimated, locations on the genome (see Figure 9). For some discrete traits, simple Mendelian inheritance is followed and the phenotype has a one to one correspondence with the genes controlling it. These are so called morphological markers, which were then related to continuous or quantitative traits of interest. Examples are shape, colour, size or height in particular varieties of peas, as studied by Mendel. For another example, see Appendix A.2.",
+      "Genomic markers used in linkage mapping have evolved from restriction fragment length polymorphisms (RFLPs) to microsatellites (simple sequence repeat polymorphisms; SSRPs), to single-nucleotide polymorphisms (SNPs), with the more modern markers exhibiting higher frequencies in the genome (thus ensuring fuller coverage). Linkage mapping of a trait is in fact the demonstration of linkage between the phenotype and a genomic marker, followed by an inference of linkage between the genomic marker and the responsible DNA variant. Transitive logic ties the phenotype with the DNA variant, which is of course the point of the exercise. See Fig.",
+      "However, because of time constraints it is often more practicable to choose an appropriate mapping population that is already available through the current stock centers. Plant species chosen for study will depend largely on the availability of suitable plant resources. Obtain appropriate mapping population information to include information on markers/genotypes (see Note 4). A marker is an identifying factor; a gene or other DNA of known location that is used to track the inheritance and so on of other genes whose exact location is not yet known.",
+      "The closer two genes are together on a chromosome, the  less likely it is for a recombination event to occur between the two, causing a non-random association. This is the basis for genetic linkage. The development of genetic markers allowed the theory of linkage disequilibrium (LD) to be used in mapping genes. Genetic markers are speci c genetic di\u241berences between species or cultivars, and genetic linkage of these markers to particular morphological traits can allow genetic markers to be used to represent the gene of interest (Collard et al. , 2005)."
+    ],
+    [
+      "This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. 1. Introduction The association between a complex phenotypic trait and genetic markers on the chromosomes can be detected through statistical analysis, leading to the identification of quantitative trait loci (QTL)\u2014regions of the chromosomes that appear to be associated with the phenotype. Quantitative trait loci (QTL) are expected to be associated with the genes controlling some aspects of the phenotype.",
+      "Nowadays many different cost-efficient genotyping solutions (including sequencing and Single Nucleotide Polymorphisms arrays) have opened the way to systematic genome-wide fine mapping of quantitative traits (Quantitative Trait Locus or QTL mapping). The process of QTL mapping (Figure 1) consists in searching for genome regions that influence the value of a given trait. For example, identifying a QTL for plant height means finding a DNA region at which the plants that carry a certain allele tend to be significantly higher or lower than those carrying another allele.",
+      "QTLs are regions within the genome whose genetic variation modulates quantitatively a phenotype characteristic of the particular trait under study (Lynch and Walsh, 1998). Determining the association between variations in specific disease phenotypes or a trait, with variations in genotypes of a reference population can be used to locate a QTL. One of the methods used for mapping QTLs associated with complex traits is genetic markers-trait association. Genetic markers associated with certain loci can be inherited in linkage disequilibrium. Generating populations with linked loci in disequilibrium is achieved though either crosses between inbred lines, or use of the out-bred populations.",
+      "Often, the first step in analysis of new trait data is single-marker regression across all chromosomes. A hypothetical QTL is evaluated at the location of each marker locus, and the significance of that QTL is estimated from a likelihood ratio statistic (LRS) (Haley and Knott, 1992). For this analysis, WebQTL automatically does a permutation test to establish genomewide significance criteria for the trait (Churchill and Doerge, 1994).",
+      "One possible approach to facilitate this endeavor is to identify quantitative trait loci (QTL) that contribute to the phenotype and consequently unravel the candidate genes within these loci. Each proposed candidate locus contains multiple genes and, therefore, further analysis is required to choose plausible candidate genes. One of such methods is to use comparative genomics in order to narrow down the QTL to a region containing only a few genes. We illustrate this strategy by applying it to genetic findings regarding physical activity (PA) in mice and human.",
+      "Elucidation of the molecular basis of these traits has proven difficult as they are under the control of multiple genes and genetic loci. The standard approach to gene identification involves mapping by linkage analysis in experimental crosses, and this has led to the localization in the rat genome of hundreds of quantitative trait loci (QTLs) underlying trait variation (68). We refer to these loci as physiological quantitative trait loci (pQTLs).",
+      "Often, the first step in analysis of new trait data is single-marker regression across all chromosomes.A hypothetical QTL is evaluated at the location of each marker locus, and the significance of that QTL is estimated from a likelihood ratio statistic (LRS) (Haley and Knott, 1992).For this analysis, WebQTL automatically does a permutation test to establish genomewide significance criteria for the trait (Churchill and Doerge, 1994).By default, it returns a list of marker loci that show greater than sugges-tive association with the trait according to standard criteria (Lander and Kruglyak, 1995), but it will also accept user-defined criteria.Local maxima in the LRS in this list identify loci that are most likely to be near QTLs.WebQTL provides this list within a few seconds.",
+      "QTLs can be identified through their genetic linkage to visible marker loci with genotypes that can be readily classified [94, 97]. As such, markers that are genetically linked quantitative trait will segregate more often with trait values, whereas unlinked markers will lack an association with the phenotype [94, 98]. The principal goal of a QTL analysis is to identify all QTLs linked to a trait and discern whether phenotypic differences are mainly due to a few loci with large effects, or many loci with small effects [98].",
+      "This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. 1. Introduction The association between a complex phenotypic trait and genetic markers on the chromosomes can be detected through statistical analysis, leading to the identification of quantitative trait loci (QTL)\u2014regions of the chromosomes that appear to be associated with the phenotype. Quantitative trait loci (QTL) are expected to be associated with the genes controlling some aspects of the phenotype.",
+      "The basic principle of classic QTL is trait segregation along with the markers and necessitated the availability of two or more genetically different lines corresponding with the phenotypic trait. Markers like single nucleotide polymorphisms (SNPs) and microsatellites are used for genotypic distinctions (Vignal et al. , 2002). QTL mapping is achieved in four basic steps; the first one is the measurement of variation for a trait in the individuals. It is a prerequisite to have the traits that show phenotypic variability among the individuals (inbred strains).",
+      "Often, the first step in analysis of new trait data is single-marker regression across all chromosomes.A hypothetical QTL is evaluated at the location of each marker locus, and the significance of that QTL is estimated from a likelihood ratio statistic (LRS) (Haley and Knott, 1992).For this analysis, WebQTL automatically does a permutation test to establish genomewide significance criteria for the trait (Churchill and Doerge, 1994).By default, it returns a list of marker loci that show greater than sugges-tive association with the trait according to standard criteria (Lander and Kruglyak, 1995), but it will also accept user-defined criteria.Local maxima in the LRS in this list identify loci that are most likely to be near QTLs.WebQTL provides this list within a few seconds.",
+      "Often, the first step in analysis of new trait data is single-marker regression across all chromosomes. A hypothetical QTL is evaluated at the location of each marker locus, and the significance of that QTL is estimated from a likelihood ratio statistic (LRS) (Haley and Knott, 1992). For this analysis, WebQTL automatically does a permutation test to establish genomewide significance criteria for the trait (Churchill and Doerge, 1994).",
+      "Quantitative Trait Locus (QTL) mapping To map QTL, we used 934 AXB/BXA genetic informative markers obtained from http://www. genenetwork.org. For all the in vitro measurements and gene expression linkage analysis, a genome-wide scan was performed using R/qtl [57]. Significance of QTL logarithm-of-odds (LOD) scores was assessed using 1000 permutations of the phenotype data [114] and the corresponding p-values reported. For the cellular phenotypes, QTL significance was reported at a genome-wide threshold corresponding to p < 0.05.",
+      "Typically one may obtain a location known to derive from only one of the two parent strains that contains a chromosomal region that correlates with a trait of interest. Since the actual gene and gene product will frequently remain unknown, the region is referred to as quantitative trait locus (QTL), and is simply named for the trait itself (Alberts & Schughart, 2010). Growing sets of strain-dependent marker locations in established RI strains are continually updated in online repositories.",
+      "By definition, a quantitative trait locus is a chromosomal region that contains a gene, or genes, that regulate a portion of the genetic variation for a particular phenotype (Wehner et al. 2001). The goal of QTL mapping is to identify regions of the genome that harbour genes relevant to a specified trait. QTL map locations are commonly determined by initial screening of mice with specific genetic characteristics, such as recombinant inbred strains, the F2 of two inbred strains, or recombinant congenic strains (Flint 2003).",
+      "Often, the first step in analysis of new trait data is single-marker regression across all chromosomes. A hypothetical QTL is evaluated at the location of each marker locus, and the significance of that QTL is estimated from a likelihood ratio statistic (LRS) (Haley and Knott, 1992). For this analysis, WebQTL automatically does a permutation test to establish genomewide significance criteria for the trait (Churchill and Doerge, 1994).",
+      "QTL linkage studies are conducted in order to map a region or regions of the genome which affect a continuous or quantitative trait. In agriculture, as soon as markers linked to QTL are found for economically important traits, these markers can be used for selecting individuals in breeding programmes. In human studies, the aim is often to identify markers indicating disease susceptibility. Current techniques for measuring markers are usually relatively slow and laborious. Newer DNA technology, such as SNP or single nucleotide polymorphisms (Kwok, 2001b; Patil et al.",
+      "Genomic regions linked to complex traits can be identified by genetic mapping and quantitative trait locus (QTL) analysis (Shehzad and Okuno 2014). 7 QTL mapping QTL mapping with molecular markers is the first strategy in genetic studies. In plant breeding, QTL mapping is an essential step required for marker-assisted selection (Mohan et al. 1997; Shehzad and Okuno 2014). The fundamental idea underlying QTL analysis is to associate genotype and phenotype in a population exhibiting a genetic variation (Broman and Sen 2009).",
+      "Four steps of QTL mapping are (1) development a  W  population, (2) genotyping the population using molecular markers, (3) phenotyping the population for an interested trait, and (4) QTL analysis using statistical procedures to find  IE  markers linked to the QTL (Bernardo 2002). PR EV  Populations used for genetic mapping can be a segregating population (F2 and backcross) or a permanent population (double haploids or recombinant inbred lines). Recombinant inbred lines (RILs) are developed by selfing of individual progenies of the F2 plants until homozygosity is achieved (F7-F8).",
+      "This tool allows systems genetic analysis of single genes or small sets of genes using a bottom-up approach. relations define quantitative trait loci (QTLs). Because the marker is not typically the actual site of the polymorphism, interpolative methods have been developed to estimate the distance of the QTL from the marker and the strength of the association. Using multiple-regression and model-fitting methods, the true complexity of the phenotypic variation can be modeled through the consideration of multiple loci and environmental factors as predictors [13]."
+    ]
+  ]
+}
\ No newline at end of file
diff --git a/gnqa/paper1_eval/src/data/datasets/old/general2_dataset.json b/gnqa/paper1_eval/src/data/datasets/old/general2_dataset.json
new file mode 100644
index 00000000..ca8918f2
--- /dev/null
+++ b/gnqa/paper1_eval/src/data/datasets/old/general2_dataset.json
@@ -0,0 +1,128 @@
+{
+  "question": [
+    "Create a how-to guide for genetic sequencing.",
+    "What is the significance of the length of telomeres?",
+    "Once a sperm combines with an egg, what determines how traits are passed onto the resulting lifeform?",
+    "Why is genetic tracing matrilineal rather than patrilineal?",
+    "How does one tell the difference between X and Y DNA, with respect to DNA tracing and determining QTLs?"
+  ],
+  "answer": [
+    "1. Start with an initial linkage or association: This is the first step in a genetic study where a trait or disease is linked to a specific region of the genome.  2. Perform Laboratory Genotyping: This involves determining the genetic variation of an individual by examining their DNA sequence using biological assays and comparing it to a standard sequence.  3. Use Genomic Sequence: The role of genomic sequence is crucial at every stage. It serves as a common thread that ties all the stages together.  4. Establish True Orthology: Determine the true orthology between genes in different species. If two genes are truly orthologous, their evolution closely follows patterns of speciation.  5. Apply Bioinformatics: Use bioinformatics tools and techniques to analyze the data. This can be applied to sequence data from any collection of organisms.  6. Understand Genome Structure and Content: Provide an overview of genome structure and content to provide a context for subsequent discussions.  7. Improve DNA Sequencing Methods: Constantly work on improving DNA sequencing methods with the ultimate goal of sequencing a human genome in a single day for a cost of about US $1,000.  8. Interpret the Information: The major challenge ahead is the interpretation of this information. Understand how our genes interact with each other, and how the environment contributes to the development of health and disease.  9. Consider the Implications: Understand the individual and societal implications of knowing our genome sequence.  10. Keep Up with Technological Advances: With further technological advances, other previously unimaginable research approaches will become real.",
+    "The length of telomeres is significant as it is associated with aging, disease, and mortality. Longer telomeres are generally associated with better health outcomes, including protection from age-related diseases, better cognitive function, and healthier lipid profiles. Shorter telomeres, on the other hand, have been linked to higher risks of heart disease, infection-related death, and other diseases. Telomere length can also be influenced by factors such as stress, with high-stress situations potentially leading to faster telomere shortening and premature aging.",
+    "The traits are determined by the combination of genes from both the sperm and the egg. This process involves meiosis, where each gamete (sperm and egg) contributes one chromosome to each pair, resulting in a zygote with a full complement of 23 chromosome pairs. The process of recombination or crossing over, where similar DNA sequences from the paired chromosomes swap genetic material, also plays a crucial role in determining the traits of the offspring. This results in a shuffling of genetic material and contributes to the genetic variation seen among offspring.",
+    "Genetic tracing is matrilineal due to the inheritance of mitochondrial DNA (mtDNA), which is passed from mother to all her children without any admixture from the father. This allows for a clear lineage tracing through the maternal line. In contrast, Y-DNA is passed from father to son, allowing for patrilineal tracing, but it does not provide information about the genetic contributions of other ancestors in a family tree.",
+    "The difference between X and Y DNA is not explicitly discussed in the context of determining QTLs in the provided text. However, in general, X and Y chromosomes are different in size and gene content, which can be identified through genetic sequencing. In the context of QTL studies, these differences could potentially influence the traits being studied. However, the specific process of how this is done in QTL mapping is not detailed in the provided text."
+  ],
+  "contexts": [
+    [
+      "To overcome the lack of phenotypic information in the 1000 Genomes Project, The ClinSeq Project was developed to pilot large-scale genome sequencing for research in genomic medicine at the National Institutes of Health Clinical Research Center in Bethesda, MD. 40 The study seeks to enroll 1000 individuals who will be evaluated for personal health status and family history.The project aims to:",
+      "We (Hein, Schierup and Wiuf) have published a 300 page book on molecular population genetics titled \u201cGene Genealogies, Sequence Variation and Evolution\u201d Oxford University Press, and are presently developing a tutorial in association mapping that we hope to publish as a booklet in 2006 and are also involved in a very large EU collaboration (Holland, Denmark, Iceland and UK) to find susceptibility genes for breast and prostate cancer. In comparative genomics, the most fundamental investigation is to find genes in a pair of aligned genomes.",
+      "Key bioinformatic steps to take a genetic study from an initial linkage or association to laboratory genotyping are illustrated. The reader should note the role of genomic sequence as a common thread through every stage  regions in man (see Chapter 5). Similar issues also exist in the establishment of true orthology between genes in different species, where one is identified to play a role in a disease model. If two genes are truly orthologous, their evolution closely follows patterns of speciation (Fitch, 2000).",
+      "In general terms, the approaches we describe can be applied to sequence data from any collection of organisms, but our emphasis here is primarily on Bioinformatics for Geneticists, Second Edition. Edited by Michael R. Barnes 2007 John Wiley & Sons, Ltd ISBN 978-0-470-02619-9 (HB) ISBN 978-0-470-02620-5 (PB)  \u2402 C 106  CH 6 COMPARATIVE GENOMICS  questions of relevance to human genetics. We begin, in Section 6.2 by presenting an overview of genome structure and content, providing a context for the subsequent discussions.",
+      "4 Assembling a View of the Human Genome Colin A. M. Semple Bioinformatics, MRC Human Genetics Unit, Edinburgh EH4 2XU, UK  4.1 Introduction The miraculous birth of the draft human genome sequence took place against the odds. It was only made possible by parallel revolutions in the technologies used to produce, store and analyse the sequence data, and by the development of new, large-scale consortia to organize and obtain funding for the work (Watson, 1990). The initial flood of human sequence has subsided as the sequencing centres have sequenced genomes from other mammalian orders and beyond.",
+      "This fully indexed but semi-intelligible  Bioinformatics for Geneticists, Second Edition. Edited by Michael R. Barnes 2007 John Wiley & Sons, Ltd ISBN 978-0-470-02619-9 (HB) ISBN 978-0-470-02620-5 (PB)  \u2402 C 4  CH 1 BIOINFORMATICS CHALLENGES FOR THE GENETICIST  \u2018book of life\u2019 immediately began to serve as a valuable framework for integration of genetic and biological data. However, knowledge of the genome sequence did not immediately clarify the nature and structure of human genetic variation.",
+      "Methods for DNA sequencing are constantly being improved, with the ultimate goal of sequencing a human genome in a single day for a cost of about US $1,000, an end that appears to be in sight (Hayden, 2014).In the very near future, whole-genome sequencing will be routinely available for clinical purposes, perhaps even beginning at birth.The major challenge ahead is the interpretation of this information.How do our genes interact with each other, and how does the environment contribute to the development of health and disease?What are the individual and societal implications of knowing our genome sequence?The answers to these and other important questions will unfold in the years ahead.Thus, we are truly in an era where precision medicine may soon become a reality.",
+      "Characteristics of genotyping and sequencing technologies",
+      "Key bioinformatic steps to take a genetic study from an initial linkage or association to laboratory genotyping are illustrated. The reader should note the role of genomic sequence as a common thread through every stage  regions in man (see Chapter 5). Similar issues also exist in the establishment of true orthology between genes in different species, where one is identified to play a role in a disease model. If two genes are truly orthologous, their evolution closely follows patterns of speciation (Fitch, 2000).",
+      "In general terms, the approaches we describe can be applied to sequence data from any collection of organisms, but our emphasis here is primarily on Bioinformatics for Geneticists, Second Edition. Edited by Michael R. Barnes 2007 John Wiley & Sons, Ltd ISBN 978-0-470-02619-9 (HB) ISBN 978-0-470-02620-5 (PB)  \u2402 C 106  CH 6 COMPARATIVE GENOMICS  questions of relevance to human genetics. We begin, in Section 6.2 by presenting an overview of genome structure and content, providing a context for the subsequent discussions.",
+      "4 Assembling a View of the Human Genome Colin A. M. Semple Bioinformatics, MRC Human Genetics Unit, Edinburgh EH4 2XU, UK  4.1 Introduction The miraculous birth of the draft human genome sequence took place against the odds. It was only made possible by parallel revolutions in the technologies used to produce, store and analyse the sequence data, and by the development of new, large-scale consortia to organize and obtain funding for the work (Watson, 1990). The initial flood of human sequence has subsided as the sequencing centres have sequenced genomes from other mammalian orders and beyond.",
+      "Introduction  Since the first human genome was sequenced at an estimated cost of $150 million, several advanced high-throughput techniques \u2013 some with lower costs - have come up. At the same time, this resulted in a data deluge and a critical need to connect the heterogeneous sequencing data and associated annotations \u2013 structural and functional with the basic tenets of biology or molecular basis of development and disease.",
+      "Key bioinformatic steps to take a genetic study from an initial linkage or association to laboratory genotyping are illustrated. The reader should note the role of genomic sequence as a common thread through every stage  regions in man (see Chapter 5). Similar issues also exist in the establishment of true orthology between genes in different species, where one is identified to play a role in a disease model. If two genes are truly orthologous, their evolution closely follows patterns of speciation (Fitch, 2000).",
+      "In general terms, the approaches we describe can be applied to sequence data from any collection of organisms, but our emphasis here is primarily on Bioinformatics for Geneticists, Second Edition. Edited by Michael R. Barnes 2007 John Wiley & Sons, Ltd ISBN 978-0-470-02619-9 (HB) ISBN 978-0-470-02620-5 (PB)  \u2402 C 106  CH 6 COMPARATIVE GENOMICS  questions of relevance to human genetics. We begin, in Section 6.2 by presenting an overview of genome structure and content, providing a context for the subsequent discussions.",
+      "4 Assembling a View of the Human Genome Colin A. M. Semple Bioinformatics, MRC Human Genetics Unit, Edinburgh EH4 2XU, UK  4.1 Introduction The miraculous birth of the draft human genome sequence took place against the odds. It was only made possible by parallel revolutions in the technologies used to produce, store and analyse the sequence data, and by the development of new, large-scale consortia to organize and obtain funding for the work (Watson, 1990). The initial flood of human sequence has subsided as the sequencing centres have sequenced genomes from other mammalian orders and beyond.",
+      "Ample time was allotted to answer questions and a copy of \"A Guide to Your Genome\" (National Human Genome Research Institute 2007) was provided to further assist participants' understanding and ability to communicate results with family members or others.",
+      "Whether within 10 or 12 (or 8) years, such inexpensive sequencing will change both research and clinical care, and progress does not need to wait even that long.The National Human Genome Research Institute (NHGRI) plans to focus a significant portion of the sequencing capacity that it supports on medical sequencing.For instance, the NHGRI and the National Cancer Institute are actively considering a Human Cancer Genome Project, 22 which would use DNA sequencing and a host of other genome technologies to gather information about the mutations and functional abnormalities found in multiple samples from many major types of cancer.Medical sequencing should also provide important insight into many other diseases.For example, sequencing all exons in X-linked mental retardation syndromes may reveal much about their etiology.Sequencing candidate genes in the extremes of the distribution of quantitative traits should also reveal much of importance about common diseases, such as coronary atherosclerosis. 23With further technological advances, other previously unimaginable research approaches will become real.",
+      "Key bioinformatic steps to take a genetic study from an initial linkage or association to laboratory genotyping are illustrated. The reader should note the role of genomic sequence as a common thread through every stage  regions in man (see Chapter 5). Similar issues also exist in the establishment of true orthology between genes in different species, where one is identified to play a role in a disease model. If two genes are truly orthologous, their evolution closely follows patterns of speciation (Fitch, 2000).",
+      "In general terms, the approaches we describe can be applied to sequence data from any collection of organisms, but our emphasis here is primarily on Bioinformatics for Geneticists, Second Edition. Edited by Michael R. Barnes 2007 John Wiley & Sons, Ltd ISBN 978-0-470-02619-9 (HB) ISBN 978-0-470-02620-5 (PB)  \u2402 C 106  CH 6 COMPARATIVE GENOMICS  questions of relevance to human genetics. We begin, in Section 6.2 by presenting an overview of genome structure and content, providing a context for the subsequent discussions.",
+      "4 Assembling a View of the Human Genome Colin A. M. Semple Bioinformatics, MRC Human Genetics Unit, Edinburgh EH4 2XU, UK  4.1 Introduction The miraculous birth of the draft human genome sequence took place against the odds. It was only made possible by parallel revolutions in the technologies used to produce, store and analyse the sequence data, and by the development of new, large-scale consortia to organize and obtain funding for the work (Watson, 1990). The initial flood of human sequence has subsided as the sequencing centres have sequenced genomes from other mammalian orders and beyond."
+    ],
+    [
+      "In birds, where erythrocyte telomere length (ETL) is measured, the majority of species sampled have shown no sex difference (36).Nonetheless, bird telomere dynamics are complex and, as with humans, may be affected by environment and stress.For example, a longitudinal study of black-tailed gulls (Larus crassitostris) over 2-5 years found no correlation between ETL and age or sex.Rather, ETL attrition was correlated with reduced food availability and environmental stressors (55).In a captive zebra finch (Taeniopygia guttata) population, male and female mean telomere length decreased with increasing age of the animals, but did differ between sexes (56).As these examples illustrate, the relationship between telomere length, lifespan, and sex is likely to be complex in other vertebrates.",
+      "Comparative studies of age-related telomere attrition in other species also reveal a variety of patterns.Barrett and Richardson (36) recently summarized the comparative data available on sex differences in telomere length.They found a strong correlation between male-biased mortality and either shorter telomeres or greater telomere attrition in males across bird and mammal taxa.However, telomere length did not differ between males and females in species where females are shorter-lived than males (36), suggesting that telomere shortening is not associated with species-specific longevity in a simple linear fashion.These studies generally suffer from relatively small sample sizes and are largely cross-sectional.Further, the use of diverse assays, different tissues (eg, leukocytes in mammals vs erythrocytes in birds), and lack of standardized benchmarks for accuracy makes comparisons between studies difficult.",
+      "In some organisms, there is no clear relationship between telomere length and lifespan.Age-related telomere attrition could not be detected in Daphnia pulex (57) or sea urchin species (Strongylocentrotus franciscanus and Lytechinus variegatus) (58).Studies in C. elegans examining natural variation in telomere length and experimentally manipulated telomere length detect no correlation with lifespan (59,60), and in Drosophila, which uses a telomerase-independent mechanism for telomere maintenance, there is a similar lack of correlation between longevity and telomere length (61).Similarly, data on sex differences in age-related telomere shortening are mixed.For example, in the ant species Lasius niger, the rate of telomere shortening is more rapid in short-lived males compared to longer-lived females.But, mean telomere length does not differ between the two types of females, queens and workers, despite the fact that queens live much longer than workers (up to 28 years vs 2-3 months) (62).These findings suggest that the question of how telomere shortening affects aging across species and how sex affects telomere attrition rates are complex.",
+      "With new methodologies to assess relative telomere length by Q-PCR, studies were designed to address the impact of telomere length on aging, aging associated pathologies, and mortality.One such study has correlated shorter leukocyte telomere lengths at age 60 with a three times higher risk of heart disease and an eightfold increase in risk of infection-related death (36), thereby associating measured relative cellular aging with disease and life expectancy.In a similar way, chronic stress was shown to correlate with short leukocyte telomere length, a phenomenon attributed to higher levels of oxidative stress at the cellular level (70).More recent studies have linked telomere length in smooth muscle cells with senescence and disease severity in patients with atherosclerosis (141,150).Leukocyte telomere length was also short in a cohort of similar patients and associated with a higher risk of developing occult cardiovascular disease (71).More data are needed to understand and validate the use of leukocyte telomere length as a biomarker for cardiovascular and other diseases.",
+      "Shortening of the telomeres at the ends of chromosomes has been associated with age-related disease and mortality [16][17][18].A recent study identified a common haplotype of four SNPs in the human telomerase reverse transcriptase gene (hTERT) that is enriched in centenarians and associated with longer telomere length [19].It was also shown that centenarians and their offspring maintain longer telomeres compared with controls and that longer telomeres are associated with protection from age-related diseases, better cognitive function and lipid profiles of healthy ageing [19].",
+      "New research has indicated how social factors, such as subordination, may translate into biological effects (epel et al. 2004;Chae et al. 2014).In a now classic study, epel et al. ( 2004) examined the telomere lengths of fifty-eight healthy premenopausal women who either had a healthy child (n = 19) or were giving care to a chronically ill child (n = 39. )They measured perceived stress, years of caregiving, telomere length, and oxidative stress.They found highly statistically significant differences in telomere length between women taking care of chronically ill children and those who had healthy children.They found highly statistically significant negative correlations between telomere length and perceived stress and years of caregiving.Telomerase activity had highly statistically significant negative correlations with perceived stress and years of caregiving.Oxidative stress was highly positively correlated with perceived stress and years of caregiving.They concluded that the telomere length shortening was equivalent to 9 to 17 years of aging in the high stress group.Telomere length is considered a biomarker of aging (Finch and Kirkwood 2000).Thus, this study showed that caregiver stress had essentially aged these women 9 to 17 years compared to women who had healthy children.",
+      "The single, consistent predictor of the rate of telomere attrition shown in multiple adult and the few child longitudinal studies is the baseline measurement of telomere length at the start of each study.This suggests the importance of understanding predictors of telomere length prior to adulthood, as it determines in part the rate of change (Revesz et al. 2014a, b;Nordfjall et al. 2009).Moreover, longitudinal studies in adults have had found that telomere attrition rate is dependent on baseline telomere length independent of any phenotypic predictors of shortening, such as disease or demographic variables (Nordfjall et al. 2009), attesting to the importance of studies to evaluate risk factors for shortening prior to adulthood.",
+      "Rates of decline in childhood may be particularly relevant for later chronic disease risk as shorter telomere length has been implicated in disease progression through exposure to cellular senescence, inflammatory cytokines and adipocyte hypertrophy (Raschenberger et al. 2015;Willeit et al. 2014;Monickaraj et al. 2012;Fyhrquist et al. 2013).",
+      "Adult studies have also found a negative correlation with baseline telomere length, suggesting a negative feedback regulation of leukocyte telomere length (Farzaneh-Far et al. 2010;Aviv et al. 2009;Epel et al. 2008;Nordfjall et al. 2009).It is possible that while our follow-up period was shorter than Shalev et al. 2013 and adult studies, which had a minimum of 5 year intervals with the exception of Puterman et al. (2015) who followed for a one-year time period, there may be biological regulation of telomere length at 4 and 5 years of age such that shorter telomeres are more robustly maintained, whereas longer telomeres have greater rates of decline, over a short period of one year.It is unlikely that this relationship is due to assay error or regression to the mean given the consistency of our findings across studies.We have had similar findings of longer telomeres having greater rates of decline and shorter telomeres being maintained in our different studies (Farzaneh-Far et al. 2010;Epel et al. 2008;Puterman et al. 2015).",
+      "We found primarily maintenance and lengthening from 4 to 5 years of age in children, with minimal telomere attrition, indicating that most of the telomere loss happens in the first 4 years, plateauing by age 4. Lastly, we found close to 10 % of the variance in rate of change in children shared by mothers.While some of this shared variance is genetic, there are likely environmental factors that need to be further identified that impact rate of telomere length change.",
+      "Abstract Telomeres are the protective complexes at the end of chromosomes, required for genomic stability.Little is known about predictors of attrition in young children or the relationship between parental and child patterns of telomere change.Telomere length was assessed twice over one year, at 4 and at 5 years of age, in Latino preschool children (n = 77) and their mothers (n = 70) in whole blood leukocytes.Maternal and child rates of attrition during the same time period were compared in 70 mother-child pairs.More children showed lengthened telomeres over one year compared to their mothers and very few children showed attrition (2.6 %).Approximately 31 % of children and 16 % of mothers displayed lengthening over one year while 66 % of children showed maintenance in contrast with 74 % of mothers.The strongest predictor for child telomere length change was child's baseline telomere length (r = \u22120.61,p < 0.01).Maternal rate of change was associated with child rate of change (r = 0.33, p < 0.01).After controlling for child baseline telomere length, the relationship between child and maternal rate of change trended towards significance (Coeff = 0.20, 95 % CI \u22120.03 to 0.43; p = 0.08).",
+      "Blackburn and Epel, a health psychologist who did original research on how specific lifestyle and psychological habits can protect telomeres, published The Telomere Effect (Blackburn & Epel, 2017), in which they suggested that individuals with shorter telomeres developed diseases earlier in life (a shorter \"disease span\").What follows is the evidence from these authors, their colleagues, and other researchers describing how length of telomeres contributes to mind-body connection and healthy longevity.",
+      "As early as at the time of birth, each of the 92 telomeres of the human genome has its own characteristic length.Additionally, each telomere shortens by its individual attrition rate.In general, longer telomeres at birth are associated with higher age-dependent attrition rates and vice versa.Overall, telomere shortening appears more dynamic in males.",
+      "In conclusion, a combination of overall and chromosomespecifi c shorter telomeres and more pronounced age-dependent telomere erosion could be observed in males.There is a prospective clinical study strongly suggesting that longer telomeres decrease the risk of dying (Cawthon et al., 2003).With this in mind, the telomere length discrepancies between the sexes may indeed be a factor infl uencing the differences in their life expectancy.",
+      "In every chromosome a linear decline of telomere length with age was observed, being more pronounced in men independent of the examined chromosome arm.This might suggest that telomere length on single chromosome arms may be infl uenced by the same factors which determine overall telomere length.",
+      "S. Mayer a S. Br\u00fcderlein a S. Perner a I. Waibel a A. Holdenried a N. Ciloglu a C. Hasel a T. Mattfeldt a K.V. Nielsen b P. M\u00f6ller a a Institute of Pathology, University of Ulm, Ulm (Germany); b DakoCytomation A/S, Glostrup (Denmark) follow uniformity.In previous studies, sex-specifi c differences in telomere length and attrition rate of men and women were found (Benetos et al., 2001;Cawthon et al., 2003;Nawrot et al., 2004), suggesting gender differences in behavior of telomeres.In individual chromosome arms, telomere length was also shown not to be homogeneous (Lansdorp et al., 1996;Benn, 1997;Martens et al., 1998;Surralles et al., 1999;Hao and Tan, 2001;Londono-Vallejo et al., 2001;Graakjaer et al., 2003), some telomeres being signifi cantly shorter, others longer than the average length.",
+      "To date, these characteristics in telomere lengths could not be set in a biological context, as only a few groups have provided detailed information about chromosome-specifi c patterns of telomere distribution (Lansdorp et al., 1996;Graakjaer et al., 2003).Whether accumulation of short telomeres (Martens et al., 2000;Londono-Vallejo et al., 2001) or rather the shortest telomere of one specifi c chromosome arm (Hemann et al., 2001) elicits senescence, remains an open question so far.",
+      "In recent literature, there are hints that the average telomere length may be higher in women and that their annual shortening rate may be somewhat lower (Vaziri et al., 1993;Rufer et al., 1998;Jeanclos et al., 2000), but these reported differences failed to reach statistical signifi cance except for one study (Jeanclos et al., 2000).Here, we provide compelling evidence that this is indeed the case.",
+      "It is generally accepted that telomeres shorten during DNA replication both in vitro and in vivo.In individuals, short telomeres are considered to be a sign of advanced age.Cawthon and coworkers (2003) showed that telomere shortening in humans likely contributes to mortality, supporting the hypothesis that they might act as a mitotic clock (Allsopp et al., 1992).Telomere length dynamics, however, does not seem to Abstract.During aging, telomeres are gradually shortened, eventually leading to cellular senescence.By T/C-FISH (telomere/centromere-FISH), we investigated human telomere length differences on single chromosome arms of 205 individuals in different age groups and sexes.For all chromosome arms, we found a linear correlation between telomere length and donor age.Generally, males had shorter telomeres and higher attrition rates.Every chromosome arm had its individual age-specifi c telomere length and erosion pattern, resulting in an unexpected heterogeneity in chromosomespecifi c regression lines.This differential erosion pattern, however, does not seem to be accidental, since we found a correlation between average telomere length of single chromosome arms in newborns and their annual attrition rate.Apart from the above-mentioned sex-specifi c discrepancies, chromosome arm-specifi c telomere lengths were strikingly similar in men and women.This implies a mechanism that arm specifi cally regulates the telomere length independent of gender, thus leading to interchromosomal telomere variations.",
+      "Shortening of the telomeres at the ends of chromosomes has been associated with age-related disease and mortality [16][17][18].A recent study identified a common haplotype of four SNPs in the human telomerase reverse transcriptase gene (hTERT) that is enriched in centenarians and associated with longer telomere length [19].It was also shown that centenarians and their offspring maintain longer telomeres compared with controls and that longer telomeres are associated with protection from age-related diseases, better cognitive function and lipid profiles of healthy ageing [19]."
+    ],
+    [
+      "Selection could occur at multiple levels, from germ cell generation and propagation to fertilization and early embryonic growth.Chromosomal abnormalities, including aneuploidy, were found in 10-20% of spermatozoa and oocytes (20) and in the cleaved embryo, with a 21% rate of abnormalities in preimplantation embryos (21).These findings led to a model for natural selection against chromosome abnormalities (21).Selection extends to the end of gestation: Only approximately 30% of all conceptions result in a live birth, with more than half of aborted fetuses containing chromosomal abnormalities (22), a number likely to be an underestimate because of technological limitations in measuring all possible mutations.But even in the very small fraction of germ cell duos that survive this withering genome attack and result in a live birth, a number of severe de novo mutations will still be found (23).The data on gross chromosomal alterations suggest that overall, mutation frequency early in life is very high.The functional consequence, however, is limited because of selection.Somewhat surprisingly, this picture points toward an initial decline in genomic alterations, allowing the adult individual to acquire a somatic genome optimally equipped to provide function.",
+      "The phenotype of the F1 hybrids is compared to those of the parental inbred strains to reveal dominance or semi-dominance relationships between the alleles that a\u00a1ect the phenotype. Phenotypic di\u00a1erences between reciprocal F1 hybrids indicate that one or more of the following factors may a\u00a1ect the trait: (1) sex linkage (X- or Ylinked traits), (2) genomic imprinting of QTLs that a\u00a1ect the phenotype, (3) prenatal maternal e\u00a1ects (e\u00a1ects of intrauterine environment), and/or (4) postnatal maternal or paternal e\u00a1ects (e\u00a1ects of maternal and/or paternal parenting behaviour on o\u00a1spring).",
+      "Sex brings harmful alleles together into the same genetic background, allowing selection to more efficiently purge them from the population and potentially producing some offspring that are fitter than either parent. However, the benefit of recombining deleterious mutations may depend on the nature of the epistatic interactions between them. The mutational deterministic hypothesis (Kondrashov 1988) depends partly on this epistasis.",
+      "In most plants and animals, sex is a necessary component of reproduction, and the question for evolutionary biologists is why reproductive mechanisms have evolved that way. In one of the experiments described next, evolutionary geneticists have nevertheless devised a way to compare evolution with and without recombination in the obligately sexual fruit fly.",
+      "This disparity in investment is the basis for the twofold cost: asexual females hypothetically could transmit twice as many alleles at the same cost. In most plants and animals, mates tend to be unrelated, leading to outcrossing. But sex usually also involves the basic process of physical recombination: the breakage and reunion of two different DNA or RNA molecules. Of these two processes, recombination is clearly the more widespread feature of sexual reproduction. A variety of reproductive systems, such as selfing and automixis, involve recombination but not outcrossing. In contrast, relatively few reproductive systems have outcrossing without recombination.",
+      "Crossing over-The swapping of genetic material that occurs in the germline.During the formation of egg and sperm cells, also known as meiosis, paired chromosomes from each parent align so that similar DNA sequences from the paired chromosomes cross over one another.Crossing over results in a shuffling of genetic material and is an important cause of the genetic variation seen among offspring.This process is also known as meiotic recombination.",
+      "The reason for the rarity of these mutations is natural selection: If the mutations result in disorders that decrease health and reproductive fitness, they will eventually be eliminated from a population.In exceptional cases, mutations may cause both beneficial and detrimental consequences, resulting in opposing forces of positive selection and negative selection that may cause the mutations to be preserved at nonrare frequencies in a population.For example, the HbS mutation in the HBB gene (which produces the \u03b2 subunit of hemoglobin) causes sickle cell disease when present in both alleles, a detrimental consequence, but protects against malaria when present in 1 allele, a beneficial consequence, ensuring that the mutation persists in populations in areas of the world where malaria is endemic.Genes are passed from parents to offspring via the process of meiosis by which gametes, the egg cells in the mother and the sperm cells in the father, are generated.Ordinarily, each cell has 23 pairs of chromosomes; the gametes have 23 unpaired chromosomes.In meiosis, the 23 pairs are split so that each gamete receives 1 chromosome from each pair (Figures 8 and 9).Two gametes (egg and sperm) ultimately join into a single cell, the zygote, which has the full complement of 23 chromosome pairs restored.If all goes well, the zygote gives rise to a live offspring.",
+      "Recombination (meiotic recombination)-The swapping of genetic material that occurs in the germline.During the formation of egg and sperm cells, also known as meiosis, paired chromosomes from each parent align so that similar DNA sequences from the paired chromosomes recombine with one another.Recombination results in a shuffling of genetic material and is an important cause of the genetic variation seen among offspring.Also known as crossing over.",
+      "In the generation of gametes, crossing over regularly occurs, and genetic information is swapped between members of a chromosome pair.That doesn't matter within inbred animals, because the swapped parts are identical.In an F 1 animal, however, the chromosomes of a particular pair are genetically different, one each having come from each parent.Each gamete produced will be unique, as will be each F 2 zygote formed by uniting of the gametes from two F 1 parents.An F 2 group thus provides for expression of some genetic variability.This variability is limited to the allelic differences existing between the parent strains of the F 1 s, so that another F 2 , derived from different inbred strains, will express different genetic differences.",
+      "Sex brings harmful alleles together into the same genetic background, allowing selection to more efficiently purge them from the population and potentially producing some offspring that are fitter than either parent. However, the benefit of recombining deleterious mutations may depend on the nature of the epistatic interactions between them. The mutational deterministic hypothesis (Kondrashov 1988) depends partly on this epistasis.",
+      "In most plants and animals, sex is a necessary component of reproduction, and the question for evolutionary biologists is why reproductive mechanisms have evolved that way. In one of the experiments described next, evolutionary geneticists have nevertheless devised a way to compare evolution with and without recombination in the obligately sexual fruit fly.",
+      "This disparity in investment is the basis for the twofold cost: asexual females hypothetically could transmit twice as many alleles at the same cost. In most plants and animals, mates tend to be unrelated, leading to outcrossing. But sex usually also involves the basic process of physical recombination: the breakage and reunion of two different DNA or RNA molecules. Of these two processes, recombination is clearly the more widespread feature of sexual reproduction. A variety of reproductive systems, such as selfing and automixis, involve recombination but not outcrossing. In contrast, relatively few reproductive systems have outcrossing without recombination.",
+      "Aberrant recombination patterns on chromosomes that have missegregated have also been identified as an important factor, in both male and female gametes (Table I).This is because recombination together with cohesion of sister chromatids establish the unique 'bivalent' chromosome structure where homologous partner chromosomes are tethered together, a configuration that is critical for their accurate segregation in meiosis I (Fig. 2A).The remarkable feature is that recombination occurs in foetal oocytes whereas chromosome segregation takes place decades later (Fig. 2A).Since mammalian oocytes are arrested at the G2/M transition (or dictyate stage), this raises the intriguing question of how the bivalent is maintained until the meiotic divisions.",
+      "Traditionally, it has been agreed that the \ufb01nal sex of an individual (phenotypic sex) depends on two sequential processes: the sex determination system of the species and the gonad differentiation process (Valenzuela, 2008). However, recently, these two seemingly distinct processes are viewed as part of a general process leading to gonad formation and sex ratios (Sarre et al. , 2004; Quinn et al. , 2011; Uller and Helantera\u0308, 2011).",
+      "However, we expect that only at this level, the most signi\ufb01cant contributions brought by integrating epigenetics will be made. Concluding Remarks and Future Prospects Fish sex ratios are the result of a complex interaction between genetic, biochemical, and environmental interactions. The ultimate result of these interactions at the individual level is gender: male or female. However, at the population level, the combination of sex determination and differentiation sets the sex ratio. In turn, sex ratios de\ufb01ne the reproductive capacity of populations and, if sex growth dimorphism exists, also the growth characteristics, something very important in an aquaculture context.",
+      "Obehav is, in turn, influenced by offspring genes and environment (Ogene and Oenvir respectively). Hence, indirect genetic effects (blue arrows) and direct genetic effects (red arrow) are important influencers of behaviour. B) Parentoffspring conflict theory predicts that parental resource investment and offspring solicitation behaviours are influenced by the fitness benefit to a focal individual (O), cost to a social partner such as a sibling (S1 and S2) or parent (P), and by their coefficient of relatedness (black arrows). 42 Figure 2: Genomic imprinting can result in divergent phenotypes from the same genotype. A) A paternally imprinted gene, i.e. maternally expressed.",
+      "Because of the small contribution, through the sperm, of the paternal transcriptome to the fertilized zygote, and because of the stronger maternal contribution to child rearing in most model organisms, parental effects are typically thought of as synonymous with maternal effects, although true paternal effects are known to exist (Rando, 2012). Maternal effects have been shown to be important during embryonic development, leading to differences in the birth weight of mice depending on the genotype of the mother (Cowley et al. , 1989; Wolf et al. , 2011).",
+      "Therefore, the resulting phenotypic patterns lag a generation behind the genetic transmission of the causal variants. The most well-studied parental genetic effects are caused by deposition of maternal transcripts into the egg prior to fertilization, resulting in differences in early embryonic development depending on the genotype of the mother. Certain genes have also been shown to respond to maternal influence after birth through genetically defined maternal behaviors (Weaver et al. , 2004).",
+      "The phenotype of the F1 hybrids is compared to those of the parental inbred strains to reveal dominance or semi-dominance relationships between the alleles that a\u00a1ect the phenotype. Phenotypic di\u00a1erences between reciprocal F1 hybrids indicate that one or more of the following factors may a\u00a1ect the trait: (1) sex linkage (X- or Ylinked traits), (2) genomic imprinting of QTLs that a\u00a1ect the phenotype, (3) prenatal maternal e\u00a1ects (e\u00a1ects of intrauterine environment), and/or (4) postnatal maternal or paternal e\u00a1ects (e\u00a1ects of maternal and/or paternal parenting behaviour on o\u00a1spring).",
+      "It was believed by many that for each trait variant we should expect to find a corresponding genetic change, or \u201egene for\u201f that trait. Through historical happenstance the relationship between genes and traits was set up and treated as if it were one-to-one. But the production of a trait involves not only genes, but also their interactions with each other and the environment, and chance."
+    ],
+    [
+      "distinguishing prenatal from postnatal maternal effects, see below). Maternal effects can account for a large proportion of phenotypic variance, especially during early life, and for some traits explain more variation than direct genetic effects [33, 97, 99, 100, 102\u2013115]. However, maternal and offspring genotype are correlated (i.e. half their genes are shared), and in inbred lines they are fully confounded, thus separating the effects of their respective genotypes is difficult. To remove this confounding effect cross-fostering has been used, both in the laboratory and in the field [119, 131].",
+      "Using genetic markers, the pattern of inheritance can be tracked through families. For example, by analyzing a marker linked to the eye color gene in several generations, it is possible to determine from which grandparents a child has inherited its eye color alleles. More importantly, \ufb01nding a marker linked to a disease can lead to location of the faulty gene causing the disease. Finding the gene is very valuable in the search for the cure. The distance between two loci can be expressed either as physical or genetic distance.",
+      "Although autosomal SNPs are commonly used as genetic markers to infer ancestry or race/ethnicity membership, haploid such as mitochondria, Y-DNA, and X-lined markers are also important to provide separate stories of ancestry of individuals from paternal and maternal sides [42,43].Therefore, genetic structure created due to autosomal markers could be different from those of lineage markers (often influenced by political, social, and migration history of individuals/populations).mitochondrial DNA or mtDNA haploid is the maternally inherited mitochondrial genome (mtDNA) [44].All children inherit mtDNA from their mother, with no admixture from the father.Like Y-line DNA, mtDNA is passed intact from one generation to the next but through maternal line.",
+      "a) Autosomal DNA (testing both sexes) markers: autosomal DNA tests utilize DNA from the 22 pairs of autosomal chromosomes.Autosomal DNA is inherited from both parents.Autosomal testing provides percentages of ethnicity using autosomal DNA SNP test (i.e., ancestry informative markers), and it is the most commonly used test to infer ancestry across diploid genome.b) Y-DNA or Y-SNPs (paternal line testing) markers: a haploid Y-DNA is the paternally inherited non-recombining portion of the Y chromosome, and it tests only for males.The Y-DNA testing tests the Y chromosome which is passed intact from father to son with no DNA from the mother.Y-DNA testing can then be used to trace direct paternal line.Y-DNA remains the same in each generation, allowing us to compare surname from different regions to see if we are from the same family.Y-line testing does not indicate anything about the contributions of the other ancestors in a family tree.In other words, you could be 3/4th Native American, with only the direct paternal line being European, and this test would tell you nothing at all about those other three Native lines.When testing the Y-chromosome, there are two types of tests, short tandem repeat (STR) and SNP markers.STR tests are best for recent ancestry while SNP tests tell about more ancient ancestry.c) Mitochondrial DNA (maternal line testing) markers:",
+      "Additional information about past breeding practices can be gleaned by quantifying the number of reproductive males and females in a population.This can be achieved by comparing levels of genetic diversity between sex chromosomes, autosomes and mtDNA 99 .In cattle, for example, gene flow from aurochs is evident in the autosomes but is absent in mtDNA 41 .This has been interpreted as a management strategy that may have involved allowing insemination of domesticated females by wild bulls 41,100 .In horses, a comparison of the levels of diversity of the Y chromosome and the autosomal chromosomes demonstrated that some cultures allowed fewer males to breed and instead selected specific stallion bloodlines 55 .This male-oriented breeding strategy was not practised by the Romans and only became increasingly prominent in the past 1,000 years as a result of the growing influence of Oriental stallions (Arabian, Persian and Turkmen) 101 .",
+      "Dr Ring: What makes the maternal gene so peculiar compared to the paternal?Dr Cookson: If you look in the epidemiologic sense, many studies show that there is increased risk of allergic disease if the mother is affected.However, very few studies have actually set out to test that formally and most of them might suffer from some sort of selection bias because the mother is more likely to be aware of her symptoms and feel guilty, and so on.It is very difficult to explain.Is it genomic imprinting, where the gene is only active when transmitted through the mother?I do not think all of these genes would be imprinted, though it is possible.It also seems that there are effects of the maternal phenotype.The maternal phenotype, if the mother is affected or unaffected, determines the strength of the maternal effect.Again, if a gene was imprinted, you would not expect maternal phenotype to be important.So, I think that this has something to do with maternal/fetal interaction, either through the placenta or shortly after birth.There is the issue of immune conflict between mother and child.At the same time, the mother is trying to prime the infant's immune system.",
+      "Genetic and Genomic Discovery Using Family Studies  Ingrid B. Borecki, PhD; Michael A. Province, PhD G enetic studies traditionally have been performed on sets of related individuals, that is, families.Mendel's early studies in sweet peas (Pisum sativum) on the inheritance patterns of discrete traits from parents with specific mating types to offspring has shed light on the basic mechanisms of inheritance, including the fundamental laws of segregation of discrete factors (genes) from parents to offspring and the cosegregation of genes that are closely located on a chromosome (linkage).The distribution of traits within families exhibited mathematical segregation ratios in offspring from known mating types.These expected segregation ratios have been used as an important discovery tool in the study of human diseases in pedigrees, providing evidence for a multitude of single-gene disorders.Furthermore, in some cases, trait cosegregation with genetic markers with known positions provides mapping information that enables localization and, ultimately, identification of the relevant causative gene.",
+      "In fact, this idea has been pursued before in the context of signatures of reproductive isolation and shown to reveal patterns consistent with epistatic gene interactions that arise in the shape of Dobzhansky-Muller incompatibilities [10,11]. In contrast to the mouse data, the available human genotypes were derived from outbred, ethnically distinct populations. In this case pairs of functionally interacting genes can be detected following a slightly different approach.",
+      "Family Structure  The first re-identification method (FAMILY) employs genealogical data accompanying genomic data.Genealogies, rich in depth and structure, permit the construction of complex familial relationships.Consider a simple family structure of two parents and one child.Since the parental genders are guaranteed, there exist 2 variants of this structure, since the child's gender is either male or female.When disease status is taken into account, it is represented as a Boolean variable; either an individual afflicted or not afflicted.In this aspect, all three family members can be represented as three attributes {Father, Mother, Child}, and there exist (father's disease status)*(mother's disease status)*(child's disease status)*(child's gender) = 2*2*2*2 = 16 possible family-disease combinations.In reality, pedigrees are much more robust than a simple nuclear family.For example, a three-generation family of two children per family permits on the order of 10 5 distinct variants of the family-disease structure and 10 6 individuals that could be uniquely characterized.The number of combinationsk is larger when supplementary information, such as living status or medical/genetic features, is considered. 16e ability to determine unique family structures is only one part of the re-identification process.These structures must be linked to identifiable information, which, in many instances, is publicly available in the form of various genealogical databases.These databases are accessible both offline and via the World Wide Web.For example, genealogical records are available in many public databases, including ,Ancestry.com>,,Infospace.com>,,RootsWeb.com>,,GeneaNet.com>,,FamilySearch.org>, and ,Genealogy.com>. {From such data, it is not difficult to construct family structures and, with such information in hand, an adversary can link disease-labeled family structures to named individuals.",
+      "Fig. 3. Illustrations of the three CEU pedigrees (black) showing how genetic information from distant patrilineal relatives (arrow; red, patrilineal lines) can identify individuals.Filled squares represent sequenced individuals.To respect the privacy of these families, only abbreviated versions are presented.The sex of the CEU grandchildren was randomized.The numbers of grandchildren are not given.",
+      "When I was in high school, I remember often trying to match my friends to their parents at various school functions and being surprised at how easy this was.As human geneticists, in spite of the enormous advances being made in our field, we still cannot answer many of the everyday questions that we are asked, such as: \"Why does he look just like his mother? \"Max Perutz [1], in a recent editorial comment in the New Scientist entitled \"The Molecular Biology of the Future,\" suggested some questions, for, as he put it, \"an examination in some future century. \"Here are two of them: (1) \"The time has come\" the Walrus said, \"To talk of many things ...And why the sea is boiling hot And whether pigs have wings. \"Calculate the amount of genetic information this would require in megacricks.",
+      "Using genetic markers, the pattern of inheritance can be tracked through families. For example, by analyzing a marker linked to the eye color gene in several generations, it is possible to determine from which grandparents a child has inherited its eye color alleles. More importantly, \ufb01nding a marker linked to a disease can lead to location of the faulty gene causing the disease. Finding the gene is very valuable in the search for the cure. The distance between two loci can be expressed either as physical or genetic distance.",
+      "In contrast, genomic imprinting is due to epigenetic changes within the individual causing differential gene expression characterized by either complete or partial silencing of one parental allele (Barlow, 2011; Abramowitz and Bartolomei, 2012; Ashbrook and Hager, 2013). As both mothers and fathers had contact with the pups in our study, our observed PGEs could come from either parent. Among quantitative USV traits only peak amplitude of call displayed a possible parent-of-origin effect. For call number, call duration, mean peak frequency, and all morphological traits, there were no significant parent-of-origin effect in reciprocal F1 females. In contrast, Thornton et al.",
+      "Another way of avoiding stratification is to use family-based samples.This approach has several theoretical advantages: as well as being immune to stratification 114 , these samples can be used to determine whether an allele has different effects on disease when it is inherited maternally or paternally 115 , and DISCORDANT SIB designs [116][117][118] can control for the effects of shared environment.Furthermore, more complex family-based designs are possible 119 that might allow combined association and linkage analysis 120 , and family-based association tests have also been developed for quantitative traits [94][95][96][97][98] .However, pure sibship-based association studies are underpowered relative to case-control studies 107,116,117 , and the requirement for living parents might introduce an age-of-onset bias towards younger patients for diseases that usually arise late in life.Furthermore, family-based samples are often much more difficult to collect, particularly if larger pedigrees are sought.Finally, the most commonly used family-based design, the TRANSMISSION DISEQUILIBIRIUM TEST (TDT; see REF. 114) is susceptible to technical artefacts (see below).",
+      "Because mtDNA is not subjected (as far as we know) to sexual recombination and crossover at the time of nuclear meiosis, nature must call on other means to ensure that inevitable germ plasm mtDNA mutations (Medvedev, 1981) are not transmitted.These mutations among primary oocytes, on the face of it, can be expected to increase with time, that is with maternal age.Empirical data on this question are incomplete and conflicting, being mostly confined so far to searches for deletions rather than point mutations (Chen et al., 1995;Keefe et al., 1995).It is inevitable, however, that there will be such mutations and that there must therefore be a reliable physiological mechanism (a) for giving an opportunity for back-mutations to occur, (b) for selecting in favor of those back-mutations (thus preserving the genome) and in favor of rare advantageous mutations, and (c) for preventing the spread of persistent harmful mutations through the population -mutations that are too slight (or too late in origin) to have escaped intraovarian culling.The sheer conservation of the mitochondrial genome over 0.5 billion years or more, despite a mutation rate estimated at 10 -20 times that of nuclear DNA, is ample reason to conclude that such a physiological purification process must exist.",
+      "To scrutinize the polygenic networks underlying complex diseases, however, mouse resources that are optimized to study the actions of isolated genetic loci on a fixed background will be insufficient on their own. For example, predisposition to the metabolic syndrome is inherited in a non-Mendelian fashion stressing genetic heterogeneity and multigenetic pathogenesis (Nandi et al. , 2004). With the reawakening as to the extraordinary genetic resources and phenotypic diversity archived in extant inbred strains, however, a foundation is in place for tracking down these complex traits and quantitative trait loci (QTL).",
+      "Otherwise, tens of thousands or markers will appear significant in the genome-wise association studies using up to one million genetic markers. Approaches to control for stratification include using of self report of ancestry or genetically derived principle components in the analysis. For studies using inbred mouse lines, a cladogram which is a hierarchical grouping based on phylogenetic analysis of strain relatedness can be created to subdivide inbred strains into more genetically homogenous subgroups.",
+      "Although bilateral descent is the norm in Western societies, it is not universal and there is variation with cultural practices around lineage.In certain societies, individuals place greater importance on (and have greater knowledge about) one side of the family than another (unilineal descent).Thus, individuals in patrilineal groups trace relationships through males only so that your father's brother's children are members of your family, but not your father's sisters (Kottak, 2007).They are members of their husband's group or family.Efforts to create a family pedigree may be hampered if the participant is not familiar with her mother's relatives, but her mother's brother's children (her cousins) may be able to supplement her overall family history.Knowledge about the cultural system of unilineal descent avoids assuming the universality of bilateral descent.Cultural beliefs such as these also have implications in the conduct of genetic research in terms of confidentiality and autonomy (Benkendorf et al., 1997;Wertz, 1997).One cannot assume that the named proband is in a position to speak for the extended family in agreeing to participate in any genetic research (DudokdeWit et al., 1997).",
+      "In particular in polygynous species, a female\u2019s offspring may have different fathers and are thus more closely related through the maternal than the paternal line. Therefore, any fitness cost to mothers, such as increased provisioning and care, affect maternally derived genes more strongly than paternally derived genes, leading to the silencing of the maternal copy (i.e. paternal expression) of genes that increase resource transfer. 5. Coadaptation between offspring and maternal traits The genetics of the co-evolution of parental and offspring traits has been investigated using quantitative genetics models and in several empirical studies (Agrawal et al.",
+      "In this scenario, genes expressed in parents will be selected for their effects on parental behaviour while genes expressed in offspring will be selected for their effects on influencing parental behaviour. At the genetic level the predicted conflict between paternal and maternal genomes is thought to have led to the evolution of genomic imprinting (monoallelic gene expression). Genomic imprinting effects are good examples of offspring genetic effects on maternal care because of the impact on the quality of maternal care and level of resource provisioning (e.g. Li et al. , 1999)."
+    ],
+    [
+      "Genetic mapping in mouse strains enhances the power of detecting modifier genes and identifying complex genetic interactions. Genomewide quantitative trait locus (QTL) analysis, as described in more detail below, represents a promising approach to detect genetic variants that are associated with specific phenotypes and interact with each other. 16 ACCEPTED MANUSCRIPT In experimental crosses of two (inbred) strains the first generation (F1) of offsprings is genetically heterozygous but equal. Then in the next generation (F2) the  PT  strain-specific genetic information is distributed across the genomes of their progeny and  RI  each offspring is genetically unique.",
+      "Second, and perhaps more important, is the difference in the size and types of the genetic reference populations. In our previous study, we mapped the QTL with 36 F2 mice that were genotyped at 82 markers. In the current study, by comparison, we were able to map QTLs after examining 342 mice from 55 strains that were genotyped at approximately 4000 markers.",
+      "This contrast can be exploited to identify subregions that underlie the trans-QTLs [67]. SNPs were counted for all four pairs of parental haplotypes\u2014B vs D, B vs H, B vs C, and L vs S\u2014and SNP profiles for the four crosses were compared (figure 6). Qrr1 is a highly polymorphic PLoS Genetics | www.plosgenetics.org  8  November 2008 | Volume 4 | Issue 11 | e1000260 QTL Hotspot on Mouse Distal Chromosome 1  Figure 5. QTL for aminoacyl-tRNA synthetases in distal Qrr1.",
+      "The traditional approach to QTL mapping is to use two strains that differ maximally in the phenotype as parental strains for genetic crosses, with the following caveats. QTL analysis based on a single cross will most likely reflect only a small portion of the net genetic variation, and QTL detection will be limited to regions where the two progenitor strains have functional polymorphisms. Data from multiple crosses, or from an HS, will overcome this limitation and can also be used to reduce QTL intervals [5,30].",
+      "These candidate genes are then sequenced in the two parental inbred strains looking for sequence di\u00a1erences in coding or regulatory regions. After \u00a2ne mapping the QTL interval and shortening the list of plausible candidate polymorphisms, the major challenge remains \u2401 proving de\u00a2nitively which nucleotide polymorphism underlies the QTL. The most direct proof would be replacing one strain\u2019s allele with another strain\u2019s allele (creating a FIG. 1. Intercross breeding strategy for mapping quantitative trait loci (QTLs). On the right, the parental, F1 hybrid, and intercross (F2) mouse generations are depicted.",
+      "Furthermore, splicing QTLs (sQTLs) rather than eQTLs could comprise the molecular mechanism linking DNA variants with YFP53; thus, sQTL analysis could uncover genes that would not normally be detected at the level of differential gene expression (DGE),53 and thus, a differentially  181 182  Molecular-Genetic and Statistical Techniques for Behavioral and Neural Research  Figure 8.5 Schematic for immediate, rapid \ufb01ne mapping in select F2 recombinants of the RCC-F2 cross. Top panel: Genome-wide signi\ufb01cant QTL (green trace; red dashed line \u00bc signi\ufb01cance threshold; blue vertical lines \u00bc Bayes credible interval).",
+      "Interval-specific haplotype analysis Approximately 97% of the genetic variation between inbred mouse strains is ancestral [22], so regions of identity by descent (IBD) between two strains used to detect a QTL are highly unlikely to contain the causal genetic polymorphism underlying the QTL [28]. For example, a cross between C57BL/6J and A/J mice detected www.sciencedirect.com  a blood pressure QTL on Chr 1 [7].",
+      "Interval-specific haplotype analysis Approximately 97% of the genetic variation between inbred mouse strains is ancestral [22], so regions of identity by descent (IBD) between two strains used to detect a QTL are highly unlikely to contain the causal genetic polymorphism underlying the QTL [28]. For example, a cross between C57BL/6J and A/J mice detected www.sciencedirect.com  a blood pressure QTL on Chr 1 [7].",
+      "At present, the BXD panel is composed of 80 different strains that all have been fully genotyped.26 Variation in any quantifiable trait can be associated with the segregation of parental alleles, and linkage genetics can map this variation to quantitative trait loci (QTLs), thereby identifying the genomic region(s) affecting that trait. An overview of the QTL mapping approach is depicted in Figure 2. Classical QTL analysis has permitted the identification of loci that are associated with variation in HSC traits.",
+      "In general, linking genetic variation with trait variation identifies QTL and a significant linkage of phenotype and genotype suggest that the DNA status helps to determine trait expression. As stated above, mouse QTL studies provide distinct advantages over human studies in the examination of genetic causes of a quantitative trait (e.g. alcoholism), even in the absence of specific hypotheses regarding its aetiology or candidate genes.",
+      "The progenitor mouse strains should have sufficient variation for the traits of interest and they should be genetically diverse enough to enable genetic mapping (BENNETT et al. 2006; FLINT 2003; GRISEL 2000). The sample size required for the identification of QTL depends largely on the effect size that a QTL contributes to phenotypes on interest. Inference about QTL can be made if one or more genetic markers are over- or underrepresented in the analysed individuals. Genotyping is often done by means of microsatellite markers, which contains mono, di-, tri-, or tetranucleotide tandem repeats flanked by specific sequences (Figure 4a).",
+      "This comparison gives information about the reliability of the observed genotype information: The more the marker locations differ between the two maps (which signifies variation in marker positions), the higher the possibility of genotyping errors. QTL mapping was done in several stages to identify loci acting individually and QTL that interacted, either additively or epistatically. To determine individually-acting QTL, a singleQTL genome scan was conducted with the function scanone.",
+      "Importantly, whereas these studies required substantial labor, time, and resources, X-QTL is a quick and easy approach to achieve a comparable level of genetic dissection. The levels of complexity observed here (e.g. 14 loci explaining 70% of the genetic variance for 4-NQO resistance) are still dramatically lower than those seen in for some human traits in GWAS (e.g. 40 loci explaining 5% of the variance for height 2,5). One obvious explanation is the difference in experimental designs (line crosses vs. population association studies), but differences in genetic architectures among species and traits may also contribute.",
+      "The method uses two pieces of information: mapping data from crosses that involve more than two inbred strains and sequence variants in the progenitor strains within the interval containing a quantitative trait locus (QTL). By testing whether the strain distribution pattern in the progenitor strains is consistent with the observed genetic effect of the QTL we can assign a probability that any sequence variant is a quantitative trait nucleotide (QTN). It is not necessary to genotype the animals except at a skeleton of markers; the genotypes at all other polymorphisms are estimated by a multipoint analysis.",
+      "The method uses two pieces of information: mapping data from crosses that involve more than two inbred strains and sequence variants in the progenitor strains within the interval containing a quantitative trait locus (QTL). By testing whether the strain distribution pattern in the progenitor strains is consistent with the observed genetic effect of the QTL we can assign a probability that any sequence variant is a quantitative trait nucleotide (QTN). It is not necessary to genotype the animals except at a skeleton of markers; the genotypes at all other polymorphisms are estimated by a multipoint analysis.",
+      "Genotyping all the individual progeny for markers that show allelic variation between the parental strains (either single nucleotide polymorphisms or simple sequence repeats) will allow the detection of associations between trait values and marker genotype, and in this way demonstrate to which set of markers a QTL is linked. To reduce the genotyping effort, selective genotyping of the individuals at the extremes of the phenotypic spectrum can be performed (20,23). Although these three approaches are in general considered to be the best to detect and map QTL, they have several disadvantages for quantitative traits involving HSC.",
+      "So, how do you go about planning and performing a QTL study, and how do you identify the responsible gene within a QTL that you have identified? Generally, one starts by performing a strain survey to find two parental inbred strains that have a markedly different trait. One can now look up many different traits of inbred mice online at the Mouse Phenome Database (http://phenome. jax.org/pub-cgi/phenome/mpdcgi?rtn=docs/home). However, the trait you may want to study may not be present in wild type mice, so you may want to cross a mutant (or genetically engineered) strain onto several inbred strains.",
+      "QTL Theory and Planning The theory behind the most basic form of QTL mapping is based upon intercrossing two inbred strains. The mouse genome consists of 19 pairs of autosomes (non sex-determining chromosome) and the X and Y chromosomes. In the example shown in Fig. 18.1, we are intercrossing stain A (shown with a black chromosome pair) with strain B (shown with a white chromosome pair). The initial F1 (filial generation 1) mice are true hybrids, with each individual  From: Molecular Biomethods Handbook, 2nd Edition.",
+      "These candidate genes are then sequenced in the two parental inbred strains looking for sequence di\u00a1erences in coding or regulatory regions. After \u00a2ne mapping the QTL interval and shortening the list of plausible candidate polymorphisms, the major challenge remains \u2401 proving de\u00a2nitively which nucleotide polymorphism underlies the QTL. The most direct proof would be replacing one strain\u2019s allele with another strain\u2019s allele (creating a FIG. 1. Intercross breeding strategy for mapping quantitative trait loci (QTLs). On the right, the parental, F1 hybrid, and intercross (F2) mouse generations are depicted.",
+      "QTL mapping studies then seek to detect the polymorphisms underlying the complex traits of interest by scanning for alleles that co-vary with the traits. Similar experiments also can be conducted with special derivatives of inbred strains known as recombinant inbred (RI) mice. These animals are derived by cross-breeding two or more distinct parental strains (which often diverge widely for the trait of interest), followed by inbreeding of the offspring for several generations (Bailey 1971). Given the correct breeding strategy, this method 1  This is an issue faced by GWASs researchers when classifying samples as cases or controls."
+    ]
+  ]
+}
\ No newline at end of file
author	SoloDShelby	2024-07-19 14:41:40 +0300
committer	SoloDShelby	2024-07-19 14:41:40 +0300
commit	3fa31b50af2861382fbe2c76406f5a04c3fefc93 (patch)
tree	34d581648b0e0d3fc8dbe6577752a4fd433a3258 /gnqa/paper1_eval/src/data/datasets/old
parent	74616897e30c7daafe5e74d34073466464921316 (diff)
download	gn-ai-3fa31b50af2861382fbe2c76406f5a04c3fefc93.tar.gz