1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
|
{
"question": [
"Which genes give a predisposition to developing T1D?",
"What is ensembl",
"Which database can I use for genetic, genomics, phenotype, and disease-related data generated from rat research?",
"What is RGD?",
"What resources can I use to do pathway analyses?"
],
"answer": [
"The genes that give a predisposition to developing T1D include the HLA class II genes, the insulin locus (INS) VNTR on 11p15, the cytotoxic T-lymphocyte-associated protein 4 (CTLA4) locus on 2q31, the PTPN22 gene on 1p13, and the interleukin 2 receptor alpha (IL2RA) on 10p15. Other genes associated with T1D include those in the HLA region on chromosome 6, specifically HLA alleles DR3-DQ2 or DR4-DQ8, and alleles at HLA-DQB1.",
"Ensembl is a joint project between the EBI and the Wellcome Trust Sanger Institute. It is a publicly available web resource that contains automatically annotated genomes and integrates data from a wide range of biological research sources. The Ensembl database, launched in 1999, was the first to provide a window on the draft genome, curating the results of a series of computational analyses. It provides detailed information about the human genome including variants like SNPs, insertions, deletions and somatic mutations for several species. Ensembl also provides a DAS reference server giving access to a wide range of specialist annotations of the human genome.",
"You can use the Rat Genome Database (RGD) for genetic, genomic, phenotype, and disease-related data generated from rat research.",
"The background text does not provide information on what RGD is.",
"You can use various resources for pathway analyses such as Ingenuity Pathway Analysis (IPA), Disease Association Protein-Protein Link Evaluator (DAPPLE), Pathway Studio, MetaCore, REACTOME, LitInspector, NetPath, Predictive Networks, GeneGo, Database for Annotation, Visualization and Integrated Discovery (DAVID), PATHWAYASSIST, E! Ensemble, Protonet, Pandora, Pubmatrix, KEGG, Reactome, g:Profiler, Gene Ontology, Biocarta, GenMAPP, WebGestalt, Gene Set Enrichment Analysis (GSEA), ClueGo, CluePedia, Cytoscape, RegulonDB, WIT.UM-BBD, EcoCyc, MetaCyc, Enzyme and metabolic pathways database, and Gene-Set Enrichment Analysis (GSEA)."
],
"contexts": [
[
"A. Genetic ScreeningWe have discussed above the genetic component of T1D.The genetic susceptibility to T1D is determined by genes related to immune function with the potential exception of the insulin gene (434).The genetic susceptibility component of T1D allows some targeting of primary preventive care to family members of diagnosed T1D patients, but there is no complete inheritance of the disease.Nevertheless, the risk for developing T1D compared with people with no family history is 10 -15 times greater.Although 70% of individuals with T1D carry defined risk-associated genotypes at the HLA locus, only 3-7% of the carriers of such genetic risk markers develop diabetes (3).II. THE GENETICS OF TYPE 1 DIABETESA comprehensive overview of genetic data in mouse and human is beyond the scope of this article.Instead, we will focus on how the various susceptibility genes and environmental triggers can fit in a mechanistic model for T1D etiology.",
"T1D risk is strongly influenced by multiple genetic loci and as yet poorly understood environmental factors.The disease is highly heritable, with first-degree relatives of cases displaying approximately 15 times greater risk than the general population.Concordance in monozygotic twins is also as high as 50%.A number of genetic determinants of T1D had already been established before the era of genome-wide association studies.The strongest genetic factors include the HLA class II genes, encoding highly polymorphic antigen-presenting proteins that account for almost 50% of the genetic risk for T1D.Other established loci confer more modest, but substantial effects, such as the insulin locus (INS) VNTR on 11p15 [44][45][46][47], the cytotoxic T-lymphocyte-associated protein 4 (CTLA4) locus on 2q31 [48][49][50][51] and the PTPN22 gene on 1p13 [52,53].More recently, convincing statistical support for an additional T1D susceptibility locus on 10p15 harboring the interleukin 2 receptor alpha (IL2RA) was uncovered, utilizing non-coding SNPs [54][55][56].However, the majority of other associations in the pre-GWA era remain controversial [57][58][59], and linkage studies have established the fact that there are no other loci with an effect size approaching that of HLA.",
"Clearly genetics play an important role in the T1D disease process as both MZ and DZ twins have the same environmental exposures but different concordance rates and length to diagnosis of the second twin.Numerous genes have been associated with T1D, the most significant being the HLA region on chromosome 6 [6].More than 90% of type 1 diabetics carry HLA alleles DR3-DQ2 or DR4-DQ8 compared to no more than 40% of the general population [7].Alleles at HLA-DQB1 are known to be, in part, protective [8].Single nucleotide polymorphisms (SNPs) are also associated with T1D.A recent genome-wide association study of approximately 2,000 patients with each of 7 common, chronic diseases, including T1D, and 7,000 shared controls confirmed the association of SNPs in 5 previously identified regions with T1D and discovered 5 novel associations.However, the authors concluded that these regions, with the exception of the HLA on chromosome 6, confer only modest effects on T1D, and ''the association signals so far identified account for only a small proportion of overall familiality'' [9].These results suggest that additional genetic variants contribute to inheritance of T1D.Type 1 diabetes (T1D) tends to cluster in families, suggesting there may be a genetic component predisposing to disease.However, a recent large-scale genome-wide association study concluded that identified genetic factors, single nucleotide polymorphisms, do not account for overall familiality.Another class of genetic variation is the amplification or deletion of .1 kilobase segments of the genome, also termed copy number variations (CNVs).We performed genome-wide CNV analysis on a cohort of 20 unrelated adults with T1D and a control (Ctrl) cohort of 20 subjects using the Affymetrix SNP Array 6.0 in combination with the Birdsuite copy number calling software.We identified 39 CNVs as enriched or depleted in T1D versus Ctrl.Additionally, we performed CNV analysis in a group of 10 monozygotic twin pairs discordant for T1D.Eleven of these 39 CNVs were also respectively enriched or depleted in the Twin cohort, suggesting that these variants may be involved in the development of islet autoimmunity, as the presently unaffected twin is at high risk for developing islet autoimmunity and T1D in his or her lifetime.These CNVs include a deletion on chromosome 6p21, near an HLA-DQ allele.CNVs were found that were both enriched or depleted in patients with or at high risk for developing T1D.These regions may represent genetic variants contributing to development of islet autoimmunity in T1D.Type 1 diabetes (T1D) tends to cluster in families, suggesting there may be a genetic component predisposing to disease.However, a recent large-scale genome-wide association study concluded that identified genetic factors, single nucleotide polymorphisms, do not account for overall familiality.Another class of genetic variation is the amplification or deletion of .1 kilobase segments of the genome, also termed copy number variations (CNVs).We performed genome-wide CNV analysis on a cohort of 20 unrelated adults with T1D and a control (Ctrl) cohort of 20 subjects using the Affymetrix SNP Array 6.0 in combination with the Birdsuite copy number calling software.We identified 39 CNVs as enriched or depleted in T1D versus Ctrl.Additionally, we performed CNV analysis in a group of 10 monozygotic twin pairs discordant for T1D.Eleven of these 39 CNVs were also respectively enriched or depleted in the Twin cohort, suggesting that these variants may be involved in the development of islet autoimmunity, as the presently unaffected twin is at high risk for developing islet autoimmunity and T1D in his or her lifetime.These CNVs include a deletion on chromosome 6p21, near an HLA-DQ allele.CNVs were found that were both enriched or depleted in patients with or at high risk for developing T1D.These regions may represent genetic variants contributing to development of islet autoimmunity in T1D.",
"Background: The immune system matures mainly during the postnatal period through breastfeeding, and is partly modified by nutritive factors.The manner by which early feeding practices influence the development of type 1 diabetes mellitus (TID) is not clear.Also the use of genetics in prognostic evaluation of the disease has not be studied intensely. Aim:To study the relationship between early infant feeding patterns and susceptibility to TID through the HLA-DRB1 and DQ allelic polymorphism and identify the genes of high predictive value in the prognostic model. Methods:The study included 24 diabetic children with TID matched with 21 controls.All the children were exposed to detailed history of the disease process and anthropometry for weight, height and body mass index.Blood samples were collected from all 45 cases for measuring HLA-DRB1and HLA-DQB1allelic polymorphism for the susceptible genes of HLA-DRB1 0301, 0302, 0401 and 0402 and HLA-DQB1*02 and for the protective genes HLA-DRB1 07,*13 by polymerase chain reaction sequence specific primer (PCR-SSP) done by genomic DNA extraction using Genomic DNA purification kits.Results: Allelic polymorphism for the susceptible genes of HLA-DRB1 were shown to be higher in the diabetic group compared to the control group especially for the 0302 and 0401 alleles at P<0.05, but was not significant for HLA-DRB1-0301 and 0402 at P>0.05.HLADRB1*07 and HLADRB1*13 were significantly higher in the breastfed healthy but not in the diseased or the formula fed groups (p<0.001)(p<0.05).The detection of HLADRB1 0401 allele was more with retinopathy and HLADRB1 0301 allele with microalbuminuria. Conclusions:The absence of protective genes is a strong predictor of TID.Susceptibility genes are influenced by early feeding patterns and in turn affect the clinical course of the disease that could be of prognostic value in TID.",
"More than 60 susceptibility loci have been identified (Table 1).The greatest genetic risk (50%) for T1D is conferred by alterations to immune genes, especially those encoding the classical HLAs (Ounissi-Benkalha and Polychronakos, 2008).Other genetic loci (Table 1) are believed to influence population-level risk for T1D, although it is poorly understood how these non-HLA loci contribute to disease susceptibility (Ram et al., 2016a).The genetics of type 1 diabetesThere is a strong genetic risk to T1D.This is exemplified by (Redondo et al., 2001) who demonstrated a strong concordance of genetic inheritance (65%) and T1D susceptibility in monozygotic twin pairs.That is, when one sibling is afflicted, there is a high probability that the other twin will develop T1D by the age of 60 years.Additionally, autoantibody positivity and islet destruction was observed after a prospective long-term follow-up of monozygotic twins of patients with T1D, despite initial disease-discordance among the twins (Redondo et al., 2008).",
"Family and twin studies indicate that a substantial fraction of susceptibility to type 1 diabetes is attributable to genetic factors.These and other epidemiologic studies also implicate environmental factors as important triggers.Although the specific environmental factors that contribute to immune-mediated diabetes remain unknown, several of the relevant genetic factors have been identified using two main approaches: genome-wide linkage analysis and candidate gene association studies.This article reviews the epidemiology of type 1 diabetes, the relative merits of linkage and association studies, and the results achieved so far using these two approaches.Prospects for the future of type 1 diabetes genetics research are considered.Family and twin studies indicate that a substantial fraction of susceptibility to type 1 diabetes is attributable to genetic factors.These and other epidemiologic studies also implicate environmental factors as important triggers.Although the specific environmental factors that contribute to immune-mediated diabetes remain unknown, several of the relevant genetic factors have been identified using two main approaches: genome-wide linkage analysis and candidate gene association studies.This article reviews the epidemiology of type 1 diabetes, the relative merits of linkage and association studies, and the results achieved so far using these two approaches.Prospects for the future of type 1 diabetes genetics research are considered.",
"CONCLUSIONThe greatest genetic risk (both increased risk, susceptible, and decreased risk, protective) for type 1 diabetes is conferred by specific alleles, genotypes, and haplotypes of the HLA class II (and class I) genes.There are currently about 50 non-HLA region loci that also affect the type 1 diabetes risk.Many of the assumed functions of the non-HLA genes of interest suggest that variants at these loci act in concert on the adaptive and innate immune systems to initiate, magnify, and perpetuate -cell destruction.The clues that genetic studies provide will eventually help lead us to identify how -cell destruction is influenced by environmental factors.While there is extensive overlap between type 1 diabetes and other immune-mediated diseases, it appears that type 1 and type 2 diabetes are genetically distinct entities.These observations may suggest ways to help identify causal gene(s) and, ultimately, a set of disease-associated variants defined on specific haplotypes.Unlike other complex human diseases, relatively little familial clustering remains to be explained for type 1 diabetes.The remaining missing heritability for type 1 diabetes is likely to be explained by as yet unmapped common variants, rare variants, structural polymorphisms, and gene-gene and/or gene-environmental interactions, in which we can expect epigenetic effects to play a role.The examination of the type 1 diabetes genes and their pathways may reveal the earliest pathogenic mechanisms that result in the engagement of the innate and adaptive immune systems to produce massive -cell destruction and clinical disease.The resources established by the international T1DGC are available to the research community and provide a basis for future discovery of genes that regulate the earliest events in type 1 diabetes etiology-potential targets for intervention or biomarkers for monitoring the effects and outcomes of potential therapeutic agents.",
"IntroductionOver 60 loci in the genome contribute to genetic predisposition to type 1 diabetes (T1D) [1][2][3][4][5] in which insulin deficiency results from an autoimmune attack against insulin-producing beta cells of the pancreatic islets.Heterogeneity in the disease aetiology is recently acknowledged and immunological processes leading to T1D in individuals diagnosed later in life appear different from the processes in individuals having disease onset in early childhood, in which B cells are involved in the pathological process in the pancreas [5].Different genes and genetic variants may thus affect disease course at varying ages, also suggested by the high diagnosis age correlation (r 2 = 0.95) in Finnish monozygotic twins concordant for T1D [6].Of the known T1D risk loci, however, only the HLA locus and a few non-HLA loci, have been associated with age at diagnosis [7][8][9][10].Genetic risk score combines risk-increasing alleles into a single score and the genetic risk score for T1D has already been suggested for clinical use for screening of infants at highest T1D risk [11].All disease-susceptibility variants are included in the score, but only a few known T1D variants have stronger effects in individuals with early-onset disease [10].Genes affecting type 1 diabetes diagnosis age / A. Syreeni et al.Genome-wide search for genes affecting the age at diagnosis of type 1 diabetes.",
"The risk for T1D is strongly influenced by multiple genetic loci and environmental factors.The disease is heritable, with first-degree relatives of patients with T1D being at 15-fold greater risk for developing the condition than the general population.",
"Type 1 DiabetesThe higher type 1 diabetes prevalence observed in relatives implies a genetic risk, and the degree of genetic identity with the proband correlates with risk (22)(23)(24)(25)(26). Gene variants in one major locus, human leukocyte antigen (HLA) (27), confer 50-60% of the genetic risk by affecting HLA protein binding to antigenic peptides and antigen presentation to T cells (28).Approximately 50 additional genes individually contribute smaller effects (25,29).These contributors include gene variants that modulate immune regulation and tolerance (30)(31)(32)(33), variants that modify viral responses (34,35), and variants that influence responses to environmental signals and endocrine function (36), as well as some that are expressed in pancreatic b-cells (37).Genetic influences on the triggering of islet autoimmunity and disease progression are being defined in relatives (38,39).Together, these gene variants explain ;80% of type 1 diabetes heritability.Epigenetic (40), gene expression, and regulatory RNA profiles (36) may vary over time and reflect disease activity, providing a dynamic readout of risk.",
"Type 1 diabetes risk stratification by T1D family history and HLA genotyping",
"Genetics. T1DM is a polygenic disease that is influ enced by environmental factors.Genetic risk factors are necessary but not sufficient for disease, as their pene trance is low.The concordance rate of T1DM among monozygotic twins is reported to be only 30%, although a recent study that involved longterm followup suggested that this percentage might be higher 47,48 .",
"Presently, 48 other genomic regions, referred to as susceptibility regions, have been found to also confer susceptibility to T1D (Burren et al., 2011;Steck and Rewers, 2011;Yang et al., 2011;Bluestone et al. 2010;Poicot et al., 2010;Todd et al., 2010;Todd et al., 2007).But their contribution is minimal in comparison to the HLA locus (Gillespie, 2014).Also, research has shown that less than 10% of individuals with HLA-conferred diabetes susceptibility actually progress to clinical disease (Knip andSiljandera, 2008, Wenzlau et al., 2008).This implies that additional factors are needed to trigger and drive -cell destruction in genetically predisposed persons (Knip and Siljandera, 2008).Environmental factors are believed to influence the expression of T1D.The reason being that in the case of identical twins, if one twin has T1D, the other twin only has it 30%-50% of the time, despite having the same genome.This means that other factors contribute to the prevalence or onset of this disease (Knip et al., 2005)."
],
[
"Zerbino, D. R., Achuthan, P., Akanni, W., Amode, M. R., Barrell,D., Bhai, J., Billis, K., Cummins, C., Gall, A., Girn, C. G., Gil,L., Gordon, L., Haggerty, L., Haskell, E., Hourlier, T., Izuogu, O.G., Janacek, S. H., Juettemann, T., To, J. K., Laird, M. R., Lavidas, I., Liu, Z., Loveland, J. E., Maurel, T., McLaren, W., Moore,B., Mudge, J., Murphy, D. N., Newman, V., Nuhn, M., Ogeh, D.,Ong, C. K., Parker, A., Patricio, M., Riat, H. S., Schuilenburg,H., Sheppard, D., Sparrow, H., Taylor, K., Thormann, A., Vullo,A., Walts, B., Zadissa, A., Frankish, A., Hunt, S. E., Kostadima,M., Langridge, N., Martin, F. J., Muffato, M., Perry, E., Ruffier,M., Staines, D. M., Trevanion, S. J., Aken, B. L., Cunningham,F., Yates, A., and Flicek, P.: Ensembl 2018, Nucl.",
"But the four sites are not equivalent; there are important distinctions between them in terms of the data analysed, the analyses carriedout and the way the results are displayed. 4.4.1 EnsemblEnsembl is a joint project between the EBI (http://www.ebi.ac.uk/) and the WellcomeTrust Sanger Institute (http://www.sanger.ac.uk/). The Ensembl database (Hubbardet al. , 2002; http://www.ensembl.org/), launched in 1999, was the first to provide awindow on the draft genome, curating the results of a series of computational analyses.Until January 2002 (Release 3.26.1), Ensembl used the UCSC draft sequenceassemblies as its starting point, but it is now based upon NCBI assemblies. TheEnsembl analysis pipeline consists of a rule-based system designed to mimic decisions made by a human annotator. The idea is to identify confirmed genes that arecomputationally predicted (by the GENSCAN gene prediction program) and alsosupported by a significant BLAST match to one or more expressed sequences orproteins. Ensembl also identifies the positions of known human genes from publicsequence database entries, usually using GENEWISE to predict their exon structures.Data retrieval is extremely well catered for in Ensembl, with text searches of alldatabase entries, BLAST searches of all sequences archived, and the availability of bulkdownloads of all Ensembl data and even software source code. Ensembl annotationcan also be viewed interactively on ones local machine with the Apollo viewer (Lewiset al. , 2002; http://www.fruitfly.org/annot/apollo/). 4.4.2 The UCSC Human Genome BrowserThe UCSC Human Genome Browser (UCSC) bears many similarities to Ensembl;it, too, provides annotation of the NCBI assemblies, and it displays a similar array offeatures, including confirmed genes from Ensembl.Ensembl provides a DAS referenceserver giving access to a wide range of specialist annotations of the humangenome (for more detail, see http://www.ensembl.org/das/). Data mining The ability to query very large databases in order to satisfy ahypothesis (top-down data mining), or to interrogate a database in order togenerate new hypotheses based on rigorous statistical correlations (bottom-updata mining). Domain (protein) A region of special biological interest within a single proteinsequence.",
"But the four sites are not equivalent; there are important distinctions between them in terms of the data analysed, the analyses carriedout and the way the results are displayed. 4.4.1 EnsemblEnsembl is a joint project between the EBI (http://www.ebi.ac.uk/) and the WellcomeTrust Sanger Institute (http://www.sanger.ac.uk/). The Ensembl database (Hubbardet al. , 2002; http://www.ensembl.org/), launched in 1999, was the first to provide awindow on the draft genome, curating the results of a series of computational analyses.Until January 2002 (Release 3.26.1), Ensembl used the UCSC draft sequenceassemblies as its starting point, but it is now based upon NCBI assemblies. TheEnsembl analysis pipeline consists of a rule-based system designed to mimic decisions made by a human annotator. The idea is to identify confirmed genes that arecomputationally predicted (by the GENSCAN gene prediction program) and alsosupported by a significant BLAST match to one or more expressed sequences orproteins. Ensembl also identifies the positions of known human genes from publicsequence database entries, usually using GENEWISE to predict their exon structures.Data retrieval is extremely well catered for in Ensembl, with text searches of alldatabase entries, BLAST searches of all sequences archived, and the availability of bulkdownloads of all Ensembl data and even software source code. Ensembl annotationcan also be viewed interactively on ones local machine with the Apollo viewer (Lewiset al. , 2002; http://www.fruitfly.org/annot/apollo/). 4.4.2 The UCSC Human Genome BrowserThe UCSC Human Genome Browser (UCSC) bears many similarities to Ensembl;it, too, provides annotation of the NCBI assemblies, and it displays a similar array offeatures, including confirmed genes from Ensembl.Ensembl provides a DAS referenceserver giving access to a wide range of specialist annotations of the humangenome (for more detail, see http://www.ensembl.org/das/). Data mining The ability to query very large databases in order to satisfy ahypothesis (top-down data mining), or to interrogate a database in order togenerate new hypotheses based on rigorous statistical correlations (bottom-updata mining). Domain (protein) A region of special biological interest within a single proteinsequence.",
"But the four sites are not equivalent; there are important distinctions between them in terms of the data analysed, the analyses carriedout and the way the results are displayed. 4.4.1 EnsemblEnsembl is a joint project between the EBI (http://www.ebi.ac.uk/) and the WellcomeTrust Sanger Institute (http://www.sanger.ac.uk/). The Ensembl database (Hubbardet al. , 2002; http://www.ensembl.org/), launched in 1999, was the first to provide awindow on the draft genome, curating the results of a series of computational analyses.Until January 2002 (Release 3.26.1), Ensembl used the UCSC draft sequenceassemblies as its starting point, but it is now based upon NCBI assemblies. TheEnsembl analysis pipeline consists of a rule-based system designed to mimic decisions made by a human annotator. The idea is to identify confirmed genes that arecomputationally predicted (by the GENSCAN gene prediction program) and alsosupported by a significant BLAST match to one or more expressed sequences orproteins. Ensembl also identifies the positions of known human genes from publicsequence database entries, usually using GENEWISE to predict their exon structures.Data retrieval is extremely well catered for in Ensembl, with text searches of alldatabase entries, BLAST searches of all sequences archived, and the availability of bulkdownloads of all Ensembl data and even software source code. Ensembl annotationcan also be viewed interactively on ones local machine with the Apollo viewer (Lewiset al. , 2002; http://www.fruitfly.org/annot/apollo/). 4.4.2 The UCSC Human Genome BrowserThe UCSC Human Genome Browser (UCSC) bears many similarities to Ensembl;it, too, provides annotation of the NCBI assemblies, and it displays a similar array offeatures, including confirmed genes from Ensembl.Ensembl provides a DAS referenceserver giving access to a wide range of specialist annotations of the humangenome (for more detail, see http://www.ensembl.org/das/). Data mining The ability to query very large databases in order to satisfy ahypothesis (top-down data mining), or to interrogate a database in order togenerate new hypotheses based on rigorous statistical correlations (bottom-updata mining). Domain (protein) A region of special biological interest within a single proteinsequence.",
"EnsemblEnsembl is a publicly available web resource that contains automatically annotated genomes.It is integrated with other available biological databases like Jasper for binding motifs.It is a much larger web resource than T1Dbase, and contains general information about the human genome including variants.These include SNPs, insertions, deletions and somatic mutations (Alterations in DNA that occur after conception, meaning that they are not inherited) for several species.Data from Ensembl can be accessed in a number of ways.The names of all the SNPs that occur in the T1D susceptibility regions can be collected from Ensembl using the Biomart tool (Kinsella et al., 2011).To achieve this, the coordinates of the T1D regions obtained from T1Dbase are uploaded to the biomart query page which allows one to search the genome browser and retrieve data like the names, chromosomal positions, and genic positions (referred to as \"consequence to transcript\", in Ensembl) of the SNPs.The SNP genic positions tell if a SNP is located within a gene, adjacent to a gene or whether they occur in inter-genic positions between gene coding regions, as well as the particular genes in which they are located.Advantages of Ensembl:There is a number of advantages to using Ensembl. (i) It is a larger web resource than T1Dbase and integrates data from a wide range of biological research sources into its database.Therefore, available information is quite comprehensive. (ii) Genic positions for 99% of the variants obtained from T1Dbase could be retrieved. (iii) Ensembl contains quality checks for genetic variants in its variation pipeline.A variant is flagged as failed if certain quality criteria are not met, for instance if none of the variant alleles match the reference allele of the variant.Generally, Ensembl was found to give more detailed information regarding the genic positions of variants compared to T1Dbase.Information about genes, including gene names, chromosomal coordinates, biotype (coding or non-coding), and number of splice variants, can also be retrieved from Ensembl.",
"But the four sites are not equivalent; there are important distinctions between them in terms of the data analysed, the analyses carriedout and the way the results are displayed. 4.4.1 EnsemblEnsembl is a joint project between the EBI (http://www.ebi.ac.uk/) and the WellcomeTrust Sanger Institute (http://www.sanger.ac.uk/). The Ensembl database (Hubbardet al. , 2002; http://www.ensembl.org/), launched in 1999, was the first to provide awindow on the draft genome, curating the results of a series of computational analyses.Until January 2002 (Release 3.26.1), Ensembl used the UCSC draft sequenceassemblies as its starting point, but it is now based upon NCBI assemblies. TheEnsembl analysis pipeline consists of a rule-based system designed to mimic decisions made by a human annotator. The idea is to identify confirmed genes that arecomputationally predicted (by the GENSCAN gene prediction program) and alsosupported by a significant BLAST match to one or more expressed sequences orproteins. Ensembl also identifies the positions of known human genes from publicsequence database entries, usually using GENEWISE to predict their exon structures.Data retrieval is extremely well catered for in Ensembl, with text searches of alldatabase entries, BLAST searches of all sequences archived, and the availability of bulkdownloads of all Ensembl data and even software source code. Ensembl annotationcan also be viewed interactively on ones local machine with the Apollo viewer (Lewiset al. , 2002; http://www.fruitfly.org/annot/apollo/). 4.4.2 The UCSC Human Genome BrowserThe UCSC Human Genome Browser (UCSC) bears many similarities to Ensembl;it, too, provides annotation of the NCBI assemblies, and it displays a similar array offeatures, including confirmed genes from Ensembl.Ensembl provides a DAS referenceserver giving access to a wide range of specialist annotations of the humangenome (for more detail, see http://www.ensembl.org/das/). Data mining The ability to query very large databases in order to satisfy ahypothesis (top-down data mining), or to interrogate a database in order togenerate new hypotheses based on rigorous statistical correlations (bottom-updata mining). Domain (protein) A region of special biological interest within a single proteinsequence."
],
[
"The database contains trait data for severalhundred phenotypes including common inbreds, consomics, 80 BXD recombinant inbreds,hybrids, and over 60,0000 mutagenised mice including ENU mutants and several knockoutlines. SOPs are employed for phenotypic data acquisition. This publicly accessible databaseis an excellent example of one that can be made significantly more valuable to thecommunity with a standard in place for the reporting of these protocols. PhenoSITE (http://www.gsc.riken.go.jp/Mouse/phenotype/top.htm) provides baselinephenotype data for three inbred strains and their F1 hybrids.",
"The MouseGenome Database (MGD) has structured their mouse genomic data in terms of the Mammalian Phenotype Ontology[10]. Similarly, the Rat Genome Database (RGD) [11] alsodeveloped a phenome database, integrated with its genomicdata. In humans, the GeneNetwork (WebQTL) provides adatabase of complex traits with mappings to quantitative traitloci [12]. And several studies have focused on integratinghuman phenome and genome resources. For example, Butteet al. created a large-scale phenomegenome network byintegrating the Unied Medical Language System with humanmicroarray gene expression data [13]; and Aerts et al.de la Cruz N, Bromberg S, Pasko D, Shimoyama M, Twigger S, et al. (2005)The Rat Genome Database (RGD): Developments towards a phenomedatabase. Nucleic Acids Res 33: D485D491. Wang J, Williams RW, Manly KF (2003) WebQTL: Web-based complex traitanalysis. Neuroinformatics 1: 299308. Butte AJ, Kohane IS (2006) Creation and implications of a phenomegenome network. Nat Biotechnol 24: 5562. Aerts S, Lambrechts D, Maity S, Van Loo P, Coessens B, et al. (2006) Geneprioritization through genomic data fusion. Nat Biotechnol 24: 537544.",
"Shur-Jen Wang provided an overview of the Rat Genome Database, which provides a platform to improve model selection.The database includes a quantitative phenotype tool that provides expected ranges for a phenotype of interest across strain groups, drawing from published literature and other deposited data and resources.This tool can also be used to link phenotypic variation to damaging genomic variants, which are shown in parallel.",
"This is apublicly available database that contains phenotypes from hundreds of studies and alsolists basal gene expression data for many tissues, including brain regions. 3.4. Why Mice? The European house mouse (Mus musculus) has served as human analogue in basicresearch for many decades. Ethical and logistic limitations preclude almost all toxicogeneticresearch in humans. Genome-wide association studies in humans have revealed the geneticbasis for individual differences in several diseases; however, the exact mechanisms for geneaction are difficult to ascertain. Thus, the use of animal models to uncover mechanismsbecomes the approach [61,62].",
"A number of public data resources are also being established to provide freelyaccessible microarray data on drug- and toxicity-related phenotypes. For example,the Chemical Effects in Biological Systems (CEBS) database (Mattes et al. , 2004) isa highly recommended resource that accommodates gene-expression profiles, andproteomics and metabolomics data and allows very complex queries across morethan 100 experiments, mostly performed in rat liver. These experiments include datagenerated after exposure to members of key drug classes, including the antidiabetic,troglitazone (Rezulin); the antiepileptic, valproic acid; and the antidepressive, fluoxetine (Prozac) among other drugs (Mattes et al. , 2004).",
"Although these as yet include only alimited number of laboratories and genotypes, they all try to enlist larger groupsof researchers and to expand the animalmodels covered, and they are publicly available. It will be beneficial for the redesign ofnew behavioral measures that raw behavioral data will be available as well in thesedatabases. Access to this information will allowexperimenters to extract from the databasethe size of the genotype-by-laboratory interaction relevant to their experiment.",
", 2014; see Section 9). GeneNetwork is a database that enables searching for 4000 phenotypes from multiple studies in the BXD, HXB, and in other recombinant inbred rodent families, as well as in other model organismsand even humans (Mulligan et al. , 2017). GeneNetwork employed asomewhat dierent strategy than MPD in that it did not rely solely onresearchers submitting their data. Instead the database operators extracted the data from the scientic literature and integrated them into auniform format (Chesler et al. , 2003).In the future, these two dataresources, the per strain phenotype data storage with thorough protocoldocumentation in MPD, the Rat Genome Database, and genetic analysissuite in GeneNetwork.org will be more closely integrated (Mulliganet al. , 2017). The public database of the International Mouse Phenotyping221Neuroscience and Biobehavioral Reviews 87 (2018) 218232N. Kafka et al. Consortium (IMPC) is intended to be the rst truly comprehensivefunctional catalogue of a mammalian genome (Morgan et al. , 2009;Koscielny et al. , 2014).",
"Useful Databases for the Exploration of Relationships Among Genetic Variations and Specific Phenotypes.",
"Shimoyama M, De Pons J, Hayman GT, Laulederkind SJ, Liu W, Nigam R, Petri V, Smith JR,Tutaj M, Wang S-J, The Rat Genome Database 2015: genomic, phenotypic and environmentalvariations and disease, Nucleic acids research 43(D1) (2014) D743D750. [PubMed: 25355511][24]. Dickinson ME, Flenniken AM, Ji X, Teboul L, Wong MD, White JK, Meehan TF, Weninger WJ,Westerberg H, Adissu H, High-throughput discovery of novel developmental phenotypes, Nature537(7621) (2016) 508. [PubMed: 27626380][25].",
"All data presented in this paper were deposited in the online databaseGeneNetwork (www.genenetwork.org), an open web resource that containsgenotypic, gene expression, and phenotypic data from several genetic referencepopulations of multiple species (e.g. mouse, rat and human) and various celltypes and tissues.35;36 It provides a valuable tool to integrate gene networks andphenotypic traits, and also allows cross-cell type and cross-species comparativegene expression and eQTL analyses.",
"This is apublicly available database that contains phenotypes from hundreds of studies and alsolists basal gene expression data for many tissues, including brain regions. 3.4. Why Mice? The European house mouse (Mus musculus) has served as human analogue in basicresearch for many decades. Ethical and logistic limitations preclude almost all toxicogeneticresearch in humans. Genome-wide association studies in humans have revealed the geneticbasis for individual differences in several diseases; however, the exact mechanisms for geneaction are difficult to ascertain. Thus, the use of animal models to uncover mechanismsbecomes the approach [61,62].",
"The Mouse Phenome Database would be a natural choice: it already provides acontrolled vocabulary for representing phenotype measurements and enforces correct strain nomenclature tofacilitate accurate comparisons across studies. Effectiveintegration of phenotypic and genetic data, facilitated bythe databases and analytical tools presented in this review,is critical to realizing the promise of the CC as it existstoday.",
"A number of public data resources are also being established to provide freelyaccessible microarray data on drug- and toxicity-related phenotypes. For example,the Chemical Effects in Biological Systems (CEBS) database (Mattes et al. , 2004) isa highly recommended resource that accommodates gene-expression profiles, andproteomics and metabolomics data and allows very complex queries across morethan 100 experiments, mostly performed in rat liver. These experiments include datagenerated after exposure to members of key drug classes, including the antidiabetic,troglitazone (Rezulin); the antiepileptic, valproic acid; and the antidepressive, fluoxetine (Prozac) among other drugs (Mattes et al. , 2004).",
"The GeneNetwork database provides open accessto BXD and other RI strain derived microarray data, single nucleotide polymorphism (SNP) data,and phenotypic data for quantitative trait loci analysis and gene expression correlation analyses. Gene expression data were exported for manually selected probes in the PDNN hippocampusdatabase (Hippocampus Consortium M430v2), and the PDNN whole brain database (INIA BrainmRNA M430). The Hippocampus database was chosen as one of the most elaborate brain databases,as well as most highly recommended dataset on GeneNetwork itself (http://www.genenetwork.org/webqtl/main.py?FormID=sharinginfo&GN_AccessionId=112).",
"The Mouse Phenome Database would be anatural choice: it already provides a controlled vocabulary for representing phenotypemeasurements and enforces correct strain nomenclature to facilitate accurate comparisonsacross studies. Effective integration of phenotypic and genetic data, facilitated by thedatabases and analytical tools presented in this review, is critical to realizing the promise ofthe CC as it exists today.",
"RGD database (www.rgd.mcw.edu) provides updated genetic,genomic, phenotype, and disease data generated from mouse, rat,and human. A total of 450 genes were downloaded using cardiomyocyte, myocyte, and cardiomyopathy as the keywords. GWAS Catalog (www.ebi.ac.uk/gwas) database provides published genome-wide association studies in human populations. Atotal of 126 genes associated with cardiomyopathy disease with pvalue 5 10 6 were downloaded using cardiomyopathy asthe key word. IMPC database (http://www.mousephenotype.org/) provides detailed phenotype data for the knockout mouse. A total of 636genes were downloaded using cardiomyocyte, myocyte, andcardiomyopathy as key words. collaborative eort [19].",
"A number of public data resources are also being established to provide freelyaccessible microarray data on drug- and toxicity-related phenotypes. For example,the Chemical Effects in Biological Systems (CEBS) database (Mattes et al. , 2004) isa highly recommended resource that accommodates gene-expression profiles, andproteomics and metabolomics data and allows very complex queries across morethan 100 experiments, mostly performed in rat liver. These experiments include datagenerated after exposure to members of key drug classes, including the antidiabetic,troglitazone (Rezulin); the antiepileptic, valproic acid; and the antidepressive, fluoxetine (Prozac) among other drugs (Mattes et al. , 2004).",
"A number of public data resources are also being established to provide freelyaccessible microarray data on drug- and toxicity-related phenotypes. For example,the Chemical Effects in Biological Systems (CEBS) database (Mattes et al. , 2004) isa highly recommended resource that accommodates gene-expression profiles, andproteomics and metabolomics data and allows very complex queries across morethan 100 experiments, mostly performed in rat liver. These experiments include datagenerated after exposure to members of key drug classes, including the antidiabetic,troglitazone (Rezulin); the antiepileptic, valproic acid; and the antidepressive, fluoxetine (Prozac) among other drugs (Mattes et al. , 2004)."
],
[
"d",
"Summary",
"b gg n n e e r c S",
"G",
"d",
"npg",
"Hence only G2D and Gentrepid will be discussed here.",
"F, forward; R, reverse.",
"~~~.",
"n.d.n.d.",
"3KR",
"What Is Relevant?",
"R5. Ubuntu philosophya)R5. Ubuntu philosophy (See page 66)",
"RSet in 10/12 pt Dutch801BT by Aptara\u0002Inc., New Delhi, IndiaDisclaimerThe publisher and the author make no representations or warranties with respect to the accuracy orcompleteness of the contents of this work and specically disclaim all warranties, including withoutlimitation warranties of tness for a particular purpose. No warranty may be created or extended bysales or promotional materials. The advice and strategies contained herein may not be suitable forevery situation. This work is sold with the understanding that the publisher is not engaged inrendering legal, accounting, or other professional services.",
"vid",
"npg",
"HG LG HG LG HG LG HG LG HG LG HG LG HG LG",
"rMZ"
],
[
"Pathway analysisSignificant over-representation of biochemical pathways from KEGG and Reactome as well as gene ontology terms were taken from the output of g:Profiler, http://biit.cs.ut.ee/gprofiler/ [15].Lists of genes (n > 10) pertaining to a given type of GxE interaction, i.e., either a particular phenotype or environmental factor, served as input to the pathway/ontology tool.g:Profiler was run with default settings.",
"Pathway EnrichmentPathway analyses were performed to explore possible biological mechanisms that may underlie the associations between the identified genes and aging pathways.We used The Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways, GO ontology, Pathway commons, and disease-associated genes from WebGestalt for our analyses (Wang et al. 2013).For each pathway, the hypergeometric test was used to detect the overrepresentation of our set of genes among all genes in the pathway.Lastly, FDR was controlled using the Benjamini-Hochberg procedure.In all cases, the complete set of proteincoding genes was used as the background.",
"Multiple exploratory dataanalysis will be used since different analysis can reveal different aspects of the data (Leung Y.F. ,Cavalieri D.). The program EASE (Expression Analysis Systematic Explorer) will furtheranalyze the data by looking at over-represented functional categories of genes in the network. Ingenuity Pathway Analysis will help to identify biological pathways that are relevant to thegenes of interest. The data will be analyzed using WebQTL which will link gene expressionwith behavioral data. Important specific genes found in the study will be further confirmed byreal time PCR.",
"Pathway analysisThe identified CpGs were annotated to nearest genes and evaluated for enrichment of gene-sets in the Reactome and the KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways using Gene-Set Enrichment Analysis (GSEA) (http://www.broadinstitute.org/gsea/index.jsp).",
"Ingenuity Pathway Analysis (IPA)The IPA software (Ingenuity Systems, Inc.) was used to carry out the network composition analyses.The Ingenuity Canonical Pathways analysis was used to identify the most significant pathways that were set from the Ingenuity Pathway Analysis library.The significance of the association between a data set and the canonical pathway was measured in two ways: (1) a ratio of the number of molecules from the data set that map to the pathway divided by the total number of molecules that map to the canonical pathway was displayed, and (2) Fisher's exact test was used to calculate a p-value to determine the probability that the association between the genes in the dataset and the canonical pathway can be explained by chance alone [28].",
"Pathway analysisPathway analyses were carried out using the core analysis function of the Ingenuity Pathway Analysis software (IPA, Ingenuity Systems).We performed gene-based tests for association based on results from the PAR-dr and WL-dr discovery GWAS, using the Versatile Gene-based Association Study (VEGAS) software. (16) The full list of genes and gene-based p-values generated by VEGAS was uploaded into IPA for use as a reference set (16,965 genes were available for the PAR-dr analysis and 16,953 for the WL-dr analysis).From this list p-value cut-offs of 0.01 or 0.05 were used to identify IPA focus molecules (Supplemental Section 7).Networks generated by IPA provide insight into the molecular interactions of the focus molecules, independent of any predictions of biological function.",
"Inmetabolic pathways analysis , using bioinformatics toolssuch as RegulonDB, WIT.UM-BBD, EcoCyc,MetaCyc,Enzyme and metabolic pathways database, KEGG bythe researchers willprovide them with theencyclopaedic information about biochemical products ,substrates, catalysing enzymes,amino acids,carbohydrates, lipids and toxic compounds etc. and theirmetabolic pathways specific diseases related to thefailure in their functions. Bioinformatics tools likeKEGG, KEGG BRITE, Gene network database,Genepath help the researchers in analysis of genetic pathwaysand regulatory networks in such a ways that giveinformation about the genes, transcriptional factors,miRNA, genes encode enzymes involved in geneticrelated diseases.The techniques integrate the molecular information from thedatabases with simulation of metabolic networks. These methods also help in representation of genes, proteins andmetabolic pathways in combination with dynamic simulated environment. In this paper we reviewed someapplicable bioinformatics tools for analytical study of three types of pathways such as metabolic, genetics andsignalling pathways along with the information about their principle, work system and their direct access link to thedatabases and programs. This study helps scientists in fast, economic, high accuracy and large scale based outputs ofpathways analysis of their appropriate research involving the biochemical pathways.",
"Well-established methodologies such as Gene Set EnrichmentAnalysis (GSEA) [41] help in differentiating pathways as functionalunits from experimental populations. Manually curated pathwaysbased on expert knowledge and existing literature obtained fromthe Kyoto Encyclopedia of Genes and Genomes (KEGG, http://www.genome.jp/kegg/pathway.html) are another alternative measure used for validation [21]. Biological Network Inference from Microarray Data, Current Solutions, and AssessmentsTo evaluate the biological significance of a inference method,researchers explored an alternative measure based on Gene Ontology (GO) against functional, biological enrichment of a group ofgenes derived from inferred network modules [34].",
"Pathway analyses.We used two different programs for pathway analysis: Ingenuity (see URLs), version August 2012, application build 172788, content version 14197757) and the Disease Association Protein-Protein Link Evaluator (DAPPLE) 39 .",
"PATHWAYASSIST includes an automatedtext-mining tool, which enables the software to generate pathways from the entire PubMed database and other publicsources. Thus, we surveyed all published work in PubMedand extracted data on each candidate gene relating to itstranscriptional regulation, its binding partners and any othergene/protein that modifies or interacts with it. This analysiswas presented graphically and colour-coding genes identifiedin our study enabled easy identification of the genes lying inoverlapping pathways.",
"For example, Gene Ontology [1], Biocarta [2], GenMAPP[3] and KEGG [4] all allow a list of genes to be crossedwith biological functions and genetic networks, includingmetabolic, signalling or other regulation pathways. Basicstatistical analysis (e.g. , [5,6]) can then determinewhether a pathway is over-represented in the list, andwhether it is over-activated or under-activated. However,one can argue that introducing information on the pathway at this point in the analysis process sacrifices somestatistical power to the simplicity of the approach.",
"Gene Ontology and Pathway analysisData sets were interrogated using the Ingenuity Pathways Analysis (IPA) application (Ingenuity Systems, Redwood City, CA; http://www.ingenuity.com).IPA was used to identify enriched canonical pathways, gene networks, functional classes, and toxicity lists (molecules involved in known toxicity processes).",
"Analysing participating pathways is an important aspectof any genes functional analysis strategy. In this view,REACTOME (http://www.reactome.org) [13] is a crossreferenced, manually curated and peer reviewed pathwaydatabase. LitInspector (http://www.litinspector.org) [14]and NetPath (http://www.netpath.org/index.html) [15]allow one to access curated signal transduction related literature and interaction pathways respectively. PredictiveNetworks (http://predictivenetworks.org/) [16] integratesgene interactions and networks information from PubMedliterature and other online biological databases and presents it in an accessible and efficient user interface. Twoother noteworthy commercial tools are GeneGo andIngenuity IPA.",
", 2011; Kim et al. , 2011b; Zhang et al. ,2011). A number of pathway analysis software packages are available such as PathwayStudio(http://www.ariadnegenomics.com/),and MetaCoreTM (http://www.genego.com/metacore.php). In such software packages, thealgorithms calculate the statistical signicanceof the expression changes across every group orpathway in the database, thus, allowing identication of groups or pathways most stronglyaffected by the observed expression changes(http://www.ariadnegenomics.com/technologyresearch/pathway-analysis/).",
"Network analyses.Network analyses were carried out using the Ingenuity Pathway Analysis tool 66 .P values for canonical pathways and functions were calculated from the observed number of candidate genes in the gene set, compared with the number expected under the null hypothesis and corrected (Bonferroni) for the number of pathways tested.",
"Pathway enrichment analysis.Pathway enrichment analysis for the predicted genomic key driver variants was performed using the ClueGo(v2.1.7) 74and CluePedia(v1.1.7) 75plugins in Cytoscape(v.3.1.0) 76with the GO database (29.02.2016 download).Pathways with a Bonferroni-corrected p-value are shown with full data in Supplementary Data 4. Pathway enrichment analysis for the coexpression modules from transcriptomic analysis was performed by R package goseq with default parameters 77 .",
"Pathway analysisFor the 85 learning-associated genes, we used a combination of bioinformatics software that included E! Ensemble, Protonet, Pandora, and Pubmed and Pubmatrix searches (Becker et al., 2003).We also used http://bind.cafor protein-protein interaction information.Using this approach (Burger et al., 2007;Velardo et al., 2004) we found information on 50 genes (Table 3 and Supplementary Table 3); the other 35 transcripts were expressed sequence tags (EST).",
"Finally, using the top 24 results, we conducted a pathway analysis with the Database for Annotation, Visualization and Integrated Discovery (http://david.abcc.ncifcrf.gov/).",
"Pathway analysis helps to add structure to the very large amount of data generated by microarrays.This type of analysis allows determining whether differentially methylated genes belong to predefined networks more than by chance alone.Gene ontology enrichment was performed using the Ingenuity Pathway Analysis (IPA) software (Ingenuity System).IPA compares a provided list of genes (differentially methylated genes in this case) to a reference list of genes included in various biological pathways.It provides a P value based on a hypergeometric test identifying over-represented gene ontology categories."
]
],
"task_id": [
"029A427CEEBABE644F12EE390469B134",
"7C028B1D0013EA11574B094986ABE4C2",
"55562016699AFE4B8AD9A7F29A806CB5",
"C9B1B98F9207B79EBBC98790A769CB51",
"242918F32291CC085DEB319A7EE3284B"
]
}
|