aboutsummaryrefslogtreecommitdiff
path: root/gnqa/paper2_eval/data/dataset/human/intermediate_files/human_cs_gn_9
diff options
context:
space:
mode:
authorShelbySolomonDarnell2024-10-17 12:24:26 +0300
committerShelbySolomonDarnell2024-10-17 12:24:26 +0300
commit00cba4b9a1e88891f1f96a1199320092c1962343 (patch)
tree270fd06daa18b2fc5687ee72d912cad771354bb0 /gnqa/paper2_eval/data/dataset/human/intermediate_files/human_cs_gn_9
parente0b2b0e55049b89805f73f291df1e28fa05487fe (diff)
downloadgn-ai-master.tar.gz
Docker image built to run code, all evals run using R2RHEADmaster
Diffstat (limited to 'gnqa/paper2_eval/data/dataset/human/intermediate_files/human_cs_gn_9')
-rw-r--r--gnqa/paper2_eval/data/dataset/human/intermediate_files/human_cs_gn_965
1 files changed, 65 insertions, 0 deletions
diff --git a/gnqa/paper2_eval/data/dataset/human/intermediate_files/human_cs_gn_9 b/gnqa/paper2_eval/data/dataset/human/intermediate_files/human_cs_gn_9
new file mode 100644
index 0000000..764ad85
--- /dev/null
+++ b/gnqa/paper2_eval/data/dataset/human/intermediate_files/human_cs_gn_9
@@ -0,0 +1,65 @@
+{
+ "titles": [
+ "2016 - Coming of age ten years of next.pdf",
+ "2020 - Precision and Personalized Medicine How Genomic.pdf",
+ "2016 - Coming of age ten years of next.pdf",
+ "2016 - Coming of age ten years of next.pdf",
+ "2016 - Coming of age ten years of next.pdf",
+ "2016 - Coming of age ten years of next.pdf",
+ "2008 - Gene Expression Profiling.pdf",
+ "2014 - Computational tools to aid the design and development of a genetic reference population.pdf",
+ "2015 -Pandey- Functional Analysis of Genomic Variation and Impact on Molecular.pdf",
+ "2015 - Functional Analysis of Genomic Variation and Impact on Molecular and Higher Order Phenotypes.pdf"
+ ],
+ "extraction_id": [
+ "cad1dcca-621d-5003-ba3a-81950819bc52",
+ "de09f30d-e9ba-5379-8c7a-85b2cd2ed6c8",
+ "4cdf13c0-c505-5ff9-9a6e-b10e5d1c8819",
+ "5da5fc5d-1fe6-58f0-9141-72b9b2996fff",
+ "cad1dcca-621d-5003-ba3a-81950819bc52",
+ "5da5fc5d-1fe6-58f0-9141-72b9b2996fff",
+ "c14d1c74-a14a-5037-8d3f-f32a60faa9a5",
+ "a744f8ce-7920-5fb8-acce-912f70112924",
+ "f4e989e5-c3d0-5d5c-b8c3-95894a14630b",
+ "fa426831-7c04-56c1-a191-1ebbc35342ed"
+ ],
+ "document_id": [
+ "9dd6e4e9-d136-507b-b628-68c8e1461bd0",
+ "cd11028a-933b-52a0-9534-c173323056ef",
+ "9dd6e4e9-d136-507b-b628-68c8e1461bd0",
+ "9dd6e4e9-d136-507b-b628-68c8e1461bd0",
+ "9dd6e4e9-d136-507b-b628-68c8e1461bd0",
+ "9dd6e4e9-d136-507b-b628-68c8e1461bd0",
+ "59f3b969-089b-5258-93ad-892dbc9ffa9c",
+ "70cbde25-6406-5a31-91ae-57f430e8f267",
+ "93381e23-494b-5bc2-9d09-ef315506601f",
+ "263d327b-f5db-54e4-a215-b3f8a51cd7d6"
+ ],
+ "id": [
+ "chatcmpl-ADZJdRAJz1ZAYluBcJm58tFEyX9Ng",
+ "98010acc-fd11-5d33-bced-626ef29f2896",
+ "8477a774-dddb-5541-b8d7-d51a7e56b0af",
+ "d2540614-9397-5e3e-8b5f-ad328ca973b2",
+ "13a6b6f9-4a9a-5eb9-ac79-d986d9e613f0",
+ "97f2aa12-623b-53ec-9793-5834311a37dd",
+ "1e324977-2ca5-5062-8a09-7659d516e899",
+ "3e782f01-a06e-51b6-ac8a-0e0a56939d08",
+ "da667832-cd2f-5af6-a0a8-a17542b0a2e2",
+ "d1158643-3625-5855-a03d-eec4ac96eb4d",
+ "cdf2b80f-1509-50a2-9cb2-a36dd6f3f2cc"
+ ],
+ "contexts": [
+ "FURTHER INFORMATION 10X Genomics: http://www.10xgenomics.com 454 Sequencing: http://www.454.com Advances in Genome Biology and Technology (AGBT): http://www.agbt.org BGISEQ500: http://seq500.com/en/portal/Sequencer.shtml Illumina: http://www.illumina.com Ion Torrent: https://www.thermofisher.com/us/en/home/ brands/ion-torrent.html Oxford Nanopore Technologies: https://www.nanoporetech. com Pacific Biosciences: http://www.pacb.com Personal Genome Project: http://www.personalgenomes.org",
+ "36. Sequencing, H.G. Finishing the euchromatic sequence of the human genome. Nature 2004 ,431, 931945. 37. Heather, J.M.; Chain, B. The sequence of sequencers: The history of sequencing DNA. Genomics 2016 ,107, 18. [CrossRef] 38. Rothberg, J.M.; Leamon, J.H. The development and impact of 454 sequencing. Nat. Biotechnol. 2008 ,26, 11171124. [CrossRef] [PubMed] 39. Shendure, J.; Ji, H. Next-generation DNA sequencing. Nat. Biotechnol. 2008 ,26, 11351145. [CrossRef] [PubMed]",
+ "sequencing. Genome Res. 20, 11651173 (2010). 64. English,A.C. etal. Assessing structural variation in a personal genome-towards a human reference diploid genome. BMC Genomics 16, 286 (2015). 65. Carneiro,M.O. etal. Pacific Biosciences sequencing technology for genotyping and variation discovery in human data. BMC Genomics 13, 375 (2012). 66. Quail,M.A. etal. A tale of three next generation sequencing platforms: comparison of Ion T orrent, Pacific Biosciences and Illumina MiSeq sequencers.",
+ "22. Karow, J. Qiagen launches GeneReader NGS System atAMP; presents performance evaluation by broad. GenomeWeb [online], https:// www.genomeweb.com/ molecular-diagnostics/qiagen-launches-genereader- ngs-system-amp-presents-performance-evaluation (4Nov 2015). 23. Smith,D.R. & McKernan,K. Methods of producing and sequencing modified polynucleotides . US Patent 8058030 (2011). 24. Margulies,M. etal. Genome sequencing in microfabricated high-density picolitre reactors. Nature 437, 376380 (2005).",
+ "160. Glenn,T .C. Field guide to next-generation DNA sequencers. Mol. Ecol. Resour. 11, 759769 (2011). 161. Karow,J. At AGBT , 10X Genomics launches GemCode platform; shipments slated for Q2 as firm battles IP lawsuits. GenomeWeb [online], https://www. genomeweb.com/sample-prep/agbt-10x-genomics- launches-gemcode-platform-shipments-slated-q2-firm- battles-ip-lawsuits (2Mar 2015). Competing interests statement The authors declare competing interests: see Web version for details. FURTHER INFORMATION",
+ "sequencing. Bioinformatics 31, 20402042 (2015). 46. Qiagen. Oncology insights enabled by knowledge base- guided panel design and the seamless workflow of the GeneReader NGS system Press Release. Qiagen [online], http://www.genereaderngs.com/PROM-9192- 001_1100403_WP_GeneReader_NGS_0116_NA.pdf (2016). 47. Forgetta,V. etal. Sequencing of the Dutch elm disease fungus genome using the Roche/454 GS-FLX Titanium System in a comparison of multiple genomics core",
+ "DNA), and provide the means to link sequences containing applications. First, base- callers like Phred (4,5) extract raw sequences from raw data. There are also contig assemblers like Phrap (University of Washington, http://bozeman. mbt.washington.edu/phrap.docs/phrap.html ) or CAP3 (6) that assemble frag- ments to contigs and packages like consed (7) or GAP4 (8), which are used to finish sequencing projects. These programs are not explained in detail here.",
+ "sequencing data to solutions from the genotyping array data. iv PREVIEW",
+ "11 BIOINFORMATIC CHALLENGES FOR GENOMIC MEDICINE Processing and managing of high-throughput sequence data High throughput sequencing offers severa l advantages relative to array-based genotyping or expression assays. First, unlike genotyping arrays, whole genome sequencing is not limited to interrogating onl y known sequence variants. Similarly, RNA- sequencing (RNA-seq) enables expression quanti fication of novel transcripts that are not",
+ "11 BIOINFORMATIC CHALLENGES FOR GENOMIC MEDICINE Processing and managing of high-throughput sequence data High throughput sequencing offers severa l advantages relative to array-based genotyping or expression assays. First, unlike genotyping arrays, whole genome sequencing is not limited to interrogating onl y known sequence variants. Similarly, RNA- sequencing (RNA-seq) enables expression quanti fication of novel transcripts that are not"
+ ],
+ "answer": "### How to Guide for Genetic Sequencing Genetic sequencing is a powerful tool used to determine the order of nucleotides in DNA. This guide will walk you through the basic steps involved in genetic sequencing, from sample preparation to data analysis. #### Step 1: Sample Preparation 1. **Collect DNA Sample**: Obtain a DNA sample from the organism or tissue of interest. This can be done using various methods such as blood draws, buccal swabs, or tissue biopsies. 2. **Extract DNA**: Use a DNA extraction kit to isolate DNA from the collected sample. Ensure the DNA is of high quality and free from contaminants. #### Step 2: Library Preparation 1. **Fragment DNA**: Break the DNA into smaller fragments. This can be done using mechanical shearing or enzymatic digestion. 2. **Add Adapters**: Ligate specific adapters to the ends of the DNA fragments. These adapters are necessary for the sequencing process and allow the fragments to bind to the sequencing platform. #### Step 3: Amplification 1. **PCR Amplification**: Amplify the DNA fragments using Polymerase Chain Reaction (PCR). This step increases the quantity of DNA to ensure there is enough material for sequencing. #### Step 4: Sequencing 1. **Choose a Sequencing Platform**: Select an appropriate sequencing platform based on your needs. Popular platforms include Illumina, Ion Torrent, Pacific Biosciences, and Oxford Nanopore Technologies [1], [5], [6]. 2. **Load the Sample**: Load the prepared DNA library onto the sequencing platform. Each platform has specific protocols for loading samples. 3. **Run the Sequencer**: Initiate the sequencing run. The platform will read the DNA fragments and generate raw sequence data. #### Step 5: Data Analysis 1. **Base Calling**: Use base-calling software to convert raw data into nucleotide sequences. Tools like Phred are commonly used for this purpose [7]. 2. **Sequence Assembly**: Assemble the short DNA sequences into longer contiguous sequences (contigs) using assemblers like Phrap or CAP3 [7]. 3. **Quality Control**: Assess the quality of the assembled sequences. Tools like consed or GAP4 can help in finishing sequencing projects and ensuring accuracy [7]. #### Step 6: Interpretation 1. **Variant Calling**: Identify genetic variants such as single nucleotide polymorphisms (SNPs) and insertions/deletions (indels) using variant calling software. 2. **Annotation**: Annotate the identified variants to understand their potential impact on genes and biological functions. 3. **Data Management**: Store and manage the sequencing data using bioinformatics tools and databases. High-throughput sequencing generates large amounts of data that require efficient processing and management [9], [10]. #### Additional Resources - For more detailed protocols and information on specific sequencing technologies, refer to resources from companies like 10X Genomics, Illumina, and Pacific Biosciences [1]. - For historical context and advancements in sequencing technologies, see references such as Heather and Chain (2016) and Shendure and Ji (2008) [2]. By following these steps, you can successfully perform genetic sequencing and analyze the resulting data to gain insights into the genetic makeup of your sample.",
+ "question": "Create a how to guide for genetic sequencing"
+} \ No newline at end of file