general/datasets/Uthsc_bxdygeyernaseq_deseq2_rlog2_0422/processing.rtf


1
2
3
4
5
6
7
8
9

<p><b>Generation of RNA-seq data</b></p>

<p>1&nbsp;&micro;g of RNA was used for cDNA library construction at Novogene using an NEBNext<sup>&reg;</sup>&nbsp;Ultra RNA Library Prep Kit for Illumina<sup>&reg;</sup>&nbsp;(cat# E7420S, New England Biolabs, Ipswich, MA, USA) according to the manufacturer&rsquo;s protocol. Briefly, mRNA was enriched using oligo(dT) beads followed by two rounds of purification and fragmented randomly by adding fragmentation buffer. The first strand cDNA was synthesized using random hexamers primer, after which a custom second-strand synthesis buffer (Illumina, San Diego, CA, USA), dNTPs, RNase H and DNA polymerase I were added to generate the second strand (ds cDNA). After a series of terminal repair, poly-adenylation, and sequencing adaptor ligation, the double-stranded cDNA library was completed following size selection and PCR enrichment. The resulting 250-350 bp insert libraries were quantified using a Qubit 2.0 fluorometer (Thermo Fisher Scientific, Waltham, MA, USA) and quantitative PCR. Size distribution was analyzed using an Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA). Qualified libraries were sequenced on an Illumina Novaseq Platform (Illumina, San Diego, CA, USA) using a paired-end 150 run (2&times;150 bases). An average of 40 million raw reads were generated from each library.</p>

<p>&nbsp;</p>

<p><b>Read mapping and normalization</b></p>

<p><em>Mus musculus</em>&nbsp;(mouse) reference genome (mm11 Mus_musculus.GRCm39, release 104) and gene model annotation files were downloaded from the Ensembl genome browser (<a href="https://useast.ensembl.org/">https://useast.ensembl.org/</a>). Rows with no gene symbol name were deleted. Indices of the reference genome were&nbsp;&nbsp;built using STAR version 2.5.2b and paired-end reads were aligned to the reference genome.&nbsp;FeatureCount from package RsubRead, version 1.32.4, was used to count the number of read mapped to each gene. Raw counts were then normalized and log2 transformed using function rlogTransformation from the DESeq2 package (version 1.16.1) and an increment was added to the normalized values to make all values positive.</p>