CTC 2013: Poster 41 (Beth Wilmot)

Poster 41
Presenter: Beth Wilmot
Wednesday, 3:00 – 5:00pm

Reading the Whole Transcriptome

Sunita Kawane¹, Christina L. Zheng^2,3, Daniel W. Bottomly^1,3, Robert Searles⁴, Robert Hitzemann^5,7, Shannon McWeeney^1,2,3,6, Beth Wilmot^1,2,3
¹Oregon Clinical and Translational Research Institute, ² Department of Medical Informatics and Clinical Epidemiology, ³the Knight Cancer Center, ⁴Integrated Genomics Laboratory, ⁵Department of Behavioral Neuroscience, ⁶Department of Public Health and Preventative Medicine, Oregon Health & Science University, Portland, Oregon, ⁷Veterans Affairs Research Service, Portland.

In order to utilize the full power of RNAseq, these two aspects of transcriptomics that need to be addressed: 1) assigning reads to the approximately 10% of overlapping genes and 2) developing a full annotation of reads aligning to non-exonic regions. We have developed a framework that allows investigation of both the complex regions of genic overlap and annotation of reads that align to unannotated regions of the genome. We used stranded libraries of polyA RNA or riboZero treated RNA in the development and testing of this framework. Scenarios when each of these protocols would be appropriate are discussed to guide discovery. In the resulting annotation files, the genome is portioned into categories according to strand, exon and intron overlaps and intergenic regions. This is summarized at the gene, exon, transcript and intron/intergenic regions. Complex regions of overlapping genes were further characterized. To help define regions where reads align to nongenic areas of the genome, Non-Code annotation and other genomic features such as histone marks and DHS can also be integrated. This is highly extensible for any RNA-seq pipeline and freely available to the research community. This work was supported by grants AA010760, AA011034, MH051372, and AA013484.