Poster 41
Presenter: Beth Wilmot
Wednesday, 3:00 – 5:00pm

Reading the Whole Transcriptome

Sunita Kawane1, Christina L. Zheng2,3, Daniel W. Bottomly1,3, Robert Searles4, Robert Hitzemann5,7, Shannon McWeeney1,2,3,6, Beth Wilmot1,2,3
1Oregon Clinical and Translational Research Institute, 2 Department of Medical Informatics and Clinical Epidemiology, 3the Knight Cancer Center, 4Integrated Genomics Laboratory, 5Department of Behavioral Neuroscience, 6Department of Public Health and Preventative Medicine, Oregon Health & Science University, Portland, Oregon, 7Veterans Affairs Research Service, Portland.

In order to utilize the full power of RNAseq, these two aspects of transcriptomics that need to be addressed: 1) assigning reads to the approximately 10% of overlapping genes and 2) developing a full annotation of reads aligning to non-exonic regions. We have developed a framework that allows investigation of both the complex regions of genic overlap and annotation of reads that align to unannotated regions of the genome. We used stranded libraries of polyA RNA or riboZero treated RNA in the development and testing of this framework. Scenarios when each of these protocols would be appropriate are discussed to guide discovery. In the resulting annotation files, the genome is portioned into categories according to strand, exon and intron overlaps and intergenic regions. This is summarized at the gene, exon, transcript and intron/intergenic regions. Complex regions of overlapping genes were further characterized. To help define regions where reads align to nongenic areas of the genome, Non-Code annotation and other genomic features such as histone marks and DHS can also be integrated. This is highly extensible for any RNA-seq pipeline and freely available to the research community. This work was supported by grants AA010760, AA011034, MH051372, and AA013484.