Transcript Profiling


Transcript profiling experiments aim to characterize the expression of known mRNA transcripts in samples. Usually, there is little discovery of novel transcripts expected, unlike in transcript discovery. The GSL offers transcript profiling by arrays or next-gen sequencing. Arrays are often a good choice when the analysis of many samples or groups of samples is required, as it is usually cheaper to run a well-designed array experiment with replicates than to run RNA-Seq. RNA-Seq can be more advantageous if there are only a small number of samples of interest and splice variant detection is desired. Sequencing platforms offered include Illumina, SOLiD, and Roche 454.

Platform Information

For transcript profiling, the GSL uses the Illumina Tru-Seq RNA Sample Prep Kits which targets mRNA profiling exclusively. Ribosomal, microRNAs, and other non-mRNA species will not be represented in the final sequencing library. This kit allows indexing of RNA-Seq samples such that many samples may be sequenced per lane. For most standard RNA-Seq experiments with a goal of transcript profiling, GSL recommends indexing with the Tru-Seq kit and multiplexing 4 samples per lane on a 50bp paired-end sequencing HiSeq run. One HiSeq lane generates approximately 80 million paired-end reads so each sample would yield ~20 million paired-end reads. For unique RNA-Seq experiments needing extra coverage, sample prep remains the same but each sample would be run in its own lane. The following sequencing conditions are available: 36bp or 72bp (~25-40m PE reads); 50bp or 100bp (~80m PE reads).

Sample Submission Requirements

RNA-Seq samples from different projects are batched together in groups of 24 or more to allow indexing; so if fewer than 24 samples are submitted in a project, there may be a delay until additional samples fill the batch. Submission requirements are 2-4ug of total RNA in 55ul of 10mM Tris or Qiagen EB. If these conditions are not met, a fee of $25/sample will be charged. For best results, RNA should be free of contaminants and intact, with a Bioanalyzer RIN number of 7 or higher. Standard sequencing conditions for RNA-Seq are indexed samples are run 4 per lane on a 50bp paired end HiSeq run to yield ~20 million paired end reads per sample. Additional sequencing can be accommodated by special request.

GSL RNA-Seq Data

All demultiplexing (i.e. the sorting of indexed reads) is included in the cost of basic RNA-Seq, with users receiving fastq files for each sample. Additional fees are charged for analysis, including alignment. The current GSL pipeline for RNA-Seq analysis can include alignment with Tophat to standard organisms (human, mouse, rat) and differential gene expression analysis with Cufflinks. If a non-standard organism is sequenced and analysis help requested, an additional fee applies is charged to cover manipulation and installation of a new genome.

Please note that analysis of RNA-Seq data, even provided at the 'finished' level of Cufflinks output, is still a work in progress and very computationally heavy. It requires that end users be willing to delve into the provided output and interpret the results rather than simply relying on a fold-change Excel spreadsheet of differentially expressed genes.

©2010-11 HudsonAlpha Institute for Biotechnology
genomics@hudsonalpha.org