Whole Genome Sequencing
The cost and effort to resequence an entire mammalian genome is rapidly dropping, but remains too high for most applications. Most mammalian researchers choose to balance these costs by sequencing a subset of the genome. For a smaller genome, however, whole genome sequencing can be an attractive solution. As a rule of thumb, de novo assembly should be undertaken primarily using long reads. For genomes with a reference available, short reads will usually work just as well.
Platform Information
The GSL offers Illumina, 454, and SOLiD sequencing, any of which may be appropriate for whole genome sequencing depending on the size of the genome and the goals of the experiment. It is possible to index samples to allow multiple samples to run per lane (or plate, etc).
| Platform | Read Type | Sample Unit | Units/Run | Expected Reads | Read Length |
| Illumina HiSeq 2000 | short | lane | 16 | 200 million | 50 or 100bp |
| Illumina Genome Analyzer IIx | short | lane | 8 | 25 million | 36 or 72bp |
| Applied Biosystems SOLiD 4 | short | slide | 1 | 400 million | 50bp |
| Roche 454 Genome Sequencer FLX Titanium | long | region | 2 | 500 thousand | 400 - 600bp |
| Ion Torrent PGM | long | chip | 1 | 190 thousand | ~100bp |
The "Sample Units" in this table are the most natural way to run a single sample on a given platform, they are not necessarily equivalent in terms of cost. The number of expected reads are for a single-end sequencing run. In paired-end or mate-pair mode, the number of reads will double.
Sample Submission Requirements
A total of 3µg genomic DNA (or high-quality WGA DNA) should be provided in 55µl of 10mM Tris, Qiagen EB, TE, or dH2O. As little as 2.5µg can be used if sample quantities are limited and methods of amplification are also available for very small input of gDNA. No special treatment is needed for the DNA samples provided that they were NOT isolated from whole blood collected in Heparin tubes. Any other standard DNA isolation method that results in clean, intact, high molecular weight DNA is appropriate. If possible, please send 5µg of sample as we find users' methods of quantitation tend to over-estimate amounts by as much as 30-50%. The GSL will NOT pool samples if additional sample is needed to perform the assay.
GSL Whole Genome Sequencing Data
A quality assessment of the sequencing run and alignment of reads by BWA to a standard genome (human, rat, mouse, Drosophila) is included in the cost. Due to the size of sequencing data, if an alignment is done, only .bam files (and the .bai or bam index file) are provided as results. Fastq files are not also provided due to the redundancy of data. Users wishing to perform their own analysis of the raw data can easily re-generate the original fastq files using the bam2fastq program.
©2010-11 HudsonAlpha Institute for Biotechnology
genomics@hudsonalpha.org