Comparison and calibration of transcriptome data from RNA-Seq and tiling arrays.

Overview

abstract

BACKGROUND: Tiling arrays have been the tool of choice for probing an organism's transcriptome without prior assumptions about the transcribed regions, but RNA-Seq is becoming a viable alternative as the costs of sequencing continue to decrease. Understanding the relative merits of these technologies will help researchers select the appropriate technology for their needs. RESULTS: Here, we compare these two platforms using a matched sample of poly(A)-enriched RNA isolated from the second larval stage of C. elegans. We find that the raw signals from these two technologies are reasonably well correlated but that RNA-Seq outperforms tiling arrays in several respects, notably in exon boundary detection and dynamic range of expression. By exploring the accuracy of sequencing as a function of depth of coverage, we found that about 4 million reads are required to match the sensitivity of two tiling array replicates. The effects of cross-hybridization were analyzed using a "nearest neighbor" classifier applied to array probes; we describe a method for determining potential "black list" regions whose signals are unreliable. Finally, we propose a strategy for using RNA-Seq data as a gold standard set to calibrate tiling array data. All tiling array and RNA-Seq data sets have been submitted to the modENCODE Data Coordinating Center. CONCLUSIONS: Tiling arrays effectively detect transcript expression levels at a low cost for many species while RNA-Seq provides greater accuracy in several regards. Researchers will need to carefully select the technology appropriate to the biological investigations they are undertaking. It will also be important to reconsider a comparison such as ours as sequencing technologies continue to evolve.

authors

Agarwal, Ashish

Koppstein, David

Rozowsky, Joel

Sboner, Andrea
Habegger, Lukas
Hillier, Ladeana W
Sasidharan, Rajkumar
Reinke, Valerie
Waterston, Robert H
Gerstein, Mark

publication date

June 17, 2010

published in

BMC genomics Journal

Research

keywords

Gene Expression Profiling
Oligonucleotide Array Sequence Analysis
RNA

Identity

PubMed Central ID

PMC3091629

Scopus Document Identifier

77953553548

Digital Object Identifier (DOI)

10.1186/1471-2164-11-383

PubMed ID

20565764

Additional Document Info

has global citation frequency

94

volume

11

VIVO Weill Cornell Medical College

Comparison and calibration of transcriptome data from RNA-Seq and tiling arrays. Academic Article

Overview

abstract

authors

publication date

published in

Research

keywords

Identity

PubMed Central ID

Scopus Document Identifier

Digital Object Identifier (DOI)

PubMed ID

Additional Document Info

has global citation frequency

volume