This blog is also available as a recorded webinar!
Key considerations for experimental planning
Several factors influence the experimental setup, for example, which type of data is needed to provide the answer for your specific project? Do you need quantitative (e.g., gene expression levels) or qualitative data (e.g., transcript resolution, splicing information, isoform abundance, structure, etc.)? Further, the type of RNA of interest is also a decisive factor for the design of the lab workflow. If you are interested in small RNAs, you not only need to use a suitable small RNA-Seq library preparation kit but also pick an RNA Extraction method or kit that retains small RNAs. In case, you are interested in non-coding RNAs, you should use ribosomal depletion during preprocessing and avoid mRNA-selection as most lncRNAs are not polyadenylated and would be lost. In addition, choosing the right conditions, including controls, optimizing the sample plate layouts, and providing replicates are crucial for the success of the experiment. For more information on experimental design and planning, see our recent blog Planning for Success.
The distinction between quantitative and qualitative data also influences the choice of library prep in the lab. For quantitative gene expression information, 3′ mRNA-Seq library preps, such as QuantSeq are ideal. For quantitative transcript-level information, whole transcriptome library preps such as CORALL RNA-Seq are used in combination with poly(A) selection or rRNA-depletion depending on the RNA species under investigation.
Choosing between 3' mRNA-Seq and Whole Transcriptome RNA-Seq Technologies
Both, 3′ mRNA-Seq and whole transcriptome sequencing (WTS) can be used to assess expression information during RNA-Seq experiments. Before diving into practical examples how these methods influence results and conclusions of real-life studies, this section briefly introduces the methodologies, advantages, and differences.
3′ mRNA-Seq libraries are generated from total RNA using in-prep poly(A) selection through an initial oligo(dT) priming step – streamlining library preparation and omitting several steps needed for classical library preps. Sequencing reads are localized to the 3′ end of polyadenylated RNAs which is sufficient to identify gene expression patterns even at low sequencing depth of 1 – 5 M reads/sample (Fig. 2). As QuantSeq generates one fragment per transcript, data analysis is straight-forward and results can be obtained extremely fast by read counting without the need to normalize to transcript coverage and concentration estimates (Moll et al., 2014). Due to the robustness of 3′ mRNA-Seq library preparation protocols, they are the method of choice for gene expression profiling from demanding samples including degraded and FFPE material. Our blog article “Transcriptomics on FFPE samples” provides several examples of successful studies utilizing 3′ mRNA-Seq on this sample type.
Whole transcriptome library preps are most common for RNA-Seq projects. For WTS, cDNA synthesis is initiated with random primers ultimately distributing sequencing reads across the entire transcript (Fig. 2). To prevent random primers from binding to highly abundant ribosomal RNA (rRNA) which would result in a high percentage of unnecessary sequencing reads, rRNA must be effectively removed prior to library preparation – either by selecting polyadenylated RNAs or by specifically depleting rRNA. As this enrichment / depletion step is done prior to the library preparation, WTS workflows typically take longer than 3′ mRNA-Seq workflows. Higher read depth is required to provide sufficient coverage across the entire transcript, reads are aligned, normalized, and individual transcript concentrations are estimated.
Compared to 3′ mRNA-Seq, WTS data offers additional information such as differences in the expression between isoforms of a single gene, alternative splicing, gene fusions, etc. WTS also typically detects more differentially expressed genes, however, biological conclusions, e.g., the effects of various conditions or compounds on biological processes, networks and (de)activated pathways are highly similar between both methods. As a result, 3′ mRNA-Seq is often used for large-scale projects addressing expression patterns, initial experiments to find conditions of interest or to identify drug effects. Whole transcriptome sequencing is then employed on smaller sample sets to investigate the mode-of-action of compounds, unravel effects on transcript level or determine the influence of non-coding transcripts, fusions or splice variations.
When to choose whole transcriptome sequencing (WTS) versus 3' mRNA-Seq
Ultimately, the choice between WTS and 3′ mRNA-Seq depends on your specific research question:
Choose WTS/Total RNA-Seq if you need:
- A global view of all RNA types (coding and non-coding).
- Information about alternative splicing, novel isoforms, or fusion genes.
- To work with samples where the poly(A) tail might be absent or highly degraded (e.g., prokaryotic RNA, some highly degraded clinical samples without good 3′ end preservation).
Choose 3′ mRNA-Seq if you need:
- Accurate and cost-effective gene expression quantification (i.e., how much of each gene is expressed).
- High-throughput screening of many samples.
- A streamlined workflow with simpler data analysis.
- To efficiently profile mRNA expression from degraded RNA and challenging sample types (like FFPE).
A practical example: Comparing WTS and 3' mRNA-Seq data sets for differential expression, gene set enrichment and pathway analysis using Omics Playground
Several publicly available studies have assessed 3′ mRNA-Seq workflows in comparison with traditional whole transcriptome sequencing. The following study by Ma et al., 2019 compares both RNA-Seq methods to assess differential expression in murine livers following a normal or high iron diet for 5 weeks. The authors conclude that both methods generate highly similar results and show a reproducibility between biological replicates.
Abstract
Background
3’ RNA sequencing provides an alternative to whole transcript analysis. However, we do not know a priori the relative advantage of each method. Thus, a comprehensive comparison between the whole transcript and the 3′ method is needed to determine their relative merits. To this end, we used two commercially available library preparation kits, the KAPA Stranded mRNA-Seq kit (traditional method) and the Lexogen QuantSeq 3’ mRNA-Seq kit (3′ method), to prepare libraries from mouse liver RNA. We then sequenced and analyzed the libraries to determine the advantages and disadvantages of these two approaches.
Results
We found that the traditional whole transcript method and the 3’ RNA-Seq method had similar levels of reproducibility. As expected, the whole transcript method assigned more reads to longer transcripts, while the 3′ method assigned roughly equal numbers of reads to transcripts regardless of their lengths. We found that the 3’ RNA-Seq method detected more short transcripts than the whole transcript method. With regard to differential expression analysis, we found that the whole transcript method detected more differentially expressed genes, regardless of the level of sequencing depth.
Conclusions
The 3’ RNA-Seq method was better able to detect short transcripts, while the whole transcript RNA-Seq was able to detect more differentially expressed genes. Thus, both approaches have relative advantages and should be selected based on the goals of the experiment.
As expected, WTS detects more differentially expressed genes and assigns more reads to longer transcripts requiring stringent length normalization (Fig. 3). 3′ mRNA-Seq is less sensitive detecting fewer differentially expressed due to reads localizing to the less diverse 3′ UTR, which contains many common regulatory features such as protein binding motifs. Nevertheless, the conclusions for gene enrichment and pathway analysis remain the same regardless which method is chosen – with the additional benefit of 3′ mRNA-Seq requiring less read depth and thus allowing more samples to be sequenced in parallel.
To use 3′ mRNA-Seq effectively, a well-curated 3′ annotation is crucial! Model organisms such as human and mouse annotations are updated regularly and well annotated on transcript end sites. For non-model organisms, it is important to ensure transcript end site information is available and to improve the annotation if needed. Insufficient 3′ annotation leads to reduced mapping rates even if the library preparation and sequencing workflow was optimal and resulted in high-quality data. For more information on alignments and read mapping, please see our Lexicon chapter on secondary data analysis.
The authors focused mostly on differential expression for pathways associated with iron metabolism. Reanalysis of the dataset with Omics Playground confirmed the expected findings of increased detection of differentially expressed genes in WTS approaches (Fig. 4). Nevertheless, 3′ mRNA-Seq reliably captures the majority of key differentially expressed genes, providing highly similar results to WTS approaches and consistent biological conclusions at the level of enriched gene sets (Table 1) and differentially regulated pathways (Table 2). In addition to iron metabolism, reanalysis revealed further gene sets affected by an iron-rich diet, such as genes involved in the regulation of the circadian rhythm or inflammatory responses that are robustly detected with both methods. Tables 1 and 2 compare results from gene set enrichment and pathway analysis.
Table 1 | Top 15 most statistically significant upregulated gene sets in WTS and their rank in 3’ mRNA-seq.
| Gene set | Rank WTS | Rank 3’mRNAseq |
| PATHWAY_REACTOME:Response Of EIF2AK1 (HRI) To Heme Deficiency | 1 | 1 |
| GO_BP:negative regulation of circadian rhythm | 2 | 4 |
| PATHWAY_WIKI:Photodynamic therapy-induced unfolded protein response | 3 | 6 |
| PATHWAY_WIKI:Cholesterol biosynthesis pathway | 4 | 11 |
| GO_BP:negative regulation of acute inflammatory response | 5 | 3 |
| PATHWAY_WIKI:Matrix metalloproteinases. | 6 | 39 |
| GO_BP:myeloid dendritic cell chemotaxis | 7 | 7 |
| GO_BP:prostaglandin secretion | 8 | 82 |
| PATHWAY_BIOPLANET:PERK-regulated gene expression | 9 | 8 |
| PATHWAY_REACTOME:ATF4 Activates Genes In Response To Endoplasmic Reticulum Stress | 10 | 5 |
| GO_BP:eosinophil migration | 11 | 23 |
| GO_BP:leukocyte aggregation | 12 | 62 |
| PATHWAY_REACTOME:PERK Regulates Gene Expression | 13 | 9 |
| GO_BP:negative regulation of immune effector process | 14 | 172 |
| GO_BP:eosinophil chemotaxis | 15 | 24 |
Among the top 15 upregulated gene sets, the 3′ mRNA-seq method captures all the genesets identified by the WTS approach, though with shifts in rank for specific categories, especially below the very top hits. This suggests that for this dataset and analytical approach, all major strongest signals detected by WTS were also detected by 3’ mRNA-seq. However, the strength of association for non-top gene sets differed, which may impact pathway prioritization and secondary biological inferences.
Table 2 | Top 15 most statistically significant upregulated Wikipathways in WTS and their rank in 3’ mRNA-seq.
| Pathway | Rank WST | Rank 3’mRNAseq |
| PATHWAY_WIKI:Photodynamic therapy-induced unfolded protein response_ | 1 | 1 |
| PATHWAY_WIKI:Cholesterol biosynthesis pathway | 2 | 8 |
| PATHWAY_WIKI:Matrix metalloproteinases | 3 | 26 |
| PATHWAY_WIKI:Mammary gland development pathway – Embryonic development (Stage 1 of 4) | 4 | 18 |
| PATHWAY_WIKI:Cholesterol synthesis disorders | 5 | 31 |
| PATHWAY_WIKI:Cholestasis_WikiPathways | 6 | – |
| PATHWAY_WIKI:Chronic hyperglycemia impairment of neuron function | 7 | – |
| PATHWAY_WIKI:Differentiation of white and brown adipocyte | 8 | – |
| PATHWAY_WIKI:NRF2-ARE regulation | 9 | 16 |
| PATHWAY_WIKI:Platelet-mediated interactions with vascular and circulating cells | 10 | – |
| PATHWAY_WIKI:Statin inhibition of cholesterol production | 11 | 4 |
| PATHWAY_WIKI:Unfolded protein response | 12 | 10 |
| PATHWAY_WIKI:Prostaglandin signaling | 13 | – |
| PATHWAY_WIKI:Cytokines and inflammatory response | 14 | – |
| PATHWAY_WIKI:nsp1 from SARS-CoV-2 inhibits translation initiation in the host cell | 15 | 36 |
Choosing suitable data analysis tools is of utmost importance to generate conclusive and comparable results and drawing valid, scientifically sound conclusions. Cloud-based plug-and-play platforms such as Lexogen’s Kangooroo and Omics Playground offer curated differential expression analysis pipelines with interactive visualization facilitating analysis and offering a fast, easy, and straightforward view into the data set. In addition, a variety of different options for differential expression and gene set enrichment are typically available and both platforms are suitable for analysis of different types of datasets – specifically 3′ mRNA-Seq and WTS data.
Replicates and choosing the right control conditions are the key determinants for sound biological conclusions
Control conditions are fundamental for robust and interpretable differential expression analyses. Neutral control conditions provide a meticulously defined baseline as transcriptomic reference, against which the profiles of experimentally perturbed model systems or disease states can be compared. Without such controls, it becomes impossible to attribute observed changes in gene expression to the specific experimental variable of interest. Instead, detected variations could merely reflect inherent stochastic biological variation, technical noise introduced during sample handling and processing, or the influence of uncontrolled confounding variables. Consequently, the inclusion of appropriate control groups enables the application of sound statistical inference, minimizes the risk of both false positive and false negative errors.
Technical and biological replicates ensure reliable results
The importance of replicates cannot be emphasized enough in scientific research, especially when conducting NGS experiments producing complex data sets. Replicates are essential for ensuring the reliability, robustness, and statistical validity of an experiments conclusions. Without replicates, it’s virtually impossible to distinguish true biological signals from random noise or experimental artifacts (Table 3). Not only is the majority of information (up to 90%) lost, the reliability of the experiment suffers tremendously. While technical replicates capture variabilities in the workflow, including changes to the environment, equipment or reagents used, biological replicates additionally capture the natural variance between individual biological samples.
Reducing the number of replicates in the experiment in silico resulted in more false positives and other gene sets reported to be differentially regulated. When analyzing the top 15 hits for for gene sets enriched in the data set containing three replicates, 4 of these are not found at all when only one replicate is used. In addition, four other gene sets significantly drop in rank below 50th place – and would also not be considered as likely hits in an experiment (Table 3). Hence, the absence or presence of replicates can completely change the outcome, interpretation and biological relevance of an NGS experiment.
Table 3 | Top 15 most statistically significant upregulated gene sets in WTS and their rank in the absence of biological replicates.
| Gene set | Rank – 3 Replicates per condition | Rank – 1 Replicates per condition |
| PATHWAY_REACTOME:Response Of EIF2AK1 (HRI) To Heme Deficiency | 1 | 1 |
| GO_BP:negative regulation of circadian rhythm | 2 | 7 |
| PATHWAY_WIKI:Photodynamic therapy-induced unfolded protein response | 3 | 2 |
| PATHWAY_WIKI:Cholesterol biosynthesis pathway | 4 | – |
| GO_BP:negative regulation of acute inflammatory response | 5 | 11 |
| PATHWAY_WIKI:Matrix metalloproteinases | 6 | 113 |
| GO_BP:myeloid dendritic cell chemotaxis | 7 | – |
| GO_BP:prostaglandin secretion | 8 | 53 |
| PATHWAY_BIOPLANET:PERK-regulated gene expression | 9 | 3 |
| PATHWAY_REACTOME:ATF4 Activates Genes In Response To Endoplasmic Reticulum Stress | 10 | 4 |
| GO_BP:eosinophil migration | 11 | 152 |
| GO_BP:leukocyte aggregation | 12 | – |
| PATHWAY_REACTOME:PERK Regulates Gene Expression | 13 | 5 |
| GO_BP:negative regulation of immune effector process | 14 | – |
| GO_BP:eosinophil chemotaxis | 15 | 201 |
Lack of biological replicates leads to missing or severely downgraded detection of roughly 47% of the top upregulated gene sets detected by the three replicate analysis, highlighting the risk of false negatives or loss of statistical significance when experiment replication is insufficient.
Summary
- Whole Transcriptome Sequencing (WTS), often referred to as Total RNA-Seq, and 3' mRNA-Seq are both powerful RNA sequencing techniques, but they differ significantly in their approach, and consequently, in their advantages and applications.
- Choosing the right approach depends on your experimental aim, data requirements, study design, sample size and budget constraints.
- Cloud-based data analysis platforms like Kangooroo and Omics Playground significantly increase the speed of analysis and are suitable for both types of data sets 3' mRNA-Seq or WTS. The choice of suitable data analysis tools is key for meaningful results.
- Proper controls, biological replicates, and the use of spike-in standards are essential to ensure data reliability, reproducibility, and meaningful interpretation — especially in large-scale or variable experiments.
References
Ma, F., Fuqua, B.K., Hasin, Y. et al. (2019) A comparison between whole transcript and 3’ RNA sequencing methods using Kapa and Lexogen library preparation methods. BMC Genomics 20. DOI: 10.1186/s12864-018-5393-3
Moll, P., Ante, M., Seitz, A. and Reda, T. (2014) QuantSeq 3′ mRNA sequencing for RNA quantification. Nat Methods 11, i–iii (2014). DOI: 10.1038/nmeth.f.376
Written by Dr. Yvonne Göpel
