SIRV Set Selection Guide
The Spike-In RNA Variants of the isoform and ERCC modules (see Modular Design) are realized as defined mixes, and for some modules, different mixes are available.
Mixes, and combinations thereof, are available in the form of SIRV sets. Table 1 presents a SIRV set selection guide to help you finding the right set of modules and mixes for your application.
Table 1 ǀ SIRV set selection guide. SIRV-Set 1 (Cat. No 025.03) contains the isoform mixes E0, E1 and E2 of the isoform module, SIRV-Set 2 (Cat. No 050.01 and 050.03) provides the isoform Mix E0 only, whereas SIRV-Set 3 (Cat. No 051.01 and 051.03) has the SIRV Isoform Mix E0 in a mixture with the ERCCs. : applicable, : not applicable, and partly applicable (or parts of the sets applicable).
|SIRV-Set 1||SIRV-Set 2||SIRV-Set 3|
|Module(s)||Isoforms||Isoform Mixes E0, E1, E2||Isoform Mix E0||Isoform Mix E0|
|ERCC||ERCC Mix 1|
|Property||Isoform detection, and quantification|
For more information please consult the respective User Guides in the Downloads section.
SIRV-Set 1: Isoform Mixes E0, E1 and E2
The isoform module is available in SIRV-Set 1 (Cat. No. 025) in 3 SIRV mixes, termed E0, E1 and E2, with each mix containing all 69 SIRV isoform transcripts (from 7 SIRV genes) but in different concentration ratios. E0 is ideal for assessing the detection capabilities of a given RNA-Seq workflow, since all 69 transcripts are present in equimolar concentrations, and their detection should be unbiased and not a function of read depth or similar. E1 already contains a moderate concentration distribution of transcript variants of a given gene, and E2 represents the natural situation, whereby a dominant, abundant transcript variant is transcribed from a gene together with (up to 17) other variants present at lower expression levels (down to <1%). The latter situation is already quite challenging for correct transcript determination based on short read assembly but also tests efficiently the linearity and sensitivity of long-read sequencing platforms and protocols that cannot rely on millions of reads (Figure 1).
Figure 1 ǀ Distribution of the 4 SubMixes in the 3 isoform mixes and the resulting intra- and inter-mix ratios. Left, the intra-mix concentration ratios provide three different concentration settings to evaluate accuracy in relative concentration measurements. Right, the present fold-changes allow for 3 possible inter-mix comparisons to evaluate differential gene expression measurements. SubMixes 1-4 are indicated by their respective colors, and transcript isoforms of each of the 7 SIRV genes are distributed across all SubMixes.
Remarkably, by comparing the SIRV transcript quantifications in different mixes (E0 vs E1, E0 vs E2, or E1 vs E2) differential gene expression can be evaluated on the transcript isoform and variant level. It can be assessed, if biases in transcript detection are similar for both mixes tested and therefore having a significant effect on transcript quantification in a given mix, while having no or only a limited effect on differential transcript/gene expression quantification. Combining the individual SIRV transcript expression quantifications yields a value for SIRV gene expression, and the accuracy of this evaluation might differ significantly from the deviations seen on the transcript level.
SIRV-Set 2: Isoform Mix E0
The isoform Mix E0 is available on its own as SIRV-Set 2 (Cat. No 050) with all 69 isoforms being present at equimolar concentrations. Its applications include RNA-Seq experiments that need to be validated for the detection of a complex mixture of isoforms without applying a high read depth to cover transcripts at different concentrations. Among these are sequencing runs on long-read NGS platforms as provided by Oxford Nanopore Technologies and Pacific Biosciences. These can produce full-length reads to identify the isoforms faithfully. However, unlike short-read platforms they do not provide the millions of reads necessary to detect and quantify isoforms across a larger dynamic range, in particular if these spike-ins only constitute a very small fraction of the total RNA.
Deviations from the expected equimolar outcome can be quantified, which allows for evaluation of the performance of isoform-centered workflows. On the gene level, quantifications of the individual SIRV isoform can be summed up for each SIRV gene, which permits the validation of pipelines working with data stemming from individual isoforms but focused on gene expression calculations only.
SIRV-Set 2 is very suitable for the calculation of concordance, since the experiments’ fingerprints depends solely on their dealing with the SIRV isoform’s complexity but not on input concentration differences between these isoform transcripts.
Reads from SIRV-Set 2 can be downsampled to emulate data representing the situations in lower concentration ranges. A repeated mapping and assignment provides adequate measures for the RNA-Seq experiment ability to detect variants and measure its concentrations in a different band with. Using such iterative approach SIRV-Set 2 is capable to map the entire abundance spectrum.