Frequently Asked Questions

Please find a list of the most frequently asked questions below. If you cannot find the answer to your question here or want to know more about our products, please contact support@lexogen.com.

The SIRV design is based on 7 human model genes. The annotated transcripts of these genes were extended by additional isoforms and variants to comprehensively cover alternative splicing, start- and end-site variations, antisense and overlapping transcripts. The exonic sequences of the resulting 69 transcript structures (6-18 per gene) were derived from database-derived genomes, which were the altered to completely lose alignment identity, and blasted against the NCBI database on the nucleotide and the protein level to ensure they are non-identical. Intronic sequences were generated randomly.

The SIRV sequences conform to the canonical exon-intron junction rule: 96.9 % of all SIRV junctions are GT-AG, with the less frequent variants being present at 1.7% (GC-AG) and 0.6% (AT-AC). Two non-canonical splice sites were included at 0.4% each (CT-AG and CT-AC).

The SIRVs are in vitro transcription products, and therefore all SIRV RNA start with a 5’ ppp-G. Hence, cap-specific cDNA preparation methods are not feasible. The 3’ end of each SIRV RNA is polyadenylated (with 30 adenosines), enabling oligo(dT) based selection and priming. The SIRVs User Guide contains a graphic representation of the SIRV design.

You can download the sequences – pdf  SIRV sequence design overview (XLSX)

In short
The SIRVs were produced by T7 transcription from synthetic genes which – while optimized – in some cases generates RNA of varying integrity. A series of tailored methods was applied to purify full-length SIRV RNAs with a minimal amount of any side products.

In detail
Synthetic gene constructs are produced (Bio Basic Inc, Markham, ON, Canada) that comprised 5’ to 3’ a unique restriction site, a T7 RNA polymerase promoter whose 3’ G is the first nucleotide of the actual SIRV sequence, which is seamlessly followed by a (A30)-tail that is fused with an exclusive 2nd restriction site. These gene cassettes are cloned into a vector, colony-amplified and singularized. All SIRV sequence in the purified plasmids are verified by Sanger-sequencing to identify the correct clones. The E. coli cultures are grown in batches to obtain plasmids in the lower µg-scale. Double digestion of isolated plasmids with XhoI and NsiI must show correct insert size and complete restriction. Linearized, silica-purified plasmids serve as templates in in vitro transcription reactions using T7 transcription kits (AmpliScribe T7 High Yield Transcription Kit, and AmpliScribe T7 Flash Transcription Kit, Epicentre, Madison, WI). The DNase-treated, phenol-extracted and silica-purified in vitro transcription products are assessed for concentration and purity by spectrophotometry (NanoDrop, Thermo Fisher Scientific, Waltham, MA) and for integrity by capillary electrophoresis (2100 Bioanalyzer, RNA 6000 Pico Kit, Agilent Technologies, Santa Clara, CA).

In the context of variant verification RNA integrity is a very important measure. Fragments arising from incomplete transcription might impose errors on the correct determination of variants which share those sequences and thereby also affect the overall gene coverage. The integrity of the transcription products is very heterogeneous as expected, given the broad sequence variation and the length of the SIRV transcripts (average 1.1 kb with 14 RNAs between 2.0 and 2.5 kb). Therefore, a set of tailored purification procedures is applied to obtain full-length RNAs with a minimal amount of side products despite the broad sequence and length variation of the SIRVs. A majority of transcripts must be purified by at least one of two purification methods. Purification method one is selective for poly(A)-tails and neglects prematurely terminated transcription products. Purification method two is based on size-selective quantitative electrophoresis separating the correctly sized main products from shorter fragments (transcription break off and degraded products) and longer fragments (run-through transcription products). After purification, the 69 SIRV RNAs are assessed for the ratios of pre-peak fraction, main-peak fraction (corresponding to RNAs of correct length), and post-peak fraction. Finally, the SIRVs are quantified by absorbance spectroscopy to adjust all stock solutions to a base concentration of close to, but above 50 ng/µl, and to monitor RNA purity by absorbance ratio of 260/280 nm, and 260/230 nm.

The total molarity of each mix is 69.5 fmol/µl, the concentration is 25.3 ng/µl. The final total molarity and the final total concentration is the same in all 3 mixes.In Mix E0, all SIRVs are present at the same molarity.
Each mix of E0, E1, and E2 contains all 69 SIRVs in different concentration ratios. The concentration ratios in E0 are identical (1:1), E1 covers one order of magnitude (up to 1:8), and E2 extends over more than two orders of magnitude (up to 1:128). This allows to either assess all SIRV transcripts spiked-in at the same level or to mimic the transcript variant concentration distribution encountered in real samples. Moreover, the comparison of 3 samples, each spiked with either E0, E1, and E2 allows for a detailed assessment of differential gene expression on the transcript level. The inter-mix concentration ratios range from 1/64- to 16-fold. The final concentration of the SIRV mixes contain identical mass concentrations of 25.2, 25.2, and 25.4 ng/µl, and molar concentration of 69.0, 68.5 and 70.8 fmol/µl for mixes E0, E1 and E2 respectively.
Each SIRV transcript enters the final mixes via one of eight PreMixes, allowing for the unique identification of each SIRV by capillary electrophoresis while entering the mixing scheme. Then, four pairs of PreMixes are combined in equal ratios to SubMixes, before those SubMixes are combined in defined ratios. The combination of accurate volumes of the stock solutions in sufficiently large batch sizes and a transparent monitoring of the sequential processing steps warrant the most accurate preparation of the mixes above process inherent lower boundaries. Pipetting errors vary depending on transfer volumes, and range from ±4 % for 2 µl to ±0.8 % for >100 µl transfers. The precision was experimentally determined with blank solutions. Starting with the stock concentration measurement (NanoDrop accuracy ±2 ng/µl for 50 ng/µl), and accounting for the entire mixing pathway, the accumulative concentration error is expected to range between ±8 % and ±4.7 %. Therefore, in the data evaluation one has to account for the experimental fuzziness by allowing for lower accuracy thresholds of ±8 % on the linear scale, or ±0.11 on the log2-fold scale, respectively. The SIRV concentration ratios between two mixes are more precise because only one final pipetting step defines the concentration differences and synchronizes all SIRVs, which belong to the same SubMix. Here, a maximal error of ±4 % (or ±0.057 on the log2-fold scale) can be expected between SubMixes, while all SIRVs of the same SubMix must propagate coherently into the final mixes. Bioanalyzer traces are used to monitor the relative propagation of the SIRVs, PreMixes and SubMixes during the mixing. In addition, the accurate pipetting of the 8 PreMixes is controlled by checksums of Nanodrop concentration measurements by weighing on an analytical balance.
The spike-in ratios have to be chosen in concordance with the desired final SIRV content. For RNA to be poly(A) selected (starting from 100 ng of Human Brain Reference Total RNA, HBRR) we recommend to use 2.4 µl of a 1:1000 dilution, and for RNA to be rRNA-depleted we recommend to use 3.6 µl of a 1:1000 dilution respectively, which results in both cases in 2.83 % SIRVs in the final mixture. For samples with different input amounts, mRNA content, depletion or enrichment method the amount of SIRVs have to be adjusted. For samples with unknown mRNA content we recommend to use the 2.4 µl volume given above and then – by comparing the share of reads aligning to the reference genome and the “SIRVome”, respectively – derive the mRNA content. For details, please consult the User Guide or design the experiment by using the SIRV Suite Experiment Designer.
The SIRV mixes can be used with crude cell extract, purified total RNA, rRNA-depleted RNA or poly(A) enriched RNA. The spiked-in RNA can be used for all common RNA-Seq library preparations to be analyzed on any platform (Illumina, Ion Torrent, SOLiD, Pacific Bioscience, Oxford Nanopore,…).
The mRNAs range from 191 to 2528 nt with a GC content of 29.5 – 51.2 %, with the shortest mRNAs being antisense mono-exonic transcripts.
The ERCC Spike-In Controls (ERCCs, Ambion, Thermo Fisher) allow to asses dynamic range, dose response, lower limit of detection, and fold-change response of RNA sequencing pipelines within the limitation of their mono-exonic, single-isoform RNA sequences. Because the ERCCs contain no transcript variants, one of the main challenges of sequencing complex transcriptomes – to identify and distinguish splice variants – cannot be evaluated using the ERCCs.

In contrast, the comprehensive and novel set of Spike-In Transcript Variants, SIRVs, can be used to validate isoform-specific RNA sequencing workflows and to compare experiments by extrapolating the results from the well-defined isoform ground truth of a small fraction of control reads to the sample reads. Within the context of variant detection assessment of dynamic range, dose response, lower limit of detection, and fold-change response is possible as well.

The number of reactions depends on the spike-in amount required. You can draw 4 times 1 µl. This 1 µl should then be stepwise diluted to 1:1000, of which for a typical experiment using 100 ng total RNA input (for example, spiking of Human Brain Reference RNA (HBRR)) 3.6 µl are required for an rRNA depletion experiment, respectively 2.4 µl for an mRNA-Seq experiment. Hence from the 1 µl original SIRV Mix, around 300 – 400 samples can be spiked depending on the RNA input.
In another example: If 10 ng total RNA input are to be spiked, then 1 µl can be diluted 1:10000. See also Table 3, p.10 of the User Guide.
In any case we do not recommend to keep the dilution for very long as the diluted RNA solutions are increasingly unstable