TeloPrime
Detecting genetic variation and base modifications together in the same single molecules of DNA and RNA at base pair resolution using a magnetic tweezer platform
Zhen Wang, Jérôme Maluenda, Laurène Giraut, Thibault Vieille, Andréas Lefevre, David Salthouse, Gaël Radou, Rémi Moulinas, Sandra Astete-Morales, Pol d’Avezac, Geoff Smith, Charles André, Jean-François Allemand, David Bensimon, Vincent Croquette, Jimmy Ouellet, Gordon Hamilton
Accurate decoding of nucleic acid variation is important to understand the complexity and regulation of genome function. Here we introduce a single-molecule platform based on magnetic tweezer (MT) technology that can identify and map the positions of sequence variation and multiple base modifications together in the same single molecules of DNA or RNA at single base resolution. Using synthetic templates, we demonstrate that our method can distinguish the most common epigenetic marks on DNA and RNA with high sensitivity, specificity and precision. We also developed a highly specific CRISPR-Cas enrichment strategy to target genomic regions in native DNA without amplification. We then used this method to enrich native DNA from E. coli and characterized the differential levels of adenine and cytosine base modifications together in molecules of up to 5 kb in length. Finally, we enriched the 5‘UTR of FMR1 from cells derived from a Fragile X carrier and precisely measured the repeat expansion length and methylation status of each molecule. These results demonstrate that our platform can detect a variety of genetic, epigenetic and base modification changes concomitantly within the same single molecules.
The giant sequoia genome and proliferation of disease resistance genes
Alison D. Scott, Aleksey V. Zimin, Daniela Puiu, Rachael Workman, Monica Britton, Sumaira Zaman, Madison Caballero, Andrew C. Read, Adam J. Bogdanove, Emily Burns, Jill Wegrzyn, Winston Timp, Steven L. Salzberg, David B. Neale
The giant sequoia (Sequoiadendron giganteum) of California are massive, long-lived trees that grow along the U.S. Sierra Nevada mountains. As they grow primarily in isolated groves within a narrow range, conservation of existing trees has been a national goal for over 150 years. Genomic data are limited in giant sequoia, and the assembly and annotation of the first giant sequoia genome has been an important goal to allow marker development for restoration and management. Using Illumina and Oxford Nanopore sequencing combined with Dovetail chromosome conformation capture libraries, 8.125 Gbp of sequence was assembled into eleven chromosome-scale scaffolds. This giant sequoia assembly represents the first genome sequenced in the Cupressaceae family, and lays a foundation for using genomic tools to aid in giant sequoia conservation and management. Beyond conservation and management applications, the giant sequoia assembly is a resource for answering questions about the life history of this enigmatic and robust species. Here we provide an example by taking an inventory of the large and complex family of NLR type disease resistance genes.
HIV-1 spliced RNAs display transcription start site bias
Jackie M. Esquiaqui, Siahrei Kharytonchyk, Darra Drucker and Alice Telesnitsky
HIV-1 transcripts have three fates: to serve as genomic RNAs, unspliced mRNAs, or spliced subgenomic mRNAs. Recent structural studies have shown that sequences near the 5′ end of HIV-1 RNA can adopt at least two alternate 3-dimensional conformations, and that these structures dictate genome vs. unspliced mRNA fates. HIV-1’s use of alternate transcription start sites can influence which RNA conformer is generated, and this choice in turn dictates the fate of the unspliced RNA. The structural context of HIV-1’s major 5′ splice site differs in these two RNA conformers, suggesting that the conformers may differ in their ability to support HIV-1 splicing events. Here we tested the hypothesis that transcription start sites that shift the RNA monomer/dimer structural equilibrium away from the splice site sequestering dimer-competent fold would favor splicing. Consistent with this hypothesis, the results showed that the 5′ ends of spliced HIV-1 RNAs were enriched in 3GCap structures and depleted of 1GCap RNAs relative to the total intracellular RNA population. These findings expand the functional significance of HIV-1 RNA structural dynamics by demonstrating roles for RNA structure in defining all three classes of HIV-1 RNAs, and suggest that HIV-1 transcription start site choice initiates a cascade of molecular events that dictate the fates of nascent HIV-1 RNAs.
Inhibition of cytoplasmic cap methylation identifies 5′ TOP mRNAs as recapping targets and reveals recapping sites downstream of native 5′ ends
Daniel del Valle Morales, Jackson B Trotman, Ralf Bundschuh, Daniel R Schoenberg
Features TeloPrime Full-Length cDNA Amplification Kit and QuantSeq 3’ mRNA-Seq Library Prep Kit REV for Illumina
Long-read Assays Shed New Light on the Transcriptome Complexity of a Viral Pathogen and on Virus-Host Interaction
Dóra Tombácz, István Prazsák, Zoltán Maróti, Norbert Moldován, Zsolt Csabai, Zsolt Balázs, Béla Dénes, Tibor Kalmár, Michael Snyder, Zsolt Boldogkői
Characterization of global transcriptomes using conventional short-read sequencing is challenging because of the insensitivity of these platforms to transcripts isoforms, multigenic RNA molecules, and transcriptional overlaps, etc. Long-read sequencing (LRS) can overcome these limitations by reading full-length transcripts. Employment of these technologies has led to the redefinition of transcriptional complexities in reported organisms. In this study, we applied LRS platforms from Pacific Biosciences and Oxford Nanopore Technologies to profile the dynamic vaccinia virus (VACV) transcriptome and assess the effect of viral infection on host gene expression. We performed cDNA and direct RNA sequencing analyses and revealed an extremely complex transcriptional landscape of this virus. In particular, VACV genes produce large numbers of transcript isoforms that vary in their start and termination sites. A significant fraction of VACV transcripts start or end within coding regions of neighboring genes. We distinguished five classes of host genes according to their temporal responses to viral infection. This study provides novel insights into the transcriptomic profile of a viral pathogen and the effect of the virus on host gene expression.
Dual-initiation promoters with intertwined canonical and TCT/TOP transcription start sites diversify transcript processing
Chirag Nepal, Yavor Hadzhiev, Piotr Balwierz, Estefanía Tarifeño-Saldivia, Ryan Cardenas, Joseph W. Wragg, Ana-Maria Suzuki, Piero Carninci, Bernard Peers, Boris Lenhard, Jesper B. Andersen & Ferenc Müller
Variations in transcription start site (TSS) selection reflect diversity of preinitiation complexes and can impact on post-transcriptional RNA fates. Most metazoan polymerase II-transcribed genes carry canonical initiation with pyrimidine/purine (YR) dinucleotide, while translation machinery-associated genes carry polypyrimidine initiator (5’-TOP or TCT). By addressing the developmental regulation of TSS selection in zebrafish we uncovered a class of dual-initiation promoters in thousands of genes, including snoRNA host genes. 5’-TOP/TCT initiation is intertwined with canonical initiation and used divergently in hundreds of dual-initiation promoters during maternal to zygotic transition. Dual-initiation in snoRNA host genes selectively generates host and snoRNA with often different spatio-temporal expression. Dual-initiation promoters are pervasive in human and fruit fly, reflecting evolutionary conservation. We propose that dual-initiation on shared promoters represents a composite promoter architecture, which can function both coordinately and divergently to diversify RNAs.
Template-switching artifacts resemble alternative polyadenylation
Zsolt Balázs, Dóra Tombácz, Zsolt Csabai, Norbert Moldován, Michael Snyder & Zsolt Boldogkői
Background
Alternative polyadenylation is commonly examined using cDNA sequencing, which is known to be affected by template-switching artifacts. However, the effects of such template-switching artifacts on alternative polyadenylation are generally disregarded, while alternative polyadenylation artifacts are attributed to internal priming.
Results
Here, we analyzed both long-read cDNA sequencing and direct RNA sequencing data of two organisms, generated by different sequencing platforms. We developed a filtering algorithm which takes into consideration that template-switching can be a source of artifactual polyadenylation when filtering out spurious polyadenylation sites. The algorithm outperformed the conventional internal priming filters based on comparison to direct RNA sequencing data. We also showed that the polyadenylation artifacts arise in cDNA sequencing at consecutive stretches of as few as three adenines. There was no substantial difference between the lengths of poly(A) tails at the artifactual and the true transcriptional end sites even though it is expected that internal priming artifacts have shorter poly(A) tails than genuine polyadenylated reads.
Conclusions
Our findings suggest that template switching plays an important role in the generation of spurious polyadenylation and support the need for more rigorous filtering of artifactual polyadenylation sites in cDNA data, or that alternative polyadenylation should be annotated using native RNA sequencing.
High-quality chromosome-scale assembly of the walnut (Juglans regia L) reference genome
Annarita Marrano, Monica Britton, Paulo A. Zaini, Aleksey V. Zimin, Rachael E. Workman, Daniela Puiu, Luca Bianco, Erica Adele Di Pierro, Brian J. Allen, Sandeep Chakraborty, Michela Troggio, Charles A. Leslie, Winston Timp, Abhaya Dandekar, Steven L. Salzberg, David B. Neale
The release of the first reference genome of walnut (Juglans regia L.) enabled many achievements in the characterization of walnut genetic and functional variation. However, it is highly fragmented, preventing the integration of genetic, transcriptomic, and proteomic information to fully elucidate walnut biological processes. Here we report the new chromosome-scale assembly of the walnut reference genome (Chandler v2.0) obtained by combining Oxford Nanopore long-read sequencing with chromosome conformation capture (Hi-C) technology. Relative to the previous reference genome, the new assembly features an 84.4-fold increase in N50 size, and the full sequence of all 16 chromosomal pseudomolecules, nine of which present telomere sequences at both ends. Using full-length transcripts from single-molecule real-time sequencing, we predicted 40,491 gene models, with a mean gene length higher than the previous gene annotations. Most of the new protein-coding genes (90%) are full-length, which represents a significant improvement compared to Chandler v1.0 (only 48%). We then tested the potential impact of the new chromosome-level genome on different areas of walnut research. By studying the proteome changes occurring during catkin development, we observed that the virtual proteome obtained from Chandler v2.0 presents fewer artifacts than the previous reference genome, enabling the identification of a new potential pollen allergen in walnut. Also, the new chromosome-scale genome facilitates in-depth studies of intraspecies genetic diversity by revealing previously undetected autozygous regions in Chandler, likely resulting from inbreeding, and 195 genomic regions highly differentiated between Western and Eastern walnut cultivars. Overall, Chandler v2.0 is a valuable resource to understand and explore walnut biology better.
Structural rearrangements drive extensive genome divergence between symbiotic and free-living Symbiodinium
Raúl A. González-Pech, Timothy G. Stephens, Yibi Chen, Amin R. Mohamed, Yuanyuan Cheng, David W. Burt, Debashish Bhattacharya, Mark A. Ragan, Cheong Xin Chan
Symbiodiniaceae are predominantly symbiotic dinoflagellates critical to corals and other reef organisms. Symbiodinium is a basal symbiodiniacean lineage and includes symbiotic and free-living taxa. However, the molecular mechanisms underpinning these distinct lifestyles remain little known. Here, we present high-quality de novo genome assemblies for the symbiotic Symbiodinium tridacnidorum CCMP2592 (genome size 1.3 Gbp) and the free-living Symbiodinium natans CCMP2548 (genome size 0.74 Gbp). These genomes display extensive sequence divergence, sharing only ~1.5% conserved regions (≥90% identity). We predicted 45,474 and 35,270 genes for S. tridacnidorum and S. natans, respectively; of the 58,541 homologous gene families, 28.5% are common to both genomes. We recovered a greater extent of gene duplication and higher abundance of repeats, transposable elements and pseudogenes in the genome of S. tridacnidorum than in that of S. natans. These findings demonstrate that genome structural rearrangements are pertinent to distinct lifestyles in Symbiodinium, and may contribute to the vast genetic diversity within the genus, and more broadly in Symbiodiniaceae. Moreover, the results from our whole-genome comparisons against a free-living outgroup support the notion that the symbiotic lifestyle is a derived trait in, and that the free-living lifestyle is ancestral to, Symbiodinium.
Illuminating the dark side of the human transcriptome with TAMA Iso-Seq analysis
Richard I. Kuo, Yuanyuan Cheng, Jacqueline Smith, Alan L. Archibald, David W. Burt
The human transcriptome is one of the most well-annotated of the eukaryotic species. However, limitations in technology biased discovery toward protein coding spliced genes. Accurate high throughput long read RNA sequencing now has the potential to investigate genes that were previously undetectable. Using our Transcriptome Annotation by Modular Algorithms (TAMA) tool kit to analyze the Pacific Bioscience Universal Human Reference RNA Sequel II Iso-Seq dataset, we discovered thousands of potential novel genes and identified challenges in both RNA preparation and long read data processing that have major implications for transcriptome annotation.
Polarella glacialis genomes encode tandem repeats of single-exon genes with functions critical to adaptation of dinoflagellates
Timothy G. Stephens, Raúl A. González-Pech, Yuanyuan Cheng, Amin R. Mohamed, David W. Burt, Debashish Bhattacharya, Mark A. Ragan, Cheong Xin Chan
Dinoflagellates are taxonomically diverse, ecologically important phytoplankton in marine and freshwater environments. Here, we present two draft diploid genome assemblies of the free-living dinoflagellate Polarella glacialis, isolated from the Arctic and Antarctica. For each genome, guided using full-length transcriptome data, we predicted >50,000 high-quality genes. About 68% of the genome is repetitive sequence; long terminal repeats likely contribute to intra-species structural divergence and distinct genome sizes (3.0 and 2.7 Gbp). Of all genes, ∼40% are encoded unidirectionally, ∼25% comprised of single exons. Multi-genome comparison unveiled genes specific to P. glacialis and a common, putatively bacterial, origin of ice-binding domains in cold-adapted dinoflagellates. Our results elucidate how selection acts within the context of a complex genome structure to facilitate local adaptation. Since most dinoflagellate genes are constitutively expressed, Polarella glacialis has enhanced transcriptional responses via unidirectional, tandem duplication of single-exon genes that encode functions critical to survival in cold, low-light environments.
Identification and characterisation of anti – Pseudomonas aeruginosa proteins in mucus of the brown garden snail, Cornu aspersum
SJ Pitt, JA Hawthorne, M Garcia-Maya, A Alexandrovich, RC Symonds & A Gunn
British Journal of Biomedical Science, doi:10.1080/09674845.2019.1603794
Background
Novel antimicrobial treatments are urgently needed. Previous work has shown that the mucus of the brown garden snail (Cornu aspersum) has antimicrobial properties, in particular against type culture collection strains of Pseudomonas aeruginosa. We hypothesised that it would also be effective against clinical isolates of the bacterium and that investigation of fractions of the mucus would identify one or more proteins with anti-pseudomonal properties, which could be further characterised.
Materials and methods
Mucus was extracted from snails collected from the wild. Antimicrobial activity against laboratory and clinical isolates of Ps. aeruginosa was determined in disc diffusion assays. Mucus was purified using size exclusion chromatography and fractions containing anti-pseudomonal activity identified. Mass spectroscopy and high performance liquid chromatography analysis of these fractions yielded partial peptide sequences. These were used to interrogate an RNA transcriptome generated from whole snails.
Results
Mucus from C. aspersum inhibited growth of type collection strains and clinical isolates of Ps. aeruginosa. Four novel C. aspersum proteins were identified; at least three are likely to have antimicrobial properties. The most interesting is a 37.4 kDa protein whilst smaller proteins, one 17.5 kDa and one 18.6 kDa also appear to have activity against Ps. aeruginosa.
Conclusions
The study has identified novel proteins with antimicrobial properties which could be used to develop treatments for use in human medicine.
Transcriptome profiling of mouse samples using nanopore sequencing of cDNA and RNA molecules
Camille Sessegolo, Corinne Cruaud, Corinne Da Silva, Marion Dubarry, Thomas Derrien, Vincent Lacroix, Jean-Marc Aury
Background
Our vision of DNA transcription and splicing has changed dramatically with the intro-duction of short-read sequencing. These high-throughput sequencing technologies promised to unravel the complexity of any transcriptome. Generally gene expression levels are well-captured using these technologies, but there are still remaining caveats due to the limited read length and the fact that RNA molecules had to be reverse transcribed before sequencing. Oxford Nanopore Technologies has recently launched a portable sequencer which offers the possibility of sequencing long reads and most importantly RNA molecules.
Results
Here we generated a full mouse transcriptome from brain and liver using the Oxford Nanopore device. As a comparison, we sequenced RNA (RNA-Seq) and cDNA (cDNA-Seq) molecules using both long and short reads technologies. In addition, we tested the TeloPrime preparation kit, dedicated to the enrichment of full-length transcripts.
Conclusions
Using spike-in data, we confirmed that expression levels are efficiently captured by cDNA-Seq using short reads. More importantly, Oxford Nanopore RNA-Seq tends to be more efficient, while cDNA-Seq appears to be more biased. We further show that the cDNA library preparation of the Nanopore protocol induces read truncation for transcripts containing stretches of A’s. Furthermore, bioinformatics challenges remain ahead for quantifying at the transcript level, especially when reads are not full-length. Accurate quantification of processed pseudogenes also remains difficult, and we show that current mapping protocols which map reads to the genome largely over-estimate their expression, at the expense of their parent gene.
Features TeloPrime Full-Length cDNA Amplification Kit and SIRVs (Spike-in RNA Variant Control Mixes) – SIRV-Set 1
Long-read sequencing uncovers a complex transcriptome topology in varicella zoster virus
István Prazsák, Norbert Moldován, Zsolt Balázs, Dóra Tombácz, Klára Megyeri, Attila Szűcs, Zsolt Csabai and Zsolt Boldogkői
Varicella zoster virus (VZV) is a human pathogenic alphaherpesvirus harboring a relatively large DNA molecule. The VZV transcriptome has already been analyzed by microarray and short-read sequencing analyses. However, both approaches have substantial limitations when used for structural characterization of transcript isoforms, even if supplemented with primer extension or other techniques. Among others, they are inefficient in distinguishing between embedded RNA molecules, transcript isoforms, including splice and length variants, as well as between alternative polycistronic transcripts. It has been demonstrated in several studies that long-read sequencing is able to circumvent these problems.
Results
In this work, we report the analysis of the VZV lytic transcriptome using the Oxford Nanopore Technologies sequencing platform. These investigations have led to the identification of 114 novel transcripts, including mRNAs, non-coding RNAs, polycistronic RNAs and complex transcripts, as well as 10 novel spliced transcripts and 25 novel transcription start site isoforms and transcription end site isoforms. A novel class of transcripts, the nroRNAs are described in this study. These transcripts are encoded by the genomic region located in close vicinity to the viral replication origin. We also show that the ORF63 exhibits a complex structural variation encompassing the splice sites of VZV latency transcripts. Additionally, we have detected RNA editing in a novel non-coding RNA molecule.
Conclusions
Our investigations disclosed a composite transcriptomic architecture of VZV, including the discovery of novel RNA molecules and transcript isoforms, as well as a complex meshwork of transcriptional read-throughs and overlaps. The results represent a substantial advance in the annotation of the VZV transcriptome and in understanding the molecular biology of the herpesviruses in general.
Improving nanopore read accuracy with the R2C2 method enables the sequencing of highly multiplexed full-length single-cell cDNA
Roger Volden, Theron Palmer, Ashley Byrne, Charles Cole, Robert J. Schmitz, Richard E. Green, and Christopher Vollmers
Dual Platform Long-Read RNA-Sequencing Dataset of the Human Cytomegalovirus Lytic Transcriptome
Zsolt Balázs, Dóra Tombácz, Attila Szűcs, Michael Snyder and Zsolt Boldogkői
RNA-sequencing has revolutionized transcriptomics and the way we measure gene expression (Wang et al., 2009). As of today, short-read RNA sequencing is more widely used, and due to its low price and high throughput, is the preferred tool for the quantitative analysis of gene expression. However, the annotation of transcript isoforms is rather difficult using only short-read sequencing data, because the reads are shorter than most transcripts (Steijger et al., 2013). Long-read sequencing, on the other hand, can provide full contig information about transcripts, including exon-connectivity, and its merits in transcriptome profiling are being increasingly acknowledged (Sharon et al., 2013; Abdel-Ghany et al., 2016; Wang et al., 2016; Kuo et al., 2017). Due to the relatively low throughput of current long-read sequencing technologies, they can only characterize smaller transcriptomes in high-depth (Weirather et al., 2017).
The Human cytomegalovirus (HCMV) is a ubiquitous betaherpesvirus, which can cause mononucleosis-like symptoms in adults (Cohen and Corey, 1985), and severe life-threatening infections in newborns (Wen et al., 2002). Latent HCMV infection has recently been implicated to affect cancer formation (Dziurzynski et al., 2012; Jin et al., 2014). Examining the transcriptome of the virus can go a long way in helping understand its molecular biology. Short-read RNA sequencing studies have discovered splice junctions and non-coding transcripts (Gatherer et al., 2011) and have shown that the most abundant HCMV transcripts are similarly expressed in different cell types (Cheng et al., 2017). Our long-read RNA sequencing experiments using the Pacific Biosciences (PacBio) RSII platform revealed a great number of transcript isoforms, polycistronic RNAs and transcriptional overlaps (Balázs et al., 2017a).
Data
Here, we present the dual-platform long-read RNA sequencing dataset of two HCMV-infected fibroblast samples. We have sequenced the same RNA population that we have previously sequenced with the PacBio RS II platform (Balázs et al., 2017b), but now using the PacBio Sequel and Oxford Nanopore Technologies (ONT) MinION platforms. These data, apart from providing a more profound picture of the lytic HCMV transcriptome, can also be used to compare the current technologies. A further sample was prepared, using lytic HCMV RNAs. This sample was subjected to ONT Cap-selected cDNA sequencing (Cap-Seq) in order to allow better characterization of the transcription start sites, and also to direct (d)RNA sequencing in order to avoid reverse-transcription (RT) and PCR artifacts. We report of sequencing of approximately 100 GB raw data (Supplementary Table 1). The CapSeq by the MinION platform yielded the highest read count, the throughputs of the Sequel platform and the ONT dRNA sequencing both lagged behind (summarized in Figure 1A); both technologies nonetheless offer significant benefits. The Sequel platform is more accurate and the dRNA sequencing is free of RT and PCR artifacts. The read length distribution shows that the Sequel platform has a similar molecule-size preference to the RSII platform, while the MinION platform sequences more short reads (Figure 1B). The length-distribution of the non-cap selected cDNA sequencing reads are different from the other ONT reads, because this library was size-selected (>500 nt).
Comparative genome analysis of programmed DNA elimination in nematodes
Jianbin Wang, Shenghan Gao, Yulia Mostovoy, Yuanyuan Kang, Maxim Zagoskin, Yongqiao Sun, Bing Zhang, Laura K. White, Alice Easton, Thomas B. Nutman, Pui-Yan Kwok, Songnian Hu, Martin K. Nielsen and Richard E. Davis
Transcriptomic study of Herpes simplex virus type-1 using full-length sequencing techniques
Zsolt Boldogkői, Attila Szűcs, Zsolt Balázs, Donald Sharon, Michael Snyder & Dóra Tombácz
Lytic Transcriptome Dataset of Varicella Zoster Virus Generated by Long-read Sequencing
Dóra Tombácz, Donald Sharon, Attila Szűcs, Norbert Moldován, Michael Snyder, Zsolt Boldogkői
Introduction
Varicella zoster virus (VZV) belongs to the Alphaherpesvirinae subfamily of the Herpesviridae family. It is the etiological agent of chickenpox (varicella) caused by primary infection and shingles (zoster), which is due to reactivation of the virus from latency (Kennedy, 2002). Many countries have adopted recommendations for routine immunization of children and susceptible adults against VZV. The VZV virion is composed of an icosahedral nucleocapsid surrounded by a tegument layer, which is covered by an envelope derived from the host cell membrane with incorporated viral glycoproteins (Maresova et al., 2005). The genome of VZV consists of a linear double-stranded DNA molecule and is approximately 125-kbp in size, which contains more than 70 annotated open reading frames (ORFs) (Tyler et al., 2007). The transcription of the virus is strictly regulated by cascade-like processes. First, the immediate-early (IE) transcripts are expressed, which is then followed by the expression of the early (E), and then the late (L) kinetic classes of transcripts (Reichelt et al., 2009). The IE ORF62 gene of VZV encodes the major transactivator, which controls the expression of other viral genes. The viral E genes encode proteins that are used in DNA replication, while L genes code for the structural elements of the virus.
High-throughput short-read sequencing (SRS) techniques have revolutionized transcriptome research (Delseny et al., 2010). These techniques have also been utilized in the investigation of herpesvirus gene expression (e.g. Chambers et al., 1999; Ebrahimi et al., 2003; Baird et al., 2014; Oláh et al., 2015). However, the SRS approach has severe limitations in comparison to long-read sequencing (LRS), including Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT) platforms. LRS techniques have been used before in transcriptome studies of the herpesviruses (Tombácz et a, 2016; O’Grady et al., 2016; Tombácz et al., 2017; Balázs et al., 2017a; 2017b; Moldován et al., 2018). These studies uncovered a very complex transcriptome, which included the identification of a large number of novel RNA molecules and transcript isoforms (Tombácz et al., 2015; Tombácz et al., 2017; Balázs et al., 2017a). Moreover, an extended meshwork of overlaps between the transcripts was also detected by these studies (Tombácz et a, 2016; Moldován et al., 2018).
The presented data report is aimed toward providing a new, comprehensive transcript catalog of VZV using an LRS approach for the first time. In this study, we applied the ONT MinION device and various full-length cDNA sequencing protocols that capture the entire poly(A)-transcriptome of VZV.
Transcriptome-wide survey of pseudorabies virus using next- and third-generation sequencing platforms
Dóra Tombácz, Donald Sharon, Attila Szűcs, Norbert Moldován, Michael Snyder & Zsolt Boldogkői
Multi-Platform Sequencing Approach Reveals a Novel Transcriptome Profile in Pseudorabies Virus
Norbert Moldován, Dóra Tombácz, Attila Szűcs, Zsolt Csabai, Michael Snyder and Zsolt Boldogkői
In the second part, I developed novel technology CAPTRE to measure the translational status of distinct mRNA TL isoforms. In mouse fibroblasts, a total of 22,357 TSSs derived from 10,875 protein-coding genes were identified. Among 4153 genes expressing multiple TSSs, 745 exhibited significant TE difference between their alternative TL isoforms. Longer isoforms were more frequently associated with lower TE and the global impact of several regulatory elements was also revisited, such as uORFs, cap-adjacent stable RNA secondary structures as well as 5′-terminal oligopyrimidine tract. In addition, several novel sequence motifs that can affect translation activity were identified and their effect was validated using two reporter systems. Finally, quantitative models combining different features identified in this study explained approximately 60% of the variance of the TE difference observed between TL isoforms.
This study provides novel mechanistic insights into translational regulation and characterizes the potential coupling between translational and transcriptional regulation in mammalian cells.
Thyroglobulin Represents a Novel Molecular Architecture of Vertebrates
Guillaume Holzer, Yoshiaki Morishita, Jean-Baptiste Fini, Thibault Lorin, Benjamin Gillet, Sandrine Hughes, Marie Tohmé, Gilbert Deléage, Barbara Demeneix, Peter Arvan and Vincent Laudet
cDNA Library Enrichment of Full Length Transcripts for SMRT Long Read Sequencing
Maria Cartolano, Bruno Huettel, Benjamin Hartwig, Richard Reinhardt, Korbinian Schneeberger
Pervasive isoform‐specific translational regulation via alternative transcription start sites in mammals
Xi Wang, Jingyi Hou, Claudia Quedenau, Wei Chen
Molecular Systems Biology (2016) 12, 875, DOI 10.15252/msb.20166941