{"id":593614,"date":"2023-01-03T06:49:31","date_gmt":"2023-01-03T12:49:31","guid":{"rendered":"https:\/\/news.sellorbuyhomefast.com\/index.php\/2023\/01\/03\/accurate-isoform-discovery-with-isoquant-using-long-reads\/"},"modified":"2023-01-03T06:49:31","modified_gmt":"2023-01-03T12:49:31","slug":"accurate-isoform-discovery-with-isoquant-using-long-reads","status":"publish","type":"post","link":"https:\/\/newsycanuse.com\/index.php\/2023\/01\/03\/accurate-isoform-discovery-with-isoquant-using-long-reads\/","title":{"rendered":"Accurate isoform discovery with IsoQuant using long reads"},"content":{"rendered":"\n<div>\n<div id=\"Sec1-section\" data-title=\"Main\">\n<h2 id=\"Sec1\">Main<\/h2>\n<div id=\"Sec1-content\">\n<p>Long-read RNA sequencing is now widely used in bulk, sorted cells, single cells and spatial approaches. This wide field of applications has led to the development of multiple spliced alignment programs<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15\u201321 (2013).\" href=\"http:\/\/www.nature.com\/#ref-CR1\" id=\"ref-link-section-d324223e468\">1<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094\u20133100 (2018).\" href=\"http:\/\/www.nature.com\/#ref-CR2\" id=\"ref-link-section-d324223e468_1\">2<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Liu, B. et al. deSALT: fast and accurate long transcriptomic read alignment with de Bruijn graph-based index. Genome Biol. 20, 274 (2019).\" href=\"http:\/\/www.nature.com\/#ref-CR3\" id=\"ref-link-section-d324223e468_2\">3<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 4\" title=\"Sahlin, K. &#038; M\u00e4kinen, V. Accurate spliced alignment of long RNA sequencing reads. Bioinformatics 37, 4643\u20134651 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#ref-CR4\" id=\"ref-link-section-d324223e471\">4<\/a><\/sup>, transcript discovery methods<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Kovaka, S. et al. Transcriptome assembly from long-read RNA-seq alignments with StringTie2. Genome Biol. 20, 278 (2019).\" href=\"http:\/\/www.nature.com\/#ref-CR5\" id=\"ref-link-section-d324223e475\">5<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Tung, L. H., Shao, M. &#038; Kingsford, C. Quantifying the benefit offered by transcript assembly with Scallop-LR on single-molecule long reads. Genome Biol. 20, 287 (2019).\" href=\"http:\/\/www.nature.com\/#ref-CR6\" id=\"ref-link-section-d324223e475_1\">6<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Wyman, D. et al. A technology-agnostic long-read analysis pipeline for transcriptome discovery and quantification. Preprint at bioRxiv \n                https:\/\/doi.org\/10.1101\/672931\n                \n               (2020).\" href=\"http:\/\/www.nature.com\/#ref-CR7\" id=\"ref-link-section-d324223e475_2\">7<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Tang, A. D. et al. Full-length transcript characterization of SF3B1 mutation in chronic lymphocytic leukemia reveals downregulation of retained introns. Nat. Commun. 11, 1438 (2020).\" href=\"http:\/\/www.nature.com\/#ref-CR8\" id=\"ref-link-section-d324223e475_3\">8<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Kuo, R. I. et al. Illuminating the dark side of the human transcriptome with long read transcript sequencing. BMC Genomics 21, 751 (2020).\" href=\"http:\/\/www.nature.com\/#ref-CR9\" id=\"ref-link-section-d324223e475_4\">9<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Byrne, A. et al. Nanopore long-read RNAseq reveals widespread transcriptional variation among the surface receptors of individual B cells. Nat. Commun. 8, 16027 (2017).\" href=\"http:\/\/www.nature.com\/#ref-CR10\" id=\"ref-link-section-d324223e475_5\">10<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 11\" title=\"Chen, Y. et al. Context-aware transcript quantification from long read RNA-Seq data. Bioconductor \n                https:\/\/doi.org\/10.18129\/B9.bioc.bambu\n                \n               (2022).\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#ref-CR11\" id=\"ref-link-section-d324223e478\">11<\/a><\/sup>, tools for transcript classification<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 12\" title=\"Tardaguila, M. et al. Corrigendum: SQANTI: extensive characterization of long-read transcript sequences for quality control in full-length transcriptome identification and quantification. Genome Res. 28, 1096\u20131096 (2018).\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#ref-CR12\" id=\"ref-link-section-d324223e482\">12<\/a><\/sup>, annotation<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 13\" title=\"de la Fuente, L. et al. tappAS: a comprehensive computational framework for the analysis of the functional impact of differential splicing. Genome Biol. 21, 119 (2020).\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#ref-CR13\" id=\"ref-link-section-d324223e486\">13<\/a><\/sup> and visualization<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 14\" title=\"Reese, F. &#038; Mortazavi, A. Swan: a library for the analysis and visualization of long-read transcriptomes. Bioinformatics 37, 1322\u20131323 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#ref-CR14\" id=\"ref-link-section-d324223e490\">14<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 15\" title=\"Stein, A. N., Joglekar, A., Poon, C.-L. &#038; Tilgner, H. U. ScisorWiz: visualizing differential isoform expression in single-cell long-read data. Bioinformatics 38, 3474\u20133476 (2022).\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#ref-CR15\" id=\"ref-link-section-d324223e493\">15<\/a><\/sup>. Additionally, several reference-free tools for RNA long-read correction and assembly have been developed<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 16\" title=\"Sahlin, K. &#038; Medvedev, P. Error correction enables use of Oxford Nanopore technology for reference-free transcriptome analysis. Nat. Commun. 12, 2 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#ref-CR16\" id=\"ref-link-section-d324223e498\">16<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 17\" title=\"Nip, K. M. et al. RNA-Bloom enables reference-free and reference-guided sequence assembly for single-cell transcriptomes. Genome Res. 30, 1191\u20131200 (2020).\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#ref-CR17\" id=\"ref-link-section-d324223e501\">17<\/a><\/sup>. Current community efforts address the problem of understanding performance, weaknesses and advantages of each approach for various applications<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 18\" title=\"Pardo-Palacios, F. et al. Systematic assessment of long-read RNA-seq methods for transcript identification and quantifican. Preprint at \n                https:\/\/doi.org\/10.21203\/rs.3.rs-777702\/v1\n                \n               (2021).\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#ref-CR18\" id=\"ref-link-section-d324223e505\">18<\/a><\/sup>.<\/p>\n<p>Here we present IsoQuant\u2014a tool for transcript discovery and quantification with long RNA reads. IsoQuant takes as input a reference genome and a dataset containing PacBio or ONT (Oxford Nanopore Technologies) RNA reads. By default, IsoQuant maps input reads to the genome via minimap2 in splice mode<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 2\" title=\"Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094\u20133100 (2018).\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#ref-CR2\" id=\"ref-link-section-d324223e512\">2<\/a><\/sup>. Alternatively, a user may provide BAM files generated with a spliced aligner of their choice, for example STARlong<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 11\"00 title=\"Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15\u201321 (2013).\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#ref-CR1\" id=\"ref-link-section-d324223e516\">1<\/a><\/sup> for PacBio and uLTRA<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 11\"11 title=\"Sahlin, K. &#038; M\u00e4kinen, V. Accurate spliced alignment of long RNA sequencing reads. Bioinformatics 37, 4643\u20134651 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#ref-CR4\" id=\"ref-link-section-d324223e520\">4<\/a><\/sup> or deSALT<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 11\"22 title=\"Liu, B. et al. deSALT: fast and accurate long transcriptomic read alignment with de Bruijn graph-based index. Genome Biol. 20, 274 (2019).\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#ref-CR3\" id=\"ref-link-section-d324223e524\">3<\/a><\/sup> for ONT reads. In two distinct modes, IsoQuant can be used for de novo annotation-free transcript discovery as well as with the reference gene annotation.<\/p>\n<p>IsoQuant uses long-read spliced alignments to construct an intron graph, in which vertices are splice junctions, that is, pairs of splice sites (donor and acceptor), and two vertices are connected with a directed edge if the corresponding splice junctions are consecutive in at least one read (<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"section anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#Sec2\">Methods<\/a>). This graph is exploited for constructing paths that correspond to full-length transcripts (Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#Fig1\">1a<\/a>). If the reference annotation is provided, IsoQuant first assigns reads to known isoforms via an inexact intron-chain matching algorithm that accounts for splice site shifts, which are typical for alignment of error-prone reads<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 11\"33 title=\"Mikheenko, A., Prjibelski, A. D., Joglekar, A. &#038; Tilgner, H. U. Sequencing of individual barcoded cDNAs using Pacific Biosciences and Oxford Nanopore Technologies reveals platform-specific error patterns. Genome Res. 32, 726\u2013737 (2022).\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#ref-CR19\" id=\"ref-link-section-d324223e537\">19<\/a><\/sup>. These assignments are further used for reference transcript quantification and correction of inaccurately detected splice junctions and misalignments, such as skipped microexons.<\/p>\n<div data-test=\"figure\" data-container-section=\"figure\" id=\"figure-1\" data-title=\"IsoQuant pipeline outline and characteristics of novel transcripts generated from mouse simulated data.\">\n<figure><figcaption><b id=\"Fig1\" data-test=\"figure-caption-text\">Fig. 1: IsoQuant pipeline outline and characteristics of novel transcripts generated from mouse simulated data.<\/b><\/figcaption><div>\n<div><a data-test=\"img-link\" data-track=\"click\" data-track-label=\"image\" data-track-action=\"view figure\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y\/figures\/1\" rel=\"nofollow\"><picture><source type=\"image\/webp\" ><img decoding=\"async\" aria-describedby=\"Fig1\" src=\"http:\/\/media.springernature.com\/lw685\/springer-static\/image\/art%3A10.1038%2Fs41587-022-01565-y\/MediaObjects\/41587_2022_1565_Fig1_HTML.png\" alt=\" figure 1\" loading=\"lazy\" width=\"685\" height=\"349\"><\/picture><\/a><\/div>\n<div data-test=\"bottom-caption\" id=\"figure-1-desc\">\n<p><b>a<\/b>, Outline of the IsoQuant pipeline. When a reference gene annotation is provided, reads are assigned to annotated isoforms and alignment artifacts are corrected (top). The intron graph is constructed from read alignments (middle) and transcripts are discovered via path construction (bottom). <b>b<\/b>, F1-score for novel transcripts reported by different tools on simulated ONT (left) and PacBio data (right). <b>c<\/b>, Precision and recall for novel transcripts reported by different tools on simulated ONT data broken up by expression levels in TPM. TPM bins are presented by dot sizes. <b>d<\/b>, Precision (left) and recall (right) for novel transcripts reported by different tools on simulated ONT data. <b>e<\/b>, Same as <b>d<\/b>, but for simulated PacBio data.<\/p>\n<p><a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#MOESM3\">Source Data<\/a><\/p>\n<\/div>\n<\/div>\n<p xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\"><a data-test=\"article-link\" data-track=\"click\" data-track-label=\"button\" data-track-action=\"view figure\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y\/figures\/1\" data-track-dest=\"link:Figure1 Full size image\" aria-label=\"Reference 11\"44 rel=\"nofollow\"><span>Full size image<\/span><\/a><\/p>\n<\/figure>\n<\/div>\n<p>To compare IsoQuant performance against existing transcript discovery tools, we first simulated mouse PacBio and ONT data using realistic gene expression profiles with IsoSeqSim (<a href=\"https:\/\/github.com\/yunhaowang\/IsoSeqSim\">https:\/\/github.com\/yunhaowang\/IsoSeqSim<\/a>) and Trans-NanoSim<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 11\"55 title=\"Hafezqorani, S. et al. Trans-NanoSim characterizes and simulates nanopore RNA-sequencing data. Gigascience 9, giaa061 (2020).\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#ref-CR20\" id=\"ref-link-section-d324223e592\">20<\/a><\/sup> respectively. For more informative benchmarking, we simulated an ONT R9.4 dataset representing R9.4 chemistry and an ONT R10.4 dataset corresponding to a more accurate R10.4 chemistry (<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"section anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#Sec2\">Methods<\/a>).<\/p>\n<p>To mimic real-life datasets containing unannotated transcripts, we arbitrarily removed 5,311 (15%) of 35,684 expressed isoforms (the ones contributing to at least one read during the simulation) from the GENCODE<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 11\"66 title=\"Frankish, A. et al. GENCODE 2021. Nucleic Acids Res. 49, D916\u2013D923 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#ref-CR21\" id=\"ref-link-section-d324223e603\">21<\/a><\/sup> gene annotation. These 5,311 hidden transcripts were further used as a ground truth for novel transcript discovery. The reduced GENCODE annotation was used as an input for all tools. Each output annotation was then separated into a set of known and a set of novel transcripts, which were compared against the respective baselines using gffcompare<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 11\"77 title=\"Pertea, G. &#038; Pertea, M. GFF utilities: GffRead and GffCompare. F1000Res. 9, 304 (2020).\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#ref-CR22\" id=\"ref-link-section-d324223e607\">22<\/a><\/sup> (<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"section anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#Sec2\">Methods<\/a>).<\/p>\n<p>For known transcripts, IsoQuant has the highest F1-score (the harmonic mean of precision and recall) compared to TALON<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 11\"88 title=\"Wyman, D. et al. A technology-agnostic long-read analysis pipeline for transcriptome discovery and quantification. Preprint at bioRxiv \n                https:\/\/doi.org\/10.1101\/672931\n                \n               (2020).\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#ref-CR7\" id=\"ref-link-section-d324223e617\">7<\/a><\/sup>, FLAIR<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 11\"99 title=\"Tang, A. D. et al. Full-length transcript characterization of SF3B1 mutation in chronic lymphocytic leukemia reveals downregulation of retained introns. Nat. Commun. 11, 1438 (2020).\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#ref-CR8\" id=\"ref-link-section-d324223e621\">8<\/a><\/sup>, Bambu<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 12\"00 title=\"Chen, Y. et al. Context-aware transcript quantification from long read RNA-Seq data. Bioconductor \n                https:\/\/doi.org\/10.18129\/B9.bioc.bambu\n                \n               (2022).\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#ref-CR11\" id=\"ref-link-section-d324223e625\">11<\/a><\/sup> and StringTie<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 12\"11 title=\"Kovaka, S. et al. Transcriptome assembly from long-read RNA-seq alignments with StringTie2. Genome Biol. 20, 278 (2019).\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#ref-CR5\" id=\"ref-link-section-d324223e629\">5<\/a><\/sup>, but these advances are not dramatic (Supplementary Tables <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#MOESM1\">1<\/a>\u2013<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#MOESM1\">3<\/a>). However, IsoQuant produces novel transcripts with a 1.9-fold higher F1-score on ONT R10.4 data compared to the second-best tool, StringTie. In comparison to TALON, FLAIR and Bambu, the improvement in F1-score is even more noticeable (Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#Fig1\">1b<\/a>, left). On PacBio data, IsoQuant again shows the best F1-score, but the difference from other tools is smaller than for ONT R10.4 data (Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#Fig1\">1b<\/a>, right).<\/p>\n<p>Compared to most tools, IsoQuant\u2019s improvements in F1-score is primarily caused by its very high precision of novel transcripts. As compared to TALON, FLAIR and StringTie, IsoQuant shows a minimum of fivefold drop in false-positive rate on ONT R10.4 data, while still maintaining slight gains in recall (Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#Fig1\">1d<\/a>). The situation is of a different nature for Bambu. IsoQuant has higher precision (86.3 versus 69.9%), but substantially higher recall: while Bambu only reconstructs 73 out of 5,311 novel isoforms (1% recall), IsoQuant reconstructs 3,848 (62.6%). On ONT R9.4 simulated data IsoQuant similarly shows a notably lower false-positive rate compared to other tools (Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#MOESM1\">2<\/a>).<\/p>\n<p>On PacBio simulated data, similar trends can be observed for novel transcripts, although with a less drastic difference in specificity. Bambu shows slightly higher precision (95.8%) compared to IsoQuant (94.4%), but again has the lowest recall (18.7% for Bambu versus 76.8% for IsoQuant). StringTie, TALON and FLAIR again predict transcripts with comparable recall, but have at least fivefold higher false-positive rate compared to IsoQuant (Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#Fig1\">1e<\/a>, detailed analysis of the false-positive transcript is provided in Supplementary Note <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#MOESM1\">8<\/a>).<\/p>\n<p>Further, we measured precision and recall for novel transcripts with different expression levels (Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#Fig1\">1c<\/a> and Supplementary Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#MOESM1\">1<\/a>). While all tools tend to show lower recall and precision for lowly expressed transcripts, IsoQuant yields highly specific transcript models (\u226580% precision) and maintains advances for novel transcript discovery regardless of the expression levels. Thus, IsoQuant is likely to be highly useful across many genes, including but not limited to low-expressed long-noncoding RNAs and marker genes of cell types.<\/p>\n<p>Among the five listed methods, only StringTie and IsoQuant support annotation-free transcript discovery. Thus, we compared these two tools on the same simulated datasets used above without providing any annotation (Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#MOESM1\">4<\/a>). On PacBio data both tools yield highly accurate transcript models. On ONT data StringTie shows higher recall, while IsoQuant generates transcripts with substantially lower false-positive rates (2.5-fold decrease for ONT R10.4 dataset and 3.7-fold for ONT R9.4). While overall quality of transcripts discovered in reference-based mode is, indeed, higher compared to annotation-free runs, the precision and recall of novel transcripts appears to be rather similar in both modes.<\/p>\n<p>To complement our benchmarks on simulated data, we also sequenced Lexogen spike-in RNA variant (SIRV) synthetic molecules on the Oxford Nanopore MinION using ONT R10.4 flowcells (<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"section anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#Sec2\">Methods<\/a>). Along with the complete SIRV annotation, Lexogen provides an incomplete annotation, missing 26 out of the total 69 SIRV isoforms, which allows the evaluation of novel transcript discovery, similar to the one we performed for simulated data with the reduced GENCODE annotation.<\/p>\n<p>Results on SIRV sequencing data resemble the ones obtained on simulated reads. When predicting novel isoforms, IsoQuant shows at least four times higher F1-score and eightfold lower false-positive rate than any other tool. In comparison to most tools, with the exception of TALON, IsoQuant shows high gains in both precision and recall. TALON has a better recall (42.3 versus 38.5%), but IsoQuant has tenfold higher precision (Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#Fig2\">2a<\/a>). Similar to simulated data, all tools are able to accurately predict SIRV transcripts kept in the annotation, with Bambu, StringTie and IsoQuant having perfect precision for known isoforms alone (Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#MOESM1\">5<\/a>).<\/p>\n<p>To support our observations, we also applied all tools to the real human ONT complementary DNA, ONT direct RNA (dRNA)<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 12\"22 title=\"Workman, R. E. et al. Nanopore native RNA sequencing of a human poly(A) transcriptome. Nat. Methods 16, 1297\u20131305 (2019).\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#ref-CR23\" id=\"ref-link-section-d324223e698\">23<\/a><\/sup> and PacBio public datasets, for which the ground truth is indeed unknown. We used gffcompare to estimate the consistency of predictions by computing the number of identical transcript models reported by the different tools. On the human ONT dRNA dataset, IsoQuant shows the highest percentage of transcripts confirmed by at least three other methods (70.1%), while no other tool surpasses the 40% threshold. This suggests that IsoQuant transcript models are notably more consistent with other methods (Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#Fig2\">2b<\/a>, middle). In comparison to the other approaches, IsoQuant also reports the lowest number of transcripts that are not predicted by any other method. If one interprets such transcript models as potential false positives, IsoQuant again stands out in the lowest false-discovery rate (3.5%, 1,162 transcripts). In contrast, other tools output annotations containing more than 33% of unconfirmed transcript models (varying from 18,000 to 48,000). Additionally, for each tool we computed the number of potentially missed transcripts that were reported by all other methods. While TALON has the lowest number of such transcripts (75), Bambu shows the second-best results of 1,089 possible false negatives and IsoQuant shows the third-best results of 1,521 such transcripts (Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#MOESM1\">6<\/a>).<\/p>\n<div data-test=\"figure\" data-container-section=\"figure\" id=\"figure-2\" data-title=\"Characteristics of transcripts obtained from real sequencing data.\">\n<figure><figcaption><b id=\"Fig2\" data-test=\"figure-caption-text\">Fig. 2: Characteristics of transcripts obtained from real sequencing data.<\/b><\/figcaption><div>\n<div><a data-test=\"img-link\" data-track=\"click\" data-track-label=\"image\" data-track-action=\"view figure\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y\/figures\/2\" rel=\"nofollow\"><picture><source type=\"image\/webp\" ><img decoding=\"async\" aria-describedby=\"Fig2\" src=\"http:\/\/media.springernature.com\/lw685\/springer-static\/image\/art%3A10.1038%2Fs41587-022-01565-y\/MediaObjects\/41587_2022_1565_Fig2_HTML.png\" alt=\" figure 2\" loading=\"lazy\" width=\"685\" height=\"180\"><\/picture><\/a><\/div>\n<div data-test=\"bottom-caption\" id=\"figure-2-desc\">\n<p><b>a<\/b>, Precision, recall and F1-score for novel transcripts generated on real SIRV ONT cDNA sequencing data. <b>b<\/b>, Consistency of predictions made by different methods on real human ONT cDNA, ONT dRNA and PacBio data.<\/p>\n<p><a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#MOESM4\">Source Data<\/a><\/p>\n<\/div>\n<\/div>\n<p xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\"><a data-test=\"article-link\" data-track=\"click\" data-track-label=\"button\" data-track-action=\"view figure\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y\/figures\/2\" data-track-dest=\"link:Figure2 Full size image\" aria-label=\"Reference 12\"33 rel=\"nofollow\"><span>Full size image<\/span><\/a><\/p>\n<\/figure>\n<\/div>\n<p>Similar trends can be observed in ONT cDNA and PacBio datasets, although the overall percentage of common transcripts appears to be lower compared to ONT dRNA data (Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#Fig2\">2b<\/a>, left and right). IsoQuant again shows the highest fraction of transcripts predicted by at least three other tools (35.6% for ONT cDNA, 55.6% for PacBio), while other programs have correspondingly 25 and 40% at best. All four other tools produce annotations containing a high number of transcripts that are not confirmed by any other method (>\u200950% of all transcripts for ONT cDNA, >\u200930% for PacBio), while IsoQuant\u2019s potential false predictions are below 25% on ONT cDNA dataset and below 10% on the PacBio dataset.<\/p>\n<p>Although these values cannot be explicitly treated as false positives and false negatives, they advocate that, unlike other tools, IsoQuant produces highly specific annotations that are strongly consistent with transcripts reported by several alternative approaches. Moreover, because IsoQuant typically misses very few isoforms predicted by all other tools simultaneously, it is likely to also be highly sensitive (Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#MOESM1\">6<\/a>, the number of potentially missed transcripts).<\/p>\n<p>Additionally, we used long-read RNA sequencing data from a mouse brain sample, in which a previous study reported 76 novel isoforms of high biological importance<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 12\"44 title=\"Joglekar, A. et al. A spatially resolved brain region- and cell type-specific isoform atlas of the postnatal mouse brain. Nat. Commun. 12, 463 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#ref-CR24\" id=\"ref-link-section-d324223e752\">24<\/a><\/sup>, which were confirmed by manual annotation by the GENCODE team. Here, we compared IsoQuant only with StringTie, which has the second-best F1-score across all simulated datasets. On PacBio data, IsoQuant correctly reconstructs 71% of the confirmed novel isoforms, while StringTie restores approximately half as many novel transcripts\u201437% (Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#MOESM1\">7<\/a>). Similarly, on the single-cell ONT dataset from the same brain sample IsoQuant restores almost 50% of these 76 novel isoforms, whereas StringTie reports 30%. Although it is not possible to evaluate specificity in this kind of experiment, it confirms that IsoQuant can maintain high recall values on real sequencing data.<\/p>\n<p>Beside transcript discovery, IsoQuant implements additional functionality, such as read-to-isoform assignment and transcript quantification. Benchmarks of these supplementary features, information on computational performance, as well as IsoQuant results obtained with different spliced aligners can be found in the Supplementary Notes <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#MOESM1\">2<\/a>\u2013<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#MOESM1\">7<\/a>.<\/p>\n<p>In summary, IsoQuant accurately predicts transcript models from PacBio or ONT RNA sequencing data. For known isoforms, IsoQuant has higher F1-score compared to other tested tools, but these differences are not dramatic. For unannotated isoforms, however, IsoQuant provides very strong increases in F1-score over other existing approaches. In comparison to most tools, it achieves this F1-score increase by maintaining higher recall, while substantially increasing precision. Thus, IsoQuant is a valuable tool for predicting novel alternatively spliced isoforms in the age of long-read sequencing.<\/p>\n<\/div>\n<\/div>\n<div id=\"Sec2-section\" data-title=\"Methods\">\n<h2 id=\"Sec2\">Methods<\/h2>\n<div id=\"Sec2-content\">\n<h3 id=\"Sec3\">Sequencing Lexogen SIRV transcripts<\/h3>\n<p>First, total RNA from HeLa cells was extracted using the miRNeasy Tissue\/Cells Advanced Mini Kit (Qiagen, 217604), and polyA transcripts were pulled-down using the NEBNext Poly(A) messenger RNA Magnetic Isolation Module (NEB, E7490S). Next, the SIRV-Set 4 (Iso Mix E0\/ERCC\/Long SIRVs) (Lexogen, 141.01) was spiked-in to the RNA and reverse transcribed using the Maxima H Minus Reverse Transcriptase (Thermo Scientific, EP0752). The reverse transcriptase reaction final concentrations are as follows: 1.25\u2009ng\u2009\u03bcl<sup>\u22121<\/sup> polyA HeLa RNA, 0.33\u2009ng\u2009\u03bcl<sup>\u22121<\/sup> SIRV-Set 4, 0.5\u2009mM dNTP, 5\u2009\u03bcM dT-VN oligo, 5\u2009\u03bcM TSO, 1\u00d7 reverse transcriptase buffer, 2\u2009U\u2009\u03bcl<sup>\u22121<\/sup> RiboLock RNase Inhibitor (Thermo Scientific, EO0382) and 20\u2009U\u2009\u03bcl<sup>\u22121<\/sup> Maxima H Minus Reverse Transcriptase. The reaction was incubated for 30\u2009min at 50\u2009\u00b0C and 5\u2009min at 85\u2009\u00b0C. Then, 5\u2009\u03bcl of reverse transcriptase reaction were amplified using the Platinum Superfi II Mastermix (ThermoFisher, 12368010) for 12 cycles, according to the manufacturer\u2019s instructions and using Forward- and Reverse-Amplification primers. Finally, the cDNA was cleaned up using SPRIselect beads at a 0.8\u00d7 ratio (Beckman Coulter, B23318) and used as input for Oxford Nanopore Technology sequencing with both the Kit 12 (SQK-LSK110 kit and FLO-MIN106D flowcells) and Q20+(SQK-LSK112 kit and FLO-MIN112 flowcells) chemistries. Both were run for 72\u2009h and basecalled using the Super Accuracy model.<\/p>\n<h3 id=\"Sec4\">Data simulation<\/h3>\n<p>To simulate PacBio circular consensus sequencing (CCS) reads we used IsoSeqSim (<a href=\"https:\/\/github.com\/yunhaowang\/IsoSeqSim\">https:\/\/github.com\/yunhaowang\/IsoSeqSim<\/a>), which generates a read by truncating a transcript sequence according to given probabilities and randomly inserts sequencing errors at a specified rate with uniform distribution. As reported in previous studies<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 12\"55 title=\"Ono, Y. et al. PBSIM: PacBio reads simulator\u2014toward accurate genome assembly. Bioinformatics 29, S119\u2013S121 (2013).\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#ref-CR25\" id=\"ref-link-section-d324223e807\">25<\/a><\/sup>, a uniform error distribution is a realistic model for PacBio CCS reads. Here we used 5\u2032 and 3\u2032 truncation probabilities typical for PacBio Sequel II (provided within the package) and an overall error rate of 1.6%: 0.6% deletions, 0.6% insertions and 0.4% substitutions. While these discrepancies do not necessarily represent sequencing errors, they must nevertheless be modeled, as they can confuse transcript reconstruction. The above values were obtained by mapping real PacBio CCS reads to the reference genome<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 12\"66 title=\"Pardo-Palacios, F. et al. Systematic assessment of long-read RNA-seq methods for transcript identification and quantifican. Preprint at \n                https:\/\/doi.org\/10.21203\/rs.3.rs-777702\/v1\n                \n               (2021).\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#ref-CR18\" id=\"ref-link-section-d324223e811\">18<\/a><\/sup>.<\/p>\n<p>ONT reads were simulated with the NanoSim software in the transcriptome mode<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 12\"77 title=\"Hafezqorani, S. et al. Trans-NanoSim characterizes and simulates nanopore RNA-sequencing data. Gigascience 9, giaa061 (2020).\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#ref-CR20\" id=\"ref-link-section-d324223e818\">20<\/a><\/sup>. NanoSim is designed specifically for simulating ONT-specific sequencing errors and biases. It first constructs error-profile and length-distribution models, which are further used to mutate reference transcript sequences. We trained the model using the ONT R10.4 sequencing data (average error rate of 2.8%: 0.7% deletions, 1.1% insertions, 1% substitutions.). To simulate ONT R9.4 chemistry, we used a pretrained model provided within the NanoSim package, which was obtained using publicly available ONT cDNA data<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 12\"88 title=\"Workman, R. E. et al. Nanopore native RNA sequencing of a human poly(A) transcriptome. Nat. Methods 16, 1297\u20131305 (2019).\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#ref-CR23\" id=\"ref-link-section-d324223e822\">23<\/a><\/sup> from the NA12878 human cell line and has an average error rate of 15.9%: 6% deletions, 5.1% insertions and 4.8% substitutions. In addition, we turned off the simulation of intron retention events and random unaligned reads representing the background noise.<\/p>\n<p>However, additional analysis of the simulated ONT data and NanoSim code revealed that NanoSim randomly selects a start position of a read in a transcript sequence with a uniform distribution, thus introducing no 5\u2032 or 3\u2032 bias. To simulate more realistic ONT reads, we aligned real ONT cDNA data obtained from the mouse brain sample to the reference transcriptome using minimap2 and derived empirical truncation probability distributions on both 5\u2032 and 3\u2032 ends. Further, we changed the NanoSim source code to enable sequence truncation with respect to obtained probabilities (Supplementary Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#MOESM1\">2<\/a>). The modified version is available at <a href=\"https:\/\/github.com\/andrewprzh\/lrgasp-simulation\">https:\/\/github.com\/andrewprzh\/lrgasp-simulation<\/a>.<\/p>\n<p>For both ONT and PacBio simulation we used Mouse GENCODE v.26 and Human GENCODE v.36 basic annotations<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 12\"99 title=\"Frankish, A. et al. GENCODE 2021. Nucleic Acids Res. 49, D916\u2013D923 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#ref-CR21\" id=\"ref-link-section-d324223e842\">21<\/a><\/sup>. Before simulation, we also attached a 30\u2009basepair (bp) polyA tail to every transcript sequence. To simulate realistic mouse data, a transcript expression profile was obtained using PacBio data from a mouse brain sample<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 13\"00 title=\"Joglekar, A. et al. A spatially resolved brain region- and cell type-specific isoform atlas of the postnatal mouse brain. Nat. Commun. 12, 463 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#ref-CR24\" id=\"ref-link-section-d324223e846\">24<\/a><\/sup>. For human data, a gene expression profile was computed with PacBio GM12878 data. A complete description of every dataset used in this study is provided in the Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#MOESM1\">8<\/a>.<\/p>\n<h3 id=\"Sec5\">Quality evaluation of predicted novel transcripts<\/h3>\n<p>To mimic real-life situations and assess the ability of an algorithm to predict novel transcripts, we created reduced gene annotations by removing a fraction of expressed isoforms. First, we define a subset of true expressed transcripts that contributed to at least one read during the simulation. Among this set, we select a fraction of transcripts to be excluded from the annotation. These transcripts are denoted as the true novel isoforms. The remaining transcripts (among the expressed) are defined as true known isoforms. To create a reduced gene annotation, we remove all true novel isoforms from the comprehensive GENCODE annotation. Here we created a reduced mouse annotation with 15% of expressed transcripts removed, and four human reduced annotations with 10, 15, 20 and 25% of excluded expressed isoforms (Supplementary Note <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#MOESM1\">2<\/a>).<\/p>\n<p>To evaluate a transcript prediction tool, we provided the entire set of simulated reads and the reduced annotation as an input. Thus, true novel isoforms are hidden from the annotation, but present in the reads. We then compute precision and recall by running gffcompare<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 13\"11 title=\"Pertea, G. &#038; Pertea, M. GFF utilities: GffRead and GffCompare. F1000Res. 9, 304 (2020).\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#ref-CR22\" id=\"ref-link-section-d324223e867\">22<\/a><\/sup> for (1) the entire output annotation versus the complete set of expressed transcripts, (2) reported known isoforms versus the set of true known isoforms and (3) predicted novel transcript models versus the true novel set. The information on whether a transcript is known or novel is obtained from the output GTF file. The script for computing these metrics can be found in the IsoQuant repository in misc\/reduced_db_gffcompare.py.<\/p>\n<p>For the annotation-free benchmarks we simply compared the entire output annotation with the true set of expressed isoforms using gffcompare.<\/p>\n<p>To estimate how recall and precision of novel transcripts depend on the expression levels, predicted transcripts are grouped into bins by their transcripts per million (TPM) values. For computing recall the number of false negative calls (undetected transcripts) in each TPM bin is required. We thus group transcripts by their TPM values used during the simulation. However, computing precision requires the number of false-positive predictions within each bin and thus only reported TPM values can be used (the true TPM for a false prediction is 0). Thus, it may happen that the same transcript may fall into different bins when benchmarking different tools. Although it is not possible to compute precision and recall exactly for an arbitrary TPM range, the bias has a minor effect as only a small number of bins was used in this experiment (five). Therefore, despite being imperfect, these estimations can provide additional insights on whether a transcript discovery method has any bias toward high- or low-expressed isoforms.<\/p>\n<p>To evaluate SIRV transcripts we used an incomplete SIRV annotation containing only 43 out of 69 SIRV transcripts. The output annotations were again split into known and novel transcripts, and compared against the respective reference set using gffcompare. The SIRV-Set 4 annotations are available at <a href=\"https:\/\/www.lexogen.com\/sirvs\/download\/\">https:\/\/www.lexogen.com\/sirvs\/download\/<\/a>.<\/p>\n<h3 id=\"Sec6\">Estimating consistency between annotations<\/h3>\n<p>Consistency between transcripts generated on real data was estimated using gffcompare (without providing a reference annotation). Based on gffcompare output, for each tool we computed how many of its transcripts are supported by (1) all four other tools, (2) exactly three other tools, (3) one or two other tools and (4) no other tool (possible false predictions). We also counted the number of potentially missed transcripts that were reported by all methods except the one being evaluated (possible false negative). This approach is implemented in misc\/denovo_model_stats.py.<\/p>\n<h3 id=\"Sec7\">Command line options<\/h3>\n<p>For PacBio data minimap2 was launched with \u2018splice:hq\u2019 preset; for ONT data we used <i>k<\/i>-mer size 14 with the usual \u2018splice\u2019 preset. We also provided annotated splice junctions in BED format as an input. In each experiment, all tools were provided with the same BAM file and the same reference annotation. IsoQuant was launched with the default parameters setting the appropriate data type via \u2018\u2013data_type\u2019 option. StringTie2 was launched with the \u2018-L\u2019 option. All other tools were run with the default parameters in 20 threads. In contrast to all other tools, Bambu outputs all reference transcripts, including unexpressed ones. Thus, we filtered out all transcripts with read count values <1 from the Bambu output. As recommended in the user manual, we also ran TALON using preliminary alignment correction with TranscriptClean<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 13\"22 title=\"Wyman, D. &#038; Mortazavi, A. TranscriptClean: variant-aware correction of indels, mismatches and splice junctions in long-read transcripts. Bioinformatics 35, 340\u2013342 (2019).\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#ref-CR26\" id=\"ref-link-section-d324223e908\">26<\/a><\/sup> (<a href=\"https:\/\/github.com\/mortazavilab\/TALON\">https:\/\/github.com\/mortazavilab\/TALON<\/a>). However, as the results with and without correction were almost identical, we decided to use the annotations obtained from raw data for a fair comparison. Complete information on all options and software versions are provided in the Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#MOESM1\">9<\/a>.<\/p>\n<h3 id=\"Sec8\">IsoQuant algorithm<\/h3>\n<p>To process long RNA reads, IsoQuant requires a reference genome and optionally\u2014a corresponding gene annotation. If the reads are provided in the FASTQ format, IsoQuant maps them to the reference with minimap2 in splice mode<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 13\"33 title=\"Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094\u20133100 (2018).\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#ref-CR2\" id=\"ref-link-section-d324223e930\">2<\/a><\/sup>. Alternatively, a user may provide a sorted and indexed BAM file generated with a spliced aligner of their choice. If the reference annotation is provided, the IsoQuant algorithm includes four main steps: (1) assigning mapped reads to known isoforms, (2) transcript quantification, (3) alignment correction and (4) transcript model construction. In the annotation-free mode, the pipeline simply proceeds to the transcript discovery step. Below, we describe the key aspects of all four procedures.<\/p>\n<h3 id=\"Sec9\">Assigning long reads to known isoforms<\/h3>\n<p>The algorithm for assigning long reads to annotated isoforms is based on intron-chain matching and detecting exonic overlaps. To assign reads, IsoQuant processes each gene individually by extracting reads that map to the respective region from the sorted BAM file.<\/p>\n<p>IsoQuant first processes the annotation to construct splice junction and exon profiles of all known isoforms. A set of annotated splice junctions in the gene is sorted according to their coordinates in the genome and enumerated from 1 to <i>N<\/i>. Thus, an annotated isoform can be represented as a vector of length <i>N<\/i>, in which the element at position <i>i<\/i> is set to 1 if this isoform includes the <i>i<\/i>th splice junction and \u22121 otherwise (Supplementary Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#MOESM1\">3a<\/a>). This vector is henceforth referred to as an isoform splice junction profile. The exon profile is constructed in a similar manner: all annotated exons are first split into a minimal set of <i>M<\/i> nonoverlapping fragments, such that every exon can be represented as their combination, and these exonic fragments are sorted and enumerated. The exon profile for an annotated isoform is similarly denoted as a vector of length <i>M<\/i>, where the <i>i<\/i>th element is set to 1 if this isoform contains the <i>i<\/i>th exon fragment and \u22121 otherwise (Supplementary Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#MOESM1\">3b<\/a>).<\/p>\n<p>To assign a read to an annotated isoform, each splice junction from the alignment is matched against annotated splice junctions from the current gene and a read splice junction profile is constructed (also a vector of length <i>N<\/i>). In this vector the <i>i<\/i>th element is set to 1 if the annotated splice junction with index <i>i<\/i> matches to a splice junction from the read, \u22121 if it is overlapped or spanned by the read, but no match is detected, and 0 otherwise. A zero value indicates that the splice junction is located outside the alignment region and therefore no information can be derived, for example due to read truncation. Similarly, the exon profile of the read is constructed based on <i>M<\/i> exonic fragments described above: 1 indicates that the respective exonic fragment is overlapped, \u22121 means it is spanned and 0 is set for exonic fragments outside the alignment region (Supplementary Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#MOESM1\">4<\/a>).<\/p>\n<p>Due to sequencing errors, an aligner may detect splice site positions inaccurately<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 13\"44 title=\"Mikheenko, A., Prjibelski, A. D., Joglekar, A. &#038; Tilgner, H. U. Sequencing of individual barcoded cDNAs using Pacific Biosciences and Oxford Nanopore Technologies reveals platform-specific error patterns. Genome Res. 32, 726\u2013737 (2022).\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#ref-CR19\" id=\"ref-link-section-d324223e997\">19<\/a><\/sup>. To avoid considering them as alternative or novel, the algorithm allows a small difference <i>\u0394<\/i> between annotated and alignment splice site coordinates when matching splice junctions. Formally speaking, an annotated splice junction (<i>x<\/i><sub>1<\/sub>, <i>x<\/i><sub>2<\/sub>) matches a read splice junction (<i>y<\/i><sub>1<\/sub>, <i>y<\/i><sub>2<\/sub>) if |<i>x<\/i><sub>1<\/sub>\u2009\u2212\u2009<i>y<\/i><sub>1<\/sub>|\u2009\u2264\u2009<i>\u0394<\/i> and |<i>x<\/i><sub>2<\/sub>\u2009\u2212\u2009<i>y<\/i><sub>2<\/sub>|\u2009\u2264\u2009<i>\u0394<\/i>. The default <i>\u0394<\/i> value varies for different types of input data: 4\u2009bp used for PacBio CCS reads and 6\u2009bp for ONT reads (can be set manually). Although an aligned read can be assigned to an isoform by simply comparing its intron chain and exonic coordinates to the annotation, vectorizing the alignment as described above allows one to easily implement inexact splice site comparison with a delta, and quickly detect candidate isoforms for read assignment.<\/p>\n<p>Further, to assign a read to an isoform, its exon and splice junction profiles are matched against the respective profiles of the annotated isoforms. The distance between two profiles is computed simply as the number of distinct elements in which the read profile has nonzero values. A read is said to be consistent with an isoform if the distances between their exon and splice junction profiles are 0, and the read has no unannotated splice junctions\/exons (Supplementary Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#MOESM1\">4<\/a>). When a read is consistent with a single isoform, it is reported as a unique match. When a read is consistent with multiple isoforms simultaneously, it is classified as ambiguous, which may happen, for example, due to read truncation. If a read contains unannotated splice junctions\/exons, or its profiles are not consistent with any isoform, it is marked as inconsistent. For such alignments IsoQuant reports the most similar reference transcript and detected alternative splicing events.<\/p>\n<p>Some inconsistencies can be, however, caused by misalignments, rather than by real alternative splicing events<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 13\"55 title=\"Mikheenko, A., Prjibelski, A. D., Joglekar, A. &#038; Tilgner, H. U. Sequencing of individual barcoded cDNAs using Pacific Biosciences and Oxford Nanopore Technologies reveals platform-specific error patterns. Genome Res. 32, 726\u2013737 (2022).\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#ref-CR19\" id=\"ref-link-section-d324223e1057\">19<\/a><\/sup>: (1) skipped short exons, (2) intron shifts exceeding <i>\u0394<\/i>\u2009bp and (3) short unannotated exons at transcript ends (Supplementary Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#MOESM1\">5<\/a>). If an inconsistent alignment contains only these types of discrepancy, the read is reclassified as conditionally consistent.<\/p>\n<h3 id=\"Sec10\">Transcript quantification<\/h3>\n<p>Once long reads are assigned to annotated isoforms, quantification becomes rather trivial. Uniquely assigned reads are counted as a single detected transcript, while ambiguous reads are treated as multi-mappers and contribute to multiple assigned isoforms with lower weight. A transcript is reported as expressed only if it has at least one uniquely assigned read. Inconsistent reads are considered as potential novel isoforms and ignored during the quantification step. Beside genes and transcripts, IsoQuant can also count inclusion and exclusion abundances for separate exons and introns, which can be useful for computing percentage spliced-in values.<\/p>\n<p>IsoQuant implements additional functionality for barcoded long RNA reads, for example barcoded by single-cell or spatial location<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 13\"66 title=\"Joglekar, A. et al. A spatially resolved brain region- and cell type-specific isoform atlas of the postnatal mouse brain. Nat. Commun. 12, 463 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#ref-CR24\" id=\"ref-link-section-d324223e1078\">24<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 13\"77 title=\"Gupta, I. et al. Single-cell isoform RNA sequencing characterizes isoforms in thousands of cerebellar cells. Nat. Biotechnol. 36, 1197\u20131202 (2018).\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#ref-CR27\" id=\"ref-link-section-d324223e1081\">27<\/a><\/sup>. A user can provide information on how the reads are grouped, for example, as a TSV file that indicates a barcode or a cell type of origin for every read. Isoform and gene abundances are then calculated for every read group separately, which can facilitate an expression comparison between different groups or cell types.<\/p>\n<h3 id=\"Sec11\">Spliced alignment correction<\/h3>\n<p>IsoQuant corrects each uniquely assigned read individually. If a read contains misalignments described above (Supplementary Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#MOESM1\">5<\/a>) or its intron chain is not identical to the intron chain of the assigned isoform, the alignment is corrected as follows. Short skipped exons are restored according to the annotation and minor splice junction shifts are replaced with the respective splice junctions from the assigned transcript. Unannotated terminal microexons are simply removed from the alignment. Finally, any unannotated splice site is substituted with the nearest site from the assigned transcript if (1) these splice sites are located within <i>\u0394<\/i>\u2009bp and (2) read alignment contains sequencing errors near this splice site. Coordinates of corrected alignments are then saved in BED12 format.<\/p>\n<h3 id=\"Sec12\">Transcript model construction<\/h3>\n<p>The transcript reconstruction procedure implemented in IsoQuant includes four steps: (1) intron graph construction from read alignments, (2) intron graph simplification, (3) attaching terminal vertices and (4) construction of paths representing full-length transcripts. This stage does not require any information on reference transcripts and thus can be used for both de novo and annotation-based transcript discovery. Below we provide a detailed description of all algorithms and intuition behind them.<\/p>\n<h4 id=\"Sec13\">Intron graph construction<\/h4>\n<p>To construct transcript models, IsoQuant implements a concept of an intron graph, which was influenced by the previously designed splice graph approach<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 13\"88 title=\"Heber, S. et al. Splicing graphs and EST assembly problem. Bioinformatics 18, S181\u2013S188 (2002).\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#ref-CR28\" id=\"ref-link-section-d324223e1114\">28<\/a><\/sup>, used, for example, in StringTie<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 13\"99 title=\"Kovaka, S. et al. Transcriptome assembly from long-read RNA-seq alignments with StringTie2. Genome Biol. 20, 278 (2019).\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#ref-CR5\" id=\"ref-link-section-d324223e1118\">5<\/a><\/sup>. For a given set of transcripts, an intron graph is constructed as follows. First, we define internal vertices as a set of all splice junctions from all transcripts. Thus, each vertex represents a pair of splice sites (donor and acceptor) or, more formally, an ordered pair of coordinates in the genome. Two vertices are connected with a directed edge if the respective splice junctions are consecutive in any transcript. Finally, for every first or last splice junction in a transcript, the corresponding vertex is connected with a terminating vertex that represents the transcript start and end positions (formally, a single integer). The intron graph is a directed acyclic graph since every edge connects only consecutive elements. Each transcript can now be represented as a path in the graph that traverses from the initial to terminal vertex, where internal vertices denote its intron chain (Supplementary Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#MOESM1\">6a<\/a>).<\/p>\n<p>The described approach can be used to construct an intron graph from read alignments. Similarly, to the read-to-isoform assignment procedure, the genes are processed by IsoQuant individually. First, the algorithm constructs a set of internal vertices corresponding to splice junctions from the selected alignments. Two vertices are likewise connected when the respective splice junctions are consecutive in any read alignment. Due to the presence of inexactly detected splice sites, which may remain even after the alignment correction, such a graph may contain false vertices and connections. These false nodes typically form topological patterns, such as tips and bulges. A tip is defined as a dead end (dead start) edge that has a starting (ending) vertex with outdegree (indegree) at least 2. A bulge consists of two alternative paths having the same start and end vertices (Supplementary Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#MOESM1\">6b<\/a>). Similar patterns are also typical for de Bruijn graphs, which are used for short read assembly, where bulges and tips are caused by sequencing errors. To remove tips and bulges assemblers exploit various techniques broadly called graph simplification<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 14\"00 title=\"Zerbino, D. R. &#038; Birney, E. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 18, 821\u2013829 (2008).\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#ref-CR29\" id=\"ref-link-section-d324223e1131\">29<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 14\"11 title=\"Bankevich, A. et al. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J. Comput. Biol. 19, 455\u2013477 (2012).\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#ref-CR30\" id=\"ref-link-section-d324223e1134\">30<\/a><\/sup>.<\/p>\n<h4 id=\"Sec14\">Intron graph simplification<\/h4>\n<p>Here we implement a graph simplification procedure based on the following observations: (1) a false splice junction is typically unannotated, (2) splice site shifts that cause a false intron are short and (3) the number of reads supporting the correct splice junction often exceeds read support of a false one. Formally, a bulge\/tip is removed from the graph if it represents an unannotated splice junction that has at least twice lower read support compared to the alternative vertex and the alternative vertex has splice sites within 20\u2009bp (10\u2009bp for PacBio). In other cases, when an unannotated splice junction has a high read support or no similar splice junction exists, a bulge or a tip is likely to represent a part of a novel isoform and thus should be preserved (Supplementary Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#MOESM1\">6b<\/a>). Although intron graph simplification strongly resembles naive splice junctions clustering, it has an important difference: a splice junction is removed not only based on its properties, such as splice site positions and read support, but based on the graph topology as well, thus considering adjacent splice junctions. Such a method allows one to, for example, preserve similar splice junctions from distinct isoforms. It is worth noting that the simplification procedure keeps track of all collapsed tips and bulges, thus preserving the possibility to later traverse alignment containing removed splice junctions through the graph.<\/p>\n<h4 id=\"Sec15\">Collecting terminal positions<\/h4>\n<p>After the graph is simplified, the algorithm proceeds to attach starting and terminal vertices. In contrast to annotated transcripts, read alignments do not provide the exact terminal positions, as their sequences can be truncated. Thus, to avoid having an extreme number of terminal vertices, terminal positions are detected using the heuristics presented below. Without loss of generality here we assume that the gene of interest is on the forward strand and polyA tails are on the right.<\/p>\n<p>For every splice junction <i>V<\/i> in the graph, the algorithm selects only read alignments that contain <i>V<\/i> as a terminal splice junction and processes them as follows. First, the polyA sites are collected and clustered. Clustered polyA positions {<i>p<\/i><sub>1<\/sub>, \u2026, <i>p<\/i><sub>k<\/sub>} are added to the graph as terminal vertices and connected to vertex <i>V<\/i> (Supplementary Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#MOESM1\">7a<\/a>). Further, the algorithm adds the rightmost non-polyA terminal position <i>P<\/i> as a terminal vertex if one of the conditions is satisfied: (1) <i>V<\/i> has no outgoing edges, (2) <i>V<\/i> has an outgoing edge to a splice junction (<i>u<\/i><sub>1<\/sub>, <i>u<\/i><sub>2<\/sub>) and <i>P<\/i>\u2009>\u2009<i>u<\/i><sub>1<\/sub>\u2009+\u2009<i>\u0394<\/i> or (3) <i>V<\/i> has adjacent polyA vertices {<i>p<\/i><sub>1<\/sub>, \u2026, <i>p<\/i><sub><i>k<\/i><\/sub>} and <i>P<\/i>\u2009>\u2009max(<i>p<\/i><sub>1<\/sub>, \u2026, <i>p<\/i><sub><i>k<\/i><\/sub>)\u2009+\u2009<i>\u0394<\/i> (where <i>\u0394<\/i> is the parameter defined above). Thus, a non-polyA terminal position can only be attached if it is located to the right of adjacent exons or polyA vertices. Starting positions are collected in a similar manner, but without looking for polyA sites (Supplementary Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#MOESM1\">7b<\/a>). The described approach, however, may lose information when several isoforms share the same starting splice junction but have distinct transcription start and end sites. Thus, we also apply an additional transcripts correction, which is described below.<\/p>\n<h4 id=\"Sec16\">Transcript discovery via path construction<\/h4>\n<p>Once the intron graph is constructed and simplified, IsoQuant detects full-length paths that connect starting and terminal vertices. Paths entirely supported by at least a single read alignment (that is, full-splice match) are marked as transcript prediction candidates (Supplementary Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#MOESM1\">7c<\/a>). To filter out unreliable novel transcripts IsoQuant applies read support cutoffs: at least five full-splice match reads (three for PacBio) and at least 2% from the maximum graph coverage. Since some isoforms may not have a full-splice matching alignment, IsoQuant also reports known transcripts that (1) have at least one uniquely assigned read and (2) can be traversed through the intron graph. It also reports known mono-exonic transcripts that have (1) a uniquely assigned read and (2) a confirmed polyA site.<\/p>\n<p>To correct terminal positions of a novel transcript, the algorithm selects all alignments consistent with this transcript and uses them to extract terminal positions using the approach described above (Supplementary Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#MOESM1\">7d<\/a>). In contrast to detecting terminal vertices for the entire graph, where all alignments are used, the subset of consistent reads likely belongs specifically to this isoform and thus provides correct start and end positions. The resulting transcripts are saved in GTF format, providing additional information about transcript types and their reference genes.<\/p>\n<p>While the previously designed splice graph structure and the intron graph implemented in this work are designed to represent alternatively spliced transcripts and, in general, are highly similar, there are a few differences that can be highlighted. First of all, the splice graph natively supports transcription start and polyA sites as well as mono-exonic transcripts. The intron graph, however, requires the introduction of additional types of \u2018terminal vertex\u2019 that denote transcript start and end positions. At the same time, any exonic overlap between alternative transcripts will lead to a merged node in the splice graph, while the intron graph requires an exact match of both splice sites between two transcripts to form a single connected component. Thus, the intron graph can potentially be less tangled for the genes containing multiple alternatively spliced isoforms and, therefore, less complex to traverse through. Moreover, the intron graph natively provides information on neighboring splice junctions, which allows to easily detect incorrectly detected splice sites caused by misalignments and perform graph simplification. While this procedure can definitely be implemented within the splice graph concept, it seems to be more straightforward and native for the intron graph.<\/p>\n<p>To evaluate how different steps of the transcript model construction algorithm affect recall and precision of IsoQuant, we performed a separate experiment described in Supplementary Note <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#MOESM1\">1<\/a>.<\/p>\n<h3 id=\"Sec17\">Reporting summary<\/h3>\n<p>Further information on research design is available in the <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#MOESM2\">Nature Portfolio Reporting Summary<\/a> linked to this article.<\/p>\n<\/div>\n<\/div><\/div>\n<div>\n<div id=\"data-availability-section\" data-title=\"Data availability\">\n<h2 id=\"data-availability\">Data availability<\/h2>\n<div id=\"data-availability-content\">\n<p>Nanopore sequencing data obtained from the human NA12878 cell line is available at <a href=\"https:\/\/github.com\/nanopore-wgs-consortium\/NA12878\/blob\/master\/RNA.md\">https:\/\/github.com\/nanopore-wgs-consortium\/NA12878\/blob\/master\/RNA.md<\/a>. PacBio human GM12878 data is available at ENCODE (<a href=\"https:\/\/www.encodeproject.org\/search\">https:\/\/www.encodeproject.org\/search<\/a>) under the accession numbers ENCFF450VAU and ENCFF694DIE. Sequencing data obtained from mouse brain samples is available at NCBI Gene Expression Omnibus (<a href=\"https:\/\/www.ncbi.nlm.nih.gov\/geo\/\">https:\/\/www.ncbi.nlm.nih.gov\/geo\/<\/a>) under accession numbers <a href=\"https:\/\/www.ncbi.nlm.nih.gov\/geo\/query\/acc.cgi?acc=GSE158450\">GSE158450<\/a> and <a href=\"https:\/\/www.ncbi.nlm.nih.gov\/geo\/query\/acc.cgi?acc=GSE178175\">GSE178175<\/a>. ONT SIRV data, simulated data and reduced gene annotations are published at <a href=\"https:\/\/zenodo.org\/record\/7121404\">https:\/\/zenodo.org\/record\/7121404<\/a> (ref. <sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 14\"22 title=\"Prjibelski, A., Mikheenko, A., Joglekar, A., Jarroux, J. &#038; Tilgner, H. U. Mouse SIRV and simulated data used in the IsoQuant publication. Zenodo \n                https:\/\/doi.org\/10.5281\/zenodo.7121404\n                \n               (2022).\" href=\"http:\/\/www.nature.com\/articles\/s41587-022-01565-y#ref-CR31\" id=\"ref-link-section-d324223e1411\">31<\/a><\/sup>).<\/p>\n<\/p><\/div>\n<\/div>\n<div id=\"code-availability-section\" data-title=\"Code availability\">\n<h2 id=\"code-availability\">Code availability<\/h2>\n<div id=\"code-availability-content\">\n<p>IsoQuant and the supplementary scripts used for the evaluation are available at <a href=\"https:\/\/github.com\/ablab\/IsoQuant\">https:\/\/github.com\/ablab\/IsoQuant<\/a>. Scripts for data simulation are available at <a href=\"https:\/\/github.com\/andrewprzh\/lrgasp-simulation\">https:\/\/github.com\/andrewprzh\/lrgasp-simulation<\/a>.<\/p>\n<\/p><\/div>\n<\/div>\n<div id=\"MagazineFulltextArticleBodySuffix\" aria-labelledby=\"Bib1\" data-title=\"References\">\n<h2 id=\"Bib1\">References<\/h2>\n<div data-container-section=\"references\" id=\"Bib1-content\">\n<ol data-track-component=\"outbound reference\">\n<li data-counter=\"1.\">\n<p id=\"ref-CR1\">Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. <i>Bioinformatics<\/i> <b>29<\/b>, 15\u201321 (2013).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1093\/bioinformatics\/bts635\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1093%2Fbioinformatics%2Fbts635\" aria-label=\"Reference 14\"33 data-doi=\"10.1093\/bioinformatics\/bts635\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BC38XhvV2gsbnF\" aria-label=\"Reference 14\"44>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 14\"55 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=STAR%3A%20ultrafast%20universal%20RNA-seq%20aligner&#038;journal=Bioinformatics&#038;doi=10.1093%2Fbioinformatics%2Fbts635&#038;volume=29&#038;pages=15-21&#038;publication_year=2013&#038;author=Dobin%2CA\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"2.\">\n<p id=\"ref-CR2\">Li, H. Minimap2: pairwise alignment for nucleotide sequences. <i>Bioinformatics<\/i> <b>34<\/b>, 3094\u20133100 (2018).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1093\/bioinformatics\/bty191\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1093%2Fbioinformatics%2Fbty191\" aria-label=\"Reference 14\"66 data-doi=\"10.1093\/bioinformatics\/bty191\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BC1MXhtVamu73J\" aria-label=\"Reference 14\"77>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 14\"88 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Minimap2%3A%20pairwise%20alignment%20for%20nucleotide%20sequences&#038;journal=Bioinformatics&#038;doi=10.1093%2Fbioinformatics%2Fbty191&#038;volume=34&#038;pages=3094-3100&#038;publication_year=2018&#038;author=Li%2CH\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"3.\">\n<p id=\"ref-CR3\">Liu, B. et al. deSALT: fast and accurate long transcriptomic read alignment with de Bruijn graph-based index. <i>Genome Biol.<\/i> <b>20<\/b>, 274 (2019).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1186\/s13059-019-1895-9\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1186%2Fs13059-019-1895-9\" aria-label=\"Reference 14\"99 data-doi=\"10.1186\/s13059-019-1895-9\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 15\"00 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=deSALT%3A%20fast%20and%20accurate%20long%20transcriptomic%20read%20alignment%20with%20de%20Bruijn%20graph-based%20index&#038;journal=Genome%20Biol.&#038;doi=10.1186%2Fs13059-019-1895-9&#038;volume=20&#038;publication_year=2019&#038;author=Liu%2CB\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"4.\">\n<p id=\"ref-CR4\">Sahlin, K. &#038; M\u00e4kinen, V. Accurate spliced alignment of long RNA sequencing reads. <i>Bioinformatics<\/i> <b>37<\/b>, 4643\u20134651 (2021).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1093\/bioinformatics\/btab540\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1093%2Fbioinformatics%2Fbtab540\" aria-label=\"Reference 15\"11 data-doi=\"10.1093\/bioinformatics\/btab540\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BB38Xit1Ohsr4%3D\" aria-label=\"Reference 15\"22>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 15\"33 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Accurate%20spliced%20alignment%20of%20long%20RNA%20sequencing%20reads&#038;journal=Bioinformatics&#038;doi=10.1093%2Fbioinformatics%2Fbtab540&#038;volume=37&#038;pages=4643-4651&#038;publication_year=2021&#038;author=Sahlin%2CK&#038;author=M%C3%A4kinen%2CV\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"5.\">\n<p id=\"ref-CR5\">Kovaka, S. et al. Transcriptome assembly from long-read RNA-seq alignments with StringTie2. <i>Genome Biol.<\/i> <b>20<\/b>, 278 (2019).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1186\/s13059-019-1910-1\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1186%2Fs13059-019-1910-1\" aria-label=\"Reference 15\"44 data-doi=\"10.1186\/s13059-019-1910-1\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BC1MXisVSntb3I\" aria-label=\"Reference 15\"55>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 15\"66 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Transcriptome%20assembly%20from%20long-read%20RNA-seq%20alignments%20with%20StringTie2&#038;journal=Genome%20Biol.&#038;doi=10.1186%2Fs13059-019-1910-1&#038;volume=20&#038;publication_year=2019&#038;author=Kovaka%2CS\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"6.\">\n<p id=\"ref-CR6\">Tung, L. H., Shao, M. &#038; Kingsford, C. Quantifying the benefit offered by transcript assembly with Scallop-LR on single-molecule long reads. <i>Genome Biol.<\/i> <b>20<\/b>, 287 (2019).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1186\/s13059-019-1883-0\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1186%2Fs13059-019-1883-0\" aria-label=\"Reference 15\"77 data-doi=\"10.1186\/s13059-019-1883-0\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BC1MXisVyiurrK\" aria-label=\"Reference 15\"88>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 15\"99 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Quantifying%20the%20benefit%20offered%20by%20transcript%20assembly%20with%20Scallop-LR%20on%20single-molecule%20long%20reads&#038;journal=Genome%20Biol.&#038;doi=10.1186%2Fs13059-019-1883-0&#038;volume=20&#038;publication_year=2019&#038;author=Tung%2CLH&#038;author=Shao%2CM&#038;author=Kingsford%2CC\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"7.\">\n<p id=\"ref-CR7\">Wyman, D. et al. A technology-agnostic long-read analysis pipeline for transcriptome discovery and quantification. Preprint at <i>bioRxiv<\/i> <a href=\"https:\/\/doi.org\/10.1101\/672931\">https:\/\/doi.org\/10.1101\/672931<\/a> (2020).<\/p>\n<\/li>\n<li data-counter=\"8.\">\n<p id=\"ref-CR8\">Tang, A. D. et al. Full-length transcript characterization of SF3B1 mutation in chronic lymphocytic leukemia reveals downregulation of retained introns. <i>Nat. Commun.<\/i> <b>11<\/b>, 1438 (2020).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/s41467-020-15171-6\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fs41467-020-15171-6\" aria-label=\"Reference 16\"00 data-doi=\"10.1038\/s41467-020-15171-6\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BB3cXlt1Wisbc%3D\" aria-label=\"Reference 16\"11>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 16\"22 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Full-length%20transcript%20characterization%20of%20SF3B1%20mutation%20in%20chronic%20lymphocytic%20leukemia%20reveals%20downregulation%20of%20retained%20introns&#038;journal=Nat.%20Commun.&#038;doi=10.1038%2Fs41467-020-15171-6&#038;volume=11&#038;publication_year=2020&#038;author=Tang%2CAD\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"9.\">\n<p id=\"ref-CR9\">Kuo, R. I. et al. Illuminating the dark side of the human transcriptome with long read transcript sequencing. <i>BMC Genomics<\/i> <b>21<\/b>, 751 (2020).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1186\/s12864-020-07123-7\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1186%2Fs12864-020-07123-7\" aria-label=\"Reference 16\"33 data-doi=\"10.1186\/s12864-020-07123-7\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BB3cXitlygtrbF\" aria-label=\"Reference 16\"44>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 16\"55 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Illuminating%20the%20dark%20side%20of%20the%20human%20transcriptome%20with%20long%20read%20transcript%20sequencing&#038;journal=BMC%20Genomics&#038;doi=10.1186%2Fs12864-020-07123-7&#038;volume=21&#038;publication_year=2020&#038;author=Kuo%2CRI\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"10.\">\n<p id=\"ref-CR10\">Byrne, A. et al. Nanopore long-read RNAseq reveals widespread transcriptional variation among the surface receptors of individual B cells. <i>Nat. Commun.<\/i> <b>8<\/b>, 16027 (2017).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/ncomms16027\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fncomms16027\" aria-label=\"Reference 16\"66 data-doi=\"10.1038\/ncomms16027\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BC2sXht1WitbnJ\" aria-label=\"Reference 16\"77>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 16\"88 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Nanopore%20long-read%20RNAseq%20reveals%20widespread%20transcriptional%20variation%20among%20the%20surface%20receptors%20of%20individual%20B%20cells&#038;journal=Nat.%20Commun.&#038;doi=10.1038%2Fncomms16027&#038;volume=8&#038;publication_year=2017&#038;author=Byrne%2CA\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"11.\">\n<p id=\"ref-CR11\">Chen, Y. et al. Context-aware transcript quantification from long read RNA-Seq data. <i>Bioconductor<\/i> <a href=\"https:\/\/doi.org\/10.18129\/B9.bioc.bambu\">https:\/\/doi.org\/10.18129\/B9.bioc.bambu<\/a> (2022).<\/p>\n<\/li>\n<li data-counter=\"12.\">\n<p id=\"ref-CR12\">Tardaguila, M. et al. Corrigendum: SQANTI: extensive characterization of long-read transcript sequences for quality control in full-length transcriptome identification and quantification. <i>Genome Res.<\/i> <b>28<\/b>, 1096\u20131096 (2018).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1101\/gr.239137.118\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1101%2Fgr.239137.118\" aria-label=\"Reference 16\"99 data-doi=\"10.1101\/gr.239137.118\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BC1cXhsFelu7vE\" aria-label=\"Reference 17\"00>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 17\"11 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Corrigendum%3A%20SQANTI%3A%20extensive%20characterization%20of%20long-read%20transcript%20sequences%20for%20quality%20control%20in%20full-length%20transcriptome%20identification%20and%20quantification&#038;journal=Genome%20Res.&#038;doi=10.1101%2Fgr.239137.118&#038;volume=28&#038;pages=1096-1096&#038;publication_year=2018&#038;author=Tardaguila%2CM\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"13.\">\n<p id=\"ref-CR13\">de la Fuente, L. et al. tappAS: a comprehensive computational framework for the analysis of the functional impact of differential splicing. <i>Genome Biol.<\/i> <b>21<\/b>, 119 (2020).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1186\/s13059-020-02028-w\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1186%2Fs13059-020-02028-w\" aria-label=\"Reference 17\"22 data-doi=\"10.1186\/s13059-020-02028-w\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 17\"33 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=tappAS%3A%20a%20comprehensive%20computational%20framework%20for%20the%20analysis%20of%20the%20functional%20impact%20of%20differential%20splicing&#038;journal=Genome%20Biol.&#038;doi=10.1186%2Fs13059-020-02028-w&#038;volume=21&#038;publication_year=2020&#038;author=Fuente%2CL\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"14.\">\n<p id=\"ref-CR14\">Reese, F. &#038; Mortazavi, A. Swan: a library for the analysis and visualization of long-read transcriptomes. <i>Bioinformatics<\/i> <b>37<\/b>, 1322\u20131323 (2021).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1093\/bioinformatics\/btaa836\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1093%2Fbioinformatics%2Fbtaa836\" aria-label=\"Reference 17\"44 data-doi=\"10.1093\/bioinformatics\/btaa836\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BB3MXhvFGqu7vO\" aria-label=\"Reference 17\"55>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 17\"66 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Swan%3A%20a%20library%20for%20the%20analysis%20and%20visualization%20of%20long-read%20transcriptomes&#038;journal=Bioinformatics&#038;doi=10.1093%2Fbioinformatics%2Fbtaa836&#038;volume=37&#038;pages=1322-1323&#038;publication_year=2021&#038;author=Reese%2CF&#038;author=Mortazavi%2CA\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"15.\">\n<p id=\"ref-CR15\">Stein, A. N., Joglekar, A., Poon, C.-L. &#038; Tilgner, H. U. ScisorWiz: visualizing differential isoform expression in single-cell long-read data. <i>Bioinformatics<\/i> <b>38<\/b>, 3474\u20133476 (2022).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1093\/bioinformatics\/btac340\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1093%2Fbioinformatics%2Fbtac340\" aria-label=\"Reference 17\"77 data-doi=\"10.1093\/bioinformatics\/btac340\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BB38XisVKmtrvM\" aria-label=\"Reference 17\"88>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 17\"99 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=ScisorWiz%3A%20visualizing%20differential%20isoform%20expression%20in%20single-cell%20long-read%20data&#038;journal=Bioinformatics&#038;doi=10.1093%2Fbioinformatics%2Fbtac340&#038;volume=38&#038;pages=3474-3476&#038;publication_year=2022&#038;author=Stein%2CAN&#038;author=Joglekar%2CA&#038;author=Poon%2CC-L&#038;author=Tilgner%2CHU\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"16.\">\n<p id=\"ref-CR16\">Sahlin, K. &#038; Medvedev, P. Error correction enables use of Oxford Nanopore technology for reference-free transcriptome analysis. <i>Nat. Commun.<\/i> <b>12<\/b>, 2 (2021).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/s41467-020-20340-8\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fs41467-020-20340-8\" aria-label=\"Reference 18\"00 data-doi=\"10.1038\/s41467-020-20340-8\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BB3MXnsVOjtg%3D%3D\" aria-label=\"Reference 18\"11>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 18\"22 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Error%20correction%20enables%20use%20of%20Oxford%20Nanopore%20technology%20for%20reference-free%20transcriptome%20analysis&#038;journal=Nat.%20Commun.&#038;doi=10.1038%2Fs41467-020-20340-8&#038;volume=12&#038;publication_year=2021&#038;author=Sahlin%2CK&#038;author=Medvedev%2CP\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"17.\">\n<p id=\"ref-CR17\">Nip, K. M. et al. RNA-Bloom enables reference-free and reference-guided sequence assembly for single-cell transcriptomes. <i>Genome Res.<\/i> <b>30<\/b>, 1191\u20131200 (2020).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1101\/gr.260174.119\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1101%2Fgr.260174.119\" aria-label=\"Reference 18\"33 data-doi=\"10.1101\/gr.260174.119\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BB3MXltl2jsA%3D%3D\" aria-label=\"Reference 18\"44>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 18\"55 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=RNA-Bloom%20enables%20reference-free%20and%20reference-guided%20sequence%20assembly%20for%20single-cell%20transcriptomes&#038;journal=Genome%20Res.&#038;doi=10.1101%2Fgr.260174.119&#038;volume=30&#038;pages=1191-1200&#038;publication_year=2020&#038;author=Nip%2CKM\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"18.\">\n<p id=\"ref-CR18\">Pardo-Palacios, F. et al. Systematic assessment of long-read RNA-seq methods for transcript identification and quantifican. Preprint at <a href=\"https:\/\/doi.org\/10.21203\/rs.3.rs-777702\/v1\">https:\/\/doi.org\/10.21203\/rs.3.rs-777702\/v1<\/a> (2021).<\/p>\n<\/li>\n<li data-counter=\"19.\">\n<p id=\"ref-CR19\">Mikheenko, A., Prjibelski, A. D., Joglekar, A. &#038; Tilgner, H. U. Sequencing of individual barcoded cDNAs using Pacific Biosciences and Oxford Nanopore Technologies reveals platform-specific error patterns. <i>Genome Res.<\/i> <b>32<\/b>, 726\u2013737 (2022).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1101\/gr.276405.121\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1101%2Fgr.276405.121\" aria-label=\"Reference 18\"66 data-doi=\"10.1101\/gr.276405.121\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 18\"77 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Sequencing%20of%20individual%20barcoded%20cDNAs%20using%20Pacific%20Biosciences%20and%20Oxford%20Nanopore%20Technologies%20reveals%20platform-specific%20error%20patterns&#038;journal=Genome%20Res.&#038;doi=10.1101%2Fgr.276405.121&#038;volume=32&#038;pages=726-737&#038;publication_year=2022&#038;author=Mikheenko%2CA&#038;author=Prjibelski%2CAD&#038;author=Joglekar%2CA&#038;author=Tilgner%2CHU\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"20.\">\n<p id=\"ref-CR20\">Hafezqorani, S. et al. Trans-NanoSim characterizes and simulates nanopore RNA-sequencing data. <i>Gigascience<\/i> <b>9<\/b>, giaa061 (2020).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1093\/gigascience\/giaa061\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1093%2Fgigascience%2Fgiaa061\" aria-label=\"Reference 18\"88 data-doi=\"10.1093\/gigascience\/giaa061\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 18\"99 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Trans-NanoSim%20characterizes%20and%20simulates%20nanopore%20RNA-sequencing%20data&#038;journal=Gigascience&#038;doi=10.1093%2Fgigascience%2Fgiaa061&#038;volume=9&#038;publication_year=2020&#038;author=Hafezqorani%2CS\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"21.\">\n<p id=\"ref-CR21\">Frankish, A. et al. GENCODE 2021. <i>Nucleic Acids Res.<\/i> <b>49<\/b>, D916\u2013D923 (2021).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1093\/nar\/gkaa1087\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1093%2Fnar%2Fgkaa1087\" aria-label=\"Reference 2\"00 data-doi=\"10.1093\/nar\/gkaa1087\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BB3MXntlejtbY%3D\" aria-label=\"Reference 2\"11>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 2\"22 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=GENCODE%202021&#038;journal=Nucleic%20Acids%20Res.&#038;doi=10.1093%2Fnar%2Fgkaa1087&#038;volume=49&#038;pages=D916-D923&#038;publication_year=2021&#038;author=Frankish%2CA\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"22.\">\n<p id=\"ref-CR22\">Pertea, G. &#038; Pertea, M. GFF utilities: GffRead and GffCompare. <i>F1000Res.<\/i> <b>9<\/b>, 304 (2020).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.12688\/f1000research.23297.1\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.12688%2Ff1000research.23297.1\" aria-label=\"Reference 2\"33 data-doi=\"10.12688\/f1000research.23297.1\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 2\"44 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=GFF%20utilities%3A%20GffRead%20and%20GffCompare&#038;journal=F1000Res.&#038;doi=10.12688%2Ff1000research.23297.1&#038;volume=9&#038;publication_year=2020&#038;author=Pertea%2CG&#038;author=Pertea%2CM\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"23.\">\n<p id=\"ref-CR23\">Workman, R. E. et al. Nanopore native RNA sequencing of a human poly(A) transcriptome. <i>Nat. Methods<\/i> <b>16<\/b>, 1297\u20131305 (2019).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/s41592-019-0617-2\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fs41592-019-0617-2\" aria-label=\"Reference 2\"55 data-doi=\"10.1038\/s41592-019-0617-2\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BC1MXitFOru7fL\" aria-label=\"Reference 2\"66>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 2\"77 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Nanopore%20native%20RNA%20sequencing%20of%20a%20human%20poly%28A%29%20transcriptome&#038;journal=Nat.%20Methods&#038;doi=10.1038%2Fs41592-019-0617-2&#038;volume=16&#038;pages=1297-1305&#038;publication_year=2019&#038;author=Workman%2CRE\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"24.\">\n<p id=\"ref-CR24\">Joglekar, A. et al. A spatially resolved brain region- and cell type-specific isoform atlas of the postnatal mouse brain. <i>Nat. Commun.<\/i> <b>12<\/b>, 463 (2021).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/s41467-020-20343-5\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fs41467-020-20343-5\" aria-label=\"Reference 2\"88 data-doi=\"10.1038\/s41467-020-20343-5\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BB3MXhvFaqsro%3D\" aria-label=\"Reference 2\"99>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 11\"0000 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=A%20spatially%20resolved%20brain%20region-%20and%20cell%20type-specific%20isoform%20atlas%20of%20the%20postnatal%20mouse%20brain&#038;journal=Nat.%20Commun.&#038;doi=10.1038%2Fs41467-020-20343-5&#038;volume=12&#038;publication_year=2021&#038;author=Joglekar%2CA\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"25.\">\n<p id=\"ref-CR25\">Ono, Y. et al. PBSIM: PacBio reads simulator\u2014toward accurate genome assembly. <i>Bioinformatics<\/i> <b>29<\/b>, S119\u2013S121 (2013).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1093\/bioinformatics\/bts649\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1093%2Fbioinformatics%2Fbts649\" aria-label=\"Reference 11\"0101 data-doi=\"10.1093\/bioinformatics\/bts649\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 11\"0202 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=PBSIM%3A%20PacBio%20reads%20simulator%E2%80%94toward%20accurate%20genome%20assembly&#038;journal=Bioinformatics&#038;doi=10.1093%2Fbioinformatics%2Fbts649&#038;volume=29&#038;pages=S119-S121&#038;publication_year=2013&#038;author=Ono%2CY\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"26.\">\n<p id=\"ref-CR26\">Wyman, D. &#038; Mortazavi, A. TranscriptClean: variant-aware correction of indels, mismatches and splice junctions in long-read transcripts. <i>Bioinformatics<\/i> <b>35<\/b>, 340\u2013342 (2019).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1093\/bioinformatics\/bty483\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1093%2Fbioinformatics%2Fbty483\" aria-label=\"Reference 11\"0303 data-doi=\"10.1093\/bioinformatics\/bty483\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BC1MXitVWgs77E\" aria-label=\"Reference 11\"0404>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 11\"0505 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=TranscriptClean%3A%20variant-aware%20correction%20of%20indels%2C%20mismatches%20and%20splice%20junctions%20in%20long-read%20transcripts&#038;journal=Bioinformatics&#038;doi=10.1093%2Fbioinformatics%2Fbty483&#038;volume=35&#038;pages=340-342&#038;publication_year=2019&#038;author=Wyman%2CD&#038;author=Mortazavi%2CA\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"27.\">\n<p id=\"ref-CR27\">Gupta, I. et al. Single-cell isoform RNA sequencing characterizes isoforms in thousands of cerebellar cells. <i>Nat. Biotechnol.<\/i> <b>36<\/b>, 1197\u20131202 (2018).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/nbt.4259\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fnbt.4259\" aria-label=\"Reference 11\"0606 data-doi=\"10.1038\/nbt.4259\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BC1cXhvFSqtLzK\" aria-label=\"Reference 11\"0707>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 11\"0808 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Single-cell%20isoform%20RNA%20sequencing%20characterizes%20isoforms%20in%20thousands%20of%20cerebellar%20cells&#038;journal=Nat.%20Biotechnol.&#038;doi=10.1038%2Fnbt.4259&#038;volume=36&#038;pages=1197-1202&#038;publication_year=2018&#038;author=Gupta%2CI\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"28.\">\n<p id=\"ref-CR28\">Heber, S. et al. Splicing graphs and EST assembly problem. <i>Bioinformatics<\/i> <b>18<\/b>, S181\u2013S188 (2002).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1093\/bioinformatics\/18.suppl_1.S181\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1093%2Fbioinformatics%2F18.suppl_1.S181\" aria-label=\"Reference 11\"0909 data-doi=\"10.1093\/bioinformatics\/18.suppl_1.S181\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 11\"1010 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Splicing%20graphs%20and%20EST%20assembly%20problem&#038;journal=Bioinformatics&#038;doi=10.1093%2Fbioinformatics%2F18.suppl_1.S181&#038;volume=18&#038;pages=S181-S188&#038;publication_year=2002&#038;author=Heber%2CS\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"29.\">\n<p id=\"ref-CR29\">Zerbino, D. R. &#038; Birney, E. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. <i>Genome Res.<\/i> <b>18<\/b>, 821\u2013829 (2008).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1101\/gr.074492.107\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1101%2Fgr.074492.107\" aria-label=\"Reference 11\"1111 data-doi=\"10.1101\/gr.074492.107\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BD1cXlslChsLg%3D\" aria-label=\"Reference 11\"1212>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 11\"1313 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Velvet%3A%20algorithms%20for%20de%20novo%20short%20read%20assembly%20using%20de%20Bruijn%20graphs&#038;journal=Genome%20Res.&#038;doi=10.1101%2Fgr.074492.107&#038;volume=18&#038;pages=821-829&#038;publication_year=2008&#038;author=Zerbino%2CDR&#038;author=Birney%2CE\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"30.\">\n<p id=\"ref-CR30\">Bankevich, A. et al. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. <i>J. Comput. Biol.<\/i> <b>19<\/b>, 455\u2013477 (2012).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1089\/cmb.2012.0021\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1089%2Fcmb.2012.0021\" aria-label=\"Reference 11\"1414 data-doi=\"10.1089\/cmb.2012.0021\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BC38XmsFOmt7k%3D\" aria-label=\"Reference 11\"1515>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 11\"1616 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=SPAdes%3A%20a%20new%20genome%20assembly%20algorithm%20and%20its%20applications%20to%20single-cell%20sequencing&#038;journal=J.%20Comput.%20Biol.&#038;doi=10.1089%2Fcmb.2012.0021&#038;volume=19&#038;pages=455-477&#038;publication_year=2012&#038;author=Bankevich%2CA\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"31.\">\n<p id=\"ref-CR31\">Prjibelski, A., Mikheenko, A., Joglekar, A., Jarroux, J. &#038; Tilgner, H. U. Mouse SIRV and simulated data used in the IsoQuant publication. <i>Zenodo<\/i> <a href=\"https:\/\/doi.org\/10.5281\/zenodo.7121404\">https:\/\/doi.org\/10.5281\/zenodo.7121404<\/a> (2022).<\/p>\n<\/li>\n<\/ol>\n<p><a data-track=\"click\" data-track-action=\"download citation references\" data-track-label=\"link\" rel=\"nofollow\" href=\"https:\/\/citation-needed.springer.com\/v2\/references\/10.1038\/s41587-022-01565-y?format=refman&#038;flavour=references\">Download references<\/a><\/p>\n<\/div>\n<\/div>\n<div id=\"Ack1-section\" data-title=\"Acknowledgements\">\n<h2 id=\"Ack1\">Acknowledgements<\/h2>\n<p>We thank Nanopore WGS consortium and Ali Mortazavi\u2019s laboratory at the University of California, Irvine for making the ONT and PacBio data publicly available. This work was supported by St. Petersburg State University, Russia (grant ID no. PURE 93023437 to A.M., A.S., A.L.L. and A.D.P.). Scientific research was performed at the Research Park of St. Petersburg State University Computing Center.<\/p>\n<\/div>\n<div id=\"author-information-section\" aria-labelledby=\"author-information\" data-title=\"Author information\">\n<h2 id=\"author-information\">Author information<\/h2>\n<div id=\"author-information-content\">\n<p><span id=\"author-notes\">Author notes<\/span><\/p>\n<ol>\n<li id=\"na1\">\n<p>These authors contributed equally: Andrey D. Prjibelski, Alla Mikheenko.<\/p>\n<\/li>\n<\/ol>\n<h3 id=\"affiliations\">Authors and Affiliations<\/h3>\n<ol>\n<li id=\"Aff1\">\n<p>Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, St. Petersburg State University, St. Petersburg, Russia<\/p>\n<p>Andrey D. Prjibelski,\u00a0Alla Mikheenko\u00a0&#038;\u00a0Alla L. Lapidus<\/p>\n<\/li>\n<li id=\"Aff2\">\n<p>Department of Computer Science, University of Helsinki, Helsinki, Finland<\/p>\n<p>Andrey D. Prjibelski<\/p>\n<\/li>\n<li id=\"Aff3\">\n<p>Tri-Institutional Computational Biology and Medicine, Weill Cornell Medicine, New York, NY, USA<\/p>\n<p>Anoushka Joglekar<\/p>\n<\/li>\n<li id=\"Aff4\">\n<p>Brain and Mind Research Institute, Weill Cornell Medicine, New York, NY, USA<\/p>\n<p>Anoushka Joglekar,\u00a0Julien Jarroux\u00a0&#038;\u00a0Hagen U. Tilgner<\/p>\n<\/li>\n<li id=\"Aff5\">\n<p>Center for Neurogenetics, Weill Cornell Medicine, New York, NY, USA<\/p>\n<p>Anoushka Joglekar,\u00a0Julien Jarroux\u00a0&#038;\u00a0Hagen U. Tilgner<\/p>\n<\/li>\n<li id=\"Aff6\">\n<p>Bioinformatics Institute, St. Petersburg, Russia<\/p>\n<p>Alexander Smetanin<\/p>\n<\/li>\n<\/ol>\n<h3 id=\"contributions\">Contributions<\/h3>\n<p>A.D.P., A.M. and A.S. designed and implemented the software. A.D.P., A.M. and A.J. performed the benchmarks. J.J. performed the sequencing experiments. A.D.P. and A.L.L. acquired funding. H.U.T. suggested the project. A.L.L. and H.U.T. supervised the project. A.D.P., A.M., A.J. and H.U.T. wrote the manuscript.<\/p>\n<h3 id=\"corresponding-author\">Corresponding authors<\/h3>\n<p id=\"corresponding-author-list\">Correspondence to<br \/>\n                <a id=\"corresp-c1\" href=\"http:\/\/www.nature.com\/mailto:an********@***il.com\" data-original-string=\"p+HTyYkdPQBJ0TKbyhhCWg==7f4XJ+hblR4Wk8Jnr6J5oZcH3HQ4512a+715HVVmFGcJJ8=\" title=\"This contact has been encoded by Anti-Spam by CleanTalk. Click to decode. To finish the decoding make sure that JavaScript is enabled in your browser.\">Andrey D. Prjibelski<\/a> or <a id=\"corresp-c2\" href=\"http:\/\/www.nature.com\/mailto:hu*****@*********ll.edu\" data-original-string=\"BUYIytuuWXWB96uU3oVXdg==7f4Si9kg0qPsPBoNX0ZhzbYDEoGvythW1oCNBy19jVmClE=\" title=\"This contact has been encoded by Anti-Spam by CleanTalk. Click to decode. To finish the decoding make sure that JavaScript is enabled in your browser.\">Hagen U. Tilgner<\/a>.<\/p>\n<\/div>\n<\/div>\n<div id=\"ethics-section\" data-title=\"Ethics declarations\">\n<h2 id=\"ethics\">Ethics declarations<\/h2>\n<div id=\"ethics-content\">\n<h3 id=\"FPar2\">Competing interests<\/h3>\n<p>The authors declare no competing interests.<\/p>\n<\/p><\/div>\n<\/div>\n<div id=\"peer-review-section\" data-title=\"Peer review\">\n<h2 id=\"peer-review\">Peer review<\/h2>\n<div id=\"peer-review-content\">\n<h3 id=\"FPar1\">Peer review information<\/h3>\n<p><i>Nature Biotechnology<\/i> thanks Heng Li and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.<\/p>\n<\/p><\/div>\n<\/div>\n<div id=\"additional-information-section\" data-title=\"Additional information\">\n<h2 id=\"additional-information\">Additional information<\/h2>\n<p><b>Publisher\u2019s note<\/b> Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.<\/p>\n<\/div>\n<div id=\"Sec19-section\" data-title=\"Supplementary information\">\n<h2 id=\"Sec19\">Supplementary information<\/h2>\n<\/div>\n<div id=\"Sec20-section\" data-title=\"Source data\">\n<h2 id=\"Sec20\">Source data<\/h2>\n<\/div>\n<div id=\"rightslink-section\" data-title=\"Rights and permissions\">\n<h2 id=\"rightslink\">Rights and permissions<\/h2>\n<div id=\"rightslink-content\">\n<p><b>Open Access<\/b>  This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article\u2019s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article\u2019s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit <a href=\"http:\/\/creativecommons.org\/licenses\/by\/4.0\/\" rel=\"license\">http:\/\/creativecommons.org\/licenses\/by\/4.0\/<\/a>.<\/p>\n<p><a data-track=\"click\" data-track-action=\"view rights and permissions\" data-track-label=\"link\" href=\"https:\/\/s100.copyright.com\/AppDispatchServlet?title=Accurate%20isoform%20discovery%20with%20IsoQuant%20using%20long%20reads&#038;author=Andrey%20D.%20Prjibelski%20et%20al&#038;contentID=10.1038%2Fs41587-022-01565-y&#038;copyright=The%20Author%28s%29&#038;publication=1087-0156&#038;publicationDate=2023-01-02&#038;publisherName=SpringerNature&#038;orderBeanReset=true&#038;oa=CC%20BY\">Reprints and Permissions<\/a><\/p>\n<\/div>\n<\/div>\n<div id=\"article-info-section\" aria-labelledby=\"article-info\" data-title=\"About this article\">\n<h2 id=\"article-info\">About this article<\/h2>\n<div id=\"article-info-content\">\n<p><a data-crossmark=\"10.1038\/s41587-022-01565-y\" target=\"_blank\" rel=\"noopener\" href=\"https:\/\/crossmark.crossref.org\/dialog\/?doi=10.1038\/s41587-022-01565-y\" data-track=\"click\" data-track-action=\"Click Crossmark\" data-track-label=\"link\" data-test=\"crossmark\"><img loading=\"lazy\" decoding=\"async\" width=\"57\" height=\"81\" alt=\" Verify currency and authenticity via CrossMark\" src=\"data:image\/svg+xml;base64,PHN2ZyBoZWlnaHQ9IjgxIiB3aWR0aD0iNTciIHhtbG5zPSJodHRwOi8vd3d3LnczLm9yZy8yMDAwL3N2ZyI+PGcgZmlsbD0ibm9uZSIgZmlsbC1ydWxlPSJldmVub2RkIj48cGF0aCBkPSJtMTcuMzUgMzUuNDUgMjEuMy0xNC4ydi0xNy4wM2gtMjEuMyIgZmlsbD0iIzk4OTg5OCIvPjxwYXRoIGQ9Im0zOC42NSAzNS40NS0yMS4zLTE0LjJ2LTE3LjAzaDIxLjMiIGZpbGw9IiM3NDc0NzQiLz48cGF0aCBkPSJtMjggLjVjLTEyLjk4IDAtMjMuNSAxMC41Mi0yMy41IDIzLjVzMTAuNTIgMjMuNSAyMy41IDIzLjUgMjMuNS0xMC41MiAyMy41LTIzLjVjMC02LjIzLTIuNDgtMTIuMjEtNi44OC0xNi42Mi00LjQxLTQuNC0xMC4zOS02Ljg4LTE2LjYyLTYuODh6bTAgNDEuMjVjLTkuOCAwLTE3Ljc1LTcuOTUtMTcuNzUtMTcuNzVzNy45NS0xNy43NSAxNy43NS0xNy43NSAxNy43NSA3Ljk1IDE3Ljc1IDE3Ljc1YzAgNC43MS0xLjg3IDkuMjItNS4yIDEyLjU1cy03Ljg0IDUuMi0xMi41NSA1LjJ6IiBmaWxsPSIjNTM1MzUzIi8+PHBhdGggZD0ibTQxIDM2Yy01LjgxIDYuMjMtMTUuMjMgNy40NS0yMi40MyAyLjktNy4yMS00LjU1LTEwLjE2LTEzLjU3LTcuMDMtMjEuNWwtNC45Mi0zLjExYy00Ljk1IDEwLjctMS4xOSAyMy40MiA4Ljc4IDI5LjcxIDkuOTcgNi4zIDIzLjA3IDQuMjIgMzAuNi00Ljg2eiIgZmlsbD0iIzljOWM5YyIvPjxwYXRoIGQ9Im0uMiA1OC40NWMwLS43NS4xMS0xLjQyLjMzLTIuMDFzLjUyLTEuMDkuOTEtMS41Yy4zOC0uNDEuODMtLjczIDEuMzQtLjk0LjUxLS4yMiAxLjA2LS4zMiAxLjY1LS4zMi41NiAwIDEuMDYuMTEgMS41MS4zNS40NC4yMy44MS41IDEuMS44MWwtLjkxIDEuMDFjLS4yNC0uMjQtLjQ5LS40Mi0uNzUtLjU2LS4yNy0uMTMtLjU4LS4yLS45My0uMi0uMzkgMC0uNzMuMDgtMS4wNS4yMy0uMzEuMTYtLjU4LjM3LS44MS42Ni0uMjMuMjgtLjQxLjYzLS41MyAxLjA0LS4xMy40MS0uMTkuODgtLjE5IDEuMzkgMCAxLjA0LjIzIDEuODYuNjggMi40Ni40NS41OSAxLjA2Ljg4IDEuODQuODguNDEgMCAuNzctLjA3IDEuMDctLjIzcy41OS0uMzkuODUtLjY4bC45MSAxYy0uMzguNDMtLjguNzYtMS4yOC45OS0uNDcuMjItMSAuMzQtMS41OC4zNC0uNTkgMC0xLjEzLS4xLTEuNjQtLjMxLS41LS4yLS45NC0uNTEtMS4zMS0uOTEtLjM4LS40LS42Ny0uOS0uODgtMS40OC0uMjItLjU5LS4zMy0xLjI2LS4zMy0yLjAyem04LjQtNS4zM2gxLjYxdjIuNTRsLS4wNSAxLjMzYy4yOS0uMjcuNjEtLjUxLjk2LS43MnMuNzYtLjMxIDEuMjQtLjMxYy43MyAwIDEuMjcuMjMgMS42MS43MS4zMy40Ny41IDEuMTQuNSAyLjAydjQuMzFoLTEuNjF2LTQuMWMwLS41Ny0uMDgtLjk3LS4yNS0xLjIxLS4xNy0uMjMtLjQ1LS4zNS0uODMtLjM1LS4zIDAtLjU2LjA4LS43OS4yMi0uMjMuMTUtLjQ5LjM2LS43OC42NHY0LjhoLTEuNjF6bTcuMzcgNi40NWMwLS41Ni4wOS0xLjA2LjI2LTEuNTEuMTgtLjQ1LjQyLS44My43MS0xLjE0LjI5LS4zLjYzLS41NCAxLjAxLS43MS4zOS0uMTcuNzgtLjI1IDEuMTgtLjI1LjQ3IDAgLjg4LjA4IDEuMjMuMjQuMzYuMTYuNjUuMzguODkuNjdzLjQyLjYzLjU0IDEuMDNjLjEyLjQxLjE4Ljg0LjE4IDEuMzIgMCAuMzItLjAyLjU3LS4wNy43NmgtNC4zNmMuMDcuNjIuMjkgMS4xLjY1IDEuNDQuMzYuMzMuODIuNSAxLjM4LjUuMjkgMCAuNTctLjA0LjgzLS4xM3MuNTEtLjIxLjc2LS4zN2wuNTUgMS4wMWMtLjMzLjIxLS42OS4zOS0xLjA5LjUzLS40MS4xNC0uODMuMjEtMS4yNi4yMS0uNDggMC0uOTItLjA4LTEuMzQtLjI1LS40MS0uMTYtLjc2LS40LTEuMDctLjctLjMxLS4zMS0uNTUtLjY5LS43Mi0xLjEzLS4xOC0uNDQtLjI2LS45NS0uMjYtMS41MnptNC42LS42MmMwLS41NS0uMTEtLjk4LS4zNC0xLjI4LS4yMy0uMzEtLjU4LS40Ny0xLjA2LS40Ny0uNDEgMC0uNzcuMTUtMS4wNy40NS0uMzEuMjktLjUuNzMtLjU4IDEuM3ptMi41LjYyYzAtLjU3LjA5LTEuMDguMjgtMS41My4xOC0uNDQuNDMtLjgyLjc1LTEuMTNzLjY5LS41NCAxLjEtLjcxYy40Mi0uMTYuODUtLjI0IDEuMzEtLjI0LjQ1IDAgLjg0LjA4IDEuMTcuMjNzLjYxLjM0Ljg1LjU3bC0uNzcgMS4wMmMtLjE5LS4xNi0uMzgtLjI4LS41Ni0uMzctLjE5LS4wOS0uMzktLjE0LS42MS0uMTQtLjU2IDAtMS4wMS4yMS0xLjM1LjYzLS4zNS40MS0uNTIuOTctLjUyIDEuNjcgMCAuNjkuMTcgMS4yNC41MSAxLjY2LjM0LjQxLjc4LjYyIDEuMzIuNjIuMjggMCAuNTQtLjA2Ljc4LS4xNy4yNC0uMTIuNDUtLjI2LjY0LS40MmwuNjcgMS4wM2MtLjMzLjI5LS42OS41MS0xLjA4LjY1LS4zOS4xNS0uNzguMjMtMS4xOC4yMy0uNDYgMC0uOS0uMDgtMS4zMS0uMjQtLjQtLjE2LS43NS0uMzktMS4wNS0uN3MtLjUzLS42OS0uNy0xLjEzYy0uMTctLjQ1LS4yNS0uOTYtLjI1LTEuNTN6bTYuOTEtNi40NWgxLjU4djYuMTdoLjA1bDIuNTQtMy4xNmgxLjc3bC0yLjM1IDIuOCAyLjU5IDQuMDdoLTEuNzVsLTEuNzctMi45OC0xLjA4IDEuMjN2MS43NWgtMS41OHptMTMuNjkgMS4yN2MtLjI1LS4xMS0uNS0uMTctLjc1LS4xNy0uNTggMC0uODcuMzktLjg3IDEuMTZ2Ljc1aDEuMzR2MS4yN2gtMS4zNHY1LjZoLTEuNjF2LTUuNmgtLjkydi0xLjJsLjkyLS4wN3YtLjcyYzAtLjM1LjA0LS42OC4xMy0uOTguMDgtLjMxLjIxLS41Ny40LS43OXMuNDItLjM5LjcxLS41MWMuMjgtLjEyLjYzLS4xOCAxLjA0LS4xOC4yNCAwIC40OC4wMi42OS4wNy4yMi4wNS40MS4xLjU3LjE3em0uNDggNS4xOGMwLS41Ny4wOS0xLjA4LjI3LTEuNTMuMTctLjQ0LjQxLS44Mi43Mi0xLjEzLjMtLjMxLjY1LS41NCAxLjA0LS43MS4zOS0uMTYuOC0uMjQgMS4yMy0uMjRzLjg0LjA4IDEuMjQuMjRjLjQuMTcuNzQuNCAxLjA0Ljcxcy41NC42OS43MiAxLjEzYy4xOS40NS4yOC45Ni4yOCAxLjUzcy0uMDkgMS4wOC0uMjggMS41M2MtLjE4LjQ0LS40Mi44Mi0uNzIgMS4xM3MtLjY0LjU0LTEuMDQuNy0uODEuMjQtMS4yNC4yNC0uODQtLjA4LTEuMjMtLjI0LS43NC0uMzktMS4wNC0uN2MtLjMxLS4zMS0uNTUtLjY5LS43Mi0xLjEzLS4xOC0uNDUtLjI3LS45Ni0uMjctMS41M3ptMS42NSAwYzAgLjY5LjE0IDEuMjQuNDMgMS42Ni4yOC40MS42OC42MiAxLjE4LjYyLjUxIDAgLjktLjIxIDEuMTktLjYyLjI5LS40Mi40NC0uOTcuNDQtMS42NiAwLS43LS4xNS0xLjI2LS40NC0xLjY3LS4yOS0uNDItLjY4LS42My0xLjE5LS42My0uNSAwLS45LjIxLTEuMTguNjMtLjI5LjQxLS40My45Ny0uNDMgMS42N3ptNi40OC0zLjQ0aDEuMzNsLjEyIDEuMjFoLjA1Yy4yNC0uNDQuNTQtLjc5Ljg4LTEuMDIuMzUtLjI0LjctLjM2IDEuMDctLjM2LjMyIDAgLjU5LjA1Ljc4LjE0bC0uMjggMS40LS4zMy0uMDljLS4xMS0uMDEtLjIzLS4wMi0uMzgtLjAyLS4yNyAwLS41Ni4xLS44Ni4zMXMtLjU1LjU4LS43NyAxLjF2NC4yaC0xLjYxem0tNDcuODcgMTVoMS42MXY0LjFjMCAuNTcuMDguOTcuMjUgMS4yLjE3LjI0LjQ0LjM1LjgxLjM1LjMgMCAuNTctLjA3LjgtLjIyLjIyLS4xNS40Ny0uMzkuNzMtLjczdi00LjdoMS42MXY2Ljg3aC0xLjMybC0uMTItMS4wMWgtLjA0Yy0uMy4zNi0uNjMuNjQtLjk4Ljg2LS4zNS4yMS0uNzYuMzItMS4yNC4zMi0uNzMgMC0xLjI3LS4yNC0xLjYxLS43MS0uMzMtLjQ3LS41LTEuMTQtLjUtMi4wMnptOS40NiA3LjQzdjIuMTZoLTEuNjF2LTkuNTloMS4zM2wuMTIuNzJoLjA1Yy4yOS0uMjQuNjEtLjQ1Ljk3LS42My4zNS0uMTcuNzItLjI2IDEuMS0uMjYuNDMgMCAuODEuMDggMS4xNS4yNC4zMy4xNy42MS40Ljg0LjcxLjI0LjMxLjQxLjY4LjUzIDEuMTEuMTMuNDIuMTkuOTEuMTkgMS40NCAwIC41OS0uMDkgMS4xMS0uMjUgMS41Ny0uMTYuNDctLjM4Ljg1LS42NSAxLjE2LS4yNy4zMi0uNTguNTYtLjk0LjczLS4zNS4xNi0uNzIuMjUtMS4xLjI1LS4zIDAtLjYtLjA3LS45LS4ycy0uNTktLjMxLS44Ny0uNTZ6bTAtMi4zYy4yNi4yMi41LjM3LjczLjQ1LjI0LjA5LjQ2LjEzLjY2LjEzLjQ2IDAgLjg0LS4yIDEuMTUtLjYuMzEtLjM5LjQ2LS45OC40Ni0xLjc3IDAtLjY5LS4xMi0xLjIyLS4zNS0xLjYxLS4yMy0uMzgtLjYxLS41Ny0xLjEzLS41Ny0uNDkgMC0uOTkuMjYtMS41Mi43N3ptNS44Ny0xLjY5YzAtLjU2LjA4LTEuMDYuMjUtMS41MS4xNi0uNDUuMzctLjgzLjY1LTEuMTQuMjctLjMuNTgtLjU0LjkzLS43MXMuNzEtLjI1IDEuMDgtLjI1Yy4zOSAwIC43My4wNyAxIC4yLjI3LjE0LjU0LjMyLjgxLjU1bC0uMDYtMS4xdi0yLjQ5aDEuNjF2OS44OGgtMS4zM2wtLjExLS43NGgtLjA2Yy0uMjUuMjUtLjU0LjQ2LS44OC42NC0uMzMuMTgtLjY5LjI3LTEuMDYuMjctLjg3IDAtMS41Ni0uMzItMi4wNy0uOTVzLS43Ni0xLjUxLS43Ni0yLjY1em0xLjY3LS4wMWMwIC43NC4xMyAxLjMxLjQgMS43LjI2LjM4LjY1LjU4IDEuMTUuNTguNTEgMCAuOTktLjI2IDEuNDQtLjc3di0zLjIxYy0uMjQtLjIxLS40OC0uMzYtLjctLjQ1LS4yMy0uMDgtLjQ2LS4xMi0uNy0uMTItLjQ1IDAtLjgyLjE5LTEuMTMuNTktLjMxLjM5LS40Ni45NS0uNDYgMS42OHptNi4zNSAxLjU5YzAtLjczLjMyLTEuMy45Ny0xLjcxLjY0LS40IDEuNjctLjY4IDMuMDgtLjg0IDAtLjE3LS4wMi0uMzQtLjA3LS41MS0uMDUtLjE2LS4xMi0uMy0uMjItLjQzcy0uMjItLjIyLS4zOC0uM2MtLjE1LS4wNi0uMzQtLjEtLjU4LS4xLS4zNCAwLS42OC4wNy0xIC4ycy0uNjMuMjktLjkzLjQ3bC0uNTktMS4wOGMuMzktLjI0LjgxLS40NSAxLjI4LS42My40Ny0uMTcuOTktLjI2IDEuNTQtLjI2Ljg2IDAgMS41MS4yNSAxLjkzLjc2cy42MyAxLjI1LjYzIDIuMjF2NC4wN2gtMS4zMmwtLjEyLS43NmgtLjA1Yy0uMy4yNy0uNjMuNDgtLjk4LjY2cy0uNzMuMjctMS4xNC4yN2MtLjYxIDAtMS4xLS4xOS0xLjQ4LS41Ni0uMzgtLjM2LS41Ny0uODUtLjU3LTEuNDZ6bTEuNTctLjEyYzAgLjMuMDkuNTMuMjcuNjcuMTkuMTQuNDIuMjEuNzEuMjEuMjggMCAuNTQtLjA3Ljc3LS4ycy40OC0uMzEuNzMtLjU2di0xLjU0Yy0uNDcuMDYtLjg2LjEzLTEuMTguMjMtLjMxLjA5LS41Ny4xOS0uNzYuMzFzLS4zMy4yNS0uNDEuNGMtLjA5LjE1LS4xMy4zMS0uMTMuNDh6bTYuMjktMy42M2gtLjk4di0xLjJsMS4wNi0uMDcuMi0xLjg4aDEuMzR2MS44OGgxLjc1djEuMjdoLTEuNzV2My4yOGMwIC44LjMyIDEuMi45NyAxLjIuMTIgMCAuMjQtLjAxLjM3LS4wNC4xMi0uMDMuMjQtLjA3LjM0LS4xMWwuMjggMS4xOWMtLjE5LjA2LS40LjEyLS42NC4xNy0uMjMuMDUtLjQ5LjA4LS43Ni4wOC0uNCAwLS43NC0uMDYtMS4wMi0uMTgtLjI3LS4xMy0uNDktLjMtLjY3LS41Mi0uMTctLjIxLS4zLS40OC0uMzctLjc4LS4wOC0uMy0uMTItLjY0LS4xMi0xLjAxem00LjM2IDIuMTdjMC0uNTYuMDktMS4wNi4yNy0xLjUxcy40MS0uODMuNzEtMS4xNGMuMjktLjMuNjMtLjU0IDEuMDEtLjcxLjM5LS4xNy43OC0uMjUgMS4xOC0uMjUuNDcgMCAuODguMDggMS4yMy4yNC4zNi4xNi42NS4zOC44OS42N3MuNDIuNjMuNTQgMS4wM2MuMTIuNDEuMTguODQuMTggMS4zMiAwIC4zMi0uMDIuNTctLjA3Ljc2aC00LjM3Yy4wOC42Mi4yOSAxLjEuNjUgMS40NC4zNi4zMy44Mi41IDEuMzguNS4zIDAgLjU4LS4wNC44NC0uMTMuMjUtLjA5LjUxLS4yMS43Ni0uMzdsLjU0IDEuMDFjLS4zMi4yMS0uNjkuMzktMS4wOS41M3MtLjgyLjIxLTEuMjYuMjFjLS40NyAwLS45Mi0uMDgtMS4zMy0uMjUtLjQxLS4xNi0uNzctLjQtMS4wOC0uNy0uMy0uMzEtLjU0LS42OS0uNzItMS4xMy0uMTctLjQ0LS4yNi0uOTUtLjI2LTEuNTJ6bTQuNjEtLjYyYzAtLjU1LS4xMS0uOTgtLjM0LTEuMjgtLjIzLS4zMS0uNTgtLjQ3LTEuMDYtLjQ3LS40MSAwLS43Ny4xNS0xLjA4LjQ1LS4zMS4yOS0uNS43My0uNTcgMS4zem0zLjAxIDIuMjNjLjMxLjI0LjYxLjQzLjkyLjU3LjMuMTMuNjMuMi45OC4yLjM4IDAgLjY1LS4wOC44My0uMjNzLjI3LS4zNS4yNy0uNmMwLS4xNC0uMDUtLjI2LS4xMy0uMzctLjA4LS4xLS4yLS4yLS4zNC0uMjgtLjE0LS4wOS0uMjktLjE2LS40Ny0uMjNsLS41My0uMjJjLS4yMy0uMDktLjQ2LS4xOC0uNjktLjMtLjIzLS4xMS0uNDQtLjI0LS42Mi0uNHMtLjMzLS4zNS0uNDUtLjU1Yy0uMTItLjIxLS4xOC0uNDYtLjE4LS43NSAwLS42MS4yMy0xLjEuNjgtMS40OS40NC0uMzggMS4wNi0uNTcgMS44My0uNTcuNDggMCAuOTEuMDggMS4yOS4yNXMuNzEuMzYuOTkuNTdsLS43NC45OGMtLjI0LS4xNy0uNDktLjMyLS43My0uNDItLjI1LS4xMS0uNTEtLjE2LS43OC0uMTYtLjM1IDAtLjYuMDctLjc2LjIxLS4xNy4xNS0uMjUuMzMtLjI1LjU0IDAgLjE0LjA0LjI2LjEyLjM2cy4xOC4xOC4zMS4yNmMuMTQuMDcuMjkuMTQuNDYuMjFsLjU0LjE5Yy4yMy4wOS40Ny4xOC43LjI5cy40NC4yNC42NC40Yy4xOS4xNi4zNC4zNS40Ni41OC4xMS4yMy4xNy41LjE3LjgyIDAgLjMtLjA2LjU4LS4xNy44My0uMTIuMjYtLjI5LjQ4LS41MS42OC0uMjMuMTktLjUxLjM0LS44NC40NS0uMzQuMTEtLjcyLjE3LTEuMTUuMTctLjQ4IDAtLjk1LS4wOS0xLjQxLS4yNy0uNDYtLjE5LS44Ni0uNDEtMS4yLS42OHoiIGZpbGw9IiM1MzUzNTMiLz48L2c+PC9zdmc+\"><\/a><\/p>\n<div>\n<h3 id=\"citeas\">Cite this article<\/h3>\n<p>Prjibelski, A.D., Mikheenko, A., Joglekar, A. <i>et al.<\/i> Accurate isoform discovery with IsoQuant using long reads.<br \/>\n                    <i>Nat Biotechnol<\/i>  (2023). https:\/\/doi.org\/10.1038\/s41587-022-01565-y<\/p>\n<p><a data-test=\"citation-link\" data-track=\"click\" data-track-action=\"download article citation\" data-track-label=\"link\" data-track-external rel=\"nofollow\" href=\"https:\/\/citation-needed.springer.com\/v2\/references\/10.1038\/s41587-022-01565-y?format=refman&#038;flavour=citation\">Download citation<\/a><\/p>\n<ul data-test=\"publication-history\">\n<li>\n<p>Received<span>: <\/span><span><time datetime=\"2022-04-19\">19 April 2022<\/time><\/span><\/p>\n<\/li>\n<li>\n<p>Accepted<span>: <\/span><span><time datetime=\"2022-10-13\">13 October 2022<\/time><\/span><\/p>\n<\/li>\n<li>\n<p>Published<span>: <\/span><span><time datetime=\"2023-01-02\">02 January 2023<\/time><\/span><\/p>\n<\/li>\n<li>\n<p><abbr title=\"Digital Object Identifier\">DOI<\/abbr><span>: <\/span><span>https:\/\/doi.org\/10.1038\/s41587-022-01565-y<\/span><\/p>\n<\/li>\n<\/ul>\n<\/div>\n<\/div>\n<\/div><\/div>\n<p><a href=\"https:\/\/www.nature.com\/articles\/s41587-022-01565-y\" class=\"button purchase\" rel=\"nofollow noopener\" target=\"_blank\">Read More<\/a><br \/>\n Andrey D. Prjibelski<\/p>\n","protected":false},"excerpt":{"rendered":"<p>MainLong-read RNA sequencing is now widely used in bulk, sorted cells, single cells and spatial approaches. This wide field of applications has led to the development of multiple spliced alignment programs1,2,3,4, transcript discovery methods5,6,7,8,9,10,11, tools for transcript classification12, annotation13 and visualization14,15. Additionally, several reference-free tools for RNA long-read correction and assembly have been developed16,17. Current<\/p>\n","protected":false},"author":1,"featured_media":593615,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[30323,104931,536],"tags":[],"class_list":{"0":"post-593614","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-accurate","8":"category-isoform","9":"category-science-nature"},"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/newsycanuse.com\/index.php\/wp-json\/wp\/v2\/posts\/593614","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/newsycanuse.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/newsycanuse.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/newsycanuse.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/newsycanuse.com\/index.php\/wp-json\/wp\/v2\/comments?post=593614"}],"version-history":[{"count":0,"href":"https:\/\/newsycanuse.com\/index.php\/wp-json\/wp\/v2\/posts\/593614\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/newsycanuse.com\/index.php\/wp-json\/wp\/v2\/media\/593615"}],"wp:attachment":[{"href":"https:\/\/newsycanuse.com\/index.php\/wp-json\/wp\/v2\/media?parent=593614"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/newsycanuse.com\/index.php\/wp-json\/wp\/v2\/categories?post=593614"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/newsycanuse.com\/index.php\/wp-json\/wp\/v2\/tags?post=593614"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}