{"id":611612,"date":"2023-02-24T20:56:51","date_gmt":"2023-02-25T02:56:51","guid":{"rendered":"https:\/\/news.sellorbuyhomefast.com\/index.php\/2023\/02\/24\/extending-and-improving-metagenomic-taxonomic-profiling-with-uncharacterized-species-using-metaphlan-4\/"},"modified":"2023-02-24T20:56:51","modified_gmt":"2023-02-25T02:56:51","slug":"extending-and-improving-metagenomic-taxonomic-profiling-with-uncharacterized-species-using-metaphlan-4","status":"publish","type":"post","link":"https:\/\/newsycanuse.com\/index.php\/2023\/02\/24\/extending-and-improving-metagenomic-taxonomic-profiling-with-uncharacterized-species-using-metaphlan-4\/","title":{"rendered":"Extending and improving metagenomic taxonomic profiling with uncharacterized species using MetaPhlAn 4"},"content":{"rendered":"<p>Science &#038; Nature <\/p>\n<div>\n<div id=\"Sec1-section\" data-title=\"Main\">\n<h2 id=\"Sec1\">Main<\/h2>\n<div id=\"Sec1-content\">\n<p>Over the last 25 years, shotgun metagenomic sequencing<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 1\" title=\"Quince, C., Walker, A. W., Simpson, J. T., Loman, N. J. &#038; Segata, N. Shotgun metagenomics, from sampling to analysis. Nat. Biotechnol. 35, 833\u2013844 (2017).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR1\" id=\"ref-link-section-d30100885e766\">1<\/a><\/sup> and associated computational methods have developed as robust, efficient ways to study the taxonomic composition<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Segata, N. et al. Metagenomic microbial community profiling using unique clade-specific marker genes. Nat. Methods 9, 811\u2013814 (2012).\" href=\"http:\/\/www.nature.com\/#ref-CR2\" id=\"ref-link-section-d30100885e770\">2<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Truong, D. T. et al. MetaPhlAn2 for enhanced metagenomic taxonomic profiling. Nat. Methods 12, 902\u2013903 (2015).\" href=\"http:\/\/www.nature.com\/#ref-CR3\" id=\"ref-link-section-d30100885e770_1\">3<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Beghini, F. et al. Integrating taxonomic, functional, and strain-level profiling of diverse microbial communities with bioBakery 3. eLife 10, e65088 (2021).\" href=\"http:\/\/www.nature.com\/#ref-CR4\" id=\"ref-link-section-d30100885e770_2\">4<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Lu, J., Breitwieser, F. P., Thielen, P. &#038; Salzberg, S. L. Bracken: estimating species abundance in metagenomics data. PeerJ Comput. Sci. 3, e104 (2017).\" href=\"http:\/\/www.nature.com\/#ref-CR5\" id=\"ref-link-section-d30100885e770_3\">5<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\" title=\"Milanese, A. et al. Microbial abundance, activity and population genomic profiling with mOTUs2. Nat. Commun. 10, 1014 (2019).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR6\" id=\"ref-link-section-d30100885e773\">6<\/a><\/sup> and functional potential<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 4\" title=\"Beghini, F. et al. Integrating taxonomic, functional, and strain-level profiling of diverse microbial communities with bioBakery 3. eLife 10, e65088 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR4\" id=\"ref-link-section-d30100885e777\">4<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 7\" title=\"Franzosa, E. A. et al. Species-level functional profiling of metagenomes and metatranscriptomes. Nat. Methods 15, 962\u2013968 (2018).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR7\" id=\"ref-link-section-d30100885e780\">7<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 8\" title=\"Nazeen, S., Yu, Y. W. &#038; Berger, B. Carnelian uncovers hidden functional patterns across diverse study populations from whole metagenome sequencing reads. Genome Biol. 21, 47 (2020).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR8\" id=\"ref-link-section-d30100885e783\">8<\/a><\/sup> of complex microbial communities populating human, animal and natural environments. Genome assembly methods developed for microbial isolates have been expanded to apply to shotgun metagenomes, but while they excel in identifying new organisms from communities, their sensitivity is often limited by such environments\u2019 complexity<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 9\" title=\"Ayling, M., Clark, M. D. &#038; Leggett, R. M. New approaches for metagenome assembly with short reads. Brief Bioinform. 21, 584\u2013594 (2020).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR9\" id=\"ref-link-section-d30100885e787\">9<\/a><\/sup>. Reference-based computational approaches complement assembly by relying on annotated reference sequence information to accurately identify and quantify the known taxa and genes present in a microbiome by homology instead<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Beghini, F. et al. Integrating taxonomic, functional, and strain-level profiling of diverse microbial communities with bioBakery 3. eLife 10, e65088 (2021).\" href=\"http:\/\/www.nature.com\/#ref-CR4\" id=\"ref-link-section-d30100885e791\">4<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Lu, J., Breitwieser, F. P., Thielen, P. &#038; Salzberg, S. L. Bracken: estimating species abundance in metagenomics data. PeerJ Comput. Sci. 3, e104 (2017).\" href=\"http:\/\/www.nature.com\/#ref-CR5\" id=\"ref-link-section-d30100885e791_1\">5<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Milanese, A. et al. Microbial abundance, activity and population genomic profiling with mOTUs2. Nat. Commun. 10, 1014 (2019).\" href=\"http:\/\/www.nature.com\/#ref-CR6\" id=\"ref-link-section-d30100885e791_2\">6<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 7\" title=\"Franzosa, E. A. et al. Species-level functional profiling of metagenomes and metatranscriptomes. Nat. Methods 15, 962\u2013968 (2018).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR7\" id=\"ref-link-section-d30100885e794\">7<\/a><\/sup>. This set of methods enabled deep exploration of human microbiomes and the discovery of microbial associations with multiple health conditions<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Qin, N. et al. Alterations of the human gut microbiome in liver cirrhosis. Nature 513, 59\u201364 (2014).\" href=\"http:\/\/www.nature.com\/#ref-CR10\" id=\"ref-link-section-d30100885e799\">10<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Tett, A. et al. Unexplored diversity and strain-level structure of the skin microbiome associated with psoriasis. NPJ Biofilms Microbiomes 3, 14 (2017).\" href=\"http:\/\/www.nature.com\/#ref-CR11\" id=\"ref-link-section-d30100885e799_1\">11<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Jie, Z. et al. The gut microbiome in atherosclerotic cardiovascular disease. Nat. Commun. 8, 845 (2017).\" href=\"http:\/\/www.nature.com\/#ref-CR12\" id=\"ref-link-section-d30100885e799_2\">12<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Schirmer, M. et al. Dynamics of metatranscription in the inflammatory bowel disease gut microbiome. Nat. Microbiol. 3, 337\u2013346 (2018).\" href=\"http:\/\/www.nature.com\/#ref-CR13\" id=\"ref-link-section-d30100885e799_3\">13<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Ye, Z. et al. A metagenomic study of the gut microbiome in Behcet\u2019s disease. Microbiome 6, 135 (2018).\" href=\"http:\/\/www.nature.com\/#ref-CR14\" id=\"ref-link-section-d30100885e799_4\">14<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Zhou, W. et al. Longitudinal multi-omics of host-microbe dynamics in prediabetes. Nature 569, 663\u2013671 (2019).\" href=\"http:\/\/www.nature.com\/#ref-CR15\" id=\"ref-link-section-d30100885e799_5\">15<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Thomas, A. M. et al. Metagenomic analysis of colorectal cancer datasets identifies cross-cohort microbial diagnostic signatures and a link with choline degradation. Nat. Med. 25, 667\u2013678 (2019).\" href=\"http:\/\/www.nature.com\/#ref-CR16\" id=\"ref-link-section-d30100885e799_6\">16<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Ghensi, P. et al. Strong oral plaque microbiome signatures for dental implant diseases identified by strain-resolution metagenomics. NPJ Biofilms Microbiomes 6, 47 (2020).\" href=\"http:\/\/www.nature.com\/#ref-CR17\" id=\"ref-link-section-d30100885e799_7\">17<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 18\" title=\"Zhu, F. et al. Metagenome-wide association of gut microbiome features for schizophrenia. Nat. Commun. 11, 1612 (2020).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR18\" id=\"ref-link-section-d30100885e802\">18<\/a><\/sup> and dietary patterns<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Claesson, M. J. et al. Gut microbiota composition correlates with diet and health in the elderly. Nature 488, 178\u2013184 (2012).\" href=\"http:\/\/www.nature.com\/#ref-CR19\" id=\"ref-link-section-d30100885e806\">19<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"David, L. A. et al. Diet rapidly and reproducibly alters the human gut microbiome. Nature 505, 559\u2013563 (2014).\" href=\"http:\/\/www.nature.com\/#ref-CR20\" id=\"ref-link-section-d30100885e806_1\">20<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Hansen, L. B. S. et al. A low-gluten diet induces changes in the intestinal microbiome of healthy Danish adults. Nat. Commun. 9, 4630 (2018).\" href=\"http:\/\/www.nature.com\/#ref-CR21\" id=\"ref-link-section-d30100885e806_2\">21<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Asnicar, F. et al. Microbiome connections with host metabolism and habitual diet from 1,098 deeply phenotyped individuals. Nat. Med. 27, 321\u2013332 (2021).\" href=\"http:\/\/www.nature.com\/#ref-CR22\" id=\"ref-link-section-d30100885e806_3\">22<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 23\" title=\"Wang, D. D. et al. The gut microbiome modulates the protective association between a Mediterranean diet and cardiometabolic disease risk. Nat. Med. 27, 333\u2013343 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR23\" id=\"ref-link-section-d30100885e809\">23<\/a><\/sup>, as well as the characterization of the evolution and transmission of microbial species and strains<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Asnicar, F. et al. Studying vertical microbiome transmission from mothers to infants by strain-level metagenomic profiling. mSystems 2, e00164-16 (2017).\" href=\"http:\/\/www.nature.com\/#ref-CR24\" id=\"ref-link-section-d30100885e813\">24<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Ferretti, P. et al. Mother-to-infant microbial transmission from different body sites shapes the developing infant gut microbiome. Cell Host Microbe 24, 133\u2013145 (2018).\" href=\"http:\/\/www.nature.com\/#ref-CR25\" id=\"ref-link-section-d30100885e813_1\">25<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Yassour, M. et al. Strain-level analysis of mother-to-child bacterial transmission during the first few months of life. Cell Host Microbe 24, 146\u2013154 (2018).\" href=\"http:\/\/www.nature.com\/#ref-CR26\" id=\"ref-link-section-d30100885e813_2\">26<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Brito, I. L. et al. Transmission of human-associated microbiota along family and social networks. Nat. Microbiol. 4, 964\u2013971 (2019).\" href=\"http:\/\/www.nature.com\/#ref-CR27\" id=\"ref-link-section-d30100885e813_3\">27<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Ianiro, G. et al. Faecal microbiota transplantation for the treatment of diarrhoea induced by tyrosine-kinase inhibitors in patients with metastatic renal cell carcinoma. Nat. Commun. 11, 4333 (2020).\" href=\"http:\/\/www.nature.com\/#ref-CR28\" id=\"ref-link-section-d30100885e813_4\">28<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 29\" title=\"Chen, L. et al. The long-term genetic stability and individual specificity of the human gut microbiome. Cell 184, 2302\u20132315 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR29\" id=\"ref-link-section-d30100885e816\">29<\/a><\/sup>. However, reference-based methods can only detect cataloged microbial species included in available reference databases, which typically only represent a fraction of the community members across environments, thus limiting the interpretation of shotgun metagenomes<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"00 title=\"Thomas, A. M. &#038; Segata, N. Multiple levels of the unknown in microbiome research. BMC Biol. 17, 48 (2019).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR30\" id=\"ref-link-section-d30100885e820\">30<\/a><\/sup>.<\/p>\n<p>Conversely, de novo metagenomic assembly to reconstruct draft genes and genomes\u2014called metagenome-assembled genomes (MAGs)\u2014has advanced to the point of very high specificity (albeit often low sensitivity) for recovery directly from metagenomes<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Li, D., Liu, C.-M., Luo, R., Sadakane, K. &#038; Lam, T.-W. MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics 31, 1674\u20131676 (2015).\" href=\"http:\/\/www.nature.com\/#ref-CR31\" id=\"ref-link-section-d30100885e827\">31<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Nurk, S., Meleshko, D., Korobeynikov, A. &#038; Pevzner, P. A. metaSPAdes: a new versatile metagenomic assembler. Genome Res. 27, 824\u2013834 (2017).\" href=\"http:\/\/www.nature.com\/#ref-CR32\" id=\"ref-link-section-d30100885e827_1\">32<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Kang, D. D. et al. MetaBAT 2: an adaptive binning algorithm for robust and efficient genome reconstruction from metagenome assemblies. PeerJ 7, e7359 (2019).\" href=\"http:\/\/www.nature.com\/#ref-CR33\" id=\"ref-link-section-d30100885e827_2\">33<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Wu, Y.-W., Simmons, B. A. &#038; Singer, S. W. MaxBin 2.0: an automated binning algorithm to recover genomes from multiple metagenomic datasets. Bioinformatics 32, 605\u2013607 (2016).\" href=\"http:\/\/www.nature.com\/#ref-CR34\" id=\"ref-link-section-d30100885e827_3\">34<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"11 title=\"Nissen, J. N. et al. Improved metagenome binning and assembly using deep variational autoencoders. Nat. Biotechnol. \n                https:\/\/doi.org\/10.1038\/s41587-020-00777-4\n                \n               (2021).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR35\" id=\"ref-link-section-d30100885e830\">35<\/a><\/sup>. This allows recovery of microbial sequences that have not yet been isolated or characterized and are thus absent from reference databases<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"22 title=\"Saheb Kashaf, S., Almeida, A., Segre, J. A. &#038; Finn, R. D. Recovering prokaryotic genomes from host-associated, short-read shotgun metagenomic sequencing data. Nat. Protoc. 16, 2520\u20132541 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR36\" id=\"ref-link-section-d30100885e834\">36<\/a><\/sup>. As metagenomic assembly and binning have improved dramatically in the last few years<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Li, D., Liu, C.-M., Luo, R., Sadakane, K. &#038; Lam, T.-W. MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics 31, 1674\u20131676 (2015).\" href=\"http:\/\/www.nature.com\/#ref-CR31\" id=\"ref-link-section-d30100885e838\">31<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Nurk, S., Meleshko, D., Korobeynikov, A. &#038; Pevzner, P. A. metaSPAdes: a new versatile metagenomic assembler. Genome Res. 27, 824\u2013834 (2017).\" href=\"http:\/\/www.nature.com\/#ref-CR32\" id=\"ref-link-section-d30100885e838_1\">32<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Kang, D. D. et al. MetaBAT 2: an adaptive binning algorithm for robust and efficient genome reconstruction from metagenome assemblies. PeerJ 7, e7359 (2019).\" href=\"http:\/\/www.nature.com\/#ref-CR33\" id=\"ref-link-section-d30100885e838_2\">33<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Wu, Y.-W., Simmons, B. A. &#038; Singer, S. W. MaxBin 2.0: an automated binning algorithm to recover genomes from multiple metagenomic datasets. Bioinformatics 32, 605\u2013607 (2016).\" href=\"http:\/\/www.nature.com\/#ref-CR34\" id=\"ref-link-section-d30100885e838_3\">34<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"33 title=\"Nissen, J. N. et al. Improved metagenome binning and assembly using deep variational autoencoders. Nat. Biotechnol. \n                https:\/\/doi.org\/10.1038\/s41587-020-00777-4\n                \n               (2021).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR35\" id=\"ref-link-section-d30100885e841\">35<\/a><\/sup>, large-scale MAG catalogs have been compiled and comprise a vast amount of unknown and uncultivated microbial species populating diverse environments<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Tully, B. J., Graham, E. D. &#038; Heidelberg, J. F. The reconstruction of 2,631 draft metagenome-assembled genomes from the global oceans. Sci. Data 5, 170203 (2018).\" href=\"http:\/\/www.nature.com\/#ref-CR37\" id=\"ref-link-section-d30100885e845\">37<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Manara, S. et al. Microbial genomes from non-human primate gut metagenomes expand the primate-associated bacterial tree of life with over 1000 novel species. Genome Biol. 20, 299 (2019).\" href=\"http:\/\/www.nature.com\/#ref-CR38\" id=\"ref-link-section-d30100885e845_1\">38<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Stewart, R. D. et al. Compendium of 4,941 rumen metagenome-assembled genomes for rumen microbiome biology and enzyme discovery. Nat. Biotechnol. 37, 953\u2013961 (2019).\" href=\"http:\/\/www.nature.com\/#ref-CR39\" id=\"ref-link-section-d30100885e845_2\">39<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Nayfach, S., Shi, Z. J., Seshadri, R., Pollard, K. S. &#038; Kyrpides, N. C. New insights from uncultivated genomes of the global human gut microbiome. Nature 568, 505\u2013510 (2019).\" href=\"http:\/\/www.nature.com\/#ref-CR40\" id=\"ref-link-section-d30100885e845_3\">40<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Almeida, A. et al. A new genomic blueprint of the human gut microbiota. Nature 568, 499\u2013504 (2019).\" href=\"http:\/\/www.nature.com\/#ref-CR41\" id=\"ref-link-section-d30100885e845_4\">41<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Pasolli, E. et al. Extensive unexplored human microbiome diversity revealed by over 150,000 genomes from metagenomes spanning age, geography, and lifestyle. Cell 176, 649\u2013662 (2019).\" href=\"http:\/\/www.nature.com\/#ref-CR42\" id=\"ref-link-section-d30100885e845_5\">42<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Nayfach, S. et al. A genomic catalog of Earth\u2019s microbiomes. Nat. Biotechnol. \n                https:\/\/doi.org\/10.1038\/s41587-020-0718-6\n                \n               (2020).\" href=\"http:\/\/www.nature.com\/#ref-CR43\" id=\"ref-link-section-d30100885e845_6\">43<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Lesker, T. R. et al. An integrated metagenome catalog reveals new insights into the murine gut microbiome. Cell Rep. 30, 2909\u20132922 (2020).\" href=\"http:\/\/www.nature.com\/#ref-CR44\" id=\"ref-link-section-d30100885e845_7\">44<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Almeida, A. et al. A unified catalog of 204,938 reference genomes from the human gut microbiome. Nat. Biotechnol. \n                https:\/\/doi.org\/10.1038\/s41587-020-0603-3\n                \n               (2020).\" href=\"http:\/\/www.nature.com\/#ref-CR45\" id=\"ref-link-section-d30100885e845_8\">45<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"44 title=\"Levin, D. et al. Diversity and functional landscapes in the microbiota of animals in the wild. Science 372, eabb5352 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR46\" id=\"ref-link-section-d30100885e848\">46<\/a><\/sup>. However, such metagenomic assembly techniques are typically able to capture only a limited fraction of the organisms in complex communities due to insufficient coverage for many taxa, the presence of genetically related taxa impeding or creating spurious assemblies and difficulties in quality control of the resulting MAGs<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"55 title=\"Ayling, M., Clark, M. D. &#038; Leggett, R. M. New approaches for metagenome assembly with short reads. Brief Bioinform. 21, 584\u2013594 (2020).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR9\" id=\"ref-link-section-d30100885e852\">9<\/a><\/sup>.<\/p>\n<p>To leverage the best aspects of both reference- and assembly-based metagenome profiling, we present MetaPhlAn 4, a method that exploits an integrated extended compendium of microbial genomes and MAGs to define an expanded set of species-level genome bins<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"66 title=\"Pasolli, E. et al. Extensive unexplored human microbiome diversity revealed by over 150,000 genomes from metagenomes spanning age, geography, and lifestyle. Cell 176, 649\u2013662 (2019).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR42\" id=\"ref-link-section-d30100885e859\">42<\/a><\/sup> (SGBs) and accurately profile their presence and abundance in metagenomes. SGBs represent both existing species (known or kSGBs) or yet-to-be-characterized species (unknown, uSGBs) defined solely based on the MAGs<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"77 title=\"Pasolli, E. et al. Extensive unexplored human microbiome diversity revealed by over 150,000 genomes from metagenomes spanning age, geography, and lifestyle. Cell 176, 649\u2013662 (2019).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR42\" id=\"ref-link-section-d30100885e863\">42<\/a><\/sup>. From a collection of 1.01\u2009M bacterial and archeal MAGs and isolate genomes integrating the most recent genome catalogs<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Tully, B. J., Graham, E. D. &#038; Heidelberg, J. F. The reconstruction of 2,631 draft metagenome-assembled genomes from the global oceans. Sci. Data 5, 170203 (2018).\" href=\"http:\/\/www.nature.com\/#ref-CR37\" id=\"ref-link-section-d30100885e867\">37<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Manara, S. et al. Microbial genomes from non-human primate gut metagenomes expand the primate-associated bacterial tree of life with over 1000 novel species. Genome Biol. 20, 299 (2019).\" href=\"http:\/\/www.nature.com\/#ref-CR38\" id=\"ref-link-section-d30100885e867_1\">38<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Stewart, R. D. et al. Compendium of 4,941 rumen metagenome-assembled genomes for rumen microbiome biology and enzyme discovery. Nat. Biotechnol. 37, 953\u2013961 (2019).\" href=\"http:\/\/www.nature.com\/#ref-CR39\" id=\"ref-link-section-d30100885e867_2\">39<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Nayfach, S., Shi, Z. J., Seshadri, R., Pollard, K. S. &#038; Kyrpides, N. C. New insights from uncultivated genomes of the global human gut microbiome. Nature 568, 505\u2013510 (2019).\" href=\"http:\/\/www.nature.com\/#ref-CR40\" id=\"ref-link-section-d30100885e867_3\">40<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Almeida, A. et al. A new genomic blueprint of the human gut microbiota. Nature 568, 499\u2013504 (2019).\" href=\"http:\/\/www.nature.com\/#ref-CR41\" id=\"ref-link-section-d30100885e867_4\">41<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Pasolli, E. et al. Extensive unexplored human microbiome diversity revealed by over 150,000 genomes from metagenomes spanning age, geography, and lifestyle. Cell 176, 649\u2013662 (2019).\" href=\"http:\/\/www.nature.com\/#ref-CR42\" id=\"ref-link-section-d30100885e867_5\">42<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Nayfach, S. et al. A genomic catalog of Earth\u2019s microbiomes. Nat. Biotechnol. \n                https:\/\/doi.org\/10.1038\/s41587-020-0718-6\n                \n               (2020).\" href=\"http:\/\/www.nature.com\/#ref-CR43\" id=\"ref-link-section-d30100885e867_6\">43<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Lesker, T. R. et al. An integrated metagenome catalog reveals new insights into the murine gut microbiome. Cell Rep. 30, 2909\u20132922 (2020).\" href=\"http:\/\/www.nature.com\/#ref-CR44\" id=\"ref-link-section-d30100885e867_7\">44<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"88 title=\"Almeida, A. et al. A unified catalog of 204,938 reference genomes from the human gut microbiome. Nat. Biotechnol. \n                https:\/\/doi.org\/10.1038\/s41587-020-0603-3\n                \n               (2020).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR45\" id=\"ref-link-section-d30100885e870\">45<\/a><\/sup> and additional newly assembled MAGs spanning multiple environments, we first expanded the definition of 54,596 SGBs and then defined SGB-specific unique marker genes (that is, genes uniquely characterizing each SGB) for 21,978 kSGBs and 4,992 uSGBs. The resulting dataset expands the existing MetaPhlAn algorithm<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Segata, N. et al. Metagenomic microbial community profiling using unique clade-specific marker genes. Nat. Methods 9, 811\u2013814 (2012).\" href=\"http:\/\/www.nature.com\/#ref-CR2\" id=\"ref-link-section-d30100885e874\">2<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Truong, D. T. et al. MetaPhlAn2 for enhanced metagenomic taxonomic profiling. Nat. Methods 12, 902\u2013903 (2015).\" href=\"http:\/\/www.nature.com\/#ref-CR3\" id=\"ref-link-section-d30100885e874_1\">3<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"99 title=\"Beghini, F. et al. Integrating taxonomic, functional, and strain-level profiling of diverse microbial communities with bioBakery 3. eLife 10, e65088 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR4\" id=\"ref-link-section-d30100885e877\">4<\/a><\/sup> to enable deeper and more accurate quantitative taxonomic analyses of human, host associated and environmental microbiomes and provides insights into a number of studies associating the microbiome with host conditions.<\/p>\n<\/div>\n<\/div>\n<div id=\"Sec2-section\" data-title=\"Results\">\n<h2 id=\"Sec2\">Results<\/h2>\n<div id=\"Sec2-content\">\n<h3 id=\"Sec3\">MetaPhlAn 4 profiling of species-level genome bins<\/h3>\n<p>MetaPhlAn 4 expands and improves existing capabilities to perform taxonomic profiling of metagenomes by exploiting a framework in which extensive metagenomic assemblies are integrated with existing bacterial and archaeal reference genomes. These are then jointly preprocessed to allow efficient metagenome mapping against millions of unique marker genes, ultimately quantifying both isolated and metagenomically assembled organisms in new communities. The algorithm augments that used by previous versions in four main ways as follows: (1) the adoption of SGBs<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 4\"00 title=\"Pasolli, E. et al. Extensive unexplored human microbiome diversity revealed by over 150,000 genomes from metagenomes spanning age, geography, and lifestyle. Cell 176, 649\u2013662 (2019).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR42\" id=\"ref-link-section-d30100885e893\">42<\/a><\/sup> as primary taxonomic units, each of which groups microbial genomes and MAGs into consistent existing species and newly defined genome clusters of roughly species-level diversity; (2) the integration of over 1\u2009M MAGs and genomes into this SGB structure to build one of the largest databases of confident microbial reference sequences currently available; (3) the curation of microbial taxonomic units based on the consistency of taxonomically labeled microbial genomes and the assignment of new taxonomic labels to SGBs solely defined on MAGs and (4) the improved procedure to extract unique marker genes out of each SGB for the MetaPhlAn reference-based mapping strategy<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Segata, N. et al. Metagenomic microbial community profiling using unique clade-specific marker genes. Nat. Methods 9, 811\u2013814 (2012).\" href=\"http:\/\/www.nature.com\/#ref-CR2\" id=\"ref-link-section-d30100885e897\">2<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Truong, D. T. et al. MetaPhlAn2 for enhanced metagenomic taxonomic profiling. Nat. Methods 12, 902\u2013903 (2015).\" href=\"http:\/\/www.nature.com\/#ref-CR3\" id=\"ref-link-section-d30100885e897_1\">3<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 4\"11 title=\"Beghini, F. et al. Integrating taxonomic, functional, and strain-level profiling of diverse microbial communities with bioBakery 3. eLife 10, e65088 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR4\" id=\"ref-link-section-d30100885e900\">4<\/a><\/sup>. MetaPhlAn 4 thus leverages aspects of both metagenomic assembly, with its potential to uncover previously unseen taxa<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Nayfach, S., Shi, Z. J., Seshadri, R., Pollard, K. S. &#038; Kyrpides, N. C. New insights from uncultivated genomes of the global human gut microbiome. Nature 568, 505\u2013510 (2019).\" href=\"http:\/\/www.nature.com\/#ref-CR40\" id=\"ref-link-section-d30100885e904\">40<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Almeida, A. et al. A new genomic blueprint of the human gut microbiota. Nature 568, 499\u2013504 (2019).\" href=\"http:\/\/www.nature.com\/#ref-CR41\" id=\"ref-link-section-d30100885e904_1\">41<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 4\"22 title=\"Pasolli, E. et al. Extensive unexplored human microbiome diversity revealed by over 150,000 genomes from metagenomes spanning age, geography, and lifestyle. Cell 176, 649\u2013662 (2019).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR42\" id=\"ref-link-section-d30100885e907\">42<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 4\"33 title=\"Almeida, A. et al. A unified catalog of 204,938 reference genomes from the human gut microbiome. Nat. Biotechnol. \n                https:\/\/doi.org\/10.1038\/s41587-020-0603-3\n                \n               (2020).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR45\" id=\"ref-link-section-d30100885e910\">45<\/a><\/sup> and the sensitivity of reference-based profiling to provide accurate taxonomic identification and quantification.<\/p>\n<p>The adoption of SGBs as the primary unit of taxonomic analysis is central to this approach<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 4\"44 title=\"Pasolli, E. et al. Extensive unexplored human microbiome diversity revealed by over 150,000 genomes from metagenomes spanning age, geography, and lifestyle. Cell 176, 649\u2013662 (2019).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR42\" id=\"ref-link-section-d30100885e917\">42<\/a><\/sup>. Briefly, an SGB<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 4\"55 title=\"Pasolli, E. et al. Extensive unexplored human microbiome diversity revealed by over 150,000 genomes from metagenomes spanning age, geography, and lifestyle. Cell 176, 649\u2013662 (2019).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR42\" id=\"ref-link-section-d30100885e921\">42<\/a><\/sup> delineates a microbial species purely based on the clustering of whole-genome genetic distances at 5% genomic identity<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 4\"66 title=\"Jain, C., Rodriguez-R, L. M., Phillippy, A. M., Konstantinidis, K. T. &#038; Aluru, S. High throughput ANI analysis of 90K prokaryotic genomes reveals clear species boundaries. Nat. Commun. 9, 5114 (2018).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR47\" id=\"ref-link-section-d30100885e925\">47<\/a><\/sup> and a taxonomic label can then be assigned to the SGB based on the presence (or not) of characterized genomes from isolate sequencing. This definition permits arbitrary microbial genomes to be organized in a manner not unlike amplicons into operational taxonomic units (OTUs) and matches remarkably well the expected boundaries of the existing taxonomy<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 4\"77 title=\"Pasolli, E. et al. Extensive unexplored human microbiome diversity revealed by over 150,000 genomes from metagenomes spanning age, geography, and lifestyle. Cell 176, 649\u2013662 (2019).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR42\" id=\"ref-link-section-d30100885e929\">42<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 4\"88 title=\"Jain, C., Rodriguez-R, L. M., Phillippy, A. M., Konstantinidis, K. T. &#038; Aluru, S. High throughput ANI analysis of 90K prokaryotic genomes reveals clear species boundaries. Nat. Commun. 9, 5114 (2018).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR47\" id=\"ref-link-section-d30100885e932\">47<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 4\"99 title=\"Parks, D. H. et al. A complete domain-to-species taxonomy for Bacteria and Archaea. Nat. Biotechnol. 38, 1079\u20131086 (2020).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR48\" id=\"ref-link-section-d30100885e935\">48<\/a><\/sup>. Available microbial reference genomes and medium-to-high-quality MAGs are thus grouped into taxonomically well-defined species (\u2018known\u2019 SGBs or kSGBs when an isolate genome with available taxonomy is present in the SGB) or unknown equivalent clades (uSGBs).<\/p>\n<p>Following the SGB clustering approach, the database employed by MetaPhlAn 4 contains SGBs that result from the merging of species that were originally incorrectly taxonomically labeled as separate species. For example, genomes assigned in NCBI<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 7\"00 title=\"Schoch, C. L. et al. NCBI taxonomy: a comprehensive update on curation, resources and tools. Database 2020, baaa062 (2020).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR49\" id=\"ref-link-section-d30100885e942\">49<\/a><\/sup> to <i>Lawsonibacter asaccharolyticus<\/i> and <i>Clostridium phoceensis<\/i> are 98.7% identical, likely due to independent naming of members of a new species and were merged into the SGB15154 (Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM3\">1<\/a>). This merging also applies to taxonomic species that are genetically difficult or impossible to distinguish (for example, species of the <i>Bacillus cereus<\/i> group, genetically differentiated only by their plasmidic sequences<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 7\"11 title=\"Rasko, D. A., Altherr, M. R., Han, C. S. &#038; Ravel, J. Genomics of the Bacillus cereus group of organisms. FEMS Microbiol. Rev. 29, 303\u2013329 (2005).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR50\" id=\"ref-link-section-d30100885e959\">50<\/a><\/sup>) and are thus clustered in the same SGB. Conversely, species with subclades diverging for more than 5% genetic identity were split into multiple SGBs (for example, <i>Prevotella copri<\/i> is represented by four different SGBs<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 7\"22 title=\"Tett, A. et al. The Prevotella copri complex comprises four distinct clades underrepresented in westernized populations. Cell Host Microbe 26, 666\u2013679 (2019).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR51\" id=\"ref-link-section-d30100885e966\">51<\/a><\/sup>, or <i>Faecalibacterium prausnitzii<\/i> with SGBs representing its distinct (sub)species<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 7\"33 title=\"De Filippis, F., Pasolli, E. &#038; Ercolini, D. Newly explored faecalibacterium diversity is connected to age, lifestyle, geography and disease. Curr. Biol. 30, 4932\u20134943 (2020).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR52\" id=\"ref-link-section-d30100885e973\">52<\/a><\/sup>; Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM3\">1<\/a>). Finally, incorrectly or partially taxonomically classified reference genomes were detected and amended based on the detection of outlier labels resulting from misspellings or incorrect assignments by NCBI genome submitters (for example, the <i>Staphylococcus epidermidis<\/i> SGB7865 is composed of 700 reference genomes, 32 of which have different or unspecified species labels in the NCBI database<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 7\"44 title=\"Schoch, C. L. et al. NCBI taxonomy: a comprehensive update on curation, resources and tools. Database 2020, baaa062 (2020).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR49\" id=\"ref-link-section-d30100885e984\">49<\/a><\/sup>, Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM3\">1<\/a>).<\/p>\n<p>To derive the database of SGBs to be profiled in MetaPhlAn 4, the isolate genome component included 236,620 bacterial and archeal genomes available in NCBI<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 7\"55 title=\"NCBI Resource Coordinators. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 46, D8\u2013D13 (2018).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR53\" id=\"ref-link-section-d30100885e994\">53<\/a><\/sup> and labeled as \u2018reconstructed from isolate sequencing or single cells\u2019. These were integrated with 771,528 MAGs assembled from samples collected from humans (five distinct main human body sites, 164 distinct human cohorts), animal hosts (including 22 nonhuman primate species) and nonhost-associated environments (including soil, fresh water and oceans; Supplementary Tables <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM3\">2<\/a> and <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM3\">3<\/a>). After removing reference genomes and MAGs that did not meet quality control criteria (that is, genome completeness above 50% and contamination below 5%; see <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"section anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Sec12\">Methods<\/a>), the catalog comprised 729,195 genomes (560,084 MAGs and 169,111 reference genomes) and was Mash<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 7\"66 title=\"Ondov, B. D. et al. Mash: fast genome and metagenome distance estimation using MinHash. Genome Biol. 17, 132 (2016).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR54\" id=\"ref-link-section-d30100885e1007\">54<\/a><\/sup> clustered into SGBs at 5% sequence similarity<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 7\"77 title=\"Pasolli, E. et al. Extensive unexplored human microbiome diversity revealed by over 150,000 genomes from metagenomes spanning age, geography, and lifestyle. Cell 176, 649\u2013662 (2019).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR42\" id=\"ref-link-section-d30100885e1012\">42<\/a><\/sup> for the final database of 70.9\u2009k SGBs, 47.6\u2009k of which are taxonomically unknown at the species level (uSGBs; Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Fig1\">1a<\/a>). This catalog spans 95 different phyla that are quite consistently enriched by uSGBs (Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM3\">4<\/a>). In comparison with the original SGB catalog<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 7\"88 title=\"Pasolli, E. et al. Extensive unexplored human microbiome diversity revealed by over 150,000 genomes from metagenomes spanning age, geography, and lifestyle. Cell 176, 649\u2013662 (2019).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR42\" id=\"ref-link-section-d30100885e1022\">42<\/a><\/sup>, the current collection integrates 3.6 times more MAGs from highly diverse environments (Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM3\">3<\/a>) and resulted in the definition of 4.3 times more SGBs. While the repository can be used for genome-based studies at a larger scale than what has been described so far<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Nayfach, S., Shi, Z. J., Seshadri, R., Pollard, K. S. &#038; Kyrpides, N. C. New insights from uncultivated genomes of the global human gut microbiome. Nature 568, 505\u2013510 (2019).\" href=\"http:\/\/www.nature.com\/#ref-CR40\" id=\"ref-link-section-d30100885e1029\">40<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Almeida, A. et al. A new genomic blueprint of the human gut microbiota. Nature 568, 499\u2013504 (2019).\" href=\"http:\/\/www.nature.com\/#ref-CR41\" id=\"ref-link-section-d30100885e1029_1\">41<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 7\"99 title=\"Pasolli, E. et al. Extensive unexplored human microbiome diversity revealed by over 150,000 genomes from metagenomes spanning age, geography, and lifestyle. Cell 176, 649\u2013662 (2019).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR42\" id=\"ref-link-section-d30100885e1032\">42<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 8\"00 title=\"Almeida, A. et al. A unified catalog of 204,938 reference genomes from the human gut microbiome. Nat. Biotechnol. \n                https:\/\/doi.org\/10.1038\/s41587-020-0603-3\n                \n               (2020).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR45\" id=\"ref-link-section-d30100885e1035\">45<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 8\"11 title=\"Tett, A. et al. The Prevotella copri complex comprises four distinct clades underrepresented in westernized populations. Cell Host Microbe 26, 666\u2013679 (2019).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR51\" id=\"ref-link-section-d30100885e1038\">51<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Karcher, N. et al. Analysis of 1321 Eubacterium rectale genomes from metagenomes uncovers complex phylogeographic population structure and subspecies functional adaptations. Genome Biol. 21, 138 (2020).\" href=\"http:\/\/www.nature.com\/#ref-CR55\" id=\"ref-link-section-d30100885e1041\">55<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Karcher, N. et al. Genomic diversity and ecology of human-associated Akkermansia species in the gut microbiome revealed by extensive metagenomic assembly. Genome Biol. 22, 209 (2021).\" href=\"http:\/\/www.nature.com\/#ref-CR56\" id=\"ref-link-section-d30100885e1041_1\">56<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 8\"22 title=\"Hall, A. B. et al. A novel Ruminococcus gnavus clade enriched in inflammatory bowel disease patients. Genome Med. 9, 103 (2017).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR57\" id=\"ref-link-section-d30100885e1044\">57<\/a><\/sup>, we focused here on the task of identification and quantification of taxa from metagenomes. To this end, and to decrease the potential rate of false-positive detection of SGBs without strong support or that are extremely rare, we retained only the uSGBs containing at least five MAGs from distinct samples for subsequent metagenome profiling, resulting in a final catalog of 29.4\u2009k quality-controlled SGBs (see <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"section anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Sec12\">Methods<\/a>).<\/p>\n<div data-test=\"figure\" data-container-section=\"figure\" id=\"figure-1\" data-title=\"MetaPhlAn 4 integrates reference sequences from isolate and metagenome-assembled genomes for metagenome taxonomic profiling.\">\n<figure><figcaption><b id=\"Fig1\" data-test=\"figure-caption-text\">Fig. 1: MetaPhlAn 4 integrates reference sequences from isolate and metagenome-assembled genomes for metagenome taxonomic profiling.<\/b><\/figcaption><div>\n<div><a data-test=\"img-link\" data-track=\"click\" data-track-label=\"image\" data-track-action=\"view figure\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w\/figures\/1\" rel=\"nofollow\"><picture><source type=\"image\/webp\" ><img decoding=\"async\" aria-describedby=\"Fig1\" src=\"http:\/\/media.springernature.com\/lw685\/springer-static\/image\/art%3A10.1038%2Fs41587-023-01688-w\/MediaObjects\/41587_2023_1688_Fig1_HTML.png\" alt=\"Science &amp; Nature figure 1\" loading=\"lazy\" width=\"685\" height=\"210\"><\/picture><\/a><\/div>\n<p><b>a<\/b>, From a collection of 1.01\u2009M bacterial and archeal reference genomes and metagenomic-assembled genomes (MAGs) spanning 70,927 species-level genome bins (SGBs), our pipeline defined 5.1\u2009M unique SGB-specific marker genes that are used by MetaPhlAn 4 (avg., 189\u2009\u00b1\u200934 per SGB). <b>b<\/b>, The expanded marker database allows MetaPhlAn 4 to detect the presence and estimate the relative abundance of 26,970 SGBs, 4,992 of which are candidate species without reference sequences (uSGBs) defined by at least five MAGs. The profiling is performed firstly by (1) aligning the reads of input metagenomes against the markers database, then (2) discarding low-quality alignments and (3) calculating the robust average coverage of the markers in each SGB that (4) are normalized across SGBs to report the SGB relative abundances (see <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"section anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Sec12\">Methods<\/a>). All data are presented as mean\u2009\u00b1\u2009s.d.<\/p>\n<\/div>\n<p xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\"><a data-test=\"article-link\" data-track=\"click\" data-track-label=\"button\" data-track-action=\"view figure\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w\/figures\/1\" data-track-dest=\"link:Figure1 Full size image\" aria-label=\"Reference 8\"33 rel=\"nofollow\"><span>Full size image<\/span><\/a><\/p>\n<\/figure>\n<\/div>\n<p>From this SGB genome catalog, we built the pangenome of each SGB (collection of all gene families found in at least one genome in the SGB) and used them to identify species-specific marker genes for MetaPhlAn profiling. The pangenomes were built by categorizing the coding sequences of all the 729\u2009k genomes into UniRef90 clusters<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 8\"44 title=\"Suzek, B. E. et al. UniRef clusters: a comprehensive and scalable alternative for improving sequence similarity searches. Bioinformatics 31, 926\u2013932 (2015).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR58\" id=\"ref-link-section-d30100885e1084\">58<\/a><\/sup> when a 90% amino acid identity match was found within the UniRef database, or by de novo clustering all remaining sequences at 90% amino acid identity following the Uniclust90 criteria<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 8\"55 title=\"Mirdita, M. et al. Uniclust databases of clustered and deeply annotated protein sequences and alignments. Nucleic Acids Res. 45, D170\u2013D176 (2017).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR59\" id=\"ref-link-section-d30100885e1088\">59<\/a><\/sup> (see <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"section anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Sec12\">Methods<\/a>). From the resulting 50.6\u2009M UniRef90 identities and 77.7\u2009M new Uniclust90 gene families, we subsequently identified core gene families (that is those present in almost all genomes and MAGs of an SGB; see <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"section anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Sec12\">Methods<\/a>) and then screened them for their species-specificity by mapping against all sequences of all SGBs (see <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"section anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Sec12\">Methods<\/a>). This procedure resulted in 5.1\u2009M total unique marker genes spanning 26,970 high-quality SGBs, with an average of 189\u2009\u00b1\u200934 unique marker genes per SGB. MetaPhlAn 4 taxonomic profiling uses these markers to detect the presence of an SGB (known or unknown) in new metagenomes based on the detection via read mapping of a sufficient fraction of SGB-specific marker genes (default 20%) and quantifies their relative abundance based on the within-sample-normalized average coverage estimations (see <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"section anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Sec12\">Methods<\/a>; Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Fig1\">1b<\/a>).<\/p>\n<h3 id=\"Sec4\">MetaPhlAn 4 improves the performance of taxonomic profiling<\/h3>\n<p>To evaluate the taxonomic profiling performance of MetaPhlAn 4, we first assessed its ability to profile well-characterized species (that is, those belonging to kSGBs) in comparison with available methods by using 133 synthetic metagenomes (~4B total reads). Most of these synthetic samples (128) are from the CAMI 2 taxonomic profiling challenge<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 8\"66 title=\"Meyer, F. et al. Tutorial: assessing metagenomics software with the CAMI benchmarking toolkit. Nat. Protoc. \n                https:\/\/doi.org\/10.1038\/s41596-020-00480-3\n                \n               (2021).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR60\" id=\"ref-link-section-d30100885e1116\">60<\/a><\/sup> representing host-associated and marine communities, whereas the other five are additional nonhuman synthetic metagenomes (derived from SynPhlAn; see <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"section anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Sec12\">Methods<\/a>) representing more diverse environments than in previous evaluations<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 8\"77 title=\"Beghini, F. et al. Integrating taxonomic, functional, and strain-level profiling of diverse microbial communities with bioBakery 3. eLife 10, e65088 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR4\" id=\"ref-link-section-d30100885e1123\">4<\/a><\/sup>.<\/p>\n<p>Through the OPAL benchmarking framework<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 8\"88 title=\"Meyer, F. et al. Assessing taxonomic metagenome profilers with OPAL. Genome Biol. 20, 51 (2019).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR61\" id=\"ref-link-section-d30100885e1130\">61<\/a><\/sup>, we evaluated MetaPhlAn 4 in comparison with MetaPhlAn 3 (ref. <sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 8\"99 title=\"Beghini, F. et al. Integrating taxonomic, functional, and strain-level profiling of diverse microbial communities with bioBakery 3. eLife 10, e65088 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR4\" id=\"ref-link-section-d30100885e1134\">4<\/a><\/sup>), mOTUs 2.6 (ref. <sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 9\"00 title=\"Milanese, A. et al. Microbial abundance, activity and population genomic profiling with mOTUs2. Nat. Commun. 10, 1014 (2019).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR6\" id=\"ref-link-section-d30100885e1138\">6<\/a><\/sup>) (latest database available as for March 2021) and Bracken 2.5 (ref. <sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 9\"11 title=\"Lu, J., Breitwieser, F. P., Thielen, P. &#038; Salzberg, S. L. Bracken: estimating species abundance in metagenomics data. PeerJ Comput. Sci. 3, e104 (2017).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR5\" id=\"ref-link-section-d30100885e1142\">5<\/a><\/sup>) (with two databases, one built using the April 2019 RefSeq release<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 9\"22 title=\"O\u2019Leary, N. A. et al. Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation. Nucleic Acids Res. 44, D733\u2013D745 (2016).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR62\" id=\"ref-link-section-d30100885e1146\">62<\/a><\/sup> and another one built using the GTDB release 207 (ref. <sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 9\"33 title=\"Parks, D. H. et al. GTDB: an ongoing census of bacterial and archaeal diversity through a phylogenetically consistent, rank normalized and complete genome-based taxonomy. Nucleic Acids Res. 50, D785\u2013D794 (2022).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR63\" id=\"ref-link-section-d30100885e1151\">63<\/a><\/sup>)). Due to the high false-positive rates reported by Bracken 2.5, we decided to evaluate its performance by filtering out low-abundant hits (minimum relative abundance 0.01%; Supplementary Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM1\">1<\/a>). MetaPhlAn 4 outperformed the other tools when assessing the F1 score (Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Fig2\">2a<\/a>) computed based on the common reference NCBI taxonomy. This was true despite the fact that OPAL does not consider SGB-defined species groups (that is, single species incorrectly taxonomically labeled as separated species and included in the same SGB), thus penalizing MetaPhlAn 4 profiling that cannot match the corresponding labels; the new version still achieved a higher number of species correctly detected compared with MetaPhlAn 3 across all simulations (avg., 96.65\u2009\u00b1\u200966.08 and 85.32\u2009\u00b1\u200961.95 true positives, respectively) while maintaining a low number of false positives (avg., 16.09\u2009\u00b1\u200917.65 and 13.63\u2009\u00b1\u200916.56, respectively; Supplementary Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM1\">2a,b<\/a> and Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM3\">5<\/a>). Most of the false positives (84.6%) were due to the new labels of SGB-defined species groups (for example, the <i>Marinilactibacillus<\/i> sp. 15R, present in almost all the CAMI 2 oral metagenomes, belongs to the <i>Marinilactibacillus piezotolerans<\/i> SGB7875 species group) and are thus also not strictly false positives. In fact, further evaluation using single isolate sequences (see <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"section anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Sec12\">Methods<\/a>) showed no false-positive hits when running MetaPhlAn 4 with default parameters, and no false negatives in all cases with a coverage\u2009\u2265\u20090.5\u00d7. This coverage threshold means that MetaPhlAn 4 is guaranteed to detect all SGBs that are at a relative abundance of at least 0.01% for a metagenomic sample at a standard depth of 10Gbases with detection at lower abundances frequently possible (Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM3\">6<\/a>). The improvement in recall is substantially explained by the expanded catalog of reference genomes included in MetaPhlAn 4 (169.1\u2009k genomes spanning 31.9\u2009k species in comparison with 99.2\u2009k genomes from 13.5\u2009k species in MetaPhlAn 3).<\/p>\n<div data-test=\"figure\" data-container-section=\"figure\" id=\"figure-2\" data-title=\"MetaPhlAn 4 improves sensitivity and specificity of metagenome taxonomic profiling.\">\n<figure><figcaption><b id=\"Fig2\" data-test=\"figure-caption-text\">Fig. 2: MetaPhlAn 4 improves sensitivity and specificity of metagenome taxonomic profiling.<\/b><\/figcaption><div>\n<div><a data-test=\"img-link\" data-track=\"click\" data-track-label=\"image\" data-track-action=\"view figure\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w\/figures\/2\" rel=\"nofollow\"><picture><source type=\"image\/webp\" ><img decoding=\"async\" aria-describedby=\"Fig2\" src=\"http:\/\/media.springernature.com\/lw685\/springer-static\/image\/art%3A10.1038%2Fs41587-023-01688-w\/MediaObjects\/41587_2023_1688_Fig2_HTML.png\" alt=\"Science &amp; Nature figure 2\" loading=\"lazy\" width=\"685\" height=\"461\"><\/picture><\/a><\/div>\n<p><b>a<\/b>, To evaluate its performance in taxonomic profiling, MetaPhlAn 4 was applied to synthetic metagenomes representing host-associated communities from the CAMI 2 taxonomic profiling challenge<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 9\"44 title=\"Meyer, F. et al. Tutorial: assessing metagenomics software with the CAMI benchmarking toolkit. Nat. Protoc. \n                https:\/\/doi.org\/10.1038\/s41596-020-00480-3\n                \n               (2021).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR60\" id=\"ref-link-section-d30100885e1194\">60<\/a><\/sup> (<i>n<\/i>\u2009=\u2009128 samples) and the SynPhlAn-nonhuman dataset (<i>n<\/i>\u2009=\u20095 samples), representing more diverse environments from previous evaluations<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 9\"55 title=\"Beghini, F. et al. Integrating taxonomic, functional, and strain-level profiling of diverse microbial communities with bioBakery 3. eLife 10, e65088 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR4\" id=\"ref-link-section-d30100885e1204\">4<\/a><\/sup>. Species-level evaluation using the OPAL framework<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 9\"66 title=\"Meyer, F. et al. Assessing taxonomic metagenome profilers with OPAL. Genome Biol. 20, 51 (2019).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR61\" id=\"ref-link-section-d30100885e1208\">61<\/a><\/sup> shows that MetaPhlAn 4 is more accurate than the available alternatives in both the detection of which taxa are present (the F1 score is the harmonic mean of the precision and recall of detection) and their quantitative estimation (the BC beta-diversity is computed between the estimated profiles and the abundances in the gold standard). Additional evaluations performed using genomes within the SGB organization (labeled \u2018SGB evaluation\u2019; see <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"section anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Sec12\">Methods<\/a>) show that MetaPhlAn 4 further improves accuracy at this more refined taxonomic level. See Supplementary Tables <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM3\">5<\/a> and <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM3\">7<\/a> for more details (GI, gastrointestinal; UT, urogenital tract). <b>b<\/b>, MetaPhlAn 4 was applied to synthetic metagenomes (<i>n<\/i>\u2009=\u200970 samples) modeling different host and nonhost-associated environments and containing, on average, 47 genomes from both kSGBs and uSGBs (see <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"section anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Sec12\">Methods<\/a>). This evaluation directly on SGBs shows the reliability of MetaPhlAn 4 to quantify both known and unknown microbial species. Additional evaluation based on a mixture of new MAGs from samples not considered in the building of the genomic database (mixed evaluation, <i>n<\/i>\u2009=\u20095 samples) stresses its accuracy independently from the inclusion of the profiled data in the database. See Supplementary Tables <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM3\">9<\/a> and <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM3\">10<\/a> for more details (NHP\u2009=\u2009nonhuman primates, W = westernized, NW = nonwesternized). Box plots in <b>a<\/b> and <b>b<\/b> show the median (center), 25th\/75th percentile (lower\/upper hinges), 1.5\u00d7 interquartile range (whiskers) and outliers (points).<\/p>\n<\/div>\n<p xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\"><a data-test=\"article-link\" data-track=\"click\" data-track-label=\"button\" data-track-action=\"view figure\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w\/figures\/2\" data-track-dest=\"link:Figure2 Full size image\" aria-label=\"Reference 9\"77 rel=\"nofollow\"><span>Full size image<\/span><\/a><\/p>\n<\/figure>\n<\/div>\n<p>We then evaluated the relative abundance quantification performance of MetaPhlAn 4 using Bray-Curtis (BC) dissimilarity and root-mean-square error (RMSE) with respect to synthetic reference community compositions. MetaPhlAn 4 outperformed the alternative methods (avg. BC, 0.13\u2009\u00b1\u20090.07; avg. RMSE, 0.016\u2009\u00b1\u20090.019), including the previous MetaPhlAn version 3 (avg. BC,\u20090.19\u2009\u00b1\u20090.12; avg. RMSE,\u20090.019\u2009\u00b1\u20090.018; Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM3\">7<\/a> and Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Fig2\">2a<\/a>). The quality of the marker set is likely the driving factor of this improvement, a consequence of the phylogenetic consistency of the SGBs that ensures that identically-labeled taxa are genomically consistent. This avoids hard-to-detect taxonomic mislabeling in the original, manually assigned taxonomic labels and allowed us to obtain a set of marker genes that (1) is larger (avg., 189\u2009\u00b1\u200934 per SGB as compared to 84\u2009\u00b1\u200947 per species in MetaPhlAn 3), (2) more reliable (Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM3\">6<\/a> and Supplementary Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM1\">3<\/a>) and (3) more unique (99.3% of the markers in comparison with 72.7% in MetaPhlAn 3, and from 3.8\u00d7 to 15.55\u00d7 less randomly assigned reads depending on the environments; Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM3\">8<\/a>).<\/p>\n<p>Apropos, because these evaluations were not able to account for modifications of species taxonomy that avoid these issues, we then evaluated MetaPhlAn 4 on the same synthetic metagenomes, but using SGB-based taxonomy (see <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"section anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Sec12\">Methods<\/a>). By considering as the gold standard label for each genome in the synthetic community the SGBs it belongs to, MetaPhlAn 4 achieved high accuracies when assessing both the F1 score (avg., 0.95\u2009\u00b1\u20090.06) and the BC dissimilarity (avg., 0.031\u2009\u00b1\u20090.023; Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Fig2\">2a<\/a>).<\/p>\n<p>Finally, we assessed the performance of MetaPhlAn 4 to specifically detect uSGBs representing clades without taxonomically characterized isolates. We constructed 65 synthetic metagenomes simulating microbiomes from 12 different human body sites, animal hosts and nonhost-associated environments, using both kSGBs and uSGBs that were found and reconstructed in real metagenomes in each of the environments via metagenomic assembly (see <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"section anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Sec12\">Methods<\/a>). We also built five additional synthetic metagenomes using a mixture of MAGs and reference genomes from samples not included in our original genomic database (see <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"section anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Sec12\">Methods<\/a>). MetaPhlAn 4 showed accuracies in the detection and quantification of uSGBs (avg. F1 score, 0.97\u2009\u00b1\u20090.02; Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Fig2\">2b<\/a> and Supplementary Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM1\">2c,d<\/a>) that were on par with those of known species (kSGBs; avg. F1 score, 0.96\u2009\u00b1\u20090.024; Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Fig2\">2a<\/a>). Both the F1 score and the BC similarity to the gold standard were consistent across all the different environments assessed. Synthetic samples based on the MAGs not available at the time when the MetaPhlAn 4 database was built yielded similar results (avg. F1 score, 0.98\u2009\u00b1\u20090.006; Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Fig2\">2b<\/a> and Supplementary Tables <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM3\">9<\/a> and <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM3\">10<\/a>). Altogether, MetaPhlAn 4 outperformed the other available tools on synthetic data and further provided quantification of yet-to-be-characterized species, while maintaining high accuracy for taxonomically well-defined species.<\/p>\n<h3 id=\"Sec5\">MetaPhlAn 4 expands the profiled fraction of metagenomes<\/h3>\n<p>The MetaPhlAn 4 database expands the number of quantifiable known microbial species (18.4\u2009k more species than in MetaPhlAn 3) and refines the resolution of many species described by kSGBs (21,978 kSGBs, with avg. 1.15 kSGBs per species), and includes 4,992 yet-to-be-characterized microbial species (uSGBs). We assessed its resulting increased ability to explain a larger fraction of the reads in a metagenome by profiling a total of 24.5\u2009k metagenomic samples (145 distinct studies, Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM3\">11<\/a>) from different human, animal and nonhost-associated environments (Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Fig3\">3a<\/a> and Supplementary Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM1\">4<\/a>). We further divided the 19.5\u2009k human metagenomes based on the body site of origin and the lifestyle (that is, westernized or nonwesternized) of the donor (for a full description of westernization, see <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"section anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Sec12\">Methods<\/a>).<\/p>\n<div data-test=\"figure\" data-container-section=\"figure\" id=\"figure-3\" data-title=\"MetaPhlAn 4 expands observable microbial diversity, primarily by quantifying yet-to-be-characterized species (uSGBs).\">\n<figure><figcaption><b id=\"Fig3\" data-test=\"figure-caption-text\">Fig. 3: MetaPhlAn 4 expands observable microbial diversity, primarily by quantifying yet-to-be-characterized species (uSGBs).<\/b><\/figcaption><div>\n<div><a data-test=\"img-link\" data-track=\"click\" data-track-label=\"image\" data-track-action=\"view figure\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w\/figures\/3\" rel=\"nofollow\"><picture><source type=\"image\/webp\" ><img decoding=\"async\" aria-describedby=\"Fig3\" src=\"http:\/\/media.springernature.com\/lw685\/springer-static\/image\/art%3A10.1038%2Fs41587-023-01688-w\/MediaObjects\/41587_2023_1688_Fig3_HTML.png\" alt=\"Science &amp; Nature figure 3\" loading=\"lazy\" width=\"685\" height=\"747\"><\/picture><\/a><\/div>\n<p><b>a<\/b>, We applied MetaPhlAn 4 profiling to a total of 24.5\u2009k metagenomic samples from diverse environments, highlighting its ability to detect microbiome compositions and clear differences between them, even when considering distinct human body sites and variable host lifestyles (Supplementary Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM1\">5b<\/a> and Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM3\">11<\/a>). <b>b<\/b>, The expanded genomic database of MetaPhlAn 4 substantially increases the estimated fraction of classified reads in comparison with the previous MetaPhlAn version across habitat types (<i>n<\/i>\u2009=\u200924,515 samples). <b>c<\/b>, MetaPhlAn 4 detects on average 48 unknown bacterial species (uSGBs) per human gut microbiome, and reaches up to more than 700 in other nonhuman environments (<i>n<\/i>\u2009=\u200924,515 samples). <b>d<\/b>, The most prevalent microbial species in the gastrointestinal tract of westernized populations are known species (kSGBs). The ten most prevalent kSGBs in westernized and nonwesternized lifestyles are shown ordered by their highest prevalence and reported together with the number of MAGs assembled from human gut metagenomes in the MetaPhlAn genome catalog. Species names are shown together with their SGB ID between brackets. <b>e<\/b>, The most prevalent SGBs in nonwesternized populations belong to yet-to-be-cultivated and named species. The ten most prevalent uSGBs of each lifestyle are shown ordered by their highest prevalence. <b>f<\/b>, In westernized populations, the most prevalent kSGBs and uSGBs vary across age categories. The two most prevalent SGBs for each age category are shown. <b>g<\/b>, The fraction of uSGBs relative to kSGB increases after infancy (<i>n<\/i>\u2009=\u200919,468). Box plots in <b>b, c<\/b> and <b>g<\/b> show the median (center), 25th\/75th percentile (lower\/upper hinges), 1.5\u00d7 interquartile range (whiskers) and outliers (points). NHP, nonhuman primates; W, westernized; NW, nonwesternized; A, ancient.<\/p>\n<\/div>\n<p xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\"><a data-test=\"article-link\" data-track=\"click\" data-track-label=\"button\" data-track-action=\"view figure\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w\/figures\/3\" data-track-dest=\"link:Figure3 Full size image\" aria-label=\"Reference 9\"88 rel=\"nofollow\"><span>Full size image<\/span><\/a><\/p>\n<\/figure>\n<\/div>\n<p>In the resulting taxonomic profiles, MetaPhlAn 4 detected 11,132 SGBs present in at least 1% of the samples of one of the environments, 3,527 of which (31.68%) were taxonomically unknown at the species level (uSGBs). The new profiles explained a much larger fraction of the reads in the metagenomic samples compared to the previous version across all environments (Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Fig3\">3b<\/a>). Within the human body sites, the improvement was high for the airways (avg. 1.95-fold increase of explainable reads), and substantially higher improvements were reached for samples from, for example, the gut microbiomes of nonhuman mammals ranging from the average 3.26-fold increase of the wild mice to 14.15-fold increase in the rumen. For these animals, the average number of uSGBs detected surpassed that of the kSGBs (with the exception of the nonhuman primates; Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Fig3\">3c<\/a> and Supplementary Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM1\">5a<\/a>). These increases were consistent with the number of newly considered MAGs that defined new uSGBs from nonhuman microbiomes (90,606 MAGs defining 1,287 uSGBs).<\/p>\n<p>Environmental ecosystems had metagenomes that were generally less explained by the taxa considered in MetaPhlAn 4, with soil, in particular, remaining poorly characterized due to its remarkable microbial variability and the lack of systematic large metagenomic efforts targeting it (only 2,495 MAGs defining 26 uSGBs in our database), while the ocean microbiome had a 6.65-fold increase, largely due to the inclusion of the Tara ocean MAGs<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 9\"99 title=\"Sunagawa, S. et al. Structure and function of the global ocean microbiome. Science 348, 1261359 (2015).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR64\" id=\"ref-link-section-d30100885e1409\">64<\/a><\/sup> in the SGB database (Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Fig3\">3c<\/a>). Overall, uSGBs were instrumental to increase the fraction of metagenomes profileable by MetaPhlAn 4 (Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Fig3\">3b<\/a>), as they accounted for an average of 23.13% (s.d.: 17.89%) of the richness of the resulting profiles across all environments (Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Fig3\">3c<\/a>).<\/p>\n<h3 id=\"Sec6\">SGB profiling reveals species overlaps across environments<\/h3>\n<p>A key advantage of reference-based metagenomic profiling as compared to assembly is its ability to detect low-abundant and hard-to-assemble genomes<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 7\"00 title=\"Tully, B. J., Graham, E. D. &#038; Heidelberg, J. F. The reconstruction of 2,631 draft metagenome-assembled genomes from the global oceans. Sci. Data 5, 170203 (2018).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR37\" id=\"ref-link-section-d30100885e1430\">37<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 7\"11 title=\"Stewart, R. D. et al. Compendium of 4,941 rumen metagenome-assembled genomes for rumen microbiome biology and enzyme discovery. Nat. Biotechnol. 37, 953\u2013961 (2019).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR39\" id=\"ref-link-section-d30100885e1433\">39<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 7\"22 title=\"Nayfach, S., Shi, Z. J., Seshadri, R., Pollard, K. S. &#038; Kyrpides, N. C. New insights from uncultivated genomes of the global human gut microbiome. Nature 568, 505\u2013510 (2019).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR40\" id=\"ref-link-section-d30100885e1436\">40<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 7\"33 title=\"Pasolli, E. et al. Extensive unexplored human microbiome diversity revealed by over 150,000 genomes from metagenomes spanning age, geography, and lifestyle. Cell 176, 649\u2013662 (2019).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR42\" id=\"ref-link-section-d30100885e1439\">42<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 7\"44 title=\"Nayfach, S. et al. A genomic catalog of Earth\u2019s microbiomes. Nat. Biotechnol. \n                https:\/\/doi.org\/10.1038\/s41587-020-0718-6\n                \n               (2020).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR43\" id=\"ref-link-section-d30100885e1442\">43<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 7\"55 title=\"Almeida, A. et al. A unified catalog of 204,938 reference genomes from the human gut microbiome. Nat. Biotechnol. \n                https:\/\/doi.org\/10.1038\/s41587-020-0603-3\n                \n               (2020).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR45\" id=\"ref-link-section-d30100885e1445\">45<\/a><\/sup>. This allows the generation of confident ecological statistics regarding prevalent and rare taxa, which are difficult to quantify accurately in the presence of many technical nondetections in data solely from metagenome assemblies. On this dataset, MetaPhlAn 4 identified 1,657 SGBs found in at least 1% of the samples from the gut of nonwesternized human populations (550 of these being uSGBs), 331 SGBs at the same prevalence threshold in the typically low-diverse human vaginal microbiome (61 of which are uSGBs) and intermediate numbers in other environments (Supplementary Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM1\">5b<\/a>).<\/p>\n<p>This confirmed that gut metagenomes retrieved from ancient samples (ranging from 5,300 to 150 years ago in the available datasets) possessed more SGBs in common with those at >1% prevalence in the gut microbiome of modern nonwesternized populations (1,039 SGBs) than of westernized ones (748 SGBs), despite the dominance in datasets and databases of data derived from westernized populations (~ten times more samples; Supplementary Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM1\">5b<\/a> and Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM3\">3<\/a>). Similarly, and adopting the same prevalence threshold at 1%, the SGBs found in the gut of nonhuman primates (including those in captivity) overlapped more with gut samples from ancient microbiomes (879 SGBs) than with modern ones (668 SGBs), further highlighting the effect of lifestyle in shaping the human microbiome (Supplementary Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM1\">5b<\/a>). A similar environmental adaptation can be observed in the gut microbiome of laboratory mice, in which many more modern human gut SGBs were found (481 SGBs) compared to those from wild mice (53 SGBs). Twenty-eight SGBs were present at >1% prevalence in all human body sites (Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM3\">12<\/a>), comprising typically oral microbes that can reach the lower gastrointestinal tract, can contaminate the skin and can colonize other mucosal sites such as the vagina, that is, the <i>Haemophilus parainfluenzae<\/i> group (SGB9712), the <i>Streptococcus salivarius<\/i> group (SGB8007), <i>Veillonella parvula<\/i> (SGB6939), <i>Rothia mucilaginosa<\/i> (SGB16971) and <i>Streptococcus oralis<\/i> (SGB8130).<\/p>\n<p>Species that overlap across environments at the same 1% prevalence threshold can also spot potential contamination as it is the case of the only nine SGBs shared between the modern human gut and ocean water samples (Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM3\">13<\/a>). These were predominantly skin and oral microbes likely to contaminate low-biomass water samples during laboratory procedures as follows: <i>Cutibacterium acnes<\/i> (SGB16955), <i>Staphylococcus aureus<\/i> (SGB7852), <i>Streptococcus thermophilus<\/i> (SGB8002), <i>Escherichia coli<\/i> (SGB10068), <i>V. parvula<\/i> (SGB6939), <i>Staphylococcus epidermidis<\/i> (SGB7865), <i>Staphylococcus hominis<\/i> (SGB7858), <i>Streptococcus mitis<\/i> (SGB8163) and <i>R. mucilaginosa<\/i> (SGB16971). Overall, the new MetaPhlAn 4 profiling highlights that microbiomes from most nonhost-associated environments have little overlap between themselves and the human microbiome (Supplementary Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM1\">5c<\/a>), and that, as expected, human microbiomes from different body sites have limited but relevant overlaps (Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM3\">12<\/a>).<\/p>\n<h3 id=\"Sec7\">MetaPhlAn 4 expands the panel of prevalent human gut species<\/h3>\n<p>We assessed the prevalence of SGBs in the gut microbiome of human individuals (Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM3\">14<\/a>) using 19.5\u2009k human gut metagenomes from 86 datasets, spanning different age categories, geographic locations and lifestyles (Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM3\">15<\/a>). The most prevalent SGBs in westernized populations were from known species (Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Fig3\">3d<\/a>), specifically <i>Blautia wexlerae<\/i> (SGB4837, 89.2%), the <i>Bacteroides uniformis<\/i> group (SGB1836, 88.1%) and <i>Phocaeicola vulgatus<\/i> (previously <i>Bacteroides vulgatus<\/i>, SGB1814, 85.8%). Four distinct <i>F. prausnitzii<\/i> SGBs appeared within the top ten most prevalent species, and three of them had quite distinct prevalence in both lifestyles (Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Fig3\">3d<\/a>), highlighting the ability of SGB profiling to increase the resolution of species that are particularly genetically divergent<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 7\"66 title=\"De Filippis, F., Pasolli, E. &#038; Ercolini, D. Newly explored faecalibacterium diversity is connected to age, lifestyle, geography and disease. Curr. Biol. 30, 4932\u20134943 (2020).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR52\" id=\"ref-link-section-d30100885e1561\">52<\/a><\/sup>. <i>Cibionibacter quicibialis<\/i><sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 7\"77 title=\"Pasolli, E. et al. Extensive unexplored human microbiome diversity revealed by over 150,000 genomes from metagenomes spanning age, geography, and lifestyle. Cell 176, 649\u2013662 (2019).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR42\" id=\"ref-link-section-d30100885e1568\">42<\/a><\/sup>, as well as several other species of interest considered kSGBs because they have a sequenced representative even though they remain largely uncharacterized (for example, <i>Oscillibacter sp<\/i>. ER4) were also found at high prevalence (Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Fig3\">3d<\/a>).<\/p>\n<p>While most uSGBs had lower prevalence in this population, 4 uSGBs from the <i>Ruminococcaceae<\/i> family exceeded 75% prevalence, and many of them were substantially more prevalent in nonwesternized compared to westernized populations (Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Fig3\">3e<\/a>). The species with the highest prevalence in each specific age category displayed variable prevalence in the other age groups (Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Fig3\">3f<\/a>, Supplementary Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM1\">6<\/a> and Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM3\">14<\/a>), and uSGBs tended to be particularly common in childhood, which may be under-studied relative to infancy and adulthood (Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Fig3\">3g<\/a> and Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM3\">16<\/a>). Overall, the newly established SGBs prevalence across population and lifestyles (Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM3\">14<\/a>) expands both the size and detail of that established by prior metagenomic studies.<\/p>\n<h3 id=\"Sec8\">Biomarkers of diet in mice are dominated by uSGBs<\/h3>\n<p>MetaPhlAn 4 integrates 22,718 MAGs assembled from 1,906 mouse gut metagenomes (both research laboratory mice and wild mice) and defines 540 uSGBs, allowing greater resolution in profiling the mouse gut. When applied to a heterogeneous public dataset of 184 mouse gut microbiomes spanning eight genetic backgrounds and six different vendors (Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM3\">17<\/a>)<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 7\"88 title=\"Xiao, L. et al. A catalog of the mouse gut metagenome. Nat. Biotechnol. 33, 1103\u20131108 (2015).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR65\" id=\"ref-link-section-d30100885e1617\">65<\/a><\/sup>, MetaPhlAn 4 detected 632 different SGBs, 45.57% of them that would not be detected using only MAGs reconstructed from the same samples\u2019 set (Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM3\">18<\/a>). As already noted in recent studies<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 7\"99 title=\"Kieser, S., Zdobnov, E. M. &#038; Trajkovski, M. Comprehensive mouse microbiota genome catalog reveals major difference to its human counterpart. PLoS Comput. Biol. 18, e1009947 (2022).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR66\" id=\"ref-link-section-d30100885e1624\">66<\/a><\/sup> employing a metagenomic-assembly-based workflow<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 18\"00 title=\"Kieser, S., Brown, J., Zdobnov, E. M., Trajkovski, M. &#038; McCue, L. A. ATLAS: a Snakemake workflow for assembly, annotation, and genomic binning of metagenome sequence data. BMC Bioinf. 21, 257 (2020).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR67\" id=\"ref-link-section-d30100885e1628\">67<\/a><\/sup>, most of the detected SGBs in the mouse gut (60.8%) were uSGBs (Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Fig4\">4a<\/a> and Supplementary Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM1\">7a<\/a>). In contrast, only 108 total species were detected by MetaPhlAn 3 from the same samples. Interestingly, of the 43 SGBs present in more than 75% of the samples, most are uSGBs; the 12 kSGBs themselves represent poorly characterized species such as <i>Lachnospiraceae<\/i> bacterium 28_4 (SGB7272), <i>Dorea<\/i> sp. 5_2 (SGB7275) and <i>Oscillibacter<\/i> sp. 1_3 (SGB7266), which were also the only ones detectable by MetaPhlAn 3. The poor mappability of many mouse microbiomes against isolate genomes is also reflected at taxonomic levels higher than species, as more than half of the families (that is, family-level genome bins (FGBs) defined similarly to SGBs but spanning up to 30% nucleotide divergence; see <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"section anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Sec12\">Methods<\/a>) present in more than 20% of the samples are still uncharacterized (uFGBs; Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Fig4\">4b<\/a> and Supplementary Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM1\">7b<\/a>).<\/p>\n<div data-test=\"figure\" data-container-section=\"figure\" id=\"figure-4\" data-title=\"MetaPhlAn 4 enables accurate metagenomic profiling of mouse microbiomes containing few cultured isolate taxa.\">\n<figure><figcaption><b id=\"Fig4\" data-test=\"figure-caption-text\">Fig. 4: MetaPhlAn 4 enables accurate metagenomic profiling of mouse microbiomes containing few cultured isolate taxa.<\/b><\/figcaption><div>\n<div><a data-test=\"img-link\" data-track=\"click\" data-track-label=\"image\" data-track-action=\"view figure\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w\/figures\/4\" rel=\"nofollow\"><picture><source type=\"image\/webp\" ><img decoding=\"async\" aria-describedby=\"Fig4\" src=\"http:\/\/media.springernature.com\/lw685\/springer-static\/image\/art%3A10.1038%2Fs41587-023-01688-w\/MediaObjects\/41587_2023_1688_Fig4_HTML.png\" alt=\"Science &amp; Nature figure 4\" loading=\"lazy\" width=\"685\" height=\"653\"><\/picture><\/a><\/div>\n<p><b>a<\/b>, MetaPhlAn 4 taxonomic profiling of a cohort of mouse gut microbiome samples (<i>n<\/i>\u2009=\u2009181 samples), spanning eight genetic backgrounds and six different vendors<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 18\"11 title=\"Xiao, L. et al. A catalog of the mouse gut metagenome. Nat. Biotechnol. 33, 1103\u20131108 (2015).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR65\" id=\"ref-link-section-d30100885e1675\">65<\/a><\/sup> revealed that the majority of detected microbial taxa are uncharacterized SGBs (uSGBs) that do not contain a sequenced isolate representative. <b>b<\/b>, Some of the most prevalent families in the mouse gut microbiome (<i>n<\/i>\u2009=\u2009181 samples) are still unclassified at the family level (uFGBs). FGBs detected in at least 20% of the samples (circles and right-side <i>y<\/i> axis) and with a median relative abundance above 1% (box plots and left-side <i>y<\/i> axis) are shown. <b>c<\/b>, Random effects models applied to the MetaPhlAn 4 profiles revealed that most of the high- and low-fat diet microbial biomarkers are uncharacterized species (FDR\u2009<\u20090.2). log<sub>10<\/sub>-transformed relative abundances of the microbial biomarkers are represented in the heatmap and their effect size (linear model beta coefficient) in the bar plots. For kSGBs, species names are shown together with their SGB ID between brackets. SGB41568 is reported in NCBI as assigned to an unclassified phylum, and we thus report only the kingdom label. SMUC\u2009=\u2009Southern Medical University in China, CMR\u2009=\u2009Craniofacial Mutant Resource at the Jackson Laboratory (Jax). Box plots in <b>a<\/b> and <b>b<\/b> show the median (center), 25th\/75th percentile (lower\/upper hinges), 1.5\u00d7 interquartile range (whiskers) and outliers (points).<\/p>\n<\/div>\n<p xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\"><a data-test=\"article-link\" data-track=\"click\" data-track-label=\"button\" data-track-action=\"view figure\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w\/figures\/4\" data-track-dest=\"link:Figure4 Full size image\" aria-label=\"Reference 18\"22 rel=\"nofollow\"><span>Full size image<\/span><\/a><\/p>\n<\/figure>\n<\/div>\n<p>To test the relevance of uSGBs in the context of typical mouse microbiome studies, we recapitulated prior statistical tests to identify taxonomic biomarkers of high-fat (HF) versus normal chow diets across host genetic backgrounds and vendors<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 18\"33 title=\"Xiao, L. et al. A catalog of the mouse gut metagenome. Nat. Biotechnol. 33, 1103\u20131108 (2015).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR65\" id=\"ref-link-section-d30100885e1714\">65<\/a><\/sup>. Applying linear mixed models on the MetaPhlAn 4 taxonomic profiles and controlling for sex, age, genetic background and vendor (see <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"section anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Sec12\">Methods<\/a>; Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM3\">19<\/a>), we identified 18 SGB biomarkers at FDR\u2009<\u20090.2 with an average relative abundance in the associated diet >1% (Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Fig4\">4c<\/a>). Most of the over-abundant biomarkers of a hyper-caloric diet were uSGBs (13 uSGBs, 72% of the 18 biomarkers), in addition to three taxa that could be detected using MetaPhlAn 3 (<i>Lachnospiraceae<\/i> bacterium 28_4 SGB7272, <i>Lactobacillus johnsonii<\/i> SGB7041 and <i>Faecalibaculum rodentium<\/i> SGB4047) and 2 kSGBs representing poorly characterized species (<i>Lachnospiraceae<\/i> bacterium SGB41544 and Bacteroidales bacterium SGB27761). While other approaches are already available to exploit environment-specific MAG catalogs for metagenomic profiling<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Kieser, S., Brown, J., Zdobnov, E. M., Trajkovski, M. &#038; McCue, L. A. ATLAS: a Snakemake workflow for assembly, annotation, and genomic binning of metagenome sequence data. BMC Bioinf. 21, 257 (2020).\" href=\"http:\/\/www.nature.com\/#ref-CR67\" id=\"ref-link-section-d30100885e1740\">67<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Wood, D. E., Lu, J. &#038; Langmead, B. Improved metagenomic analysis with Kraken 2. Genome Biol. 20, 257 (2019).\" href=\"http:\/\/www.nature.com\/#ref-CR68\" id=\"ref-link-section-d30100885e1740_1\">68<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 18\"44 title=\"Saenz, C., Nigro, E., Gunalan, V. &#038; Arumugam, M. MIntO: a modular and scalable pipeline for microbiome metagenomic and metatranscriptomic data integration. Front. Bioinform. 2, 846922 (2022).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR69\" id=\"ref-link-section-d30100885e1743\">69<\/a><\/sup>, MetaPhlAn 4\u2019s ability to rapidly and accurately profile species defined solely by MAGs (that is, uSGBs) appears particularly relevant for under-characterized microbial environments in which cultivated and sequenced taxa still represent a small fraction of overall microbial diversity.<\/p>\n<h3 id=\"Sec9\">Stronger links between gut microbiome, diet and metabolism<\/h3>\n<p>We used MetaPhlAn 4 to extend links between the gut microbiome, diet and host metabolism<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Claesson, M. J. et al. Gut microbiota composition correlates with diet and health in the elderly. Nature 488, 178\u2013184 (2012).\" href=\"http:\/\/www.nature.com\/#ref-CR19\" id=\"ref-link-section-d30100885e1755\">19<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"David, L. A. et al. Diet rapidly and reproducibly alters the human gut microbiome. Nature 505, 559\u2013563 (2014).\" href=\"http:\/\/www.nature.com\/#ref-CR20\" id=\"ref-link-section-d30100885e1755_1\">20<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Hansen, L. B. S. et al. A low-gluten diet induces changes in the intestinal microbiome of healthy Danish adults. Nat. Commun. 9, 4630 (2018).\" href=\"http:\/\/www.nature.com\/#ref-CR21\" id=\"ref-link-section-d30100885e1755_2\">21<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Asnicar, F. et al. Microbiome connections with host metabolism and habitual diet from 1,098 deeply phenotyped individuals. Nat. Med. 27, 321\u2013332 (2021).\" href=\"http:\/\/www.nature.com\/#ref-CR22\" id=\"ref-link-section-d30100885e1755_3\">22<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 18\"55 title=\"Wang, D. D. et al. The gut microbiome modulates the protective association between a Mediterranean diet and cardiometabolic disease risk. Nat. Med. 27, 333\u2013343 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR23\" id=\"ref-link-section-d30100885e1758\">23<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 18\"66 title=\"Ley, R. E., Turnbaugh, P. J., Klein, S. &#038; Gordon, J. I. Microbial ecology: human gut microbes associated with obesity.Nature 444, 1022\u20131023 (2006).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR70\" id=\"ref-link-section-d30100885e1761\">70<\/a><\/sup> by re-analyzing metagenomes from 1,001 deeply phenotyped individuals in the ZOE PREDICT 1 study<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 18\"77 title=\"Asnicar, F. et al. Microbiome connections with host metabolism and habitual diet from 1,098 deeply phenotyped individuals. Nat. Med. 27, 321\u2013332 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR22\" id=\"ref-link-section-d30100885e1765\">22<\/a><\/sup>. As in the original study, strengths of association between the microbiome and both dietary and cardiometabolic host variables were evaluated by testing the predictive power of random forest (RF) classifiers and regressors trained on the taxonomic profiles (see <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"section anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Sec12\">Methods<\/a>). Among the 19 health and diet markers most strongly linked with the microbiome according to MetaPhlAn 3 in the original work, all but two were better predicted when incorporating MetaPhlAn 4 taxa (new median AUC\u2009=\u20090.74, 4.84% improvement; Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Fig5\">5a<\/a>). The highest improvement was found for the 10-year atherosclerotic cardiovascular disease (ASCVD) risk (0.106 higher AUC, 16.24% improvement), and the Healthy Eating Index (HEI) score<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 18\"88 title=\"Guenther, P. M. et al. Update of the healthy eating index: HEI-2010. J. Acad. Nutr. Diet. 113, 569\u2013580 (2013).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR71\" id=\"ref-link-section-d30100885e1775\">71<\/a><\/sup> achieved the strongest association (0.072 higher AUC, 10.05% improvement and 31% regression improvement).<\/p>\n<div data-test=\"figure\" data-container-section=\"figure\" id=\"figure-5\" data-title=\"MetaPhlAn 4 reveals strong links between the unknown fraction of the human gut microbiome and host diet and cardiometabolic markers.\">\n<figure><figcaption><b id=\"Fig5\" data-test=\"figure-caption-text\">Fig. 5: MetaPhlAn 4 reveals strong links between the unknown fraction of the human gut microbiome and host diet and cardiometabolic markers.<\/b><\/figcaption><div>\n<div><a data-test=\"img-link\" data-track=\"click\" data-track-label=\"image\" data-track-action=\"view figure\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w\/figures\/5\" rel=\"nofollow\"><picture><source type=\"image\/webp\" ><img decoding=\"async\" aria-describedby=\"Fig5\" src=\"http:\/\/media.springernature.com\/lw685\/springer-static\/image\/art%3A10.1038%2Fs41587-023-01688-w\/MediaObjects\/41587_2023_1688_Fig5_HTML.png\" alt=\"Science &amp; Nature figure 5\" loading=\"lazy\" width=\"685\" height=\"622\"><\/picture><\/a><\/div>\n<p><b>a<\/b>, Compared to the original results from the ZOE PREDICT 1 study based on the MetaPhlAn 3 taxonomic profiles<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 18\"99 title=\"Asnicar, F. et al. Microbiome connections with host metabolism and habitual diet from 1,098 deeply phenotyped individuals. Nat. Med. 27, 321\u2013332 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR22\" id=\"ref-link-section-d30100885e1794\">22<\/a><\/sup>, random forest (RF) models trained on the MetaPhlAn 4 microbiome profiles (<i>n<\/i>\u2009=\u20091,001 samples) substantially improve classification (circles and right-side <i>y<\/i> axis) and regression (box plots and left-side <i>y<\/i> axis) result for a panel of 19 markers representative of nutritional and cardiometabolic health (see <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"section anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Sec12\">Methods<\/a>). Box plots show the median (center), 25th\/75th percentile (lower\/upper hinges), 1.5\u00d7 interquartile range (whiskers) and outliers (points.) <b>b<\/b>, Panel of the 20 unknown microbial species (uSGBs) showing the strongest overall correlations with the positive (top-half list) and negative (bottom-half list) dietary and cardiometabolic health markers, respectively (<sup><span>\u2217<\/span><\/sup>FDR\u2009<\u20090.2).<\/p>\n<\/div>\n<p xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\"><a data-test=\"article-link\" data-track=\"click\" data-track-label=\"button\" data-track-action=\"view figure\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w\/figures\/5\" data-track-dest=\"link:Figure5 Full size image\" aria-label=\"Reference 23\"00 rel=\"nofollow\"><span>Full size image<\/span><\/a><\/p>\n<\/figure>\n<\/div>\n<p>Microbiome links with dietary indices were particularly improved by considering uSGBs (Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Fig5\">5a<\/a>); previously, visceral fat and blood lipid levels were generally more strongly microbiome-associated than dietary indices using MetaPhlAn 3 profiles. This was substantiated by the analysis of correlation between the abundance of each uSGB with all 19 host diet, anthropometric and physiology indices. Indeed, the strongest correlations (after accounting for age, sex and BMI; Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Fig5\">5b<\/a>) mostly involved uSGBs (6 of the 10 SGBs most associated with healthy conditions were uSGBs), and the three highest (absolute) correlations involved Alphaproteobacteria SGB4777, positively correlating with the alternate Mediterranean diet (aMED<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 23\"11 title=\"Fung, T. T. et al. Diet-quality scores and plasma concentrations of markers of inflammation and endothelial dysfunction. Am. J. Clin. Nutr. 82, 163\u2013173 (2005).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR72\" id=\"ref-link-section-d30100885e1833\">72<\/a><\/sup>, <i>\u03c1<\/i>\u2009=\u20090.21) and HEI (<i>\u03c1<\/i>\u2009=\u20090.19) scores, and negatively correlating with the uPDI (<i>\u03c1<\/i>\u2009=\u2009\u22120.25).<\/p>\n<p>We further compared the SGBs newly linked to diet and biometrics in the ZOE PREDICT 1 re-analysis to those associated with other health and disease conditions in our broader human gut data MetaPhlAn 4 profiles. Among the ten uSGBs most health-associated based on the average correlation ranks with the 19 reference markers selected from the ZOE PREDICT 1 study, <i>Lachnospiraceae<\/i> SGB4894 emerged as a particularly relevant taxon. This uSGB was prevalent in both contemporary human cohorts (44.33% in healthy individuals) and in nonhuman primates (41.36% prevalence). It was also present in 60% of the metagenomes available from ancient stool samples (Supplementary Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM1\">8a<\/a>), suggesting that this taxon is an important, as-yet-uncharacterized member of the healthy human microbiome.<\/p>\n<p>When comparing the relative abundances of <i>Lachnospiraceae<\/i> SGB4894 in case\/control studies across datasets spanning 11 different human diseases (see <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"section anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Sec12\">Methods<\/a>; Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM3\">20<\/a>), we found statistically significant associations not only with conditions directly linked with cardiometabolic health such as ASCVD (<i>P<\/i>\u2009=\u20090.045) and cirrhosis (<i>P<\/i>\u2009=\u20099.20\u2009\u00d7\u200910<sup>\u22127<\/sup>) but also with the inflammatory bowel diseases (IBD; Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Fig6\">6a<\/a>). This included associations over three different cohorts of a higher abundance and prevalence of <i>Lachnospiraceae<\/i> SGB4894 with both of the main IBD subtypes, Crohn\u2019s disease (<i>P<\/i>\u2009=\u20092.50\u2009\u00d7\u200910<sup>\u221228<\/sup>, 4.67\u2009\u00d7\u200910<sup>\u22126<\/sup> and 0.0016) and ulcerative colitis (<i>P<\/i>\u2009=\u20091.85\u2009\u00d7\u200910<sup>\u221222<\/sup>, 3.89\u2009\u00d7\u200910<sup>\u22126<\/sup> and 1.28\u2009\u00d7\u200910<sup>\u22128<\/sup>). Altogether, these results show the importance of profiling the unknown fraction of the microbiome even for relatively well-characterized environments, such as the human gut, as microbial links with cardiometabolic blood metabolites, dietary patterns and host diseases can also incorporate and shed light on newly defined uSGBs.<\/p>\n<div data-test=\"figure\" data-container-section=\"figure\" id=\"figure-6\" data-title=\"StrainPhlAn 4 accurately reconstructs large-scale strain-level phylogenies of uncharacterized microbial species.\">\n<figure><figcaption><b id=\"Fig6\" data-test=\"figure-caption-text\">Fig. 6: StrainPhlAn 4 accurately reconstructs large-scale strain-level phylogenies of uncharacterized microbial species.<\/b><\/figcaption><div>\n<div><a data-test=\"img-link\" data-track=\"click\" data-track-label=\"image\" data-track-action=\"view figure\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w\/figures\/6\" rel=\"nofollow\"><picture><source type=\"image\/webp\" ><img decoding=\"async\" aria-describedby=\"Fig6\" src=\"http:\/\/media.springernature.com\/lw685\/springer-static\/image\/art%3A10.1038%2Fs41587-023-01688-w\/MediaObjects\/41587_2023_1688_Fig6_HTML.png\" alt=\"Science &amp; Nature figure 6\" loading=\"lazy\" width=\"685\" height=\"528\"><\/picture><\/a><\/div>\n<p><b>a<\/b>, Relative abundances (box plots and top-part <i>y<\/i> axis) and prevalences (bar plots and bottom-part <i>y<\/i> axis) of the uncharacterized species (uSGB) <i>Lachnospiraceae<\/i> SGB4894 are substantially higher in healthy individuals (<i>n<\/i>\u2009=\u2009738 samples) in comparison with patients suffering from several gastrointestinal related diseases (<i>n<\/i>\u2009=\u20091,183 samples), and this difference is reproducible across populations (one-sided Mann\u2013Whitney <i>U<\/i> test). Box plots show the median (center), 25th\/75th percentile (lower\/upper hinges), 1.5\u00d7 interquartile range (whiskers) and outliers (points). <b>b<\/b>, <i>Lachnospiraceae<\/i> SGB4894 shows within-species genetic diversity strongly linked to geographic origin and lifestyle. <b>c<\/b>, Pairwise geographic distances between strains of different countries correlate with their median genetic distances (Spearman\u2019s <i>\u03c1<\/i>\u2009=\u20090.505; see <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"section anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Sec12\">Methods<\/a>), suggesting that human <i>Lachnospiraceae<\/i> SGB4894 strains could have followed an isolation-by-distance pattern.<\/p>\n<\/div>\n<p xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\"><a data-test=\"article-link\" data-track=\"click\" data-track-label=\"button\" data-track-action=\"view figure\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w\/figures\/6\" data-track-dest=\"link:Figure6 Full size image\" aria-label=\"Reference 23\"22 rel=\"nofollow\"><span>Full size image<\/span><\/a><\/p>\n<\/figure>\n<\/div>\n<h3 id=\"Sec10\">StrainPhlAn 4 reconstructs large phylogenies of uSGBs<\/h3>\n<p>The unique clade-specific marker genes exploited by MetaPhlAn to detect and quantify microbial taxa can also be used to reconstruct the sample-specific genetic makeup of individual strains with the StrainPhlAn approach<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 23\"33 title=\"Beghini, F. et al. Integrating taxonomic, functional, and strain-level profiling of diverse microbial communities with bioBakery 3. eLife 10, e65088 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR4\" id=\"ref-link-section-d30100885e1968\">4<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 23\"44 title=\"Truong, D. T., Tett, A., Pasolli, E., Huttenhower, C. &#038; Segata, N. Microbial strain-level population structure and genetic diversity from metagenomes. Genome Res. 27, 626\u2013638 (2017).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR73\" id=\"ref-link-section-d30100885e1971\">73<\/a><\/sup>. MetaPhlAn 4 also extends StrainPhlAn 4 to be applicable to SGBs, and thus to uncharacterized species (uSGBs). StrainPhlAn 4 uses the MetaPhlAn 4 mapping of reads against markers to produce per-sample genotypes for the dominant strains per species (for all SGBs with sufficient coverage). Compared to StrainPhlAn 3, we improved the procedure to select and process markers and samples with a more robust and validated set of default parameters and a more stringent gap-trimming strategy. We also exploit the larger marker\u2019s database of more phylogenetically consistent SGBs (avg., 189\u2009\u00b1\u200934 markers per SGB). This resulted in more accurate phylogenies compared to the previous version, with an average of 1.33% increase in correlation between StrainPhlAn phylogenetic distances and MAG-based phylogenies built on the fraction of samples, in which high-quality MAGs could be reconstructed (evaluation done on 100 samples for the three most prevalent kSGBs with consistent MetaPhlAn 3 species; Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM3\">21<\/a> and Supplementary Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM1\">9a\u2013f<\/a>; see <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"section anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Sec12\">Methods<\/a>).<\/p>\n<p>To illustrate the potential of StrainPhlAn profiling for uSGBs, we continued our exploration of the health-linked <i>Lachnospiraceae<\/i> SGB4894 introduced above, exploiting the same collection of 19.5\u2009k gut metagenomic samples used for MetaPhlAn 4 (Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM3\">12<\/a>). This analysis incorporated all 5.8\u2009k samples in which MetahlAn 4 detected <i>Lachnospiraceae<\/i> SGB4894, including 79 nonhuman primates and 12 ancient human gut metagenomes (Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM3\">22<\/a>). StrainPhlAn 4 retained 37 SGB4894-specific marker genes (spanning 19,449 nucleotide positions after trimming the alignment to exclude nonvariable positions) across the 1,683 samples, in which the target uSGB had enough coverage for strain profiling (samples with, at least, 20 <i>Lachnospiraceae<\/i> SGB4894 markers reconstructed with >80% breadth of coverage) and automatically built a phylogeny integrating all strain profiles from among host types.<\/p>\n<p>The resulting phylogeny showed that <i>Lachnospiraceae<\/i> SGB4894 is composed of multiple subclades, including one comprising strains mostly from individuals from westernized populations and other two instead dominated by individuals from nonwesternized or Chinese populations, the latter also with higher intraclade diversity (Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Fig6\">6b<\/a>). One strain reconstructed from a sample of palaeofaeces from ~1,300 years ago<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 23\"55 title=\"Hagan, R. W. et al. Comparison of extraction methods for recovering ancient microbial DNA from paleofeces. Am. J. Phys. Anthropol. 171, 275\u2013284 (2020).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR74\" id=\"ref-link-section-d30100885e2011\">74<\/a><\/sup> was also integrated within the <i>Lachnospiraceae<\/i> SGB4894 phylogeny and placed as basal for the subclade of mostly European and North American strains (Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Fig6\">6b<\/a>), whereas the strains from nonhuman primates tended to populate a common, divergent region of the tree.<\/p>\n<p><i>Lachnospiraceae<\/i> SGB4894\u2019s phylogeny further demonstrated genetic structure linked to the geographic origin of the hosts (Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Fig6\">6b<\/a>). Indeed, when considering pairs of strains sampled in different countries, we found a correlation between geographic and median genetic distance (Spearman\u2019s <i>P<\/i>\u2009=\u20090.505) that can be used to hypothesize isolation-by-distance effects<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 23\"66 title=\"Wright, S. Isolation by distance. Genetics 28, 114\u2013138 (1943).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR75\" id=\"ref-link-section-d30100885e2032\">75<\/a><\/sup>, as previously shown for <i>Helicobacter pylori<\/i><sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 23\"77 title=\"Linz, B. et al. An African origin for the intimate association between humans and Helicobacter pylori. Nature 445, 915\u2013918 (2007).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR76\" id=\"ref-link-section-d30100885e2038\">76<\/a><\/sup> and <i>Eubacterium rectale<\/i><sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 23\"88 title=\"Karcher, N. et al. Analysis of 1321 Eubacterium rectale genomes from metagenomes uncovers complex phylogeographic population structure and subspecies functional adaptations. Genome Biol. 21, 138 (2020).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR55\" id=\"ref-link-section-d30100885e2045\">55<\/a><\/sup> (Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Fig6\">6c<\/a>). Correspondingly, SGB4894 had a higher intrapopulation genetic variability in nonwesternized populations (Mann\u2013Whitney <i>U<\/i> test, <i>P<\/i>\u2009<\u20092.22\u2009\u00d7\u200910<sup>\u221246<\/sup>; Supplementary Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM1\">8b<\/a>) and higher intrasubject polymorphism rates (calculated as the percentage of bases in the reconstructed markers with an allele dominance below 80%, Mann\u2013Whitney <i>U<\/i> test, <i>P<\/i>\u2009=\u20098.6\u2009\u00d7\u200910<sup>-14<\/sup>; Supplementary Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM1\">8c<\/a>). StrainPhlAn 4 thus readily enabled phylogenetic reconstruction and population genetics for uncultivated, yet-to-be-named species with high precision (see <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"section anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Sec12\">Methods<\/a>; Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM3\">21<\/a> and Supplementary Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM1\">9g,h<\/a>).<\/p>\n<p>StrainPhlAn 4 also allows the analysis of strain sharing and transmission between communities<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 23\"99 title=\"Beghini, F. et al. Integrating taxonomic, functional, and strain-level profiling of diverse microbial communities with bioBakery 3. eLife 10, e65088 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR4\" id=\"ref-link-section-d30100885e2089\">4<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 29\"00 title=\"Ferretti, P. et al. Mother-to-infant microbial transmission from different body sites shapes the developing infant gut microbiome. Cell Host Microbe 24, 133\u2013145 (2018).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR25\" id=\"ref-link-section-d30100885e2092\">25<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 29\"11 title=\"Ianiro, G. et al. Faecal microbiota transplantation for the treatment of diarrhoea induced by tyrosine-kinase inhibitors in patients with metastatic renal cell carcinoma. Nat. Commun. 11, 4333 (2020).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR28\" id=\"ref-link-section-d30100885e2095\">28<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 29\"22 title=\"Truong, D. T., Tett, A., Pasolli, E., Huttenhower, C. &#038; Segata, N. Microbial strain-level population structure and genetic diversity from metagenomes. Genome Res. 27, 626\u2013638 (2017).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR73\" id=\"ref-link-section-d30100885e2098\">73<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Shao, Y. et al. Stunted microbiota and opportunistic pathogen colonization in caesarean-section birth. Nature 574, 117\u2013121 (2019).\" href=\"http:\/\/www.nature.com\/#ref-CR77\" id=\"ref-link-section-d30100885e2101\">77<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Valles-Colomer, M. et al. Variation and transmission of the human gut microbiota across multiple familial generations. Nat. Microbiol. 7, 87\u201396 (2022).\" href=\"http:\/\/www.nature.com\/#ref-CR78\" id=\"ref-link-section-d30100885e2101_1\">78<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 29\"33 title=\"Ianiro, G. et al. Variability of strain engraftment and predictability of microbiome composition after fecal microbiota transplantation across different diseases. Nat. Med. 28, 1913\u20131923 (2022).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR79\" id=\"ref-link-section-d30100885e2104\">79<\/a><\/sup> for uncharacterized species, that is, uSGBs (see <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"section anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Sec12\">Methods<\/a>). Notably, StrainPhlAn 4 estimated that strains of <i>Lachnospiraceae<\/i> SGB4894 were not shared between mothers and their <1-year infants in all 21 cases in which it was reliably detected in both relatives (Supplementary Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM1\">8d<\/a>). Similarly, only 5.63% of adults in the same household that were positive for <i>Lachnospiraceae<\/i> SGB4894 shared the same strain (Supplementary Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM1\">8d<\/a>), suggesting that stable vertical and horizontal transmission for this species are both rare. There is some evidence for horizontal transmission between host species, however, as we found evidence of two captive nonhuman primates sharing closely related <i>Lachnospiraceae<\/i> SGB4894 strains with humans (Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#Fig6\">6b<\/a>). Overall, this example shows that the extension of StrainPhlAn 4 to incorporate SGBs alongside MetaPhlAn 4 enables the analysis of highly-resolved, sub-species phylogenies for both well-characterized and yet-to-be-cultivated microbial species.<\/p>\n<\/div>\n<\/div>\n<div id=\"Sec11-section\" data-title=\"Discussion\">\n<h2 id=\"Sec11\">Discussion<\/h2>\n<div id=\"Sec11-content\">\n<p>MetaPhlAn 4 provides a strategy for integrating metagenomic assembly with reference-based profiling approaches to achieve novelty by incorporating diverse high-quality metagenome assemblies, and sensitivity and specificity using refined mapping to prescreened marker sequences. This strategy leverages multiple recent large efforts in metagenomically cataloging microbial diversity<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Tully, B. J., Graham, E. D. &#038; Heidelberg, J. F. The reconstruction of 2,631 draft metagenome-assembled genomes from the global oceans. Sci. Data 5, 170203 (2018).\" href=\"http:\/\/www.nature.com\/#ref-CR37\" id=\"ref-link-section-d30100885e2139\">37<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Manara, S. et al. Microbial genomes from non-human primate gut metagenomes expand the primate-associated bacterial tree of life with over 1000 novel species. Genome Biol. 20, 299 (2019).\" href=\"http:\/\/www.nature.com\/#ref-CR38\" id=\"ref-link-section-d30100885e2139_1\">38<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Stewart, R. D. et al. Compendium of 4,941 rumen metagenome-assembled genomes for rumen microbiome biology and enzyme discovery. Nat. Biotechnol. 37, 953\u2013961 (2019).\" href=\"http:\/\/www.nature.com\/#ref-CR39\" id=\"ref-link-section-d30100885e2139_2\">39<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Nayfach, S., Shi, Z. J., Seshadri, R., Pollard, K. S. &#038; Kyrpides, N. C. New insights from uncultivated genomes of the global human gut microbiome. Nature 568, 505\u2013510 (2019).\" href=\"http:\/\/www.nature.com\/#ref-CR40\" id=\"ref-link-section-d30100885e2139_3\">40<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Almeida, A. et al. A new genomic blueprint of the human gut microbiota. Nature 568, 499\u2013504 (2019).\" href=\"http:\/\/www.nature.com\/#ref-CR41\" id=\"ref-link-section-d30100885e2139_4\">41<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Pasolli, E. et al. Extensive unexplored human microbiome diversity revealed by over 150,000 genomes from metagenomes spanning age, geography, and lifestyle. Cell 176, 649\u2013662 (2019).\" href=\"http:\/\/www.nature.com\/#ref-CR42\" id=\"ref-link-section-d30100885e2139_5\">42<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Nayfach, S. et al. A genomic catalog of Earth\u2019s microbiomes. Nat. Biotechnol. \n                https:\/\/doi.org\/10.1038\/s41587-020-0718-6\n                \n               (2020).\" href=\"http:\/\/www.nature.com\/#ref-CR43\" id=\"ref-link-section-d30100885e2139_6\">43<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Lesker, T. R. et al. An integrated metagenome catalog reveals new insights into the murine gut microbiome. Cell Rep. 30, 2909\u20132922 (2020).\" href=\"http:\/\/www.nature.com\/#ref-CR44\" id=\"ref-link-section-d30100885e2139_7\">44<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Almeida, A. et al. A unified catalog of 204,938 reference genomes from the human gut microbiome. Nat. Biotechnol. \n                https:\/\/doi.org\/10.1038\/s41587-020-0603-3\n                \n               (2020).\" href=\"http:\/\/www.nature.com\/#ref-CR45\" id=\"ref-link-section-d30100885e2139_8\">45<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 29\"44 title=\"Levin, D. et al. Diversity and functional landscapes in the microbiota of animals in the wild. Science 372, eabb5352 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR46\" id=\"ref-link-section-d30100885e2142\">46<\/a><\/sup>, organizing over 1\u2009M prokaryotic sequences into species-level genome bins, improving the diversity of the microbiome types in comparison to the current biases in available databases and efficiently using them to profile new metagenomes using a marker-based strategy. This approach improved the resolution of health-associated biomarkers and enabled phylogenetic reconstruction and population genetics inference for both known and uncharacterized taxa across tens of thousands of shotgun metagenomes spanning dozens of distinct environments.<\/p>\n<p>Notably, even with the extended MetaPhlAn 4 SGB and marker set, further work remains to better profile under-characterized habitats. Environmental, nonhost-associated, and other under-studied microbial communities are still highly enriched for sequences not captured even by current uSGBs, although the algorithm and software architecture can be continuously updated as new MAGs become available. Indeed, we plan to release at least two new MetaPhlAn databases per year, substantially expanding the profilable microbial diversity. The current methods also do not extensively incorporate viral or eukaryotic microbial sequences, due to their unique genomic architectures and quality control requirements relative to bacterial and archaeal genomes. Interestingly, because SGBs represent essentially whole-genome OTU clusters<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 29\"55 title=\"Hamady, M. &#038; Knight, R. Microbial community profiling for human microbiome projects: tools, techniques and challenges. Genome Res. 19, 1141\u20131152 (2009).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR80\" id=\"ref-link-section-d30100885e2149\">80<\/a><\/sup>, many related downstream statistical challenges also remain to be addressed; for example, the tradeoff between sensitivity and specificity when applying quality control measures to identify real but rare taxa. Another important aspect of increasing relevance in current metagenomic research is the phylogenetic and taxonomic contextualization of under-characterized species, specifically uSGBs. While MetaPhlAn 4 has been designed to provide taxonomic labels corresponding to the part of the taxonomy that can be confidentially transferred from the closest (if any) reference genomes, and PhyloPhlAn<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 29\"66 title=\"Asnicar, F. et al. Precise phylogenetic analysis of microbial isolates and genomes from metagenomes using PhyloPhlAn 3.0. Nat. Commun. 11, 2500 (2020).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR81\" id=\"ref-link-section-d30100885e2153\">81<\/a><\/sup> provides specific workflows for phylogenetic characterization, further integration of isolate genomes and new methods for defining taxonomic clades above the level of the microbial family are still needed. We expect to continue addressing these challenges in future versions of the methodology, which will also form the basis for other MAG-aware updates of the bioBakery platform<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 29\"77 title=\"Beghini, F. et al. Integrating taxonomic, functional, and strain-level profiling of diverse microbial communities with bioBakery 3. eLife 10, e65088 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR4\" id=\"ref-link-section-d30100885e2157\">4<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 29\"88 title=\"McIver, L. J. et al. bioBakery: a meta\u2019omic analysis environment. Bioinformatics 34, 1235\u20131237 (2018).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR82\" id=\"ref-link-section-d30100885e2160\">82<\/a><\/sup>.<\/p>\n<\/div>\n<\/div>\n<div id=\"Sec12-section\" data-title=\"Methods\">\n<h2 id=\"Sec12\">Methods<\/h2>\n<div id=\"Sec12-content\">\n<h3 id=\"Sec13\">Overview of the approach<\/h3>\n<p>MetaPhlAn 4 taxonomic profiling relies on detecting the presence and estimating the coverage of a collection of species-specific marker genes to estimate the relative abundance of known and unknown microbial taxa in shotgun metagenomic samples. Since version 4, MetaPhlAn is relying on the concept of sequence-defined species-level genome bins (SGBs)<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 29\"99 title=\"Pasolli, E. et al. Extensive unexplored human microbiome diversity revealed by over 150,000 genomes from metagenomes spanning age, geography, and lifestyle. Cell 176, 649\u2013662 (2019).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR42\" id=\"ref-link-section-d30100885e2176\">42<\/a><\/sup> that addresses many limitations of manual taxonomy assignment and encompasses taxonomic units both with available reference genomes from cultivation (kSGBs) and taxa defined solely based on the metagenome-assembled genomes (uSGBs).<\/p>\n<p>As a brief summary of the approach (details in the following subsections), to build the MetaPhlAn database of SGB-specific markers, we collected a catalog of 729,195 dereplicated and quality-controlled genomes (560,084 MAGs and 169,111 reference genomes) that was used to expand the SGB organization by Pasolli et al. <sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"0000 title=\"Pasolli, E. et al. Extensive unexplored human microbiome diversity revealed by over 150,000 genomes from metagenomes spanning age, geography, and lifestyle. Cell 176, 649\u2013662 (2019).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR42\" id=\"ref-link-section-d30100885e2183\">42<\/a><\/sup>. This led to the definition of 21,373 FGBs, 47,643 genus-level genome bins (GGBs) and 70,927 SGBs, with 23,737 of them containing at least one reference genome (kSGBs) and 47,190 containing only MAGs (uSGBs). To minimize the chance that SGBs incorporate assembly artifacts or chimeric sequences, we considered only those uSGBs with at least five MAGs (no filtering for kSGBs). The genome catalog was then annotated using the UniRef90 database<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"0101 title=\"Suzek, B. E. et al. UniRef clusters: a comprehensive and scalable alternative for improving sequence similarity searches. Bioinformatics 31, 926\u2013932 (2015).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR58\" id=\"ref-link-section-d30100885e2187\">58<\/a><\/sup> (see below) and, within each SGB, the genes that could not be assigned to UniRef90 gene families were de novo clustered together using the UniClust90 (ref. <sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"0202 title=\"Mirdita, M. et al. Uniclust databases of clustered and deeply annotated protein sequences and alignments. Nucleic Acids Res. 45, D170\u2013D176 (2017).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR59\" id=\"ref-link-section-d30100885e2191\">59<\/a><\/sup>) criteria (>90% identity and >80% coverage of the cluster centroid). Using the resulting UniRef- and UniClust90 annotations, we defined a set of core genes for each quality-controlled SGB (genes present in almost all genomes composing an SGB), and after mapping all core genes against the entire genomic catalog, we defined a set of 5.1\u2009M SGB-specific marker genes (core genes not present in any other SGB) for a total of 21,978 kSGBs and 4,992 uSGBs.<\/p>\n<p>For the taxonomic profiling step that uses the markers based on the SGB data, MetaPhlAn 4 maps metagenomic reads (preferably already quality controlled) against the marker database using Bowtie 2 (ref. <sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"0303 title=\"Longmead, B. &#038; Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357\u2013359 (2012).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR83\" id=\"ref-link-section-d30100885e2198\">83<\/a><\/sup>). From these mapping results, MetaPhlAn estimates the coverage of each marker and computes the clade\u2019s coverage as the robust average of the coverage across the markers of the same clade. Finally, the clade\u2019s coverages are normalized across all detected clades to obtain the relative abundance of each taxon. Several downstream analyses are included in the MetaPhlAn package, including the strain-level phylogenetic profiling of SGBs by StrainPhlAn.<\/p>\n<h3 id=\"Sec14\">The starting catalog of reference genomes and MAGs<\/h3>\n<p>Starting from the original catalog of 154,724 human MAGs and 80,990 reference genomes collected by Pasolli et al. <sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"0404 title=\"Pasolli, E. et al. Extensive unexplored human microbiome diversity revealed by over 150,000 genomes from metagenomes spanning age, geography, and lifestyle. Cell 176, 649\u2013662 (2019).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR42\" id=\"ref-link-section-d30100885e2210\">42<\/a><\/sup>, we retrieved an additional set of 616,805 MAGs spanning different human body sites, animal hosts and nonhost-associated environments (Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM3\">2<\/a>), and 155,767 new reference genomes available as of November 2020 in the NCBI Genbank database<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"0505 title=\"Benson, D. A. et al. GenBank. Nucleic Acids Res. 41, D36\u2013D42 (2012).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR84\" id=\"ref-link-section-d30100885e2217\">84<\/a><\/sup>. To ensure the quality of the downloaded sequences, we executed CheckM version 1.1.4 (ref. <sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"0606 title=\"Parks, D. H., Imelfort, M., Skennerton, C. T., Hugenholtz, P. &#038; Tyson, G. W. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res. 25, 1043\u20131055 (2015).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR85\" id=\"ref-link-section-d30100885e2221\">85<\/a><\/sup>) on the complete catalog of 1,008,148 genomes (that is, reference sequences and MAGs), filtering those with completeness below 50% or contamination above 5%. To avoid multiple inclusions of the same strains, we computed the all-versus-all MASH distances<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"0707 title=\"Ondov, B. D. et al. Mash: fast genome and metagenome distance estimation using MinHash. Genome Biol. 17, 132 (2016).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR54\" id=\"ref-link-section-d30100885e2225\">54<\/a><\/sup> (version 2.0) on the quality-controlled sequences, followed by the dereplication at 99,99% genetic identity. This resulted in a quality-controlled catalog of 729,195 genomes, comprising 560,084 MAGs and 169,111 reference genomes.<\/p>\n<h3 id=\"Sec15\">Building the expanded SGB catalog<\/h3>\n<p>Using the new genomic catalog, we expanded the SGB organization proposed by Pasolli et al. <sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"0808 title=\"Pasolli, E. et al. Extensive unexplored human microbiome diversity revealed by over 150,000 genomes from metagenomes spanning age, geography, and lifestyle. Cell 176, 649\u2013662 (2019).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR42\" id=\"ref-link-section-d30100885e2237\">42<\/a><\/sup>. First, we apply the \u2018<i>phylophlan_metagenomic\u2019<\/i> subroutine of PhyloPhlAn 3 (ref. <sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"0909 title=\"Asnicar, F. et al. Precise phylogenetic analysis of microbial isolates and genomes from metagenomes using PhyloPhlAn 3.0. Nat. Commun. 11, 2500 (2020).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR81\" id=\"ref-link-section-d30100885e2244\">81<\/a><\/sup>) on the 493,482 new MAGs and reference genomes to identify their closest SGB, GGB and FGB and their MASH distances. Based on the reported distances, we assigned the genomes to the already existing SGBs, GGBs and FGBs according to the thresholds defined by Pasolli et al. (5%, 15% and 30% genetic distance, respectively)<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"1010 title=\"Pasolli, E. et al. Extensive unexplored human microbiome diversity revealed by over 150,000 genomes from metagenomes spanning age, geography, and lifestyle. Cell 176, 649\u2013662 (2019).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR42\" id=\"ref-link-section-d30100885e2248\">42<\/a><\/sup>. We then applied a hierarchical clustering with average linkage on the all-versus-all MASH distances of the genomes not assigned to any existing SGB, using the \u2018fastcluster\u2019 python package version 1.1.25. The resulting dendrogram was divided with cutoffs at 5%, 15% and 30% genetic distance to define 54,596 new SGBs, 37,546 new GGBs and 18,211 new FGBs, respectively. In short, from the initial filtered catalog of 729,195 MAGs and reference genomes, we defined 21,373 FGBs, 47,643 GGBs and 70,927 SGBs, with 23,737 of them containing, at least, one reference genome (kSGBs) and 47,190 containing only MAGs (uSGBs; Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM3\">1<\/a>). In comparison with the latest largest MAG collections<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"1111 title=\"Nayfach, S. et al. A genomic catalog of Earth\u2019s microbiomes. Nat. Biotechnol. \n                https:\/\/doi.org\/10.1038\/s41587-020-0718-6\n                \n               (2020).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR43\" id=\"ref-link-section-d30100885e2256\">43<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"1212 title=\"Almeida, A. et al. A unified catalog of 204,938 reference genomes from the human gut microbiome. Nat. Biotechnol. \n                https:\/\/doi.org\/10.1038\/s41587-020-0603-3\n                \n               (2020).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR45\" id=\"ref-link-section-d30100885e2259\">45<\/a><\/sup>, our genome catalog spans 5,092 more kSGBs and 19,121 more uSGBs.<\/p>\n<p>We assigned a taxonomic label to all 70,927 SGBs according to the NCBI taxonomy database (as of February 2021)<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"1313 title=\"Schoch, C. L. et al. NCBI taxonomy: a comprehensive update on curation, resources and tools. Database 2020, baaa062 (2020).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR49\" id=\"ref-link-section-d30100885e2266\">49<\/a><\/sup>. For kSGBs, we assigned taxonomy by applying a majority rule to the taxonomic labels of the reference genomes contained in each SGB. In case of a tie, the taxonomic label is resolved by choosing the representative taxon, the one alphabetically first. For uSGBs, we applied a similar majority rule but on the taxonomies of the reference genomes contained at the GGB level, assigning a taxonomic label up to the genus level. If no reference genomes were present at the GGB level, we further applied the same procedure at the FGB level. If no reference genomes were found at the FGB level, we assigned the taxonomic labels only up to the phylum level by considering the phylum that is most recurrent within the set of taxonomic labels of the closest reference genome and, at most, up to one hundred reference genomes within 5% genomic distance to the closest as identified by \u2018<i>phylophlan_metagenomic<\/i>\u2019. For the taxonomic levels not receiving any taxonomy label, we assigned all the internal taxonomic nodes with SGB, GGB and FGB identifiers to maintain the taxonomy with all its levels and for providing categorization of uSGBs.<\/p>\n<h3 id=\"Sec16\">Genome annotation and pangenome generation<\/h3>\n<p>The filtered catalog of 729,195 MAGs and reference genomes was subjected to an annotation workflow, in which (1) the FASTA files were processed with Prokka (version 1.14)<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"1414 title=\"Seemann, T. Prokka: rapid prokaryotic genome annotation. Bioinformatics 30, 2068\u20132069 (2014).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR86\" id=\"ref-link-section-d30100885e2281\">86<\/a><\/sup> to detect and annotate the coding sequences (CDS) and (2) subsequently assign the CDS to a UniRef90 cluster<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"1515 title=\"Suzek, B. E. et al. UniRef clusters: a comprehensive and scalable alternative for improving sequence similarity searches. Bioinformatics 31, 926\u2013932 (2015).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR58\" id=\"ref-link-section-d30100885e2285\">58<\/a><\/sup> using a DIAMOND-based pipeline (available in <a href=\"https:\/\/github.com\/biobakery\/uniref_annotator\">https:\/\/github.com\/biobakery\/uniref_annotator<\/a>). The DIAMOND-based pipeline performs a sequence search (DIAMOND version 0.9.24)<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"1616 title=\"Buchfink, B., Xie, C. &#038; Huson, D. H. Fast and sensitive protein alignment using DIAMOND. Nat. Methods 12, 59\u201360 (2015).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR87\" id=\"ref-link-section-d30100885e2296\">87<\/a><\/sup> of the protein sequences against the UniRef90 database (release 2019_06) and then applies the UniRef90 inclusion criteria on the mapping results to annotate the input sequences (>90% identity and >80% coverage of the cluster centroid). Within each SGB, protein sequences that were not assigned to any UniRef90 cluster were clustered using MMseqs2 (ref. <sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"1717 title=\"Steinegger, M. &#038; S\u00f6ding, J. MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets. Nat. Biotechnol. 35, 1026\u20131028 (2017).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR88\" id=\"ref-link-section-d30100885e2300\">88<\/a><\/sup>) following the Uniclust90 criteria (\u2018<i>-c 0.80\u2013min-seq-id 0.9<\/i>\u2019 parameters)<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"1818 title=\"Mirdita, M. et al. Uniclust databases of clustered and deeply annotated protein sequences and alignments. Nucleic Acids Res. 45, D170\u2013D176 (2017).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR59\" id=\"ref-link-section-d30100885e2308\">59<\/a><\/sup>.<\/p>\n<p>For each SGB, based on the UniRef90 and UniClust90 annotations, a pangenome was generated by collecting all the UniRef\/UniClust90 clusters present in at least one of the SGB\u2019s genomes. For each cluster, the representative sequence was randomly selected within all the genomes and a coreness value was calculated based on the cluster prevalence within the 2\u2009k highest quality genomes of the SGB. uSGBs containing less than five MAGs were discarded for the following steps. We implemented this restriction because we found evidence that some of the small uSGBs contained likely assembly artifacts or chimeric genomes, and they were also more likely to generate false positives by failing to omit potential markers that later proved to be ambiguous. In this step, 41,498 uSGBs of the 70,927 SGBs were discarded, while all kSGBs were retained as they are represented by theoretically more reliable sequences.<\/p>\n<h3 id=\"Sec17\">The MetaPhlAn 4 vJan21 markers database<\/h3>\n<p>From these pangenomes, the construction of the marker database for MetaPhlAn 4 is divided into two sequential steps as follows: the identification of the core genes within each SGB and the screening of the core genes for their SGB-specificity.<\/p>\n<p>For the identification of the core genes, the procedure first defines a coreness percentage threshold (that is, the percentual prevalence of a gene within the SGB) based on the SGB pangenomes. Specifically, we selected the maximum coreness threshold that allowed the retrieval of at least 800 core genes (of length between 450 and 4,500 nucleotides). The minimum coreness threshold was bound to 60% for SGBs with less than 100 genomes and 50% for the others. For each SGB, a core gene set was generated using the inferred coreness thresholds. On average, we retrieved 2,985 core genes per SGB (median, 2,687; s.d., 1,861). SGBs with less than 200 core genes were discarded and not considered further (9 SGBs).<\/p>\n<p>To detect the SGB-specific marker genes, each set of core genes was then aligned against the genomes of the other SGBs using Bowtie 2 (version 2.3.5.1; &#8212;<i>sensitive<\/i> parameter)<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"1919 title=\"Longmead, B. &#038; Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357\u2013359 (2012).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR83\" id=\"ref-link-section-d30100885e2333\">83<\/a><\/sup>. For each SGB, a subset comprising up to the highest quality 100 genomes was selected for the mapping for computational reasons. Each core gene was split into fragments of 150-nt length to simulate metagenomic reads, and then they were mapped against the representative subset of the SGB\u2019s genomes. An alignment hit of a fragment was considered a hit for its corresponding core gene. Core genes hitting none (perfectly unique markers) or less than 1% (quasi-markers) of the genomes of any other SGB and hitting a number of the genomes of their SGB above or equal to their coreness threshold were selected as marker genes. Crucially, this uniqueness procedure was substantially stricter than those used in previous MetaPhlAn versions owing to the improved consistency of the SGBs compared to original species taxonomic assignments.<\/p>\n<p>The small fraction of SGBs producing less than 100 marker genes (810 SGBs) was subjected to the following workflow:<\/p>\n<ol>\n<li>\n                  <span>1.<\/span><\/p>\n<p>If more than 200 core genes of the target SGB were matching an external SGB (a kSGB belonging to the same species, or a uSGB) and if the external SGB had less than 10% of the genomes in the target SGB, then the external SGB was discarded (this occurred for 392 kSGBs and 150 uSGBs). This step was repeated every time an external SGB was removed until the target SGB produced 100 marker genes or there were no more external SGBs that could be evaluated. In the latter case, the removal of the external SGBs was rolled back.<\/p>\n<\/li>\n<li>\n                  <span>2.<\/span><\/p>\n<p>If the target SGB still could not identify ten marker genes, external SGBs with low-quality species taxonomic labels were discarded (this occurred for 822 kSGBs and 286 uSGBs). Specifically, the regular expressions used to detecting low-quality species taxonomic labels are<\/p>\n<p>\u2018(C|c)andidat(e|us) | _sp(_.*|$) | (.*_|^)(b|B)acterium(_.*|) |.*(eury|)archaeo(n_|te|n$).* |.*(endo|)symbiont.* |.*genomosp_.* |.*unidentified.* |.*_bacteria_.* |.*_taxon_.* |.*_et_al_.* |.*_and_.* |.*(cyano|proteo|actino)bacterium_.*) This step was repeated every time an external SGB was removed until the target SGB produced ten marker genes or there were no more external SGBs that could be evaluated. In the latter case, the removal of the external SGBs was rolled back.<\/p>\n<\/li>\n<li>\n                  <span>3.<\/span><\/p>\n<p>For the SGBs that still did not produce at least ten marker genes, a conflict graph was generated collecting all the core gene hits against external SGBs in which more than 200 core genes were in conflict. The graph was then processed by merging SGBs with a procedure that minimizes the number of merged SGBs and maximizes the number of markers retrieved. After this process, 849 SGBs were merged, producing 237 SGB groups.<\/p>\n<\/li>\n<\/ol>\n<p>Finally, for each SGB, a maximum of 200 marker genes were selected based first on their uniqueness and then on their size (longer first). SGBs that still had fewer than ten markers were discarded (188 SGBs). Each marker was associated with an entry in the MetaPhlAn 4 vJan21 database which includes the SGB for which the sequence is a marker, the list of SGBs sharing the marker, the sequence length, and the taxonomy of the SGB. This produced a list of 5.1\u2009M marker genes for a total of 21,978 kSGBs and 4,992 uSGBs (4,863 kSGBs and 1,198 uSGBs still not captured by the latest largest genomes catalogs<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"2020 title=\"Nayfach, S. et al. A genomic catalog of Earth\u2019s microbiomes. Nat. Biotechnol. \n                https:\/\/doi.org\/10.1038\/s41587-020-0718-6\n                \n               (2020).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR43\" id=\"ref-link-section-d30100885e2382\">43<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"2121 title=\"Almeida, A. et al. A unified catalog of 204,938 reference genomes from the human gut microbiome. Nat. Biotechnol. \n                https:\/\/doi.org\/10.1038\/s41587-020-0603-3\n                \n               (2020).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR45\" id=\"ref-link-section-d30100885e2385\">45<\/a><\/sup>).<\/p>\n<h3 id=\"Sec18\">MetaPhlAn 4 taxonomic profiling<\/h3>\n<p>MetaPhlAn 4 taxonomic profiling is based on the read homology to and coverage of SGB-specific markers to estimate the relative abundance of taxonomic clades present in a metagenomic sample. The MetaPhlAn pipeline starts by mapping the raw reads of metagenomic samples against the SGB-specific markers\u2019 database using Bowtie 2 (ref. <sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"2222 title=\"Longmead, B. &#038; Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357\u2013359 (2012).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR83\" id=\"ref-link-section-d30100885e2397\">83<\/a><\/sup>). Input metagenomic reads can be provided as a single FASTQ file (compressed with several algorithms), multiple FASTQ files included in a single (compressed) archive, or as a preperformed mapping (<i>bowtie2out<\/i> format). By default, the Bowtie 2 mapping is performed using the \u2018&#8211;<i>very-sensitive<\/i>\u2019 preset. For read-mapping quality purposes, short reads (reads shorter than 70\u2009bp; \u2018<i>&#8211;read_min_len<\/i>\u2019 parameter) and low-quality alignments (alignments with a MAPQ value lower than 5; \u2018<i>&#8211;min_mapq_val<\/i>\u2019 parameter) are discarded.<\/p>\n<p>Using the quality-controlled mapping results, MetaPhlAn estimates the coverage of each marker and computes the clade\u2019s coverage as the robust average of the coverage across the markers of the same clade, but excluding the top and bottom quantiles of the marker abundances (\u2018<i>&#8211;stat_q<\/i>\u2019 parameter). For the SGB profiling, this parameter by default set to 0.2, thus excluding the 20% of markers with the highest abundance and the 20% of markers with the lowest abundance. The coverage of quasi-markers is not considered from this computation when at least 33% (default value, \u2018<i>&#8211;perc_nonzero<\/i>\u2019 parameter) of the markers of their respective external SGB were present. The clade\u2019s coverages are finally normalized across all detected clades to obtain the relative abundance of each taxon as previously described in (refs. <sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"2323 title=\"Segata, N. et al. Metagenomic microbial community profiling using unique clade-specific marker genes. Nat. Methods 9, 811\u2013814 (2012).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR2\" id=\"ref-link-section-d30100885e2422\">2<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"2424 title=\"Truong, D. T. et al. MetaPhlAn2 for enhanced metagenomic taxonomic profiling. Nat. Methods 12, 902\u2013903 (2015).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR3\" id=\"ref-link-section-d30100885e2425\">3<\/a><\/sup>).<\/p>\n<h3 id=\"Sec19\">MetaPhlAn 4 compatibility with the GTDB taxonomy<\/h3>\n<p>MetaPhlAn 4 supports additional taxonomies via genome and MAG matching against other systems. We specifically implemented the mapping of the MetaPhlAn 4 SGB-based taxonomic profiles to those based on the species in the GTDB<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"2525 title=\"Parks, D. H. et al. GTDB: an ongoing census of bacterial and archaeal diversity through a phylogenetically consistent, rank normalized and complete genome-based taxonomy. Nucleic Acids Res. 50, D785\u2013D794 (2022).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR63\" id=\"ref-link-section-d30100885e2437\">63<\/a><\/sup>. This is available via the utility script \u2018<i>sgb_to_gtdb_profile.py<\/i>\u2019 included in the version 4 release. To assign each SGB to a GTDB species, we used the GTDB-Tk taxonomic classification tool (version 2.1.1)<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"2626 title=\"Chaumeil, P.-A., Mussig, A. J., Hugenholtz, P. &#038; Parks, D. H. GTDB-Tk: a toolkit to classify genomes with the Genome Taxonomy Database. Bioinformatics \n                https:\/\/doi.org\/10.1093\/bioinformatics\/btz848\n                \n               (2019).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR89\" id=\"ref-link-section-d30100885e2444\">89<\/a><\/sup> to assign a GTDB-defined species (release 207) to each centroid genome of the 26,970 SGBs included in the MetaPhlAn 4 database.<\/p>\n<h3 id=\"Sec20\">MetaPhlAn 4 unclassified reads calculation<\/h3>\n<p>MetaPhlAn 4 includes a feature for estimating the fraction of input reads that cannot be assigned to taxa in the database (\u2018<i>&#8211;unclassified_estimation<\/i>\u2019 parameter). This is calculated by subtracting from the total number of input reads the average read depth of each reported SGB normalized by its SGB-specific average genome length as follows:<\/p>\n<div id=\"Equa\">\n<p><span>$$begin{array}{l}% {mathrm{uncl.}mathrm{reads}} =\\ frac{{{mathrm{Total},mathrm{reads}} &#8211; left( {mathop {sum}nolimits_{{mathrm{sp}} = 0}^{n} {left( {{mathrm{avg},mathrm{non} mathrm{zero},mathrm{markers},mathrm{coverage}_mathrm{sp} times {mathrm{avg},mathrm{genome},mathrm{length}_{mathrm{sp}}}}} right)} } right) \/{mathrm{avg},mathrm{read},mathrm{length}}}}{{{mathrm{Total},mathrm{reads}}}}end{array}$$<\/span><\/p>\n<\/div>\n<div id=\"Equb\">\n<p><span>$${mathrm{sp} = mathrm{indices},mathrm{of},mathrm{all},mathrm{the},mathrm{SGBs},mathrm{reported},mathrm{in},mathrm{the},mathrm{MetaPhlAn},mathrm{profile}}$$<\/span><\/p>\n<\/div>\n<p>The average read depth of a SGB is calculated as the mean read depth of all its detected (nonzero) marker genes. The SGB-specific genome length for kSGBs is calculated using only the genome lengths of its reference genomes, while for uSGBs the average genome length is incremented by 7% (calculated to be the average difference between the genome sizes of references genomes and MAGs within the same SGB).<\/p>\n<h3 id=\"Sec21\">Building the MetaPhlAn 4 tree of life<\/h3>\n<p>The MetaPhlAn 4 package includes the phylogenetic tree of all the SGBs available in the MetaPhlAn database (the \u2018microbial tree of life\u2019; Supplementary Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM1\">10<\/a>), enabling both the calculation of phylogeny-based beta-diversity estimates between samples such as the UniFrac<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"2727 title=\"Lozupone, C. &#038; Knight, R. UniFrac: a new phylogenetic method for comparing microbial communities. Appl. Environ. Microbiol. 71, 8228\u20138235 (2005).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR90\" id=\"ref-link-section-d30100885e2725\">90<\/a><\/sup> (Supplementary Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM1\">4<\/a>), and the further exploration of phylogenetic relations between SGBs. To build the tree, we selected the highest quality genomes for each of the 26,970 SGBs based on the CheckM. We then executed PhyloPhlAn 3 (ref. <sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"2828 title=\"Asnicar, F. et al. Precise phylogenetic analysis of microbial isolates and genomes from metagenomes using PhyloPhlAn 3.0. Nat. Commun. 11, 2500 (2020).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR81\" id=\"ref-link-section-d30100885e2732\">81<\/a><\/sup>) with the optimized set of parameters for very large phylogenies as described in (ref. <sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"2929 title=\"Asnicar, F. et al. Precise phylogenetic analysis of microbial isolates and genomes from metagenomes using PhyloPhlAn 3.0. Nat. Commun. 11, 2500 (2020).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR81\" id=\"ref-link-section-d30100885e2736\">81<\/a><\/sup>). In particular, PhyloPhlAn performed a DIAMOND<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"3030 title=\"Buchfink, B., Xie, C. &#038; Huson, D. H. Fast and sensitive protein alignment using DIAMOND. Nat. Methods 12, 59\u201360 (2015).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR87\" id=\"ref-link-section-d30100885e2741\">87<\/a><\/sup> mapping (version 0.9.24) against the 400 PhyloPhlAn\u2019s universal markers\u2019 database, used TrimAl version 1.4.rev15 (ref. <sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"3131 title=\"Capella-Guti\u00e9rrez, S., Silla-Mart\u00ednez, J. M. &#038; Gabald\u00f3n, T. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25, 1972\u20131973 (2009).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR91\" id=\"ref-link-section-d30100885e2745\">91<\/a><\/sup>) for the trimming, MAFFT version 7.475 (ref. <sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"3232 title=\"Katoh, K. &#038; Standley, D. M. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol. 30, 772\u2013780 (2013).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR92\" id=\"ref-link-section-d30100885e2749\">92<\/a><\/sup>) to generate the multiple-sequence alignment, and IQ-TREE version 2.0.3 (ref. <sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"3333 title=\"Nguyen, L.-T., Schmidt, H. A., von Haeseler, A. &#038; Minh, B. Q. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol. Biol. Evol. 32, 268\u2013274 (2015).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR93\" id=\"ref-link-section-d30100885e2753\">93<\/a><\/sup>) for the phylogenetic reconstruction, together with the PhyloPhlAn presets \u2018&#8211;<i>diversity high\u2013fast<\/i>\u2019.<\/p>\n<h3 id=\"Sec22\">MetaPhlAn 4 synthetic evaluation<\/h3>\n<p>We evaluated MetaPhlAn 4 using different published and newly created synthetic metagenomes. Firstly, we assessed the performance of MetaPhlAn 4 in comparison to several available alternatives, that is, MetaPhlAn 3 (ref. <sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"3434 title=\"Beghini, F. et al. Integrating taxonomic, functional, and strain-level profiling of diverse microbial communities with bioBakery 3. eLife 10, e65088 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR4\" id=\"ref-link-section-d30100885e2768\">4<\/a><\/sup>), mOTUs 2.6 (latest database available as of March 2021)<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"3535 title=\"Milanese, A. et al. Microbial abundance, activity and population genomic profiling with mOTUs2. Nat. Commun. 10, 1014 (2019).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR6\" id=\"ref-link-section-d30100885e2772\">6<\/a><\/sup> and Bracken 2.5 (ref. <sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"3636 title=\"Lu, J., Breitwieser, F. P., Thielen, P. &#038; Salzberg, S. L. Bracken: estimating species abundance in metagenomics data. PeerJ Comput. Sci. 3, e104 (2017).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR5\" id=\"ref-link-section-d30100885e2776\">5<\/a><\/sup>). Through the OPAL benchmarking framework<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"3737 title=\"Meyer, F. et al. Assessing taxonomic metagenome profilers with OPAL. Genome Biol. 20, 51 (2019).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR61\" id=\"ref-link-section-d30100885e2780\">61<\/a><\/sup>, we evaluated the performance of each tool by profiling the CAMI 2 taxonomic profiling challenge metagenomes<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"3838 title=\"Meyer, F. et al. Tutorial: assessing metagenomics software with the CAMI benchmarking toolkit. Nat. Protoc. \n                https:\/\/doi.org\/10.1038\/s41596-020-00480-3\n                \n               (2021).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR60\" id=\"ref-link-section-d30100885e2784\">60<\/a><\/sup> and SynPhlAn-nonhuman synthetic metagenomes<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"3939 title=\"Beghini, F. et al. Integrating taxonomic, functional, and strain-level profiling of diverse microbial communities with bioBakery 3. eLife 10, e65088 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR4\" id=\"ref-link-section-d30100885e2789\">4<\/a><\/sup>. The CAMI 2 metagenomes include 128 samples representing five human body site-specific microbiomes (that is, airways, oral, the gastrointestinal tract, skin and the urogenital tract), the marine environment and the mouse gut microbiome, while the SynPhlAn-nonhuman metagenomes were designed to mirror the sequencing depth and community structure of the CAMI 2 metagenomes (that is, 30 million, 150-nt paired-end sequencing reads from genomes in kSGBs with a log-normal abundance distribution), but for environments different than the human body.<\/p>\n<p>We ran each tool using default parameters. For mOTUs 2.6, we considered two different settings, and it was thus run twice with parameters \u2018<i>-C recall<\/i>\u2019 and \u2018<i>-C precision<\/i>\u2019 to optimize for precision and recall separately, respectively. Both parameters are preset configurations of mOTUs 2 created by its developers for the CAMI 2 challenge. Results from Bracken 2.5 were filtered out discarding species reported with a relative abundance below 0.01%. Additionally, to better evaluate the SGB architecture, we performed an alternative evaluation assessing the detection and quantification of the genomes included in the synthetic metagenomes. To this end, we defined (1) \u2018true positive\u2019 as the detection of an SGB containing a genome present in the synthetic metagenome, (2) \u2018false positive\u2019 as the detection of an SGB that does not contain any genome in the metagenome and (3) \u2018false negative\u2019 as the nondetection of an SGB containing a genome present in the synthetic metagenome. Detection of an SGB that represents an overlapping SGB present in the community was also accounted as \u2018true positive\u2019. For the gold standard, relative abundances were obtained by summing up the relative abundances of the genomes belonging to the same SGB. For MetaPhlAn 3 that contains markers describing species groups, we considered (1) \u2018true positive\u2019 a species group containing a species present in the synthetic metagenome and (2) \u2018false positive\u2019 a species group that does not contain any species present in the synthetic metagenome.<\/p>\n<p>To further assess the performance of MetaPhlAn 4 to profile both known and unknown SGBs complementing the synthetic samples from CAMI 2 and SynPhlAn, we constructed additional synthetic metagenomes from different environments, hosts and human body sites using ART<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"4040 title=\"Huang, W., Li, L., Myers, J. R. &#038; Marth, G. T. ART: a next-generation sequencing read simulator. Bioinformatics 28, 593\u2013594 (2012).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR94\" id=\"ref-link-section-d30100885e2805\">94<\/a><\/sup> with the Illumina HiSeq 2500 error model (available at <a href=\"http:\/\/segatalab.cibio.unitn.it\/tools\/metaphlan\/\">http:\/\/segatalab.cibio.unitn.it\/tools\/metaphlan\/<\/a>). For each environment, we simulated five metagenomes containing 30 million, 150-nt paired-end sequencing reads using randomly selected genomes from SGBs containing MAGs coming from that environment (with a restriction of one genome per SGB), and following a log-normal abundance distribution. MetaPhlAn 4 evaluation was then performed by assessing the detection and quantification of the genomes included in the synthetic metagenomes as described above. Additionally, to demonstrate that the evaluation was not biased by the usage of genomes included in the genomic catalog, we built, using the same procedure, another five metagenomes using a mixture of new MAGs and reference genomes not included in our genomic database. SGB assignment of the new genomes was performed using the \u2018<i>phylophlan_metagenomic\u2019<\/i> subroutine of PhyloPhlAn 3 (ref. <sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"4141 title=\"Asnicar, F. et al. Precise phylogenetic analysis of microbial isolates and genomes from metagenomes using PhyloPhlAn 3.0. Nat. Commun. 11, 2500 (2020).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR81\" id=\"ref-link-section-d30100885e2819\">81<\/a><\/sup>) against the Jan21 database.<\/p>\n<p>Finally, to assess the minimum relative abundance at which MetaPhlAn 4 can confidently assign a species, we randomly selected five reference genomes and five MAGs from the mixture genomes not included in our genomic database to simulate single-isolate synthetic metagenomes at different depths of coverage using ART<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"4242 title=\"Huang, W., Li, L., Myers, J. R. &#038; Marth, G. T. ART: a next-generation sequencing read simulator. Bioinformatics 28, 593\u2013594 (2012).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR94\" id=\"ref-link-section-d30100885e2826\">94<\/a><\/sup> with the Illumina HiSeq 2,500 error model (available at <a href=\"http:\/\/segatalab.cibio.unitn.it\/tools\/metaphlan\/\">http:\/\/segatalab.cibio.unitn.it\/tools\/metaphlan\/<\/a>). For each genome, we generated reads at 0.01\u00d7, 0.05\u00d7, 0.1\u00d7, 0.5\u00d7, 1\u00d7, 5\u00d7, 10\u00d7, 50\u00d7 and 100\u00d7 coverage.<\/p>\n<h3 id=\"Sec23\">MetaPhlAn 4 application to human and nonhuman metagenomes<\/h3>\n<p>To measure the increase of the fraction of classified reads when compared with MetaPhlAn 3, we profiled 24,515 samples from 145 datasets spanning different human body sites (airways, gastrointestinal tract, oral, skin and urogenital tract) and lifestyles, animal hosts (nonhuman primates, mice and ruminants) and other nonhost-associated environments (soil, fresh water and ocean) (Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM3\">11<\/a>) with both MetaPhlAn 3 (version 3.0.12) and MetaPhlAn 4 (version 4.beta.1) using the unknown\/unclassified estimation feature (Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM3\">23<\/a>). Improvements were reported using only the samples in which both tools reported, at least, one species. An SGB was reported to be present in a specific environment if it was detected in, at least, 1% of the samples from that environment. Finally, to investigate the abundance and prevalence of gut-related SGBs across different age categories and lifestyles, we selected a subset of 19,468 human gut metagenomes from 86 datasets for which age information was available (Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM3\">15<\/a>) as reported and curated in the curatedMetagenomicData<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"4343 title=\"Pasolli, E. et al. Accessible, curated metagenomic data through ExperimentHub. Nat. Methods 14, 1023\u20131024 (2017).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR95\" id=\"ref-link-section-d30100885e2855\">95<\/a><\/sup> 3 package.<\/p>\n<h3 id=\"Sec24\">Westernization definition<\/h3>\n<p>The process of westernization brought by industrialization and urbanization over the past two hundred years has had a significant impact on human populations. These changes include access to pharmaceuticals and healthcare, improved sanitation and hygiene, increased urban dwelling and decreased exposure to livestock; and changes in habitual diets (with westernized diets tending to consist of increased fat and animal proteins, high salt and simple carbohydrates). In this study, we characterize westernized or nonwesternised individuals or populations based on either the distinction given in the primary publication or an assessment based on the criteria outlined above.<\/p>\n<h3 id=\"Sec25\">Analysis of diet-related taxa in the mouse microbiome<\/h3>\n<p>We performed a differential abundance analysis of HF versus normal chow diets in a public cohort of 181 mouse gut microbiome<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"4444 title=\"Xiao, L. et al. A catalog of the mouse gut metagenome. Nat. Biotechnol. 33, 1103\u20131108 (2015).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR65\" id=\"ref-link-section-d30100885e2875\">65<\/a><\/sup>. From the original cohort, we excluded ten samples missing age information, and we selected only the samples from genetic backgrounds tested for both types of diet. In total, we analyzed 43 HF-fed mice and 88 mice fed with normal control chow, further stratified into two genetic backgrounds and five vendors (Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM3\">17<\/a>). To correct data compositionality, we first imputed the zero values with the minimum value for an abundance found in the dataset, then we applied the centered-log-ratio transformation to the SGB\u2019s relative abundance distribution (\u2018<i>scikit-bio<\/i>\u2019 Python package, version 0.5.6). We then built a random-intercept model for each feature (SGB) using the \u2018<i>statsmodels<\/i>\u2019 Python package version 0.11.1. We associated the diet (HF or chow, encoded as a binary factor) to the transformed abundance of the strain, using the sex, age-in-days and genetic background of the mice as fixed effects and the vendor as a grouping variable. Significance was determined by the Wald test. <i>P<\/i> values were corrected according to Benjamini-Hochberg (\u2018<i>statsmodels<\/i>\u2019 Python package, <i>Q<\/i>\u2009<\u20090.2). Before plotting, we selected the biomarkers having a mean abundance in the associated group greater than 1%. The reported heatmap was printed using the \u2018<i>pheatmap<\/i>\u2019 R package version 1.0.12 (parameters \u2018<i>clustering_distance_cols<\/i>\u2009=\u2009<i>\u2018euclidean\u2019, clustering_method<\/i>\u2009=\u2009<i>\u2018complete\u2019, cluster_rows<\/i>\u2009=\u2009<i>FALSE<\/i>\u2019).<\/p>\n<h3 id=\"Sec26\">Re-analysis of the ZOE PREDICT 1 intervention study<\/h3>\n<p>We assessed the associations between microbiome and cardiometabolic health and dietary patterns using 1,001 deeply phenotyped individuals from the United Kingdom retrieved from the ZOE PREDICT 1 intervention study<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"4545 title=\"Asnicar, F. et al. Microbiome connections with host metabolism and habitual diet from 1,098 deeply phenotyped individuals. Nat. Med. 27, 321\u2013332 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR22\" id=\"ref-link-section-d30100885e2922\">22<\/a><\/sup>. Machine learning (ML) analyses were performed using the \u2018<i>scikit-learn<\/i>\u2019 Python package (version 0.22.2) on a panel of 19 representative nutritional and cardiometabolic markers described in the original study<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"4646 title=\"Asnicar, F. et al. Microbiome connections with host metabolism and habitual diet from 1,098 deeply phenotyped individuals. Nat. Med. 27, 321\u2013332 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR22\" id=\"ref-link-section-d30100885e2929\">22<\/a><\/sup>. A cross-validation approach was implemented with a random split of 80\/20 of training and testing sets, repeated for 100 bootstrap iterations, again with the same exact approach as the original study. Because the ZOE PREDICT 1 cohort includes twins, to avoid overfitting, the twin from the training set was removed if its twin pair was present in the testing set. The ML model is based on RFs using SGBs-level taxonomic relative abundances as estimated by MetaPhlAn 4 and relative abundance values were arcsin-sqrt transformed.<\/p>\n<p>For the RF classification task, continuous features were divided into two classes, the top and bottom quartiles. The \u2018<i>RandomForestClassifier\u2019<\/i> function was used with parameters \u2018<i>n_estimators<\/i>=<i>1000, max_features<\/i>=<i>\u2018sqrt\u2019<\/i>\u2019. For the RF regression task, the RandomForestRegressor function was used with parameters \u2018<i>n_estimators<\/i>=<i>1<\/i><i>000, criterion<\/i>=<i>\u2018mse\u2019, max_features<\/i>\u2009=\u2009<i>\u2018sqrt\u2019<\/i>\u2019. A linear regressor (\u2018<i>LinearRegression\u2019<\/i> function with default parameters) was also trained on training target values to calibrate the range of output values predicted by the RF regressor model. Pairwise Spearman\u2019s correlations were calculated between the relative abundance of uSGBs with a prevalence of at least 20% (at least 200 of 1001 samples), and the panel of 19 nutritional and cardiometabolic markers, correcting for age, sex and body mass index. Correlations were computed using the \u2018<i>ppcor<\/i>\u2019 R package version 1.1 (Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM3\">24<\/a>) and <i>P<\/i> values were corrected through the Benjamini-Hochberg procedure.<\/p>\n<h3 id=\"Sec27\">\n                        <i>Lachnospiraceae<\/i> SGB4894 association with health conditions<\/h3>\n<p>To investigate associations between <i>Lachnospiraceae<\/i> SGB4894 with host health conditions across several diseases, we collected 21 disease case\u2013control datasets available through curatedMetagenomicData<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"4747 title=\"Pasolli, E. et al. Accessible, curated metagenomic data through ExperimentHub. Nat. Methods 14, 1023\u20131024 (2017).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR95\" id=\"ref-link-section-d30100885e2991\">95<\/a><\/sup> (Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM3\">20<\/a>). For each dataset, we assessed the associations of <i>Lachnospiraceae<\/i> SGB4894 with the subjects reported as healthy controls by computing a one-sided Mann\u2013Whitney <i>U<\/i> test on the arcsin square root transformed relative abundances profiles using the \u2018<i>stats.mannwhitneyu<\/i>\u2019 function of the \u2018<i>scipy<\/i>\u2019 Python package version 1.5.2. Samples from westernized adults were used and comparisons were performed only when at least ten healthy and ten disease samples were available. Statistically significant associations were defined by a <i>P<\/i>\u2009<\u20090.05.<\/p>\n<h3 id=\"Sec28\">StrainPhlAn 4 profiling<\/h3>\n<p>StrainPhlAn profiling estimates strain-level species-specific phylogenies, and it is based on the reconstruction of sample-specific consensus sequences of MetaPhlAn species-specific marker genes followed by multiple-sequence alignment and phylogenetic inference<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"4848 title=\"Beghini, F. et al. Integrating taxonomic, functional, and strain-level profiling of diverse microbial communities with bioBakery 3. eLife 10, e65088 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR4\" id=\"ref-link-section-d30100885e3022\">4<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"4949 title=\"Truong, D. T., Tett, A., Pasolli, E., Huttenhower, C. &#038; Segata, N. Microbial strain-level population structure and genetic diversity from metagenomes. Genome Res. 27, 626\u2013638 (2017).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR73\" id=\"ref-link-section-d30100885e3025\">73<\/a><\/sup>. Compared to StrainPhlAn 3, the accuracy and performance of StrainPhlAn 4 have been improved mostly because of (1) the redesigned procedure to select and process markers and samples to be considered in the phylogeny, and (2) the use of the same MetaPhlAn 4 database of markers from the extensive set of phylogenetically consistent SGBs.<\/p>\n<p>For item (1), StrainPhlAn 4 considers as input the reads-to-markers alignment results (in SAM format<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"5050 title=\"Li, H. et al. The sequence alignment\/map format and SAMtools. Bioinformatics 25, 2078\u20132079 (2009).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR96\" id=\"ref-link-section-d30100885e3032\">96<\/a><\/sup>) from the MetaPhlAn 4 profiling together with the MetaPhlAn 4 database. For each sample, StrainPhlAn 4 reconstructs consensus sequences of the species-specific marker genes by considering, for each position, the nucleotide with the highest frequency among the reads mapping against it. By default, consensus markers covered by less than eight reads or with a breadth of coverage below 80% are discarded (that is, the proportion of the marker covered by reads, \u2018<i>&#8211;breadth_threshold<\/i>\u2019 parameter). For this step, ambiguous bases (that is, positions in alignment with quality lower than 30 or with major allele dominance below 80%) are considered unmapped positions. After the reconstruction of the markers, StrainPhlAn discards samples with less than 80% of the available markers and markers present in less than 80% of the samples (\u2018<i>&#8211;sample_with_n_markers<\/i>\u2019 and \u2018<i>\u2013marker_in_n_samples<\/i>\u2019 parameters, respectively). Then, markers are trimmed by removing the leading and trailing 50 bases (\u2018<i>\u2013trim_sequences<\/i>\u2019 parameter), and a polymorphic rates report is generated. Finally, the remaining samples and markers are processed by PhyloPhlAn<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"5151 title=\"Asnicar, F. et al. Precise phylogenetic analysis of microbial isolates and genomes from metagenomes using PhyloPhlAn 3.0. Nat. Commun. 11, 2500 (2020).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR81\" id=\"ref-link-section-d30100885e3049\">81<\/a><\/sup>. By default, multiple-sequence alignment is performed by MAFFT<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"5252 title=\"Katoh, K. &#038; Standley, D. M. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol. 30, 772\u2013780 (2013).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR92\" id=\"ref-link-section-d30100885e3053\">92<\/a><\/sup>, gappy positions (that is, positions with more than 67% of gaps) are trimmed by trimAl<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"5353 title=\"Capella-Guti\u00e9rrez, S., Silla-Mart\u00ednez, J. M. &#038; Gabald\u00f3n, T. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25, 1972\u20131973 (2009).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR91\" id=\"ref-link-section-d30100885e3057\">91<\/a><\/sup> and phylogenetic trees are inferred by RAxML<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"5454 title=\"Stamatakis, A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30, 1312\u20131313 (2014).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR97\" id=\"ref-link-section-d30100885e3061\">97<\/a><\/sup>.<\/p>\n<h3 id=\"Sec29\">\n                        <i>Lachnospiraceae<\/i> SGB4894 strain-level analyses<\/h3>\n<p>For the <i>Lachnospiraceae<\/i> SGB4894 strain-level analysis, we selected 5,883 human gut metagenomic samples from 86 datasets in which <i>Lachnospiraceae<\/i> SGB4894 was reported to be present based by MetaPhlAn 4 (Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM3\">15<\/a>). Seventy-nine nonhuman primates (NHP) and 12 ancient human gut metagenomic samples were also included from 12 different datasets (Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM3\">22<\/a>). SGB4894-specific marker genes were successfully reconstructed from 2,787 metagenomes, of which 2,738 were from contemporary human gut microbiome samples, five from ancient gut microbiome samples, and 44 from NHP gut microbiome samples. Strain-level profiling with StrainPhlAn 4 was performed using parameters \u2018<i>&#8211;marker_in_n_samples 70 \u2014sample_with_n_markers 10 \u2013\u2013phylophlan_mode accurate<\/i>\u2019. The phylogenetic tree generated by StrainPhlAn was plotted with GraPhlAn version 1.1.4 (ref. <sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"5555 title=\"Asnicar, F., Weingart, G., Tickle, T. L., Huttenhower, C. &#038; Segata, N. Compact graphical representation of phylogenetic data and metadata with GraPhlAn. PeerJ 3, e1029 (2015).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR98\" id=\"ref-link-section-d30100885e3094\">98<\/a><\/sup>). Phylogenetic distances were extracted based on the distance between samples in the tree and normalized by the total branch length of the tree. Geographic distances between countries were calculated using the \u2018<i>distGeo<\/i>\u2019 function of the \u2018<i>geosphere<\/i>\u2019 R package version 1.5\u201310. Spearman\u2019s correlation between genetic and geographic distance was then calculated using the \u2018<i>cor.test<\/i>\u2019 function of the \u2018<i>stats\u2019<\/i> R package version 4.0.5. Finally, to assess the transmissibility of <i>Lachnospiraceae<\/i> SGB4894, we executed the StrainPhlAn\u2019s \u2018<i>strain_transmission.py<\/i>\u2019 script using as input the phylogenetic tree (default parameters). The script, which is part of the StrainPhlAn release, can use the species-specific cutoffs on the normalized phylogenetic distances precomputed on the available datasets with longitudinal sampling.<\/p>\n<h3 id=\"Sec30\">StrainPhlAn 4 evaluation<\/h3>\n<p>The three most prevalent single-species kSGBs whose species were available in the MetaPhlAn 3 database, that is, <i>B. wexlerae<\/i> (SGB4837), <i>B. uniformis<\/i> (SGB1836) and <i>E. rectale<\/i> (SGB4933), were selected to evaluate the improvements included in StrainPhlAn 4 in comparison with the previous version. As a gold standard, for each species, we considered 100 high-quality MAGs randomly selected from the genomic catalog (Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM3\">25<\/a>) and obtained a phylogeny by processing the MAGs via Roary core gene alignment and RAxML tree reconstruction. Specifically, we computed a multiple-sequence alignment from each set of core genes (present in at least 90% of genomes) using Roary version 3.13.0 (ref. <sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"5656 title=\"Page, A. J. et al. Roary: rapid large-scale prokaryote pan genome analysis. Bioinformatics 31, 3691\u20133693 (2015).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR99\" id=\"ref-link-section-d30100885e3137\">99<\/a><\/sup>) with parameters \u2018<i>-cd 90 -i 90 -e &#8211;mafft<\/i>\u2019, and launched RAxML version 8.2.4 (ref. <sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"5757 title=\"Stamatakis, A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30, 1312\u20131313 (2014).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR97\" id=\"ref-link-section-d30100885e3145\">97<\/a><\/sup>) with parameters \u2018<i>-f a -# 100 -p 12345 -x 12345 -m GTRGAMMA<\/i>\u2019. Using the metagenomic samples from which the considered MAGs were assembled, we executed StrainPhlAn 3 and 4 using their respective database and with default parameters and \u2018<i>\u2013mutation_rates<\/i>\u2019. Additionally, we executed a similar evaluation (but using the MetaPhlAn 4 database in the StrainPhlAn 3 call) on the uSGB <i>Lachnospiraceae<\/i> SGB4894, using the 170 MAGs from the genomic catalog with publicly available metagenomic samples. Pairwise phylogenetic distances normalized by the total branch length were calculated using the PyPhlAn package (<a href=\"https:\/\/github.com\/SegataLab\/pyphlan\">https:\/\/github.com\/SegataLab\/pyphlan<\/a>). Pearson correlations between StrainPhlAn and the gold standard results were calculated using the \u2018<i>stats.pearsonr<\/i>\u2019 function of the \u2018<i>scipy<\/i>\u2019 Python package version 1.5.2.<\/p>\n<h3 id=\"Sec31\">Reporting summary<\/h3>\n<p>Further information on research design is available in the <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM2\">Nature Portfolio Reporting Summary<\/a> linked to this article.<\/p>\n<\/div>\n<\/div><\/div>\n<div data-enable-entitlement-checks>\n<div id=\"data-availability-section\" data-title=\"Data availability\">\n<h2 id=\"data-availability\">Data availability<\/h2>\n<p>All metagenomic studies analyzed in this work are publicly available through the corresponding publications listed in Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM3\">11<\/a>. All reference genomes and taxonomic data are publicly available through the NCBI GenBank database (<a href=\"https:\/\/www.ncbi.nlm.nih.gov\/genbank\/\">https:\/\/www.ncbi.nlm.nih.gov\/genbank\/<\/a>). The GTDB release 207 is publicly available at <a href=\"https:\/\/gtdb.ecogenomic.org\/\">https:\/\/gtdb.ecogenomic.org\/<\/a>. The CAMI 2 Challenge synthetic metagenomes and gold standards are available at <a href=\"https:\/\/www.microbiome-cosi.org\/cami\/cami\/cami2\">https:\/\/www.microbiome-cosi.org\/cami\/cami\/cami2<\/a>. The SynPhlAn-nonhuman synthetic metagenomes and gold standards are available at <a href=\"http:\/\/segatalab.cibio.unitn.it\/tools\/biobakery\">http:\/\/segatalab.cibio.unitn.it\/tools\/biobakery<\/a>. The new synthetic metagenomes containing kSGBs and uSGBs and gold standards as well as the single-isolate synthetic metagenomes are available at <a href=\"http:\/\/segatalab.cibio.unitn.it\/tools\/metaphlan\/\">http:\/\/segatalab.cibio.unitn.it\/tools\/metaphlan\/<\/a>. Prevalences of the SGBs across environments, age categories and lifestyles are available in Supplementary Tables <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM3\">13<\/a> and <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM3\">14<\/a>. Metadata of the publicly analyzed human metagenomes is also available through the curatedMetagenomicData R package<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"5858 title=\"Pasolli, E. et al. Accessible, curated metagenomic data through ExperimentHub. Nat. Methods 14, 1023\u20131024 (2017).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR95\" id=\"ref-link-section-d30100885e3310\">95<\/a><\/sup>. The full list of metagenomic studies used for the strain-level analysis of <i>Lachnospiraceae<\/i> SGB4894 is reported in Supplementary Tables <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM3\">15<\/a> and <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#MOESM3\">22<\/a>.<\/p>\n<\/div>\n<div id=\"code-availability-section\" data-title=\"Code availability\">\n<h2 id=\"code-availability\">Code availability<\/h2>\n<div id=\"code-availability-content\">\n<p>The MetaPhlAn 4 version described in this work is labeled as MetaPhlAn 4.beta.1 and is available at <a href=\"http:\/\/segatalab.cibio.unitn.it\/tools\/metaphlan\">http:\/\/segatalab.cibio.unitn.it\/tools\/metaphlan<\/a> with the open source code at <a href=\"https:\/\/github.com\/biobakery\/MetaPhlAn\">https:\/\/github.com\/biobakery\/MetaPhlAn<\/a> (ref. <sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"5959 title=\"Blanco-Miguez, A. et al. MetaPhlAn 4 code repository. GitHub. \n                http:\/\/segatalab.cibio.unitn.it\/tools\/metaphlan\/\n                \n               (2022).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR100\" id=\"ref-link-section-d30100885e3346\">100<\/a><\/sup>) together with StrainPhlAn 4. It is also available via Bioconda <a href=\"https:\/\/anaconda.org\/bioconda\/metaphlan\">https:\/\/anaconda.org\/bioconda\/metaphlan<\/a> (ref. <sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\"6060 title=\"Blanco-Miguez, A. et al. MetaPhlAn 4 package. Bioconda. \n                https:\/\/anaconda.org\/bioconda\/metaphlan\n                \n               (2022).\" href=\"http:\/\/www.nature.com\/articles\/s41587-023-01688-w#ref-CR101\" id=\"ref-link-section-d30100885e3357\">101<\/a><\/sup>) and PIP <a href=\"https:\/\/pypi.org\/project\/MetaPhlAn\">https:\/\/pypi.org\/project\/MetaPhlAn<\/a>.<\/p>\n<\/p><\/div>\n<\/div>\n<div id=\"MagazineFulltextArticleBodySuffix\" aria-labelledby=\"Bib1\" data-title=\"References\">\n<h2 id=\"Bib1\">References<\/h2>\n<div data-container-section=\"references\" id=\"Bib1-content\">\n<ol data-track-component=\"outbound reference\">\n<li data-counter=\"1.\">\n<p id=\"ref-CR1\">Quince, C., Walker, A. W., Simpson, J. T., Loman, N. J. &#038; Segata, N. Shotgun metagenomics, from sampling to analysis. <i>Nat. Biotechnol.<\/i> <b>35<\/b>, 833\u2013844 (2017).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/nbt.3935\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fnbt.3935\" aria-label=\"Reference 6\"6161 data-doi=\"10.1038\/nbt.3935\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BC2sXhsVOht7vJ\" aria-label=\"Reference 6\"6262>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=28898207\" aria-label=\"Reference 6\"6363>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 6\"6464 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Shotgun%20metagenomics%2C%20from%20sampling%20to%20analysis&#038;journal=Nat.%20Biotechnol.&#038;doi=10.1038%2Fnbt.3935&#038;volume=35&#038;pages=833-844&#038;publication_year=2017&#038;author=Quince%2CC&#038;author=Walker%2CAW&#038;author=Simpson%2CJT&#038;author=Loman%2CNJ&#038;author=Segata%2CN\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"2.\">\n<p id=\"ref-CR2\">Segata, N. et al. Metagenomic microbial community profiling using unique clade-specific marker genes. <i>Nat. Methods<\/i> <b>9<\/b>, 811\u2013814 (2012).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/nmeth.2066\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fnmeth.2066\" aria-label=\"Reference 6\"6565 data-doi=\"10.1038\/nmeth.2066\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BC38XotlCksL0%3D\" aria-label=\"Reference 6\"6666>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=22688413\" aria-label=\"Reference 6\"6767>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC3443552\" aria-label=\"Reference 6\"6868>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 6\"6969 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Metagenomic%20microbial%20community%20profiling%20using%20unique%20clade-specific%20marker%20genes&#038;journal=Nat.%20Methods&#038;doi=10.1038%2Fnmeth.2066&#038;volume=9&#038;pages=811-814&#038;publication_year=2012&#038;author=Segata%2CN\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"3.\">\n<p id=\"ref-CR3\">Truong, D. T. et al. MetaPhlAn2 for enhanced metagenomic taxonomic profiling. <i>Nat. Methods<\/i> <b>12<\/b>, 902\u2013903 (2015).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/nmeth.3589\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fnmeth.3589\" aria-label=\"Reference 6\"7070 data-doi=\"10.1038\/nmeth.3589\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BC2MXhsFyqsL7I\" aria-label=\"Reference 6\"7171>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=26418763\" aria-label=\"Reference 6\"7272>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 6\"7373 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=MetaPhlAn2%20for%20enhanced%20metagenomic%20taxonomic%20profiling&#038;journal=Nat.%20Methods&#038;doi=10.1038%2Fnmeth.3589&#038;volume=12&#038;pages=902-903&#038;publication_year=2015&#038;author=Truong%2CDT\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"4.\">\n<p id=\"ref-CR4\">Beghini, F. et al. Integrating taxonomic, functional, and strain-level profiling of diverse microbial communities with bioBakery 3. <i>eLife<\/i> <b>10<\/b>, e65088 (2021).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.7554\/eLife.65088\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.7554%2FeLife.65088\" aria-label=\"Reference 6\"7474 data-doi=\"10.7554\/eLife.65088\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BB3MXislantrzF\" aria-label=\"Reference 6\"7575>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=33944776\" aria-label=\"Reference 6\"7676>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC8096432\" aria-label=\"Reference 6\"7777>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 6\"7878 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Integrating%20taxonomic%2C%20functional%2C%20and%20strain-level%20profiling%20of%20diverse%20microbial%20communities%20with%20bioBakery%203&#038;journal=eLife&#038;doi=10.7554%2FeLife.65088&#038;volume=10&#038;publication_year=2021&#038;author=Beghini%2CF\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"5.\">\n<p id=\"ref-CR5\">Lu, J., Breitwieser, F. P., Thielen, P. &#038; Salzberg, S. L. Bracken: estimating species abundance in metagenomics data. <i>PeerJ Comput. Sci.<\/i> <b>3<\/b>, e104 (2017).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.7717\/peerj-cs.104\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.7717%2Fpeerj-cs.104\" aria-label=\"Reference 6\"7979 data-doi=\"10.7717\/peerj-cs.104\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 6\"8080 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Bracken%3A%20estimating%20species%20abundance%20in%20metagenomics%20data&#038;journal=PeerJ%20Comput.%20Sci.&#038;doi=10.7717%2Fpeerj-cs.104&#038;volume=3&#038;publication_year=2017&#038;author=Lu%2CJ&#038;author=Breitwieser%2CFP&#038;author=Thielen%2CP&#038;author=Salzberg%2CSL\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"6.\">\n<p id=\"ref-CR6\">Milanese, A. et al. Microbial abundance, activity and population genomic profiling with mOTUs2. <i>Nat. Commun.<\/i> <b>10<\/b>, 1014 (2019).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/s41467-019-08844-4\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fs41467-019-08844-4\" aria-label=\"Reference 6\"8181 data-doi=\"10.1038\/s41467-019-08844-4\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=30833550\" aria-label=\"Reference 6\"8282>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC6399450\" aria-label=\"Reference 6\"8383>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 6\"8484 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Microbial%20abundance%2C%20activity%20and%20population%20genomic%20profiling%20with%20mOTUs2&#038;journal=Nat.%20Commun.&#038;doi=10.1038%2Fs41467-019-08844-4&#038;volume=10&#038;publication_year=2019&#038;author=Milanese%2CA\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"7.\">\n<p id=\"ref-CR7\">Franzosa, E. A. et al. Species-level functional profiling of metagenomes and metatranscriptomes. <i>Nat. Methods<\/i> <b>15<\/b>, 962\u2013968 (2018).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/s41592-018-0176-y\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fs41592-018-0176-y\" aria-label=\"Reference 6\"8585 data-doi=\"10.1038\/s41592-018-0176-y\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BC1cXitVCks7nF\" aria-label=\"Reference 6\"8686>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=30377376\" aria-label=\"Reference 6\"8787>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC6235447\" aria-label=\"Reference 6\"8888>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 6\"8989 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Species-level%20functional%20profiling%20of%20metagenomes%20and%20metatranscriptomes&#038;journal=Nat.%20Methods&#038;doi=10.1038%2Fs41592-018-0176-y&#038;volume=15&#038;pages=962-968&#038;publication_year=2018&#038;author=Franzosa%2CEA\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"8.\">\n<p id=\"ref-CR8\">Nazeen, S., Yu, Y. W. &#038; Berger, B. Carnelian uncovers hidden functional patterns across diverse study populations from whole metagenome sequencing reads. <i>Genome Biol.<\/i> <b>21<\/b>, 47 (2020).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1186\/s13059-020-1933-7\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1186%2Fs13059-020-1933-7\" aria-label=\"Reference 6\"9090 data-doi=\"10.1186\/s13059-020-1933-7\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=32093762\" aria-label=\"Reference 6\"9191>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC7038607\" aria-label=\"Reference 6\"9292>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 6\"9393 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Carnelian%20uncovers%20hidden%20functional%20patterns%20across%20diverse%20study%20populations%20from%20whole%20metagenome%20sequencing%20reads&#038;journal=Genome%20Biol.&#038;doi=10.1186%2Fs13059-020-1933-7&#038;volume=21&#038;publication_year=2020&#038;author=Nazeen%2CS&#038;author=Yu%2CYW&#038;author=Berger%2CB\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"9.\">\n<p id=\"ref-CR9\">Ayling, M., Clark, M. D. &#038; Leggett, R. M. New approaches for metagenome assembly with short reads. <i>Brief Bioinform.<\/i> <b>21<\/b>, 584\u2013594 (2020).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1093\/bib\/bbz020\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1093%2Fbib%2Fbbz020\" aria-label=\"Reference 6\"9494 data-doi=\"10.1093\/bib\/bbz020\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BB3cXitF2ntr7K\" aria-label=\"Reference 6\"9595>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=30815668\" aria-label=\"Reference 6\"9696>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 6\"9797 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=New%20approaches%20for%20metagenome%20assembly%20with%20short%20reads&#038;journal=Brief%20Bioinform.&#038;doi=10.1093%2Fbib%2Fbbz020&#038;volume=21&#038;pages=584-594&#038;publication_year=2020&#038;author=Ayling%2CM&#038;author=Clark%2CMD&#038;author=Leggett%2CRM\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"10.\">\n<p id=\"ref-CR10\">Qin, N. et al. Alterations of the human gut microbiome in liver cirrhosis. <i>Nature<\/i> <b>513<\/b>, 59\u201364 (2014).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/nature13568\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fnature13568\" aria-label=\"Reference 6\"9898 data-doi=\"10.1038\/nature13568\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BC2cXhsVyhurvE\" aria-label=\"Reference 6\"9999>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=25079328\" aria-label=\"Reference 4\"0000>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 4\"0101 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Alterations%20of%20the%20human%20gut%20microbiome%20in%20liver%20cirrhosis&#038;journal=Nature&#038;doi=10.1038%2Fnature13568&#038;volume=513&#038;pages=59-64&#038;publication_year=2014&#038;author=Qin%2CN\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"11.\">\n<p id=\"ref-CR11\">Tett, A. et al. Unexplored diversity and strain-level structure of the skin microbiome associated with psoriasis. <i>NPJ Biofilms Microbiomes<\/i> <b>3<\/b>, 14 (2017).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/s41522-017-0022-5\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fs41522-017-0022-5\" aria-label=\"Reference 4\"0202 data-doi=\"10.1038\/s41522-017-0022-5\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=28649415\" aria-label=\"Reference 4\"0303>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC5481418\" aria-label=\"Reference 4\"0404>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 4\"0505 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Unexplored%20diversity%20and%20strain-level%20structure%20of%20the%20skin%20microbiome%20associated%20with%20psoriasis&#038;journal=NPJ%20Biofilms%20Microbiomes&#038;doi=10.1038%2Fs41522-017-0022-5&#038;volume=3&#038;publication_year=2017&#038;author=Tett%2CA\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"12.\">\n<p id=\"ref-CR12\">Jie, Z. et al. The gut microbiome in atherosclerotic cardiovascular disease. <i>Nat. Commun.<\/i> <b>8<\/b>, 845 (2017).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/s41467-017-00900-1\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fs41467-017-00900-1\" aria-label=\"Reference 4\"0606 data-doi=\"10.1038\/s41467-017-00900-1\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=29018189\" aria-label=\"Reference 4\"0707>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC5635030\" aria-label=\"Reference 4\"0808>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 4\"0909 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=The%20gut%20microbiome%20in%20atherosclerotic%20cardiovascular%20disease&#038;journal=Nat.%20Commun.&#038;doi=10.1038%2Fs41467-017-00900-1&#038;volume=8&#038;publication_year=2017&#038;author=Jie%2CZ\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"13.\">\n<p id=\"ref-CR13\">Schirmer, M. et al. Dynamics of metatranscription in the inflammatory bowel disease gut microbiome. <i>Nat. Microbiol.<\/i> <b>3<\/b>, 337\u2013346 (2018).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/s41564-017-0089-z\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fs41564-017-0089-z\" aria-label=\"Reference 4\"1010 data-doi=\"10.1038\/s41564-017-0089-z\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BC1cXlvFSjug%3D%3D\" aria-label=\"Reference 4\"1111>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=29311644\" aria-label=\"Reference 4\"1212>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC6131705\" aria-label=\"Reference 4\"1313>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 4\"1414 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Dynamics%20of%20metatranscription%20in%20the%20inflammatory%20bowel%20disease%20gut%20microbiome&#038;journal=Nat.%20Microbiol.&#038;doi=10.1038%2Fs41564-017-0089-z&#038;volume=3&#038;pages=337-346&#038;publication_year=2018&#038;author=Schirmer%2CM\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"14.\">\n<p id=\"ref-CR14\">Ye, Z. et al. A metagenomic study of the gut microbiome in Behcet\u2019s disease. <i>Microbiome<\/i> <b>6<\/b>, 135 (2018).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1186\/s40168-018-0520-6\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1186%2Fs40168-018-0520-6\" aria-label=\"Reference 4\"1515 data-doi=\"10.1186\/s40168-018-0520-6\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=30077182\" aria-label=\"Reference 4\"1616>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC6091101\" aria-label=\"Reference 4\"1717>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 4\"1818 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=A%20metagenomic%20study%20of%20the%20gut%20microbiome%20in%20Behcet%E2%80%99s%20disease&#038;journal=Microbiome&#038;doi=10.1186%2Fs40168-018-0520-6&#038;volume=6&#038;publication_year=2018&#038;author=Ye%2CZ\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"15.\">\n<p id=\"ref-CR15\">Zhou, W. et al. Longitudinal multi-omics of host-microbe dynamics in prediabetes. <i>Nature<\/i> <b>569<\/b>, 663\u2013671 (2019).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/s41586-019-1236-x\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fs41586-019-1236-x\" aria-label=\"Reference 4\"1919 data-doi=\"10.1038\/s41586-019-1236-x\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BC1MXhtVOgtL3L\" aria-label=\"Reference 4\"2020>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=31142858\" aria-label=\"Reference 4\"2121>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC6666404\" aria-label=\"Reference 4\"2222>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 4\"2323 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Longitudinal%20multi-omics%20of%20host-microbe%20dynamics%20in%20prediabetes&#038;journal=Nature&#038;doi=10.1038%2Fs41586-019-1236-x&#038;volume=569&#038;pages=663-671&#038;publication_year=2019&#038;author=Zhou%2CW\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"16.\">\n<p id=\"ref-CR16\">Thomas, A. M. et al. Metagenomic analysis of colorectal cancer datasets identifies cross-cohort microbial diagnostic signatures and a link with choline degradation. <i>Nat. Med.<\/i> <b>25<\/b>, 667\u2013678 (2019).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/s41591-019-0405-7\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fs41591-019-0405-7\" aria-label=\"Reference 4\"2424 data-doi=\"10.1038\/s41591-019-0405-7\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BC1MXosFGqu74%3D\" aria-label=\"Reference 4\"2525>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=30936548\" aria-label=\"Reference 4\"2626>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC9533319\" aria-label=\"Reference 4\"2727>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 4\"2828 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Metagenomic%20analysis%20of%20colorectal%20cancer%20datasets%20identifies%20cross-cohort%20microbial%20diagnostic%20signatures%20and%20a%20link%20with%20choline%20degradation&#038;journal=Nat.%20Med.&#038;doi=10.1038%2Fs41591-019-0405-7&#038;volume=25&#038;pages=667-678&#038;publication_year=2019&#038;author=Thomas%2CAM\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"17.\">\n<p id=\"ref-CR17\">Ghensi, P. et al. Strong oral plaque microbiome signatures for dental implant diseases identified by strain-resolution metagenomics. <i>NPJ Biofilms Microbiomes<\/i> <b>6<\/b>, 47 (2020).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/s41522-020-00155-7\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fs41522-020-00155-7\" aria-label=\"Reference 4\"2929 data-doi=\"10.1038\/s41522-020-00155-7\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BB3cXitlCqtbnN\" aria-label=\"Reference 4\"3030>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=33127901\" aria-label=\"Reference 4\"3131>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC7603341\" aria-label=\"Reference 4\"3232>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 4\"3333 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Strong%20oral%20plaque%20microbiome%20signatures%20for%20dental%20implant%20diseases%20identified%20by%20strain-resolution%20metagenomics&#038;journal=NPJ%20Biofilms%20Microbiomes&#038;doi=10.1038%2Fs41522-020-00155-7&#038;volume=6&#038;publication_year=2020&#038;author=Ghensi%2CP\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"18.\">\n<p id=\"ref-CR18\">Zhu, F. et al. Metagenome-wide association of gut microbiome features for schizophrenia. <i>Nat. Commun.<\/i> <b>11<\/b>, 1612 (2020).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/s41467-020-15457-9\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fs41467-020-15457-9\" aria-label=\"Reference 4\"3434 data-doi=\"10.1038\/s41467-020-15457-9\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=32235826\" aria-label=\"Reference 4\"3535>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC7109134\" aria-label=\"Reference 4\"3636>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 4\"3737 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Metagenome-wide%20association%20of%20gut%20microbiome%20features%20for%20schizophrenia&#038;journal=Nat.%20Commun.&#038;doi=10.1038%2Fs41467-020-15457-9&#038;volume=11&#038;publication_year=2020&#038;author=Zhu%2CF\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"19.\">\n<p id=\"ref-CR19\">Claesson, M. J. et al. Gut microbiota composition correlates with diet and health in the elderly. <i>Nature<\/i> <b>488<\/b>, 178\u2013184 (2012).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/nature11319\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fnature11319\" aria-label=\"Reference 4\"3838 data-doi=\"10.1038\/nature11319\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BC38XhtFKiur%2FP\" aria-label=\"Reference 4\"3939>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=22797518\" aria-label=\"Reference 4\"4040>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 4\"4141 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Gut%20microbiota%20composition%20correlates%20with%20diet%20and%20health%20in%20the%20elderly&#038;journal=Nature&#038;doi=10.1038%2Fnature11319&#038;volume=488&#038;pages=178-184&#038;publication_year=2012&#038;author=Claesson%2CMJ\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"20.\">\n<p id=\"ref-CR20\">David, L. A. et al. Diet rapidly and reproducibly alters the human gut microbiome. <i>Nature<\/i> <b>505<\/b>, 559\u2013563 (2014).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/nature12820\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fnature12820\" aria-label=\"Reference 4\"4242 data-doi=\"10.1038\/nature12820\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BC2cXhtFOls78%3D\" aria-label=\"Reference 4\"4343>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=24336217\" aria-label=\"Reference 4\"4444>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 4\"4545 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Diet%20rapidly%20and%20reproducibly%20alters%20the%20human%20gut%20microbiome&#038;journal=Nature&#038;doi=10.1038%2Fnature12820&#038;volume=505&#038;pages=559-563&#038;publication_year=2014&#038;author=David%2CLA\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"21.\">\n<p id=\"ref-CR21\">Hansen, L. B. S. et al. A low-gluten diet induces changes in the intestinal microbiome of healthy Danish adults. <i>Nat. Commun.<\/i> <b>9<\/b>, 4630 (2018).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/s41467-018-07019-x\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fs41467-018-07019-x\" aria-label=\"Reference 4\"4646 data-doi=\"10.1038\/s41467-018-07019-x\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=30425247\" aria-label=\"Reference 4\"4747>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC6234216\" aria-label=\"Reference 4\"4848>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 4\"4949 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=A%20low-gluten%20diet%20induces%20changes%20in%20the%20intestinal%20microbiome%20of%20healthy%20Danish%20adults&#038;journal=Nat.%20Commun.&#038;doi=10.1038%2Fs41467-018-07019-x&#038;volume=9&#038;publication_year=2018&#038;author=Hansen%2CLBS\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"22.\">\n<p id=\"ref-CR22\">Asnicar, F. et al. Microbiome connections with host metabolism and habitual diet from 1,098 deeply phenotyped individuals. <i>Nat. Med.<\/i> <b>27<\/b>, 321\u2013332 (2021).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/s41591-020-01183-8\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fs41591-020-01183-8\" aria-label=\"Reference 4\"5050 data-doi=\"10.1038\/s41591-020-01183-8\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BB3MXhtVGrt74%3D\" aria-label=\"Reference 4\"5151>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=33432175\" aria-label=\"Reference 4\"5252>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC8353542\" aria-label=\"Reference 4\"5353>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 4\"5454 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Microbiome%20connections%20with%20host%20metabolism%20and%20habitual%20diet%20from%201%2C098%20deeply%20phenotyped%20individuals&#038;journal=Nat.%20Med.&#038;doi=10.1038%2Fs41591-020-01183-8&#038;volume=27&#038;pages=321-332&#038;publication_year=2021&#038;author=Asnicar%2CF\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"23.\">\n<p id=\"ref-CR23\">Wang, D. D. et al. The gut microbiome modulates the protective association between a Mediterranean diet and cardiometabolic disease risk. <i>Nat. Med.<\/i> <b>27<\/b>, 333\u2013343 (2021).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/s41591-020-01223-3\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fs41591-020-01223-3\" aria-label=\"Reference 4\"5555 data-doi=\"10.1038\/s41591-020-01223-3\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BB3MXjvVOnsbo%3D\" aria-label=\"Reference 4\"5656>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=33574608\" aria-label=\"Reference 4\"5757>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC8186452\" aria-label=\"Reference 4\"5858>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 4\"5959 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=The%20gut%20microbiome%20modulates%20the%20protective%20association%20between%20a%20Mediterranean%20diet%20and%20cardiometabolic%20disease%20risk&#038;journal=Nat.%20Med.&#038;doi=10.1038%2Fs41591-020-01223-3&#038;volume=27&#038;pages=333-343&#038;publication_year=2021&#038;author=Wang%2CDD\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"24.\">\n<p id=\"ref-CR24\">Asnicar, F. et al. Studying vertical microbiome transmission from mothers to infants by strain-level metagenomic profiling. <i>mSystems<\/i> <b>2<\/b>, e00164-16 (2017).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1128\/mSystems.00164-16\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1128%2FmSystems.00164-16\" aria-label=\"Reference 4\"6060 data-doi=\"10.1128\/mSystems.00164-16\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=28144631\" aria-label=\"Reference 4\"6161>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC5264247\" aria-label=\"Reference 4\"6262>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 4\"6363 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Studying%20vertical%20microbiome%20transmission%20from%20mothers%20to%20infants%20by%20strain-level%20metagenomic%20profiling&#038;journal=mSystems&#038;doi=10.1128%2FmSystems.00164-16&#038;volume=2&#038;publication_year=2017&#038;author=Asnicar%2CF\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"25.\">\n<p id=\"ref-CR25\">Ferretti, P. et al. Mother-to-infant microbial transmission from different body sites shapes the developing infant gut microbiome. <i>Cell Host Microbe<\/i> <b>24<\/b>, 133\u2013145 (2018).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1016\/j.chom.2018.06.005\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1016%2Fj.chom.2018.06.005\" aria-label=\"Reference 4\"6464 data-doi=\"10.1016\/j.chom.2018.06.005\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BC1cXhtlSqt7vI\" aria-label=\"Reference 4\"6565>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=30001516\" aria-label=\"Reference 4\"6666>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC6716579\" aria-label=\"Reference 4\"6767>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 4\"6868 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Mother-to-infant%20microbial%20transmission%20from%20different%20body%20sites%20shapes%20the%20developing%20infant%20gut%20microbiome&#038;journal=Cell%20Host%20Microbe&#038;doi=10.1016%2Fj.chom.2018.06.005&#038;volume=24&#038;pages=133-145&#038;publication_year=2018&#038;author=Ferretti%2CP\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"26.\">\n<p id=\"ref-CR26\">Yassour, M. et al. Strain-level analysis of mother-to-child bacterial transmission during the first few months of life. <i>Cell Host Microbe<\/i> <b>24<\/b>, 146\u2013154 (2018).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1016\/j.chom.2018.06.007\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1016%2Fj.chom.2018.06.007\" aria-label=\"Reference 4\"6969 data-doi=\"10.1016\/j.chom.2018.06.007\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BC1cXhtlSqt7jM\" aria-label=\"Reference 4\"7070>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=30001517\" aria-label=\"Reference 4\"7171>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC6091882\" aria-label=\"Reference 4\"7272>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 4\"7373 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Strain-level%20analysis%20of%20mother-to-child%20bacterial%20transmission%20during%20the%20first%20few%20months%20of%20life&#038;journal=Cell%20Host%20Microbe&#038;doi=10.1016%2Fj.chom.2018.06.007&#038;volume=24&#038;pages=146-154&#038;publication_year=2018&#038;author=Yassour%2CM\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"27.\">\n<p id=\"ref-CR27\">Brito, I. L. et al. Transmission of human-associated microbiota along family and social networks. <i>Nat. Microbiol.<\/i> <b>4<\/b>, 964\u2013971 (2019).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/s41564-019-0409-6\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fs41564-019-0409-6\" aria-label=\"Reference 4\"7474 data-doi=\"10.1038\/s41564-019-0409-6\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BC1MXmslaktr0%3D\" aria-label=\"Reference 4\"7575>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=30911128\" aria-label=\"Reference 4\"7676>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC7450247\" aria-label=\"Reference 4\"7777>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 4\"7878 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Transmission%20of%20human-associated%20microbiota%20along%20family%20and%20social%20networks&#038;journal=Nat.%20Microbiol.&#038;doi=10.1038%2Fs41564-019-0409-6&#038;volume=4&#038;pages=964-971&#038;publication_year=2019&#038;author=Brito%2CIL\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"28.\">\n<p id=\"ref-CR28\">Ianiro, G. et al. Faecal microbiota transplantation for the treatment of diarrhoea induced by tyrosine-kinase inhibitors in patients with metastatic renal cell carcinoma. <i>Nat. Commun.<\/i> <b>11<\/b>, 4333 (2020).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/s41467-020-18127-y\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fs41467-020-18127-y\" aria-label=\"Reference 4\"7979 data-doi=\"10.1038\/s41467-020-18127-y\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=32859933\" aria-label=\"Reference 4\"8080>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC7455693\" aria-label=\"Reference 4\"8181>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 4\"8282 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Faecal%20microbiota%20transplantation%20for%20the%20treatment%20of%20diarrhoea%20induced%20by%20tyrosine-kinase%20inhibitors%20in%20patients%20with%20metastatic%20renal%20cell%20carcinoma&#038;journal=Nat.%20Commun.&#038;doi=10.1038%2Fs41467-020-18127-y&#038;volume=11&#038;publication_year=2020&#038;author=Ianiro%2CG\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"29.\">\n<p id=\"ref-CR29\">Chen, L. et al. The long-term genetic stability and individual specificity of the human gut microbiome. <i>Cell<\/i> <b>184<\/b>, 2302\u20132315 (2021).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1016\/j.cell.2021.03.024\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1016%2Fj.cell.2021.03.024\" aria-label=\"Reference 4\"8383 data-doi=\"10.1016\/j.cell.2021.03.024\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BB3MXovV2itLk%3D\" aria-label=\"Reference 4\"8484>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=33838112\" aria-label=\"Reference 4\"8585>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 4\"8686 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=The%20long-term%20genetic%20stability%20and%20individual%20specificity%20of%20the%20human%20gut%20microbiome&#038;journal=Cell&#038;doi=10.1016%2Fj.cell.2021.03.024&#038;volume=184&#038;pages=2302-2315&#038;publication_year=2021&#038;author=Chen%2CL\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"30.\">\n<p id=\"ref-CR30\">Thomas, A. M. &#038; Segata, N. Multiple levels of the unknown in microbiome research. <i>BMC Biol.<\/i> <b>17<\/b>, 48 (2019).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1186\/s12915-019-0667-z\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1186%2Fs12915-019-0667-z\" aria-label=\"Reference 4\"8787 data-doi=\"10.1186\/s12915-019-0667-z\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=31189463\" aria-label=\"Reference 4\"8888>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC6560723\" aria-label=\"Reference 4\"8989>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 4\"9090 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Multiple%20levels%20of%20the%20unknown%20in%20microbiome%20research&#038;journal=BMC%20Biol.&#038;doi=10.1186%2Fs12915-019-0667-z&#038;volume=17&#038;publication_year=2019&#038;author=Thomas%2CAM&#038;author=Segata%2CN\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"31.\">\n<p id=\"ref-CR31\">Li, D., Liu, C.-M., Luo, R., Sadakane, K. &#038; Lam, T.-W. MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. <i>Bioinformatics<\/i> <b>31<\/b>, 1674\u20131676 (2015).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1093\/bioinformatics\/btv033\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1093%2Fbioinformatics%2Fbtv033\" aria-label=\"Reference 4\"9191 data-doi=\"10.1093\/bioinformatics\/btv033\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BC28XhtFyltL3N\" aria-label=\"Reference 4\"9292>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=25609793\" aria-label=\"Reference 4\"9393>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 4\"9494 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=MEGAHIT%3A%20an%20ultra-fast%20single-node%20solution%20for%20large%20and%20complex%20metagenomics%20assembly%20via%20succinct%20de%20Bruijn%20graph&#038;journal=Bioinformatics&#038;doi=10.1093%2Fbioinformatics%2Fbtv033&#038;volume=31&#038;pages=1674-1676&#038;publication_year=2015&#038;author=Li%2CD&#038;author=Liu%2CC-M&#038;author=Luo%2CR&#038;author=Sadakane%2CK&#038;author=Lam%2CT-W\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"32.\">\n<p id=\"ref-CR32\">Nurk, S., Meleshko, D., Korobeynikov, A. &#038; Pevzner, P. A. metaSPAdes: a new versatile metagenomic assembler. <i>Genome Res.<\/i> <b>27<\/b>, 824\u2013834 (2017).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1101\/gr.213959.116\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1101%2Fgr.213959.116\" aria-label=\"Reference 4\"9595 data-doi=\"10.1101\/gr.213959.116\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BC2sXhtFyjsrrJ\" aria-label=\"Reference 4\"9696>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=28298430\" aria-label=\"Reference 4\"9797>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC5411777\" aria-label=\"Reference 4\"9898>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 4\"9999 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=metaSPAdes%3A%20a%20new%20versatile%20metagenomic%20assembler&#038;journal=Genome%20Res.&#038;doi=10.1101%2Fgr.213959.116&#038;volume=27&#038;pages=824-834&#038;publication_year=2017&#038;author=Nurk%2CS&#038;author=Meleshko%2CD&#038;author=Korobeynikov%2CA&#038;author=Pevzner%2CPA\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"33.\">\n<p id=\"ref-CR33\">Kang, D. D. et al. MetaBAT 2: an adaptive binning algorithm for robust and efficient genome reconstruction from metagenome assemblies. <i>PeerJ<\/i> <b>7<\/b>, e7359 (2019).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.7717\/peerj.7359\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.7717%2Fpeerj.7359\" aria-label=\"Reference 7\"0000 data-doi=\"10.7717\/peerj.7359\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=31388474\" aria-label=\"Reference 7\"0101>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC6662567\" aria-label=\"Reference 7\"0202>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 7\"0303 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=MetaBAT%202%3A%20an%20adaptive%20binning%20algorithm%20for%20robust%20and%20efficient%20genome%20reconstruction%20from%20metagenome%20assemblies&#038;journal=PeerJ&#038;doi=10.7717%2Fpeerj.7359&#038;volume=7&#038;publication_year=2019&#038;author=Kang%2CDD\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"34.\">\n<p id=\"ref-CR34\">Wu, Y.-W., Simmons, B. A. &#038; Singer, S. W. MaxBin 2.0: an automated binning algorithm to recover genomes from multiple metagenomic datasets. <i>Bioinformatics<\/i> <b>32<\/b>, 605\u2013607 (2016).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1093\/bioinformatics\/btv638\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1093%2Fbioinformatics%2Fbtv638\" aria-label=\"Reference 7\"0404 data-doi=\"10.1093\/bioinformatics\/btv638\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BC28XhsVWhur3F\" aria-label=\"Reference 7\"0505>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=26515820\" aria-label=\"Reference 7\"0606>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 7\"0707 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=MaxBin%202.0%3A%20an%20automated%20binning%20algorithm%20to%20recover%20genomes%20from%20multiple%20metagenomic%20datasets&#038;journal=Bioinformatics&#038;doi=10.1093%2Fbioinformatics%2Fbtv638&#038;volume=32&#038;pages=605-607&#038;publication_year=2016&#038;author=Wu%2CY-W&#038;author=Simmons%2CBA&#038;author=Singer%2CSW\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"35.\">\n<p id=\"ref-CR35\">Nissen, J. N. et al. Improved metagenome binning and assembly using deep variational autoencoders. <i>Nat. Biotechnol.<\/i> <a href=\"https:\/\/doi.org\/10.1038\/s41587-020-00777-4\">https:\/\/doi.org\/10.1038\/s41587-020-00777-4<\/a> (2021).<\/p>\n<\/li>\n<li data-counter=\"36.\">\n<p id=\"ref-CR36\">Saheb Kashaf, S., Almeida, A., Segre, J. A. &#038; Finn, R. D. Recovering prokaryotic genomes from host-associated, short-read shotgun metagenomic sequencing data. <i>Nat. Protoc.<\/i> <b>16<\/b>, 2520\u20132541 (2021).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/s41596-021-00508-2\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fs41596-021-00508-2\" aria-label=\"Reference 7\"0808 data-doi=\"10.1038\/s41596-021-00508-2\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BB3MXptFSqtrY%3D\" aria-label=\"Reference 7\"0909>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=33864056\" aria-label=\"Reference 7\"1010>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 7\"1111 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Recovering%20prokaryotic%20genomes%20from%20host-associated%2C%20short-read%20shotgun%20metagenomic%20sequencing%20data&#038;journal=Nat.%20Protoc.&#038;doi=10.1038%2Fs41596-021-00508-2&#038;volume=16&#038;pages=2520-2541&#038;publication_year=2021&#038;author=Saheb%20Kashaf%2CS&#038;author=Almeida%2CA&#038;author=Segre%2CJA&#038;author=Finn%2CRD\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"37.\">\n<p id=\"ref-CR37\">Tully, B. J., Graham, E. D. &#038; Heidelberg, J. F. The reconstruction of 2,631 draft metagenome-assembled genomes from the global oceans. <i>Sci. Data<\/i> <b>5<\/b>, 170203 (2018).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/sdata.2017.203\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fsdata.2017.203\" aria-label=\"Reference 7\"1212 data-doi=\"10.1038\/sdata.2017.203\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BC1cXovFamug%3D%3D\" aria-label=\"Reference 7\"1313>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=29337314\" aria-label=\"Reference 7\"1414>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC5769542\" aria-label=\"Reference 7\"1515>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 7\"1616 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=The%20reconstruction%20of%202%2C631%20draft%20metagenome-assembled%20genomes%20from%20the%20global%20oceans&#038;journal=Sci.%20Data&#038;doi=10.1038%2Fsdata.2017.203&#038;volume=5&#038;publication_year=2018&#038;author=Tully%2CBJ&#038;author=Graham%2CED&#038;author=Heidelberg%2CJF\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"38.\">\n<p id=\"ref-CR38\">Manara, S. et al. Microbial genomes from non-human primate gut metagenomes expand the primate-associated bacterial tree of life with over 1000 novel species. <i>Genome Biol.<\/i> <b>20<\/b>, 299 (2019).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1186\/s13059-019-1923-9\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1186%2Fs13059-019-1923-9\" aria-label=\"Reference 7\"1717 data-doi=\"10.1186\/s13059-019-1923-9\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BC1MXisFSjs7rP\" aria-label=\"Reference 7\"1818>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=31883524\" aria-label=\"Reference 7\"1919>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC6935492\" aria-label=\"Reference 7\"2020>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 7\"2121 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Microbial%20genomes%20from%20non-human%20primate%20gut%20metagenomes%20expand%20the%20primate-associated%20bacterial%20tree%20of%20life%20with%20over%201000%20novel%20species&#038;journal=Genome%20Biol.&#038;doi=10.1186%2Fs13059-019-1923-9&#038;volume=20&#038;publication_year=2019&#038;author=Manara%2CS\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"39.\">\n<p id=\"ref-CR39\">Stewart, R. D. et al. Compendium of 4,941 rumen metagenome-assembled genomes for rumen microbiome biology and enzyme discovery. <i>Nat. Biotechnol.<\/i> <b>37<\/b>, 953\u2013961 (2019).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/s41587-019-0202-3\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fs41587-019-0202-3\" aria-label=\"Reference 7\"2222 data-doi=\"10.1038\/s41587-019-0202-3\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BC1MXhsFWqt7fN\" aria-label=\"Reference 7\"2323>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=31375809\" aria-label=\"Reference 7\"2424>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC6785717\" aria-label=\"Reference 7\"2525>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 7\"2626 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Compendium%20of%204%2C941%20rumen%20metagenome-assembled%20genomes%20for%20rumen%20microbiome%20biology%20and%20enzyme%20discovery&#038;journal=Nat.%20Biotechnol.&#038;doi=10.1038%2Fs41587-019-0202-3&#038;volume=37&#038;pages=953-961&#038;publication_year=2019&#038;author=Stewart%2CRD\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"40.\">\n<p id=\"ref-CR40\">Nayfach, S., Shi, Z. J., Seshadri, R., Pollard, K. S. &#038; Kyrpides, N. C. New insights from uncultivated genomes of the global human gut microbiome. <i>Nature<\/i> <b>568<\/b>, 505\u2013510 (2019).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/s41586-019-1058-x\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fs41586-019-1058-x\" aria-label=\"Reference 7\"2727 data-doi=\"10.1038\/s41586-019-1058-x\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BC1MXosVSgsbo%3D\" aria-label=\"Reference 7\"2828>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=30867587\" aria-label=\"Reference 7\"2929>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC6784871\" aria-label=\"Reference 7\"3030>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 7\"3131 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=New%20insights%20from%20uncultivated%20genomes%20of%20the%20global%20human%20gut%20microbiome&#038;journal=Nature&#038;doi=10.1038%2Fs41586-019-1058-x&#038;volume=568&#038;pages=505-510&#038;publication_year=2019&#038;author=Nayfach%2CS&#038;author=Shi%2CZJ&#038;author=Seshadri%2CR&#038;author=Pollard%2CKS&#038;author=Kyrpides%2CNC\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"41.\">\n<p id=\"ref-CR41\">Almeida, A. et al. A new genomic blueprint of the human gut microbiota. <i>Nature<\/i> <b>568<\/b>, 499\u2013504 (2019).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/s41586-019-0965-1\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fs41586-019-0965-1\" aria-label=\"Reference 7\"3232 data-doi=\"10.1038\/s41586-019-0965-1\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BC1MXmslKhu7s%3D\" aria-label=\"Reference 7\"3333>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=30745586\" aria-label=\"Reference 7\"3434>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC6784870\" aria-label=\"Reference 7\"3535>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 7\"3636 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=A%20new%20genomic%20blueprint%20of%20the%20human%20gut%20microbiota&#038;journal=Nature&#038;doi=10.1038%2Fs41586-019-0965-1&#038;volume=568&#038;pages=499-504&#038;publication_year=2019&#038;author=Almeida%2CA\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"42.\">\n<p id=\"ref-CR42\">Pasolli, E. et al. Extensive unexplored human microbiome diversity revealed by over 150,000 genomes from metagenomes spanning age, geography, and lifestyle. <i>Cell<\/i> <b>176<\/b>, 649\u2013662 (2019).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1016\/j.cell.2019.01.001\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1016%2Fj.cell.2019.01.001\" aria-label=\"Reference 7\"3737 data-doi=\"10.1016\/j.cell.2019.01.001\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BC1MXhsVars7w%3D\" aria-label=\"Reference 7\"3838>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=30661755\" aria-label=\"Reference 7\"3939>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC6349461\" aria-label=\"Reference 7\"4040>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 7\"4141 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Extensive%20unexplored%20human%20microbiome%20diversity%20revealed%20by%20over%20150%2C000%20genomes%20from%20metagenomes%20spanning%20age%2C%20geography%2C%20and%20lifestyle&#038;journal=Cell&#038;doi=10.1016%2Fj.cell.2019.01.001&#038;volume=176&#038;pages=649-662&#038;publication_year=2019&#038;author=Pasolli%2CE\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"43.\">\n<p id=\"ref-CR43\">Nayfach, S. et al. A genomic catalog of Earth\u2019s microbiomes. <i>Nat. Biotechnol.<\/i> <a href=\"https:\/\/doi.org\/10.1038\/s41587-020-0718-6\">https:\/\/doi.org\/10.1038\/s41587-020-0718-6<\/a> (2020).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/s41587-020-0718-6\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fs41587-020-0718-6\" aria-label=\"Reference 7\"4242 data-doi=\"10.1038\/s41587-020-0718-6\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=33349699\" aria-label=\"Reference 7\"4343>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC8116208\" aria-label=\"Reference 7\"4444>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 7\"4545 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=A%20genomic%20catalog%20of%20Earth%E2%80%99s%20microbiomes&#038;journal=Nat.%20Biotechnol.&#038;doi=10.1038%2Fs41587-020-0718-6&#038;publication_year=2020&#038;author=Nayfach%2CS\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"44.\">\n<p id=\"ref-CR44\">Lesker, T. R. et al. An integrated metagenome catalog reveals new insights into the murine gut microbiome. <i>Cell Rep.<\/i> <b>30<\/b>, 2909\u20132922 (2020).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1016\/j.celrep.2020.02.036\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1016%2Fj.celrep.2020.02.036\" aria-label=\"Reference 7\"4646 data-doi=\"10.1016\/j.celrep.2020.02.036\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BB3cXlt1ektrc%3D\" aria-label=\"Reference 7\"4747>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=32130896\" aria-label=\"Reference 7\"4848>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC7059117\" aria-label=\"Reference 7\"4949>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 7\"5050 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=An%20integrated%20metagenome%20catalog%20reveals%20new%20insights%20into%20the%20murine%20gut%20microbiome&#038;journal=Cell%20Rep.&#038;doi=10.1016%2Fj.celrep.2020.02.036&#038;volume=30&#038;pages=2909-2922&#038;publication_year=2020&#038;author=Lesker%2CTR\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"45.\">\n<p id=\"ref-CR45\">Almeida, A. et al. A unified catalog of 204,938 reference genomes from the human gut microbiome. <i>Nat. Biotechnol.<\/i> <a href=\"https:\/\/doi.org\/10.1038\/s41587-020-0603-3\">https:\/\/doi.org\/10.1038\/s41587-020-0603-3<\/a> (2020).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/s41587-020-0603-3\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fs41587-020-0603-3\" aria-label=\"Reference 7\"5151 data-doi=\"10.1038\/s41587-020-0603-3\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=32690973\" aria-label=\"Reference 7\"5252>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC7801254\" aria-label=\"Reference 7\"5353>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 7\"5454 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=A%20unified%20catalog%20of%20204%2C938%20reference%20genomes%20from%20the%20human%20gut%20microbiome&#038;journal=Nat.%20Biotechnol.&#038;doi=10.1038%2Fs41587-020-0603-3&#038;publication_year=2020&#038;author=Almeida%2CA\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"46.\">\n<p id=\"ref-CR46\">Levin, D. et al. Diversity and functional landscapes in the microbiota of animals in the wild. <i>Science<\/i> <b>372<\/b>, eabb5352 (2021).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1126\/science.abb5352\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1126%2Fscience.abb5352\" aria-label=\"Reference 7\"5555 data-doi=\"10.1126\/science.abb5352\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BB3MXovVKksLs%3D\" aria-label=\"Reference 7\"5656>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=33766942\" aria-label=\"Reference 7\"5757>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 7\"5858 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Diversity%20and%20functional%20landscapes%20in%20the%20microbiota%20of%20animals%20in%20the%20wild&#038;journal=Science&#038;doi=10.1126%2Fscience.abb5352&#038;volume=372&#038;publication_year=2021&#038;author=Levin%2CD\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"47.\">\n<p id=\"ref-CR47\">Jain, C., Rodriguez-R, L. M., Phillippy, A. M., Konstantinidis, K. T. &#038; Aluru, S. High throughput ANI analysis of 90K prokaryotic genomes reveals clear species boundaries. <i>Nat. Commun.<\/i> <b>9<\/b>, 5114 (2018).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/s41467-018-07641-9\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fs41467-018-07641-9\" aria-label=\"Reference 7\"5959 data-doi=\"10.1038\/s41467-018-07641-9\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=30504855\" aria-label=\"Reference 7\"6060>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC6269478\" aria-label=\"Reference 7\"6161>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 7\"6262 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=High%20throughput%20ANI%20analysis%20of%2090K%20prokaryotic%20genomes%20reveals%20clear%20species%20boundaries&#038;journal=Nat.%20Commun.&#038;doi=10.1038%2Fs41467-018-07641-9&#038;volume=9&#038;publication_year=2018&#038;author=Jain%2CC&#038;author=Rodriguez-R%2CLM&#038;author=Phillippy%2CAM&#038;author=Konstantinidis%2CKT&#038;author=Aluru%2CS\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"48.\">\n<p id=\"ref-CR48\">Parks, D. H. et al. A complete domain-to-species taxonomy for Bacteria and Archaea. <i>Nat. Biotechnol.<\/i> <b>38<\/b>, 1079\u20131086 (2020).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/s41587-020-0501-8\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fs41587-020-0501-8\" aria-label=\"Reference 7\"6363 data-doi=\"10.1038\/s41587-020-0501-8\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BB3cXotVCisro%3D\" aria-label=\"Reference 7\"6464>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=32341564\" aria-label=\"Reference 7\"6565>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 7\"6666 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=A%20complete%20domain-to-species%20taxonomy%20for%20Bacteria%20and%20Archaea&#038;journal=Nat.%20Biotechnol.&#038;doi=10.1038%2Fs41587-020-0501-8&#038;volume=38&#038;pages=1079-1086&#038;publication_year=2020&#038;author=Parks%2CDH\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"49.\">\n<p id=\"ref-CR49\">Schoch, C. L. et al. NCBI taxonomy: a comprehensive update on curation, resources and tools. <i>Database<\/i> <b>2020<\/b>, baaa062 (2020).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1093\/database\/baaa062\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1093%2Fdatabase%2Fbaaa062\" aria-label=\"Reference 7\"6767 data-doi=\"10.1093\/database\/baaa062\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BB3cXitlaltbnE\" aria-label=\"Reference 7\"6868>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=32761142\" aria-label=\"Reference 7\"6969>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC7408187\" aria-label=\"Reference 7\"7070>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 7\"7171 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=NCBI%20taxonomy%3A%20a%20comprehensive%20update%20on%20curation%2C%20resources%20and%20tools&#038;journal=Database&#038;doi=10.1093%2Fdatabase%2Fbaaa062&#038;volume=2020&#038;publication_year=2020&#038;author=Schoch%2CCL\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"50.\">\n<p id=\"ref-CR50\">Rasko, D. A., Altherr, M. R., Han, C. S. &#038; Ravel, J. Genomics of the <i>Bacillus cereus<\/i> group of organisms. <i>FEMS Microbiol. Rev.<\/i> <b>29<\/b>, 303\u2013329 (2005).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BD2MXivV2qurc%3D\" aria-label=\"Reference 7\"7272>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=15808746\" aria-label=\"Reference 7\"7373>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 7\"7474 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Genomics%20of%20the%20Bacillus%20cereus%20group%20of%20organisms&#038;journal=FEMS%20Microbiol.%20Rev.&#038;volume=29&#038;pages=303-329&#038;publication_year=2005&#038;author=Rasko%2CDA&#038;author=Altherr%2CMR&#038;author=Han%2CCS&#038;author=Ravel%2CJ\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"51.\">\n<p id=\"ref-CR51\">Tett, A. et al. The Prevotella copri complex comprises four distinct clades underrepresented in westernized populations. <i>Cell Host Microbe<\/i> <b>26<\/b>, 666\u2013679 (2019).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1016\/j.chom.2019.08.018\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1016%2Fj.chom.2019.08.018\" aria-label=\"Reference 7\"7575 data-doi=\"10.1016\/j.chom.2019.08.018\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BC1MXhvFKku7zE\" aria-label=\"Reference 7\"7676>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=31607556\" aria-label=\"Reference 7\"7777>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC6854460\" aria-label=\"Reference 7\"7878>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 7\"7979 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=The%20Prevotella%20copri%20complex%20comprises%20four%20distinct%20clades%20underrepresented%20in%20westernized%20populations&#038;journal=Cell%20Host%20Microbe&#038;doi=10.1016%2Fj.chom.2019.08.018&#038;volume=26&#038;pages=666-679&#038;publication_year=2019&#038;author=Tett%2CA\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"52.\">\n<p id=\"ref-CR52\">De Filippis, F., Pasolli, E. &#038; Ercolini, D. Newly explored faecalibacterium diversity is connected to age, lifestyle, geography and disease. <i>Curr. Biol.<\/i> <b>30<\/b>, 4932\u20134943 (2020).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1016\/j.cub.2020.09.063\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1016%2Fj.cub.2020.09.063\" aria-label=\"Reference 7\"8080 data-doi=\"10.1016\/j.cub.2020.09.063\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=33065016\" aria-label=\"Reference 7\"8181>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 7\"8282 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Newly%20explored%20faecalibacterium%20diversity%20is%20connected%20to%20age%2C%20lifestyle%2C%20geography%20and%20disease&#038;journal=Curr.%20Biol.&#038;doi=10.1016%2Fj.cub.2020.09.063&#038;volume=30&#038;pages=4932-4943&#038;publication_year=2020&#038;author=Filippis%2CF&#038;author=Pasolli%2CE&#038;author=Ercolini%2CD\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"53.\">\n<p id=\"ref-CR53\">NCBI Resource Coordinators. Database resources of the National Center for Biotechnology Information. <i>Nucleic Acids Res.<\/i> <b>46<\/b>, D8\u2013D13 (2018).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1093\/nar\/gkx1095\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1093%2Fnar%2Fgkx1095\" aria-label=\"Reference 7\"8383 data-doi=\"10.1093\/nar\/gkx1095\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 7\"8484 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Database%20resources%20of%20the%20National%20Center%20for%20Biotechnology%20Information&#038;journal=Nucleic%20Acids%20Res.&#038;doi=10.1093%2Fnar%2Fgkx1095&#038;volume=46&#038;pages=D8-D13&#038;publication_year=2018\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"54.\">\n<p id=\"ref-CR54\">Ondov, B. D. et al. Mash: fast genome and metagenome distance estimation using MinHash. <i>Genome Biol.<\/i> <b>17<\/b>, 132 (2016).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1186\/s13059-016-0997-x\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1186%2Fs13059-016-0997-x\" aria-label=\"Reference 7\"8585 data-doi=\"10.1186\/s13059-016-0997-x\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=27323842\" aria-label=\"Reference 7\"8686>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4915045\" aria-label=\"Reference 7\"8787>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 7\"8888 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Mash%3A%20fast%20genome%20and%20metagenome%20distance%20estimation%20using%20MinHash&#038;journal=Genome%20Biol.&#038;doi=10.1186%2Fs13059-016-0997-x&#038;volume=17&#038;publication_year=2016&#038;author=Ondov%2CBD\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"55.\">\n<p id=\"ref-CR55\">Karcher, N. et al. Analysis of 1321 <i>Eubacterium rectale<\/i> genomes from metagenomes uncovers complex phylogeographic population structure and subspecies functional adaptations. <i>Genome Biol.<\/i> <b>21<\/b>, 138 (2020).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1186\/s13059-020-02042-y\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1186%2Fs13059-020-02042-y\" aria-label=\"Reference 7\"8989 data-doi=\"10.1186\/s13059-020-02042-y\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BB3cXhtFemtbjI\" aria-label=\"Reference 7\"9090>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=32513234\" aria-label=\"Reference 7\"9191>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC7278147\" aria-label=\"Reference 7\"9292>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 7\"9393 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Analysis%20of%201321%20Eubacterium%20rectale%20genomes%20from%20metagenomes%20uncovers%20complex%20phylogeographic%20population%20structure%20and%20subspecies%20functional%20adaptations&#038;journal=Genome%20Biol.&#038;doi=10.1186%2Fs13059-020-02042-y&#038;volume=21&#038;publication_year=2020&#038;author=Karcher%2CN\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"56.\">\n<p id=\"ref-CR56\">Karcher, N. et al. Genomic diversity and ecology of human-associated Akkermansia species in the gut microbiome revealed by extensive metagenomic assembly. <i>Genome Biol.<\/i> <b>22<\/b>, 209 (2021).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1186\/s13059-021-02427-7\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1186%2Fs13059-021-02427-7\" aria-label=\"Reference 7\"9494 data-doi=\"10.1186\/s13059-021-02427-7\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BB3MXhvFKmurrN\" aria-label=\"Reference 7\"9595>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=34261503\" aria-label=\"Reference 7\"9696>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC8278651\" aria-label=\"Reference 7\"9797>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 7\"9898 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Genomic%20diversity%20and%20ecology%20of%20human-associated%20Akkermansia%20species%20in%20the%20gut%20microbiome%20revealed%20by%20extensive%20metagenomic%20assembly&#038;journal=Genome%20Biol.&#038;doi=10.1186%2Fs13059-021-02427-7&#038;volume=22&#038;publication_year=2021&#038;author=Karcher%2CN\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"57.\">\n<p id=\"ref-CR57\">Hall, A. B. et al. A novel <i>Ruminococcus gnavus<\/i> clade enriched in inflammatory bowel disease patients. <i>Genome Med.<\/i> <b>9<\/b>, 103 (2017).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1186\/s13073-017-0490-5\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1186%2Fs13073-017-0490-5\" aria-label=\"Reference 7\"9999 data-doi=\"10.1186\/s13073-017-0490-5\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=29183332\" aria-label=\"Reference 8\"0000>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC5704459\" aria-label=\"Reference 8\"0101>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 8\"0202 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=A%20novel%20Ruminococcus%20gnavus%20clade%20enriched%20in%20inflammatory%20bowel%20disease%20patients&#038;journal=Genome%20Med.&#038;doi=10.1186%2Fs13073-017-0490-5&#038;volume=9&#038;publication_year=2017&#038;author=Hall%2CAB\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"58.\">\n<p id=\"ref-CR58\">Suzek, B. E. et al. UniRef clusters: a comprehensive and scalable alternative for improving sequence similarity searches. <i>Bioinformatics<\/i> <b>31<\/b>, 926\u2013932 (2015).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1093\/bioinformatics\/btu739\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1093%2Fbioinformatics%2Fbtu739\" aria-label=\"Reference 8\"0303 data-doi=\"10.1093\/bioinformatics\/btu739\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BC28Xht1Gntb7F\" aria-label=\"Reference 8\"0404>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=25398609\" aria-label=\"Reference 8\"0505>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 8\"0606 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=UniRef%20clusters%3A%20a%20comprehensive%20and%20scalable%20alternative%20for%20improving%20sequence%20similarity%20searches&#038;journal=Bioinformatics&#038;doi=10.1093%2Fbioinformatics%2Fbtu739&#038;volume=31&#038;pages=926-932&#038;publication_year=2015&#038;author=Suzek%2CBE\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"59.\">\n<p id=\"ref-CR59\">Mirdita, M. et al. Uniclust databases of clustered and deeply annotated protein sequences and alignments. <i>Nucleic Acids Res.<\/i> <b>45<\/b>, D170\u2013D176 (2017).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1093\/nar\/gkw1081\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1093%2Fnar%2Fgkw1081\" aria-label=\"Reference 8\"0707 data-doi=\"10.1093\/nar\/gkw1081\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BC1cXhslWgsb8%3D\" aria-label=\"Reference 8\"0808>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=27899574\" aria-label=\"Reference 8\"0909>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 8\"1010 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Uniclust%20databases%20of%20clustered%20and%20deeply%20annotated%20protein%20sequences%20and%20alignments&#038;journal=Nucleic%20Acids%20Res.&#038;doi=10.1093%2Fnar%2Fgkw1081&#038;volume=45&#038;pages=D170-D176&#038;publication_year=2017&#038;author=Mirdita%2CM\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"60.\">\n<p id=\"ref-CR60\">Meyer, F. et al. Tutorial: assessing metagenomics software with the CAMI benchmarking toolkit. <i>Nat. Protoc.<\/i> <a href=\"https:\/\/doi.org\/10.1038\/s41596-020-00480-3\">https:\/\/doi.org\/10.1038\/s41596-020-00480-3<\/a> (2021).<\/p>\n<\/li>\n<li data-counter=\"61.\">\n<p id=\"ref-CR61\">Meyer, F. et al. Assessing taxonomic metagenome profilers with OPAL. <i>Genome Biol.<\/i> <b>20<\/b>, 51 (2019).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1186\/s13059-019-1646-y\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1186%2Fs13059-019-1646-y\" aria-label=\"Reference 8\"1111 data-doi=\"10.1186\/s13059-019-1646-y\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=30832730\" aria-label=\"Reference 8\"1212>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC6398228\" aria-label=\"Reference 8\"1313>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 8\"1414 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Assessing%20taxonomic%20metagenome%20profilers%20with%20OPAL&#038;journal=Genome%20Biol.&#038;doi=10.1186%2Fs13059-019-1646-y&#038;volume=20&#038;publication_year=2019&#038;author=Meyer%2CF\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"62.\">\n<p id=\"ref-CR62\">O\u2019Leary, N. A. et al. Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation. <i>Nucleic Acids Res.<\/i> <b>44<\/b>, D733\u2013D745 (2016).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1093\/nar\/gkv1189\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1093%2Fnar%2Fgkv1189\" aria-label=\"Reference 8\"1515 data-doi=\"10.1093\/nar\/gkv1189\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=26553804\" aria-label=\"Reference 8\"1616>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 8\"1717 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Reference%20sequence%20%28RefSeq%29%20database%20at%20NCBI%3A%20current%20status%2C%20taxonomic%20expansion%2C%20and%20functional%20annotation&#038;journal=Nucleic%20Acids%20Res.&#038;doi=10.1093%2Fnar%2Fgkv1189&#038;volume=44&#038;pages=D733-D745&#038;publication_year=2016&#038;author=O%E2%80%99Leary%2CNA\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"63.\">\n<p id=\"ref-CR63\">Parks, D. H. et al. GTDB: an ongoing census of bacterial and archaeal diversity through a phylogenetically consistent, rank normalized and complete genome-based taxonomy. <i>Nucleic Acids Res.<\/i> <b>50<\/b>, D785\u2013D794 (2022).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1093\/nar\/gkab776\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1093%2Fnar%2Fgkab776\" aria-label=\"Reference 8\"1818 data-doi=\"10.1093\/nar\/gkab776\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BB38Xit1Grur8%3D\" aria-label=\"Reference 8\"1919>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=34520557\" aria-label=\"Reference 8\"2020>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 8\"2121 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=GTDB%3A%20an%20ongoing%20census%20of%20bacterial%20and%20archaeal%20diversity%20through%20a%20phylogenetically%20consistent%2C%20rank%20normalized%20and%20complete%20genome-based%20taxonomy&#038;journal=Nucleic%20Acids%20Res.&#038;doi=10.1093%2Fnar%2Fgkab776&#038;volume=50&#038;pages=D785-D794&#038;publication_year=2022&#038;author=Parks%2CDH\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"64.\">\n<p id=\"ref-CR64\">Sunagawa, S. et al. Structure and function of the global ocean microbiome. <i>Science<\/i> <b>348<\/b>, 1261359 (2015).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1126\/science.1261359\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1126%2Fscience.1261359\" aria-label=\"Reference 8\"2222 data-doi=\"10.1126\/science.1261359\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=25999513\" aria-label=\"Reference 8\"2323>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 8\"2424 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Structure%20and%20function%20of%20the%20global%20ocean%20microbiome&#038;journal=Science&#038;doi=10.1126%2Fscience.1261359&#038;volume=348&#038;publication_year=2015&#038;author=Sunagawa%2CS\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"65.\">\n<p id=\"ref-CR65\">Xiao, L. et al. A catalog of the mouse gut metagenome. <i>Nat. Biotechnol.<\/i> <b>33<\/b>, 1103\u20131108 (2015).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/nbt.3353\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fnbt.3353\" aria-label=\"Reference 8\"2525 data-doi=\"10.1038\/nbt.3353\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BC2MXhsFersLzP\" aria-label=\"Reference 8\"2626>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=26414350\" aria-label=\"Reference 8\"2727>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 8\"2828 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=A%20catalog%20of%20the%20mouse%20gut%20metagenome&#038;journal=Nat.%20Biotechnol.&#038;doi=10.1038%2Fnbt.3353&#038;volume=33&#038;pages=1103-1108&#038;publication_year=2015&#038;author=Xiao%2CL\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"66.\">\n<p id=\"ref-CR66\">Kieser, S., Zdobnov, E. M. &#038; Trajkovski, M. Comprehensive mouse microbiota genome catalog reveals major difference to its human counterpart. <i>PLoS Comput. Biol.<\/i> <b>18<\/b>, e1009947 (2022).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1371\/journal.pcbi.1009947\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1371%2Fjournal.pcbi.1009947\" aria-label=\"Reference 8\"2929 data-doi=\"10.1371\/journal.pcbi.1009947\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BB38XnvF2qtbs%3D\" aria-label=\"Reference 8\"3030>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=35259160\" aria-label=\"Reference 8\"3131>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC8932566\" aria-label=\"Reference 8\"3232>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 8\"3333 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Comprehensive%20mouse%20microbiota%20genome%20catalog%20reveals%20major%20difference%20to%20its%20human%20counterpart&#038;journal=PLoS%20Comput.%20Biol.&#038;doi=10.1371%2Fjournal.pcbi.1009947&#038;volume=18&#038;publication_year=2022&#038;author=Kieser%2CS&#038;author=Zdobnov%2CEM&#038;author=Trajkovski%2CM\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"67.\">\n<p id=\"ref-CR67\">Kieser, S., Brown, J., Zdobnov, E. M., Trajkovski, M. &#038; McCue, L. A. ATLAS: a Snakemake workflow for assembly, annotation, and genomic binning of metagenome sequence data. <i>BMC Bioinf.<\/i> <b>21<\/b>, 257 (2020).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1186\/s12859-020-03585-4\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1186%2Fs12859-020-03585-4\" aria-label=\"Reference 8\"3434 data-doi=\"10.1186\/s12859-020-03585-4\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 8\"3535 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=ATLAS%3A%20a%20Snakemake%20workflow%20for%20assembly%2C%20annotation%2C%20and%20genomic%20binning%20of%20metagenome%20sequence%20data&#038;journal=BMC%20Bioinf.&#038;doi=10.1186%2Fs12859-020-03585-4&#038;volume=21&#038;publication_year=2020&#038;author=Kieser%2CS&#038;author=Brown%2CJ&#038;author=Zdobnov%2CEM&#038;author=Trajkovski%2CM&#038;author=McCue%2CLA\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"68.\">\n<p id=\"ref-CR68\">Wood, D. E., Lu, J. &#038; Langmead, B. Improved metagenomic analysis with Kraken 2. <i>Genome Biol.<\/i> <b>20<\/b>, 257 (2019).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1186\/s13059-019-1891-0\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1186%2Fs13059-019-1891-0\" aria-label=\"Reference 8\"3636 data-doi=\"10.1186\/s13059-019-1891-0\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BC1MXitlynsbvJ\" aria-label=\"Reference 8\"3737>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=31779668\" aria-label=\"Reference 8\"3838>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC6883579\" aria-label=\"Reference 8\"3939>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 8\"4040 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Improved%20metagenomic%20analysis%20with%20Kraken%202&#038;journal=Genome%20Biol.&#038;doi=10.1186%2Fs13059-019-1891-0&#038;volume=20&#038;publication_year=2019&#038;author=Wood%2CDE&#038;author=Lu%2CJ&#038;author=Langmead%2CB\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"69.\">\n<p id=\"ref-CR69\">Saenz, C., Nigro, E., Gunalan, V. &#038; Arumugam, M. MIntO: a modular and scalable pipeline for microbiome metagenomic and metatranscriptomic data integration. <i>Front. Bioinform.<\/i> <b>2<\/b>, 846922 (2022).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.3389\/fbinf.2022.846922\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.3389%2Ffbinf.2022.846922\" aria-label=\"Reference 8\"4141 data-doi=\"10.3389\/fbinf.2022.846922\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=36304282\" aria-label=\"Reference 8\"4242>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC9580859\" aria-label=\"Reference 8\"4343>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 8\"4444 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=MIntO%3A%20a%20modular%20and%20scalable%20pipeline%20for%20microbiome%20metagenomic%20and%20metatranscriptomic%20data%20integration&#038;journal=Front.%20Bioinform.&#038;doi=10.3389%2Ffbinf.2022.846922&#038;volume=2&#038;publication_year=2022&#038;author=Saenz%2CC&#038;author=Nigro%2CE&#038;author=Gunalan%2CV&#038;author=Arumugam%2CM\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"70.\">\n<p id=\"ref-CR70\">Ley, R. E., Turnbaugh, P. J., Klein, S. &#038; Gordon, J. I. Microbial ecology: human gut microbes associated with obesity.<i>Nature<\/i> <b>444<\/b>, 1022\u20131023 (2006).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/4441022a\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2F4441022a\" aria-label=\"Reference 8\"4545 data-doi=\"10.1038\/4441022a\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BD28XhtlemtLvM\" aria-label=\"Reference 8\"4646>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=17183309\" aria-label=\"Reference 8\"4747>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 8\"4848 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Microbial%20ecology%3A%20human%20gut%20microbes%20associated%20with%20obesity.&#038;journal=Nature&#038;doi=10.1038%2F4441022a&#038;volume=444&#038;pages=1022-1023&#038;publication_year=2006&#038;author=Ley%2CRE&#038;author=Turnbaugh%2CPJ&#038;author=Klein%2CS&#038;author=Gordon%2CJI\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"71.\">\n<p id=\"ref-CR71\">Guenther, P. M. et al. Update of the healthy eating index: HEI-2010. <i>J. Acad. Nutr. Diet.<\/i> <b>113<\/b>, 569\u2013580 (2013).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1016\/j.jand.2012.12.016\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1016%2Fj.jand.2012.12.016\" aria-label=\"Reference 8\"4949 data-doi=\"10.1016\/j.jand.2012.12.016\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=23415502\" aria-label=\"Reference 8\"5050>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 8\"5151 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Update%20of%20the%20healthy%20eating%20index%3A%20HEI-2010&#038;journal=J.%20Acad.%20Nutr.%20Diet.&#038;doi=10.1016%2Fj.jand.2012.12.016&#038;volume=113&#038;pages=569-580&#038;publication_year=2013&#038;author=Guenther%2CPM\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"72.\">\n<p id=\"ref-CR72\">Fung, T. T. et al. Diet-quality scores and plasma concentrations of markers of inflammation and endothelial dysfunction. <i>Am. J. Clin. Nutr.<\/i> <b>82<\/b>, 163\u2013173 (2005).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1093\/ajcn\/82.1.163\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1093%2Fajcn%2F82.1.163\" aria-label=\"Reference 8\"5252 data-doi=\"10.1093\/ajcn\/82.1.163\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BD2MXmsVSgsL4%3D\" aria-label=\"Reference 8\"5353>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=16002815\" aria-label=\"Reference 8\"5454>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 8\"5555 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Diet-quality%20scores%20and%20plasma%20concentrations%20of%20markers%20of%20inflammation%20and%20endothelial%20dysfunction&#038;journal=Am.%20J.%20Clin.%20Nutr.&#038;doi=10.1093%2Fajcn%2F82.1.163&#038;volume=82&#038;pages=163-173&#038;publication_year=2005&#038;author=Fung%2CTT\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"73.\">\n<p id=\"ref-CR73\">Truong, D. T., Tett, A., Pasolli, E., Huttenhower, C. &#038; Segata, N. Microbial strain-level population structure and genetic diversity from metagenomes. <i>Genome Res.<\/i> <b>27<\/b>, 626\u2013638 (2017).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1101\/gr.216242.116\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1101%2Fgr.216242.116\" aria-label=\"Reference 8\"5656 data-doi=\"10.1101\/gr.216242.116\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BC2sXmtlSqsbs%3D\" aria-label=\"Reference 8\"5757>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=28167665\" aria-label=\"Reference 8\"5858>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC5378180\" aria-label=\"Reference 8\"5959>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 8\"6060 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Microbial%20strain-level%20population%20structure%20and%20genetic%20diversity%20from%20metagenomes&#038;journal=Genome%20Res.&#038;doi=10.1101%2Fgr.216242.116&#038;volume=27&#038;pages=626-638&#038;publication_year=2017&#038;author=Truong%2CDT&#038;author=Tett%2CA&#038;author=Pasolli%2CE&#038;author=Huttenhower%2CC&#038;author=Segata%2CN\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"74.\">\n<p id=\"ref-CR74\">Hagan, R. W. et al. Comparison of extraction methods for recovering ancient microbial DNA from paleofeces. <i>Am. J. Phys. Anthropol.<\/i> <b>171<\/b>, 275\u2013284 (2020).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1002\/ajpa.23978\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1002%2Fajpa.23978\" aria-label=\"Reference 8\"6161 data-doi=\"10.1002\/ajpa.23978\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=31785113\" aria-label=\"Reference 8\"6262>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 8\"6363 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Comparison%20of%20extraction%20methods%20for%20recovering%20ancient%20microbial%20DNA%20from%20paleofeces&#038;journal=Am.%20J.%20Phys.%20Anthropol.&#038;doi=10.1002%2Fajpa.23978&#038;volume=171&#038;pages=275-284&#038;publication_year=2020&#038;author=Hagan%2CRW\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"75.\">\n<p id=\"ref-CR75\">Wright, S. Isolation by distance. <i>Genetics<\/i> <b>28<\/b>, 114\u2013138 (1943).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1093\/genetics\/28.2.114\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1093%2Fgenetics%2F28.2.114\" aria-label=\"Reference 8\"6464 data-doi=\"10.1093\/genetics\/28.2.114\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:STN:280:DC%2BD2s%2FmsFSmsg%3D%3D\" aria-label=\"Reference 8\"6565>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=17247074\" aria-label=\"Reference 8\"6666>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC1209196\" aria-label=\"Reference 8\"6767>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 8\"6868 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Isolation%20by%20distance&#038;journal=Genetics&#038;doi=10.1093%2Fgenetics%2F28.2.114&#038;volume=28&#038;pages=114-138&#038;publication_year=1943&#038;author=Wright%2CS\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"76.\">\n<p id=\"ref-CR76\">Linz, B. et al. An African origin for the intimate association between humans and Helicobacter pylori. <i>Nature<\/i> <b>445<\/b>, 915\u2013918 (2007).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/nature05562\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fnature05562\" aria-label=\"Reference 8\"6969 data-doi=\"10.1038\/nature05562\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=17287725\" aria-label=\"Reference 8\"7070>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC1847463\" aria-label=\"Reference 8\"7171>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 8\"7272 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=An%20African%20origin%20for%20the%20intimate%20association%20between%20humans%20and%20Helicobacter%20pylori&#038;journal=Nature&#038;doi=10.1038%2Fnature05562&#038;volume=445&#038;pages=915-918&#038;publication_year=2007&#038;author=Linz%2CB\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"77.\">\n<p id=\"ref-CR77\">Shao, Y. et al. Stunted microbiota and opportunistic pathogen colonization in caesarean-section birth. <i>Nature<\/i> <b>574<\/b>, 117\u2013121 (2019).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/s41586-019-1560-1\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fs41586-019-1560-1\" aria-label=\"Reference 8\"7373 data-doi=\"10.1038\/s41586-019-1560-1\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BC1MXhvVSgtbzJ\" aria-label=\"Reference 8\"7474>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=31534227\" aria-label=\"Reference 8\"7575>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC6894937\" aria-label=\"Reference 8\"7676>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 8\"7777 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Stunted%20microbiota%20and%20opportunistic%20pathogen%20colonization%20in%20caesarean-section%20birth&#038;journal=Nature&#038;doi=10.1038%2Fs41586-019-1560-1&#038;volume=574&#038;pages=117-121&#038;publication_year=2019&#038;author=Shao%2CY\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"78.\">\n<p id=\"ref-CR78\">Valles-Colomer, M. et al. Variation and transmission of the human gut microbiota across multiple familial generations. <i>Nat. Microbiol.<\/i> <b>7<\/b>, 87\u201396 (2022).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/s41564-021-01021-8\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fs41564-021-01021-8\" aria-label=\"Reference 8\"7878 data-doi=\"10.1038\/s41564-021-01021-8\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BB38XhslCrug%3D%3D\" aria-label=\"Reference 8\"7979>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=34969979\" aria-label=\"Reference 8\"8080>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 8\"8181 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Variation%20and%20transmission%20of%20the%20human%20gut%20microbiota%20across%20multiple%20familial%20generations&#038;journal=Nat.%20Microbiol.&#038;doi=10.1038%2Fs41564-021-01021-8&#038;volume=7&#038;pages=87-96&#038;publication_year=2022&#038;author=Valles-Colomer%2CM\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"79.\">\n<p id=\"ref-CR79\">Ianiro, G. et al. Variability of strain engraftment and predictability of microbiome composition after fecal microbiota transplantation across different diseases. <i>Nat. Med.<\/i> <b>28<\/b>, 1913\u20131923 (2022).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/s41591-022-01964-3\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fs41591-022-01964-3\" aria-label=\"Reference 8\"8282 data-doi=\"10.1038\/s41591-022-01964-3\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BB38XisVWnsrjJ\" aria-label=\"Reference 8\"8383>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=36109637\" aria-label=\"Reference 8\"8484>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC9499858\" aria-label=\"Reference 8\"8585>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 8\"8686 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Variability%20of%20strain%20engraftment%20and%20predictability%20of%20microbiome%20composition%20after%20fecal%20microbiota%20transplantation%20across%20different%20diseases&#038;journal=Nat.%20Med.&#038;doi=10.1038%2Fs41591-022-01964-3&#038;volume=28&#038;pages=1913-1923&#038;publication_year=2022&#038;author=Ianiro%2CG\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"80.\">\n<p id=\"ref-CR80\">Hamady, M. &#038; Knight, R. Microbial community profiling for human microbiome projects: tools, techniques and challenges. <i>Genome Res.<\/i> <b>19<\/b>, 1141\u20131152 (2009).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1101\/gr.085464.108\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1101%2Fgr.085464.108\" aria-label=\"Reference 8\"8787 data-doi=\"10.1101\/gr.085464.108\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BD1MXosVCktL8%3D\" aria-label=\"Reference 8\"8888>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=19383763\" aria-label=\"Reference 8\"8989>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC3776646\" aria-label=\"Reference 8\"9090>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 8\"9191 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Microbial%20community%20profiling%20for%20human%20microbiome%20projects%3A%20tools%2C%20techniques%20and%20challenges&#038;journal=Genome%20Res.&#038;doi=10.1101%2Fgr.085464.108&#038;volume=19&#038;pages=1141-1152&#038;publication_year=2009&#038;author=Hamady%2CM&#038;author=Knight%2CR\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"81.\">\n<p id=\"ref-CR81\">Asnicar, F. et al. Precise phylogenetic analysis of microbial isolates and genomes from metagenomes using PhyloPhlAn 3.0. <i>Nat. Commun.<\/i> <b>11<\/b>, 2500 (2020).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/s41467-020-16366-7\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fs41467-020-16366-7\" aria-label=\"Reference 8\"9292 data-doi=\"10.1038\/s41467-020-16366-7\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BB3cXpvF2lt7o%3D\" aria-label=\"Reference 8\"9393>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=32427907\" aria-label=\"Reference 8\"9494>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC7237447\" aria-label=\"Reference 8\"9595>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 8\"9696 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Precise%20phylogenetic%20analysis%20of%20microbial%20isolates%20and%20genomes%20from%20metagenomes%20using%20PhyloPhlAn%203.0&#038;journal=Nat.%20Commun.&#038;doi=10.1038%2Fs41467-020-16366-7&#038;volume=11&#038;publication_year=2020&#038;author=Asnicar%2CF\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"82.\">\n<p id=\"ref-CR82\">McIver, L. J. et al. bioBakery: a meta\u2019omic analysis environment. <i>Bioinformatics<\/i> <b>34<\/b>, 1235\u20131237 (2018).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1093\/bioinformatics\/btx754\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1093%2Fbioinformatics%2Fbtx754\" aria-label=\"Reference 8\"9797 data-doi=\"10.1093\/bioinformatics\/btx754\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BC1cXitlGku7rM\" aria-label=\"Reference 8\"9898>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=29194469\" aria-label=\"Reference 8\"9999>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 9\"0000 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=bioBakery%3A%20a%20meta%E2%80%99omic%20analysis%20environment&#038;journal=Bioinformatics&#038;doi=10.1093%2Fbioinformatics%2Fbtx754&#038;volume=34&#038;pages=1235-1237&#038;publication_year=2018&#038;author=McIver%2CLJ\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"83.\">\n<p id=\"ref-CR83\">Longmead, B. &#038; Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. <i>Nat. Methods<\/i> <b>9<\/b>, 357\u2013359 (2012).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/nmeth.1923\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fnmeth.1923\" aria-label=\"Reference 9\"0101 data-doi=\"10.1038\/nmeth.1923\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 9\"0202 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Fast%20gapped-read%20alignment%20with%20Bowtie%202&#038;journal=Nat.%20Methods&#038;doi=10.1038%2Fnmeth.1923&#038;volume=9&#038;pages=357-359&#038;publication_year=2012&#038;author=Longmead%2CB&#038;author=Salzberg%2CSL\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"84.\">\n<p id=\"ref-CR84\">Benson, D. A. et al. GenBank. <i>Nucleic Acids Res.<\/i> <b>41<\/b>, D36\u2013D42 (2012).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1093\/nar\/gks1195\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1093%2Fnar%2Fgks1195\" aria-label=\"Reference 9\"0303 data-doi=\"10.1093\/nar\/gks1195\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=23193287\" aria-label=\"Reference 9\"0404>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC3531190\" aria-label=\"Reference 9\"0505>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 9\"0606 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=GenBank&#038;journal=Nucleic%20Acids%20Res.&#038;doi=10.1093%2Fnar%2Fgks1195&#038;volume=41&#038;pages=D36-D42&#038;publication_year=2012&#038;author=Benson%2CDA\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"85.\">\n<p id=\"ref-CR85\">Parks, D. H., Imelfort, M., Skennerton, C. T., Hugenholtz, P. &#038; Tyson, G. W. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. <i>Genome Res.<\/i> <b>25<\/b>, 1043\u20131055 (2015).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1101\/gr.186072.114\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1101%2Fgr.186072.114\" aria-label=\"Reference 9\"0707 data-doi=\"10.1101\/gr.186072.114\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BC2MXht1SltbfE\" aria-label=\"Reference 9\"0808>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=25977477\" aria-label=\"Reference 9\"0909>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4484387\" aria-label=\"Reference 9\"1010>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 9\"1111 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=CheckM%3A%20assessing%20the%20quality%20of%20microbial%20genomes%20recovered%20from%20isolates%2C%20single%20cells%2C%20and%20metagenomes&#038;journal=Genome%20Res.&#038;doi=10.1101%2Fgr.186072.114&#038;volume=25&#038;pages=1043-1055&#038;publication_year=2015&#038;author=Parks%2CDH&#038;author=Imelfort%2CM&#038;author=Skennerton%2CCT&#038;author=Hugenholtz%2CP&#038;author=Tyson%2CGW\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"86.\">\n<p id=\"ref-CR86\">Seemann, T. Prokka: rapid prokaryotic genome annotation. <i>Bioinformatics<\/i> <b>30<\/b>, 2068\u20132069 (2014).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1093\/bioinformatics\/btu153\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1093%2Fbioinformatics%2Fbtu153\" aria-label=\"Reference 9\"1212 data-doi=\"10.1093\/bioinformatics\/btu153\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BC2cXhtFCrtLjI\" aria-label=\"Reference 9\"1313>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=24642063\" aria-label=\"Reference 9\"1414>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 9\"1515 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Prokka%3A%20rapid%20prokaryotic%20genome%20annotation&#038;journal=Bioinformatics&#038;doi=10.1093%2Fbioinformatics%2Fbtu153&#038;volume=30&#038;pages=2068-2069&#038;publication_year=2014&#038;author=Seemann%2CT\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"87.\">\n<p id=\"ref-CR87\">Buchfink, B., Xie, C. &#038; Huson, D. H. Fast and sensitive protein alignment using DIAMOND. <i>Nat. Methods<\/i> <b>12<\/b>, 59\u201360 (2015).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/nmeth.3176\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fnmeth.3176\" aria-label=\"Reference 9\"1616 data-doi=\"10.1038\/nmeth.3176\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BC2cXhvFKlsrzN\" aria-label=\"Reference 9\"1717>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=25402007\" aria-label=\"Reference 9\"1818>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 9\"1919 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Fast%20and%20sensitive%20protein%20alignment%20using%20DIAMOND&#038;journal=Nat.%20Methods&#038;doi=10.1038%2Fnmeth.3176&#038;volume=12&#038;pages=59-60&#038;publication_year=2015&#038;author=Buchfink%2CB&#038;author=Xie%2CC&#038;author=Huson%2CDH\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"88.\">\n<p id=\"ref-CR88\">Steinegger, M. &#038; S\u00f6ding, J. MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets. <i>Nat. Biotechnol.<\/i> <b>35<\/b>, 1026\u20131028 (2017).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/nbt.3988\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fnbt.3988\" aria-label=\"Reference 9\"2020 data-doi=\"10.1038\/nbt.3988\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BC2sXhs1GqsLzE\" aria-label=\"Reference 9\"2121>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=29035372\" aria-label=\"Reference 9\"2222>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 9\"2323 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=MMseqs2%20enables%20sensitive%20protein%20sequence%20searching%20for%20the%20analysis%20of%20massive%20data%20sets&#038;journal=Nat.%20Biotechnol.&#038;doi=10.1038%2Fnbt.3988&#038;volume=35&#038;pages=1026-1028&#038;publication_year=2017&#038;author=Steinegger%2CM&#038;author=S%C3%B6ding%2CJ\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"89.\">\n<p id=\"ref-CR89\">Chaumeil, P.-A., Mussig, A. J., Hugenholtz, P. &#038; Parks, D. H. GTDB-Tk: a toolkit to classify genomes with the Genome Taxonomy Database. <i>Bioinformatics<\/i> <a href=\"https:\/\/doi.org\/10.1093\/bioinformatics\/btz848\">https:\/\/doi.org\/10.1093\/bioinformatics\/btz848<\/a> (2019).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1093\/bioinformatics\/btz848\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1093%2Fbioinformatics%2Fbtz848\" aria-label=\"Reference 9\"2424 data-doi=\"10.1093\/bioinformatics\/btz848\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=31730192\" aria-label=\"Reference 9\"2525>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC7703759\" aria-label=\"Reference 9\"2626>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 9\"2727 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=GTDB-Tk%3A%20a%20toolkit%20to%20classify%20genomes%20with%20the%20Genome%20Taxonomy%20Database&#038;journal=Bioinformatics&#038;doi=10.1093%2Fbioinformatics%2Fbtz848&#038;publication_year=2019&#038;author=Chaumeil%2CP-A&#038;author=Mussig%2CAJ&#038;author=Hugenholtz%2CP&#038;author=Parks%2CDH\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"90.\">\n<p id=\"ref-CR90\">Lozupone, C. &#038; Knight, R. UniFrac: a new phylogenetic method for comparing microbial communities. <i>Appl. Environ. Microbiol.<\/i> <b>71<\/b>, 8228\u20138235 (2005).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1128\/AEM.71.12.8228-8235.2005\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1128%2FAEM.71.12.8228-8235.2005\" aria-label=\"Reference 9\"2828 data-doi=\"10.1128\/AEM.71.12.8228-8235.2005\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BD2MXhtlehtb7K\" aria-label=\"Reference 9\"2929>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=16332807\" aria-label=\"Reference 9\"3030>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC1317376\" aria-label=\"Reference 9\"3131>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 9\"3232 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=UniFrac%3A%20a%20new%20phylogenetic%20method%20for%20comparing%20microbial%20communities&#038;journal=Appl.%20Environ.%20Microbiol.&#038;doi=10.1128%2FAEM.71.12.8228-8235.2005&#038;volume=71&#038;pages=8228-8235&#038;publication_year=2005&#038;author=Lozupone%2CC&#038;author=Knight%2CR\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"91.\">\n<p id=\"ref-CR91\">Capella-Guti\u00e9rrez, S., Silla-Mart\u00ednez, J. M. &#038; Gabald\u00f3n, T. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. <i>Bioinformatics<\/i> <b>25<\/b>, 1972\u20131973 (2009).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1093\/bioinformatics\/btp348\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1093%2Fbioinformatics%2Fbtp348\" aria-label=\"Reference 9\"3333 data-doi=\"10.1093\/bioinformatics\/btp348\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=19505945\" aria-label=\"Reference 9\"3434>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC2712344\" aria-label=\"Reference 9\"3535>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 9\"3636 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=trimAl%3A%20a%20tool%20for%20automated%20alignment%20trimming%20in%20large-scale%20phylogenetic%20analyses&#038;journal=Bioinformatics&#038;doi=10.1093%2Fbioinformatics%2Fbtp348&#038;volume=25&#038;pages=1972-1973&#038;publication_year=2009&#038;author=Capella-Guti%C3%A9rrez%2CS&#038;author=Silla-Mart%C3%ADnez%2CJM&#038;author=Gabald%C3%B3n%2CT\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"92.\">\n<p id=\"ref-CR92\">Katoh, K. &#038; Standley, D. M. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. <i>Mol. Biol. Evol.<\/i> <b>30<\/b>, 772\u2013780 (2013).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1093\/molbev\/mst010\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1093%2Fmolbev%2Fmst010\" aria-label=\"Reference 9\"3737 data-doi=\"10.1093\/molbev\/mst010\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BC3sXksFWisLc%3D\" aria-label=\"Reference 9\"3838>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=23329690\" aria-label=\"Reference 9\"3939>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC3603318\" aria-label=\"Reference 9\"4040>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 9\"4141 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=MAFFT%20multiple%20sequence%20alignment%20software%20version%207%3A%20improvements%20in%20performance%20and%20usability&#038;journal=Mol.%20Biol.%20Evol.&#038;doi=10.1093%2Fmolbev%2Fmst010&#038;volume=30&#038;pages=772-780&#038;publication_year=2013&#038;author=Katoh%2CK&#038;author=Standley%2CDM\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"93.\">\n<p id=\"ref-CR93\">Nguyen, L.-T., Schmidt, H. A., von Haeseler, A. &#038; Minh, B. Q. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. <i>Mol. Biol. Evol.<\/i> <b>32<\/b>, 268\u2013274 (2015).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1093\/molbev\/msu300\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1093%2Fmolbev%2Fmsu300\" aria-label=\"Reference 9\"4242 data-doi=\"10.1093\/molbev\/msu300\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BC2MXivFGltrs%3D\" aria-label=\"Reference 9\"4343>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=25371430\" aria-label=\"Reference 9\"4444>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 9\"4545 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=IQ-TREE%3A%20a%20fast%20and%20effective%20stochastic%20algorithm%20for%20estimating%20maximum-likelihood%20phylogenies&#038;journal=Mol.%20Biol.%20Evol.&#038;doi=10.1093%2Fmolbev%2Fmsu300&#038;volume=32&#038;pages=268-274&#038;publication_year=2015&#038;author=Nguyen%2CL-T&#038;author=Schmidt%2CHA&#038;author=Haeseler%2CA&#038;author=Minh%2CBQ\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"94.\">\n<p id=\"ref-CR94\">Huang, W., Li, L., Myers, J. R. &#038; Marth, G. T. ART: a next-generation sequencing read simulator. <i>Bioinformatics<\/i> <b>28<\/b>, 593\u2013594 (2012).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1093\/bioinformatics\/btr708\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1093%2Fbioinformatics%2Fbtr708\" aria-label=\"Reference 9\"4646 data-doi=\"10.1093\/bioinformatics\/btr708\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=22199392\" aria-label=\"Reference 9\"4747>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 9\"4848 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=ART%3A%20a%20next-generation%20sequencing%20read%20simulator&#038;journal=Bioinformatics&#038;doi=10.1093%2Fbioinformatics%2Fbtr708&#038;volume=28&#038;pages=593-594&#038;publication_year=2012&#038;author=Huang%2CW&#038;author=Li%2CL&#038;author=Myers%2CJR&#038;author=Marth%2CGT\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"95.\">\n<p id=\"ref-CR95\">Pasolli, E. et al. Accessible, curated metagenomic data through ExperimentHub. <i>Nat. Methods<\/i> <b>14<\/b>, 1023\u20131024 (2017).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/nmeth.4468\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fnmeth.4468\" aria-label=\"Reference 9\"4949 data-doi=\"10.1038\/nmeth.4468\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BC2sXhslCms7%2FF\" aria-label=\"Reference 9\"5050>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=29088129\" aria-label=\"Reference 9\"5151>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC5862039\" aria-label=\"Reference 9\"5252>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 9\"5353 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Accessible%2C%20curated%20metagenomic%20data%20through%20ExperimentHub&#038;journal=Nat.%20Methods&#038;doi=10.1038%2Fnmeth.4468&#038;volume=14&#038;pages=1023-1024&#038;publication_year=2017&#038;author=Pasolli%2CE\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"96.\">\n<p id=\"ref-CR96\">Li, H. et al. The sequence alignment\/map format and SAMtools. <i>Bioinformatics<\/i> <b>25<\/b>, 2078\u20132079 (2009).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1093\/bioinformatics\/btp352\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1093%2Fbioinformatics%2Fbtp352\" aria-label=\"Reference 9\"5454 data-doi=\"10.1093\/bioinformatics\/btp352\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=19505943\" aria-label=\"Reference 9\"5555>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC2723002\" aria-label=\"Reference 9\"5656>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 9\"5757 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=The%20sequence%20alignment%2Fmap%20format%20and%20SAMtools&#038;journal=Bioinformatics&#038;doi=10.1093%2Fbioinformatics%2Fbtp352&#038;volume=25&#038;pages=2078-2079&#038;publication_year=2009&#038;author=Li%2CH\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"97.\">\n<p id=\"ref-CR97\">Stamatakis, A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. <i>Bioinformatics<\/i> <b>30<\/b>, 1312\u20131313 (2014).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1093\/bioinformatics\/btu033\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1093%2Fbioinformatics%2Fbtu033\" aria-label=\"Reference 9\"5858 data-doi=\"10.1093\/bioinformatics\/btu033\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BC2cXmvFCjsbc%3D\" aria-label=\"Reference 9\"5959>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=24451623\" aria-label=\"Reference 9\"6060>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC3998144\" aria-label=\"Reference 9\"6161>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 9\"6262 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=RAxML%20version%208%3A%20a%20tool%20for%20phylogenetic%20analysis%20and%20post-analysis%20of%20large%20phylogenies&#038;journal=Bioinformatics&#038;doi=10.1093%2Fbioinformatics%2Fbtu033&#038;volume=30&#038;pages=1312-1313&#038;publication_year=2014&#038;author=Stamatakis%2CA\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"98.\">\n<p id=\"ref-CR98\">Asnicar, F., Weingart, G., Tickle, T. L., Huttenhower, C. &#038; Segata, N. Compact graphical representation of phylogenetic data and metadata with GraPhlAn. <i>PeerJ<\/i> <b>3<\/b>, e1029 (2015).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.7717\/peerj.1029\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.7717%2Fpeerj.1029\" aria-label=\"Reference 9\"6363 data-doi=\"10.7717\/peerj.1029\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=26157614\" aria-label=\"Reference 9\"6464>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4476132\" aria-label=\"Reference 9\"6565>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 9\"6666 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Compact%20graphical%20representation%20of%20phylogenetic%20data%20and%20metadata%20with%20GraPhlAn&#038;journal=PeerJ&#038;doi=10.7717%2Fpeerj.1029&#038;volume=3&#038;publication_year=2015&#038;author=Asnicar%2CF&#038;author=Weingart%2CG&#038;author=Tickle%2CTL&#038;author=Huttenhower%2CC&#038;author=Segata%2CN\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"99.\">\n<p id=\"ref-CR99\">Page, A. J. et al. Roary: rapid large-scale prokaryote pan genome analysis. <i>Bioinformatics<\/i> <b>31<\/b>, 3691\u20133693 (2015).<\/p>\n<p><a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"10.1093\/bioinformatics\/btv421\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1093%2Fbioinformatics%2Fbtv421\" aria-label=\"Reference 9\"6767 data-doi=\"10.1093\/bioinformatics\/btv421\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"cas reference\" href=\"http:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BC28XhtlamsrfO\" aria-label=\"Reference 9\"6868>CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&#038;db=PubMed&#038;dopt=Abstract&#038;list_uids=26198102\" aria-label=\"Reference 9\"6969>PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4817141\" aria-label=\"Reference 9\"7070>PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click\" data-track-action=\"google scholar reference\" data-track-label=\"link\" rel=\"nofollow noopener\" aria-label=\"Reference 9\"7171 href=\"http:\/\/scholar.google.com\/scholar_lookup?&#038;title=Roary%3A%20rapid%20large-scale%20prokaryote%20pan%20genome%20analysis&#038;journal=Bioinformatics&#038;doi=10.1093%2Fbioinformatics%2Fbtv421&#038;volume=31&#038;pages=3691-3693&#038;publication_year=2015&#038;author=Page%2CAJ\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<\/li>\n<li data-counter=\"100.\">\n<p id=\"ref-CR100\">Blanco-Miguez, A. et al. MetaPhlAn 4 code repository. GitHub. <a href=\"http:\/\/segatalab.cibio.unitn.it\/tools\/metaphlan\/\">http:\/\/segatalab.cibio.unitn.it\/tools\/metaphlan\/<\/a> (2022).<\/p>\n<\/li>\n<li data-counter=\"101.\">\n<p id=\"ref-CR101\">Blanco-Miguez, A. et al. MetaPhlAn 4 package. Bioconda. <a href=\"https:\/\/anaconda.org\/bioconda\/metaphlan\">https:\/\/anaconda.org\/bioconda\/metaphlan<\/a> (2022).<\/p>\n<\/li>\n<\/ol>\n<p><a data-track=\"click\" data-track-action=\"download citation references\" data-track-label=\"link\" rel=\"nofollow\" href=\"https:\/\/citation-needed.springer.com\/v2\/references\/10.1038\/s41587-023-01688-w?format=refman&#038;flavour=references\">Download references<\/a><\/p>\n<\/div>\n<\/div>\n<div id=\"Ack1-section\" data-title=\"Acknowledgements\">\n<h2 id=\"Ack1\">Acknowledgements<\/h2>\n<p>We would like to thank all the members of the Segata and Huttenhower lab for their insightful contributions to the work, and the users of past versions of MetaPhlAn for their suggestions and support. The work was supported by the European Research Council (ERC-STG project MetaPG-716575 and ERC-COG project microTOUCH-101045015) to N.S., the European H2020 program (ONCOBIOME-825410 project and MASTER-818368 project) to N.S., the National Cancer Institute of the National Institutes of Health (1U01CA230551) to N.S., the Premio Internazionale Lombardia e Ricerca 2019 to N.S., the Harvard Chan Microbiome Analysis Core (to C.H.), the National Institute of Diabetes and Digestive and Kidney Diseases of the National Institutes of Health (R24DK110499) to C.H., Cancer Research UK Grand Challenge award C10674\/A27140 to W. Garrett (to C.H.) and the National Institute of Allergy and Infectious Diseases (U19AI110820) to D. Rasko (to C.H.).<\/p>\n<\/div>\n<div id=\"author-information-section\" aria-labelledby=\"author-information\" data-title=\"Author information\">\n<h2 id=\"author-information\">Author information<\/h2>\n<div id=\"author-information-content\">\n<h3 id=\"affiliations\">Authors and Affiliations<\/h3>\n<ol>\n<li id=\"Aff1\">\n<p>Department CIBIO, University of Trento, Trento, Italy<\/p>\n<p>Aitor Blanco-M\u00edguez,\u00a0Francesco Beghini,\u00a0Fabio Cumbo,\u00a0Moreno Zolfo,\u00a0Paolo Manghi,\u00a0Leonard Dubois,\u00a0Kun D. Huang,\u00a0Andrew Maltez Thomas,\u00a0Gianmarco Piccinno,\u00a0Elisa Piperni,\u00a0Michal Pun\u010doch\u00e1\u0159,\u00a0Mireia Valles-Colomer,\u00a0Adrian Tett,\u00a0Francesco Asnicar\u00a0&#038;\u00a0Nicola Segata<\/p>\n<\/li>\n<li id=\"Aff2\">\n<p>Harvard T.H. Chan School of Public Health, Boston, MA, USA<\/p>\n<p>Lauren J. McIver,\u00a0Kelsey N. Thompson,\u00a0William A. Nickols,\u00a0Eric A. Franzosa\u00a0&#038;\u00a0Curtis Huttenhower<\/p>\n<\/li>\n<li id=\"Aff3\">\n<p>The Broad Institute of MIT and Harvard, Cambridge, MA, USA<\/p>\n<p>Lauren J. McIver,\u00a0Kelsey N. Thompson,\u00a0William A. Nickols,\u00a0Eric A. Franzosa\u00a0&#038;\u00a0Curtis Huttenhower<\/p>\n<\/li>\n<li id=\"Aff4\">\n<p>IEO, European Institute of Oncology IRCCS, Milan, Italy<\/p>\n<p>Elisa Piperni\u00a0&#038;\u00a0Nicola Segata<\/p>\n<\/li>\n<li id=\"Aff5\">\n<p>Centre for Microbiology and Environmental Systems Science, University of Vienna, Vienna, Austria<\/p>\n<p>Adrian Tett<\/p>\n<\/li>\n<li id=\"Aff6\">\n<p>Zoe Global, London, UK<\/p>\n<p>Francesca Giordano,\u00a0Richard Davies\u00a0&#038;\u00a0Jonathan Wolf<\/p>\n<\/li>\n<li id=\"Aff7\">\n<p>Department of Nutritional Sciences, King\u2019s College London, London, UK<\/p>\n<p>Sarah E. Berry<\/p>\n<\/li>\n<li id=\"Aff8\">\n<p>Department of Twin Research, King\u2019s College London, London, UK<\/p>\n<p>Tim D. Spector<\/p>\n<\/li>\n<li id=\"Aff9\">\n<p>Department of Agricultural Sciences, University of Naples, Naples, Italy<\/p>\n<p>Edoardo Pasolli<\/p>\n<\/li>\n<\/ol>\n<h3 id=\"contributions\">Contributions<\/h3>\n<p>A.B.M. and N.S. conceived the study. A.B.M. wrote, validated and tested the code, and performed most of the analyses. F.B., F.C., L.J.M., K.N.T., M.Z., P.M., L.D., K.D.H., A.M.T., W.A.N., G.P., E. Piperni, M.P., M.V.C., A.T. and F.A. supported the development and validation of the method and of the software and contributed to the analyses. A.B.M., F.A., C.H. and N.S. wrote the paper with contribution and editing from all the authors. C.H. and N.S. supervised the work. All the authors read and approved the final version of the manuscript.<\/p>\n<h3 id=\"corresponding-author\">Corresponding author<\/h3>\n<p id=\"corresponding-author-list\">Correspondence to<br \/>\n                <a id=\"corresp-c1\" href=\"http:\/\/www.nature.com\/mailto:ni***********@***tn.it\" data-original-string=\"hVQqBj+F+Vr9Caodk5z\/xg==7f49Xe6tDGJDZ3UgQVmBbeZtYioj4Zjf4WcnwP0zx1Dx3w=\" title=\"This contact has been encoded by Anti-Spam by CleanTalk. Click to decode. To finish the decoding make sure that JavaScript is enabled in your browser.\">Nicola Segata<\/a>.<\/p>\n<\/div>\n<\/div>\n<div id=\"ethics-section\" data-title=\"Ethics declarations\">\n<h2 id=\"ethics\">Ethics declarations<\/h2>\n<div id=\"ethics-content\">\n<h3 id=\"FPar2\">Competing interests<\/h3>\n<p>S.E.B., T.D.S., F.A. and N.S. are consultants to Zoe Global. F.G, R.D. and J.W. are employees of Zoe Global. The other authors declare no competing interests.<\/p>\n<\/p><\/div>\n<\/div>\n<div id=\"peer-review-section\" data-title=\"Peer review\">\n<h2 id=\"peer-review\">Peer review<\/h2>\n<div id=\"peer-review-content\">\n<h3 id=\"FPar1\">Peer review information<\/h3>\n<p><i>Nature Biotechnology<\/i> thanks C. Titus Brown and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.<\/p>\n<\/p><\/div>\n<\/div>\n<div id=\"additional-information-section\" data-title=\"Additional information\">\n<h2 id=\"additional-information\">Additional information<\/h2>\n<p><b>Publisher\u2019s note<\/b> Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.<\/p>\n<\/div>\n<div id=\"Sec33-section\" data-title=\"Supplementary information\">\n<h2 id=\"Sec33\">Supplementary information<\/h2>\n<\/div>\n<div id=\"rightslink-section\" data-title=\"Rights and permissions\">\n<h2 id=\"rightslink\">Rights and permissions<\/h2>\n<div id=\"rightslink-content\">\n<p><b>Open Access<\/b>  This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article\u2019s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article\u2019s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit <a href=\"http:\/\/creativecommons.org\/licenses\/by\/4.0\/\" rel=\"license\">http:\/\/creativecommons.org\/licenses\/by\/4.0\/<\/a>.<\/p>\n<p><a data-track=\"click\" data-track-action=\"view rights and permissions\" data-track-label=\"link\" href=\"https:\/\/s100.copyright.com\/AppDispatchServlet?title=Extending%20and%20improving%20metagenomic%20taxonomic%20profiling%20with%20uncharacterized%20species%20using%20MetaPhlAn%204&#038;author=Aitor%20Blanco-M%C3%ADguez%20et%20al&#038;contentID=10.1038%2Fs41587-023-01688-w&#038;copyright=The%20Author%28s%29&#038;publication=1087-0156&#038;publicationDate=2023-02-23&#038;publisherName=SpringerNature&#038;orderBeanReset=true&#038;oa=CC%20BY\">Reprints and Permissions<\/a><\/p>\n<\/div>\n<\/div>\n<div id=\"article-info-section\" aria-labelledby=\"article-info\" data-title=\"About this article\">\n<h2 id=\"article-info\">About this article<\/h2>\n<div id=\"article-info-content\">\n<p><a data-crossmark=\"10.1038\/s41587-023-01688-w\" target=\"_blank\" rel=\"noopener\" href=\"https:\/\/crossmark.crossref.org\/dialog\/?doi=10.1038\/s41587-023-01688-w\" data-track=\"click\" data-track-action=\"Click Crossmark\" data-track-label=\"link\" data-test=\"crossmark\"><img loading=\"lazy\" decoding=\"async\" width=\"57\" height=\"81\" alt=\"Science &amp; Nature Verify currency and authenticity via CrossMark\" src=\"data:image\/svg+xml;base64,PHN2ZyBoZWlnaHQ9IjgxIiB3aWR0aD0iNTciIHhtbG5zPSJodHRwOi8vd3d3LnczLm9yZy8yMDAwL3N2ZyI+PGcgZmlsbD0ibm9uZSIgZmlsbC1ydWxlPSJldmVub2RkIj48cGF0aCBkPSJtMTcuMzUgMzUuNDUgMjEuMy0xNC4ydi0xNy4wM2gtMjEuMyIgZmlsbD0iIzk4OTg5OCIvPjxwYXRoIGQ9Im0zOC42NSAzNS40NS0yMS4zLTE0LjJ2LTE3LjAzaDIxLjMiIGZpbGw9IiM3NDc0NzQiLz48cGF0aCBkPSJtMjggLjVjLTEyLjk4IDAtMjMuNSAxMC41Mi0yMy41IDIzLjVzMTAuNTIgMjMuNSAyMy41IDIzLjUgMjMuNS0xMC41MiAyMy41LTIzLjVjMC02LjIzLTIuNDgtMTIuMjEtNi44OC0xNi42Mi00LjQxLTQuNC0xMC4zOS02Ljg4LTE2LjYyLTYuODh6bTAgNDEuMjVjLTkuOCAwLTE3Ljc1LTcuOTUtMTcuNzUtMTcuNzVzNy45NS0xNy43NSAxNy43NS0xNy43NSAxNy43NSA3Ljk1IDE3Ljc1IDE3Ljc1YzAgNC43MS0xLjg3IDkuMjItNS4yIDEyLjU1cy03Ljg0IDUuMi0xMi41NSA1LjJ6IiBmaWxsPSIjNTM1MzUzIi8+PHBhdGggZD0ibTQxIDM2Yy01LjgxIDYuMjMtMTUuMjMgNy40NS0yMi40MyAyLjktNy4yMS00LjU1LTEwLjE2LTEzLjU3LTcuMDMtMjEuNWwtNC45Mi0zLjExYy00Ljk1IDEwLjctMS4xOSAyMy40MiA4Ljc4IDI5LjcxIDkuOTcgNi4zIDIzLjA3IDQuMjIgMzAuNi00Ljg2eiIgZmlsbD0iIzljOWM5YyIvPjxwYXRoIGQ9Im0uMiA1OC40NWMwLS43NS4xMS0xLjQyLjMzLTIuMDFzLjUyLTEuMDkuOTEtMS41Yy4zOC0uNDEuODMtLjczIDEuMzQtLjk0LjUxLS4yMiAxLjA2LS4zMiAxLjY1LS4zMi41NiAwIDEuMDYuMTEgMS41MS4zNS40NC4yMy44MS41IDEuMS44MWwtLjkxIDEuMDFjLS4yNC0uMjQtLjQ5LS40Mi0uNzUtLjU2LS4yNy0uMTMtLjU4LS4yLS45My0uMi0uMzkgMC0uNzMuMDgtMS4wNS4yMy0uMzEuMTYtLjU4LjM3LS44MS42Ni0uMjMuMjgtLjQxLjYzLS41MyAxLjA0LS4xMy40MS0uMTkuODgtLjE5IDEuMzkgMCAxLjA0LjIzIDEuODYuNjggMi40Ni40NS41OSAxLjA2Ljg4IDEuODQuODguNDEgMCAuNzctLjA3IDEuMDctLjIzcy41OS0uMzkuODUtLjY4bC45MSAxYy0uMzguNDMtLjguNzYtMS4yOC45OS0uNDcuMjItMSAuMzQtMS41OC4zNC0uNTkgMC0xLjEzLS4xLTEuNjQtLjMxLS41LS4yLS45NC0uNTEtMS4zMS0uOTEtLjM4LS40LS42Ny0uOS0uODgtMS40OC0uMjItLjU5LS4zMy0xLjI2LS4zMy0yLjAyem04LjQtNS4zM2gxLjYxdjIuNTRsLS4wNSAxLjMzYy4yOS0uMjcuNjEtLjUxLjk2LS43MnMuNzYtLjMxIDEuMjQtLjMxYy43MyAwIDEuMjcuMjMgMS42MS43MS4zMy40Ny41IDEuMTQuNSAyLjAydjQuMzFoLTEuNjF2LTQuMWMwLS41Ny0uMDgtLjk3LS4yNS0xLjIxLS4xNy0uMjMtLjQ1LS4zNS0uODMtLjM1LS4zIDAtLjU2LjA4LS43OS4yMi0uMjMuMTUtLjQ5LjM2LS43OC42NHY0LjhoLTEuNjF6bTcuMzcgNi40NWMwLS41Ni4wOS0xLjA2LjI2LTEuNTEuMTgtLjQ1LjQyLS44My43MS0xLjE0LjI5LS4zLjYzLS41NCAxLjAxLS43MS4zOS0uMTcuNzgtLjI1IDEuMTgtLjI1LjQ3IDAgLjg4LjA4IDEuMjMuMjQuMzYuMTYuNjUuMzguODkuNjdzLjQyLjYzLjU0IDEuMDNjLjEyLjQxLjE4Ljg0LjE4IDEuMzIgMCAuMzItLjAyLjU3LS4wNy43NmgtNC4zNmMuMDcuNjIuMjkgMS4xLjY1IDEuNDQuMzYuMzMuODIuNSAxLjM4LjUuMjkgMCAuNTctLjA0LjgzLS4xM3MuNTEtLjIxLjc2LS4zN2wuNTUgMS4wMWMtLjMzLjIxLS42OS4zOS0xLjA5LjUzLS40MS4xNC0uODMuMjEtMS4yNi4yMS0uNDggMC0uOTItLjA4LTEuMzQtLjI1LS40MS0uMTYtLjc2LS40LTEuMDctLjctLjMxLS4zMS0uNTUtLjY5LS43Mi0xLjEzLS4xOC0uNDQtLjI2LS45NS0uMjYtMS41MnptNC42LS42MmMwLS41NS0uMTEtLjk4LS4zNC0xLjI4LS4yMy0uMzEtLjU4LS40Ny0xLjA2LS40Ny0uNDEgMC0uNzcuMTUtMS4wNy40NS0uMzEuMjktLjUuNzMtLjU4IDEuM3ptMi41LjYyYzAtLjU3LjA5LTEuMDguMjgtMS41My4xOC0uNDQuNDMtLjgyLjc1LTEuMTNzLjY5LS41NCAxLjEtLjcxYy40Mi0uMTYuODUtLjI0IDEuMzEtLjI0LjQ1IDAgLjg0LjA4IDEuMTcuMjNzLjYxLjM0Ljg1LjU3bC0uNzcgMS4wMmMtLjE5LS4xNi0uMzgtLjI4LS41Ni0uMzctLjE5LS4wOS0uMzktLjE0LS42MS0uMTQtLjU2IDAtMS4wMS4yMS0xLjM1LjYzLS4zNS40MS0uNTIuOTctLjUyIDEuNjcgMCAuNjkuMTcgMS4yNC41MSAxLjY2LjM0LjQxLjc4LjYyIDEuMzIuNjIuMjggMCAuNTQtLjA2Ljc4LS4xNy4yNC0uMTIuNDUtLjI2LjY0LS40MmwuNjcgMS4wM2MtLjMzLjI5LS42OS41MS0xLjA4LjY1LS4zOS4xNS0uNzguMjMtMS4xOC4yMy0uNDYgMC0uOS0uMDgtMS4zMS0uMjQtLjQtLjE2LS43NS0uMzktMS4wNS0uN3MtLjUzLS42OS0uNy0xLjEzYy0uMTctLjQ1LS4yNS0uOTYtLjI1LTEuNTN6bTYuOTEtNi40NWgxLjU4djYuMTdoLjA1bDIuNTQtMy4xNmgxLjc3bC0yLjM1IDIuOCAyLjU5IDQuMDdoLTEuNzVsLTEuNzctMi45OC0xLjA4IDEuMjN2MS43NWgtMS41OHptMTMuNjkgMS4yN2MtLjI1LS4xMS0uNS0uMTctLjc1LS4xNy0uNTggMC0uODcuMzktLjg3IDEuMTZ2Ljc1aDEuMzR2MS4yN2gtMS4zNHY1LjZoLTEuNjF2LTUuNmgtLjkydi0xLjJsLjkyLS4wN3YtLjcyYzAtLjM1LjA0LS42OC4xMy0uOTguMDgtLjMxLjIxLS41Ny40LS43OXMuNDItLjM5LjcxLS41MWMuMjgtLjEyLjYzLS4xOCAxLjA0LS4xOC4yNCAwIC40OC4wMi42OS4wNy4yMi4wNS40MS4xLjU3LjE3em0uNDggNS4xOGMwLS41Ny4wOS0xLjA4LjI3LTEuNTMuMTctLjQ0LjQxLS44Mi43Mi0xLjEzLjMtLjMxLjY1LS41NCAxLjA0LS43MS4zOS0uMTYuOC0uMjQgMS4yMy0uMjRzLjg0LjA4IDEuMjQuMjRjLjQuMTcuNzQuNCAxLjA0Ljcxcy41NC42OS43MiAxLjEzYy4xOS40NS4yOC45Ni4yOCAxLjUzcy0uMDkgMS4wOC0uMjggMS41M2MtLjE4LjQ0LS40Mi44Mi0uNzIgMS4xM3MtLjY0LjU0LTEuMDQuNy0uODEuMjQtMS4yNC4yNC0uODQtLjA4LTEuMjMtLjI0LS43NC0uMzktMS4wNC0uN2MtLjMxLS4zMS0uNTUtLjY5LS43Mi0xLjEzLS4xOC0uNDUtLjI3LS45Ni0uMjctMS41M3ptMS42NSAwYzAgLjY5LjE0IDEuMjQuNDMgMS42Ni4yOC40MS42OC42MiAxLjE4LjYyLjUxIDAgLjktLjIxIDEuMTktLjYyLjI5LS40Mi40NC0uOTcuNDQtMS42NiAwLS43LS4xNS0xLjI2LS40NC0xLjY3LS4yOS0uNDItLjY4LS42My0xLjE5LS42My0uNSAwLS45LjIxLTEuMTguNjMtLjI5LjQxLS40My45Ny0uNDMgMS42N3ptNi40OC0zLjQ0aDEuMzNsLjEyIDEuMjFoLjA1Yy4yNC0uNDQuNTQtLjc5Ljg4LTEuMDIuMzUtLjI0LjctLjM2IDEuMDctLjM2LjMyIDAgLjU5LjA1Ljc4LjE0bC0uMjggMS40LS4zMy0uMDljLS4xMS0uMDEtLjIzLS4wMi0uMzgtLjAyLS4yNyAwLS41Ni4xLS44Ni4zMXMtLjU1LjU4LS43NyAxLjF2NC4yaC0xLjYxem0tNDcuODcgMTVoMS42MXY0LjFjMCAuNTcuMDguOTcuMjUgMS4yLjE3LjI0LjQ0LjM1LjgxLjM1LjMgMCAuNTctLjA3LjgtLjIyLjIyLS4xNS40Ny0uMzkuNzMtLjczdi00LjdoMS42MXY2Ljg3aC0xLjMybC0uMTItMS4wMWgtLjA0Yy0uMy4zNi0uNjMuNjQtLjk4Ljg2LS4zNS4yMS0uNzYuMzItMS4yNC4zMi0uNzMgMC0xLjI3LS4yNC0xLjYxLS43MS0uMzMtLjQ3LS41LTEuMTQtLjUtMi4wMnptOS40NiA3LjQzdjIuMTZoLTEuNjF2LTkuNTloMS4zM2wuMTIuNzJoLjA1Yy4yOS0uMjQuNjEtLjQ1Ljk3LS42My4zNS0uMTcuNzItLjI2IDEuMS0uMjYuNDMgMCAuODEuMDggMS4xNS4yNC4zMy4xNy42MS40Ljg0LjcxLjI0LjMxLjQxLjY4LjUzIDEuMTEuMTMuNDIuMTkuOTEuMTkgMS40NCAwIC41OS0uMDkgMS4xMS0uMjUgMS41Ny0uMTYuNDctLjM4Ljg1LS42NSAxLjE2LS4yNy4zMi0uNTguNTYtLjk0LjczLS4zNS4xNi0uNzIuMjUtMS4xLjI1LS4zIDAtLjYtLjA3LS45LS4ycy0uNTktLjMxLS44Ny0uNTZ6bTAtMi4zYy4yNi4yMi41LjM3LjczLjQ1LjI0LjA5LjQ2LjEzLjY2LjEzLjQ2IDAgLjg0LS4yIDEuMTUtLjYuMzEtLjM5LjQ2LS45OC40Ni0xLjc3IDAtLjY5LS4xMi0xLjIyLS4zNS0xLjYxLS4yMy0uMzgtLjYxLS41Ny0xLjEzLS41Ny0uNDkgMC0uOTkuMjYtMS41Mi43N3ptNS44Ny0xLjY5YzAtLjU2LjA4LTEuMDYuMjUtMS41MS4xNi0uNDUuMzctLjgzLjY1LTEuMTQuMjctLjMuNTgtLjU0LjkzLS43MXMuNzEtLjI1IDEuMDgtLjI1Yy4zOSAwIC43My4wNyAxIC4yLjI3LjE0LjU0LjMyLjgxLjU1bC0uMDYtMS4xdi0yLjQ5aDEuNjF2OS44OGgtMS4zM2wtLjExLS43NGgtLjA2Yy0uMjUuMjUtLjU0LjQ2LS44OC42NC0uMzMuMTgtLjY5LjI3LTEuMDYuMjctLjg3IDAtMS41Ni0uMzItMi4wNy0uOTVzLS43Ni0xLjUxLS43Ni0yLjY1em0xLjY3LS4wMWMwIC43NC4xMyAxLjMxLjQgMS43LjI2LjM4LjY1LjU4IDEuMTUuNTguNTEgMCAuOTktLjI2IDEuNDQtLjc3di0zLjIxYy0uMjQtLjIxLS40OC0uMzYtLjctLjQ1LS4yMy0uMDgtLjQ2LS4xMi0uNy0uMTItLjQ1IDAtLjgyLjE5LTEuMTMuNTktLjMxLjM5LS40Ni45NS0uNDYgMS42OHptNi4zNSAxLjU5YzAtLjczLjMyLTEuMy45Ny0xLjcxLjY0LS40IDEuNjctLjY4IDMuMDgtLjg0IDAtLjE3LS4wMi0uMzQtLjA3LS41MS0uMDUtLjE2LS4xMi0uMy0uMjItLjQzcy0uMjItLjIyLS4zOC0uM2MtLjE1LS4wNi0uMzQtLjEtLjU4LS4xLS4zNCAwLS42OC4wNy0xIC4ycy0uNjMuMjktLjkzLjQ3bC0uNTktMS4wOGMuMzktLjI0LjgxLS40NSAxLjI4LS42My40Ny0uMTcuOTktLjI2IDEuNTQtLjI2Ljg2IDAgMS41MS4yNSAxLjkzLjc2cy42MyAxLjI1LjYzIDIuMjF2NC4wN2gtMS4zMmwtLjEyLS43NmgtLjA1Yy0uMy4yNy0uNjMuNDgtLjk4LjY2cy0uNzMuMjctMS4xNC4yN2MtLjYxIDAtMS4xLS4xOS0xLjQ4LS41Ni0uMzgtLjM2LS41Ny0uODUtLjU3LTEuNDZ6bTEuNTctLjEyYzAgLjMuMDkuNTMuMjcuNjcuMTkuMTQuNDIuMjEuNzEuMjEuMjggMCAuNTQtLjA3Ljc3LS4ycy40OC0uMzEuNzMtLjU2di0xLjU0Yy0uNDcuMDYtLjg2LjEzLTEuMTguMjMtLjMxLjA5LS41Ny4xOS0uNzYuMzFzLS4zMy4yNS0uNDEuNGMtLjA5LjE1LS4xMy4zMS0uMTMuNDh6bTYuMjktMy42M2gtLjk4di0xLjJsMS4wNi0uMDcuMi0xLjg4aDEuMzR2MS44OGgxLjc1djEuMjdoLTEuNzV2My4yOGMwIC44LjMyIDEuMi45NyAxLjIuMTIgMCAuMjQtLjAxLjM3LS4wNC4xMi0uMDMuMjQtLjA3LjM0LS4xMWwuMjggMS4xOWMtLjE5LjA2LS40LjEyLS42NC4xNy0uMjMuMDUtLjQ5LjA4LS43Ni4wOC0uNCAwLS43NC0uMDYtMS4wMi0uMTgtLjI3LS4xMy0uNDktLjMtLjY3LS41Mi0uMTctLjIxLS4zLS40OC0uMzctLjc4LS4wOC0uMy0uMTItLjY0LS4xMi0xLjAxem00LjM2IDIuMTdjMC0uNTYuMDktMS4wNi4yNy0xLjUxcy40MS0uODMuNzEtMS4xNGMuMjktLjMuNjMtLjU0IDEuMDEtLjcxLjM5LS4xNy43OC0uMjUgMS4xOC0uMjUuNDcgMCAuODguMDggMS4yMy4yNC4zNi4xNi42NS4zOC44OS42N3MuNDIuNjMuNTQgMS4wM2MuMTIuNDEuMTguODQuMTggMS4zMiAwIC4zMi0uMDIuNTctLjA3Ljc2aC00LjM3Yy4wOC42Mi4yOSAxLjEuNjUgMS40NC4zNi4zMy44Mi41IDEuMzguNS4zIDAgLjU4LS4wNC44NC0uMTMuMjUtLjA5LjUxLS4yMS43Ni0uMzdsLjU0IDEuMDFjLS4zMi4yMS0uNjkuMzktMS4wOS41M3MtLjgyLjIxLTEuMjYuMjFjLS40NyAwLS45Mi0uMDgtMS4zMy0uMjUtLjQxLS4xNi0uNzctLjQtMS4wOC0uNy0uMy0uMzEtLjU0LS42OS0uNzItMS4xMy0uMTctLjQ0LS4yNi0uOTUtLjI2LTEuNTJ6bTQuNjEtLjYyYzAtLjU1LS4xMS0uOTgtLjM0LTEuMjgtLjIzLS4zMS0uNTgtLjQ3LTEuMDYtLjQ3LS40MSAwLS43Ny4xNS0xLjA4LjQ1LS4zMS4yOS0uNS43My0uNTcgMS4zem0zLjAxIDIuMjNjLjMxLjI0LjYxLjQzLjkyLjU3LjMuMTMuNjMuMi45OC4yLjM4IDAgLjY1LS4wOC44My0uMjNzLjI3LS4zNS4yNy0uNmMwLS4xNC0uMDUtLjI2LS4xMy0uMzctLjA4LS4xLS4yLS4yLS4zNC0uMjgtLjE0LS4wOS0uMjktLjE2LS40Ny0uMjNsLS41My0uMjJjLS4yMy0uMDktLjQ2LS4xOC0uNjktLjMtLjIzLS4xMS0uNDQtLjI0LS42Mi0uNHMtLjMzLS4zNS0uNDUtLjU1Yy0uMTItLjIxLS4xOC0uNDYtLjE4LS43NSAwLS42MS4yMy0xLjEuNjgtMS40OS40NC0uMzggMS4wNi0uNTcgMS44My0uNTcuNDggMCAuOTEuMDggMS4yOS4yNXMuNzEuMzYuOTkuNTdsLS43NC45OGMtLjI0LS4xNy0uNDktLjMyLS43My0uNDItLjI1LS4xMS0uNTEtLjE2LS43OC0uMTYtLjM1IDAtLjYuMDctLjc2LjIxLS4xNy4xNS0uMjUuMzMtLjI1LjU0IDAgLjE0LjA0LjI2LjEyLjM2cy4xOC4xOC4zMS4yNmMuMTQuMDcuMjkuMTQuNDYuMjFsLjU0LjE5Yy4yMy4wOS40Ny4xOC43LjI5cy40NC4yNC42NC40Yy4xOS4xNi4zNC4zNS40Ni41OC4xMS4yMy4xNy41LjE3LjgyIDAgLjMtLjA2LjU4LS4xNy44My0uMTIuMjYtLjI5LjQ4LS41MS42OC0uMjMuMTktLjUxLjM0LS44NC40NS0uMzQuMTEtLjcyLjE3LTEuMTUuMTctLjQ4IDAtLjk1LS4wOS0xLjQxLS4yNy0uNDYtLjE5LS44Ni0uNDEtMS4yLS42OHoiIGZpbGw9IiM1MzUzNTMiLz48L2c+PC9zdmc+\"><\/a><\/p>\n<div>\n<h3 id=\"citeas\">Cite this article<\/h3>\n<p>Blanco-M\u00edguez, A., Beghini, F., Cumbo, F. <i>et al.<\/i> Extending and improving metagenomic taxonomic profiling with uncharacterized species using MetaPhlAn 4.<br \/>\n                    <i>Nat Biotechnol<\/i>  (2023). https:\/\/doi.org\/10.1038\/s41587-023-01688-w<\/p>\n<p><a data-test=\"citation-link\" data-track=\"click\" data-track-action=\"download article citation\" data-track-label=\"link\" data-track-external rel=\"nofollow\" href=\"https:\/\/citation-needed.springer.com\/v2\/references\/10.1038\/s41587-023-01688-w?format=refman&#038;flavour=citation\">Download citation<\/a><\/p>\n<ul data-test=\"publication-history\">\n<li>\n<p>Received<span>: <\/span><span><time datetime=\"2022-06-07\">07 June 2022<\/time><\/span><\/p>\n<\/li>\n<li>\n<p>Accepted<span>: <\/span><span><time datetime=\"2023-01-20\">20 January 2023<\/time><\/span><\/p>\n<\/li>\n<li>\n<p>Published<span>: <\/span><span><time datetime=\"2023-02-23\">23 February 2023<\/time><\/span><\/p>\n<\/li>\n<li>\n<p><abbr title=\"Digital Object Identifier\">DOI<\/abbr><span>: <\/span><span>https:\/\/doi.org\/10.1038\/s41587-023-01688-w<\/span><\/p>\n<\/li>\n<\/ul>\n<\/div>\n<\/div>\n<\/div><\/div>\n<p><a href=\"https:\/\/www.nature.com\/articles\/s41587-023-01688-w\" class=\"button purchase\" rel=\"nofollow noopener\" target=\"_blank\">Read More<\/a><br \/>\n Aitor Blanco-M\u00edguez<\/p>\n","protected":false},"excerpt":{"rendered":"<p>MainOver the last 25 years, shotgun metagenomic sequencing1 and associated computational methods have developed as robust, efficient ways to study the taxonomic composition2,3,4,5,6 and functional potential4,7,8 of complex microbial communities populating human, animal and natural environments. Genome assembly methods developed for microbial isolates have been expanded to apply to shotgun metagenomes, but while they excel<\/p>\n","protected":false},"author":1,"featured_media":611613,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[33779,28369,536],"tags":[],"class_list":{"0":"post-611612","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-extending","8":"category-improving","9":"category-science-nature"},"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/newsycanuse.com\/index.php\/wp-json\/wp\/v2\/posts\/611612","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/newsycanuse.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/newsycanuse.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/newsycanuse.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/newsycanuse.com\/index.php\/wp-json\/wp\/v2\/comments?post=611612"}],"version-history":[{"count":0,"href":"https:\/\/newsycanuse.com\/index.php\/wp-json\/wp\/v2\/posts\/611612\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/newsycanuse.com\/index.php\/wp-json\/wp\/v2\/media\/611613"}],"wp:attachment":[{"href":"https:\/\/newsycanuse.com\/index.php\/wp-json\/wp\/v2\/media?parent=611612"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/newsycanuse.com\/index.php\/wp-json\/wp\/v2\/categories?post=611612"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/newsycanuse.com\/index.php\/wp-json\/wp\/v2\/tags?post=611612"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}