Integrative analysis of multimodal mass spectrometry data in MZmine 3

Science & Nature

Innovation in mass spectrometry (MS) and the rapidly increasing throughput and sensitivity of MS instrumentation require adaptations and innovations in data processing tools. Here, we introduce MZmine 3, a scalable MS data analysis platform that supports hybrid datasets from various instrumental setups, including liquid and gas chromatography (LC and GC)–MS, ion mobility spectrometry (IMS)–MS and MS imaging. In particular, the integration of IMS–MS imaging and LC–IMS–MS datasets provides opportunities for spatial metabolomics analyses with increased annotation confidence.

Over the past decade, the MZmine project has evolved into a community-driven, collaborative effort. As an open-source ecosystem for MS data processing, MZmine is a cross-platform software (Supplementary Note 1) that can be tuned for robust, scalable and reproducible data analysis on personal computers as well as high-performance supercomputers. The project has seen continuous development since its inception in 2004 (refs. 1,2). Community additions (Fig. 1a) introduced various functions, such as performant feature detection workflows3,4, modules for lipid annotation5 and strong ties to other community projects (Fig. 1b). Here, data exchange formats and direct interfaces (listed under ‘Tool integration’ in the documentation) enable downstream analysis in external tools, such as compound annotation in SIRIUS6 and statistical analysis in MetaboAnalyst7, and directly bind MZmine results into the molecular networking ecosystem of the Global Natural Products Social Molecular Networking (GNPS) web platform (Supplementary Note 2)8,9,10.

Fig. 1: MZmine, an open-source community project for integrative LC–IMS–MS and IMS–MS data processing.
Science & Nature figure 1

a, Overview of active developments and key additions to MZmine since the first publication, which led to over 180 modules that now drive interactive, reproducible and efficient data processing and visualization in MZmine 3. b, Data exchange formats and direct interfaces enable downstream analysis with strong ties to projects like GNPS, SIRIUS and MetaboAnalyst. c, The integrative LC–MS and IMS–MS imaging workflow applies feature detection in RT, ion mobility and m/z dimension to MS data stored in open or vendor formats. Comprehensive processing and annotation results are merged into an aligned feature list. d, An aligned feature list with one ion feature detected in LC–IMS–MS samples and aligned to one MALDI–IMS–MS ion feature image. Annotation results (‘Lipid annotation’ column) and interactive charts include the table columns ‘Shapes’ (extracted ion chromatograms), ‘Mobilograms’ (extracted ion mobilograms) and ‘Images’ (extracted ion images).

Full size image

Recent advances in MS instrumentation push sensitivity, resolving power and data acquisition speed, resulting in increased data volume and complexity. Notably, IMS gains traction in the field by including an additional separation dimension to LC–MS or imaging-based techniques like matrix-assisted laser desorption/ionization (MALDI)–MS. These advances introduce new acquisition modes (for example, parallel accumulation–serial fragmentation (PASEF))11 or enable combination of IMS and imaging, which was shown to improve annotation quality in MS imaging12. Furthermore, the number of large-scale cohort and multifactorial studies in clinical, environmental and other fields is growing, as registered in the three main metabolomics data repositories: MassIVE/GNPS8, MetaboLights and Metabolomics Workbench13. The need for scalable, reproducible and flexible data analysis workflows that can combine MS data from various sources remains unaddressed by existing tools. For example, to combine LC–(IMS–)MS and MS imaging results from the same sample, users are forced to master multiple software tools12 that divide the workflow and are specialized for either chromatography–MS (for example, MS-DIAL, XCMS, OpenMS)14,15,16 or MS imaging (for example, METASPACE, rMSI, Cardinal MSI, SpectralAnalysis)17.

The integrative spatial metabolomics workflow in MZmine 3 (Fig. 1c) imports LC–IMS–MS and IMS–MS imaging datasets stored in either open or vendor-specific formats and processes them by non-targeted feature detection. This entails resolving peak shapes for ion features in both the retention time (RT) and ion mobility dimension in LC–IMS–MS and extracting mobility-resolved ion image features with spatial distributions in IMS–MS imaging (Supplementary Figs. 13). Individual features from both methodologies are subsequently represented and aligned by their RT (LC only), m/z and ion mobility values. The resulting aligned feature list combines the strengths of the individual analytical methods by integrating the compound annotation capabilities of modern chromatography-based MS with spatial metabolite distributions that can be mapped to histological data, addressing the issue of missing MS2 data in most imaging studies. For data evaluation, MZmine organizes annotations in a feature table with interactive charts, exemplified in Fig. 1d for one ion feature detected in LC–IMS–MS samples and aligned to an ion image from one MALDI–IMS–MS imaging dataset. An exemplary spatial metabolomics workflow leading to LC–IMS–MS-resolved molecular networks, enriched with spatial ion feature information, is described in Supplementary Note 2 and Supplementary Fig. 4. Additional visualization modules (Supplementary Fig. 5) connect all available data dimensions; a fast memory-mapped data back end enables interactive exploration.

In MZmine 3, special attention was directed toward scalability due to the ever increasing study sizes that lead to large volumes of raw data, particularly in the case of LC–IMS–MS datasets. Efficient memory management and parallelization removed bottlenecks, resulting in an 89% reduction in processing time for 250 dissolved organic matter samples when compared to MZmine 2. A stress test demonstrated high sample throughput, where the mean processing times amounted to 0.1% to 0.3% of the total data acquisition time for six different LC–MS datasets (Supplementary Note 3 and Supplementary Fig. 6). Further, MZmine 3 was benchmarked using 8,273 fecal LC–MS2 samples, requiring just 47 min of processing time (see hardware specifications in Supplementary Note 3).

The improved performance of MZmine 3 over previous MZmine versions now allows processing of large datasets, including large-volume LC–IMS–MS data. For new users, the MZmine website contains detailed manuals and video tutorials, and the new processing wizard in MZmine provides starting points for various standard workflows and mass spectrometer types. In addition, a development tutorial is available for potential new contributors, and the modular design of MZmine enables testing and implementing of new ideas within the MZmine framework.

Data availability

Datasets are available on MassIVE8 with the following accession IDs: MSV000088054, human cohort study, LC–MS, neg; MSV000087728, diverse plant extracts, LC–MS2, top-3 DDA, pos; MSV000090079, dissolved organic matter, LC–MS2, top-5 DDA, pos; MSV000090328, sheep brain, LC–TIMS-MS, PASEF, pos; MSV000090327, piper plant extracts, LC–TIMS-MS, PASEF, pos. IMS resolved ion identity molecular networking results are available through GNPS: https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=7a06fa3dfadd4158bcb4ee300b574747

Code availability

The latest release of MZmine can be downloaded from https://www.mzmine.org. The complete source code is available at https://github.com/mzmine/mzmine3/ under the MIT license. The MZmine documentation is hosted on GitHub and available at https://www.mzmine.org/documentation.

References

  1. Katajamaa, M., Miettinen, J. & Oresic, M. Bioinformatics 22, 634–636 (2006).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  2. Pluskal, T., Castillo, S., Villar-Briones, A. & Oresic, M. BMC Bioinformatics 11, 395 (2010).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  3. Smirnov, A. et al. Anal. Chem. 91, 9069–9077 (2019).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  4. Du, X., Smirnov, A., Pluskal, T., Jia, W. & Sumner, S. Methods Mol. Biol. 2104, 25–48 (2020).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  5. Korf, A., Jeck, V., Schmid, R., Helmer, P. O. & Hayen, H. Anal. Chem. 91, 5098–5105 (2019).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  6. Dührkop, K. et al. Nat. Methods 16, 299–302 (2019).

    Article 
    PubMed 

    Google Scholar
     

  7. Pang, Z. et al. Nucleic Acids Res. 49, W388–W396 (2021). W1.

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  8. Wang, M. et al. Nat. Biotechnol. 34, 828–837 (2016).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  9. Nothias, L.-F. et al. Nat. Methods 17, 905–908 (2020).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  10. Schmid, R. et al. Nat. Commun. 12, 3832 (2021).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  11. Meier, F. et al. J. Proteome Res. 14, 5378–5387 (2015).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  12. Helmer, P. O. et al. Anal. Chem. 93, 2135–2143 (2021).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  13. Aksenov, A. A., da Silva, R., Knight, R., Lopes, N. P. & Dorrestein, P. C. Nat. Rev. Chem. 1, 0054 (2017).

    Article 
    CAS 

    Google Scholar
     

  14. Smith, C. A., Want, E. J., O’Maille, G., Abagyan, R. & Siuzdak, G. Anal. Chem. 78, 779–787 (2006).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  15. Tsugawa, H. et al. Nat. Biotechnol. 38, 1159–1163 (2020).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  16. Röst, H. L. et al. Nat. Methods 13, 741–748 (2016).

    Article 
    PubMed 

    Google Scholar
     

  17. Weiskirchen, R., Weiskirchen, S., Kim, P. & Winkler, R. J. Cheminform. 11, 16 (2019).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

Download references

Acknowledgements

We thank Christopher Jensen and Gauthier Boaglio for their contributions to the MZmine codebase. We thank Jianbo Zhang and Zachary Russ for their donations to MZmine development. The MZmine 3 logo was designed by the Bioinformatics & Research Computing group at the Whitehead Institute for Biomedical Research. T.P. is supported by Czech Science Foundation (GA CR) grant 21-11563M and by the European Union’s Horizon 2020 research and innovation programme under Marie Skłodowska-Curie grant agreement 891397. Support for P.C.D. was from US NIH U19 AG063744, P50HD106463, 1U24DK133658 and BBSRC-NSF award 2152526. T.S. acknowledges funding by Deutsche Forschungsgemeinschaft (441958208). M. Wang acknowledges the US Department of Energy Joint Genome Institute (https://ror.org/04xm1d337, a DOE Office of Science User Facility) and is supported by the Office of Science of the US Department of Energy operated under subcontract No. 7601660. E.R. and H.H. thank Wen Jiang (HILICON AB) for providing the iHILIC Fusion(+) column for HILIC measurements. M.F., K.D. and S.B. are supported by Deutsche Forschungsgemeinschaft (BO 1910/20). L.-F.N. is supported by the Swiss National Science Foundation (project 189921). D.P. was supported through the Deutsche Forschungsgemeinschaft (German Research Foundation) through the CMFI Cluster of Excellence (EXC-2124 — 390838134 project-ID 1-03.006_0) and the Collaborative Research Center CellMap (TRR 261 – 398967434). J.-K.W. acknowledges the US National Science Foundation (MCB-1818132), the US Department of Agriculture, and the Chan Zuckerberg Initiative. MZmine developers have received support from the European COST Action CA19105 — Pan-European Network in Lipidomics and EpiLipidomics (EpiLipidNET). We acknowledge the support of the Google Summer of Code (GSoC) program, which has funded the development of several MZmine modules through student projects. We thank Adam Tenderholt for introducing MZmine to the GSoC program.

Author information

Author notes

  1. These authors contributed equally: Robin Schmid, Steffen Heuckeroth, Ansgar Korf.

Authors and Affiliations

  1. Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA

    Robin Schmid, Zheng Zhang & Pieter C. Dorrestein

  2. Institute of Inorganic and Analytical Chemistry, University of Münster, Münster, Germany

    Robin Schmid, Steffen Heuckeroth, Ansgar Korf, Mark Wesner, Edward Rudt, Patrick O. Helmer, Heiko Hayen & Uwe Karst

  3. Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Prague, Czech Republic

    Robin Schmid, Roman Bushuiev, Olena Mokshyna, Corinna Brungs, Kirill Ponomarov, Lana Mutabdžija, Tito Damiani & Tomáš Pluskal

  4. Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC, USA

    Aleksandr Smirnov, Owen Myers & Xiuxia Du

  5. Steno Diabetes Center Copenhagen, Gentofte, Denmark

    Thomas S. Dyrlund

  6. Department of Biochemistry, Molecular Biology, and Biophysics, University of Minnesota – Twin Cities, Minneapolis, MN, USA

    Kevin J. Murray

  7. Institute for Bio- and Geosciences (IBG-5), Forschungszentrum Jülich GmbH, Jülich, Germany

    Nils Hoffmann

  8. School of Engineering, Westlake University, Hangzhou, China

    Miaoshan Lu

  9. BlockLab, Center for Large Datasystems Research, San Diego Supercomputer Center, La Jolla, CA, USA

    Abinesh Sarvepalli

  10. Chair for Bioinformatics, Friedrich Schiller University Jena, Jena, Germany

    Markus Fleischauer, Kai Dührkop & Sebastian Böcker

  11. Agriculture and Agri-Food Canada, London Research and Development Centre, London, Ontario, Canada

    Shawn J. Hoogstra

  12. Datacraft Technologies, Mosman Park, Washington, Western Australia, Australia

    Chris J. Pudney

  13. Analytical Solutions Group, Product Technology and Engineering, Jealott’s Hill International Research Centre, Bracknell, UK

    Mark Earll

  14. Center for Marine Biotechnology and Biomedicine, Scripps Institution of Oceanography, University of California San Diego, La Jolla, CA, USA

    Timothy R. Fallon

  15. Department of Effect-Directed Analysis, Helmholtz Centre for Environmental Research – UFZ, Leipzig, Germany

    Tobias Schulze

  16. Ecology and Forest Genetics, Institute of Forest Sciences (ICIFOR-INIA-CSIC), Madrid, Spain

    Albert Rivas-Ubach

  17. Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland, WA, USA

    Aivett Bilbao

  18. Clinic for Diagnostic Imaging, Diagnostic Imaging Research Unit (DIRU), University of Zurich, Zürich, Switzerland

    Henning Richter

  19. School of Pharmaceutical Sciences, University of Geneva, Geneva, Switzerland

    Louis-Félix Nothias

  20. Department of Computer Science, University of California Riverside, Riverside, CA, USA

    Mingxun Wang

  21. School of Medical Sciences, Örebro University, Örebro, Sweden

    Matej Orešič

  22. Turku Bioscience Centre, University of Turku and Åbo Akademi University, Turku, Finland

    Matej Orešič

  23. Whitehead Institute for Biomedical Research, Cambridge, MA, USA

    Jing-Ke Weng

  24. Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA

    Jing-Ke Weng

  25. Institute of Neuropathology, University Hospital Münster, Münster, Germany

    Astrid Jeibmann

  26. CMFI Cluster of Excellence, University of Tuebingen, Tuebingen, Germany

    Daniel Petras

Contributions

R.S., S.H., T.P. are coordinating the MZmine open source project. S.H., R.S., P.C.D., T.P., A.K. wrote and edited the initial manuscript. S.H., R.S., A.K., T.P. conceived the combined workflow for MALDI–IMS–MS imaging and LC–IMS–MS, developed the code and tested the workflow. R.S., S.H., A.K., T.P. A. Smirnov, O. Myers, T.S.D., R.B., K.J.M., N.H., M.L., A. Sarvepalli, Z.Z., M.F., K.D., M. Wesner., M. Wang, S.J.H., O. Mokshyna, K.P., C.J.P., T.R.F., T.S. and more have contributed open source code to MZmine. C.B., T.D., S.H., L.M., O. Mokshyna, R.S., M.E. wrote the documentation for MZmine. L.-F.-N., A.R.-U., A.B., R.S., S.H., A.K., M.O., P.C.D., D.P., U.K., J.-K.W., H.H., X.D., S.B. initiated and/or supervised projects related to MZmine development. T.S., A.K., S.H., R.S., T.P., A.R.-U., A.B., N.H., D.P. were involved in the supervision of students for the Google Summer of Code program. R.S., L.-F.N., D.P., A. Sarvepalli, Z.Z., M. Wang, P.C.D. contributed to the linking with GNPS to facilitate molecular networking in MZmine. R.S., D.P., L.-F.N., M. Wang conceived and developed the FBMN and IIMN workflows in MZmine. S.H., R.S., A.K. implemented imzML support and developed imaging feature detection. S.H. developed the ion mobility data support, native .tdf support, ion mobility gap filling; added ion mobility visualization modules; recreated project load/save. A.K. provided TDF-SDK for native .tdf import and supervised S.H. for its implementation. S.H., A.K. developed ion mobility feature detection. A.K., H.H. developed lipid annotation modules and workflows and made it IMS aware. R.S., M. Wang developed parallel gap-filling. S.H., R.S. developed parallel sample alignment. T.S.D. implemented mzTab, MGF and MSP support and various peak information (FWHM, tailing factor, asymmetry factor, RT start and RT end). R.S., C.B., A.K. worked on the mass spectral library creation and matching workflows. K.D., M.F., R.S., S.H., S.B. assisted with the integration of SIRIUS and data exchange. A.R.-U., T.P. conceived the exact mass calibration module. M.L. developed support for the open data format ‘Aird’. S.J.H. developed diagnostic fragmentation filtering. M. Wesner developed the mass-voltammogram module. R.S., S.H. profiled and optimized MZmine’s memory consumption and processing throughput. S.H. prepared sheep brain lipid extracts, prepared MALDI samples, acquired imaging data, analyzed imaging and chromatographic data. H.R. and A.J. planned and carried out animal study ZH235/17. A.J. prepared thin sections and histologic tissue staining of the sheep brain dataset and supplied the tissue samples for extraction. P.O.H., C.B. provided testing data and feedback for LC–MS and IMS–MS imaging workflows. E.R. acquired LC–IMS–MS2 lipid data. R.S., S.H., D.P. conducted the performance tests. All authors edited and approved the final manuscript.

Corresponding author

Correspondence to
Tomáš Pluskal.

Ethics declarations

Competing interests

A.K. is employed at Bruker Daltonics GmbH & Co. KG. S.B., K.D. and M.F. are co-founders of Bright Giant. P.C.D. is a scientific advisor for Cybele and is a scientific advisor and a co-founder of Enveda, Arome and Ometa with prior approval by the University of California San Diego. M. Wang is a co-founder of Ometa Labs LLC. J.-K.W. is a member of the Scientific Advisory Board and a shareholder of DoubleRainbow Biosciences, Galixir and Inari Agriculture, which develop biotechnologies related to natural products, drug discovery and agriculture.

Peer review

Peer review information

Nature Biotechnology thanks Xiaotao Shen and Zheng-Jiang Zhu for their contribution to the peer review of this work.

Supplementary information

About this article

Science & Nature Verify currency and authenticity via CrossMark

Cite this article

Schmid, R., Heuckeroth, S., Korf, A. et al. Integrative analysis of multimodal mass spectrometry data in MZmine 3.
Nat Biotechnol (2023). https://doi.org/10.1038/s41587-023-01690-2

Download citation

  • Published:

  • DOI: https://doi.org/10.1038/s41587-023-01690-2

Read More
Robin Schmid

Latest

Lil Wayne speaks out after feeling overlooked by Coachella and the Grammys

Music Lil Wayne reacts to Coachell and Grammys snub Award-winning...

Kehlani at 30: How ‘Folded’ Changed Everything | Billboard Women In Music 2026

MusicBillboard Women in Music 2026 Impact Award recipient...

Newsletter

Don't miss

Tesla’s Business Has Become Much More Diversified in Just the Past Five Years. Does That Make Its Stock a Better Buy Today?

Key Points Tesla's energy generation and storage segment generated 27% revenue growth last year. The company's non-automotive segments were able to help offset a double-digit decline in auto revenue in 2025. These 10 stocks could mint the next wave of millionaires › Tesla (NASDAQ: TSLA) is known for its electric vehicles (EVs), and while they

WD sees sustainability as key business driver in an ‘AI economy’

Hard drive company WD promoted long-term operations and sustainability executive Jackie Jung to become its first chief sustainability officer in February, as it steps up sales to companies building AI data centers. Her vision: Turn sustainability into a “brand” for WD, a strategy that reduces risk for the $6 billion company (formerly known as Western

5 Business Ideas Worth Starting in 2026

If there is one thing Nigerians understand well, it is how to spot opportunity inside hardship. In 2026, that mindset will matter more than ever. The economy is tough, competition is rising, and many people are looking for smarter ways to earn, build, and survive. But even in a difficult environment, some businesses still stand