CAMP: A modular metagenomics analysis system for integrated multi-step data exploration. Article uri icon

Overview

abstract

  • MOTIVATION: Computational analysis of large-scale metagenomics sequencing datasets have proven to be both incredibly valuable for extracting isolate-level taxonomic, and functional insights from complex microbial communities. However, due to an ever-expanding ecosystem of metagenomics-specific methods and file-formats, designing seamless and scalable end-to-end workflows, and exploring the massive amounts of output data have become studies unto themselves. One-click bioinformatics pipelines have helped to organize these tools into targeted workflows, but they suffer from general compatibility and maintainability issues, and preclude replication. METHODS: To address the gap in easily extensible yet robustly distributable metagenomics workflows, we have developed a module-based metagenomics analysis system "Core Analysis Modular Pipeline" (CAMP), written in Snakemake, a popular workflow management system, along with a standardized module and working directory architecture. Each module can be run independently or conjointly with a series of others to produce the target data format (e.g. short-read preprocessing alone, or short-read preprocessing followed by de novo assembly), and outputs aggregated summary statistics reports and semi-guided Jupyter notebook-based visualizations. RESULTS: We have applied CAMP to a set of ten metagenomics samples to demonstrate how a modular analysis system with built-in data visualization at intermediate steps facilitates rich and seamless inter-communication between output data from different analytic purposes. AVAILABILITY: The CAMP ecosystem (module template and analysis modules) can be found https://github.com/Meta-CAMP.

authors

  • Mak, Lauren
  • Tierney, Braden T
  • Wei, Wei
  • Ronkowski, Cynthia
  • Toscan, Rodolfo Brizola
  • Turhan, Berk
  • Toomey, Michael
  • Martinez, Juan Sebastian Andrade
  • Fu, Chenlian
  • Lucaci, Alexander
  • Solano, Arthur Henrique Barrios
  • Setubal, João Carlos
  • Henriksen, James R
  • Zimmerman, Sam
  • Kopbayeva, Malika
  • Noyvert, Anna
  • Iwan, Zana
  • Kar, Shraman
  • Nakazawa, Nikita
  • Meleshko, Dmitry
  • Horyslavets, Dmytro
  • Kantsypa, Valeriia
  • Frolova, Alina
  • Kahles, Andre
  • Danko, David
  • Elhaik, Eran
  • Labaj, Pawel
  • Mangul, Serghei
  • Mason, Christopher E
  • Hajirasouliha, Iman

publication date

  • April 21, 2025

Identity

PubMed Central ID

  • PMC10104186

Digital Object Identifier (DOI)

  • 10.1101/2023.04.09.536171

PubMed ID

  • 37066359