Dadaist2: A Toolkit to Automate and Simplify Statistical Analysis and Plotting of Metabarcoding Experiments

Rebecca Ansorge; Giovanni Birolo; Stephen A James; Andrea Telatin

doi:10.3390/ijms22105309

Dadaist2: A Toolkit to Automate and Simplify Statistical Analysis and Plotting of Metabarcoding Experiments

Int J Mol Sci. 2021 May 18;22(10):5309. doi: 10.3390/ijms22105309.

Authors

Rebecca Ansorge¹, Giovanni Birolo², Stephen A James¹, Andrea Telatin¹

Affiliations

¹ Gut Microbes and Health Programme, Quadram Institute Bioscience, Norwich NR4 7UQ, UK.
² Medical Sciences Department, University of Turin, 10126 Turin, Italy.

Abstract

The taxonomic composition of microbial communities can be assessed using universal marker amplicon sequencing. The most common taxonomic markers are the 16S rDNA for bacterial communities and the internal transcribed spacer (ITS) region for fungal communities, but various other markers are used for barcoding eukaryotes. A crucial step in the bioinformatic analysis of amplicon sequences is the identification of representative sequences. This can be achieved using a clustering approach or by denoising raw sequencing reads. DADA2 is a widely adopted algorithm, released as an R library, that denoises marker-specific amplicons from next-generation sequencing and produces a set of representative sequences referred to as 'Amplicon Sequence Variants' (ASV). Here, we present Dadaist2, a modular pipeline, providing a complete suite for the analysis that ranges from raw sequencing reads to the statistics of numerical ecology. Dadaist2 implements a new approach that is specifically optimised for amplicons with variable lengths, such as the fungal ITS. The pipeline focuses on streamlining the data flow from the command line to R, with multiple options for statistical analysis and plotting, both interactive and automatic.

Keywords: amplicon sequence variant; bacterial taxonomy; bioinformatics; exact amplicon variant; metabarcoding; microbial communities; numerical ecology; statistical analysis; visualizations.

MeSH terms

Algorithms
Cluster Analysis
Computational Biology / methods
DNA Barcoding, Taxonomic / statistics & numerical data*
Data Interpretation, Statistical
High-Throughput Nucleotide Sequencing
Metadata
Metagenomics / statistics & numerical data*
Microbiota / genetics*
RNA, Ribosomal, 16S / genetics
Sequence Analysis, DNA
Software*

Substances

RNA, Ribosomal, 16S

Abstract

MeSH terms

Substances

Grants and funding