Research Article |
Corresponding author: François Keck ( francois.keck@gmail.com ) Corresponding author: Florian Altermatt ( florian.altermatt@eawag.ch ) Academic editor: Michael T. Monaghan
© 2022 François Keck, Samuel Hürlemann, Nadine Locher, Christian Stamm, Kristy Deiner, Florian Altermatt.
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Keck F, Hürlemann S, Locher N, Stamm C, Deiner K, Altermatt F (2022) A triad of kicknet sampling, eDNA metabarcoding, and predictive modeling to assess richness of mayflies, stoneflies and caddisflies in rivers. Metabarcoding and Metagenomics 6: e79351. https://doi.org/10.3897/mbmg.6.79351
|
Monitoring biodiversity is essential to understand the impacts of human activities and for effective management of ecosystems. Thereby, biodiversity can be assessed through direct collection of targeted organisms, through indirect evidence of their presence (e.g. signs, environmental DNA, camera trap, etc.), or through extrapolations from species distribution and species richness models. Differences in approaches used in biodiversity assessment, however, may come with individual challenges and hinder cross-study comparability. In the context of rapidly developing techniques, we compared three different approaches in order to better understand assessments of aquatic macroinvertebrate diversity. Specifically, we compared the community composition and species richness of three orders of aquatic macroinvertebrates (mayflies, stoneflies, and caddisflies, hereafter EPT) obtained via eDNA metabarcoding and via traditional in situ kicknet sampling to catchment-level based predictions of a species richness model. We used kicknet data from 24 sites in Switzerland and compared taxonomic lists to those obtained using eDNA amplified with two different primer sets. Richness detected by these methods was compared to the independent predictions made by a statistical species richness model, that is, a generalized linear model using landscape-level features to estimate EPT diversity. Despite the ability of eDNA to consistently detect some EPT species found by traditional sampling, we found important discrepancies in community composition between the kicknet and eDNA approaches, particularly at a local scale. We found the EPT-specific primer set fwhF2/EPTDr2n, detected a greater number of targeted EPT species compared to the more general primer set mlCOIintF/HCO2198. Moreover, we found that the species richness measured by eDNA from either primer set was poorly correlated to the richness measured by kicknet sampling (Pearson correlation = 0.27) and that the richness estimated by eDNA and kicknet were poorly correlated with the prediction of the species richness model (Pearson correlation = 0.30 and 0.44, respectively). The weak relationships between the traditional kicknet sampling and eDNA with this model indicates inherent limitations in upscaling species richness estimates, and possibly a limited ability of the model to meet real world expectations. It is also possible that the number of replicates was not sufficient to detect ambiguous correlations. Future challenges include improving the accuracy and sensitivity of each approach individually, yet also acknowledging their respective limitations, in order to best meet stakeholder demands and address the biodiversity crisis we are facing.
Ephemeroptera, metabarcoding, Plecoptera, Trichoptera, water DNA
The role of biodiversity in maintaining ecosystem functions and services is widely recognized (
For a long time, macroinvertebrate monitoring of freshwater ecosystems has solely relied on the capture of individuals or their direct observation (
A pairwise comparison of methods, however, may be hard to resolve, as approaches come with their respective biases and fundamentally differ in the scale they represent. Thus, including a third approach, using a triad of comparisons (Fig.
A triad of methods (kicknet sampling, eDNA sampling, and statistical modelling) available to estimate macroinvertebrate diversity in river ecosystems. Each has its own specificities, particularly in terms of integrated spatial scale. Note that models always rely on underlying data used to train them; in this study those are independent kick-net samples.
In this study, we used a dataset of 24 streams located in Switzerland, for which macroinvertebrate communities have been sampled at one location, both by kicknet and eDNA, and for which independent predictions on species richness have been modelled. We specifically focus on the diversity of three orders of macroinvertebrates: mayflies (Ephemeroptera, E), stoneflies (Plecoptera, P), and caddisflies (Trichoptera, T). EPT taxa are commonly found in streams and rivers, and have proven to be useful and powerful indicators of water quality (
We hypothesize that the fwhF2/EPTDr2n primer set, due to its higher specificity, will detect more EPT species than the generic primer set, and that the species detected will be more consistent with the traditional approach in terms of richness and taxa detected. We also hypothesize a positive correlation of the EPT richness detected by the three different approaches tested. However, the fact that the kicknet approach measures diversity locally, while the model used predicts diversity at the watershed scale (
Water samples were collected from 24 streams in Switzerland in 2013 or 2014 (Fig.
At each location, all individuals of may-, stone-, and caddisflies (EPT) were identified to the species level (in a few cases to species complexes, subsequently treated as species) using expert taxonomists. Identification of all taxa followed pre-defined taxonomic lists, and all data from the two sites per location were pooled. For details see
Methods for filtration and extraction of DNA from water samples were previously published in
Library construction for each sample location followed a three step PCR process. The first PCR consisted of amplification of a 312 bp fragment of the 5’ end of the Cytochrom Oxidase I mitochondrial gene (COI) using the forward primer (mlCOIintF) from
The second PCR was conducted with the same PCR conditions above except the forward and reverse primers were modified to include the Nextera transposase adaptors and only 1 µL of cleaned PCR product was used in the reaction. Between the forward and reverse primer sequence and the transposase adaptor a different number of random bases were inserted to create products of varying length to allow more heterogeneity on the flow cell. The thermal-cycling regime was the same except that five cycles were used. PCR products from the four independent reactions for each sample were then pooled together and cleaned using a two-step method. First, we cleaned each pooled reaction with EXO I and SAP as described above except we adjusted proportionally the volumes of EXO I and SAP for a total cleaned volume of 30 µL rather than 7.5 µL. Second, we desalted, removed buffer components with the Illustra MicroSpin S-300 HR Columns (GE Healthcare Life Sciences, Little Chalfont, United Kingdom) following the manufacturer’s recommended protocol.
The third PCR indexed each pooled PCR using the duel-indexing strategy with the Nextera index kits A and D. Indexing PCR was carried out in a reaction volume of 50 µL where amplicons that showed a DNA concentration less than 0.1 ng/µL were added at 10 µL and all other greater than this were added at 5 µL. We used the KAPA Library Amplification Kit following the manufacturer’s recommended protocol (KAPA Biosystems, Wilmington, MA). Each of the pooled reactions were then cleaned using Agencourt AMPure XP beads following the recommended manufacturer’s protocol (Beckman Coulter, Brea, CA, USA).
Cleaned and indexed libraries were then assayed for DNA concentration using the Qubit (1.0) fluorometer following recommended protocols for the dsDNA HS Assay, normalized, then pooled at a 2 nM concentration. PHiX control was added at 1%. Paired-end sequencing was performed on an Illumina MiSeq (MiSeq Reagent kit v2, 250 cycles) at the Genomic Diversity Center at the ETH, Zurich, Switzerland following the manufacturer’s run protocols (Illumina, Inc., San Diego, CA, USA). The MiSeq Control Software Version 2.2 including MiSeq Reporter 2.2 was used for the primary analysis and the demultiplexing of the raw reads.
In order to amplify the 142 bp long fragment of the COI locus using the fwhF2 forward primer (
From the first PCR product, 10 µL was enzymatically cleaned by adding 0.11 U/µL Exonuclease I (E. coli), 0.2 U/µL Shrimp Alkaline Phosphatase (rSAP) (New England Biolabs) and 1.11 µL PCR grade water to each sample. The temperature cycling was carried out as recommended by the manufacturer.
In order to add the Nextera transposase sequences adaptors to the first PCR fragment, 4 µL cleaned PCR product was used in similar PCR condition as in the first PCR reaction. Thermal cycling regime was identical, except that the number of cycles was reduced. Amplification success was checked with the AM320 method on the QiAxcel Screening Cartridge (Qiagen, Germany). Most of the samples worked after 10 PCR cycles. However, the cycling number for 28 samples was adjusted up to 18 cycles, in order to see amplification success.
Before we attached the index adapters with the third PCR, additional cleaning steps were performed. This consisted of first pooling the replicates of the second PCR product and then running it on a 0.8% low melting point Agarose (Analytical grade, Promega) together with 100-bp ladders (Promega, Madison, WI, USA). Fragments with the correct size of 268 bp were cut out from gel, by using a fresh scalpel. Thereafter DNA was purified, using the Wizard SV Gel and PCR Clean-Up System (Promega, Madison, WI, USA). Exciseds DNA bands were dissolved in 250 µL Membrane Binding Solution at 65 °C shaken at 850 rpm for 2 minutes. After the column bind and washing steps, DNA was eluted in 20 µL PCR grade water.
Illumina Nextera XT Index set D (Illumina, Inc., San Diego, CA, USA) were attached to the purified amplicon by following the recommended protocol from the Illumina library preparation guide, except increasing cycle number from 8 to 10 cycles. After the Nextera index adapters successfully bound to the fragment, the individual samples were cleaned up with a MagJET NGS Cleanup and Size Selection Kit running on a KingFisher Flex Purification System (Thermo Fisher Scientific Inc., MA, USA).
Quantification of PCR products was conducted with a target selective fluorescence dye Qubit BR DNA Assay Kit (Life Technologies, Carlsbad, CA, USA). Fluorescence dye emission of the standard dilution series and samples were measured in replicates with a Spark Multimode Microplate Reader (Tecan, US Inc., USA). Samples, including filter, extraction and PCR controls were then merged in four equimolar pools (3nM), in relation to their concentration, with an automated liquid handling station (BRAND GMBH + CO KG, Wertheim, GE). Final pool was then three times manually purified, by using a 0.8× ratio of Agencourt AMPure XP (Beckman Coulter, Brea, CA, USA) beads, again following the recommended manufacturer’s protocol. Amplicon size was verified by an Agilent 4200 TapeStation (AgilentTechnologies, Inc., USA) run. Library was sequenced with a concentration of 10 pM in the flowcell on an Illumina MiSeq (Illumina, Inc. San Diego, CA, USA) at the Genetic Diversity Center (ETH, Zurich). The Sequencing run (MiSeq Reagent kit v2, 300 cycles, paired-ended) was spiked with 10% PHiX control.
Demultiplexed MiSeq (forward and reverse) reads were initially checked using FastQC (v. 0.11.9) and the software package DADA2 (v.1.16.0) was used to infer amplicon sequence variants (ASVs) following the methods described by
We translated the ASV sequences into amino acids starting from the 2nd nucleotide and using the invertebrate mitochondrial code. Since COI is a coding sequence, it is not expected to find stop codons in the barcode region (
Taxonomic assignment of ASV sequences was achieved using the RDP algorithm (
Replicates (sites) were merged by location. For five locations (Buttisholz, Hochdorf, Hornussen, Messen, and Niederdorf, see Fig.
For each sampling location, we predicted the EPT species richness using a statistical species richness model developed by
We used presence-absence data and species richness (i.e. the number of species) to characterize the diversity of EPT, both from the eDNA as well as the kicknet data. Diversity was studied both at local scale (i.e. locations after merging site replicates, alpha diversity), and at regional scale (i.e. all locations merged, gamma diversity). For both alpha and gamma diversity, we compared the number of species detected by kicknet only, by eDNA only, and the number of species detected both by kicknet and eDNA. For each location, the sampling effort (number of identified individuals and sequencing depth) was assessed with species accumulation curves. Finally, we computed and tested Pearson correlations between the richness found by eDNA (fwhF2/EPTDr2n and mlCOIintF/HCO2198 primers separately), found by kicknet and estimated by the predictive model. Analyses were conducted using R 4.0.3 (
All raw sequencing data are available at the European Nucleotide Archive (ENA) under the accession number PRJEB50083. The processed data and R scripts to reproduce the analyses and results are available at: https://github.com/fkeck/ecoimpact.
Library sequencing generated 4,638,809 sequences (mlCOIintF/HCO2198 primers) and 8,008,677 sequences (fwhF2/EPTDr2n primers). For sequences amplified using the mlCOIintF/HCO2198 primers, the pre-processed and quality-filtered data consists of 3,110,057 reads divided in 13,797 ASVs. For sequences amplified using the fwhF2/EPTDr2n primers, the pre-processed and quality-filtered data consists of 4,779,863 reads divided in 2,665 ASVs.
For the mlCOIintF/HCO2198 primers, taxonomic assignment failed for a significant number of ASVs for which identification was not possible, even at the highest taxonomic ranks (87% of unclassified Eukaryota). Assigned reads are dominated by insects (Diptera, Coleoptera and unclassified Insecta), Clitellata, Chromadorea and unclassified arthropods. The orders of interest (EPT) only represent a small proportion of assigned ASVs (7%), with 32 Ephemeroptera, 17 Plecoptera and 34 Trichoptera taxa detected. The relative proportion of EPT is even less important when accounting for the number of reads. In total, the EPT groups represent 3.1% of the assigned reads. In contrast, the fwhF2/EPTDr2n primers performed better with a lower proportion of unidentified Eukaryota (47.9%). Targeted orders were also more represented with 63 ASVs identified as Ephemeroptera, 37 as Plecoptera, and 42 as Trichoptera taxa, representing 10% of the assigned ASVs (8.6% of the assigned reads). The sampling depth (number of reads identified as EPT) was highly variable among locations (ranging from 7 at Aadorf with mlCOIintF/HCO2198 to 109,956 at Zullwil with fwhF2/EPTDr2n). The absolute number of reads identified as EPT was 10 to 100 times higher with the fwhF2/EPTDr2n primers than with the mlCOIintF/HCO2198 primers (Suppl. material
Across all sites (i.e., gamma diversity), kicknet was the method that detected the highest number of different EPT taxa (64), followed by eDNA amplified with the fwhF2/EPTDr2n primers (44 taxa). Results of the regional EPT species richness (across all locations) are shown on Fig.
Regional EPT species richness (diversity across all sampling locations) detected by eDNA (mlCOIintF/HCO2198 and fwhF2/EPTDr2n primers) and kicknet method in comparison to total EPT richness known from Switzerland and the subset of species included in the molecular reference database. Horizontal bars show the total number of species in each set. The vertical bars show the number of species in each intersection between sets.
The number of EPT taxa detected varied both across locations and methods (Fig.
Number of EPT taxa detected in each location by eDNA (mlCOIintF/HCO2198 and fwhF2/EPTDr2n primers) and kicknet methods. The total number of taxa detected is divided in three fractions (in green the taxa detected by the two methods, in orange the taxa detected by eDNA only, and in blue the taxa detected by kicknet only, respectively).
Some taxa commonly detected by kicknet sampling were never or rarely detected by eDNA (Fig.
Number of streams where each EPT taxon was detected by eDNA (mlCOIintF/HCO2198 and fwhF2/EPTDr2n primers) and kicknet methods. The total number of locations is divided in three fractions (in green the locations where the taxon was detected by the two methods, in orange the locations where the taxon was detected by eDNA only, and in blue by kicknet only, respectively). For clarity, only the taxa detected more than once (all streams and methods combined) are shown.
We found the correlation between the richness estimates provided by the different methods to be remarkably low (Fig.
Relationships between the EPT richness estimates provided by the four investigated methods. The diagonal panels show the density estimate of EPT richness for each method. For each combination of methods, panels located in the lower triangle show the scatterplot of the EPT richness estimated by each method (x- and y-axis) with linear regression (red lines). Panels located in the upper triangle provides the Pearson correlation values between each method (asterisk indicates p-value < 0.05).
The species richness of EPT can be assessed through direct collection of targeted organisms using kicknet sampling (e.g.
The study of diversity on a regional scale (gamma diversity) shows the ability of environmental DNA to detect many taxa also identified by the traditional kicknet method. This result is in line with previous studies that reported several EPT taxa detected by both methods (
The low congruence between the species detected by eDNA and kicknet can be explained by the numerous biases that can influence species detection probabilities at every step of data collection. For eDNA this can be caused by the complex dynamics of DNA in the environment (release rate by the organisms, degradation and dilution), manipulation of the DNA in the lab (conservation, extraction, PCR-amplification, sequencing), and the bioinformatics processing (
The main goal of our study, namely to use independent model predictions from a species richness model (
In conclusion, our results suggest that the three approaches investigated here can give different results about the species richness and the species composition of EPT communities. These differences are likely due to the respective biases of each method, but also to the different scales that they integrate. Kicknet sampling is carried out at one point and captures the organisms physically present at that location. In contrast, models typically provide estimates of macroinvertebrate diversity on a regular grid or at catchment level (
We thank Marta Reyes for help during field work, Francis J. Burdon and Rik Eggen for comments on the project and coordination, and Pascal Stucki for sampling and identification of the EPT taxa. We thank three anonymous reviewers for their valuable and constructive comments. Genetic data used were generated at the Genetic Diversity Centre (GDC), ETH Zurich. This work has been funded by the Swiss Federal Office for the Environment (BAFU/FOEN) and is part of the Eawag Ecoimpact initiative. Further funding is from The Swiss National Science Foundation (grant nr. 31003A_173074), the University of Zurich Research Priority Programme in Global Change and Biodiversity (URPP GCB) to FA and the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (Grant agreement No. 852621) to KD.
Figures S1–S5
Data type: PDF file
Explanation note: Fig. S1. Species accumulation curves of the EPT taxa detected using the mlCOIintF/HCO2198 primers. Locations where EPT taxa were not detected are not shown. Fig. S2. Species accumulation curves of the EPT taxa detected using the fwhF2/EPTDr2n primers. Locations where EPT taxa were not detected are not shown. Fig. S3. Species accumulation curves of the EPT taxa detected using the kicknet method. Fig. S4. Relationships between the EPT richness estimates provided by the Kicknet, eDNA (mlCOIintF/HCO2198 and fwhF2/EPTDr2n primers merged), and the predictive model. The upper triangle provides the correlation values between each method (star indicates p-value < 0.05). Lower triangle shows the scatterplots with linear regressions (red lines). The diagonal shows the density estimate for each variable. Fig. S5. Relationships between the EPT richness estimates provided by the four investigated methods. The diagonal panels show the density estimate of EPT richness for each method. For each combination of methods, panels located in the lower triangle show the scatterplot of the EPT richness estimated by each method (x- and y-axis) with linear regression (red lines). Panels located in the upper triangle provides the Spearman correlation values between each method.
Tables S1–S4
Data type: excel file
Explanation note: Table S1. Dates of sampling. Table S2. Additional information on the DNA concentration of extracted DNA from water, the concentration of the prepared amplicon library and number of sequences obtained for each sample. Table S3. Description of all negative controls used to monitor contamination at any stage during filtration, extraction and library preparation. Table S4. Species presence (TRUE) or absence (FALSE) in each set.