Research Article
Print
Research Article
The power of metabarcoding: Can we improve bioassessment and biodiversity surveys of stream macroinvertebrate communities?
expand article infoJarno Turunen§, Heikki Mykrä§, Vasco Elbrecht|, Dirk Steinke, Thomas Braukmann#, Jukka Aroviita§
‡ Centre for Economic Development, Transport and the Environment, Oulu, Finland
§ Finnish Environment Institute, Oulu, Finland
| Department of Environmental Systems Science, ETH Zurich, Zürich, Switzerland
¶ University of Guelph, Guelph, Canada
# Stanford University, Stanford, United States of America
Open Access

Abstract

Most stream bioassessment and biodiversity surveys are currently based on morphological identification of communities. However, DNA metabarcoding is emerging as a fast and cost-effective alternative for species identification. We compared both methods in a survey of benthic macroinvertebrate communities across 36 stream sites in northern Finland. We identified 291 taxa of which 62% were identified only by DNA metabarcoding. DNA metabarcoding produced extensive species level inventories for groups (Oligochaeta, Chironomidae, Simuliidae, Limoniidae and Limnephilidae), for which morphological identification was not feasible due to the high level of expertise needed. Metabarcoding also provided more insightful taxonomic information on the occurrence of three red-listed vulnerable or data deficient species, the discovery of two likely cryptic and potentially new species to Finland and species information of insect genera at an early larval stage that could not be separated morphologically. However, it systematically failed to reliably detect the occurrence of gastropods that were easily identified morphologically. The impact of mining on community structure could only be shown using DNA metabarcoding data which suggests that the finer taxonomic detail can improve detection of subtle impacts. Both methods generally exhibited similar strength of community-environment relationships, but DNA metabarcoding showed better performance with presence/absence data than with relative DNA sequence abundances. Our results suggest that DNA metabarcoding holds a promise for future anthropogenic impact assessments, although, in our case, the performance did not improve much from the morphological species identification. The key advantage of DNA metabarcoding lies in efficient biodiversity surveys, taxonomical studies and applications in conservation biology.

Key Words

biomonitoring, Fennoscandia, mining, molecular data, species identification, taxonomic resolution

Introduction

Freshwater biomonitoring and biodiversity assessments are traditionally based on morphological identification of species and specimen count data. Particularly, benthic macroinvertebrates are used to assess the ecological state of streams and lakes. Such bioassessment programmes are often limited to genus or family level taxonomic resolution (e.g. Wright et al. 1995) for at least three reasons: (1) morphological species identification of several taxonomic groups requires a high level of expertise and financial resources, (2) many insect larval instars cannot be reliably identified to species or even to genus using morphological characteristics and (3) some species (or subspecies) form complexes which are indistinguishable by morphology (e.g. Ståhls and Savolainen 2008; Lucentini et al. 2011). This low taxonomic resolution hampers our ability to detect environmental impacts, because even closely-related species can vary in their tolerance to a given stressor and, thus, respond differently to changes in their environment (Resh and Unzicker 1975; Macher et al. 2016a). If this is accompanied by misidentifications, the detection of biological impairment is further obscured (Lenat and Resh 2001; Haase et al. 2010; Sweeney et al. 2011).

The past two decades saw the rapid development of DNA-based approaches to species identification. Particularly, systems based on the analysis of sequence variation in short, standardised gene regions (i.e. DNA barcodes) were shown to be very effective in species discrimination (Hebert et al. 2003). The DNA barcoding approach can be combined with high-throughput sequencing, to identify thousands of specimens in bulk in a process called metabarcoding. These methods have become cost-efficient and represent reliable alternatives to existing approaches used for freshwater bioassessments (Yu et al. 2012; Elbrecht et al. 2017a; Hering et al. 2018; Nichols et al. 2019) and biodiversity surveys (Baird and Hajibabaei 2012). Barcode reference libraries for freshwater benthic macroinvertebrates are fairly complete and continuous efforts to improve these databases are underway (Weigand et al. 2019).

Nevertheless, a key criterion for the adoption of DNA-based methods as state-of-the-art procedures in bioassessment and biodiversity surveys remains their reliability. There are known cases of false-negatives (a species is present in the sample, but its DNA is not detected either due to primer bias, insufficient specimen biomass, DNA degradation or incomplete barcode libraries) and false-positives (a species is not present in the sample, but is detected either due to the presence of trace DNA in, for example, gut content, cross-contamination or analytical errors; Hering et al 2018; Strand et al. 2019). It is unclear how metabarcoding errors compare to errors in morphological identifications. Metabarcoding methods are becoming well established and it is vital to evaluate potential discrepancies in species detection and to assess methodological limitations in the context of biomonitoring.

For this study, we conducted a bioassessment survey of macroinvertebrates in subarctic streams close to mining operations in northern Finland. Macroinvertebrates were identified by using both a standard morphology-based protocol and DNA metabarcoding. We explored differences in detecting different taxa for both methods and were particularly interested in the potential of metabarcoding to gain additional taxonomic information for the study region as we expected that it would provide a more comprehensive assessment of local biodiversity than morphological identifications. Furthermore, we anticipated that a more detailed level of identification would help to better understand community-environment relationships and provide higher sensitivity to potential mining impacts.

Materials and methods

Study area

Our study area is located in the Lapland region of northern Finland (between 66°N and 70°N). Streams in this region belong to northern boreal and subarctic ecoregions. The area is sparsely populated and the main anthropogenic pressure for freshwater ecosystems stems from forestry and mining operations. There is an increasing interest in the extraction of mineral deposits in the area with potentially negative impacts on freshwater biodiversity (Leppänen et al. 2017). We selected 36 stream sites from six regions with mining activities (Fig. 1). Regions comprised three locations with operational mines, two with closed mines and one (Sokli) where mining is planned, but not yet operational. From each region, we sampled three stream sites downstream of a mine and three reference sites without any mining influence (see Mykrä et al. 2021 for details). These control sites were either upstream of a mine or, if this were not possible (mining impact extended into the headwaters), we took samples from separate unaffected streams of the same drainage system. All sites from the region without an operational mine (Sokli) were classified as reference streams.

Figure 1.

Location of the 36 study stream sites in the six mining areas in northern Finland.

The key minerals extracted from the mining sites are gold (Kittilä, Kevitsa, Pahtavaara, Saattopora-Pahtavuoma), nickel (Kevitsa), copper (Kevitsa, Saattopora-Pahtavuoma) and chromium (Kemi). In Sokli, the key mineral is phosphate. The mines are mostly combinations of open pit and underground extraction sites. The main impacts on streams at all the mining-impacted study sites are elevated concentrations of nitrogen and sulphate which originate from the ore extraction and enrichment process, as well as use of explosives and rock tailings. Some sites also show elevated concentrations of arsenic, antimony and ions such as chloride salts (Kittilä, Kemi, Kevitsa). The impacts of this type of mining on stream biodiversity are poorly known; however, a recent study suggests that they are subtle (Mykrä et al. 2021) and could thus be better detected by using DNA-based methods rather than morphological species identification.

Stream habitat measurements

We made several measurements in both the habitat and the catchment of each stream site (Suppl. material 1: Table S1). Substrate structure and coverage of stream macrophytes were visually estimated by placing fifteen 0.25 m2 (0.5 * 0.5 m) plots across each site. Substrate size was classified following a modified Wentworth scale: organic matter = 0, sand (diameter 0.25–2 mm) = 1, fine gravel (2–6 mm) = 2, coarse gravel (6–16 mm) = 3, small pebble (16–32 mm) = 4, large pebble (32–64 mm) = 5, small cobble (64–128 mm) = 6, large cobble (128–256 mm) = 7, boulder (256–400 mm) = 8, large boulder and bed rock (> 400 mm) = 9. The percentage coverage of different size classes and macrophytes was estimated from the plots and averaged across the stream site. We collected water samples from each site to measure suspended solids, nutrients and various metals, metalloids and non-metals by using national standards (National Board of Waters 1981). Water pH and conductivity was measured in situ with an YSI Professional Plus meter (YSI Inc., Yellowsprings, Ohio, USA). Upstream catchment boundaries were delineated for each site using a Geographical Information System with the Digital Elevation Model (DEM) raster database of the National Land Survey of Finland (NLS) and vector data of Drainage Basins in Finland (SYKE). Land cover and land use were calculated for each catchment using Corine Land Use data.

Sample collection and processing

Benthic macroinvertebrates were sampled in autumn 2017 by taking four 30 second kick-net samples (mesh size 0.5 mm, area of disturbed stream bed = 1.2 m2) from swiftly flowing riffle sections at each site. The protocol followed the sampling guidelines for Finnish National Water Framework Directive (WFD) monitoring (Järvinen et al. 2019). Samples were preserved in 96% ethanol in the field. Ethanol was replaced the same day to ensure a high ethanol concentration for preservation of specimens and prevent DNA degradation. Samples were kept cool (8 °C) for subsequent molecular analyses.

Morphological identification of macroinvertebrates followed the requirements for national WFD bioassessments (Järvinen et al. 2019). Taxonomic levels are species or genus for groups that are feasible for identification by morphology in routing biomonitoring (Mollusca, Coleoptera, Ephemeroptera, Plecoptera, some Diptera, most Trichoptera). In Baetis mayflies, the specimens without gill and tracheal bristles and wide frons were identified as Baetis vernus-type. Specimens from groups where determination is laborious or too difficult were only identified to higher level taxonomy (Oligochaeta, Hydracarina, Chironomidae (Diptera), Ceratopogoniidae (Diptera), Simuliidae (Diptera), Limnephilidae (Trichoptera)). After identification, specimens were pooled back into their respective sampling lot for metabarcoding analysis.

DNA metabarcoding

For DNA extraction, sampling lots comprising all morphologically identified specimens were dried overnight at 40 °C. Large specimens (e.g. a dytiscid beetle) were removed from samples and a single leg of the specimen added back to the sample to prevent DNA of such a single specimen dominating the sample (Elbrecht et al. 2017a). We ground samples to a fine powder by either using the IKA ULTRA-TURRAX Tube Drive Control System (with ten 5 mm steel beads in 20 ml tubes, ground at 4,000 rpm for 30 min) or the IKA Tube Mill control system (for larger samples using 40 ml chambers, homogenising samples for 3 minutes at 25,000 rpm). On average, 14.91 mg (SD = 7.62 mg) tissue powder per sample was used for DNA extraction with the DNeasy 96 Blood & Tissue Kit (Qiagen, Hilden, Germany). For each sample, we extracted two aliquots of tissue, as well as 12 negative controls and 10 Lepidoptera specimens which served as positive controls (see Suppl. material 2: Table S2 for exact sample layout and Lepidoptera species added). To prevent cross-contamination between samples, tissue was digested according to manufacturer recommendations in individual 1.5 ml reaction tubes at 56 °C for 3 hours and then transferred to the spin column plate.

Metabarcoding was done using a two-step fusion primer strategy (Elbrecht and Steinke 2019). During the first PCR step, a 421 bp subset of the Cytochrome c oxidase subunit I gene (COI) was amplified using the BF2+BR2 primer set (Elbrecht and Leese 2017) and the Qiagen Multiplex PCR Plus Kit (Qiagen, Hilden, Germany). Two DNA extracts from an insect mock sample of a prior study (Braukmann et al. 2019) were used as additional positive controls. The reaction mix consisted of 0.5 μl DNA (concentration not quantified), 0.2 μM each primer, 2x Multiplex PCR Master Mix and ddH2O with a total reaction volume of 25 μl. PCRs were run on an Eppendorf mastercycler pro Thermocycler using the following programme: 95 °C for 5 min; 25 cycles of 95 °C for 30 s, 50 °C for 30 s and 72 °C for 50 s; and a final extension at 72 °C for 5 min. PCR success was evaluated on a 1% agarose gel. For the second PCR step, 0.5 μl amplicon of the previous PCR was used as template and each sample tagged with a unique fusion primer combination (see Suppl. material 2: Table S2 for details). The PCR setup was identical to the first, but the cycle number was reduced to 15 and the extension time in each cycle was increased to two minutes. PCR success was again checked with a 1% agarose gel and sample 33, which showed no band, was run for 15 additional PCR cycles. Amplicons were purified, left-side size-selected using SPRIselect with a ratio of 0.76x (Beckman Coulter, Brea, CA, USA) and then quantified on a Qubit Fluorometer (HS Kit, Thermo Fisher Scientific, Waltham, MA, USA). All amplicons, including positive and negative controls, were equimolar pooled. Negative controls were pooled by using the average volume of the other samples. The resulting library was sequenced on an Illumina MiSeq using the 300 bp PE v.3 Sequencing Kit with 5% Phi-X spike-in.

Raw sequencing data was quality checked using FastQC v.0.11.8 and then processed using the R stats based JAMP v.0.58 pipeline (https://github.com/VascoElbrecht/JAMP) which mostly relies on usearch v.11.0.667 (Edgar 2010). Commands for data processing are available as supporting information (Scripts S1). Reads were demultiplexed, based on fusion primer in-line tagging (Suppl. material 2: Table S2) and paired-end merged, using usearch, while allowing mismatches of up to 25%. We trimmed Primers from both sides, using default settings in Cutadapt v.1.18 (Martin 2011). Reads where a primer could not be detected at either end were discarded. In addition, only reads between 411–431 bp length were retained for further analysis. To further discard reads with poor read quality, we applied expected error (EE) filtering (Edgar and Flyvbjerg 2015), using a max EE of 1. Sequences of all samples were pooled, dereplicated (minuniquesize = 2) and clustered into molecular operational taxonomic units (MOTUs), using cluster OTUs with a 97% identity threshold (Edgar 2013). Individual reads (including singletons) were mapped against the OTU list, using usearch global with a minimum match of 97%. OTUs with < 0.01% read abundance in both replicates of a sample were discarded and the remaining reads mapped again against the OTU subset. We multiplied the highest read count for each individual OTU present in the negative controls by two and subtracted it from all other samples, to account for low level tag switching and cross contamination. OTU taxonomy was assigned using BOLD (Ratnasingham and Hebert 2007).

Taxonomic harmonisation

Identifications by morphology and DNA metabarcoding both had cases of higher and lower taxonomic level than species level identification for a given taxon. Therefore, to avoid overlapping higher level taxa, we harmonised the taxonomy in each sample separately for both identification methods as follows. Identifications from higher level (e.g. family or genus) were assigned to corresponding lower level (e.g. genus or species) identifications in the sample in accordance with the numerical abundance (number of specimens in morphological identifications and number of sequences in DNA metabarcoding) ratios of the corresponding lower level identifications. If lower level identifications were not present in a sample for a given taxon, the higher level identifications were left as is.

Statistical analyses

We compared both identification methods, using taxon accumulation curves, taxon occurrence information, multivariate analyses of community composition and community-environment-relationships, as well as statistical tests for potential effects of mining on biota. We explored community composition using Non-metric Multidimensional Scaling (NMDS) ordination. Two-dimensional NMDS ordinations were based on Bray-Curtis distances of log-transformed abundance data (morphological identifications) and proportional DNA sequence abundance data, as well as presence-absence data. Adding further dimensions did not considerably optimise stress. We correlated community composition (NMDS scores) with main environmental gradients by using Principal Component Analysis (PCA) to reduce multidimensionality of the environmental data to a few interpretable principal components (gradients) and subsequently fitted those to the NMDS scores. We used a Permutational Multivariate Analysis of Variance (PERMANOVA; Anderson 2001) to test for impact of mining on community composition. The mining region was used as group (strata) to constrain permutations. Prior to the PERMANOVA, we tested the homogeneity of multivariate dispersion within treatments with a Permutational Analysis of Multivariate Dispersions (PERMDISP; Anderson 2006). Finally, we compared community dissimilarity and environmental distance relationships between the two identification methods, using BIO-ENV (Clarke and Ainsworth 1993). In BIO-ENV, Bray-Curtis resemblance matrices are correlated to Euclidean distance matrices of environmental variables and Spearman´s rank correlation (rs) is used to assess the strength of the correlation. BIO-ENV searches for the best subset of environmental variables which maximises the correlation between community and environmental distance. All statistical analyses were done in R (R Core Team 2019) using the vegan package (Oksanen et al. 2019), in particular its functions metaMDS and envfit (NMDS), adonis (PERMANOVA), betadisper (PERMDISP) and bioenv.

Results

MiSeq sequencing generated 15,760,049 sequences. Raw data are available under the SRA accession PRJNA547646. Eleven of the 12 negative controls showed less than 500 reads and one negative control had around 3,000 reads. The macroinvertebrate samples had an average sequencing depth of 149,278 reads (SD = 17732). During bioinformatic processing and quality filtering, an average of 11.1% of the reads were discarded in each mining sample (SD = 2.4%). The 10 Lepidoptera species, added as positive controls, did appear in other samples, almost exclusively along the column in which the respective sample was added (figure S1). This might be caused by the decreased sequence quality in read two of Illumina sequencing or tag switching on the flow cell related to the second read.

We collected a total of 33,355 specimens from all 36 stream sites. The average number of specimens per sample was 927 (range: 201–2557). Overall, morphological analysis revealed 113 taxa, while metabarcoding unveiled 250 taxa (Fig. 2, Suppl. material 4: Table S4). Metabarcoding also produced more taxa per site (mean 45, range: 29–64) than morphological identification (31, range: 16–48). The cumulative number of taxa steadily increased with the number of sites, particularly metabarcoding did not reach an asymptote with increasing number of sampling sites (Fig. 2). By contrast, for morphological identification, the number of taxa did approach an asymptote at the total number of taxa detected.

Figure 2.

Cumulative number of taxa (mean ± 95% confidence interval) for metabarcoding (blue solid line) and morphologically identified (red dashed line) data.

Metabarcoding revealed more unique taxonomic units (181) than did morphological identification (40). Approximately half of the taxa detected only by metabarcoding were from groups that were not morphologically identified below family or lower taxonomic level using the routine protocol: Oligochaeta (16 species, genera or families detected only through metabarcoding), Arachnida (11 families or species), Chironomidae (72 species or genera), Simuliidae (11 species), other Dipteran families (seven species) and Limnephilidae (Trichoptera, seven species). Even after excluding these taxonomic groups, metabarcoding still identified more unique taxa (57 taxa not identified by morphology) than did morphological identification (22 taxa not identified by metabarcoding). Overall, metabarcoding detected 81% of the taxa that were detected by morphology, whereas morphological identification detected 54% of the taxa detected by metabarcoding (excluding the taxonomic groups that were not morphologically identified to lower levels).

In many cases, metabarcoding outperformed morphological identification by producing otherwise not possible species level information (Table 1). Such cases included the stonefly genera Leuctra, Nemoura, Diura and Isoperla which are notoriously difficult to identify to species level using early nymph instars. In mayflies, metabarcoding resolved the high diversity of the swimming mayfly genus Baetis (seven species) that morphologically could only be identified to either Baetis rhodani or Baetis vernus-type. Metabarcoding also provided unique species level information for Coleoptera (Agabus guttatus, Oreodytes sanmarkii, Hydroporus lapponium, Hydraena gracilis), for three red-listed or data deficient (DD) species (Baetis liebenauae (NT), Baetis scambus (DD), Dicranota robusta (NT)) and for species that are either new to Finland or represent cryptic species of unknown taxonomic status or, in some cases, likely represented errors in the reference library: Baetis phoebus, Heptagenia pulla, Sericostoma flavicorne (in morpho S. personatum). Metabarcoding also detected taxa that should have been readily identifiable by morphology such as Dixidae, Capnia atra, Nemurella pictetii and Sipholonurus alternatus.

Table 1.

Number of occurrences and mean abundances per site for taxa that were exclusively detected either by morphological identification or metabarcoding. Taxonomic resolution of the datasets was adjusted to national target level (Järvinen et al. 2019).

Taxon Order Number of occurrences Mean abundance
Morphology DNA No of specimens Sequence %
Radix peregra Mollusca 9 6
Lymnea stagnalis Mollusca 1 1
Valvata sp. Mollusca 2 6.5
Stagnicola sp. Mollusca 2 1
Dytiscus circumcinctus Coleoptera 2 1
Dixidae Diptera 1 0.18
Clinocera sp. Diptera 2 1.5
Hemerodromia adulatoria Diptera 3 0.08
Tipula sp. Diptera 1 2
Heptagenia pulla Ephemeroptera 21 0.66
Kageronia fuscogrisea Ephemeroptera 2 2
Sipholonurus alternatus Ephemeroptera 2 0.12
Sigara striata Heteroptera 1 1
Sialis lutaria Megaloptera 2 3
Somathochlora alpestris Odonata 1 5.7
Somathochlora metallica Odonata 1 1
Capnia atra Plecoptera 2 0.07
Nemurella pictetii Plecoptera 1 0.44
Apatania crymophila Trichoptera 4 0.12
Apatania wallengreni Trichoptera 9 2.1
Micrasema gelidum Trichoptera 9 8.7
Micrasema primoricum Trichoptera 9 0.35
Ceraclea nigranervosa Trichoptera 1 1
Neureclipsis bimaculata Trichoptera 3 2
Sericostoma flavicorne Trichoptera 15 5.6

Metabarcoding also had some shortcomings in species detection. Notably, it often did not detect Gastropoda, for example, Gyraulus sp. only at five out of nine sites compared to morphology and failed to detect any of the 14 occurrences of four other gastropod taxa (Table 1). Nine other non-gastropod taxa were also not detected by metabarcoding. In some cases, information for species presence mismatched results of morphological identification. For example, Micrasema gelidum was identified from nine sites by morphology, but from none by metabarcoding which indicated presence of Micrasema primoricum at the same sites (Table 1). M. gelidum is a common species in the region, whereas M. primoricum is unknown to Finland. A similar case involved Apatania wallengreni (observed at nine sites by morphology, but at none by metabarcoding) and A. crymophila (indicated presence at four sites only by metabarcoding). The latter is also unknown in Finland. These results suggest issues with the taxonomic assignment or the completeness of the reference library.

When datasets were harmonised to the taxonomic resolution required for national WFD bioassessments, both methods produced highly comparable results with 89 shared taxa and only 14 taxa identified solely by metabarcoding and only 18 exclusively by morphology (Fig. 2). The results were similar for key indicator species (Ephemeroptera, Plecoptera, Trichoptera, EPT) with 57 shared taxa and six and five taxa unique to metabarcoding and morphology, respectively.

The PCA reduced the environmental data to four components which together explain 73% of the variation in the environmental variables. The first component (PC1) represents a gradient of mining influence on water chemistry, specifically inorganic nitrogen, sulphate and metal and non-metal concentrations explaining most of the variation (38%, Suppl. material 5: Table S5). The second component (PC2) represents a gradient of peatland influence on water chemistry (21% explained) with iron, manganese, chemical oxygen demand, aluminium, dissolved organic carbon and organic nitrogen. The third component (PC3) relates to catchment area and arsenic concentration (7.3% explained). The fourth component (PC4) represents a phosphorus gradient (7.1% explained).

The community composition (NMDS), based on both morphological and metabarcoding data, showed significant relationships to the environment, but the magnitude varied between both datasets depending on whether abundance or presence-absence data were used. The relationship of community composition and the mining pollution gradient (PC1) was stronger with morphological identification (R2 = 0.54, P < 0.001) than with metabarcoding (R2 = 0.46, P < 0.001) (Fig. 4a). Morphological data also correlated with PC2 (R2 = 0.32, P = 0.002) and the DNA data with PC3 (R2 = 0.46, P < 0.001). By contrast, when using presence-absence data, the correlation of community composition with the mining pollution gradient (PC1) was stronger with metabarcoding (R2 = 0.59, P = 0.002) than with morphological identification (R2 = 0.46, P < 0.001) (Fig. 3b). Metabarcoding also correlated with PC2 (R2 = 0.39, P = 0.003) and PC3 (R2 = 0.28, P = 0.008) and the morphological data with PC2 (R2 = 0.39, P < 0.001).

Figure 3.

Venn diagrams showing the shared and unique number of taxa by methodology when the datasets are harmonised to the same taxonomic resolution of national WFD bioassessment. a) The whole macroinvertebrate community and b) Ephemeroptera, Plecoptera and Trichoptera (EPT) taxa.

The correlation of community dissimilarity with environmental distance was substantially weaker with metabarcoding data (rs = 0.50) than with morphological identifications (rs = 0.72) when abundance data were used. The results were also different in terms of environmental variables selected as only water iron (Fe) concentration was selected as the best variable using metabarcoding (Table 2). However, when using presence-absence data, correlations were comparable between both methods (metabarcoding: rs = 0.74, morphology: rs = 0.71) and the same environmental variables were selected as the best subset of environmental variables to construct environmental dissimilarity matrices (Table 2).

Table 2.

Summary of the BIO-ENV analysis and the best sets of environmental variables used to construct environmental dissimilarity matrices. Spearman (rs) correlations describe the strength of correlation between community dissimilarity and environmental distances matrices. p/a = Presence-absence data.

Dataset rs Best subset of environmental variables
DNA metabarcoding, abundance (%) 0.50 Fe
DNA metabarcoding, p/a 0.74 Current velocity, Shade, Bryophyte cover, Mn, Tot N, Tot P, Fe, Ca
Morphology, log(abundance) 0.72 Current velocity, Bryophyte cover, Tot N, Fe, Mn
Morphology, p/a 0.71 Current velocity, Shade, Bryophyte cover, Tot N, Fe, Mn
Figure 4.

Distribution of the study sites in a two-dimensional NMS-ordination space, based on macroinvertebrate identification by morphology (a and c) and metabarcoding (b and d) using abundance data (a and b) or presence/absence (c and d) data across the six mining areas. The vectors represent significantly related Principal Components (PC) NMDS axis scores. The colours represent mining areas.

The mining impact on the streams was subtle across the surveyed mining areas (see also Mykrä et al. 2021). We did not record any effect on invertebrate community composition neither with morphologically identified abundance (PERMANOVA, F1,35 = 1.01, R2 = 0.03, P = 0.37) nor with presence-absence data (F1,35 = 1.57, R2 = 0.03, P = 0.37). However, when using relative sequence abundance data, mining showed a significant effect (F1,35 = 1.89, R2 = 0.05, P = 0.04) on invertebrate community composition. Presence-absence data showed a similar, but not significant effect (F1,35 = 1.68, R2 = 0.05, P = 0.06).

Discussion

The rapid development of new DNA sequencing techniques and the increase in taxonomic coverage in reference sequence databases has made DNA metabarcoding an attractive tool for biodiversity surveys and bioassessments (e.g. Baird and Hajibabaei 2012; Hering et al. 2018). Our study shows that bulk metabarcoding produces finer taxonomic resolution than morphological identification which also helped to obtain information on species of conservation value. Thus, metabarcoding provided additional regional biodiversity information that could not be obtained using the routine morphological identification of freshwater bioassessments. In general, both methods showed similar community-environment-relationships. However, metabarcoding showed sensitivity to detect mining impact in our study areas, which suggests that the finer taxonomic resolution can indeed improve detection of subtle impacts on biota.

The added value of metabarcoding to biodiversity surveys

Metabarcoding generated more information for unique taxa than did morphological identification. In fact, in contrast to the morphological approach that uses coarser taxonomic resolution, the cumulative number of taxa found with metabarcoding did not reach an asymptote with increasing number of studied sites. This indicates that much of the regional benthic macroinvertebrate diversity has not yet been observed at the taxonomic resolution that can be achieved by metabarcoding.

Much of this unique taxonomic information comes from groups such as Chironomidae, Simuliidae, Limnephilidae and Oligochaeta, which could be expected as Fennoscandian DNA reference libraries are quite comprehensive for these groups. In our routine morphology-based surveys, they are also never identified to species level.

In our study, metabarcoding was particularly useful in resolving morphological identifications at genus level (especially beetles and stoneflies). Cryptic diversity in morphologically-inseparable species complexes is a common feature of many benthic macroinvertebrate groups (e.g. Rutschmann et al. 2014; Vitecek et al. 2017). This is problematic for bioassessments as taxa of the same genus can exhibit differing responses to environmental stressors (Macher et al. 2016a). A particularly problematic group in this regard is the mayfly family Baetidae (e.g. Ståhls and Savolainen 2008; Rutschmann et al. 2014) and instances of cryptic diversity have been suggested by several molecular studies (Lucentini et al. 2011; Stein et al. 2014; Pereira-da-Conceicoa et al. 2020). Our study also showed some evidence for regional unknown diversity. The metabarcoding data contained sequences matching Baetis phoebus mayflies; however, this species is known only from North America (Barber-James et al. 2013). This surprising result can have three possible explanations: the occurrence of an unknown Baetis sp. that is mistakenly assigned as B. phoebus, errors in the species assignment method or a match to a mislabelled record in the DNA reference library (Elbrecht et al. 2017a). Further analysis, using different species assignment methods and a more complete Beatis reference library, might help to resolve this conundrum. Metabarcoding (and morphological identifications) placed Sericostoma personatum at many sites, but at most of the sites, it also found S. flavicorne, a species that is unknown to the region. Sericostomatidae is a caddis family known to have an unresolved taxonomy (Sipahiler 2000; Malicky 2005; Macher et al. 2016b). Both species are not clearly separated with morphology in early larval stages and the COI barcode shows insufficient taxonomic resolution (Darschnik et al. 2019). Thus, reference sequences contain misidentified specimens of either species group. This demonstrates the need for further investigation of unusual or unexpected species occurrences in metabarcoding datasets (see also Darschnik et al. 2019). In our case, a closer look at the public database helped to resolve this issue and, therefore, we ruled out bioinformatics errors related to the species assignment method chosen. Another noteworthy metabarcoding discovery was the find of three species (Baetis liebenauae, Baetis scambus and Dicranota robusta) considered near-threatened or data deficient in the national Red List (Hyvärinen et al. 2019). This further highlights the promise and power of the approach in the detection of endangered species and assessment of the conservation status of species (Pawlowski et al. 2018).

Community-environment relationships and impact detection

It has been suggested that better resolved taxonomy results in stronger estimates of community-environment relationships and better detection of environmental impacts (Resh and Unzicker 1975; Lenat and Resh 2001). We found some evidence for this as differences in community composition between near-natural streams and mine-impacted streams were detected in mining regions only by the use of metabarcoding. Similarly, Stein et al. (2014) reported that metabarcoding enabled them to differentiate benthic invertebrate communities between bank armoured and unarmoured stream reaches, whereas morphological identification failed to detect any compositional differences. However, our study showed only marginally different community-environment relationships for both methods, suggesting that more specific identifications of taxa in morphologically unresolved groups, such as Chironomidae or Oligochaeta, do not increase resolution. It is, therefore, likely that environmental responses of these species in these groups are less predictable than for taxa (i.e. EPT), traditionally identified to species or genus level (Rabeni and Wang 2001). The inclusion of species abundance rather than the mere use of presence-absence resulted in better resolved community-environment-relationships for morphological data (log specimen abundance), which highlights the importance of accounting for individual abundances in studies of community patterns (e.g. Heino 2008). However, the inclusion of abundance information (relative sequence abundance) in the metabarcoding dataset resulted in a loss of community-environment-relationship signal strength. Indeed, the signal was notably weaker when community dissimilarities were correlated with environmental dissimilarities. On the other hand, the methods showed very similar performance when only presence-absence data were used. These results confirm that the use of presence-absence data is highly recommended when using metabarcoding because sequence abundances typically do not reflect the true abundance or biomass distribution of species (Piñol et al. 2014; Elbrecht et al. 2017b; Braukmann et al. 2019) and, thus, do not provide information similar to species abundance, based on counting of individuals. The poor correlation of DNA sequence abundance with species´ abundance is one of the key limitations for metabarcoding in biomonitoring applications for which abundance information is needed (Elbrecht and Leese 2015).

Shortcomings of metabarcoding

Metabarcoding did not detect all taxa found in morphological identifications. In particular, several molluscs, which we were able to identify by morphology, were missing from the DNA dataset. This could be the result of primer bias during PCR amplification (especially for mollusca; see Fernández et al. 2019), relatively low DNA abundance or DNA degradation (Elbrecht and Steinke 2019). Similarly, Beentjes et al. (2019) found that a majority of molluscs failed to amplify, highlighting the difficulties in recovering all taxonomic groups when using universal primers that do not specifically target molluscs. The choice of primers can have a substantial effect on species detection in benthic macroinvertebrates, although different primers can give similar ecological interpretations when the goal is to study broad-scale community patterns rather than the accurate detection of species (Hajibabaei et al. 2019). Alternative PCR-free approaches, such as metagenomics or metatranscriptomics, could be considered to overcome primer bias and to provide more reliable measures of species abundance (Leese et al. 2018; Cordier et al. 2020). The lack of mollusc matches in the metabarcoding dataset could also be the result of the generally limited coverage for freshwater molluscs in DNA barcode libraries (Weigand et al. 2019). There are a few more gaps in DNA reference libraries for some other taxa, such as Micrasema and Apatania caddisflies, which are common in our study area and in northern streams and rivers in general. Such library gaps could be the reason for some of the discrepancies between both methods and those can only be overcome by further parameterisation of the reference libraries used. There were a few more taxa detected by morphology, but missing from the metabarcoding results, but their abundances were very low to begin with and could be partly attributed to identification errors.

Conclusions

Our results show that bulk metabarcoding can provide comparable or superior results to traditional morphology-based species identification for stream bioassessment and biodiversity surveys. The benefits were most apparent for species groups that are tedious to identify and require high taxonomic expertise. However, the fact that DNA sequence abundances did not correlate with species abundances or biomass hampers the use of ecological metrics that rely on those. Primer bias or incomplete DNA reference libraries, by contrast, do not seem to pose a major roadblock if the goal of a study is impact assessment rather than a complete biodiversity survey. Given the fast growth of DNA reference libraries (Bush et al. 2019; Weigand et al. 2019; Carraro et al. 2020), the latter will also become more feasible. However, the limitations, described above, require caution in the interpretation of the results and taxonomic expertise is still needed to resolve conflicting information. Despite all this, metabarcoding has the potential to substantially increase the knowledge on distributions of poorly studied or endangered species and, thereby, greatly aid biodiversity conservation.

Acknowledgements

We thank Satu Maaria Karjalainen for insightful project management, study planning and site selection. We would also like to thank Minna Kuoppala, Kirsti Leinonen, Katri E. Tolonen and Mika Visuri for resilient assistance in fieldwork. We also thank K. E. Tolonen for initiating the present study and M. Kuoppala for drawing the map. This study was funded by the evaluation and management project of the cumulative environmental effects of the mining cluster in Lapland (A72032) through the EU’s Northern Finland Regional Development Fund and FRESHABIT LIFE IP (LIFE14 IPE/FI/23). This work was also supported by the Canada First Research Excellence Fund and represents a contribution to the University of Guelph’s “Food from Thought” programme.

References

  • Barber-James H, Sartori M, Gattolliat J-L, Webb J (2013) World checklist of freshwater Ephemeroptera species. World Wide Web electronic publication. http://fada.biodiversity.be/group/show/35 [date accessed: 1.14.2021]
  • Beentjes KK, Speksnijder AGCL, Schilthuizen M, Hoogeveen M, Pastoor R, van der Hoorn BB (2019) Increased performance of DNA metabarcoding of macroinvertebrates by taxonomic sorting. PLoS ONE 14(12): e0226527. https://doi.org/10.1371/journal.pone.0226527
  • Braukmann TWA, Ivanova NV, Prosser SWJ, Elbrecht V, Steinke D, Ratnasingham S, de Waard JR, Sones JE, Zakharov EV, Hebert PDN (2019) Metabarcoding a diverse arthropod mock community. Molecular Ecology Resources 19: 711–727. https://doi.org/10.1111/1755-0998.13008
  • Bush A, Compson ZG, Monk W, Porter TM, Steeves R, Emilson EJ, Gagne N, Hajibabaei M, Roy M, Baird D (2019) Studying ecosystems with DNA metabarcoding: Lessons from biomonitoring of aquatic macroinvertebrates. Frontiers in Ecology and Evolution 7: e434. https://doi.org/10.3389/fevo.2019.00434
  • Carraro L, Mächler E, Wüthrich R, Altermatt F (2020) Environmental DNA allows upscaling spatial patterns of biodiversity in freshwater ecosystems. Nature Communications 11: e3585. https://doi.org/10.1038/s41467-020-17337-8
  • Clarke KR, Ainsworth M (1993) A method of linking multivariate community structure to environmental variables. Marine Ecology Progress Series 92: 205–219. https://doi.org/10.3354/meps092205
  • Cordier T, Sáez LA, Apotheloz-Perret-Gentil L, Aylagas E, Bohan DA, Bouchez A, Chariton A, Creer S, Fruhe L, Keck F, Keeley N, Laroche O, Leese F, Pochon X, Stoeck T, Pawlowski J, Lanzén A (2020) Ecosystems monitoring powered by environmental genomics: a review of current strategies with an implementation roadmap. Molecular Ecology: online early. https://doi.org/10.20944/preprints202001.0278.v1
  • Darschnik S, Leese F, Weiss M, Weigand H (2019) When barcoding fails: development of diagnostic nuclear markers for the sibling caddisfly species Sericostoma personatum (Spence in Kirby & Spence, 1826) and Sericostoma flavicorne Schneider, 1845. ZooKeys 872: 57–68. https://doi.org/10.3897/zookeys.872.34278
  • Elbrecht V, Leese F (2015) Can DNA-based ecosystem assessments quantify species abundance? Testing primer bias and biomass-sequence relationships with an innovative metabarcoding protocol. PLoS ONE 10(7): e0130324. https://doi.org/10.1371/journal.pone.0130324
  • Elbrecht V, Leese F (2017) Validation and development of COI metabarcoding primers for freshwater macroinvertebrate bioassessment. Frontiers in Environmental Science 5: e11. https://doi.org/10.3389/fenvs.2017.00011
  • Elbrecht V, Vamos EE, Meissner K, Aroviita J, Leese F (2017a) Assessing strengths and weaknesses of DNA metabarcoding-based macroinvertebrate identification for routine stream monitoring. Methods in Ecology and Evolution 8: 1265–1275. https://doi.org/10.1111/2041-210X.12789
  • Elbrecht V, Peinert B, Leese F (2017b) Sorting things out: Assessing effects of unequal specimen biomass on DNA metabarcoding. Ecology and Evolution 7: 6918–6926. https://doi.org/10.1002/ece3.3192
  • Fernández S, Martínez-Rodríguez S, Martínez JL, Garcia-Vazquez E, Ardura A (2019) How can eDNA contribute in riverine macroinvertebrate assessment? A metabarcoding approach in the Nalón River (Asturias, Northern Spain). Environmental DNA 1: 385–401. https://doi.org/10.1002/edn3.40
  • Haase P, Pauls SU, Schindehütte K, Sundermann A (2010) First audit of macroinvertebrate samples from an EU Water Framework Directive monitoring program: human error greatly lowers precision of assessment results. Freshwater Science 29: 1279–1291. https://doi.org/10.1899/09-183.1
  • Hajibabaei M, Porter TM, Wright M, Rudar J (2019) COI metabarcoding primer choice affects richness and recovery of indicator taxa in freshwater systems. PLoS ONE 14: e0220953. https://doi.org/10.1371/journal.pone.0220953
  • Hebert PD, Cywinska A, Ball SL, Dewaard JR (2003) Biological identifications through DNA barcodes. Proceedings of the Royal Society of London Series B – Biological Sciences 270(1512): 313–321. https://doi.org/10.1098/rspb.2002.2218
  • Hering D, Borja A, Jones JI, Point D, Boets P, Bouchez A, Bruce K, Drakare S, Hänhfling B, Kahlert M, Leese F, Meissner K, Mergen P, Reyjol Y, Seguardo P, Vogler A, Kelly M (2018) Implementation options for DNA-based identification into ecological status assessment under the European Water Framework Directive. Water Research 138: 192–205. https://doi.org/10.1016/j.watres.2018.03.003
  • Hyvärinen E, Juslén A, Kemppainen E, Uddström A, Liukko U-M (2019) The 2019 Red List of Finnish Species. Ympäristöministeriö & Suomen ympäristökeskus, Helsinki, 704 pp. http://hdl.handle.net/10138/299501
  • Leese F, Bouchez A, Abarenkov K, Altermatt F, Borja Á, Bruce K, Ekrem T, Čiampor Jr F, Čiamporová-Zaťovičová Z, Costa FO, Duarte S, Elbrecht V, Fontaneto D, Franc A, Geiger MF, Hering D, Kahlert M, Stroil BK, Kelly M, Keskin E, Liska I, Mergen P, Meissner K, Pawlowski J, Penev L, Reyjol Y, Rotter A, Steinke D, der Wal B, Vitecek S, Zimmermann J, Weigand AM (2018) Why we need sustainable networks bridging countries, disciplines, cultures and generations for aquatic biomonitoring 2.0: A perspective derived from the DNAqua-Net COST Action. Advances in Ecological Research 58: 63–99. https://doi.org/10.1016/bs.aecr.2018.01.001
  • Lenat DR, Resh VH (2001) Taxonomy and stream ecology – The benefits of genus- and species-level identifications. Journal of the North American Benthological Society 20: 287–298. https://doi.org/10.2307/1468323
  • Lucentini L, Rebora M, Puletti ME, Gigliarelli L, Fontaneto D, Gaino E, Panera F (2011) Geographical and seasonal evidence of cryptic diversity in the Baetis rhodani complex (Ephemeroptera, Baetidae) revealed by means of DNA taxonomy. Hydrobiologia 673: 215–228. https://doi.org/10.1007/s10750-011-0778-1
  • Macher JN, Salis RK, Blakemore KS, Tollrian R, Matthaei CD, Leese F (2016a) Multiple-stressor effects on stream invertebrates: DNA barcoding reveals contrasting responses of cryptic mayfly species. Ecological Indicators 61: 159–169. https://doi.org/10.1016/j.ecolind.2015.08.024
  • Macher JN, Weiss M, Beermann AJ, Leese F (2016b) Diversity of Sericostomatidae (Trichoptera) caddisflies in the Montseny mountain range. BioRxiv. https://doi.org/10.1101/057844
  • Malicky H (2005) Ein kommentiertes Verzeichnis der Köcherfliegen (Trichoptera) Europas und des Mediterrangebiets. Linzer Biologische Beitrage 37: 533–596.
  • Mykrä H, Kuoppala M, Nykänen V, Tolonen K, Turunen J, Vilmi A Karjalainen SM (2021) Assessing mining impacts: The influence of background geochemical conditions on diatom and macroinvertebrate communities in subarctic streams. Journal of Environmental Management 278(part 2): 111532. https://doi.org/10.1016/j.jenvman.2020.111532
  • National Board of Waters (1981) The analytical methods used by National Board of Waters. Report 213. National Board of Waters, Helsinki.
  • Nichols SJ, Kefford BJ, Campbell CD, Bylemans J, Chandler E, Bray JP, Shackleton M, Robinson KL, Carew ME, Furlan EM (2019) Towards routine DNA metabarcoding of macroinvertebrates using bulk samples for freshwater bioassessment: Effects of debris and storage conditions on the recovery of target taxa. Freshwater Biology 65: 607–620. https://doi.org/10.1111/fwb.13443
  • Oksanen J, Blanchet FG, Friendly M, Kindt R, Legendre P, McGlinn D, Minchin PR, O´Hara RB, Simpson GL, Solymos P, Henry M, Stevens H, Szoecs E, Wagner H (2019) vegan: Community Ecology Package. R package version 2.5-6. https://CRAN.R-project.org/package=vegan
  • Pawlowski J, Kelly-Quinn M, Altermatt F, Apothèloz-Perret-Gentil L, Beja P, Boggero A, Borja A, Bouchez A, Cordier T, Domaizo I, Feio MJ, Filipe AF, Fornaroli R, Graf W, Herder J, van der Hoorn B, Jones JI, Sagova-Mareckova M, Kahlert M (2018) The future of biotic indices in the ecogenomic era: Integrating (e)DNAmetabarcoding in biological assessment of aquatic ecosystems. Science of the Total Environment 637–638: 1295–1310. https://doi.org/10.1016/j.scitotenv.2018.05.002
  • Pereira-da-Conceicoa L, Elbrecht V, Hall A, Briscoe A, Barber-James H, Price B (2020) Metabarcoding unsorted kick-samples facilitates macroinvertebrate-based biomonitoring with increased taxonomic resolution, while outperforming environmental DNA. Environmental DNA 3(2): 353–371. https://doi.org/10.1002/edn3.116
  • Piñol J, Mir G, Gomez-Polo P, Agustí N (2014) Universal and blocking primer mismatches limit the use of high-throughput DNA sequencing for the quantitative metabarcoding of arthropods. Molecular Ecology Resources 15: 1–12. https://doi.org/10.1111/1755-0998.12355
  • R Core Team (2019) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna. https://www.R-project.org/
  • Rabeni CF, Wang N (2001) Bioassessment of streams using macroinvertebrates: are the Chironomidae necessary? Environmental Monitoring and Assessment 71: 177–185. https://doi.org/10.1023/A:1017523115381
  • Resh VH, Unzicker JD (1975) Water quality monitoring and aquatic organisms: the importance of species identification. Journal (Water Pollution Control Federation): 9–19.
  • Rutschmann S, Gattolliat J-L, Hughes SJ, Báez M, Sartori M, Monaghan MT (2014) Evolution and island endemism of morphologically cryptic Baetis and Cloeon species (Ephemeroptera, Baetidae) on the Canary Islands and Madeira. Freshwater Biology 59: 2516–2527. https://doi.org/10.1111/fwb.12450
  • Sipahiler F (2000) Redescription of Sericostoma flavicorne Schneider, 1845 and a new species of Sericostoma Latreille from Turkey (Trichoptera: Sericostomatidae). Braueria 27: 23–25.
  • Stein ED, White BP, Mazor RD, Jackson JK, Battle JM, Miller PE, Pilgrim EM, Sweeney BW (2014) Does DNA barcoding improve performance of traditional stream bioassessment metrics? Freshwater Science 33: 302–311. https://doi.org/10.1086/674782
  • Strand DA, Johnsen SI, Rusch JC, Agersnap S, Larsen WB, Knudsen SW, Møller PR, Vrålstad T (2019) Monitoring a Norwegian freshwater crayfish tragedy: eDNA snapshots of invasion, infections and extinction. Journal of Applied Ecology 56: 1661–1673. https://doi.org/10.1111/1365-2664.13404
  • Ståhls G, Savolainen E (2008) MtDNA COI barcodes reveal cryptic diversity in the Baetis vernus group (Ephemeroptera, Baetidae). Molecular Phylogenetics and Evolution 46: 82–87. https://doi.org/10.1016/j.ympev.2007.09.009
  • Sweeney BW, Battle JM, Jackson JK, Dapkey T (2011) Can DNA barcodes of stream macroinvertebrates improve descriptions of community structure and water quality? Freshwater Science 30: 195–216. https://doi.org/10.1899/10-016.1
  • Vitecek S, Vinҫon G, Wolfram G, Pauls SU (2017) High cryptic diversity in aquatic insects: an integrative approach to study the enigmatic Leuctra inermis species group (Plecoptera). Arthropod Systematics and Phylogeny 75: 497–521.
  • Weigand H, Beermann AJ, Čiampor F, Costa FO, Csabai Z, Duarte S, Geiger MF, Grabowski M, Rimet F, Rulik B, Strand M, Szucsich N, Weigand M, Willassen E, Wyler SA, Bouchez A, Borja A, Člamporová-Zaťovićová Z, Ekrem T (2019) DNA barcode reference libraries for the monitoring of aquatic biota in Europe: Gap-analysis and recommendations for future work. Science of the Total Environment 678: 499–524. https://doi.org/10.1016/j.scitotenv.2019.04.247
  • Wright IA, Chessman BC, Fairweather PG, Benson LJ (1995) Measuring the impact of sewage effluent on the macroinvertebrate community of an upland stream: The effect of different levels of taxonomic resolution and quantification. Austral Ecology 20: 142–149. https://doi.org/10.1111/j.1442-9993.1995.tb00528.x
  • Yu DW, Ji Y, Emerson BC, Wang X, Ye C, Yang C, Ding Z (2012) Biodiversity soup: metabarcoding of arthropods for rapid biodiversity assessment and biomonitoring. Methods in Ecology and Evolution 3: 613–623. https://doi.org/10.1111/j.2041-210X.2012.00198.x

Supplementary materials

Supplementary material 1 

Table S1

Author: Jarno Turunen, Heikki Mykrä, Vasco Elbrecht, Dirk Steinke, Thomas Braukmann, Jukka Aroviita

Data type: Environmental variables

Explanation note: Table S1. Mean and range environmental characteristics of the 36 study sites in northern Finland.

This dataset is made available under the Open Database License (http://opendatacommons.org/licenses/odbl/1.0/). The Open Database License (ODbL) is a license agreement intended to allow users to freely share, modify, and use this Dataset while maintaining this same freedom for others, provided that the original source and author(s) are credited.
Download file (75.34 kb)
Supplementary material 2 

Table S2

Author: Jarno Turunen, Heikki Mykrä, Vasco Elbrecht, Dirk Steinke, Thomas Braukmann, Jukka Aroviita

Data type: Bioinformatic information

Explanation note: Table S2. Bioinformatic information of DNA extraction including sample layout and in-line tagging.

This dataset is made available under the Open Database License (http://opendatacommons.org/licenses/odbl/1.0/). The Open Database License (ODbL) is a license agreement intended to allow users to freely share, modify, and use this Dataset while maintaining this same freedom for others, provided that the original source and author(s) are credited.
Download file (105.65 kb)
Supplementary material 3 

Table S3

Author: Jarno Turunen, Heikki Mykrä, Vasco Elbrecht, Dirk Steinke, Thomas Braukmann, Jukka Aroviita

Data type: Read counts and taxonomy of OTU sequences

Explanation note: Table S3. Raw OTU table showing read counts, as well as taxonomy and OTU sequences for each sample. In addition, processed and quality filtered OTU tables with taxonomy assigned are included.

This dataset is made available under the Open Database License (http://opendatacommons.org/licenses/odbl/1.0/). The Open Database License (ODbL) is a license agreement intended to allow users to freely share, modify, and use this Dataset while maintaining this same freedom for others, provided that the original source and author(s) are credited.
Download file (1.39 MB)
Supplementary material 4 

Table S4

Author: Jarno Turunen, Heikki Mykrä, Vasco Elbrecht, Dirk Steinke, Thomas Braukmann, Jukka Aroviita

Data type: Occurrences of taxa

Explanation note: Table S4. List of taxa observed by metabarcoding (DNA meta) and by routine morphological identification (Morpho) across 36 stream site samples in northern Finland. Taxon frequency occurrence (% of samples) is given for both methods. Mean relative abundance of reads for DNA metabarcoding and mean number of specimens for morphological identification across samples in which each taxon was present are also given.

This dataset is made available under the Open Database License (http://opendatacommons.org/licenses/odbl/1.0/). The Open Database License (ODbL) is a license agreement intended to allow users to freely share, modify, and use this Dataset while maintaining this same freedom for others, provided that the original source and author(s) are credited.
Download file (90.22 kb)
Supplementary material 5 

Table S5

Author: Jarno Turunen, Heikki Mykrä, Vasco Elbrecht, Dirk Steinke, Thomas Braukmann, Jukka Aroviita

Data type: Results of PCA of environmental variables

Explanation note: Table S5. Eigenvalues, percent explained variance and varimax rotated principal component loadings of the environmental variables. Loadings > 0.5 are in bold.

This dataset is made available under the Open Database License (http://opendatacommons.org/licenses/odbl/1.0/). The Open Database License (ODbL) is a license agreement intended to allow users to freely share, modify, and use this Dataset while maintaining this same freedom for others, provided that the original source and author(s) are credited.
Download file (75.98 kb)
Supplementary material 6 

Scripts S1

Author: Jarno Turunen, Heikki Mykrä, Vasco Elbrecht, Dirk Steinke, Thomas Braukmann, Jukka Aroviita

Data type: R-scripts

Explanation note: Commands for data processing are available as supporting information.

This dataset is made available under the Open Database License (http://opendatacommons.org/licenses/odbl/1.0/). The Open Database License (ODbL) is a license agreement intended to allow users to freely share, modify, and use this Dataset while maintaining this same freedom for others, provided that the original source and author(s) are credited.
Download file (4.44 kb)
login to comment