The power of metabarcoding: Can we improve bioassessment and biodiversity surveys of stream macroinvertebrate communities?

Most stream bioassessment and biodiversity surveys are currently based on morphological identification of communities. However, DNA metabarcoding is emerging as a fast and cost-effective alternative for species identification. We compared both methods in a survey of benthic macroinvertebrate communities across 36 stream sites in northern Finland. We identified 291 taxa of which 62% were identified only by DNA metabarcoding. DNA metabarcoding produced extensive species level inventories for groups (Oligochaeta, Chironomidae, Simuliidae, Limoniidae and Limnephilidae), for which morphological identification was not feasible due to the high level of expertise needed. Metabarcoding also provided more insightful taxonomic information on the occurrence of three red-listed vulnerable or data deficient species, the discovery of two likely cryptic and potentially new species to Finland and species information of insect genera at an early larval stage that could not be separated morphologically. However, it systematically failed to reliably detect the occurrence of gastropods that were easily identified morphologically. The impact of mining on community structure could only be shown using DNA metabarcoding data which suggests that the finer taxonomic detail can improve detection of subtle impacts. Both methods generally exhibited similar strength of community-environment relationships, but DNA metabarcoding showed better performance with presence/absence data than with relative DNA sequence abundances. Our results suggest that DNA metabarcoding holds a promise for future anthropogenic impact assessments, although, in our case, the performance did not improve much from the morphological species identification. The key advantage of DNA metabarcoding lies in efficient biodiversity surveys, taxonomical studies and applications in conservation biology.


Introduction
Freshwater biomonitoring and biodiversity assessments are traditionally based on morphological identification of species and specimen count data. Particularly, benthic macroinvertebrates are used to assess the ecological state of streams and lakes. Such bioassessment programmes are often limited to genus or family level taxonomic resolution (e.g. Wright et al. 1995) for at least three reasons: (1) morphological species identification of several taxonomic groups requires a high level of expertise and financial resources, (2) many insect larval instars cannot be reliably identified to species or even to genus using morphological characteristics and (3) some species (or subspecies) form complexes which are indistinguishable by morphology (e.g. Ståhls and Savolainen 2008;Lucentini et al. 2011). This low taxonomic resolution hampers our ability to detect environmental impacts, because even closely-related species can vary in their tolerance to a given stressor and, thus, respond differently to changes in their environment (Resh and Unzicker 1975;Macher et al. 2016a). If this is accompanied by misidentifications, the detection of biological impairment is further obscured (Lenat and Resh 2001;Haase et al. 2010;Sweeney et al. 2011).
The past two decades saw the rapid development of DNA-based approaches to species identification. Particularly, systems based on the analysis of sequence variation in short, standardised gene regions (i.e. DNA barcodes) were shown to be very effective in species discrimination (Hebert et al. 2003). The DNA barcoding approach can be combined with high-throughput sequencing, to identify thousands of specimens in bulk in a process called metabarcoding. These methods have become cost-efficient and represent reliable alternatives to existing approaches used for freshwater bioassessments (Yu et al. 2012;Elbrecht et al. 2017a;Hering et al. 2018;Nichols et al. 2019) and biodiversity surveys (Baird and Hajibabaei 2012). Barcode reference libraries for freshwater benthic macroinvertebrates are fairly complete and continuous efforts to improve these databases are underway ).
Nevertheless, a key criterion for the adoption of DNA-based methods as state-of-the-art procedures in bioassessment and biodiversity surveys remains their reliability. There are known cases of false-negatives (a species is present in the sample, but its DNA is not detected either due to primer bias, insufficient specimen biomass, DNA degradation or incomplete barcode libraries) and false-positives (a species is not present in the sample, but is detected either due to the presence of trace DNA in, for example, gut content, cross-contamination or analytical errors; Hering et al 2018; . It is unclear how metabarcoding errors compare to errors in morphological identifications. Metabarcoding methods are becoming well established and it is vital to evaluate potential discrepancies in species detection and to assess methodological limitations in the context of biomonitoring.
For this study, we conducted a bioassessment survey of macroinvertebrates in subarctic streams close to mining operations in northern Finland. Macroinvertebrates were identified by using both a standard morphology-based protocol and DNA metabarcoding. We explored differences in detecting different taxa for both methods and were particularly interested in the potential of metabarcoding to gain additional taxonomic information for the study region as we expected that it would provide a more comprehensive assessment of local biodiversity than morphological identifications. Furthermore, we anticipated that a more detailed level of identification would help to better understand community-environment relationships and provide higher sensitivity to potential mining impacts.

Study area
Our study area is located in the Lapland region of northern Finland (between 66°N and 70°N). Streams in this region belong to northern boreal and subarctic ecoregions. The area is sparsely populated and the main anthropogenic pressure for freshwater ecosystems stems from forestry and mining operations. There is an increasing interest in the extraction of mineral deposits in the area with potentially negative impacts on freshwater biodiversity (Leppänen et al. 2017). We selected 36 stream sites from six regions with mining activities (Fig. 1). Regions comprised three locations with operational mines, two with closed mines and one (Sokli) where mining is planned, but not yet operational. From each region, we sampled three stream sites downstream of a mine and three reference sites without any mining influence (see Mykrä et al. 2021 for details). These control sites were either upstream of a mine or, if this were not possible (mining impact extended into the headwaters), we took samples from separate unaffected streams of the same drainage system. All sites from the region without an operational mine (Sokli) were classified as reference streams.
The key minerals extracted from the mining sites are gold (Kittilä, Kevitsa, Pahtavaara, Saattopora-Pahtavuoma), nickel (Kevitsa), copper (Kevitsa, Saattopora-Pahtavuoma) and chromium (Kemi). In Sokli, the key mineral is phosphate. The mines are mostly combinations of open pit and underground extraction sites. The main impacts on streams at all the mining-impacted study sites are elevated concentrations of nitrogen and sulphate which originate from the ore extraction and enrichment process, as well as use of explosives and rock tailings. Some sites also show elevated concentrations of arsenic, antimony and ions such as chloride salts (Kittilä, Kemi, Kevitsa). The impacts of this type of mining on stream biodiversity are poorly known; however, a recent study suggests that they are subtle (Mykrä et al. 2021) and could thus be better detected by using DNA-based methods rather than morphological species identification.

Stream habitat measurements
We made several measurements in both the habitat and the catchment of each stream site (Suppl. material 1: Table S1). Substrate structure and coverage of stream macrophytes were visually estimated by placing fifteen 0.25 m 2 (0.5 * 0.5 m) plots across each site. Substrate size was classified following a modified Wentworth scale: organic matter = 0, sand (diameter 0.25-2 mm) = 1, fine gravel (2-6 mm) = 2, coarse gravel (6-16 mm) = 3, small pebble (16-32 mm) = 4, large pebble (32-64 mm) = 5, small cobble (64-128 mm) = 6, large cobble (128-256 mm) = 7, boulder (256-400 mm) = 8, large boulder and bed rock (> 400 mm) = 9. The percentage coverage of different size classes and macrophytes was estimated from the plots and averaged across the stream site. We collected water samples from each site to measure suspended solids, nutrients and various metals, metalloids and non-metals by using national standards (National Board of Waters 1981). Water pH and conductivity was measured in situ with an YSI Professional Plus meter (YSI Inc., Yellowsprings, Ohio, USA). Upstream catchment boundaries were delineated for each site using a Geographical Information System with the Digital Elevation Model (DEM) raster database of the National Land Survey of Finland (NLS) and vector data of Drainage Basins in Finland (SYKE). Land cover and land use were calculated for each catchment using Corine Land Use data.

Sample collection and processing
Benthic macroinvertebrates were sampled in autumn 2017 by taking four 30 second kick-net samples (mesh size 0.5 mm, area of disturbed stream bed = 1.2 m 2 ) from swiftly flowing riffle sections at each site. The protocol followed the sampling guidelines for Finnish National Water Framework Directive (WFD) monitoring (Järvinen et al. 2019). Samples were preserved in 96% ethanol in the field. Ethanol was replaced the same day to ensure a high ethanol concentration for preservation of specimens and prevent DNA degradation. Samples were kept cool (8 °C) for subsequent molecular analyses.
Morphological identification of macroinvertebrates followed the requirements for national WFD bioassessments (Järvinen et al. 2019). Taxonomic levels are species or genus for groups that are feasible for identification by morphology in routing biomonitoring (Mollusca, Coleoptera, Ephemeroptera, Plecoptera, some Diptera, most Trichoptera). In Baetis mayflies, the specimens without gill and tracheal bristles and wide frons were identified as Baetis vernus-type. Specimens from groups where determination is laborious or too difficult were only identified to higher level taxonomy (Oligochaeta, Hydracarina, Chironomidae (Diptera), Ceratopogoniidae (Diptera), Simuliidae (Diptera), Limnephilidae (Trichoptera)). After identification, specimens were pooled back into their respective sampling lot for metabarcoding analysis.

DNA metabarcoding
For DNA extraction, sampling lots comprising all morphologically identified specimens were dried overnight at 40 °C. Large specimens (e.g. a dytiscid beetle) were removed from samples and a single leg of the specimen added back to the sample to prevent DNA of such a single specimen dominating the sample (Elbrecht et al. 2017a). We ground samples to a fine powder by either using the IKA ULTRA-TUR-RAX Tube Drive Control System (with ten 5 mm steel beads in 20 ml tubes, ground at 4,000 rpm for 30 min) or the IKA Tube Mill control system (for larger samples using 40 ml chambers, homogenising samples for 3 minutes at 25,000 rpm). On average, 14.91 mg (SD = 7.62 mg) tissue powder per sample was used for DNA extraction with the DNeasy 96 Blood & Tissue Kit (Qiagen, Hilden, Germany). For each sample, we extracted two aliquots of tissue, as well as 12 negative controls and 10 Lepidoptera specimens which served as positive controls (see Suppl. material 2: Table S2 for exact sample layout and Lepidoptera species added). To prevent cross-contamination between samples, tissue was digested according to manufacturer recommendations in individual 1.5 ml reaction tubes at 56 °C for 3 hours and then transferred to the spin column plate.
Metabarcoding was done using a two-step fusion primer strategy . During the first PCR step, a 421 bp subset of the Cytochrome c oxidase subunit I gene (COI) was amplified using the BF2+BR2 primer set (Elbrecht and Leese 2017) and the Qiagen Multiplex PCR Plus Kit (Qiagen, Hilden, Germany). Two DNA extracts from an insect mock sample of a prior study (Braukmann et al. 2019) were used as additional positive controls. The reaction mix consisted of 0.5 μl DNA (concentration not quantified), 0.2 μM each primer, 2x Multiplex PCR Master Mix and ddH 2 O with a total reaction volume of 25 μl. PCRs were run on an Eppendorf mastercycler pro Thermocycler using the following programme: 95 °C for 5 min; 25 cycles of 95 °C for 30 s, 50 °C for 30 s and 72 °C for 50 s; and a final extension at 72 °C for 5 min. PCR success was evaluated on a 1% agarose gel. For the second PCR step, 0.5 μl amplicon of the previous PCR was used as template and each sample tagged with a unique fusion primer combination (see Suppl. material 2: Table S2 for details). The PCR setup was identical to the first, but the cycle number was reduced to 15 and the extension time in each cycle was increased to two minutes. PCR success was again checked with a 1% agarose gel and sample 33, which showed no band, was run for 15 additional PCR cycles. Amplicons were purified, left-side size-selected using SPRIselect with a ratio of 0.76x (Beckman Coulter, Brea, CA, USA) and then quantified on a Qubit Fluorometer (HS Kit, Thermo Fisher Scientific, Waltham, MA, USA). All amplicons, including positive and negative controls, were equimolar pooled. Negative controls were pooled by using the average volume of the other samples. The resulting library was sequenced on an Illumina MiSeq using the 300 bp PE v.3 Sequencing Kit with 5% Phi-X spike-in.
Raw sequencing data was quality checked using FastQC v.0.11.8 and then processed using the R stats based JAMP v.0.58 pipeline (https://github.com/VascoElbrecht/JAMP) which mostly relies on usearch v.11.0.667 (Edgar 2010). Commands for data processing are available as supporting information (Scripts S1). Reads were demultiplexed, based on fusion primer in-line tagging (Suppl. material 2: Table S2) and paired-end merged, using usearch, while allowing mismatches of up to 25%. We trimmed Primers from both sides, using default settings in Cutadapt v.1.18 (Martin 2011). Reads where a primer could not be detected at either end were discarded. In addition, only reads between 411-431 bp length were retained for further analysis. To further discard reads with poor read quality, we applied expected error (EE) filtering (Edgar and Flyvbjerg 2015), using a max EE of 1. Sequences of all samples were pooled, dereplicated (minuniquesize = 2) and clustered into molecular operational taxonomic units (MOTUs), using cluster OTUs with a 97% identity threshold (Edgar 2013). Individual reads (including singletons) were mapped against the OTU list, using usearch global with a minimum match of 97%. OTUs with < 0.01% read abundance in both replicates of a sample were discarded and the remaining reads mapped again against the OTU subset. We multiplied the highest read count for each individual OTU present in the negative controls by two and subtracted it from all other samples, to account for low level tag switching and cross contamination. OTU taxonomy was assigned using BOLD (Ratnasingham and Hebert 2007).

Taxonomic harmonisation
Identifications by morphology and DNA metabarcoding both had cases of higher and lower taxonomic level than species level identification for a given taxon. Therefore, to avoid overlapping higher level taxa, we harmonised the taxonomy in each sample separately for both identification methods as follows. Identifications from higher level (e.g. family or genus) were assigned to corresponding lower level (e.g. genus or species) identifications in the sample in accordance with the numerical abundance (number of specimens in morphological identifications and number of sequences in DNA metabarcoding) ratios of the corresponding lower level identifications. If lower level identifications were not present in a sample for a given taxon, the higher level identifications were left as is.

Statistical analyses
We compared both identification methods, using taxon accumulation curves, taxon occurrence information, multivariate analyses of community composition and community-environment-relationships, as well as statistical tests for potential effects of mining on biota. We explored community composition using Non-metric Multidimensional Scaling (NMDS) ordination. Two-dimensional NMDS ordinations were based on Bray-Curtis distances of log-transformed abundance data (morphological identifications) and proportional DNA sequence abundance data, as well as presence-absence data. Adding further dimensions did not considerably optimise stress. We correlated community composition (NMDS scores) with main environmental gradients by using Principal Component Analysis (PCA) to reduce multidimensionality of the environmental data to a few interpretable principal components (gradients) and subsequently fitted those to the NMDS scores. We used a Permutational Multivariate Analysis of Variance (PER-MANOVA; Anderson 2001) to test for impact of mining on community composition. The mining region was used as group (strata) to constrain permutations. Prior to the PER-MANOVA, we tested the homogeneity of multivariate dispersion within treatments with a Permutational Analysis of Multivariate Dispersions (PERMDISP; Anderson 2006). Finally, we compared community dissimilarity and environmental distance relationships between the two identification methods, using BIO-ENV (Clarke and Ainsworth 1993). In BIO-ENV, Bray-Curtis resemblance matrices are correlated to Euclidean distance matrices of environmental variables and Spearman´s rank correlation (r s ) is used to assess the strength of the correlation. BIO-ENV searches for the best subset of environmental variables which maximises the correlation between community and environmental distance. All statistical analyses were done in R (R Core Team 2019) using the vegan package (Oksanen et al. 2019), in particular its functions metaMDS and envfit (NMDS), adonis (PER-MANOVA), betadisper (PERMDISP) and bioenv.

Results
MiSeq sequencing generated 15,760,049 sequences. Raw data are available under the SRA accession PRJNA547646. Eleven of the 12 negative controls showed less than 500 reads and one negative control had around 3,000 reads. The macroinvertebrate samples had an average sequencing depth of 149,278 reads (SD = 17732). During bioinformatic processing and quality filtering, an average of 11.1% of the reads were discarded in each mining sample (SD = 2.4%). The 10 Lepidoptera species, added as positive controls, did appear in other samples, almost exclusively along the column in which the respective sample was added (figure S1). This might be caused by the decreased sequence quality in read two of Illumina sequencing or tag switching on the flow cell related to the second read.
We collected a total of 33,355 specimens from all 36 stream sites. The average number of specimens per sample was 927 (range: 201-2557). Overall, morphological analysis revealed 113 taxa, while metabarcoding unveiled 250 taxa (Fig. 2, Suppl. material 4: Table S4). Metabarcoding also produced more taxa per site (mean 45, range: 29-64) than morphological identification (31, range: 16-48). The cumulative number of taxa steadily increased with the number of sites, particularly metabarcoding did not reach an asymptote with increasing number of sampling sites (Fig. 2). By contrast, for morphological identification, the number of taxa did approach an asymptote at the total number of taxa detected.
Approximately half of the taxa detected only by metabarcoding were from groups that were not morphologically identified below family or lower taxonomic level using the routine protocol: Oligochaeta (16 species, genera or families detected only through metabarcoding), Arachnida (11 families or species), Chironomidae (72 species or genera), Simuliidae (11 species), other Dipteran families (seven species) and Limnephilidae (Trichoptera, seven species). Even after excluding these taxonomic groups, metabarcoding still identified more unique taxa (57 taxa not identified by morphology) than did morphological identification (22 taxa not identified by metabarcoding). Overall, metabarcoding detected 81% of the taxa that were detected by morphology, whereas morphological identification detected 54% of the taxa detected by metabarcoding (excluding the taxonomic groups that were not morphologically identified to lower levels).
In many cases, metabarcoding outperformed morphological identification by producing otherwise not possible species level information (Table 1). Such cases included the stonefly genera Leuctra, Nemoura, Diura and Isoperla which are notoriously difficult to identify to species level using early nymph instars. In mayflies, metabarcoding resolved the high diversity of the swimming mayfly genus Baetis (seven species) that morphologically could only be identified to either Baetis rhodani or Baetis vernus-type. Metabarcoding also provided unique species level information for Coleoptera (Agabus guttatus, Oreodytes sanmarkii, Hydroporus lapponium, Hydraena gracilis), for three red-listed or data deficient (DD) species (Baetis liebenauae (NT), Baetis scambus (DD), Dicranota robusta (NT)) and for species that are either new to Finland or represent cryptic species of unknown taxonomic status or, in some cases, likely represented errors in the reference library: Baetis phoebus, Heptagenia pulla, Sericostoma flavicorne (in morpho S. personatum). Metabarcoding also detected taxa that should have been readily identifiable Metabarcoding also had some shortcomings in species detection. Notably, it often did not detect Gastropoda, for example, Gyraulus sp. only at five out of nine sites compared to morphology and failed to detect any of the 14 occurrences of four other gastropod taxa (Table 1). Nine other non-gastropod taxa were also not detected by metabarcoding. In some cases, information for species presence mismatched results of morphological identification. For example, Micrasema gelidum was identified from nine sites by morphology, but from none by metabarcoding which indicated presence of Micrasema primoricum at the same sites (Table 1). M. gelidum is a common species in the region, whereas M. primoricum is unknown to Finland. A similar case involved Apatania wallengreni (observed at nine sites by morphology, but at none by metabarcoding) and A. crymophila (indicated presence at four sites only by metabarcoding). The latter is also unknown in Finland. These results suggest issues with the taxonomic assignment or the completeness of the reference library.
When datasets were harmonised to the taxonomic resolution required for national WFD bioassessments, both methods produced highly comparable results with 89 shared taxa and only 14 taxa identified solely by metabarcoding and only 18 exclusively by morphology (Fig. 2). The results were similar for key indicator species (Ephemeroptera, Plecoptera, Trichoptera, EPT) with 57 shared taxa and six and five taxa unique to metabarcoding and morphology, respectively.
The PCA reduced the environmental data to four components which together explain 73% of the variation in the environmental variables. The first component (PC1) represents a gradient of mining influence on water chemistry, specifically inorganic nitrogen, sulphate and metal and non-metal concentrations explaining most of the variation (38%, Suppl. material 5: Table S5). The second component (PC2) represents a gradient of peatland influence on water chemistry (21% explained) with iron, manganese, chemical oxygen demand, aluminium, dissolved organic carbon and organic nitrogen. The third component (PC3) relates to catchment area and arsenic concentration (7.3% explained). The fourth component (PC4) represents a phosphorus gradient (7.1% explained).
The community composition (NMDS), based on both morphological and metabarcoding data, showed significant relationships to the environment, but the magnitude varied between both datasets depending on whether abundance or presence-absence data were used. The relationship of community composition and the mining pollution gradient (PC1) was stronger with morphological identification (R 2 = 0.54, P < 0.001) than with metabarcoding (R 2 = 0.46, P < 0.001) (Fig. 4a). Morphological data also correlated with PC2 (R 2 = 0.32, P = 0.002) and the DNA data with PC3 (R 2 = 0.46, P < 0.001). By contrast, when using presence-absence data, the correlation of community composition with the mining pollution gradient (PC1) was stronger with metabarcoding (R 2 = 0.59, P = 0.002)  than with morphological identification (R 2 = 0.46, P < 0.001) (Fig. 3b). Metabarcoding also correlated with PC2 (R 2 = 0.39, P = 0.003) and PC3 (R 2 = 0.28, P = 0.008) and the morphological data with PC2 (R 2 = 0.39, P < 0.001). The correlation of community dissimilarity with environmental distance was substantially weaker with metabarcoding data (r s = 0.50) than with morphological identifications (r s = 0.72) when abundance data were used. The results were also different in terms of environmental variables selected as only water iron (Fe) concentration was selected as the best variable using metabarcoding (Table 2). However, when using presence-absence data, correlations were comparable between both methods (metabarcoding: r s = 0.74, morphology: r s = 0.71) and the same environmental variables were selected as the best subset of environmental variables to construct environmental dissimilarity matrices ( Table 2).
The mining impact on the streams was subtle across the surveyed mining areas (see also Mykrä et al. 2021).

Discussion
The rapid development of new DNA sequencing techniques and the increase in taxonomic coverage in reference sequence databases has made DNA metabarcoding an attractive tool for biodiversity surveys and bioassessments (e.g. Baird and Hajibabaei 2012;Hering et al. 2018). Our study shows that bulk metabarcoding produces finer taxonomic resolution than morphological identification which also helped to obtain information on species of conservation value. Thus, metabarcoding provided additional regional biodiversity information that could not be obtained using the routine morphological identification of freshwater bioassessments. In general, both methods showed similar community-environment-relationships. However, metabarcoding showed sensitivity to detect mining impact in our study areas, which suggests that the finer taxonomic resolution can indeed improve detection of subtle impacts on biota.

The added value of metabarcoding to biodiversity surveys
Metabarcoding generated more information for unique taxa than did morphological identification. In fact, in contrast to the morphological approach that uses coarser taxonomic resolution, the cumulative number of taxa found with metabarcoding did not reach an asymptote with increasing number of studied sites. This indicates that much of the regional benthic macroinvertebrate diversity has not yet been observed at the taxonomic resolution that can be achieved by metabarcoding.
Much of this unique taxonomic information comes from groups such as Chironomidae, Simuliidae, Limnephilidae and Oligochaeta, which could be expected as Fennoscandian DNA reference libraries are quite comprehensive for these groups. In our routine morphology-based surveys, they are also never identified to species level.
In our study, metabarcoding was particularly useful in resolving morphological identifications at genus level (especially beetles and stoneflies). Cryptic diversity in morphologically-inseparable species complexes is a common feature of many benthic macroinvertebrate groups (e.g. Rutschmann et al. 2014;Vitecek et al. 2017). This is problematic for bioassessments as taxa of the same genus can exhibit differing responses to environmental stressors (Macher et al. 2016a). A particularly problematic group in this regard is the mayfly family Baetidae (e.g. Ståhls and Savolainen 2008;Rutschmann et al. 2014) and instances of cryptic diversity have been suggested by several molecular studies (Lucentini et al. 2011;Stein et al. 2014;Pereira-da-Conceicoa et al. 2020). Our study also showed some evidence for regional unknown diversity.
The metabarcoding data contained sequences matching Baetis phoebus mayflies; however, this species is known only from North America (Barber-James et al. 2013). This surprising result can have three possible explanations: the occurrence of an unknown Baetis sp. that is mistakenly assigned as B. phoebus, errors in the species assignment method or a match to a mislabelled record in the DNA reference library (Elbrecht et al. 2017a). Further analysis, using different species assignment methods and a more complete Beatis reference library, might help to resolve this conundrum. Metabarcoding (and morphological identifications) placed Sericostoma personatum at many sites, but at most of the sites, it also found S. flavicorne, a species that is unknown to the region. Sericostomatidae is a caddis family known to have an unresolved taxonomy (Sipahiler 2000;Malicky 2005;Macher et al. 2016b). Both species are not clearly separated with morphology in early larval stages and the COI barcode shows insufficient taxonomic resolution (Darschnik et al. 2019). Thus, reference sequences contain misidentified specimens of either species group. This demonstrates the need for further investigation of unusual or unexpected species occurrences in metabarcoding datasets (see also Darschnik et al. 2019). In our case, a closer look at the public database helped to resolve this issue and, therefore, we ruled out bioinformatics errors related to the species assignment method chosen. Another noteworthy metabarcoding discovery was the find of three species (Baetis liebenauae, Baetis scambus and Dicranota robusta) considered near-threatened or data deficient in the national Red List (Hyvärinen et al. 2019). This further highlights the promise and power of the approach in the detection of endangered species and assessment of the conservation status of species ).

Community-environment relationships and impact detection
It has been suggested that better resolved taxonomy results in stronger estimates of community-environment relationships and better detection of environmental impacts (Resh and Unzicker 1975;Lenat and Resh 2001). We found some evidence for this as differences in community composition between near-natural streams and mine-impacted streams were detected in mining regions only by the use of metabarcoding. Similarly, Stein et al. (2014) reported that metabarcoding enabled them to differentiate benthic invertebrate communities between bank armoured and unarmoured stream reaches, whereas morphological identification failed to detect any compositional differences. However, our study showed only marginally different community-environment relationships for both methods, suggesting that more specific identifications of taxa in morphologically unresolved groups, such as Chironomidae or Oligochaeta, do not increase resolution. It is, therefore, likely that environmental responses of these species in these groups are less predictable than for taxa (i.e. EPT), traditionally identified to species or genus level (Rabeni and Wang 2001). The inclusion of species abundance rather than the mere use of presence-absence resulted in better resolved community-environment-relationships for morphological data (log specimen abundance), which highlights the importance of accounting for individual abundances in studies of community patterns (e.g. Heino 2008). However, the inclusion of abundance information (relative sequence abundance) in the metabarcoding dataset resulted in a loss of community-environment-relationship signal strength. Indeed, the signal was notably weaker when community dissimilarities were correlated with environmental dissimilarities. On the other hand, the methods showed very similar performance when only presence-absence data were used. These results confirm that the use of presence-absence data is highly recommended when using metabarcoding because sequence abundances typically do not reflect the true abundance or biomass distribution of species (Piñol et al. 2014;Elbrecht et al. 2017b;Braukmann et al. 2019) and, thus, do not provide information similar to species abundance, based on counting of individuals. The poor correlation of DNA sequence abundance with species´ abundance is one of the key limitations for metabarcoding in biomonitoring applications for which abundance information is needed (Elbrecht and Leese 2015).

Shortcomings of metabarcoding
Metabarcoding did not detect all taxa found in morphological identifications. In particular, several molluscs, which we were able to identify by morphology, were missing from the DNA dataset. This could be the result of primer bias during PCR amplification (especially for mollusca; see Fernández et al. 2019), relatively low DNA abundance or DNA degradation . Similarly, Beentjes et al. (2019) found that a majority of molluscs failed to amplify, highlighting the difficulties in recovering all taxonomic groups when using universal primers that do not specifically target molluscs. The choice of primers can have a substantial effect on species detection in benthic macroinvertebrates, although different primers can give similar ecological interpretations when the goal is to study broad-scale community patterns rather than the accurate detection of species . Alternative PCR-free approaches, such as metagenomics or metatranscriptomics, could be considered to overcome primer bias and to provide more reliable measures of species abundance Cordier et al. 2020). The lack of mollusc matches in the metabarcoding dataset could also be the result of the generally limited coverage for freshwater molluscs in DNA barcode libraries ). There are a few more gaps in DNA reference libraries for some other taxa, such as Micrasema and Apatania caddisflies, which are common in our study area and in northern streams and rivers in general. Such library gaps could be the reason for some of the discrepancies between both methods and those can only be overcome by further parameterisation of the reference libraries used. There were a few more taxa detected by morphology, but missing from the metabarcoding results, but their abundances were very low to begin with and could be partly attributed to identification errors.

Conclusions
Our results show that bulk metabarcoding can provide comparable or superior results to traditional morphology-based species identification for stream bioassessment and biodiversity surveys. The benefits were most apparent for species groups that are tedious to identify and require high taxonomic expertise. However, the fact that DNA sequence abundances did not correlate with species abundances or biomass hampers the use of ecological metrics that rely on those. Primer bias or incomplete DNA reference libraries, by contrast, do not seem to pose a major roadblock if the goal of a study is impact assessment rather than a complete biodiversity survey. Given the fast growth of DNA reference libraries (Bush et al. 2019;Weigand et al. 2019;Carraro et al. 2020), the latter will also become more feasible. However, the limitations, described above, require caution in the interpretation of the results and taxonomic expertise is still needed to resolve conflicting information. Despite all this, metabarcoding has the potential to substantially increase the knowledge on distributions of poorly studied or endangered species and, thereby, greatly aid biodiversity conservation.