Research Article |
Corresponding author: Nina Röder ( nina-roeder@posteo.de ) Academic editor: Sarah J. Bourlat
© 2023 Nina Röder, Klaus Schwenk.
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Röder N, Schwenk K (2023) Direct PCR meets high-throughput sequencing – metabarcoding of chironomid communities without DNA extraction. Metabarcoding and Metagenomics 7: e102455. https://doi.org/10.3897/mbmg.7.102455
|
Aquatic emergent insect communities form an important link between aquatic and terrestrial ecosystems, yet studying them is costly and time-consuming as they are usually diverse and superabundant. Metabarcoding is a valuable tool to investigate arthropod community compositions, however high-throughput applications, such as for biomonitoring, require cost-effective and user-friendly procedures. To investigate if the time-consuming and labour-intensive DNA extraction step can be omitted in metabarcoding, we studied the difference in detection rates and individual read abundance using standard DNA extraction versus direct PCR protocols. Metabarcoding with and without DNA extraction was performed with artificially created communities of known composition as well as on natural communities both of the dipteran family Chironomidae to compare detection rates, individual read abundances and presence-absence community composition. We found that the novel approach of direct PCR metabarcoding presented here did not alter detection rates and had a minor effect on individual read abundances in artificially created communities. Furthermore, presence-absence community compositions of natural chironomid communities were highly comparable using both approaches. In conclusion, we showed that direct PCR protocols can be applied in chironomid metabarcoding approaches, with possible application for a wider range of arthropod taxa, enabling us to study communities more efficiently in the future.
cytochrome c oxidase I (COI or COX1), DNA isolation, metabarcoding, next-generation sequencing (NGS), presence-absence community composition, read abundance, size-sorting
A key global challenge in the 21st century is the attempt to reverse the ongoing global biodiversity decline and to mitigate its consequences. DNA metabarcoding for large-scale monitoring of species rich and abundant groups, such as insects, is labour- and cost-intensive, but has become more and more achievable since its development in the early 2000s (
Typically, a laborious DNA extraction step prior to PCR is performed in order to purify and concentrate DNA from tissue samples. However, tissue from different invertebrate species has been successfully added to PCR reactions without prior DNA extraction in so-called direct PCR (dPCR) approaches (e.g., flies and starfish:
To investigate the suitability of direct PCR protocols in insect community metabarcoding we compared purified DNA with direct application of homogenized tissue as a substrate for PCR and subsequent metabarcoding. We chose communities of the family Chironomidae, since they are a superabundant and very species rich group of merolimnic insects. Although chironomids are important components of biomonitoring programmes worldwide, their morphological identification to species level is very difficult. Therefore, they are suitable target organisms for developing efficient metabarcoding techniques. We subjected artificial (known species composition) and natural communities (highly variable species composition) to standard vs. dPCR metabarcoding using previously established cytochrome c oxidase I (COI) markers (
Chironomidae
were retrieved from artificial ponds of the Eußerthal Ecosystem Research Station (EERES; 49°15′14″N, 7°57′42″E) near Landau, Germany, in 2019 and 2020. Adult specimens were collected from passive emergence traps once to twice a week during spring and summer. In 2020, the artificial ponds were simultaneously used to study the effect of the mosquito control agent Bacillus thuringiensis israelensis (Bti) on merolimnic insect communities (
Chironomid samples were sorted into four different size categories with known average weight per specimen (cf.
Size-sorting of natural chironomid samples led to cases with only one specimen per size group and sample (due to low sample size). All these single specimens were individually Sanger sequenced. We used a direct PCR approach following
We used 550 µl of the tissue-water mix to purify DNA from each sample with two technical replicates following an adapted high salt DNA extraction protocol after
To assess the detection rates and specimen-specific read abundances under different metabarcoding approaches (Experiments 1–3), we created artificial communities with known composition of different chironomid species. Tissue-water mixes of individual chironomid specimens were selected based on the specimens’ COI sequences. We aimed for taxonomically diverse artificial communities, while still being able to distinguish between specimens by their COI sequences targeted in the metabarcoding approach. We selected tissue-water mixes of 16 specimens from 2019 with at least 1.5% dissimilarity in the targeted region to create one artificial community (“Mock A”, Suppl. material
Schematic overview of the workflows during preparation of artificial chironomid communities. Two different sets of chironomid specimens (Mock A and Mock B) were analysed using different approaches of standard and direct PCR metabarcoding, to study the effect of direct PCR or DNA extraction protocols on detection rates and individual read abundances. Experimental setups included analysing the effects of variable mass per individual and different mock compositions (Experiment 1), variable or similar amounts of different input materials (Experiment 2), and variable concentrations of mock material to assess sensitivity (Experiment 3). Mocks were created by pipetting tissue-water mixes (Mocks A-1, A-2 and B-1), purified DNA extracts (Mock B-2) or PCR products (Mock B-3) of individual specimens. A subsample of Mocks A-1 (a, b) and A-2 (c, d) was used for sequential dilution (Mock A-3). Pie charts indicate if artificial communities were created under a size-sorting scenario (aiming for even masses per individual) or under a non-size-sorting scenario (where individual masses naturally vary). Purified DNA extracts were subjected to PCR, while non-purified tissue-water mixes were used in dPCR approaches. The labels ‘Figure
Mock A was used to assess the effect of size-sorting (prior to sample preparations) and two different metabarcoding approaches i.e., with and without DNA extraction, on read-abundance per species. Each of the 5 replicates of Mock A were created by pipetting either the same amount of tissue-water mix (34 µl per specimen, concentrations varying between 0.0001 and 0.0014 mg tissue per 1 µl) simulating a “without size-sorting” scenario (Mock A-1, n = 5) or an adapted amount of tissue-water mix (4–116 µl) with approx. 0.006 mg tissue per each of the 16 specimen (Mock A-2: “with size-sorting” scenario, n = 5). For DNA extraction of the resulting tissue-water mixes, we followed the approach described in section ‘DNA extraction’ (but using only 495 µl tissue-water mix, the rest was needed for direct PCR applications). We further ran a size-sorting plus dPCR approach with Mock B (Mock B-1; n = 3, see Experiment 2 for details) to test if results are consistent when different specimens are used.
With Mock B we assessed the effect of DNA input variation, i.e., varying or equal amount per specimen of either total DNA (Mock B-2) or target fragments (Mock B-3), on read abundance per specimen. DNA was extracted and purified from the 13 individual specimens and quantified using a Qubit fluorometer (Qubit dsDNA HS Assay Kit, Thermo Fisher Scientific). Both equal (5 ng) and varying (0.7–5 ng) amounts of DNA per specimen were then pooled into artificial communities (Mock B-2; n = 4), the latter using each 1.2 µl of the purified DNA extract per specimen. In addition, direct PCR was performed with each of the specimens individually and target fragments were quantified using a Tapestation 4200 (D1000 DNA Screentape Analysis Kit; Agilent Technologies, Santa Clara, CA, USA). Consequently, we pooled either the same (14, 31, 50 or 70 ng) or varying (14–80 ng) amounts of target fragments per specimen into artificial communities (Mock B-3; n = 4) and subjected them to the second PCR for tagging and adding of Illumina adapters.
To assess sensitivity of the different methods, we determined specimen detection rates with different dilutions of the mock communities created for Experiment 1, but using only three of the five replicates per artificial mock (Mock A-3; n = 3). Subsamples of artificial communities were sequentially diluted 1:2 and PCR success was checked on a 1.5% TBE agarose gel. Due to limited sample capacity, five to six dilutions were chosen based on the quality of resulting bands to cover a representative range of dilutions that still yielded in PCR success. Further, only one technical replicate was used. We used sequential dilutions of up to 1:32 or 1:64 of the original mock for those resulting from direct PCR with or without prior size-sorting, respectively. For mocks based on purified DNA we chose dilutions between 1:8 and 1:512 (with prior size-sorting) and 1:16 up to 1: 1024 (without prior size-sorting) dilutions of the original mock.
To assess the applicability of the dPCR approach compared to common metabarcoding protocols on natural chironomid communities, we used a subsample (eight out of 12 artificial ponds, five out of 26 sampling dates, N = 40) of an ongoing ecotoxicological study. Half of the eight artificial ponds had been treated with the mosquito control agent Bti (for details see
We followed a two-step PCR metabarcoding approach using the primers BF2 & BR2 (5’-GCHCCHGAYATRGCHTTYCC-3’ & 5’-TCDGGRTGNCCRAARAAYCA-3’;
Bioinformatic analysis was performed following the JAMP approach (https://github.com/VascoElbrecht/JAMP, package version 0.77). In short, raw data were demultiplexed and adapters were removed. Consequently, paired end-sequences were merged using usearch (https://drive5.com/usearch, version 11.0.667), then primers were removed using cutadapt (https://github.com/marcelm/cutadapt, version 3.5). After quality filtering, where sequences beyond the target length of 421 ± 10 bp (cutadapt) and more than 1 expected error (usearch) were discarded, operational taxonomic units (OTUs) were clustered with a 3% radius using usearch and vsearch (https://github.com/torognes/vsearch, version 2.21.1).
In the case of Mock A, specimens were too similar in the target COI region to be identified by the OTU clustering approach, i.e., two specimens could be clustered into one OTU. Therefore, we used the dada2 pipeline (
Each of the 13 most abundant OTUs over all Mock B samples matched with one of the Sanger sequences of the 13 individuals (100% identity). In each sample, 96.7 to 99.8% of the reads were assigned to the 13 OTUs, the rest of the reads were mainly associated with spurious non-chironomid DNA. Except for one sample, where one of the replicates showed a contamination by another OTU (9.2% of reads) and therefore only 89.4% of the reads were assigned to the 13 OTUs. However, the number of reads per OTU was in the same range as in the other technical replicate, thus the replicate was not discarded. We calculated the mean number of reads (based on two technical replicates) for the 13 relevant OTUs; all other reads/OTUs were deleted.
The 160 OTUs resulting from bioinformatic processing were compared to nt database (last updated on 20.06.2022) of NCBI using BLAST+ 2.12.0 and 41 putatively non-dipteran OTUs were discarded (i.e., 7 non-dipteran insects, 7 non-insect arthropods, 1 vertebrate, 1 parazoa, 13 non-matching and 12 non-animals, mainly fungi, bacteria and plants). Negative controls showed no sign of cross-contamination among samples as the sum of reads per negative control was always low (< 112 reads, around 10 reads on average; all non-control samples contained > 1500 reads, around 22.700 on average). However, we detected 209 low-read false positives in the negative control samples (95.7% of them with less than 10 reads) presumably derived from tag switching. Additionally, read abundances per OTU were generally in the same order of magnitude in each of the two technical replicates per sample (i.e., less than 1 order of magnitude apart in 97.8% of the comparisons). In 95.9% of cases where only one of the two technical replicates contained reads for an OTU, the read numbers were spurious (i.e., < 10). Therefore, for artificial communities, read abundances were calculated as the average of both technical replicates only when both technical replicates contained reads. To further prevent false-positives, that can be introduced in any step of the metabarcoding procedure (for example spurious contamination or tag-switching, cf.
All statistical analyses were conducted in R version 4.2.0 (
Bioinformatic analysis resulted on average in 22.299 (+/-4.068 SD) reads for Mock A samples (except one sample with 95.954 reads) and on average in 24.982 (+/-3.803 SD) reads for Mock B samples (except two samples with 54.942 and 38.809 reads, respectively). The natural community samples contained on average 163 (+/- 48 SD) reads per individual (except two samples with 6 and 31 reads per individual, resp., and one sample with 467 reads per individual).
In total, 13 out of 16 specimens of Mock A and all specimens of Mock B were reliably detected in each replicate of every method. Three specimens of Mock A, corresponding to OTU_J, OTU_K and OTU_P, were detected in 97.5, 95 and 95% of the samples, respectively. For these three OTUs we found no reads in one of the two technical replicates in one or two samples. Variation in read abundance was higher between taxa (OTUs) within mock communities than within taxa comparing both metabarcoding approaches (Fig.
Effect of size-sorting and tissue preparation on individual read abundance. A, C, E show read abundances (logarithmic scale) per operational taxonomic unit (OTU) resulting from two different metabarcoding approaches (orange – purified DNA extracts, blue – direct PCR) with varying (2A) or equal (2C, E) mass per specimen from two different mock communities (Mock A, n = 5; Mock B, n = 3). Dots represent read abundances; means and ranges are indicated by vertical lines. B, D, F show boxplots illustrating variation in read abundance within mocks and within OTUs. Variation in read abundance is calculated as standard deviation of the compared values. The lower and upper hinges of boxplots correspond to the first and third quartiles. The upper whisker extends from the hinge to the largest value no further than 1.5 times the inter-quartile range from the hinge. The lower whisker extends from the hinge to the smallest value at most 1.5 times the inter-quartile range from the hinge. Data beyond the end of the whiskers, i.e., outliers, are plotted individually.
When using the same amount of purified total DNA per specimen, read abundance varied significantly between the different OTUs (Kruskal-Wallis rank sum test, n = 4, p < 0.001; Fig.
Individual read abundances resulting from equal amounts of purified total DNA or target DNA fragments. Read abundance of specimens in artificial communities (n = 4) after pooling equimolar amounts of purified total DNA (orange) or equal amounts of target DNA fragments (grey) per specimen. Dots represent read abundances, means and ranges are indicated by vertical lines. Overall read abundance per method is illustrated by boxplots on the right. The lower and upper hinges of boxplots correspond to the first and third quartiles. The upper whisker extends from the hinge to the largest value no further than 1.5 times the inter-quartile range from the hinge. The lower whisker extends from the hinge to the smallest value at most 1.5 times the inter-quartile range from the hinge. Data beyond the end of the whiskers, i.e., outliers, are plotted individually.
Individual read abundances resulting from varying amounts of purified total DNA or target DNA fragments. Read abundance of specimens in artificial communities when using different amounts of purified DNA (orange colours represent four mock communities) or different amounts of target DNA fragments (grey, four mock communities) per specimen for community pools. DNA input units correspond to 0.05 ng of purified total DNA and 1 ng of target DNA fragments. Shown are regression lines and multiple R2 of the fitted linear regression models.
When diluting the purified DNA extracts or tissue-water mixes prior to PCR, the reliability of OTU detection dropped in all approaches (Fig.
Effect of DNA input dilutions on detection rates in artificial communities. Mean number of detected OTUs (± range, n = 3) in a chironomid mock community with 16 individuals, using different metabarcoding methods and dilutions (logarithmic scale) of purified DNA extracts (orange) or tissue-water mixes (blue). We tested both the common and the direct PCR approach with (triangles, dark colours) and without (squares, light colours) prior size-sorting.
Chironomid communities of eight different ponds in a mesocosm study were analysed using two different metabarcoding approaches, i.e., with and without DNA extraction. Ponds had been treated with Bti or left as a control and chironomid emergence was sampled on 5 sampling dates over a period of 17 days. In total, we detected 114 chironomid OTUs, corresponding to 36 different genera. Pairwise comparing each OTU in each sample, there were deviations between the two metabarcoding approaches in on average 8.3 (± 5.2 SD) percent of the OTUs per sample. In 6 out of 136 OTUs we saw a deviation between the two metabarcoding approaches in more than 4 samples (> 10%). These OTUs showed a relatively low mean read abundance (10 reads and less) compared to the overall mean read abundance (447 reads). Correspondence analysis showed high overlap in metabarcoding results from both approaches (Fig.
Results of the three-way PERMANOVA on natural communities. Assessed were the effects of Bti-treatment, sampling date, and DNA preparation method as well as their interactions on presence-absence chironomid community composition.
Source of variation | DF | sum of squares | F statistics | R² | p |
---|---|---|---|---|---|
Bti-Treatment | 1 | 0.870 | 3.730 | 0.046 | 0.001 *** |
Sampling Date | 4 | 2.534 | 2.715 | 0.135 | 0.001 *** |
DNA Preparation Method | 1 | 0.080 | 0.341 | 0.004 | 0.999 |
Bti-Treatment: Sampling Date | 4 | 1.177 | 1.261 | 0.063 | 0.058 |
Bti-Treatment: DNA Preparation Method | 1 | 0.004 | 0.018 | < 0.001 | 1 |
Sampling Date: DNA Preparation Method | 4 | 0.058 | 0.063 | 0.003 | 1 |
Bti-Treatment: Sampling Date: DNA Preparation Method | 4 | 0.007 | 0.007 | < 0.001 | 1 |
Residuals | 60 | 14.001 | 0.747 | ||
Total | 79 | 18.731 |
Correspondence analysis (CA) plot comparing standard versus dPCR metabarcoding approach for natural communities. Indicated are differences of chironomid communities from four Bti-treated (triangles, dark colours) and four control (squares, light colours) ponds of a mesocosm study over five sampling dates (N = 40). Two different metabarcoding approaches, i.e., with (orange) or without (blue) DNA extraction, were used for each community. Proportion explained per axis: CA1–7.8%, CA2–7.5%. Ellipsoids indicate standard deviations of points. Two very distant points are not visible in this figure.
In the current study, we showed for the first time that direct PCR protocols can be combined with insect community metabarcoding approaches. Metabarcoding of chironomid communities with and without DNA extraction, produced highly comparable results. Both methods led to similar detection rates in artificially created communities with known chironomid composition. Furthermore, samples of natural communities from an ongoing mesocosm study (
We observed high similarity between the presence-absence chironomid community composition detected by metabarcoding with and without DNA extraction. We explain the faster decrease of specimen detection rates in sequential dilutions of the dPCR approach as compared to the common approach with the stochastic effect that arises, when very small amounts of input material are pipetted (i.e., 1 µl of purified DNA extract or 5 µl of tissue-water mix). Stochasticity probably also explains the observed higher read abundance variation in the dPCR samples as compared to the common approach. As DNA is purified and concentrated during DNA extraction, the probability to represent the whole community in a small amount of purified DNA extract is much higher than in a similar amount of tissue-water mix. This lower representativeness could indeed raise concern about the reliability of direct PCR metabarcoding when studying natural communities that are typically composed of few abundant and many rare species (e.g.,
While detection rates of the two approaches were largely similar, read abundance was influenced by the method used (Fig.
Investigating the reasons for differential individual read abundances, we found out that adjusting the availability of purified DNA prior to PCR did not eliminate variation. At the same time, varying the availably of DNA target fragments after PCR had a much larger influence on read abundance (Figs
We tested the comparability of the two metabarcoding approaches using specimens from the family Chironomidae. However, dPCR protocols have been developed and successfully applied to several other invertebrate taxa, e.g., mosquitoes:
In conclusion, we showed that DNA extraction can be omitted in chironomid community metabarcoding while preserving the informative value of the presence-absence community composition. As the adoption of our proposed direct PCR protocols is relatively easy and inexpensive, direct PCR metabarcoding has the potential to become a standard procedure in chironomid community analysis. With modern PCR reagents being more robust to contamination by inhibitors, we assume that direct PCR metabarcoding, where DNA extraction is replaced by solely a mechanical tissue-breakdown step, is applicable for a wide range of arthropod taxa and encourage further comparative studies. The opportunity to avoid DNA extraction steps might even aid the ongoing development to miniaturize the instrumental requirements for PCR and metabarcoding (e.g., for application in the field or for live on-site monitoring). In addition, laboratories implement more and more automation processes, in order to reduce the costs for high-throughput application of metabarcoding. Eliminating the DNA extraction step entirely might contribute a substantial improvement in this development. In the end, faster and cheaper metabarcoding procedures will boost our ability to monitor the diversity of arthropod communities and appropriately target conservation actions to combat the ongoing global biodiversity loss.
The authors would like to cordially thank Sara Kolbenschlag and the Eußerthal Ecosystem Research Station Team for providing preserved chironomids. We are grateful to Sophie Stoll for fabulous work in the laboratory and to our many student helpers for size-sorting insects. We would like to thank Carola Greve, Damian Baranski und Alexander Ben Hamadou of TBG for laboratory support during the measurements of target fragment concentrations. Thanks to Pavel Bystřický for help in the lab during initial tests of the direct PCR approach. Special thanks to Thomas Mehner for constructive advice on earlier versions of the manuscript and continuous support throughout the writing process. We further thank Verena Gerstle, Sara Kolbenschlag, Sebastian Pietz, Alexis Roodt and Caroline Ganglo for a stimulating atmosphere while discussing scientific writing. Thank you very much to Dominik Buchner for his effort in providing a detailed and constructive review.
The authors declare they have no conflict of interest.
No ethical statement was reported.
This study was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – Research Training Group SystemLink 326210499/GRK2360.
NR and KS conceived the ideas and designed methodology; NR collected the data; NR and KS analysed the data; NR led the writing of the manuscript. Both authors contributed critically to the drafts and gave final approval for publication.
Nina Röder https://orcid.org/0000-0002-9681-7538
Klaus Schwenk https://orcid.org/0000-0003-2427-4332
Raw sequences were deposited in GenBank SRA archive and are available with the BioProject accession number PRJNA989176. Data and R scripts are available from Zenodo https://doi.org/10.5281/zenodo.8074454 (
Overview of chironomid size classes
Data type: additional methodological information
Explanation note: Chironomid individuals were sorted into four different size categories according to their body length (from the anterior margin of the head between the antennae to the end of the posterior abdominal segment) and shape (thin – usually males or thick – usually females). They were counted and average weight per specimen was determined.
Composition of the two artificial chironomid communities
Data type: additional methodological information
Explanation note: Sanger sequences of specimens were compared to BOLD database.