Metabarcoding data allow for reliable biomass estimates in the most abundant animals on earth
expand article infoJanina Schenk, Stefan Geisen§, Nils Kleinboelting, Walter Traunspurger
‡ Bielefeld University, Bielefeld, Germany
§ Wageningen University and Research, Wageningen, Netherlands
Open Access


Microscopic organisms are the dominant and most diverse organisms on Earth. Nematodes, as part of this microscopic diversity, are by far the most abundant animals and their diversity is equally high. Molecular metabarcoding is often applied to study the diversity of microorganisms, but has yet to become the standard to determine nematode communities. As such, the information metabarcoding provides, such as in terms of species coverage, taxonomic resolution and especially if sequence reads can be linked to the abundance or biomass of nematodes in a sample, has yet to be determined. Here, we applied metabarcoding using three primer sets located within ribosomal rRNA gene regions to target assembled mock-communities consisting of 18 different nematode species that we established in 9 different compositions. We determined abundances and biomass of all species added to examine if relative sequence abundance or biomass can be linked to relative sequence reads. We found that nematode communities are not equally represented by the three different primer sets and we found that relative read abundances almost perfectly correlated positively with relative species biomass for two of the primer sets. This strong biomass-read number correlation suggests that metabarcoding reads can reveal biomass information even amongst more complex nematode communities as present in the environment and possibly can be transferred to better study other groups of organisms. This biomass-read link is of particular importance for more reliably assessing nutrient flow through food-webs, as well as adjusting biogeochemical models through user-friendly and easily obtainable metabarcoding data.

Key Words

Nematodes, quantification, community analysis, Illumina MiSeq, species diversity


Over the last centuries, researchers aimed at capturing the planet’s biodiversity that consists of about 1.9 million described species – a fraction compared to the diversity of undescribed species (Hajibabaei et al. 2011). Soil and sediments host a huge part of biodiversity and organism biomass on earth (Bar-On et al. 2018). Yet, much of this biodiversity, especially of microorganisms (e.g. bacteria, protists) and microscopic animals (e.g. rotifers, nematodes) remains undescribed and estimates on their numbers and biomass vary profoundly (Fierer et al. 2017; Bar-On et al. 2018). This is due to the fact that their identification is difficult, time-consuming and requires expert-knowledge (Coissac et al. 2012; Jörger et al. 2012). Amongst microscopic animals, nematodes are by far the most abundant (Traunspurger et al. 2012; van den Hoogen et al. 2019). In addition, the functional role of nematodes is diverse as they provide key ecosystem functioning, such as an important link in the food web between bacteria and larger organisms, as well as nutrient cycling (Ruiter et al. 1995; Traunspurger et al. 1997; Schmid and Schmid-Araya 2002).

Nematode communities (and those resulting from nematode abundance and diversity estimates) are characterised by morphological taxon identification using microscopy. This is in contrast to comparable studies focusing on microorganisms, which are entirely based on molecular approaches, particularly metabarcoding (Knight et al. 2018; Nilsson et al. 2019). Assessing the community structure of most animal groups, including nematodes using metabarcoding, is promising but not fully developed (Porazinska et al. 2009; Darby et al. 2013; Holovachov et al. 2017; Griffiths et al. 2018; Pafčo et al. 2018; Treonis et al. 2018; Weigand and Macher 2019). Differences in copy-numbers of targeted genes and differential taxon amplification, due to imperfect primer matches and variation in amplicon sizes between targeted taxa, are amongst the reasons that artificially change the observed community composition when metabarcoding is used (Prokopowich et al. 2003; Kembel et al. 2012; Bik et al. 2013; Darby et al. 2013). Taxon-assignments also suffer from inaccurate or incomplete reference databases (Cowart et al. 2015; Piñol et al. 2015; Leese et al. 2016). Amongst the only marker genes that are well covered for most microorganisms and nematodes are ribosomal genes (Peham et al. 2017). Currently 27,287 18S and 21,362 28S rRNA gene reads are available for nematodes in NCBI GenBank (Benson et al. 2013; Nov 2019).

Several studies investigated the performance of metabarcoding for community analyses, but mostly focusing on samples without an a priori knowledge of species composition initially present in the sample as reported for ciliates and microbial communities (Logares et al. 2014; Dong et al. 2017; Pitsch et al. 2019). Studies testing metabarcoding on nematode mock communities – communities with a known composition, often used a limited number of species (n < 10) or a single primer-pair for amplification (Porazinska et al. 2010; Darby et al. 2013; Macheriotou et al. 2019; Waeyenberge et al. 2019). These studies identified a potential link between sequence number and nematode abundance, but no satisfying consensus was found. It remains to be determined if other parameters, such as biomass, might be better linked to read abundance as shown for copepods (Hirai et al. 2015; Clarke et al. 2017), while others failed to find this link (Harvey et al. 2017). Most of the studies listed above were based on family-level, while an accurate inspection at higher taxonomic resolution is missing. A reliable approach to infer biomass data is, however, needed to incorporate into food-web models and implement in elemental flow measurements (Bittleston et al. 2015; Clarke et al. 2017). A positive link of relative read abundance with biomass, with larger, normally less abundant nematodes having amongst the highest read numbers, was recently suggested in a pan-European field survey of soil nematode communities using a 18S rDNA gene region (Wilschut et al. 2019).

In this study, we used mock communities, consisting of 18 different nematode species and applied Illumina MiSeq sequencing, targeting three commonly used barcoding regions within the rRNA gene region. We changed the composition of those species, such as by adding them in equal abundances, by compositions as found in nature and by replacing larger adult with smaller juvenile specimens to assess and calibrate abundance depiction efficiency. We hypothesised (1) that relative taxon composition and revealed diversity depend on the primer pair, due to differences in resolution and PCR-induced differential amplification, especially for communities. Additionally, we hypothesised (2) that relative read abundance is best reflected by relative biomass rather than abundance data due to the differences in ribosomal copy number variation and the resulting increases in barcoding gene numbers.

Material and methods

Nematode culturing and community design

Nematode species were raised on cultured agar plates (1.7%) dosed with cholesterol and E. coli and synchronised regarding life stages. In total, 18 species were used and organisms were individually removed with needles from plates and rinsed in a water drop in order to remove excess agar parts, bacteria and fungi. Organisms were then transferred into an Eppendorf tube containing lysis buffer (Machery & Nagel, Nucleo Spin Tissue XS Kit), according to predefined mock-community ratios (Table 1). Each tube contained 198–200 nematodes from the 18 species, resulting in 9 different mock-communities with 3 replicates for each mock-community. Mock-communities represented ratios that are found in nature (Peters and Traunspurger 2005; Kazemi-Dinan et al. 2014), as well as evenly distributed ratios in order to examine sequencing success. Subsequently, 8 µl of proteinase-K was added and samples were lysed at 56 °C within a rocking water bath for 8 hours. Subsequently, DNA was extracted, according to the NucleoSpin Tissue Kit XS protocol (Macherey & Nagel, Hilden). Prior to amplification, DNA concentration was checked with PicoGreen. Samples were amplified with three different primer pairs that had been commonly used for metabarcoding of microscopic animals previously. The 18S rRNA gene was amplified with F04/R22 (5′-GCTTGTCTCAAAGATTAAGCC-3′/5′-GCCTGCTGCCTTCCTTGGA-3′ (Fonseca et al. 2010), amplifying a ~350 bp fragment of the V1-V2 region and 3NDf/C_1132f (5’GGCAAGTCTGGTGCCAG 3’/5’TCCGTCAATTYCTTTAAGT 3’), amplifying a ~530 bp fragment of the V4 region (Geisen et al. 2018). A ~520 bp fragment of the D3-D5 region of the 28S rRNA gene was amplified with the primer pair 1274/706 (5’–GACCCGTCTTGAAACACGGA-3’/5’-GCCAGTTCTGCTTACC-3‘) introduced by Markmann and Tautz (2005), which has proven to work extremely well for freshwater nematodes (Ristau et al. 2013; Schenk et al. 2016). Primers pairs are subsequently simplified to 18S_V1 (F04/R22), 18S_V4 (3NDf/C_1132f) and 28S_D3 (1274/706). PCR conditions were 30 cycles with 60 s pre-denaturation at 96 °C, followed by 15 s at 96 °C, 30 s at 58 °C and 90 s at 70 °C. In a second PCR with the same conditions, Illumina indices were attached in 10 cycles. Metabarcoding was carried out at LGC Genomics (Berlin) on an Illumina MiSeq (V3 2 × 300 bp). About 20 ng of each sample were pooled for sequencing. Samples were delivered demultiplexed.

Mock-community composition of 18 nematode species with a total of 198–200 individuals present in each assembled community. Nine different communities were created, all with at least one of each species present. One species (Rhomborhabditis regina) was added either as adults or juveniles, while only adults of all other species were added. Furthermore, the wet weight (ww) of the 18 species is given in µg, together with the range of biomass. Each sample was replicated 3 times, resulting in 27 mock-communities.

Species Number of individuals
Sample Biomass ww Biomass range
1 2 3 4 5 6 7 8 9 (µg) (µg)
Rhomborhabditis regina adult (Schulte & Poinar, 1991) 11 0 6 2 5 10 5 10 2 20.29 18.82–34.59
Rhomborhabditis regina juvenile 0 11 6 0 0 0 0 0 0 6.09 3.25–16.24
Caenorhabditis elegans Maupas 1900 11 11 11 166 41 15 115 70 21 3.74 2.22–5.59
Panagrellus redivivus (Goodey, 1943) 11 11 11 2 4 10 5 10 23 3.83 3.44–5.21
Pristionchus pacificus Sommer et al. 1996 11 11 11 2 5 10 5 10 13 3.10 2.62–3.83
Pristionchus entomophagus (Steiner, 1929) 11 11 11 2 15 15 5 10 7 4.02 3.45–4.71
Plectus velox Bastian, 1865 11 11 11 2 5 10 5 5 2 4.62 3.39–7.79
Plectus cf. acuminatus Bastian, 1865 11 11 11 2 3 10 5 5 3 1.98 1.26–3.16
Plectus aquatilis Andrassy, 1985 11 11 11 2 2 5 5 5 2 1.50 1.15–2.06
Paroigolaimella bernensis (Steiner, 1914) 11 11 11 2 20 15 5 5 32 1.37 1.16–1-67
Panagrolaimus thienemanni Hirschmann, 1952 11 11 11 2 5 5 5 10 10 0.31 0.28–0.53
Acrostichus sp. 11 11 11 2 5 10 5 5 2 0.79 0.45–1.18
Acrostichus nudicapitatus (Steiner, 1914) 11 11 11 2 5 25 5 5 2 0.92 0.68–1.44
Poikilolaimus regenfussi Sudhaus, 1980 11 11 11 2 15 5 5 10 11 0.65 0.44–0.91
Poikilolaimus oxycerca (de Man, 1895) 11 11 11 2 5 5 5 5 2 0.73 0.70–0.98
Acrobeloides tricornus (Thorne, 1925) 11 11 11 2 25 10 5 10 9 0.82 0.73–1.13
Acrobeloides cf. nanus (de Man,1880) 11 11 11 2 10 15 5 10 23 0.77 0.62–1.04
Diploscapter coronatus (Cobb, 1893) 11 11 11 2 5 15 5 5 2 0.24 0.20–0.38
Aphelenchoides parietinus Steiner, 1932 11 11 11 2 25 10 5 10 34 0.14 0.09–0.22
Sum 198 198 199 200 200 200 200 200 200

Reference sequences

Individual species were amplified, Sanger-sequenced and used as reference in the bioinformatic pipeline (see below). PCR conditions followed Schenk et al. (2016). Sequences are deposited at NCBI GenBank under the Accession numbers: MK543220MK543233, MK541669MK541684, MK541653MK541668. The closely related species Acrobeloides tricornus and Acrobeloides cf. nanus, Plectus aquatilis and Plectus cf. acuminatus and Plectus velox, as well as Acrostichus nudicapitatus and Acrostichus sp. could not be distinguished, based on their sequences for the 18S rRNA gene regions. For the 28S rRNA gene region, Plectus aquatilis and Plectus cf. acuminatus, as well as Acrostichus nudicapitatus and Acrostichus sp. could not be distinguished. However, as natural communities contain closely related species as well, this was also maintained in our study. For two species, Panagrolaimus thienemanni and Aphelenchoides parietinus, no Sanger reference sequence could be generated for any of the markers (Suppl. material 1: Table S1). For the latter, NCBI sequences were available for the 28S rDNA gene region (MF325173.1, MF325174.1)

Bodyweight correlation

Mean biomass (wet weight) of every species was calculated following (Andrássy 1956) for several individuals (n = 10–15 adults) that were heat-fixated and measured (Table 1). Furthermore, the proportion of generated reads for 18S_V4 and 28S_D3 was plotted against the relative biomass and the relative abundance together with linear regression in MatLab (MATLAB User’s Guide 1998). The correlation for absolute biomass and absolute abundance against the read numbers are given in Suppl. material 2: Fig. S1.

Bioinformatic analysis

Except for the taxonomic classification and primer removal, the MiSeq standard operation procedure using mothur (Kozich et al. 2013; Schloss et al. 2009) was used. Demultiplexed reads were combined to reach a longer paired-end read for higher phylogenetic resolution by using the make.contigs function of mothur with default settings, thereby correcting sequencing errors in the overlapping region. Primer sequences were then removed from the combined reads using cutadapt with a default error rate of 0.1 (Martin 2011). Reads with ambiguous bases, homopolymers larger than 10 bases and of unexpected short or long length (allowed range: 333–367 for 18S_V1, 514–597 for 18S_V4 and 471–516 for 28S_D3) were removed from the dataset. An overview about distribution of read numbers in bioinformatics is given in Suppl. material 1: Table S2. The remaining sequences were aligned with the SILVA reference (release 132) alignment (Martin 2011) to determine the spanned 18S or 28S rRNA gene region within the alignment in order to remove sequences not spanning this region and to remove overhangs outside of this region as well. The sequences were then clustered with approximately 99% identity which corresponds to a maximum difference of 4 (18S_V1) or 5 (18S_V4 or 28S_D3) bases. An alternative clustering-free approach (using DADA2) by Callahan et al. (2016) was also tested, but as extremely similar results were obtained, this is not discussed further (Suppl. material 2: Fig. S2). Chimeras were then removed using UCHIME (Edgar et al. 2011) with a de novo constructed reference database. All these filter steps reduced the dataset from 448,687 to 404,187 (for 18S_V1), from 315,232 to 241,822 (18S_V4) and from 453,691 to 264,010 (28S_D3). The median of read length was 532-bp for 28S_D3, 583-bp for 18S_V4 and 405-bp for 18S_V1. OTUs were taxonomically classified based on top BLAST hits in the NCBI nt database using an identity cutoff of 97%. Contradicting equal scores were not encountered and OTUs represented by less than 10 reads were discarded. Besides reference sequences obtained from GenBank, Sanger-sequenced reference sequences of the mock-community species were added to support taxonomic classification of the taxa used in our mock-communities (via BLAST with 97% identity cutoff as well). These reference sequences were used instead of the NCBI nt based classification, when they had a higher similarity than the hits in the NCBI nt database or when the hits in the NCBI database were poorly annotated and contained the keywords “uncultured”, “environmental” or “unidentified”. Raw reads were deposited under PRJNA513984.

Community structure & Statistical community analysis

For each primer pair, read numbers for each species in each replicate were averaged to decrease variation and communities were compared to relative abundance and relative biomass, based on the number of inoculated specimens. For statistical analyses of communities, Bray Curtis similarity was applied with non-transformed data. Non-metric multidimensional scaling (NMDS) plots were created with MathLab (MATLAB User’s Guide 1998), while Permutational multivariate analysis of variance (PERMANOVA) was conducted with PRIMER-E v6 (Clarke and Gorley 2006). PERMANOVA was computed with 9,999 permutations and pairwise comparisons amongst relative abundance, biomass and the read proportions for the three primer pairs. The samples were used as the nested factor in the treatments (e.g. abundance, biomass or primer used). A list of all pairwise comparisons is given in Suppl. material 1: Table S3. The effect of amplicon length and mismatches on sequencing success was analysed with Analysis of Variance (ANOVA) in R (R Core Team 2013) in order to check if there was an significant effect by the two factors on the sequencing success. Sequencing success was defined as expected to encountered sequencing ratios based on biomass proportions.


Amplification success

The three primer pairs had a different amplification success (Table 2) with the 28S_D3 marker resulting in the highest recovery rate of species present in the mock-communities (n = 14), followed by 18S_V4 (n = 12), while the 18S_V1 returned the lowest OTU number (n = 10). Different amplification success could partly be attributed to mismatches in the primer regions, such as for Diploscapter coronatus, which could not be amplified within the metabarcoding approach for any of the markers, although it could be amplified with Sanger-sequencing (Suppl. material 1: Tables S1, S4). ANOVA showed that variations in length of the amplified region of different nematode taxa significantly affected amplification success of 18S_V4 (p < 0.001), the primer pair that differed most in length between nematode taxa (shortest amplicon of 492 bp in 531 bp). For the other two markers, amplicon length (28S_D3: p = 0.41, 18S_V4: p = 0.21) and mismatches (28S_D3: p = 0.51, 18S_V4: p = 0.6) did not significantly affect amplification success (28S_D3: 512 bp – 525 bp, 18S_V1: 333 bp – 364 bp, Suppl. material 1: Table S5). We focused our analyses on all sequences assigned to the nematode taxa that were added, as contamination with non-target sequences was low (0.5–2.1%, Table 2).

Species recovered by the three primer pairs. Given are the species added in the mock-communities and the performance of each marker. An “x” indicates that the species was found, an “n” means the species was missing and an (x) shows that the nucleotide sequence of this species was identical to another species, indicated in bold and, therefore, could not be experimentally verified at the species level in the metabarcoding approach. Furthermore, the total number of OTUs that could be recovered from the mock community species, including species with identical sequences, as well as without genetically indistinguishable species (in parentheses), are given, together with the average read length (bp) and the contamination (in %).

Species 18S_V1 18S_V4 28S_D3
Rhomborhabditis regina x x x
Panagrellus redivivus x x
Caenorhabditis elegans x x x
Pristionchus pacificus x x x
Pristionchus entomophagus x x x
Aphelenchoides parietinus x x
Acrobeloides cf. nanus x x x
Acrobeloides tricornus (x) (x) x
Panagrolaimus thienemanni x x
Poikilolaimus cf. regenfussi x x x
Poikilolaimus oxycerca x x
Paroigolaimella bernensis x x x
Acrostichus sp. x x
Acrostichus nudicapitatus (x) (x)
Diploscapter coronatus
Plectus aquatilis x x x
Plectus cf. acuminatus (x) (x) (x)
Plectus velox (x) (x) x
Number of OTUs 13(10) 16 (12) 16 (14)
Avg. read length (bp) 405 583 532
Avg. contamination (%) 0.53 0.81 2.14

Community structure

The community structure differed between the primer pairs, with 28S_D3 resulting in the most similar community depiction to the initial inoculated communities, while depiction for the 18S_V1 primer pair substantially differed. Nematode communities amplified with the 28S_D3 primer pair grouped together with nematode mock communities presented as biomass data in the NMDS plot (Fig. 1a), which was also supported by pairwise comparisons between 28S_D3 and biomass showing similar patterns (t-value = 0.8193, p = 0.5293). In contrast, nematode abundances differed from the community composition as depicted by the primer pair 28S_D3 (Fig. 1a) and supported statistically (t-value = 2.2339, p = 0.0076). The communities for the 18S_V4 primer pair were different from mock-community data presented as biomass (t-value = 1.9353, p = 0.0279) and even more to abundance (t-value = 3.6898, p < 0.001, Fig. 1b). Communities revealed with the primer pair 18S_V1 were most different to biomass and abundance (t-value = 5.9545, p < 0.001 and 6.7753, p < 0.001, Fig. 1c) and were therefore not evaluated further. The community structure of the inoculated communities differed when represented as biomass or abundance (t-value = 2.281, p = 0.0068). Primer pairs 28S_D3 and 18S_V4 revealed nematode communities in a similar way (t-value = 1.278, p = 0.1638). In contrast, all communities amplified with the primer pair 18S_V1 were strikingly different, being characterised by a clear dominance of few species, for example, Rhomborhabditis regina and Plectus aquatilis (Suppl. material 2: Fig. S3).

Figure 1.

NMDS plots for a) the 28S_D3 region, b) the 18S_V4 region and c) the 19S_V1 region depicting the 9 samples. Each sample is shown with three replicates, together with the relative proportion of biomass (black) and the relative abundance of the original community (cyan).


Relative biomass positively correlated with relative read numbers for the primer pair 28S_D3 (R2 = 0.90539, p < 0.001) and 18S_V4 (R2 = 0.81396, p < 0.001, Fig. 2 a–b). At the level of individual mock-communities, especially samples with dominant species were often positively correlated with relative biomass proportions for the 28S_D3 and 18S_V4 primer pair, for example, Sample 4 (28S_D3: R2 = 0.9995, p < 0.001; 18S_V4:R2 = 0.9656, p < 0.001). The same was found for Sample 7 (28S_D3: R2 = 0.9885, p < 0.001 18S_V4: R2 = 0.923, p < 0.001) or Sample 9 (28S_D3: R2 = 0.9162, p < 0.001; 18S_V4: R2 = 0.3069, p = 0.02597). In general, species with a high contribution to the total biomass resulted in the highest read proportions, while species with a low biomass were clearly less amplified. The biomass of juvenile and adult specimens differed by a factor of 5.00 (Suppl. material 1: Table S6 and Suppl. material 2: Fig. S4), while the amplification of different life stages of R. regina also resulted in varying read proportions, with the adult nematodes generating significantly more reads than the juvenile nematodes (Wilcoxon-Mann-Whitney test, W = 81, p < 0.001). For the 28S_D3 primer pair, 17.14 times more reads were generated, while the 18S_V4 primer pair resulted in 4.65 times more reads and 5.81 times

Relative species abundances were positively correlating with read numbers for the primer pairs 28S_D3 (R2 = 0.53694, p < 0.001) and 18S_V4 (R2 = 0.33077, p < 0.001), but less strong than for relative biomass-read number correlations (Fig. 2c–d). Correlations of absolute biomass and absolute abundance against absolute read counts showed a similar pattern, but with overall weaker correlation strengths (Suppl. material 2: Fig. S1).

Figure 2.

Correlations of relative biomass and relative abundance against relative read proportions. a–b Correlation of 239 relative biomass proportions in μg (x-axis) against the relative number of generated reads (y-axis). Given is the slope of the 240 correlation and the according R2-value for the 28S_D3 and the 18S_V4 marker. c–d Correlation of relative abundance 241 proportions in μg (x-axis) against the relative number of generated reads (y-axis). Given is the slope of the correlation and 242 the according R2-value for the 28S_D3 and the 18S_V4 marker. 24


In this study, we show according to our hypothesis (1) that taxon composition and revealed diversity depend on the applied primer pairs. As furthermore hypothesised, we could show (2) that the relative sequence abundance strongly correlates with the relative taxon biomass for two of the three primer pairs tested.

Resolution and species recovery

A total of 78% of species diversity present in the samples was recovered for the 28S_D3 primer pair, while, for the other primers, resolution was lower (18S_V1:55%; 18S_V4: 67%). A higher diversity was recovered considering indistinguishable taxa (28S_D3 and 18S_V4: 89%; 18S_V1: 78%). Those that could not be distinguished shared the same marker gene sequence and, as such, cannot be distinguished, based on sequencing (Table 2). The rRNA genes were recently suggested to be incapable of identifying many nematode taxa to species and genus level resolution was consequently suggested as a reliable lower-resolution alternative (Creer et al. 2016; Sahraean et al. 2017). It should, therefore, be considered in metabarcoding studies of unknown species composition, when the full species diversity is aimed at identification in a sample. However, species that are closely related and, as such, share the same genetic sequence commonly can share similar morphological, physiological and functional traits (Potapov et al. 2019). For example, nematode species have proven to be equally sensitive to pollutants within the same genus (Höss et al. 2017). Therefore, the relative-read abundance links at lower taxonomic resolution, which likely links to functional information, still holds. From a total of 18 species used, one species could not be recovered with any of the primer-pairs and other species such as Aphelenchoides parietinus were under-represented in relative sequence reads. This can be mainly explained by mismatches present in primer sites. Accordingly, a 10 fold drop in amplification efficiency was reported to be caused by only one nucleotide mismatch in the priming region (Piñol et al. 2015). Due to amplicon length variations, Panagrellus redivivus (531 bp) might not be amplified with 18S_V4, as the primer length is at the maximum of the capacity our Illumina approach can sequence (2× 300 bp). This is supported by the fact that P. redivivus was found in unmerged forward and reverse reads (Suppl. material 1: Table S7). Furthermore, we demonstrate that longer amplicon sizes negatively influenced amplification success for the 18S_V4 marker (Suppl. material 1: Table S4).

Community structure

The community depiction for this study varied due to primer-induced differences. The 18S_V1 marker was, in our case, not suitable for distinguishing communities due to a low species recovery and an extreme over-amplification of mainly two species (Rhomborhabditis regina and Plectus aquatilis) and is, therefore, not discussed further. In turn and in line with most accurate diversity representations, the 28S_D3 primer pair could reliably distinguish between communities, followed by the 18S_V4 marker. The difference for the V4-marker is likely introduced by higher length-variation in this marker (Suppl. material 1: Table S4). Especially, communities with dominant taxa were depicted accurately and uncertainties were mostly detected from less abundant taxa for both primer pairs. Many metabarcoding studies reported a problem in detecting rare taxa, which can be masked by the more abundant taxa during PCR or sequencing (Evans et al. 2016) or even filtered out during bioinformatics analyses due to too low read abundances (Elbrecht et al. 2018).


Relative biomass and read proportions for almost all mock-community species at the species level were not different for the 28S_D3 primer pair, suggesting that relative biomass can reliably be depicted by relative read numbers obtained by our nematode metabarcoding approach. In support, species that were missing or amplified at low proportions for the 28S_D3 and 18S_V4 markers in the dataset were those that also had a low overall biomass (Panagrolaimus thienemanni and Diploscapter coronatus, Table 1). These results imply that upscaling with more taxa and more complex communities might be possible. This study covered many of the extremely ubiquitous Rhabditiade, as well as other members of the nematode taxa as presented by van Megan et al. (2007), but more detailed studies using a wider phylogenetic range of nematode taxa are needed to confirm these findings across a wider species-range within Nematoda. If relative sequencing data actually represents relative biomass data, the information provided by metabarcoding might be implemented in studies investigating food-web structures (Sechi et al. 2018). This relationship, although by far less strong, has been observed for nematodes recently (Wilschut et al. 2019), but was limited to a 18S rDNA gene region. The here observed biomass-read number link was further confirmed by differences in read numbers between juvenile and adult specimen of the same species, such as shown before in fishes (Maruyama et al. 2014). As many ecosystem functions rely on biomass estimations, such as production, our results reveal that metabarcoding can be used to assess biomass distribution of previously hardly studied organisms, such as minute organisms living in soils and sediments. While we here show this potential for nematodes, we are convinced that this approach can be applied and provide biomass estimates also for other groups of organisms. Thorough calibration efforts are now needed to reliably link diversity and (relative) biomass of taxa in a sample.


We thank Steffi Gehner for raising nematodes on cultured plates at synchronised life stages and thank Professor Oliver Krüger for proving laboratory space for the molecular work. We also thank the German Federal Institute of Hydrology (BfG) for funding our research. Bioinformatic work was furthermore supported in parts by grants of the German Federal Ministry of Education and Research (BMBF) of the project “Bielefeld-Gießen Center for Microbial Bioinformatics – BiGi” (Grant-number 031A533) within the German Network for Bioinformatics Infrastructure (de.NBI). S. G. was supported by the NWO-VENI grant from the Netherlands Organisation for Scientific Research (016.Veni.181.078). We acknowledge the constructive comments by one anonymous referee and the editor.


  • Andrassy I (1956) Die Rauminhalts-und gewichtsbestimmung der Fadenwürmer (Nematoden). Acta Zoologica Hungarica 2(1): 1–5.
  • Benson DA, Cavanaugh M, Clark K, Karsch-Mizrachi I, Lipman D J, Ostell J, Sayers EW (2013) Genbank. Nucleic Acids Research 41: D36–D42.
  • Bittleston LS, Baker CCM, Strominger LB, Pringle A, Pierce NE (2015) Metabarcoding as a tool for investigating arthropod diversity in Nepenthes pitcher plants. Austral Ecology 41(2): 120–132.
  • Callahan BJ, McMurdie PJ, Rosen MJ, Han AW, Johnson AJA, Holmes SP (2016) DADA2: High-resolution sample inference from Illumina amplicon data. Nature Methods 13: 581–583.
  • Clarke KR, Gorley RN (2006) PRIMER v6 User Manual and Program. PRIMER-E Ltd, Plymouth, UK.
  • Clarke LJ, Beard JM, Swadling KM, Deagle BE (2017) Effect of marker choice and thermal cycling protocol on zooplankton DNA metabarcoding studies. Ecology and Evolution 7(3): 873–883.
  • Cowart DA, Pinheiro M, Mouchel O, Maguer M, Grall J, Miné J, Arnaud-Haond S (2015) Metabarcoding is powerful yet still blind: a comparative analysis of morphological and molecular surveys of seagrass communities. PloS ONE 10(2): e0117562.
  • Creer S, Deiner K, Frey S, Porazinska D, Taberlet P, Thomas WK, Potter C, Bik HM, Freckleton R (2016) The ecologist’s field guide to sequence-based identification of biodiversity. Methods in Ecology and Evolution 7(9): 1008–1018.
  • Darby BJ, Todd TC, Herman MA (2013) High-throughput amplicon sequencing of rRNA genes requires a copy number correction to accurately reflect the effects of management practices on soil nematode community structure. Molecular Ecology 22(21): 5456–5471.
  • Dong X, Kleiner M, Sharp CE, Thorson E, Li C, Liu D, Strous M (2017) Fast and simple analysis of MiSeq amplicon sequencing data with MetaAmp. Frontiers in Microbiology 8: 1461.
  • Elbrecht V, Vamos EE, Steinke D, Leese F (2018) Estimating intraspecific genetic diversity from community DNA metabarcoding data. PeerJ 6 (8): e4644.
  • Evans NT, Olds BP, Renshaw MA, Turner CR, Li Y, Jerde CL, Mahon AR, Pfrender ME, Lamberti GA, Lodge DM (2016) Quantification of mesocosm fish and amphibian species diversity via environmental DNA metabarcoding. Molecular Ecology Resources 16(1): 29–41.
  • Fonseca VG, Carvalho GR, Sung W, Johnson HF, Power DM, Neill SP, Packer M, Blaxter ML, Lambshead P John D, Thomas WK, Creer S (2010) Second-generation environmental sequencing unmasks marine metazoan biodiversity. Nature Communications 1: 98.
  • Geisen S, Snoek LB, Hooven FC ten, Duyts H, Kostenko O, Bloem J, Martens H, Quist CW, Helder JA, van der Putten WH (2018) Integrating quantitative morphological and qualitative molecular methods to analyse soil nematode community responses to plant range expansion. Methods in Ecology and Evolution 9(6): 1366–1378.
  • Griffiths BS, Groot GA de, Laros I, Stone D, Geisen S (2018) The need for standardisation: Exemplified by a description of the diversity, community structure and ecological indices of soil nematodes. Ecological Indicators 87: 43–46.
  • Hajibabaei M, Shokralla S, Zhou X, Gregory ACS, Baird DJ (2011) Environmental barcoding: a next-generation sequencing approach for biomonitoring applications using river benthos. PloS ONE 6(4): e17497.
  • Harvey JBJ, Johnson SB, Fisher JL, Peterson WT, Vrijenhoek RC (2017) Comparison of morphological and next generation DNA sequencing methods for assessing zooplankton assemblages. Journal of Experimental Marine Biology and Ecology 487: 113–126.
  • Hirai J, Kuriyama M, Ichikawa T, Hidaka K, Tsuda A (2015) A metagenetic approach for revealing community structure of marine planktonic copepods. Molecular Ecology Resources 15(1): 68–80.
  • Holovachov O, Haenel Q, Bourlat SJ, Jondelius U (2017) Taxonomy assignment approach determines the efficiency of identification of OTUs in marine nematodes. Royal Society Open Science 4(8): 170315.
  • Höss S, Heininger P, Claus E, Möhlenkamp C, Brinke M, Traunspurger W (2017) Validating the NemaSPEAR[%]-index for assessing sediment quality regarding chemical-induced effects on benthic communities in rivers. Ecological Indicators. 73: 52–60.
  • Jörger KM, Norenburg JL, Wilson NG, Schrödl M (2012) Barcoding against a paradox? Combined molecular species delineations reveal multiple cryptic lineages in elusive meiofaunal sea slugs. BMC Evolutionary Biology 12(245): 1–18.
  • Kazemi-Dinan A, Schroeder F, Peters L, Majdi N, Traunspurger W (2014) The effect of trophic state and depth on periphytic nematode communities in lakes. Limnologica 44: 49–57.
  • Kembel SW, Wu M, Eisen JA, Green JL (2012) Incorporating 16S gene copy number information improves estimates of microbial diversity and abundance. PLoS computational Biology 8(10): e1002743.
  • Knight R, Vrbanac A, Taylor BC, Aksenov A, Callewaert C, Debelius J, Gonzalez A, Kosciolek T, McCall L-I, McDonald D, Melnik A, Morton JT, Navas J, Quinn RA, Sanders JG, Swafford AD, Thompson LR, Tripathi A, Xu Q, Zaneveld JR, Zhu Q, Caporoso JG, Dorrestein PC (2018) Best practices for analysing microbiomes. Nature Reviews Microbiology 16(7): 410–422.
  • Kozich JJ, Westcott SL, Baxter NT, Highlander SK, Schloss PD (2013) Development of a dual-index sequencing strategy and curation pipeline for analyzing amplicon sequence data on the MiSeq Illumina sequencing platform. Applied and Environmental Microbiology 79(17): 5112–5120.
  • Leese F, Altermatt F, Bouchez A, Ekrem T, Hering D, Meissner K, Mergen P, Pawlowski J, Piggott J, Rimet F (2016) DNAqua-Net: Developing new genetic tools for bioassessment and monitoring of aquatic ecosystems in Europe. Research Ideas and Outcomes 2: e11321.
  • Logares R, Sunagawa S, Salazar G, Cornejo-Castillo FM, Ferrera I, Sarmento H, Hingamp P, Ogata H, Vargas C de, Lima-Mendez G, Raes J, Poulain J, Jaillon O, Wincker P, Kandels-Lewis S, Karsento E, Bork P, Acinas SG (2014) Metagenomic 16S rDNA Illumina tags are a powerful alternative to amplicon sequencing to explore diversity and structure of microbial communities. Environmental Microbiology 16(9): 2659–2671.
  • Macheriotou L, Guilini K, Bezerra TN, Tytgat B, Nguyen DT, Phuong Nguyen TX, Noppe F, Armenteros M, Boufahja F, Rigaux A, Vanreusel A, Derycke S (2019) Metabarcoding free-living marine nematodes using curated 18S and CO1 reference sequence databases for species-level taxonomic assignments. Ecology and Evolution 9(3): 1211–1226.
  • Markmann M, Tautz D (2005) Reverse taxonomy: an approach towards determining the diversity of meiobenthic organisms based on ribosomal RNA signature sequences. Philosophical Transactions of the Royal Society B: Biological Sciences 360(1462): 1917–1924.
  • MATLAB User’s Guide (1998) The Mathworks. Inc., Natick, MA 5, 333.
  • Nilsson RH, Anslan S, Bahram M, Wurzbacher C, Baldrian P, Tedersoo L (2019) Mycobiome diversity: high-throughput sequencing and identification of fungi. Nature Reviews Microbiology 17(2): 95–109.
  • Pafčo B, Čížková D, Kreisinger J, Hasegawa H, Vallo P, Shutt K, Todd A, Petrželková KJ, Modrý D (2018) Metabarcoding analysis of strongylid nematode diversity in two sympatric primate species. Scientific Reports 8(1): 5933.
  • Peham T, Steiner FM, Schlick-Steiner BC, Arthofer W (2017) Are we ready to detect nematode diversity by next generation sequencing? Ecology and Evolution 7(12): 4147–4151.
  • Peters L, Traunspurger W (2005) Species distribution of free-living nematodes and other meiofauna in littoral periphyton communities of lakes. Nematology 7: 267–280.
  • Piñol J, Mir G, Gomez-Polo P, Agustí N (2015) Universal and blocking primer mismatches limit the use of high-throughput DNA sequencing for the quantitative metabarcoding of arthropods. Molecular Ecology Resources 15(4): 819–830.
  • Pitsch G, Bruni EP, Forster D, Qu Z, Sonntag B, Stoeck T, Posch T (2019) Seasonality of planktonic freshwater ciliates: Are analyses based on V9 regions of the 18S rRNA gene correlated with morphospecies counts? Frontiers in Microbiology 10: 403.
  • Porazinska DL, Giblin-Davis RM, Faller L, Farmerie W, Kanzaki N, Morris K, Powers TO, Tucker AE, Sung W, Thomas WK (2009) Evaluating high‐throughput sequencing as a method for metagenomic analysis of nematode diversity. Molecular Ecology Resources 9(6): 1439–1450.
  • Potapov AM, Scheu S, Tiunov AV (2019) Trophic consistency of supraspecific taxa in belowground invertebrate communities: comparison across lineages and taxonomic ranks. Functional Ecology 33(6): 1172–1183.
  • Prokopowich CD, Gregory TR, Crease TJ (2003) The correlation between rDNA copy number and genome size in eukaryotes. Genome / National Research Council Canada = Génome / Conseil national de recherches Canada 46(1): 48–50.
  • R Core Team R (2013) R: A Language and Environment for Statistical Computing. R foundation for statistical computing, Vienna.
  • Ristau K, Steinfartz S, Traunspurger W (2013) First evidence of cryptic species diversity and significant population structure in a widespread freshwater nematode morphospecies (Tobrilus gracilis). Molecular Ecology 22(17): 4562–4575.
  • Sahraean N, van Campenhout J, Rigaux A, Mosallanejad H, Leliaert F, Moens T (2017) Lack of population genetic structure in the marine nematodes Ptycholaimellus pandispiculatus and Terschellingia longicaudata in beaches of the Persian Gulf, Iran. Marine Ecology 38(3): e12426.
  • Schenk J, Traunspurger W, Ristau K (2016) Genetic diversity of widespread moss-dwelling nematode species in German beech forests. European Journal of Soil Biology 74: 23–31.
  • Schloss PD, Westcott SL, Ryabin T, Hall JR, Hartmann M, Hollister EB, Lesniewski RA, Oakley BB, Parks DH, Robinson CJ (2009) Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities. Applied and environmental Microbiology 75(23): 7537–7541.
  • Schmid PE, Schmid-Araya JM (2002) Trophic relationships in temporary and permanent freshwater meiofauna. In: Rundle SD, Robertson AL, Schmid-Araya JM (Ed.) Freshwater Meiofauna: Biology and Ecology. Backhuys Publishers, Leiden, 295–319.
  • Sechi V, Goede RGM de, Rutgers M, Brussaard L, Mulder C (2018) Functional diversity in nematode communities across terrestrial ecosystems. Basic and Applied Ecology 30: 76–86.
  • Traunspurger W, Bergtold M, Goedkoop W (1997) The effects of nematodes on bacterial activity and abundance in a freshwater sediment. Oecologia 112(1): 118–122.
  • Traunspurger W, Hoess S, Witthöft-Mühlmann A, Wessel M, Güde H (2012) Meiobenthic community patterns of Lake Constance: relationships to nutrients and abiotic parameters in an oligotrophic deep lake. Fundamental and Applied Limnology 180(3): 233–248.
  • Treonis AM, Unangst SK, Kepler RM, Buyer JS, Cavigelli MA, Mirsky SB, Maul JE (2018) Characterization of soil nematode communities in three cropping systems through morphological and DNA metabarcoding approaches. Scientific Reports 8(1): 2004.
  • Van den Hoogen J, Geisen S, Routh D, Ferris H, Traunspurger W, Wardle DA, de Goede Ron GM, Adams BJ, Ahmad W, Andriuzzi WS, Bardgett RD, Bonkowski M, Campos-Herrera R, Cares JE, Caruso T, Brito Caixeta L de, Chen X, Costa SR, Creamer R, Mauro da Cunha Castro José Dam M, Djigal D, Escuer M, Griffiths BS, Gutiérrez C, Hohberg K, Kalinkina D, Kardol P, Kergunteuil A, Korthals G, Krashevska V, Kudrin AA, Li Q, Liang W, Magilton M, Marais M, Martín JAR, Matveeva E, Mayad EH, Mulder C, Mullin P, Neilson R, Nguyen TAD, Nielsen UN, Okada H, Rius JEP, Pan K, Peneva V, Pellissier L, Carlos Pereira da Silva J, Pitteloud C, Powers TO, Powers K, Quist CW, Rasmann S, Moreno SS, Scheu S, Setälä H, Sushchuk A, Tiunov AV, Trap J, van der Putten W, Vestergård M, Villenave C, Waeyenberge L, Wall DH, Wilschut R, Wright DG, Yang JI, Crowther TW (2019) Soil nematode abundance and functional group composition at a global scale. Nature 572: 194–198.
  • Van Megen H, van den Elsen S, Holterman M, Karssen G, Mooyman P, Bongers T, Holovachov O, Bakker J, Helder J (2009) A phylogenetic tree of nematodes based on about 1200 full-length small subunit ribosomal DNA sequences. Nematology 11(6): 927–950.
  • Waeyenberge L, Sutter Nde, Viaene N, Haegeman A (2019) New insights into nematode DNA-metabarcoding as revealed by the characterization of artificial and spiked nematode communities. Diversity 11(4): 52.
  • Weigand H, Beermann AJ, Čiampor F, Costa FO, Csabai Z, Duarte S, Geiger MF, Grabowski M, Rimet F, Rulik B, Strand M, Szucsich N, Weigand AM, Willassen E, Wyler SA, Bouchez A, Borja A, Čiamporová-Zaťovičová Z, Ferreira S, Dijkstra K-D B, Eisendle U, Freyhof J, Gadawski P, Graf W, Haegerbaeumer A, van der Hoorn BB, Japoshvili B, Keresztes L, Keskin E, Leese F, Macher JN, Mamos T, Paz G, Pešić V, Pfannkuchen DM, Pfannkuchen MA, Price B, Rinkevich B, Teixeira MAL, Várbíró G, Ekrem T (2019) DNA barcode reference libraries for the monitoring of aquatic biota in Europe: Gap-analysis and recommendations for future work. The Science of the Total Environment 678: 499–524.
  • Wilschut RA, Geisen S, Martens H, Kostenko O, Hollander M de, Hooven FC ten, Weser C, Snoek LB, Bloem J, Caković D, Čelik T, Koorem K, Krigas N, Manrubia M, Ramirez KS, Tsiafouli MA, Vreš B, van der Putten WH (2019) Latitudinal variation in soil nematode communities under climate warming-related range-expanding and native plants. Global Change Biology 25(8): 2714–2726.