Research Article |
|
Corresponding author: Mathis Ventura ( mathis.ventura@u-pec.fr ) Academic editor: Florian Leese
© 2025 Mathis Ventura, Nicolas Mazuras, Shanèze Noël, Lise Dupont, Virginie Roy.
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Ventura M, Mazuras N, Noël S, Dupont L, Roy V (2025) A DNA metabarcoding workflow for the detection of terrestrial gastropods from ingested DNA. Metabarcoding and Metagenomics 9: e161359. https://doi.org/10.3897/mbmg.9.161359
|
DNA metabarcoding is a powerful approach for assessing biodiversity and unravelling trophic interactions, particularly in the context of biological invasions where introduced predators may impact native communities. However, the analysis of predator digestive contents with DNA metabarcoding poses technical challenges including the specificity of primers used to amplify the target DNA, amplification bias and contamination issues, which must be carefully considered when working on new taxa. This study addresses these challenges by developing and validating a methodology for assessing the diversity of terrestrial gastropods predated by the exotic flatworm Obama nungara. Specifically, an optimized primer pair providing taxonomic resolution for at least 40 gastropod species and genetic lineages, including taxonomically complex and diverse slug genera such as Arion and Deroceras is proposed. The primer pair was optimized through in silico predictions, laboratory controls, and field testing. Mock communities, laboratory-fed individuals, and technical replicates from the field were used as metabarcoding controls to calibrate filter thresholds for Operational Taxonomic Units. Preliminary field results revealed significant dietary diversity in O. nungara, suggesting a broad feeding spectrum. Beyond O. nungara, the primers and the assay developed here can also be applied to study the diets of other terrestrial predators or to detect gastropods in environmental DNA samples, providing a useful tool for ecological and conservation research.
Diet analysis, field controls, laboratory controls, primer design, soil macrofauna, taxonomic resolution
Developed over a decade ago, DNA metabarcoding is a powerful tool for understanding biodiversity and interactions between organisms using environmental samples such as soil, water, feces or digestive contents (
Two main DNA metabarcoding strategies exist: a generalist approach using broadly conserved primer pairs, and a targeted one using primers designed for specific taxa or narrower groups, often covering the same marker region (i.e., COI or 16S) (
When initiating a dietary study on new predator or prey taxa using digestive DNA, several issues and technical limitations widely mentioned in the literature need to be considered, such as: 1) the species coverage of the selected markers for the target taxa and their taxonomic resolution (
The present study is dedicated to the development and validation of a DNA metabarcoding methodology for the detection of terrestrial gastropod species (Gastropoda, Stylommatophora) in the digestive contents of a predator, the exotic flatworm Obama nungara (Platyhelminthes, Geoplanidae). This flatworm, native to South America, feeds on earthworms, slugs, snails and other flatworms according to laboratory experiments (
In France, 422 species of terrestrial gastropods have been recorded in the Red List of Continental Molluscs of France (https://uicn.fr/liste-rouge-mollusques-continentaux/). However, identification of these species remains a major challenge due to the taxonomic complexity of certain genera. In the slug genus Arion, for instance, misidentification cases may arise due to complex hybridization patterns. These concerns notably involve Arion vulgaris and Arion ater sensu lato (i.e., Arion ater sensu stricto and the two morphotypes of A. rufus as described by
The aim of the present work was to develop primers and suitable filtering parameters for estimating the diversity of terrestrial gastropod species consumed by O. nungara individuals collected in the field. The validation of this methodology was of particular interest to understand the technical limitations of the approach in terms of species coverage, taxonomic resolution, detection bias, and potential contamination. To achieve this, a multistep methodology was proposed: 1) primer design based on an alignment of reference sequences of the ribosomal 16S rRNA gene, enriched with sequences obtained from gastropods collected from different sites infested by O. nungara, 2) validation of primers in silico and by conventional PCR on a selected panel of gastropod species, 3) metabarcoding analysis of mock controls (DNA mixtures of established specific composition and concentration), dietary controls (DNA from gut contents of O. nungara fed sequentially with known gastropod species), and DNA from gut contents of O. nungara collected from sites with known terrestrial gastropod diversity. The results allowed the selection of a pair of primers and analysis parameters that met the validation criteria established for the detection of terrestrial gastropods in O. nungara gut contents. Although not validated for this purpose, these primers may be of interest for studying other types of samples (e.g. digestive DNA from other terrestrial gastropod predators or environmental DNA (eDNA) from soil).
Gastropods were sampled twice a year from autumn 2021 to spring 2024 in anthropized green spaces invaded by O. nungara, across four regions of mainland France (Paris region, La Rochelle, Bordeaux region and Brittany). Sampling was carried out over a standardized 2-hour survey in all sheltering microhabitats within a 100 m2 area at each site. Samples were stored in absolute ethanol at 4 °C prior to DNA extraction. In a preliminary identification step, samples were assigned to morphospecies using identification keys from
To develop and validate specific primers for the amplification and sequencing of DNA from terrestrial gastropods in predator gut contents, available primers and sequences for several genes were collected from the literature and the National Center for Biotechnology Information (NCBI) database. To select the most appropriate gene for this study, several criteria were applied: the gene had to be well documented, with sufficient sequences available to build a reference database, and suitable for targeting taxa within terrestrial gastropods, which encompass more than ten families. Although COI is commonly used, preliminary in silico and PCR tests did not reveal regions suitable for designing exclusive primers for gastropods. Moreover, previous studies have shown that primer bias in COI can lead to the underrepresentation or non-detection of some taxa, even within closely related groups (
In order to build the mitochondrial 16S rDNA database, the primer pair 16sar-L (5’-CGCCTGTTTATCAAAAACAT-3’) (
To improve species coverage for the primer design, 244 sequences were retrieved from NCBI and added to the database. These sequences were selected to represent both species recorded at our study sites and those likely to occur in French habitats potentially colonized by O. nungara. This addition represents 20 new species and lineages, bringing the total to 58, which corresponds to 13% of the French terrestrial gastropod fauna, considering species that could potentially be encountered based on their geographical distribution. Sequences were then aligned using the MUSCLE algorithm in SeaView4 (
215 earthworm 16S sequences from
To assess the overall coverage and specificity of the designed primer pairs for terrestrial gastropods, in silico PCR were first performed using the ecoPCR program (
PCR specificity cross-tests were also conducted on DNA extracts (see below CTAB extraction procedure) from three gastropod species, three earthworm species and eleven samples of O. nungara that were fed differently (with an empty stomach, previously fed with terrestrial gastropods or earthworms, or with gastropods and earthworms). The PCR reaction was carried out using the same reagent composition and volumes as previously described. The following cycling conditions were used: an initial denaturing step at 94 °C for 3 min; 40 cycles of denaturation at 94 °C for 30 s; annealing at variable temperature depending on primer pairs for 30 s, extension at 72 °C for 1 min and final extension step at 72 °C for 10 min.
To ensure broad coverage of terrestrial gastropod species, promising primers were tested in silico using the PRIMERMINER 0.21 R package (
Finally, a Neighbor-Joining tree was reconstructed with the software MEGA (Kimura 2-parameter model, 1000 Bootstrap replicates) based on the short DNA sequences (i.e. 16S metabarcoding region) targeted by the retained primers, to verify whether the different species or lineages occur in distinct clades.
To discuss the potential of using the primer pair in future studies studying terrestrial gastropods, its coverage in silico was also tested on available 16S sequences of terrestrial gastropod species from France (sequences available for 206 species out of 422 described, across 43 families) (list extracted from https://uicn.fr/liste-rouge-mollusques-continentaux/) and from Europe (sequences available for 698 species out of 2480 described, across 53 families and including sequences from this study) (list extracted from https://portals.iucn.org/library/node/48439) using the PRIMERMINER package.
Two types of mock community controls were set up (See Suppl. material
Six “controlled-feeding” controls were DNA from digestive contents of O. nungara individuals fed for three weeks with one prey per week. Three individuals (“UNI-G1 to G3”) received the same species, Cornu aspersum (non-differentiated genetic lineages, as they could not be recognized based on the external morphology). The other three individuals (“MIX-G1 to G3”) received a different species every week in the following order: Hygromia cinctella, Cornu aspersum and Arion hortensis L1. Consumption was checked after three days, and prey was always eaten. After three weeks, O. nungara individuals were killed and stored in absolute ethanol for further DNA extraction. Both types of “controlled-feeding” controls were used to test if all given prey would be detected by sequencing digestive contents after three weeks of experimentation and if there were potential amplification biases between DNA from different species. They were also used to assess contamination levels and to set a filtering threshold for clusters (
Three negative controls (“BLANK 1 to 3”), from three independent PCR amplifications without added DNA, were set up to check for possible contamination during PCR and to set the threshold of stochasticity for technical replicates. A pairwise distance matrix between negative replicates was created using the Bray-Curtis dissimilarity index (
The last three controls were DNA from digestive contents of three O. nungara individuals collected on the field during sampling campaigns (V6, V29 and V53 samples) and tested positive for gastropod DNA. Each of them was replicated three times (i.e. three independent PCR amplifications and sequencing per DNA to create technical replicates). Replicated field controls were used to apply parameters and thresholds (DNA quantity, distance threshold between technical replicates, threshold for clusters filtering) calibrated with other controls mentioned above, to ensure their reliability on real digestive contents from the field (
Finally, 16 DNA extracts from O. nungara individuals (5 from Limeil-Brevannes, Paris region, 3 from Saint-Medard-en-Jalles, Bordeaux region, 3 from Cleguer, Brittany and 5 from Concarneau, Brittany) that tested positive for gastropod DNA were included in the experiment. By estimating the diversity of terrestrial gastropod species at these particularly rich collection sites, it was possible to experiment in vivo the retained primer pair and the filtering threshold for clusters calibrated from metabarcoding controls.
DNA from about 20 mg of gut content was extracted from the region corresponding to the opening of the mouth of O. nungara according to
DNA amplifications were then conducted on all 40 samples by using the validated primer pair and the corresponding PCR protocol (see Results). For each sample and replicate, a separate library was prepared with a target sequencing depth of 10,000 reads. Each library was assigned a unique index, following GenoScreen’s protocol for multiplexing. Sequencing was performed on the Illumina MiSeq platform (2 × 150 bp paired-end chemistry) with 15% PhiX.
The Migale platform (https://galaxy.migale.inrae.fr/) was used to trim, merge, filter and cluster sequences. Pair-end reads were first trimmed with FROGS SICKLE program, based on a Phred quality score Q < 30. Merging of pair-end reads followed by first filtering were realized with FROGS PRE-PROCESS program with the following parameters: maximum lengths of 127 bp for forward reads (R1) and 134 bp for reverse reads (R2) were selected for merging, based on reads length distribution reports from FASTQC and MULTIQC programs; >30 bp coverage zone and 97% nucleotide identity were required to assemble paired-end reads and merged amplicons not comprised between 119 and 135 bp were discarded.
Clustering was then performed using FROGS CLUSTERING SWARM program with d = 1 (a small local linking threshold, representing the maximum number of differences between sequences in each aggregation swarm step) and the fastidious option selected. Clustered sequences will be referred to and identified hereafter as Operational Taxonomic Units (OTUs) in this study. Sequences identified as chimera in all samples were discarded using FROGS REMOVE CHIMERA program. Finally, clusters were filtered using the FROGS CLUSTER FILTERS program: only singletons were discarded, and the OTU filtering threshold established from control analyses was applied to both the technical replicates and the sixteen samples from the field.
For the taxonomic assignment of OTUs, a double BLAST approach was used: first with the BLASTn program from NCBI and second, with the reference database. Only BLAST matches with at least 95% identity and 95% sequence coverage were used to assign a gastropod species or lineage to each OTU. Unassigned OTUs, including those under the threshold and those unknown from any reference database, were then discarded from the analysis to maintain clean assignments and real detections. To further validate the taxonomic assignments, a Neighbor-Joining phylogenetic tree was reconstructed including OTU sequences with the corresponding reference sequences. This tree was then inspected to confirm that OTUs assigned to the same nominal species were clustered together and that the reference sequences of each species were included within the same clade.
Sequencing results were finally analyzed in R 4.4.2 (R Core Team 2024) with the PHYLOSEQ 1.46.0 R package (
Ninety-six 16S sequences were obtained from DNA extracts of gastropods, representing 38 species and lineages described by morphology and/or the COI gene from French sites invaded by O. nungara. Including sequences from NCBI, the final database contained 340 sequences from 58 species/lineages (1 to 40 sequences per species or lineage) (See Suppl. material
Six primers, ranging from 14 to 20 bp, were designed by visualizing the 16S alignment of sequences from terrestrial gastropods, earthworm and O. nungara (See Suppl. material
Schema of 16S DNA sequence in terrestrial gastropods showing variable (white) regions, conserved (green) regions and positions of the primers designed and retrieved for this study, the lengths indicated include the primers; and table detailing names, sequence lengths, 5’-3’ sequences and references for each potential primer pair.
In silico PCR results indicated that none of the primer pairs amplified bacterial 16S sequences, with amplification restricted to eukaryotic taxa. Among them, three primer pairs (the two including the MVF primer and Gast01R-Gast01F) produced the highest number of amplicons. Gast01R-Gast01F amplified exclusively Animalia, while the MVF-containing pairs retrieved mostly Animalia but also some sequences from Fungi and a small number of sequences from Chromista, Plantae, Protozoa, and an unassigned Eukaryota taxon (NC_086778). The five remaining primer pairs amplified fewer sequences overall, all belonging to Animalia (Fig.
In silico evaluation of primer pairs across eukaryotic taxa. A Global amplification per primer pair. Each bar represents the number of species amplified per primer pair, colored by kingdom. B Amplification specifically within terrestrial gastropods (order Stylommatophora). Bars indicate the number of Stylommatophora species amplified per primer pair. Dashed vertical line indicates the maximum number of target species available (n = 45).
Specificity cross-tests showed that, contrary to in silico predictions, two primer pairs (VRF-VRR and MVF-VRR) amplified only the targeted gastropod taxa and O. nungara samples fed with gastropods. Two primer pairs (VRF-MVR and GastD-GastE) amplified some of the targeted gastropods but also produced unspecific amplification. Four other primer pairs showed broader cross-amplification: MVF-MVR co-amplified targeted gastropods together with O. nungara from empty or earthworm-fed stomachs and earthworm DNA; GastD-Gast01F co-amplified targeted gastropods and O. nungara from empty and earthworm-fed stomachs; and GastD-Gast01R-F amplified the targeted gastropods but also co-amplified O. nungara from empty stomachs and produced non-specific amplicons (Fig.
PCR specificity cross-tests results for the three taxonomic groups (O. nungara, earthworms, and gastropods) using different potential primer pairs for minibarcode design. Bars indicate the number of species and samples showing each amplification outcome. “Targeted amplification” includes the three gastropod species (A. parvipenis, C. aspersum L1, D. invadens), the three O. nungara fed with gastropod prey (g. prey) and the two fed with both gastropod and earthworm prey (both prey); “Co-amplification” includes non-targeted amplifications either with the three O. nungara with empty stomach (e.s.), the three O. nungara fed with earthworm prey (e. prey) or with earthworm species (A. icterica, L. terrestris, L. castaneus L1). Colors correspond to amplification categories as indicated in the legend, annealing temperatures are indicated for each primer pair.
In silico amplifications using PRIMERMINER 0.21 showed that DNA sequences from 30 out of the 38 lineages from the field could theoretically be amplified with the VRF-VRR primer pair (Table
Results of in silico (penalty score) and in vitro analysis (PCR amplification) for DNA sequences from each of the 38 terrestrial gastropod species found during sampling. A value of “0” means that the sequences perfectly match the primer pair, sequences with score below 120 are supposed to amplify, sequences above 120 are supposed to fail, and H-D (haplotype-dependent) means that results were variable between haplotypes from the same species. Abbreviations: “+” indicates amplification (PCR band on a gel), “-” indicates no amplification (no PCR band), and “unspecific” denotes an unspecific amplification with multiple bands.
| Family | Genus and Species | Lineage | Number of samples tested | in silico penalty score | PCR amplification |
|---|---|---|---|---|---|
| Agriolimacidae | Deroceras agreste | 3 | 152 | + | |
| D. invadens | 3 | 0 | + | ||
| D. panormitanum | 2 | 0 | + | ||
| D. reticulatum | 3 | 152 | + | ||
| Arionidae | Arion ater rufus | 5 | 0 | + | |
| Ar. distinctus | 3 | 0 | + | ||
| Ar. hortensis | L1 | 3 | 0 | + | |
| Ar. hortensis | L2 | 3 | H-D | + | |
| Ar. intermedius | 3 | 0 | + | ||
| Ar. subfuscus | S1 | 2 | 0 | + | |
| Ar. vulgaris | 3 | 0 | + | ||
| Boettgerillidae | Boettgerilla pallens | 3 | 0 | + | |
| Cochlicopidae | Cochlicopa lubrica | 1 | 0 | + | |
| Geomitridae | Backeljaia gigaxii | 1 | 0 | + | |
| Cochlicella barbara | 3 | 0 | + | ||
| Xeroplexa intersecta | 3 | 0 | + | ||
| Helicidae | Cepaea hortensis | 2 | 0 | + | |
| Ce. nemoralis | 3 | 0 | + | ||
| Cornu aspersum | L1 | 4 | 0 | + | |
| Co. aspersum | L2 | 2 | 0 | + | |
| Helix lucorum | 1 | 0 | + | ||
| Theba pisana | L1 | 2 | H-D | unspecific | |
| Th. pisana | L2 | 1 | 325 | unspecific | |
| Hygromiidae | Hygromia cinctella | 3 | 254 | + | |
| Monacha cartusiana | 1 | 0 | + | ||
| Trochulus hispidus | 2 | 249 | + | ||
| Limacidae | Ambigolimax parvipenis | 5 | 0 | + | |
| Am. valentianus | 1 | 0 | + | ||
| Limacus flavus | 3 | 0 | + | ||
| Limax maximus | 3 | 0 | + | ||
| Milacidae | Milax gagates | 2 | 36 | + | |
| Mi. nigricans | 2 | 36 | + | ||
| Tandonia budapestensis | 3 | 36 | + | ||
| Oxychilidae | Oxychilus draparnaudi | 4 | 0 | + | |
| Pomatiidae | Pomatias elegans | 1 | 211 | - | |
| Testacellidae | Testacella haliotidea | 3 | 51 | + | |
| Te. maugei | 2 | 0 | + | ||
| Te. scutulum | 3 | 0 | + |
The results of PCR assays on gastropod DNA showed that 35 out of the 38 species and lineages tested showed the expected amplification band. Pomatias elegans was the only species that did not amplify, and two other lineages from a morphospecies (Theba pisana L1 and L2) showed unspecific results (Table
Fifty-six out of 58 species and lineages were monophyletic in the NJ tree reconstructed based on the short fragment: they grouped into distinct clades or were distinct singletons (See Suppl. material
High-throughput sequencing resulted in 714,602 paired-end reads for the 40 samples (metabarcoding controls and field samples). Details of reads and OTU counts through the bioinformatic process before analysis of the clustering filter threshold are shown in Table
Overview of the consecutive reduction of analyzed reads, sequences and OTUs through the bioinformatic pipeline before cluster filtering; from sequencing results to singletons removal for each type of metabarcoding controls and field samples (Ntot = 40 DNA or gut samples). Indicated values are means and standard deviations across samples.
| Sample type (n) | Sequencing (N reads) | Trimming (N reads) | Merging/ Pre-processing (N sequences) | Clustering (N sequences; N OTUs) | Chimera removal (N sequences; N OTUs) | Singletons removal (N sequences; N OTUs) |
|---|---|---|---|---|---|---|
| Negative (3) | 23342 ± 1331 | 18621 ± 2046 | 7 ± 2 | 7 ± 2; 3 ± 1 | 7 ± 2; 3 ± 1 | 7 ± 2; 3 ± 1 |
| NREP mocks (3) | 6963 ± 213 | 5847 ± 230 | 4417 ± 203 | 4417 ± 203; 84 ± 2 | 4382 ± 202; 49 ± 1 | 4301 ± 183; 40 |
| POS mocks (3) | 6747 ± 168 | 5633 ± 138 | 4287 ± 147 | 4287 ± 147; 38 ± 8 | 4266 ± 151; 20 ± 4 | 4259 ± 147; 11 ± 2 |
| UNI feeding (3) | 7533 ± 247 | 3708 ± 316 | 927 ± 516 | 927 ± 516; 20 ± 2 | 916 ± 503; 17 ± 2 | 911 ± 498; 13 ± 2 |
| MIX feeding (3) | 7843 ± 334 | 4134 ± 439 | 1878 ± 541 | 1878 ± 541; 10 ± 1 | 1876 ± 540; 9 ± 1 | 1870 ± 540; 7 ± 2 |
| Technical replicates (9) | 14632 ± 1066 | 11578 ± 852 | 6301 ± 809 | 6301 ± 809; 40 ± 5 | 6287 ± 793; 30 ± 5 | 6254 ± 818; 20 ± 3 |
| Field samples (16) | 26566 ± 925 | 21561 ± 776 | 17561 ± 679 | 17561 ± 679; 20 ± 3 | 17502 ± 667; 15 ± 2 | 17339 ± 801; 13 ± 2 |
In all controls, OTUs were either assigned to terrestrial gastropod species (based on more than 95% identity and 95% coverage with NCBI sequences and confirmed by alignment of OTU sequences with the reference database), or showed “no significant similarity” in both BLAST analysis and the database (i.e. likely artefactual sequences). The number of OTUs per nominal species or lineage varied from 1 to 6 OTUs across all samples. Some species showed high OTU richness, such as Cochlicella barbara (6 OTUs) and Cornu aspersum L1 and L2 (4 OTUs each), but phylogenetic reconstruction confirmed the clustering of OTUs assigned to the same nominal species with reference sequences (data not shown). As expected, the 16S region amplified by the VRF-VRR primer pair only failed to distinguish Ambigolimax parvipenis from Ambigolimax valentianus.
In POS-G1 and POS-G3 mocks, all five expected gastropod species were recovered, with more than 99% of sequences from these species on average, followed by unassigned clusters and artefactual sequences (Fig.
Comparison of expected and observed sequencing results for both types of mock community: A–C (POS) and D–F (NREP). Results are displayed as relative read abundance (RRA, % of sequences) after applying a 0.1% threshold filter for unexpected and non-assigned OTUs (representing 99.9% of the target gastropod taxa) and clustering reads from the same expected species. For D–F (NREP), each point represents the observed abundance of a given taxon in one sample; the red dotted line indicates the theoretical 1:1 relationship between observed and expected abundances, and the gray dashed line shows the regression across all points, representing the observed trend. The color legend applies for both type of mock community.
The VRF-VRR primer pair amplified and discriminated 16S DNA sequences from 24 out of the 27 species and lineages included in the “NREP” mock communities (Fig.
In both “POS” and “NREP” mock types, Cornu aspersum L1 showed the highest disparity between expected and observed sequence counts, as the results showed that it was overrepresented in the mock communities: +29% in POS-G1 and +10% on average in NREP mocks. Conversely, in the POS-G1 and POS-G2 mocks, Testacella haliotidea (-16% in POS-G1 and -3% in NREP mocks on average) and Hygromia cinctella (-18% in POS-G2 and -4% in NREP mocks on average) sequences were the most underrepresented in the observed communities (Fig.
In the “controlled-feeding” controls UNI-G1 to G3 (fed only with Cornu aspersum), on average more than 99% of the sequences were assigned to Cornu aspersum L1 (one OTU for UNI-G1), to Cornu aspersum L2 (one OTU for UNI-G3) or to Cornu aspersum L1 and Cornu aspersum L2 (one OTU each for UNI-G2). In all three “controlled-feeding” controls MIX-G1 to G3, sequences from the last prey given (Arion hortensis L1, 1 OTU) were detected in up to 99% of the sequences in MIX-G3 and 54% on average in MIX-G1 and MIX-G2. MIX-G1 and MIX-G2 also showed sequences from the second prey (Cornu aspersum L1), whereas sequences from the first prey (Hygromia cinctella) were not detected in any sample. Again, sequences from the second prey Cornu aspersum L1 were highly abundant in the “MIX” samples, yielding 40% (MIX-G1) and 50% (MIX-G2) of the sequences compared to sequences from the last prey Arion hortensis L1, consumed and digested one week later (See Suppl. material
The results obtained for controls were used to set the filtering threshold for OTUs. The mock communities showed that unexpected or artefactual OTUs (i.e. assigned to unexpected species or returned as “no significant similarity”) had an average occurrence of 0.1% across samples. On the other hand, unexpected OTUs from non-given prey for the “controlled-feeding” controls averaged 0.4%. These two potential thresholds were tested on PCR technical replicates from the field (V6, V29 and V53 samples). Of the 70 OTUs detected in these samples (see Suppl. material
Negative controls (BLANK 1–3) amplified very few sequences (5–8 reads in total, scattered across six species after discarding non-assigned OTUs), indicating minimal background noise and no evidence of systematic contamination. Bray-Curtis distances were calculated among these controls to provide an estimate of stochastic variation (d = 0.7), consistent with previous work using similar approaches (
Gut contents from the 16 field samples showed 1 to 6 prey species per sample after applying the 0.4% OTU filtering threshold (See Suppl. material
Number of digestive contents in which each prey species was detected for the sixteen field samples, shown per sampling site. Site titles follow the nomenclature “Region, Site”. Each point corresponds to one prey species: red symbols indicate species also recorded during field sampling, whereas grey symbols indicate species not detected in the field but found in gut contents.
In this study, new primers were developed to amplify and sequence DNA of terrestrial gastropods from digestive samples, and a detailed procedure was proposed using metabarcoding controls to ensure their specificity, resolution, and coverage for the target taxa. The first results from field data were also obtained, by sequencing the gut contents of the flatworm O. nungara, an exotic predator of terrestrial gastropods that is widespread in Europe.
Universal primers detect a wide range of taxa but often lack specificity, leading to co-amplification of non-target organisms or contaminants. Their broad taxonomic range can also reduce resolution, sometimes limiting identification to family or genus level (
This study introduces a new primer pair specifically targeting short 16S DNA fragments of about 162 bp from terrestrial gastropods, representing a key advancement for the effective implementation of metabarcoding in trophic interaction and biodiversity studies. To date, only one other primer pair, developed by
A key step in primer development is ensuring broad taxonomic coverage, i.e. the ability to amplify as many species or lineages as possible within the target taxa. Two complementary approaches, in silico amplifications based on sequence alignments and PCR amplifications on DNA from a variety of identified samples, were performed in this study and the results from both methods were only slightly divergent: among the 38 gastropod species and lineages tested, DNA amplification was expected for 32 referring to in silico tests, and PCR results showed successful amplification in 35 cases.
DNA from Pomatias elegans, Theba pisana L1, Theba pisana L2, Deroceras agreste, Deroceras reticulatum, Hygromia cinctella, and Trochulus hispidus was predicted in silico to fail amplification due to mismatches between their sequences and the retained primer pair, particularly at the 3’ end of the reverse primer. This region is critical for primer binding, and mismatches or gaps in this area can significantly reduce amplification efficiency (
However, despite similar mismatches, PCR amplification tests successfully amplified DNA from Deroceras agreste, Deroceras reticulatum, Hygromia cinctella, Trochulus hispidus, and all tested specimens of Arion hortensis L2, though in silico predictions for the latter showed some haplotype-dependent variability in amplification potential. These results underscore the need for experimental validation, as in silico predictions may not fully capture amplification potential. They also highlight the need to account for both interspecific and intraspecific sequence variability when designing primers for eDNA studies, ensuring their applicability across a wide range of targeted taxa (
Another important step in marker development is their resolution i.e. the potential of the amplified fragment to unambiguously distinguish taxa (species or genetic lineages, as described for the reference database). The phylogenetic reconstruction based on the short 16S fragment revealed that nearly all species and lineages considered could be clearly distinguished. These promising results confirm the 16S database’s value for future research and expansion. Indeed, some species still lack sufficient or even any available sequences in the literature, as is the case for Ambigolimax valentianus.
Maintaining a taxonomically verified and updated reference database is crucial for deriving ecological insights. Here, we emphasize that BLAST-based identification can be limited not by the tool itself, but by incomplete or erroneous reference databases and unresolved taxonomy, which can hinder accurate species or lineage assignment from environmental DNA. Indeed, there are two types of constraints on the identification of terrestrial gastropods from mitochondrial data: the existence of highly differentiated genetic lineages within nominal species, suggesting species complexes or cryptic species, and the existence of hybrids and introgressed forms, possibly leading to morpho-molecular incongruences.
Regarding the existence of genetic lineages within species,
Beyond cases of taxonomic identification challenges at the genetic lineage level, possible cases of misidentification could occur due to complex patterns of hybridization. They concern, for example, Arion vulgaris and Arion ater s.l. (i.e. Arion ater s.s. and the two morphotypes of A. rufus according to
It should also be noted that studies of terrestrial gastropods have revealed remarkably high levels of both intra- and interspecific divergence in mtDNA, often exceeding 10% (
How to interpret high-throughput sequencing results has always been a concern in metabarcoding studies, especially when analyzing fragmented and/or partially digested DNA. Whether to rely on a quantitative or qualitative approach depends on several factors such as research aims, species-specific amplification biases, target taxa and dietary diversity, and should be considered when evaluating sequencing results (
Extracting DNA from digestive contents rather than feces implies that the sequencing results are highly dependent on the time elapsed since prey consumption, regardless of the species consumed. Consequently, the number of sequences per species does not necessarily reflect their abundance in the diet but rather indicates that the prey was consumed more recently – or conversely, earlier and more fully digested (
The results from mock communities obtained in this study are consistent with this trend. Indeed, Cornu aspersum L1 is systematically overrepresented in the samples. This suggests that this species may be preferentially amplified during the annealing step. In the MIX “controlled-feeding” controls, this species was consistently detected at a sequence number equivalent to that of Arion hortensis L1, which was consumed a week later. Interestingly, Arion hortensis L1 yielded sequencing results that were consistent with expectations in the mock communities, suggesting it is not negatively affected by amplification bias. This supports the idea that Cornu aspersum L1 DNA hybridizes more effectively and thus biases its true sequence counts in samples. Conversely, other species appear to show the opposite trend. For example, Hygromia cinctella was consistently underrepresented in mock communities, was completely absent from the “controlled-feeding” controls despite being the first prey eaten. It was also not detected in the field replicates or in the 16 field samples, despite being sampled in two of the four study sites. These results are consistent with an in silico binding score of 254, indicating that DNA from this species would theoretically not be expected to amplify efficiently with the VRF-VRR primers. However, PCR tests still showed amplification bands for this species, suggesting that although its DNA can be amplified, it is likely to be penalized during the annealing step due to mismatches. This likely results in low sequence counts, often insufficient to pass OTU filtering thresholds. The same thought process can be applied to other species such as Testacella haliotidea or Deroceras agreste, where the in silico score is either below 120 but not equal to 0, or above 120 and sequence counts in mock communities are significantly below expectations.
These results highlight a critical loss of diversity information as demonstrated by the “controlled-feeding” controls. The process is inherently multivariate, with the amplification of one taxon’s DNA influenced by the co-occurrence of others within the same samples (
Another critical element of metabarcoding analysis is the selection of an appropriate OTU filtering threshold. This choice directly affects species detection and quantification by influencing how sequences are grouped and background noise is removed. In metabarcoding, contamination and artefacts can arise at multiple stages, including field sampling (e.g., cross-contamination between samples), DNA extraction (e.g., introduction of external DNA from reagents), PCR amplification (e.g., carry-over contamination or sequencing errors), and sequencing (e.g., through tag jumping). Low-abundance OTUs often correspond to such contaminants or errors, making the filtering strategy essential (
Secondary predation, i.e., the detection of DNA from prey previously consumed by the predator’s prey, represents an additional source of potential bias (
In metazoan metabarcoding, a common filtering threshold of 1% of total read abundance is effective in reducing contaminants and rare erroneous sequences. Optimal thresholds, however, vary with target taxa, barcoding regions, sequencing depth, environment, and study goals (
Extensive controls—including positive and negative controls, mock communities, and technical replicates—were employed here to estimate artefact rates and refine thresholds. PCR replicates are generally recommended when feasible to increase reliability and allow stochastic amplification events to be identified, although they were not implemented in the present study (
The choice of OTU filtering thresholds primarily depends on the taxon and study objectives, as no universal value exists. In this study, a single threshold was applied across all samples, calibrated and validated with mock and controlled-feeding experiments to ensure that rare but ecologically relevant taxa were retained. The purpose of this threshold is not to provide a universal value, but to illustrate a rigorous approach to OTU filtering. Filtering thresholds are highly sensitive to the diversity of diets or ecosystems: in dietary metabarcoding, specialists with narrow diets may lose rare prey even under mild thresholds, exaggerating inter-individual variation, while generalists may appear artificially homogeneous if low-abundance taxa are removed (
Diet estimation remains a major challenge in ecology. Here are presented the first results on the dietary diversity of the invasive species O. nungara, from specimens collected directly in their invaded habitats, three private gardens and a greenhouse. Prey diversity surveys were conducted at these sites, allowing for a comparison of gut content sequencing results with field observations. Species observed in the field were expected in gut contents, though not necessarily all, as O. nungara could exhibit selective feeding. If the predator is highly selective, or if certain prey are rare, the gut contents should contain fewer species than observed in the field. Poor sampling or incomplete coverage could lead to the reverse pattern. The results obtained for Saint-Medard-en-Jalles suggest that O. nungara might show selectivity, which could be influenced by factors such as prey accessibility (e.g., life cycles, defensive mechanisms) or an active feeding preference, as indicated by the observed disparity between field and gut contents species diversity. In contrast, at Concarneau, the positive correlation between species detected in the field and those found in the gut contents suggests a more opportunistic feeding behavior. At Cleguer, the detection of species in gut contents that are not recorded in field surveys indicates a potential gap in sampling effort. Indeed, field inventories focused on macrofauna, which can lead to an underrepresentation of micro-snails such as Lauria cylindracea and other small species. High-throughput sequencing of digestive DNA can therefore enhance diversity estimates, serving as a complementary tool to field surveys (
These findings highlight the importance of considering sampling effort and PCR biases when interpreting the predator’s feeding behavior. The contradictions also highlight the limited number of specimens and sites studied so far, preventing definitive ecological and trophic conclusions. However, O. nungara demonstrates a remarkable dietary diversity at this stage. Discarding the unidentified OTU and the taxon assigned to Deroceras sp., 16 gastropod species and lineages were detected in the digestive contents of only sixteen individuals, while a total of 28 species and lineages were recorded across sampled sites. Among these, there are widespread and common species across the sites, such as Cornu aspersum, Deroceras invadens, and Ambigolimax parvipenis, as well as rarer and more sporadic species like Arion hortensis L2, Arion intermedius, and Cochlicopa lubrica. A broader-scale analysis is now a thrilling perspective to determine whether O. nungara primarily adopts an opportunistic or selective feeding strategy, and what ecological impact it can have on its prey and their ecosystems.
Beyond this system, studies using environmental DNA (eDNA) to investigate terrestrial gastropods remain scarce, in contrast to aquatic mollusks where eDNA is increasingly applied to detect diverse species (
The metabarcoding assay developed here, based on newly designed primers targeting gastropod DNA, provides a practical tool for both gut content analyses and environmental DNA studies. It can help investigate terrestrial gastropod consumption by different predator taxa, as well as explore gastropod presence in understudied habitats. While the primers were designed in the context of O. nungara predation, they could also be applied to other ecological situations involving terrestrial gastropods, offering additional opportunities for ecological research and conservation studies.
In this study, a workflow was designed and validated for the detection of terrestrial gastropods DNA and application in metabarcoding studies. A combination of in silico predictions, laboratory testing and field validation showed that it could be used to detect both common and rare gastropod species in anthropized areas. A key contribution of this work lies in the calibration of OTU filtering thresholds using mock communities and dietary controls, allowing to fine-tune detection sensitivity and reduce amplification artifacts. This methodological step significantly improved the reliability of dietary data recovered from digestive contents. Using this validated approach, we performed the first targeted dietary analysis of gastropod prey in the invasive flatworm O. nungara, revealing a wide range of consumed species. Interestingly, some prey species were detected despite not being inventoried during field surveys, highlighting the potential of digestive metabarcoding to improve biodiversity assessments, particularly in complex or under-sampled ecosystems. This framework may also be useful for investigating other predator-gastropod systems and for improving biodiversity assessments in terrestrial environments. Future studies will expand the dietary analysis of O. nungara to include a larger sample size, as well as exploring other predator-gastropod systems. This will enable us to gain a better understanding of the ecological interactions between introduced predators and their prey.
The authors would like to thank Agnès Gigon, Thomas Lerch and Yoan Fourcade for their help in collecting the samples, and Clémentine Loth for sequencing. Sampling missions and experiments were funded by the French National Research Agency, as part of the PLATWORM project n°ANR-21-CE02-0016.
The authors have declared that no competing interests exist.
No ethical statement was reported.
No use of AI was reported.
No funding was reported.
MV and VR designed the study. Collection of the samples was assured by MV, NM, SN, LD and VR. Laboratory work was performed by MV, NM and VR. MV conducted the analyses. MV wrote the first draft of the manuscript. All authors contributed to the final version and approved the submitted manuscript.
Mathis Ventura https://orcid.org/0009-0001-6831-9654
Nicolas Mazuras https://orcid.org/0009-0004-0888-6659
Shanèze Noël https://orcid.org/0009-0001-5380-1085
Lise Dupont https://orcid.org/0000-0003-4824-2685
Virginie Roy https://orcid.org/0000-0003-1908-0074
All of the data that support the findings of this study are available in the main text or Supplementary Information.
DNA composition
Data type: xlsx
Explanation note: Mock communities’ composition per species per volume. All DNA were diluted to the same concentration, volumes went from 1 to 10 µL in NREP mocks, and 4 µL only in POS mocks.
Reference database for 16S gene
Data type: xlsx
Explanation note: List of accession number, taxonomy, sequence fragment and reference for all terrestrial gastropods samples used to construct the reference database for 16S gene.
Sequence alignments
Data type: xlsx
Explanation note: Sequence alignments detailed for each primer pair to discriminate O. nungara and earthworm consensus sequence from terrestrial gastropods consensus sequence. Coloured bases indicate difference with the designed primer pair.
Reference database for all available reference sequences from NCBI projects
Data type: ods
Explanation note: List of accession, complete taxonomy and in silico PCR results of all available reference sequences from NCBI projects.
Species lists
Data type: xlsx
Explanation note: Species lists of terrestrial gastropod species described in France and Europe.
Bioinformatic
Data type: xlsx
Explanation note: Results of in silico PCR using the VRF-VRR primer pair for available 16S sequences of terrestrial gastropod species described in France and Europe.
Phylogenetic tree
Data type: pdf
Explanation note: Neighbor Joining tree reconstructed using 96 terrestrial gastropod 16S sequences obtained by amplifying a short metabarcode of 162 bp using VRF-VRR primer pair, and 244 sequences from the literature. Bootstrap values (1000 repetitions) > 70% are indicated at nodes. The 38 species and lineages indicated in bold were sampled in the field and had their DNA extracted for the various analyses conducted in this study.
High-throughput sequencing
Data type: xlsx
Explanation note: Per metabarcoding control and field sample, details of the raw OTUs data obtained from bioinformatic analyses. Details of seed sequence, length (bp), affiliated species, and results of BLAST criteria are displayed. Filtering outcomes for field replicates and field samples are highlighted.
Graph of results of gut contents sequencing
Data type: pdf
Explanation note: Results of gut contents sequencing by relative read abundance (RRA %) for the three samples replicated. The figure shows the results after applying the stochasticity threshold and merging the replicates to produce consensus single samples.