Corresponding author: André B. Nobile ( andrenobile@hotmail.com ) Academic editor: Filipe Costa
© 2019 André B. Nobile, Diogo Freitas-Souza, Francisco J. Ruiz-Ruano, Maria Lígia M. O. Nobile, Gabriela O. Costa, Felipe P. de Lima, Juan Pedro M. Camacho, Fausto Foresti, Claudio Oliveira.
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Nobile AB, Freitas-Souza D, Ruiz-Ruano FJ, Nobile MLMO, Costa GO, de Lima FP, Camacho JPM, Foresti F, Oliveira C (2019) DNA metabarcoding of Neotropical ichthyoplankton: Enabling high accuracy with lower cost. Metabarcoding and Metagenomics 3: e35060. https://doi.org/10.3897/mbmg.3.35060
|
Knowledge of ichthyoplankton dynamics is extremely important for conservation management as it can provide information about preferential spawning sites, reproductive period, migratory routes and recruitment success, which can be used to guide management and conservation efforts. However, identification of the eggs and larvae of Neotropical freshwater fish is a difficult task. DNA barcodes have emerged as an alternative and highly accurate approach for species identification, but DNA barcoding can be time-consuming and costly. To solve this problem, we aimed to develop a simple protocol based on DNA metabarcoding, to investigate whether it is possible to detect and quantify all species present in a pool of organisms. To do this, 230 larvae were cut in half, one half was sequenced by the Sanger technique and the other half was used to compose six arrays with a pool of larvae that were sequenced using a next-generation technique (NGS). The results of the Sanger sequencing allowed the identification of almost all larvae at species level, and the results from NGS showed high accuracy in species detection, ranging from 83% to 100%, with an average of 95% in all samples. No false positives were detected. The frequency of organisms in the two methods was positively correlated (Pearson), with low variation among species. In conclusion, this protocol represents a considerable advance in ichthyoplankton studies, allowing a rapid, cost-effective, quali-quantitative approach that improves the accuracy of identification.
DNA Barcoding, Fisheries management, Ichthyofauna, Upper Paraná River
Studies concerning ichthyoplankton can provide novel information about preferential spawning sites, reproductive period, migratory routes, and recruitment success (
The use of DNA barcodes in ichthyoplankton improved the quality of identifications, both for eggs and larvae, and emerged as an effective alternative for traditional identification methods (
To solve this problem, DNA metabarcoding, based on next generation sequencing (NGS) could offer an alternative approach, allowing a reduction of labor time as well as costs. Although Neotropical ichthyofauna metabarcoding studies are scarce (
Here we describe a novel DNA metabarcoding protocol for ichthyoplankton using NGS technology and test the accuracy of this protocol in identifying all species present in an array. We also test whether the proposed approach is valid for estimating the relative frequency of species present in a sample.
The larvae used in this experiment come from another study using DNA barcoding for species identification. Ichthyoplankton sampling was carried out in the Mogi-Guaçu river, an important tributary of the left bank of the Grande River, Upper Paraná River Basin. Samples were taken from two sites: 1: 21°35.441’S, 47°57.244’W (Dec/16 and Jan/17) and site 2: 21°30.168’S, 48°2.534’W (Dec/15, Nov/16, Dec/16, and Jan/17) (Table
Number of larvae in each sample and their respective sites and month of capture.
Sample number | Number of larvae | Site | Month |
---|---|---|---|
S1 | 36 | 1 | Jan/17 |
S2 | 42 | 1 | Dec/16 |
S3 | 26 | 2 | Dec/16 |
S4 | 38 | 2 | Jan/17 |
S5 | 44 | 2 | Nov/16 |
S6 | 44 | 2 | Dec/15 |
Total | 230 |
Collecting took place with a conical-cylindrical net with a mesh size of 0.5 mm. Samples were immediately fixed in 96% ethanol. In the laboratory, eggs and larvae were separated from other materials present in the samples (e.g., leaves, sediment, and sticks) under a stereomicroscope and kept in 96% ethanol until the molecular procedure.
In this study, we used 230 larvae distributed randomly across six arrays, with the number of organisms per array ranging from 36 to 44. However, each sample contained larvae of only one site and month (Table
DNA extraction for Sanger sequencing was made following the protocol proposed by
DNeasy Blood and Tissue Kit (Qiagen) was used for DNA extraction from bulk samples following the manufacturer’s protocol. Two PCR runs were performed: the first to amplify the COI region, using the same primer pair used for the Sanger sequencing coupled with Illumina MiSeq adapter, and the second to attach a multiplex identifier (MID).
In the first PCR, each reaction contained, 2 μl of DNA template, 2.5 μl buffer, 1 μl MgCl2 (50mM), 1 μl dNTP (2 mM), 0.5 μl forward adapter+FishF1 primer (5 mM), 0.5 μL reverse adapter+FishR1 primer (5 mM), 0.4 μl Taq Polymerase (5 U/µl) (Phoneutria), and 17.1 μl of molecular biology grade water in a final volume of 25 μl. PCR conditions were 95 °C for 3 min, 35 cycles of 95 °C for 30 s, 52 °C for 45 s, and 68 °C for 60 s; and a final extension at 68 °C for 10 min. We purified PCR products through NucleoSpin Gel and PCR Clean-up from Macherey-Nagel. In the second PCR, the library construction was made, where amplicons were dual indexed with MID’s by Macrogen, using the Nextera XT Index Kit and following the instructions present in 16S Metagenomic Sequencing Preparation. Amplicons were sequenced on the Illumina MiSeq platform, generating paired-end reads of 301 nucleotides in length.
For sequences generated with the Sanger method we used the Geneious Pro v4.8.5 software (
Metabarcoding data were analyzed following a bioinformatic protocol described in the Suppl. material
After this step, we used the OBITools toolkit version 1.01 22 (http://metabarcoding.org/obitools;
We constructed a custom database by downloading sequences from Project – FUPR Fishes from Upper Parana River, Brazil, present in the BOLD System (
As a final filtering step, all sequences generated in the present study (NGS + Sanger) were aligned with those sequences present in a custom database that had similarity greater than 95% with one of the sequencing methods, using the Muscle (
To validate the efficacy of metabarcoding, statistical analyses were conducted to evaluate whether this method has a quali-quantitative resolution. To achieve our qualitative aim, from the genetic distance (K2P) and neighbor-joining (NJ) dendrogram, we estimated richness (S = number of species) per sample in each method. Then, the richness between methods was compared to estimate the rate of species detection by metabarcoding, using the Sanger data as a reference.
To achieve our quantitative analysis, as the exact number of larvae per sample was known, the relative frequency per species (or taxon) was estimated for the Sanger data. This frequency compared with the value of the relative frequency (number of reads assigned to a taxon or species in relation to the total) obtained metabarcoding. This comparison was made per sample individually (e.g., Sanger sample 1 × NGS 1 sample). After this, we applied a Pearson correlation to test if the frequencies of Sanger and NGS showed a significant positive correlation and a permutational analysis of variance (Permanova) to evaluate significant differences between relative abundances of Sanger and NGS data.
Of the 230 larvae used in this experiment (Suppl. material deposited in: https://doi.org/10.6084/m9.figshare.6726956) 226 were retained for the analysis with Sanger sequencing. Four larvae were removed, due to low quality in Sanger sequencing and genetic divergence above 2% in the alignment with those downloaded sequences. Results from the Sanger sequencing showed a richness of 29 taxa, belonging to 12 families and three orders, varying from six species in sample S2 to 13 species in sample S1 (Table
NGS yielded 2,511,656 reads in the eight samples (Suppl. material deposited in: https://doi.org/10.6084/m9.figshare.6726956) and, after the bioinformatic analysis, 633,224 reads were kept. In total, 28 taxa were identified with metabarcoding NGS out of the 29 taxa identified by the Sanger sequencing. Only Astyanax schubarti was not identified by metabarcoding in any of the samples (Table
Species composition and relative abundance of analyzed samples. S= Sanger sequences; NGS= Next generation sequencing metabarcoding; * false negative.
Family | Species | S1 | S2 | S3 | S4 | S5 | S6 | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
S | NGS | S | NGS | S | NGS | S | NGS | S | NGS | S | NGS | ||
Characiformes | |||||||||||||
Anostomidae | Leporellus vittatus | 5.6 | 10.4 | ||||||||||
Leporinus friderici | 13.9 | 23.2 | 2.8 | 1.5 | 9.3 | 11.6 | 2.3 | 1.2 | |||||
Leporinus paranensis | 2.3 | 3.7 | |||||||||||
Leporinus striatus | 5.6 | 2.5 | |||||||||||
Megaleporinus obtusidens | 7.3 | * | 3.8 | 3.3 | 9.3 | 0.8 | 2.3 | 0.1 | |||||
Megaleporinus piavussu | 2.8 | 0.3 | |||||||||||
Schizodon nasutus | 2.4 | 2.1 | |||||||||||
Characidae | Astyanax lacustris | 11.1 | 0.1 | 5.6 | 0.3 | 4.7 | * | ||||||
Astyanax schubarti | 2.8 | * | 5.6 | * | |||||||||
Cheirodontinae | 11.1 | 4.3 | 2.4 | 0.05 | 3.8 | 0.6 | 2.8 | 2.0 | 4.7 | 1.6 | |||
Cheirodontinae 1 | 3.8 | 6.2 | |||||||||||
Hyphessobrycon eques | 5.6 | 2.0 | |||||||||||
Curimatidae | Steindachnerina insculpta | 2.8 | 0.6 | ||||||||||
Erythrinidae | Hoplias malabaricus | 2.3 | 2.2 | ||||||||||
Parodontidae | Apareiodon affinis | 2.4 | 0.1 | ||||||||||
Prochilodontidae | Prochilodus lineatus | 2.8 | 15.2 | 4.7 | 9.2 | ||||||||
Gymnotiformes | |||||||||||||
Sternopigydae | Eigenmannia trilineata | 2.3 | 5.0 | ||||||||||
Siluriformes | |||||||||||||
Doradidae | Rhinodoras dorbignyi | 2.8 | 1.4 | ||||||||||
Heptapteridae | Heptapteridae | 5.6 | 2.2 | 2.3 | 1.9 | ||||||||
Iheringichthys labrosus | 2.3 | 4.7 | |||||||||||
Pimelodella meeki | 5.6 | 2.4 | 8.3 | 2.1 | |||||||||
Loricariidae | Rhinelepis aspera | 2.8 | 16.9 | ||||||||||
Pimelodidae | Pimelodus maculatus | 27.8 | 36.0 | 70.7 | 95.0 | 69.2 | 58.1 | 52.8 | 61.3 | 32.6 | 49.0 | 86.4 | 85.2 |
Pimelodus microstoma | 2.8 | 6.3 | 11.5 | 21.1 | |||||||||
Pinirampus pirinampu | 3.8 | 3.9 | |||||||||||
Pseudoplatystoma corruscans | 14.6 | 2.7 | 3.8 | 6.9 | 16.3 | 8.6 | 2.3 | 0.1 | |||||
Pseudoplatystoma reticulatum | 4.7 | 1.3 | |||||||||||
Sorubim lima | 2.8 | 3.0 | 9.3 | 13.8 | |||||||||
Pseudopimelodidae | Pseudopimelodus mangurus | 5.6 | 4.2 | 2.8 | 1.8 |
No false positives were observed in this study (Table
The correlation between Sanger and NGS methods showed few differences in all samples. We detected a considerable difference between the two methods in only some cases: Rhinelepis aspera, Cheirodontinae sp.1, and Astyanax lacustris, in sample 1 and Prochilodus lineatus and Astyanax lacustris in sample 4. Nevertheless, the samples showed non-significant differences between the relative abundances of the two methods in PERMANOVA analysis (Pseudo-F = 0.39293; P(perm) = 0.838), denoting that, despite some deviations in the values, the quantification of organisms with the NGS method can be considered reliable here (Table
Graphic representation of species abundance proportion (%) between Sanger (x-axis, left side) and NGS (x-axis, right side) for analyzed samples. A: sample 1, B: sample 2, C: sample 3, D: sample 4, E: sample 5, F: sample 6.* = false negative.
Species richness per sample and method of sequencing, detection rate, and Pearson correlation (r) between Sanger and NGS methods.
Sample | Richness | Detection rate NGS/Sanger | Pearson correlation | |
---|---|---|---|---|
Sanger | NGS | |||
S1 | 13 | 12 | 92.3 | 0.776 |
S2 | 6 | 5 | 83.3 | 0.987 |
S3 | 7 | 7 | 100.0 | 0.976 |
S4 | 12 | 11 | 91.7 | 0.955 |
S5 | 11 | 10 | 90.9 | 0.914 |
S6 | 7 | 7 | 100.0 | 0.998 |
Despite some failures in species detection, we can consider that the use of NGS is functional for COI-based ichthyoplankton species detection, with an average detection rate higher than 95%, which is similar to previous findings that compared Sanger and NGS (
With our present approach, most individuals were identified to the species level, demonstrating that even using only the forward reads, the information was sufficient for species identification. Only three taxa were assigned to family or subfamily level only, probably because they are cryptic species, not described or with sequences not deposited in any database. This observation reinforces the idea that more sequences of fish species must be deposited in databases, especially in river basins with rich fish fauna as the Upper Paraná River. Another positive aspect is that, compared with DNA barcoding (
Despite some differences in obtaining DNA samples, both environmental DNA (eDNA) and DNA metabarcoding from bulk samples are subject to false negatives and false positives (
Quali- and quantitative studies involving metabarcoding and eDNA have increased significantly in recent years (
Our results clearly show that the NGS approach on bulk samples can detect ichthyoplankton diversity at the species level, and also gives good estimates of relative abundance of the larvae, thus allowing reliable environmental monitoring at reduced cost and labor (
This research received financial support from FAPESP grants 2015/19025-9, 2017/12758-6 (ABN), CNPq 165830/2015-8 (ABN), 150344/2016 (FPL), 141526/2015-7 (DFS), 144265/2017-6 (GOC), and Capes 180517/2018-01 (MLMON). We wish to thank Centro Nacional de Pesquisa e Conservação de Peixes Continentais – CEPTA, Pirassununga-SP, José Augusto Senhorini and Claudio Bock for help in the field and Jorge Doña by advised about the bioinformatic analysis.