Corresponding author: Kristy Deiner ( alpinedna@gmail.com ) Academic editor: Xin Zhou
© 2018 Kristy Deiner, Jacqueline Lopez, Steve Bourne, Luke Holman, Mathew Seymour, Erin K. Grey, Anaïs Lacoursière, Yiyuan Li , Mark A. Renshaw, Michael E. Pfrender, Marc Rius, Louis Bernatchez, David M. Lodge.
This is an open access article distributed under the terms of the CC0 Public Domain Dedication.
Citation:
Deiner K, Lopez J, Bourne S, Holman LE, Seymour M, Grey EK, Lacoursière-Roussel A, Li Y, Renshaw MA, Pfrender ME, Rius M, Bernatchez L, Lodge DM (2018) Optimising the detection of marine taxonomic richness using environmental DNA metabarcoding: the effects of filter material, pore size and extraction method. Metabarcoding and Metagenomics 2: e28963. https://doi.org/10.3897/mbmg.2.28963
|
The analysis of environmental DNA (eDNA) using metabarcoding has increased in use as a method for tracking biodiversity of ecosystems. Little is known about eDNA in marine human-modified environments, such as commercial ports, which are key sites to monitor for anthropogenic impacts on coastal ecosystems. To optimise an eDNA metabarcoding protocol in these environments, seawater samples were collected in a commercial port and methodologies for concentrating and purifying eDNA were tested for their effect on eukaryotic DNA yield and subsequent richness of Operational Taxonomic Units (OTUs). Different filter materials [Cellulose Nitrate (CN) and Glass Fibre (GF)], with different pore sizes (0.5 µm, 0.7 µm and 1.2 µm) and three previously published liquid phase extraction methods were tested. The number of eukaryotic OTUs detected differed by a factor of three amongst the method combinations. The combination of CN filters with phenol-chloroform-isoamyl alcohol extractions recovered a higher amount of eukaryotic DNA and OTUs compared to GF filters and the chloroform-isoamyl alcohol extraction method. Pore size was not independent of filter material but did affect the yield of eukaryotic DNA. For the OTUs assigned to a highly successful non-indigenous species, Styela clava, the two extraction methods with phenol significantly outperformed the extraction method without phenol; other experimental treatments did not contribute significantly to detection. These results highlight that careful consideration of methods is warranted because choice of filter material and extraction method create false negative detections of marine eukaryotic OTUs and underestimate taxonomic richness from environmental samples.
eDNA, 18S ribosomal, seawater, high-throughput-sequencing, metazoan eukaryotes, non-indigenous species
Global biodiversity is being redistributed, with negative consequences for humanity (
Human-modified coastal habitats are key sites for understanding the effects of anthropogenic activities, such as pollution and the introduction of non-indigenous species (NIS) (
To date, there is some guidance on suitable filters or DNA extraction techniques for eDNA studies of both fresh and marine environments. Specifically, research has begun to identify optimal methods for filtration and extraction of eDNA in freshwater systems based on qPCR detection of single species and high-throughput sequencing of eukaryotic communities (
Using seawater sampled from a commercial port, and a factorial experiment (Figure
Experimental design used to test for an effect of filter material (Cellulose Nitrate, CN; Glass Fibre, GF) pore size and extraction method (PCI-1, PCI-2 and CI, see methods section for their description) on the detection eukaryotic DNA yield and the number of Operational Taxonomic Units (OTUs) estimated from seawater samples from a commercial port. Numbers in parentheses are the number of replicates with the target number first followed by the final number included in statistical analyses. Colours indicate experimental treatments where filter material is in blue (GF) or red (CN) and extraction methods are in variations of blue or red depending on filter material.
A power analysis (
The power analysis calculates sample size (n) or the number of 100 ml experimental replicates, for each response variable; including filter material, pore size and extraction type. We performed the test assuming a 90% confidence that the average DNA concentration after amplification or count of OTUs deviates (d), with a confidence interval (CV) no more than 10% from the true mean (µ) post-amplification DNA concentration or count of OTUs. The true mean is assumed equivalent between replicates. We used the Critical Values calculated from the t distribution assuming a one-tailed test (α=0.05, (α=0.05,
The Port of Southampton in the United Kingdom is amongst the busiest and oldest commercial ports in Europe (
To allow for broad taxonomic richness in the water used for the experiment, thirteen 1 litre seawater samples were collected from Empress Dock at two sites: seven from the harbour master’s pontoon (50°53’24.7128”N, 1°23’47.2704”W) in the southwest corner and six from the National Oceanography Centre pontoon (50°53’28.2732”N, 1°23’38.0724”W) in the northeast corner. The 1 litre Nalgene™ plastic bottles were decontaminated with a 4% bleach solution (0.58% NaOCl) and rinsed using sterile Milli-Q water. Plastic bottles were then used to collect water from the top 0.5 m of the water column, first being rinsed out twice in the water and, finally, collecting the 1 litre water sample with the third fill. After collection, samples were immediately transported to the laboratory directly near the dock where they were refrigerated at 4 °C. Samples were refrigerated for 12.5 hours until filtration could begin.
Prior to filtration, the thirteen 1 litre seawater samples were poured into a single container. Experimental replicates were created by taking 100 ml volumes of seawater from the single container (Figure
After each filtration, filters was folded and placed into a 2 ml microcentrifuge tube filled with 700 µl of Longmire’s tissue lysis buffer (
A rigorous cleaning protocol, before and during the procedure, ensured minimal cross-contamination between filters. All equipment was decontaminated before the procedure using a 4% bleach solution (0.58% NaOCl). The three-piece stainless-steel manifold and working area were first wiped down with the 4% bleach solution and then rinsed with distilled water and dried. New gloves and forceps were used for every experimental replicate and the decontamination procedure was repeated between replicates.
Experimental replicates were randomly assigned by filter type to an extraction method as indicated in Figure
To remove potential inhibitors, extracted DNA was treated with the OneStep™ PCR Inhibitor Removal Kit (Zymo Research, Irvine, California, USA). Each replicate was quantified with 2 µl of sample using Qubit dsDNA High-Sensitivity Kit and Qubit 2.0 Fluorometer (Life Technologies, Grand Island, NY).
Our 18S ribosomal RNA gene primer set contained two sequences: Illumina adapter sequence and gene-specific target sequence. The Illumina adapter sequence provided the requisite flanking region for subsequent dual-indexing PCR and sequencing platform compatibility. The gene-specific sequence targeted a 378 bp fragment of the V4 variable region of the Eukaryote 18S ribosomal RNA gene.
The library preparation workflow is illustrated in Figure
The Illumina adapter and dual-index barcodes (i7 and i5 index primers) were added to the PCR-1 products by high fidelity PCR using iProof PCR reagents (Bio Rad Laboratories, Hercules, CA) in a 0.2 ml PCR tube with individually attached cap. The 50 µl-reaction contained 5.0 µl of PCR-1 Template (regardless of concentration), 22.0 µl of PCR-grade water, 10.0 µl of 5× HiFi Buffer (Bio Rad Laboratories, Hercules, CA), 1.5 µl of 50 mM magnesium chloride (Bio Rad Laboratories, Hercules, CA), 1.0 µl of 10 mM dNTP mix, (G-Bioscience, St. Louis, MO), 5.0 µl of 10 µM i7 Index primer (Integrated DNA Technologies, Coralville, IA), 5.0 µl of 10 µM i5 Index primer (Integrated DNA Technologies, Coralville, IA) and 0.5 µl of 2 U/µl iProof HiFi DNA Polymerase (Bio Rad Laboratories, Hercules, CA). PCR reactions were cycled in a Master Pro S (Eppendorf, Westbury, NY) with the following conditions: initial denaturation at 98.0 °C for 2 minutes, followed by 8 cycles of denaturation at 98.0 °C for 10 seconds, annealing at 60.0 °C for 20 seconds and extension at 72.0 °C for 30 seconds, with a final extension at 72.0 °C for 10 minutes. Reactions were purified with a 0.8× bead:sample volume ratio (40.0 µl) of Agencourt Ampure XP Beads (Beckman Coulter, Webster, TX) and separated with a DynaMag-2 magnet (Life Technologies, Grand Island, NY). Ampure XP Beads were resuspended in 32.5 µl PCR-grade water to elute the dual-indexed library. After clearing the solution of Ampure XP Beads with the DynaMag-2 magnet, only 30.0 µl was recovered to avoid Ampure XP Bead carryover. Library concentration was determined with 2.0 µl of sample using Qubit dsDNA High-Sensitivity Kit and Qubit 2.0 Fluorometer (Life Technologies, Grand Island, NY).
No-template controls were introduced at multiple steps in the workflow (DNA extraction, eukaryotic 18S ribosomal RNA gene PCR-1 and PCR-2). No-template controls followed the same procedure as experimental replicates. In place of extracted DNA, 5.0 µl of PCR-grade water was used as the template for PCR-1 and PCR-2. A total of ten no-template controls were generated to detect contamination during seawater sampling and preparation of libraries.
Due to the project’s scale, 31 of 129 dual-indexed libraries (~24%), derived from experimental replicates, were randomly chosen for Bioanalyzer analysis to control overall cost of validation. To verify the final library fragment size and diagnose primer contamination, 1.0 µl of dual-indexed library was analysed using Bioanalyzer 2100 instrument and Bioanalyzer DNA 7500 kit (Agilent Technologies, Santa Clara, CA). All no-template controls (1.0 µl) were analysed using Bioanalyzer DNA High-Sensitivity kit (Agilent Technologies, Santa Clara, CA).
Due to the number of samples in the study, the libraries where split across two MiSeq runs. To avoid a run effect, libraries were randomly assigned to one of two pools by sub-group (explained below), extraction method and filter material. All ten no-template control libraries were added to both pools to test for consistency between runs. As PCR-2 DNA concentrations from each filter material were the response variables, the volume of template added to the eukaryotic 18S ribosomal RNA gene PCR and dual-index PCR was held constant without normalising the template concentration as this would remove the effect of our experiment (Fig.
The Genomics Core Facility (University of Notre Dame, Notre Dame, IN) validated the final library pools using a combination of Qubit dsDNA High-Sensitivity Kit and Qubit 2.0 Fluorometer (Life Technologies, Grand Island, NY), Bioanalyzer 2100 instrument and Bioanalyzer DNA 7500 Kit or Bioanalyzer DNA High-Sensitivity Kit (Agilent Technologies, Santa Clara, CA) and Kapa Illumina Library Quantification qPCR assay (Kapa Biosystems, Wilmington, MA). Final molar concentration of the final library pool was based on the average of the values determined by Qubit and qPCR analysis. For each of the final library pools, a solution containing 6 pM denatured library and 3 pM denatured PhiX Control v3 (Illumina, Inc., San Diego, CA) was sequenced on Illumina MiSeq Sequencer operating MiSeq Control Software v2.5 with MiSeq flowcell (v3) and MiSeq Reagent Kit (v3). Sequencing format was 300 cycles for read 1, 8 cycles for index 1 read, 8 cycles for index 2 read and 300 cycles for read 2. Each library pool was run on a different MiSeq flow cell. MiSeq Reporter v2.5 Real Time Analysis (RTA) v1.18.54 performed base calling. Illumina Bcl2fastq v2.18 demultiplexed the RTA output and converted the data from bcl to fastq format. Finally, BaseSpace Broker v2.1 reported the files to BaseSpace Sequencing Hub.
Lastly, one set of samples was unfortunately mixed during the second stage of DNA extraction. While they were fully processed and contributed to the total raw reads observed, we removed them from any statistical analysis due to not being able to differentiate which replicate they were. This error reduced the number of experimental replicates for the treatment of cellulose nitrate filters with a pore size of 0.65 µm (CN 0.65 µm) extracted with the PCI-2 protocol from ten to four. We also lost one sample during DNA extraction from the treatment with the filter material of glass fibre with a pore size of 0.7 (GF 0.7 µm) extracted with PCI-1. The replicate could not be processed for sequencing and thus reduced the number of replicates for this treatment from ten to nine.
Sequencing adapters were removed and reads were quality filtered using Trimmomatic v0.32 (
To explore the effect of method choice on a globally relevant NIS, we determined the detection of Styela clava in different technical replicates based on species assignments of OTUs. Styela clava, a solitary ascidian, is native to the NW Pacific. Over the last 100 years, it has become common in fouling communities in harbours and ports across the globe, including Europe, Australasia and North America (
All statistical analyses were performed using the programme R, version 3.3.1.1 (R Core Team 2016). R scripts and required data files are provided on Figshare (https://doi.org/10.6084/m9.figshare.c.4191167.v1). Least squares regression models were used to test the additive effect of filter material (GF or CN), extraction type (CI, PCI-1 and PCI-2), pore size (0.5 μm, 0.7 μm and 1.2 μm) and the nested effect of filter material on pore size on PCR-2 DNA concentrations and OTU richness (both eukaryotic and metazoans only). As mentioned above, GF 0.7 µm and CN 0.65 µm were considered equivalent pore sizes for statistical analysis of the experiment. Richness of an experimental replicate was calculated by summing all OTUs with a single read or more. Due to the possible violation of homogeneity with pore size, resulting from the imbalance in factor levels (0.5 μm = 10, 0.7 μm = 53, 1.2 μm = 60), we also tested the models after removing the smallest pore size (0.5 μm) from the dataset. In addition to the two response variables of PCR-2 DNA concentrations and OTU richness, the variables of DNA concentration after PCR-1, the number of raw merged reads, number of merged reads after quality filtering, the number of OTUs clustered at 95% and the subset of metazoan OTUs clustered at 95% were tested for a correlation to establish an association between each step in the laboratory workflow and the outcome of methods’ choices. To evaluate the detectability of the NIS Styela clava across the experimental treatments, reads from the five OTUs assigned to Styela clava were pooled (Suppl. material
Raw fastq files have been deposited in the Short Read Archive (PRJNA395904). The two MiSeq runs produced a total of 45,600,412 raw reads for the experimental replicates and 41,522 reads for the no-template controls. After merging paired reads and quality filtering, a total of 16,144,713 reads remained for the experimental replicates and 837 reads for the no-template controls. Experimental replicates had an average of 125,153 ± 111,631 reads and no-template controls had an average of 42 ± 25 reads.
The number of unique reads summed across both runs was 996,242. After removal of singletons, 260,711 unique reads remained. Clustering at 97% and 95% resulted in 4,554 and 3,634 OTUs, respectively (Suppl. material
DNA concentration varied as a function of filter material, pore size and extraction method (Fig.
Observed DNA concentrations or number of Operational Taxonomic Units by experimental treatment. Observed DNA concentrations after DNA extraction (a) indexed PCR (PCR-2) (b) and the number of estimated OTUs per experimental treatment clustered at 97% for all eukaryotes and (c) and metazoans (d). The upper and lower whiskers indicate the minimum and maximum point within 1.5 times the Interquartile Range extended from the 25th and 75th percentile, respectively. Colours indicate filter material where red is Cellulose Nitrate (CN) and blue is Glass Fibre (GF). The three extraction methods are abbreviated as in Figure
Statistical results from the least squares regression models with response variables of PCR-2 DNA concentrations, OTUs with 97% sequence similarity and Styela clava detection. Provided for each explanatory variable (extraction type, filter material, pore size and pore size nest in filter material) are the degrees of freedom (DF), Sum of squares (Sum sq), Mean squares (Mean sq), F value and p-value (P-value).
Response: PCR-2 DNA Concentration | |||||
---|---|---|---|---|---|
DF | Sum Sq | Mean Sq | F value | P value | |
Extraction type | 2 | 17018.2 | 8509.1 | 111 | < 0.001 |
Pore size | 2 | 1231.1 | 615.5 | 8.06 | < 0.001 |
Filter material | 1 | 2883.3 | 2883.3 | 37.8 | < 0.001 |
Pore size / Filter material | 1 | 347.8 | 347.8 | 4.55 | 0.035 |
Residuals | 116 | 8859.8 | 76.4 | ||
Response: OTU 97 | |||||
Extraction type | 2 | 1980701 | 990351 | 55 | < 0.001 |
Pore size | 2 | 378264 | 189132 | 10.5 | < 0.001 |
Filter material | 1 | 1152720 | 1152720 | 64 | < 0.001 |
Pore size / Filter material | 1 | 66101 | 66101 | 3.67 | 0.058 |
Residuals | 116 | 2090261 | 18019 | ||
Response: Styela clava detection | |||||
Extraction type | 2 | 4.148 | 2.074 | 10.595 | <0.001 |
Pore size | 1 | 0.177 | 0.177 | 0.905 | 0.343 |
Filter material | 1 | 0.587 | 0.587 | 2.998 | 0.086 |
Pore size / Filter material | 1 | 0.007 | 0.007 | 0.033 | 0.855 |
Residuals | 123 | 24.074 | 0.196 |
OTU detection for eukaryotes and the subset of metazoan eukaryotes varied with method choice (Figure
Based on the taxonomic assignment of OTUs, Styela clava was detected in 45 of 73 (62%) PCI replicates compared to 10 of 50 (20%) CI extraction replicates (Suppl. material
Observed DNA concentrations or number of Operational Taxonomic Units by experimental factor. Mean PCR-2 DNA concentrations (indexed libraries) (y-axis) for each set of explanatory variables including extraction type (A), filter pore size (B) and filter material (C). Extraction types include chloroform (Cl), phenol chloroform 1 (PCl 1) and phenol chloroform 2 (PCl 2) extractions. Filter materials included Cellulose Nitrate (CN) and Glass Fibre (GF). The upper and lower whiskers indicate the minimum and maximum point within 1.5 times the Interquartile Range extended from the 25th and 75th percentile, respectively.
Pairwise correlations between non-independent response variables corresponding to quantified values taken during the analyses. Lower triangle consists of scatterplots with individual sample comparisons shown. The upper triangle shows the R correlation value corresponding with the lower triangle. The diagonal shows the distribution curve for the variable indicated in the corresponding column.
Detection rate of Styela clava by extraction method. The proportion of positive detections for Styela clava across experimental treatments (left). Photographic image of Styela clava individual from Chichester Harbour, United Kingdom (Right). Colours are indicating the three different extraction methods which are abbreviated the same as in Figure
The eDNA metabarcoding methods experiment findings show that filter material, pore size and extraction method affects the yield in marine eukaryotic eDNA. Additionally, the filter material and extraction method, but not pore size, influence the estimated richness of OTUs detected from seawater using eDNA metabarcoding. Specifically, as many as three times the number of OTUs could be detected by adjusting the filter material and extraction method, indicating that some materials and methods combinations have high false negative detection rates. While no other studies have tested for the effect of all three variables tested here, these results corroborate experiments in freshwater systems where cellulose nitrate filters yielded a higher DNA concentration compared to other filter materials (
An important observation from this study is that all materials and methods used produced biodiversity information, even when DNA yields were below detection limits (e.g. Figure
The logical conclusion from first principles, which infers that false negative detections decrease as DNA yield from a sample increases, suggests that the DNA extraction is an obvious step in the process to optimise. The experimental test here demonstrated that the phenol-chloroform-isoamyl DNA extraction methods produced a greater yield of DNA compared to the chloroform-isoamyl extraction method. The slight modifications between the two phenol-chloroform-isoamyl DNA extractions methods tested were not significantly different. Previous work in freshwater systems have also shown that a phenol-chloroform-isoamyl DNA extraction method resulted in a higher yield of DNA compared to DNeasy Animal Tissue kits (Qiagen) (
This study is the first to compare filter material and similar pore sizes between filter materials. However, including a statistical term where pore size was nested within filter material provide a better model fit to the data compared to a purely additive model. The non-independence of filter material with pore size and estimated richness is likely because filter pore sizes are not uniform in structure between different filter materials. In fact, glass fibre filters are described by the manufacturer as having a nominal pore size of 0.7 μm, but the size of any one pore can vary. For a micrograph of pore structures of different filter materials, see the Supplemental material from figure S2 of
The choices made about the amount of replication in eDNA methodology experiments require more attention. Replication amongst eDNA methodology studies varies, with some including only three replicates per treatment (
The current practice in eDNA metabarcoding studies is to normalise DNA concentration of prepared libraries from each sample before pooling and sequencing. The intention of which is to produce a dataset with equal sequencing depth per sample. The normalisation of read depth per library can also be achieved bioinformatically (
While the advice and methods for detecting false positives have matured and researchers implement these practices (
This study adds to the growing literature that false negative detections are a consequence of how the sample is processed in the laboratory. While methods will continue to be optimised and because false negative detections are often the result of a sampling problem, all steps where the sample of eDNA is biased can create false negative detections. Continued work is needed to identify the most crucial steps where this bias is introduced and future research on the laboratory methods should focus on optimisation of steps where the most gain in species detection is possible. These include, but are not limited to, the sequencing depth (
This research was funded by the NSF Coastal SEES grant #1427157 (to DML, EKG). SB and LEH were supported by Natural Environment Research Council Grant #NE/L002531/1. LB and ALR were supported by Polar Knowledge Canada # NST-1617-0016B,