Research Article |
Corresponding author: A. Bruce Cahoon ( abc6c@uvawise.edu ) Academic editor: Jan Pawlowski
© 2018 A. Bruce Cahoon, Ashley G. Huffman, Megan M. Krager, Roseanna M. Crowell.
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Cahoon AB, Huffman AG, Krager MM, Crowell RM (2018) A meta-barcoding census of freshwater planktonic protists in Appalachia – Natural Tunnel State Park, Virginia, USA. Metabarcoding and Metagenomics 2: e26939. https://doi.org/10.3897/mbmg.2.26939
|
The purpose of our study was to survey the freshwater planktonic protists within an inland natural preserve in the Ridge and Valley physiographic province of the Appalachian Region using metabarcoding. Microbial eukaryotes are essential primary producers and predators in small freshwater ecosystems, yet they are often overlooked due to the difficulty of identification. This has been remedied, in part, by the cost reduction of high throughput DNA sequencing and the growth of barcode databases, making the identification and analysis of microorganisms by way of metabarcoding surveys in complex ecosystems increasingly feasible. Water samples were collected from five sites at the Natural Tunnel State Park in Scott County, VA (USA), representing three common bodies of water found in this region. Samples were initially collected during a Bioblitz event in April 2016 and then seven and fourteen weeks afterwards. Metabarcode analysis of the 23S and 18S genes identified 3663 OTUs representing 213 family level and 332 genus level taxa. This study provides an initial barcode census within a region that has a reputation as a temperate biodiversity “hotspot”. The overall protist diversity was comparably high to other temperate systems, but not unusually high; the microalgal diversity, however, was higher than that reported for other temperate regions. The three types of water bodies had their own distinctive protist biomes despite close proximity.
Metabarcoding, microbial eukaryotes, algae, Bioblitz
The region of the Cumberland Mountains and Ridge and Valley physiographic provinces of Appalachia, located at the intersection of Virginia, Kentucky, Tennessee and North Carolina, is considered one of the most biodiverse temperate regions in North America and is home to a large number of endemic freshwater macro-invertebrates (
Morphological surveys have uncovered an astounding inventory of freshwater protists (
This paper is, to our knowledge, one of the few surveys of microbial eukaryotes undertaken in the Appalachian region of North America and the first application of eukaryotic microbiome metabarcoding in a Bioblitz event. A Bioblitz uses a combination of experts and citizen scientist volunteers to survey a natural area for pre-defined groups of organisms over the course of 24 hours. The survey presented in this paper provides an inventory of 3663 OTUs representing 213 family level and 332 genus level taxa from five sample sites within a state park. These data were used to test two hypotheses: (i) overall diversity would be high; perhaps higher than comparable temperate regions and (ii) the five sample sites would have their own distinct protist profiles.
Water samples were collected at five sites in the Natural Tunnel State Park (Scott County, VA, USA). These sites were chosen because they represent the most abundant types of natural water bodies found in this physiographic region: 1 – a lentic ephemeral pool in an abandoned Quarry. The floor of the quarry is relatively flat and occupies an area 13,500 m2 (estimate calculated using Google Earth Pro, https://www.google.com/earth/desktop/). The pool was estimated to be ~400–500 m2 and depth was ~0.5 m at its deepest point. 2 – Stock Creek is a stony stream running through agricultural areas and lined with riparian vegetation prior to entering the park where it is surrounded by deciduous forest. It is responsible for carving the tunnel from which the park derives its name. Stock Creek North (SC-N) samples were taken from a lotic area with no canopy cover occurring ~750 m upstream of the North Portal of the tunnel. At this site, the creek is 9–10 m wide and 1–1.5 m in depth. The broad flat bank on the eastern side of this sample site is a popular destination for anglers. 3 – Stock Creek South (SC-S) is about ~400 m downstream of the south portal of the tunnel. This site is shallow (≤1 m in depth) and 9–10 m wide with many riffle pools and extensive deciduous canopy cover. 4 –a stream feeding into Stock Creek with a small Waterfall, ~3 m in height, with extensive deciduous canopy cover, ~1–2 m wide, very shallow (5–10 cm deep) and running down a rocky slope where it enters Stock Creek. Samples were collected from the waterfall. 5 – a different stream feeding into Stock Creek near an abandoned settlement with a derelict Waterwheel with extensive deciduous canopy cover. This stream was 1–2 m wide and very shallow (5–10 cm depth). The Waterfall and Waterwheel streams are intermittent and fed by a combination of artesian sources and rainfall. They were free-flowing during our sampling period but they can dry up during periods of low rainfall. Photographs of each site are located in Suppl. material
The 18S V4 region barcode identified many more algal OTUs than 23S but the two barcodes lacked concordance. The number of identifiable OTUs (identifiable to the genus level) and sequence abundance of the 18S and 23S barcode sequences were compared. Algal estimates include genera within the phyla Chlorophyta, Glaucophyta, Rhodophyta, Charophyta, Cryptophyta, Haptophyta, Ochrophyta, Bacillariophyta and Dinoflagellata. Heterotroph estimates include genera within the phyla Apicomplexa, Intramacronucleata, Postciliodesmatophora, Cercozoa and Oomycota. The y-axis is the number of genera represented by each circle and the circle diameter represents the number of sequence reads detected for those genera. Green circles represent algal genera detected using 18S, blue represent algal genera detected using 23S and maroon represent heterotrophic genera detected using 18S. The ‘all sites’ graph represents all genera and sequence reads from the entire study.
This sampling and identification project was included in the Natural Tunnel Park Bioblitz held on 22 April 2016. During the event, the authors collected water samples from pre-selected sites within the park and then engaged citizen scientist volunteers in morphological identifications using microscopy. The three water samples collected from each site were combined and cells were harvested by filtration on to 0.45 micron membranes using a disposable microfunnel with filter (Daigger & Co., Vernon Hills, IL, USA) and an electric vacuum pump (Welch, Mt. Prospect, IL, USA, Model 2534B-01A). This pore size was chosen since it should capture all but the smallest prokaryotes and picoplankton. All samples collected on 22 April were filtered on site and the filters maintained on ice for 4–6 hours before eventual storage at -20 °C. It has been observed that the protist community is highly dynamic with a large seasonal variation (
DNA was extracted from cells collected on the filters using Qiagen’s Power Water kit (Germantown, MD, USA). DNA was quantified using a Nanodrop Lite apparatus (ThermoFisher Scientific, Waltham, MA, USA).
The intention of this study was to sample as broadly as possible so primers were chosen that would theoretically identify the highest number of taxa. The ~500bp V4 region of the nuclear 18S rRNA, considered a universal protist barcode (
Amplicons were produced from the environmentally derived DNA preparations using the primers listed in Suppl. material
Sequence information was delivered from Genewiz as de-multiplexed paired-end files from each sample site. Sequences were paired, sorted and Operational Taxonomic Units (OTUs) identified using a protocol developed for metabarcoding analysis using the commercial software Geneious (BioMatters Ltd, Auckland, New Zealand, http://www.geneious.com/tutorials/metagenomic-analysis, accessed December 2016). A total of 4,500,654 paired reads were produced using the three primer sets (Table
A summary of sequence reads, OTUs and taxon totals for each collection site and date and each barcode used in this study.
Sequences | 18S | 23S | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Total Number of Paired Reads | Sequences with a 100% Primers Match | Sequences with a GenBank Match | Estimate of Unique OTUs via de novo Assembly* | Identifiable Families | Identifiable Genera | Sequences with a 100% Primers Match | Sequences with a GenBank Match | Estimate of Unique OTUs via de novo Assembly* | Identifiable Algal Familes | Identifiable Algal Genera | |
Quarry – Week 0 | 380,808 | 10361 | 9923 | 2018 | 68 | 110 | 1317 | 995 | 309 | 16 | 18 |
Quarry -Week 7 | 148,452 | 6231 | 6144 | 1030 | 42 | 55 | 433 | 427 | 433 | 13 | 13 |
Quarry – Week 14 | 103,166 | 1160 | 1139 | 573 | 49 | 65 | 516 | 513 | 138 | 15 | 25 |
Stock Creek-North – Week 0 | 272,112 | 9293 | 8796 | 2313 | 60 | 95 | 821 | 725 | 130 | 16 | 18 |
Stock Creek-North – Week 7 | 477,600 | 5529 | 5264 | 2358 | 86 | 136 | 292 | 227 | 367 | 28 | 30 |
Stock Creek-North – Week 14 | 176,668 | 3370 | 3319 | 1200 | 52 | 75 | 229 | 216 | 72 | 12 | 12 |
Stock Creek-South – Week 0 | 238,990 | 4625 | 4257 | 1579 | 64 | 96 | 313 | 249 | 151 | 16 | 20 |
Stock Creek-South – Week 7 | 356,404 | 6946 | 6622 | 2675 | 83 | 129 | 335 | 294 | 146 | 18 | 23 |
Stock Creek-South – Week 14 | 312,426 | 7548 | 7396 | 2595 | 64 | 91 | 357 | 345 | 162 | 24 | 27 |
Waterfall –Week 0 | 298,584 | 5689 | 5396 | 1299 | 53 | 71 | 339 | 267 | 111 | 11 | 11 |
Waterfall –Week 7 | 254,362 | 4857 | 4607 | 1733 | 49 | 72 | 705 | 651 | 363 | 26 | 24 |
Waterfall –Week 14 | 382,956 | 5038 | 4769 | 1839 | 50 | 70 | 494 | 464 | 182 | 22 | 26 |
Waterwheel – Week 0 | 496,364 | 5225 | 4864 | 1667 | 65 | 102 | 1496 | 826 | 714 | 21 | 25 |
Waterwheel – Week 7 | 127,154 | 3176 | 3008 | 1363 | 47 | 63 | 125 | 121 | 70 | 10 | 9 |
Waterwheel –Week 14 | 474,608 | 5282 | 4970 | 2323 | 60 | 93 | 362 | 310 | 150 | 19 | 22 |
Column total [Σ] | 4,500,654 | 84,330 | 80,474 | 26,565 | 892 | 1,323 | 8,134 | 6,630 | 3,498 | 267 | 303 |
Column average [μ] ± SD | 300043.6 ± 128373.8 | 5622 ± 2313.6 | 5364.9 ± 2220.6 | 1771± 612.8 | 59.5± 12.7 | 88.2 ± 24 | 542.3 ± 393.5 | 442± 253.5 | 233.2 ± 174.8 | 17.8 ± 5.5 | 20.2 ±6.5 |
non-redundant total | 3552 | 212 | 198 | 343 | 48 | 75 |
To estimate if the number of sequences identified at each site provided the highest number of taxa possible, rarefaction was calculated and a graph generated using R software (
The Simpson’s diversity index (1-D) was calculated according to
Multi-dimensional analyses were performed as Principle Coordinates analyses (PCO) with scatter plots to compare the taxonomic compositions of each sample site. These were generated using genera and sequence reads with the software package PRIMER-E with the PERMANOVA add-on (
OTU richness, as defined by sequence diversity for each sample, ranged from 573–2675 (μ=1771±612.8) for 18S and 70–714 (μ=233±174.8) for 23S (Table
The OTUs found in the sampled waterways were distributed amongst 19 phyla (Figure
The major protistan phyla detected at the Natural Tunnel State park during the spring-summer of 2016. Each phylum is represented by a circle. The x-axis represents the number of genera identified within each phylum and raw sequence abundance is represented by circle diameter. The number of genera (top number) and raw sequence abundance (bottom number) accompanies each circle. Phyla are organised into the Archaeplastida, Stramenopiles, Alveolata and Rhizaria (Rhi.) super groups according to
Alpha-diversity was estimated and rarefaction curves were produced to determine the success of the sampling strategy (Suppl. material
The relative abundance of each genus was calculated to arbitrarily rank read counts as “abundant” (>1% of total reads), “common” (0.95<>0.45%), “uncommon” (0.94<>0.1%) or “rare” (<0.1%) (Figure
The majority of genera identified in this study were part of the rare microbiome. Rank abundance distribution of genera identified from all sites and times. All genera (x-axis) are arranged according to their relative abundance (percentage) compared to all the identifiable reads (Percent Abundance, x-axis). The vertical white lines divide the genera into four arbitrary abundance categories: “abundant” (>1%), “common” (0.95<>0.45%), “uncommon” (0.94<>0.1%) or “rare” (<0.1%). The numbers associated with each category are the number of genera found in each.
The distribution of phyla amongst the five sites is represented in Figure
The majority of identifiable genera were found at multiple locations in the park but their relative abundance varies. (A) The relative distribution of the phyla at each sample site. (B) Three Venn diagrams were prepared to show the distribution of genera amongst the sample sites, represented into groupings of ‘all’, ‘photosynthetic’ and ‘heterotrophic’. See Figure
The co-incidence of all genera regardless of abundance was compared between sites (beta-diversity) using Venn diagrams. A comparison of all protist genera shows that 110 (33%) were found at all five sample sites (Figure
Principal coordinates’ analyses of genera from the five collections sites and the three ecosystems were also completed. Three distinct profiles were obvious, the Stock Creek sample sites formed a group, the waterfall and waterwheel sites formed a second group and the quarry did not associate with either (Fig.
Each sample site has a recognisable protist profile primarily defined by the photosynthetic genera. Principle Coordinates Analyses (PCO) was completed to compare the normalised eukaryotic microbiomes of the five sample sites. A – all protist genera collected from the five sample sites over the course of the sampling period. B – Photosynthetic protist genera. C – Heterotrophic protist genera. See Fig.
The use of environmental DNA metabarcoding to identify diversity within eukaryotic microbiomes is a rapidly maturing field and the necessary technologies are becoming more affordable and the bioinformatics pipelines more accessible. In this study, we report its use during a BioBlitz event to describe the diversity of planktonic protists in a state preserve located in the Ridge and Valley physiographic province of the Appalachian region of the United States, an area with reported high biodiversity but with few published surveys of microscopic eukaryotes. Over the course of this study we tested two hypotheses.
Hypothesis 1: The overall protist diversity would be high; perhaps higher than in comparable temperate regions.
We found 3663 OTUs and over 95% of these sequences were identifiable and could be placed in 220 photoautotrophic and 112 heterotrophic genera. It is unknown how many potential taxa are represented by that percentage of unidentifiable sequences but if we consider that 332 identifiable taxa are represented within 84,330 sequences, then that produces an average of 254 sequences per taxa and allows a rough estimate of 15 unidentiable taxa in the 3663 sequences with no match. The small proportion of sequences with no GenBank match is encouraging as it suggests the 18S microbial eukaryotic barcode database may be reaching saturation.
The most abundant taxa found in this study were common North American temperate freshwater protists, such as the green algae Scenedesmus and Desmodesmus (
There were also examples of unexpected taxa. Seventeen dinoflagellate genera (P: Miozoa) were detected in the lotic systems, which is a high number for a freshwater survey (
The overall distribution of phyla was very consistent with a metabarcoding survey of an inland temperate freshwater system in France (
This ‘high-protist biodiversity’ conclusion is confirmed by the Simpson and Shannon-Wiener biodiversity indices calculated for our survey (Suppl. material
We conclude that the diversity of this physiographic province can be considered high and may have a larger number of green algae than other comparable regions but the overall diversity is not unusually high compared to other temperate bodies of fresh water. This conclusion, however, must be tempered by several caveats. The first is the rare biosphere, which has become one of the hallmarks of metabarcoding studies and is defined as OTUs appearing at a very low relative abundance that tend to escape traditional morphological identification (
A second caveat is that the overall number of OTUs found during this census is very likely a gross underestimate. A challenge with environmental metabarcoding studies of this type is the choice of primer pairs and the barcode region, which can limit or bias the diversity of organisms observed. We chose the 18S V4 region amplified by ‘universal’ primers along with 23S primers to identify photoautotrophic organisms and a third diatom specific 18S region. In our study, we found the ‘universal’ primers provided the greatest number of identifiable barcode sequences and, unexpectedly, identified a nearly identical list of diatom taxa as the diatom specific primers. Previous studies have demonstrated that the universal 18S V4 primer pair can severely underestimate the total number of taxa actually present.
A third caveat was our use of an alternative bioinformatics pipeline that used commercially available software and required building a custom searchable database from GenBank rather than using the curated databases PR2 (
Hypothesis 2: The five sample sites have distinct protist profiles.
Five sites were sampled, representing three common waterways found in this region (i) a perennial lotic creek/river, (ii) shallow lotic mountainside streams and (iii) a lentic ephemeral pool. Comparisons of genera from each site found that few taxa were unique to a single location (Figure
A second observation to stem from the PCO analyses was that the eukaryotic microbiome profiles are likely to be dominated by the autotrophic organisms. The photosynthetic protists were more abundant than heterotrophic ones in the Natural Tunnel environments with a ratio of approximately 6:1. This contradicts a previous molecular survey of freshwater ponds in France where the most abundant protists were planktonic ciliates (
The technique of metabarcoding, although imperfect, provides an attractive entry point for the study of microbial eukaryotes biodiversity in small understudied freshwater ecosystems. The 213 families and 332 genera found in this study provide a base-line repository of protist information for a natural area in the Ridge and Valley region of the Appalachian region of North America where protist biodiversity has not received much attention, until now. Over the course of this study, we concluded that the region has a high number of green algae compared to other temperate regions, but not an unusually high overall protist biodiversity. We were also unable to disprove the hypothesis that the three bodies of water would have distinct protist biomes.
The authors wish to thank the Natural Tunnel State Park in Scott County, VA for generous access to the park and sampling privileges for the duration of this project. They also thank Bob VanGundy of UVa-Wise, Daniel Wolf of Ohio University and Karen Fawley and Marvin Fawley of the University of Arkansas at Monticello for critically reading the manuscript and providing invaluable insight and inspiration.
Figure 1. Sample collection sites
Figure 2. Rarefaction analysis estimates demonstrate that family and genus collections were approaching saturation
Table 1. All Detected Taxa
Table 2. Primers and PCR Conditions.
Table 3. Diversity Indices.
Table 4. The protist genera identified in Natural Tunnel State Park organised by read count.