Development of a new set of PCR primers for eDNA metabarcoding decapod crustaceans

The Decapoda is one of the largest orders within the class Malacostraca, comprising approximately 14,000 extant species and including many commercially important species. For biodiversity monitoring in a non-invasive manner, a new set of PCR primers was developed for metabarcoding environmental DNA (eDNA) from decapod crustaceans. The new primers (herein named “MiDeca”) were designed for two conservative regions of the mitochondrial 16S rRNA gene, which amplify a short, hyper-variable region (153–184 bp, 164 bp on average) with sufficient interspecific variations. With the use of MiDeca primers and tissue-derived DNA extracts, we successfully determined those sequences (154–189 bp) from 250 species, placed in 186 genera and 65 families across the suborder Dendrobranchiata and 10 of the 11 infraorders of the suborder Pleocyemata. We also preliminarily attempted eDNA metabarcoding from natural seawater collected at Banda, Tateyama, the Pacific coast of central Japan and detected 42 decapod species including 34 and 8 species with sequence identities of > 98% and 80–98%, respectively. The results suggest the usefulness of eDNA metabarcoding with MiDeca primers for biodiversity monitoring of the decapod species. It appears, however, that further optimisation of primer sequences would still be necessary to avoid possible PCR dropouts from eDNA extracts.


Introduction
Classical methods of biodiversity monitoring have been primarily based on the collection of specimens and subsequent morphology-based identification.Such biodiversity monitoring is costly and time-consuming and requires considerable expertise for various taxonomic groups.Recent technological developments in molecular ecology have provided a novel tool for species detection using DNA present in aquatic or terrestrial environments (environmental DNA or eDNA; Taberlet et al. 2012).
There are two major approaches to applying eDNA analysis: "eDNA barcoding", which aims at detecting a single species in the environment (species-specific approach); and "eDNA metabarcoding", which simultaneously detects multiple species from an environmental sample (multi-species approach).The latter approach has been developed with rapidly developed high-throughput next-generation sequencing (NGS) (e.g.Taberlet et al. 2012, Thomsen et al. 2012, Miya et al. 2015, Valentini et al. 2016).Application of eDNA is now quite wide-ranging in studies of biodiversity, aquatic ecology and conservation biology (Bohmann et al. 2014, Díaz-Ferguson andMoyer 2014).
In particular, with regard to aquatic environments, the multi-specific assessment and monitoring of the fauna using eDNA have focused mainly on vertebrates (e.g.Thomsen et al. 2012, Kelly et al. 2014, Miya et al. 2015, Andruszkiewicz et al. 2017, Ushio et al. 2017, 2018), which are known to release abundant DNA derived from faeces, body mucus, blood and sloughed tissue or scales (Bohmann et al. 2014).The applicability of the eDNA metabarcoding for aquatic invertebrates, notably those with an exoskeleton (Crustacea), has received less attention (Thomsen et al. 2012, Rees et al. 2014).The vast majority of previous studies on crustaceans yielded a "species-specific" approach detecting invasive species (e.g.Tréguier et al. 2014, Dougherty et al. 2016, Larson et al. 2017) or monitoring seasonal migrations of particular species (Wu et al. 2018).
The Decapoda is the largest order of the crustacean class Malacostraca (Arthropoda: Pancrustacea), comprising more than 14,000 extant species worldwide (De Grave et al. 2009, Ahyong et al. 2011), with continuing discovery of new species.The great majority of decapod crustaceans are marine, but other environments have also been colonised, such as lowland freshwaters, mountain rivers, estuaries and even land.Decapods are also highly diverse in ecology and include a number of commercially important species (such as shrimp, prawn, lobster, crayfish, king crab, snow crab etc.), attracting much scientific and economic interests.
The importance of marker selection in eDNA metabarcoding has recently been emphasised (Coissac et al. 2012, Deagle et al. 2014).As there is no ideal universal metabarcode (Riaz et al. 2011), marker selection could be specific to the target taxonomic group.The aim of this study was to develop new universal PCR primers for eDNA metabarcoding of decapod crustaceans, which enable detection of multiple species for biodiversity assessment and monitoring.The performance of newly developed primers (herein named "MiDeca") was tested using tissue-derived DNA extracts from 250 species and an eDNA sample from natural seawater collected from the near-shore environment.

Primer development
Mitochondrial rRNA genes have been recommended for identification of animal taxa because they have a similar taxonomic resolution to the COI marker and they present conserved regions that flank variable regions, which allows the design of primers with high-resolving power for the target taxonomic group (Deagle et al. 2014).In order to identify a suitable region in the mitogenome for species identification based on eDNA, 267 whole mitogenome sequences of Decapoda registered in the databases have been downloaded from NCBI as of 17 October 2017.
Three requirements were considered for designing the new primers: 1) a target amplicon consisting of fewer than 200 bp is desirable because the eDNA will often be degraded; 2) the amplified regions include sufficient interspecific differences for all target species; and 3) conserved regions (20-30 bp) across all target species are located at both ends of the short hyper-variable region to simultaneously amplify the target sequences (Riaz et al. 2011, Miya et al. 2015, Valentini et al. 2016).
After removing problematic sequences (i.e.18 sequences that could not be aligned with other sequences), the remaining 249 sequences from 203 species and five sequences of unidentified species of Cherax (Astacidea: Parastacidae) (Table 1) were subjected to multiple alignment using MUSCLE (Edgar 2004) implemented in MEGA7 (Kumar et al. 2016) with a default set of parameters.The aligned sequences were imported into MEGA7 for visual inspection of the conserved and hyper-variable regions.The visual search for a short hyper-variable region (up to 200 bp for paired-end sequencing using the Illumina MiSeq) flanked by two conservative regions (ca.20-30 bp) was performed on the entire set of the two aligned 12S rRNA and 16S rRNA genes.Primers were designed using Primer3 (Rozen et al. 2000) accounting for G/C contents (40-60%) and melting temperature (Tm: 50-60 °C).
The new universal primers for decapod eDNA were designed on the 16S rRNA gene (for details, see Results and Discussion) and were named MiDeca-F/R (F and R represent forward and reverse, respectively).

In silico evaluation of variation in the target region
Interspecific differences within the amplified DNA sequences are required for accurate taxonomic assignments.To computationally evaluate levels of interspecific variations within the target region (hereafter called "MiDeca sequence") across different taxonomic groups of decapods, 254 whole mitogenome sequences used for the primer development were subjected to calculation of pairwise edit distances (intra-species, inter-species, inter-genus and inter-family levels, respectively).The edit distance is defined as the minimum number of single-nucleotide substitutions, insertions or deletions that are required to transform one sequence into the other.(Jones and Pevzner 2004).

Test of primers with tissue extracted DNA
In order to examine the performance of MiDeca primers, extracted DNA from a single individual of each 250 species across the suborder Dendrobranchiata and 10 of 11 infraorders of the suborder Pleocyemata was used to amplify MiDeca sequences (Table 2).Total genomic DNA was extracted from each tissue (thoracic or pleon muscle or pereopod or pleopod muscle), which was preserved in 70-99% ethanol for one to more than 10 years, using a DNeasy Blood & Tissue kit (Qiagen, Hilden, Germany) with an elution volume of 100 µl.
DNA concentrations were measured and recorded with a NanoDrop Lite spectrophotometer (Thermo Fisher Scientific, DE, USA).PCR was carried out with 30 cycles of a 8.0 µl reaction volume (divided from the original PCR mastermix) containing 2.2 µl sterile distilled water, 3.8 µl 2 × Gflex PCR buffer (Takara, Otsu, Japan), 0.4 µl of each primer (5 µM), 0.2 µl Tks Gflex DNA polymerase (Takara, Otsu, Japan) and 1.0 µl template.The thermal cycle profile after an initial 1 min denaturation at 94 °C was as follows: denaturation at 98 °C for 10 s, annealing at 50 °C for 10 s and extension at 68 °C for 10 s with the final extension at the same temperature for 7 min.The PCR products were subjected to agarose gel electrophoresis using 2% L03 (Takara, Otsu, Japan) to confirm the amplifications.The PCR products were purified using Exo SAP-IT (USB, OH, USA) to remove redundant dNTPs and primers.
Direct sequencing of the purified PCR products was performed with the ABI 3130xl Genetic Analyzer (Life Technologies, CA, USA) and dye-labelled terminators (BigDye terminator v. 1.1; Applied Biosystems, CA, USA) following the manufacture's protocol.The DNA sequences were edited and assembled by GENETYX-MAC v. 17 (Genetyx, Tokyo, Japan) or MEGA7 and registered in the DDBJ/EMBL/NCBI database (Table 2).

Water sampling and filtration
In order to test the versatility of the newly designed primers (MiDeca-F/R), we used a filtered seawater sample collected at the rocky shore of Banda, Tateyama City, Chiba Prefecture (34.9758N, 139.7675E) on 14 September 2017 (Figs 1, 2), where the decapod crustacean fauna is well documented.
An on-site filtration method was employed to collect decapod eDNA.Disposable gloves were worn and changed between each sample.The sampling equipment (8 litre polyethylene bucket) was thoroughly decontaminated with a 10% bleach solution before use.Surface water was collected using 10 casts of the bucket fastened to a 15 m rope.In each cast, approximately 50 ml seawater was drawn into a disposable syringe with the lure lock connector (50 ml; TERUMO, Tokyo, Japan), an inlet port of the 0.45 µm Sterivex filter cartridge (Merck Millipore, MA, USA) was attached to the syringe and the seawater was filtered on to the membrane by pushing the plunger.This step was repeated twice in a single cast and the final filtration volume reached 1000 ml with 10 casts of the bucket.
After the on-site filtration, an outlet port of the filter cartridge was sealed with the Parafilm (LMS, Tokyo, Japan), RNAlater (1.6 ml; Thermo Fisher Scientific, DE, USA) was added into the cartridge using a disposable pipette to prevent eDNA degradation and an inlet port was sealed with the film.A filtration blank was made by filtering 500 ml of pure water in the same manner at the end of the water sampling.Filter cartridges were transported to the laboratory in a cooler with ice packs and then kept at -20 °C in the freezer prior to eDNA extractions.
eDNA was extracted from Sterivex cartridges using a DNeasy Blood & Tissue kit (Qiagen) following the method developed by Miya et al. (2016) with slight modifications.An inlet port of each Sterivex cartridge was connected with a 2.0 ml collection tube and the connection between the cartridge and collection tube was tightly sealed with the Parafilm.The combined unit was inserted into a centrifuge adaptor for a 15 ml conical tube and was centri-  fuged at 6000g for 1 min to remove seawater and RNAlater for DNA extraction.In order to completely remove liquid remaining in the cartridge, an aspirator (QIAvac 24 Plus, Qiagen) was used.The Sterivex cartridge was subjected to lysis using proteinase K. Before the lysis, PBS (220 μl), proteinase K (20 μl) and buffer AL (200 μl) were mixed and the mixed solution was gently pipetted into the Sterivex cartridge from an inlet port.The Sterivex cartridge was again sealed and then incubated in a 56 °C preheated incubator for 20 min, using a rotator (Mini Rotator ACR-100, As One) with a rotation rate of 10 rpm.After the incubation, the Sterivex cartridge connected with a 2 ml tube (DNA LowBind tube, SARSTEDT), which was placed in a 50 ml conical tube, was centrifuged at 6000g for 1 min to collect the DNA.The collected DNA solution (ca.900 μl) was purified using the DNeasy Blood and Tissue kit following the manufacture's protocol.

Library preparation and MiSeq sequencing with eDNA sample
eDNA extracted from the seawater sample collected at Banda was subjected to the first-round PCR (1 st PCR) and the second-round PCR (2 nd PCR) in order to append amplified sequences with three kinds of adaptor sequences: 1) primer-binding sites for sequencing; 2) dual-index sequences to distinguish amplicons; and 3) sequences for binding to the flowcells of the Illumina MiSeq (Illumina, CA, USA).
The 1 st PCR was carried out with 38 cycles of a 12 µl reaction volume containing 6.0 µl 2 × KAPA HiFi Hot-Start ReadyMix (KAPA Biosystems, MA, USA), 1.4 µl of each MiDeca primer (5 µM primer F/R), 1.2 µl sterile distilled H 2 O and 2.0 µl eDNA template.In order to minimise PCR dropouts, 8 replications were performed for the 1 st PCR using a 0.2 ml 8-strips tube.
The thermal cycle profile after an initial 3 min denaturation at 95 °C was as follows: denaturation at 98 °C for 20 s, annealing at 60 °C for 15 s and extension at 72 °C for 15 s with the final extension at the same tem-perature for 5 min.The 1 st PCR products from the 8 tubes were pooled in a single 1.5 ml tube and the pooled products were purified using a GeneRead Size Selection kit (Qiagen, Hilden, Germany) in order to remove dimers and monomers following the manufacturer's protocol.Subsequently, the purified products were quantified using TapeStation 2200 (Agilent, Tokyo, Japan), diluted to 0.1 ng/µl using Milli Q water and the diluted products were used as a template for the 2 nd PCR.
The 2 nd PCR was conducted with 12 cycles of a 15 µl reaction volume containing 7.5 µl 2 × KAPA HiFi Hot-Start ReadyMix, 0.9 µl each primer (5 µM), 3.9 µl sterile distilled H 2 O and 1.9 µl template.The thermal cycle profile after an initial 3 min denaturation at 95 °C was as follows: denaturation at 98 °C for 20 s, annealing and extension combined at 72 °C (shuttle PCR) for 15 s with the final extension at the same temperature for 5 min.
In order to monitor contamination during the process of PCRs, blank samples were prepared.During the 1 st PCR, a filtration blank (FB), an extraction blank (EB) and a PCR blank (1B) with 2.0 µl milli Q water instead of template eDNA were added; during the 2 nd PCR, in addition to the three blanks used during the 1 st PCR, one more PCR blank (2B) was added.
All the libraries containing the target region and the three adapter sequences were mixed in equal volume and the pooled libraries were size-selected from approximately 340 bp using a 2% E-Gel Size Select agarose gel (Invitrogen, CA, USA).The concentration of the size-selected libraries was measured using a Qubit dsDNA HS assay kit and a Qubit fluorometer (Life Technologies, CA, USA) and sequenced on the MiSeq platform using a MiSeq v2 Reagent Kit for 2 × 150 bp PE (Illumina, CA, USA) following the manufacturer's protocol.

Data preprocessing and taxonomic assignment
All data preprocessing and analysis of MiSeq raw reads were performed using USEARCH v10.0.240 (Edgar 2010) according to the following steps.
1) Both forward and reverse reads were merged by aligning them using the fastq_mergepairs command.During this process, low-quality tail reads with a cut-off threshold set at a quality (Phred) score of 2, too short reads (< 64 bp) after tail trimming and those paired reads with too many differences (> 5 positions) in the aligned region (ca.70 bp) were discarded; 2) primer sequences were removed from those merged reads using the fastx_truncate command; 3) those reads without the primer sequences underwent quality filtering using the fastq_filter command to remove low quality reads with an expected error rate of > 1% and too short reads of < 50 bp; 4) the preprocessed reads were dereplicated using the fastx_uniques command and all singletons, doubletons and tripletons were removed from the subsequent analysis following the recommendation by Edgar (2010); 5) the dereplicated reads were denoised using the unoise3 command and all putatively chimeric and erroneous sequences were separated from the subsequent OTU assignment; 6) finally all processed reads were assigned to OTU with a sequence identity of > 98% (query coverage ≥ 90%, 2 or 3 nucleotide differences allowed) using the usearch_global command.Reads with a sequence identity of 80-98% were also assigned to "U98 OTU" and were subjected to clustering at the level of 0.98 using cluster_smallmem command.All of these outputs were tabulated with read abundances.
For reference sequences, decapod crustacean mitochondrial 16S rRNA gene sequences were downloaded from NCBI as of 23 June 2018 and MiDeca sequences were extracted using custom Perl scripts and used as the reference database during taxonomic assignment.In addition to those published sequences, we independently determined MiDeca sequences from 250 decapod crustaceans (Table 2) and added those sequences to the reference database (27,236 reference sequences in total as of 26 September 2018; those sequences are represented by 4005 decapod taxa identified to species from 1135 genera of 167 families of 11 infraorders/suborders).

MiDeca primers
By visual inspection of the two aligned mitochondrial rRNA genes from 254 sequences, two conservative regions (ca.20 bp) that flank a hyper-variable region (154-184 bp with the exception of one extremely short sequence of 68 bp for a sergestid Acetes chinensis) were identified within the 16S rRNA gene.The new PCR primers for metabarcoding eDNA from decapods were designed on the basis of these two conservative regions and the primers were named MiDeca.The MiDeca-forward primer (MiDeca-F) comprised 5´-GGA CGA TAA GAC CCT ATA AA-3´ (20 mer), whereas the MiDeca-reverse primer (MiDeca-R) comprised 5´-ACG CTG TTA TCC CTA AAG T-3´ (19 mer).Melting temperatures and G/C contents of the MiDeca-F/R primers are 51.1 °C/51.6 °C and 40.0%/42.1%,respectively.Nucleotide variation in the primer region amongst the 207 species used to design MiDeca is summarised in Table 3.
With the use of tissue extracted DNA, MiDeca primers, without adapter sequences, were able to amplify each variable region of 250 decapod species from 10 suborders/infraorders (Table 2) and those nucleotide sequences were determined using the Sanger method.The lengths of MiDeca sequences vary from 148 bp to 189 bp (mean 165.7 bp).Those sequences were deposited in DDBJ/ EMBL/GenBank databases (Table 2).

In silico evaluation of variation in MiDeca sequence
The pairwise edit distances from MiDeca sequences were calculated for 254 sequences distributed across 10 infraorders, 56 families, 123 genera and 207 species and the results were sorted into between-families, genus, species and within species (Fig. 3).Each median of the edit distance was 41, 47, 35 and 4, respectively.

eDNA detection from natural seawater
In total, the MiSeq paired-end sequencing yielded a total of 4,693,875 raw reads with an average of 95.2% base calls being Phred quality scores of more than or equal to 30.0 (Q30; error rate = 0.1% or base call accuracy = 99.9%).This run was highly successful considering that the quality scores specified by Illumina are more than 80% bases higher than Q30 at 2 × 150 bp (Illumina Publication no.770-2011-001 as of 27 May 2014).
Of the 4,693,875 raw reads, our sample from Tateyama Bay comprised 185,690 raw reads and they were merged, quality-filtered, dereplicated and denoised, resulting in a total of 161,753 reads (87.1% raw reads being retained).Preprocessed reads from the four blanks (FB, EB, 1B, 2B) were minor, comprising only 7-128 reads (0.004-0.07% of the non-negative sample).We therefore considered that those reads from the four blanks were negligible and they were not used in the subsequent taxonomic assignment.The preprocessed reads from Tateyama Bay sample were subjected to taxonomic assignment with the custom database.Finally, these reads were assigned to 35 crustacean species with the sequence identity of 98-100%, 10 crustacean species with the sequence identity of 80-98% (Table 4) and 69 no-hit taxa.Of the 35 species with the identity of 98-100%, 34 were decapods with the exception of the amphipod Caprella scaura.Of those 34 decapod species, the occurrence of Table 3.Nucleotide sequences of the universal primers (MiDeca) and base compositions in the selected 254 sequences (see Table 1).This forward (F) and reversal (R) primer pair amplifies the mid region of the mitochondrial 16S rRNA gene with a mean length of 164 bp (154-184 bp; except for one unusually short sequence from Acetes chinensis).the 31 species in the study site and its adjacent areas is confirmed by examination of the museum collection and recent field surveys (Table 5).Reads assigned to Lebbeus groenlandicus are considered to be cross-contamination apparently derived from previous experiments based on aquarium tank water (for details, see below).Sequences of the 69 no-hit taxa were subjected to BLAST search on GenBank database.None of them was assigned to crustacean species.Of the 69 taxa, five were assigned to the following three known molluscan species with high sequence identity: Patelloida saccharina (Lesson 1831) (Gastropoda: Lottiidae; one haplotype, sequence identity 100%); Limnoperna fortunei (Dunker 1857) (Bivalvia: Mytilidae; three haplotypes, sequence identity 99-100%); Mytilus galloprovincialis Lamarck, 1819 (Bivalvia: Mytilidae; one haplotype, sequence identity 100%).One taxon was assigned to a bryozoan Beania klugei Cook, 1968 (Gymnolaemata: Cheilostomatida: Beaniidae) with low sequence identity (93%).One taxon was linked to two unidentified cyanobacteria taxa with low sequence identity, i.e.Synechococcus sp.WH 8109 (sequence identity 95.5%) or Synechococcus sp.CC9605 (sequence identity 94.2%).

Usefulness of eDNA metabarcoding with MiDeca primers
It has been confirmed that the newly developed MiDeca primers are able to amplify the hyper-variable region of the mitochondrial 16S rRNA gene from the tissue-derived DNA extracts.We have successfully sequenced the target segment from 250 species from 186 genera and 65 families distributed across 10 suborders/infraorders.The edit distance between species was very high (Fig. 3), suggesting that the MiDeca sequence has sufficient interspecific variations for taxonomic assignment.A preliminary examination of eDNA from the natural seawater from Banda, Tateyama, Chiba Prefecture, detected 34 decapod species (sequence identity > 98%) (Table 4).In addition to those species, 10 unidentified species with lower identity (80-98%) were also detected (Table 4).There is little doubt that the eDNA metabarcoding with the MiDeca primers could provide information on the presence of particular decapod species without the requirement for capturing specimens or visual census.

Taxa detected from eDNA metabarcoding
It is remarkable that as many as 34 decapod species were detected from only one sample.Of the 34 species detected, the occurrence of 32 species in the study site and nearby areas was confirmed by examination of the museum collections and our field surveys (Tables 4 and  5).Although voucher specimens have not been collected, the occurrence of Metapenaeopsis lamellata is still likely.This penaeid species is sublittoral and nocturnal and, thus, collection of specimens at the study site is difficult.The detection of the sergestid shrimp Sergia lucens is remarkable, because it is a mesopelagic species, undergoing diurnal vertical migration along the continental shelf (Omori 1969).This species sometimes occurs in coastal areas in Boso Peninsula (TK, personal observation; voucher material CBM-ZC 2053) and, thus, its detection was due either to an accidental migration to the coastal water or to transport of the eDNA from the nearby oceanic water.
The detection of the macrophthalmid crab "Macrophthalmus boscii" needs explanation.Recent studies have shown that more than one species were confounded under the name M. boscii (cf.Naderloo andTürkay 2011, Teng et al. 2016) and M. boscii and allied taxa were transferred to the genus Chaenostoma.Teng et al. (2016) showed that two species, C. crassimanus Stimpson, 1858 and C. orientale occur in the north-western Pacific, including Japan, with true C. boscii restricted to the western Indian Ocean.At the studied site, specimens of C. orientale have been collected and, thus, it is reasonable to consider that it was the eDNA of C. orientale that was detected and that the GenBank sequence identified as "Macrophthalmus" boscii has been misidentified.
On the other hand, the detection of the thorid shrimp Lebbeus groenlandicus is dubious, because it is a boreal, deep-sea benthic species that does not occur in or adjacent to the study area.As such, we concurrently examined eDNA from tank water in Aquamarine Fukushima, where individuals of this species had been kept [for the species identity, see Komai (2015)].This is, thus, the suspected source of "contamination".This again highlights the importance of safeguarding against cross-contamination during the eDNA metabarcoding (e.g.Bohmann et al. 2014).
Amongst the eight decapod taxa with a lower identity match (80-98%), e-DNA assigned to the majid crab Micippa thalia (identity 96.9%) was also found.It is now known that the Japanese population that has been referred to as M. thalia in literature is actually a separate species (P.K. L., personal communication).As such, it is not surprising that the detected sequence here does not fully match that of M. thalia registered in GenBank.Although we have tried to sequence specimens identified as "M.thalia" in the CBM collection, we have not been successful.Unfortunately, our attempts to sequence specimens identified with M. thalia deposited in the CBM collection were unsuccessful.
The species status of the other seven decapod taxa with lower identity (80-98%) cannot be determined with a high degree of confidence.As shown in Table 5, there are several local species for which MiDeca sequence data are still not available.On the other hand, it is highly likely that there are species for which their occurrence has not been confirmed in the study area.At the study site, the intertidal to subtidal area consists of fragile sandstone, which provides cryptic habitats for those decapod species.To reduce the number of unknown species, continuous efforts for the collection and accumulation of reference sequences are necessary.
It is fortuitous that eDNA of the other malacostracan taxa, including Amphipoda (a caprellid Caprella scaura and an unidentified taxon linked to the maerid Quadrimaera pacifica) and Euphausiacea (an unidentified taxon linked to euphausiid Euphausia similis), were also detected (Table 4).In addition to those, taxa other than Crustacea were also detected, as noted above.This suggests broad applicability of MiDeca metabarcoding to non-decapod Malacostraca and even to non-crustacean taxa with slight modifications of primer sequences.
A false negative (species is not detected where it is present), as well as a false positive (species is detected where it is absent), are important issues in eDNA metabarcoding, because they will cause under-or overestimation of species richness.In fact, considerable numbers of decapod species that are recorded from the study site and adjacent areas were not detected by the present metabarcoding exercise (Table 5).Two major factors could be considered for the false negatives: 1) no eDNA from those species was collected during sampling; 2) PCR amplification of eDNA from those species was not successful.
With regard to the first factor, water sampling at low tide, including various microhabitats, may be more effective for collecting more eDNA.With regard to the second factor, exploration of an optimal method that generates a greater species richness of MiDeca sequences to avoid PCR dropouts would be necessary.As PCR dropouts might be due to PCR bias derived from primer-template mismatches, an optimal number of PCR replicates and use of multiple annealing temperatures would be alternative approaches to comprehensively detect target eDNA.In fact, in a fungal metabarcoding study, pooling multiple repeated PCRs and using multiple annealing temperatures were recommended to facilitate the recovery of more accurate species richness (Schmidt et al. 2013).Furthermore, Majaneva et al. (2018) demonstrated that choice of the DNA extraction method affects DNA metabarcoding.Clarification of the optimal DNA extraction method for the target group might be advisable.
The other important point for accurate assessment of biodiversity is completeness of the reference sequence database, which is indispensable for satisfactory taxonomic assignments.Reference sequences, used in the present analyses, were primarily derived from the GenBank database with the addition of the sequence data generated by ourselves.The process of exploration of the GenBank decapod crustacean database highlighted the lack of sufficient sequence data.During this study, we have newly sequenced the target marker from 250 decapod species (including 4 unidentified species), but the number of the currently available species known from Japanese waters is still 1054, representing 36% of the decapod species presently known from the areas (about 2,890 species placed in 950 genera distributed across 147 families).Furthermore, there are many species with uncertain taxonomy, including undescribed species and poorly defined species, for example, taxa in the snapping shrimp genus Alpheus (cf.Anker and De Grave 2016), swimming crabs of the genus Thalamita (cf.Spiridonov 2017) and several genera in the highly diverse crab family Pilumnidae (cf.Ng 1987).As with standard DNA barcoding, significantly more taxonomic work, including building of verified reference sequence databases, is necessary to optimise the effectiveness of eDNA approaches.

Data accessibility
MiDeca sequences from the 250 decapod crustaceans are available from DDBJ/EMBL/GenBank databases (Table 2).Raw reads from the MiSeq sequencing are available from the DDBJ Sequence Read Archive (DRA008193/ DRR172676).

Figure 1 .
Figure 1.Map of Japan, showing the location of the sea water sampling site (Banda, Tateyama, Chiba Prefecture).This map is based on the aerial photography published by Geospatial Information Authority of Japan.

Figure 2 .
Figure 2. A, sampling site at Banda, Tateyama, Chiba Prefecture; B, sea water sampling operation with a bucket.

Figure 3 .
Figure 3. Summary of intraspecific, inter-species, inter-genus and inter-family edit distances of MiDeca sequences from 254 decapods used for the primer development.

Table 1 .
Mitogenome sequences used to design the MiDeca primers.Scientific names follow those as registered in GenBank database.

Table 2 .
A list of decapod species for testing MiDeca primers (without adapter sequences) using extracted DNA subsequently sequenced with a Sanger method.Institutional abbreviation: AMF, Aquamarine Fukushima, Iwaki, Japan.

Table 4 .
A list of malacostracan species detected from natural sea water sampled at Banda, Tateyama, Chiba Prefecture.Non-decapod taxa are marked by bold.Number of reads, degree of sequence identity, length of marker sequences and accession numbers are summarised.

Table 5 .
A list of 90 decapod species recorded from rocky intertidal to shallow subtidal zones (< 5 m) in Tateyama Bay (including Banda) and nearby areas.