Monitoring harmful microalgal species and their appearance in Tokyo Bay, Japan, using metabarcoding

During the recent decade, high-throughput sequencing (HTS) techniques, in particular, DNA metabarcoding, have facilitated increased detection of biodiversity, including harmful algal bloom (HAB) species. In this study, the presence of HAB species and their appearance patterns were investigated by employing molecular and light microscopy-based monitoring in Tokyo Bay, Japan. The potential co-appearance patterns between the HAB species, as well as with other eukaryotes and prokaryotes were investigated using correlation and association rule-based time-series analysis. In total, 40 unique HAB species were detected, including 12 tox-in-producing HAB species previously not reported from the area. More than half of the HAB species were present throughout the sampling season (summer to autumn) and no structuring or succession patterns associated with the environmental conditions could be detected. Statistically significant (p < 0.05, r S ranging from −0.88 to 0.90) associations were found amongst the HAB species and other eukaryotic and prokaryotic species, including genera containing growth-limiting bacteria. However, significant correlations between species differed amongst the years, indicating that variability in environmental conditions between the years may have a stronger influence on the microalgal community structure and interspecies interactions than the variability during the sampling season. The association rule-based time-series analysis allowed the detection of a previously reported negative relationship between Synechococcus sp. and Skeletonema sp. in nature. Overall, the results support the applicability of metabarcoding and HTS-based microalgae monitoring, as it facilitates more precise species identification compared to light microscopy, as well as provides input for investigating potential interactions amongst different species/groups through simultaneous detection of multiple species/genera.


Introduction
Microscopic algae are unicellular eukaryotic or prokaryotic organisms present in various aquatic environments. Microalgae may exist as individual cells or form chains or colonies. Some of them are important as primary producers, generating about 48% of the annual net primary production (Field et al. 1998). About 300 microalgal species are associated with harmful algal blooms (HABs; Hallegraeff 2004; Berdalet et al. 2016), a proliferation of microalgal cells, where cell abundances may reach up to millions of cells per litre (Zhang et al. 2012;Burson et al. 2014;Du et al. 2016;Viana et al. 2019). The main harmful effects are related to oxygen depletion, attenuation of light conditions, toxicity and clogging of fish gills (Hallegraeff 2004;Anderson 2009). HABs can also have a notable economic impact due to fish mortality and closure of fisheries, discolouration of seaweed (Pyropia yezoensis) cultures, negative influence on tourism, additional costs on monitoring and public health (Miyahara et al. 1996;Whyte et al. 2001;Imai et al. 2006;Lee et al. 2013;Yamaguchi et al. 2014;Clément et al. 2016;Du et al. 2016;Sanseverino et al. 2016). During recent decades, blooms, areas affected by HABs and distribution ranges of HAB species have expanded (Hallegraeff 2004;Anderson et al. 2012a;Kudela and Gobler 2012;Zhang et al. 2012;Koch et al. 2014;Shimada et al. 2016;Natsuike et al. 2019).
To prevent or minimise the adverse effects associated with HABs, information on HAB species distribution and factors supporting/hindering their growth, for example, environmental parameters and other organisms, is needed. Traditionally, microalgae, including the HAB species, have been identified and enumerated, based on their morphology using light microscopy (Nishikawa et al. 2010;Lefebvre et al. 2011;Lee et al. 2013;Yasakova 2013;Eriksen et al. 2019). However, the exact identification may be hampered by similar or variable morphology, small cell size as well as by the influence of fixatives on cells (Rhodes 1998;John et al. 2005;Zingone et al. 2006;Galluzzi et al. 2008;Gran-Stadniczeñko et al. 2017). Improved species detection can be achieved by employing high-throughput sequencing (HTS) methods, such as DNA metabarcoding. This approach allows amplifying the gene of interest from a mass collection of organisms or environmental DNA (eDNA) and thus target multiple species simultaneously (Moreno-Pino et al. 2018;Nagai 2018;Nagai et al. 2019). Metabarcoding has also been applied for investigating the diversity of microalgae, which has uncovered the presence of previously unknown HAB species/genera for the sampling area (e.g. Dzhembekova et al. 2017;Elferink et al. 2017;Nagai et al. 2017;Moreno-Pino et al. 2018;Gran-Stadniczeñko et al. 2019;Sildever et al. 2019Sildever et al. , 2021Sze et al. 2019;Liu et al. 2020).
Furthermore, eDNA metabarcoding has been successfully applied to reveal appearance patterns for several organism groups ranging from bacteria to mammals (Nagai et al. 2016;Hirai et al. 2017;Sigsgaard et al. 2017;Stoeckle et al. 2017;Berry et al. 2019;Zhang et al. 2020;Sildever et al. 2021;Alter et al. 2022), including the HAB species (Nagai et al. 2017Sildever et al. 2019). As several genes or markers can be amplified from the same sample, co-appearance patterns and associations amongst species and groups can also be investigated (Lima-Mendez et al. 2015;Sawaya et al. 2019;Djurhuus et al. 2020). This may be especially useful as changes in species diversity, presence of different bacteria and parasites can be monitored continuously over a longer time scale together with the HAB species to recognise changes in those parameters/organism groups as indicators of the state of the HABs (Yang et al. 2015;Hattenrath-Lehmann and Gobler 2017;Berdjeb et al. 2018;Shin et al 2018;Hattenrath-Lehmann et al. 2019;Nagai et al. 2019; Jankowiak and Gobler 2020; Yarimizu et al. 2020).
During recent decades, several bacteria displaying growth-limiting influence on the HAB causative microalgae have been isolated from the natural environment to find a method to control the HAB occurrence and duration (Imai et al. 1993;Lovejoy et al. 1998;Nagai and Imai 1999;Skerratt et al. 2002;Kim et al. 2008;Zheng et al. 2018;Inaba et al. 2019Inaba et al. , 2020. However, currently, no data are available to demonstrate the growth-limiting influence of bacteria on HAB causative species in nature (Mayali and Azam 2004;Meyer et al. 2017), although their influence on modifying the natural phytoplankton communities has been demonstrated in a laboratory experiment (Bigalke et al. 2019). Thus, both eukaryotes and prokaryotes were targeted in this study in Tokyo Bay from May/June to October 2017 and 2018 using metabarcoding.
Tokyo Bay is a semi-enclosed bay located in central Japan, connected with the Pacific Ocean through the Uraga Channel. The area of the Bay is 1500 km 2 (Hattori 1982) with an average depth of 45 m (Kokubu et al. 2013). In total, 36 rivers flow to the bay (Nihei et al. 2009) with higher discharges between June to September (Kokubu et al. 2013). Dominant seasonal winds are from the northeast during winter and from the southwest during summer (Oda and Kanda 2009). Water exchange between the ocean and Bay is driven by the estuarine circulation and is influenced by freshwater input and solar radiation (Nakayama 2006). During summer, the water residence time is around 15 to 35 days (Takao et al. 2004). The main currents in Tokyo Bay are tidal and residual currents, whereas the offshore current from the ocean rarely influences the circulation in the inner bay (Suzuki and Matsuyama 2000). There is a notable seasonal variation in residual current due to variations in river outflow, heating and cooling of the sea surface and prevailing winds (Guo and Yanagi 1996). The water column in the inner part of Tokyo Bay is strongly stratified from June to October, whereas from November to March, the water column is mixed through (Nakane et al. 2008;Bouman et al. 2010). In summer, diurnal amplitude in sea surface temperature is > 1 °C, with the maximum measured difference of 5.5 °C (Oda and Kanda 2009).
Microalgal abundance is the highest between March and October and the community is dominated by diatoms (Nakane et al. 2008). The Bay became eutrophicated between the 1950s and 1970s, which was also reflected in the microalgal community composition as a loss of oceanic/neritic species replaced by a few dominant species (Nomura 1998). The annual average occurrence of HABs also increased with eutrophication from an average of two HABs per year until the 1940s to 19 times per year in the 1980s (Nomura 1998). The postbloom accumulation of decaying algal cells in the seabed contributes to the development of hypoxia, resulting in losses in benthic communities (Toba et al. 2008;Kodama and Horiguchi 2011). The main HAB causative species in Tokyo Bay are a diatom Skeletonema costatum and a raphidophyte Heterosigma akashiwo (Nomura 1998) with 28 toxin-producing species also recorded from the Bay (Nomura and Yoshida 1997;Nomura 1998;Matsuoka et al. 2003;Yap-Dejeto et al. 2010;Yuki and Yoshimatsu 2012;Demura et al. 2014;Nagai et al. 2017).
This study focuses on the usefulness of metabarcoding for microalgal monitoring with emphasis on HAB species by using universal primers targeting the 18S ribosomal RNA gene. Additionally, patterns in the HAB species appearance and associations amongst HAB species, as well as between HAB species and environmental parameters, other eukaryotes and bacteria, were investigated.

Sampling and DNA extraction
Surface seawater was collected weekly between June 2017 and October 2017 (n = 21), as well as between May 2018 and October 2018 (n = 24) from Tokyo Bay, Japan (35°34.60'N, 139°65.72'E, Fig. 1) by using a bucket. Water temperature and salinity were measured during sampling by a YSI Pro30 conductivity and temperature sensor (Xylem Inc., USA). A 50 ml subsample was taken for nutrient and ion chemical analysis (NO 2 , NO 3 , PO 4 , Si, Na, Mg, S, Ca, K, Sr, B and Li). Inductively coupled plasma optical emission spectroscopy (ICP_OES, SPEC-TROBLUE, AMETEK Inc. USA), equipped with an autosampler ASX-260 (Cetac Technologies) and Smart Analyzer Vision 6.01.0945 software, was employed for ion analysis (Sekiyama et al. 2011;Ito et al. 2014;Ogawa et al. 2014;Oita et al. 2018;). The ICP-OES operating conditions were as follows: power 1.4 kW, plasma gas flow 12 l min −1 , auxiliary gas flow 1.0 l min −1 , nebuliser gas flow 1.00 l min −1 and peristaltic pump speed 30 rpm. From the obtained data, a matrix was built using the concentration in ppm for each element against the sampling points from the average result of the optimal dilution with the optimal wavelengths. Nutrient analysis was performed by a DR 3900 spectrometer (Hach, United States) using the following reagent kits: Total nitrogen -TNT826, ammonia -TNT830, nitrate -TNT835, nitrite -TNT839 and phosphate -TNT535. The analytical protocols followed the vendor's manual. For Chl a analysis, 100 ml of water was filtered through GF/F filter (Whatman, GE Healthcare, Tokyo, Japan) and stored in an N,N-Dimethylformamide at −20 °C until analysed following the workflow in Holm-Hansen et al. (1965) and using a Turner Designs fluorometer (10 AU).
Microalgae were identified and enumerated from 1 ml of raw seawater and also from 1 ml of Lugol (2%) fixed sample concentrated from 200 ml of water to 8-10 ml by using an inverted microscope (Eclipse Ti-U, Nikon, Tokyo, Japan) and a Sedgewick Rafter counting chamber. The cells in the entire chamber were identified and enumerated. Morphologically similar species, for example, belonging to Chaetoceros, Pseudo-nitzschia and Skeletonema, were identified to genus level. Microscopic count data on species/genus level is only available from 2018, whereas total microalgal cell count data is available for 2017. For metabarcoding, 200-1000 ml of seawater was filtered through 8 and 1 µm (in 2017) or 1 µm (in 2018, Figure 1. Sampling location in Tokyo Bay, Japan. The red square symbolises Tokyo Bay on the Pacific coast of Japan and the redfilled square indicates the sampling location in Tokyo Bay. 8 µm filtration was discontinued to reduce the number of steps in the DNA extraction) and 0.22 µm pore-size filters (Nuclepore membrane, GE Healthcare, Tokyo, Japan). DNA was extracted after filtration or stored at −20 °C until the extraction by using a 5% Chelex buffer (Chelex 100, Molecular Biology Grade Resin, Bio-Rad, Hercules, CA, USA), following the protocol described by Nagai et al. (2012). First, 250 µl of the 5% Chelex buffer was added to the 1.5 ml tubes containing the filters. A pellet pestle motor (Kontes Glass, Vineland, NJ, USA) was used for 30 s to break the cells on the filters. As the final step in the extraction, the samples were heated for 20 min at 97 °C.

Paired-end library preparation, sequencing and bioinformatics
For 2017 samples, DNA extracted from 1 and 8 µm filters, was mixed in equal volumes and used as a template for paired-end library preparation. For 2018, DNA extracted from 1 µm filters, was used as a template. Eukaryotic species were targeted by universal primers for the 18S rRNA gene ( Dzhembekova et al. (2017) with the exception that sequences longer than 300 bp were truncated to 300 bp by trimming the 3′ tails.
The demultiplexing and trimming were performed using Trimmomatic version 0.35 (http://www.usadellab. org/cms/?page=trimmomatic). Nucleotide sequences were demultiplexed according to the 5´-multiplex identifier (MID) tag and primer sequences were demultiplexed according to the default format in MiSeq. Sections containing: (1) palindrome clips longer than 30 bp and (2) monopolymers longer than 9 bp were trimmed from the sequences at both ends. 3´ tails with an average quality score lower than 30 at the end of the final 25-bp window were also trimmed from each sequence. 5´ and 3´ tails with an average quality score lower than 20 at the end of the final window were also trimmed. The remaining sequences were merged into paired reads using Usearch version 8.0.1517 (http://www.drive5.com/usearch/) with default settings (≥ 16 bp overlap, ≥ 90% similarity and mismatch ≤ 5 bp; http://www.drive5.com/usearch/manual/merge_options_html) resulting in a maximum sequence length of 584 bp. Singletons were also removed. Sequences were aligned using Clustal Omega v. 1.2.0. (http://www.clustal.org/omega/) and only sequences that were contained in more than 75% of the read positions were extracted. Filtering and a part of the multiple align-ment process were performed using the screen.seqs and filter. seqs commands in Mothur, as described in the MiSeq SOP (http://www.mothur.org./wiki/MiSeq_SOP; Schloss et al. 2011). Erroneous and chimeric sequences were detected and removed using the pre.cluster (diffs = 4) and chimera.uchime (minh = 0.1; http://drive5.com/usearch/ manual/uchime_algo.html; Edgar et al. 2011) commands in Mothur, respectively. Using the unique.seqs command of Mothur, the same sequences were collected into OTUs. The contig sequences were counted as OTUs by count. seqs and used for the subsequent taxonomic identification analysis. Demultiplexed, filtered, but untrimmed sequence are available in the DDBJ Sequence Read Archive (Accession number: DRA013265). Sequences were clustered to OTUs at ≥ 99% similarity level. The sequence database, used for assigning taxonomy to OTUs, was downloaded from GenBank on 22.03.2019. OTUs associated with multiple records from the same genus were merged with the OTU associated with a single species from the same genus if the multiple records consisted of a single species and other records identified as sp. of the same genus. For eukaryotes, only OTUs that had ≥ 99% similarity with the best match from the database were used for further analysis. For prokaryotes, only OTUs with > 50 total sequence reads for each fraction and sampling year were included in the further analysis.

HAB species selection
The presence of HAB species amongst the detected eukaryotic OTUs was investigated by comparing the list of OTUs associated with phytoplankton species against the IOC-UNESCO Taxonomic Reference List of Harmful Micro Algae (Moestrup et al. 2022), as well as with the known non-toxic HAB species from the Tokyo Bay (Takano 1956;Nomura 1998). To avoid the inclusion of OTUs with ambiguous identities, representative sequences of all OTUs associated with the HAB species were also manually BLAST-searched from the GenBank online database (https://blast.ncbi.nlm.nih.gov/Blast.cgi). The validity of the taxonomic names was checked against the AlgaeBase (http://www.algaebase.org/) and the World Register of Marine Species (http://www.marinespecies. org/). Only OTUs that had the ≥ 99% BLAST top hit similarity with the associated single species were included in the further analysis. For OTUs associated with Alexandrium catenella and Alexandrium tamarense, the nomenclature described by Litaker et al. (2018) was followed. To facilitate the comparison with the LM data, OTUs are referred to as species or genera. For example, species belonging to Pseudo-nitzchia were detected to the genus level by LM; however, they are also shown as single species detected by the metabarcoding and HTS to provide more detailed information. Species belonging to genera/higher taxonomic levels reported by Nomura (1998;Chaetoceros, Euglenophyceae, Gymnodinium, Naviculaceae, Prorocentrum, Pyramimonas, Rhizosolenia and Thalassiosira) are shown on a genus/higher taxonomic level. An exception to this is Cryptomonadaceae spp., which is represented by Goniomonas aff. amphinema as it was the only species from this group detected by metabarcoding and the HTS-approach, whereas no species belonging to this group were detected by LM in this study.

Statistical analysis
All data analysis was conducted in R v. 4.0.3 (R Core Team 2021) using the "vegan" (v.2.5-7, Oksanen et al. 2021) or "psych" packages (correlation analysis; Revelle 2020). The number of shared OTUs amongst different sampling years and methods was analysed and visualised as a Venn diagram. As LM did not allow species-level identification for some genera, the number of shared OTUs is compared both at species and genus levels. Species richness was calculated as the number of OTUs. Ordination analyses were applied to reveal the relationship between the presence/absence of OTUs associated with HAB species and several environmental parameters. First, a Detrended Correspondence Analysis was performed to choose between linear and unimodal ordination methods. The length of the first DCA axis was < 3 standard deviations for 2017 and 2018 data (Table 1) and, thus, linear methods were applied (PCA: Principal Component Analysis and RDA: Redundancy Analysis; Ter Braak and Prentice 1988). PCA was used to investigate general trends in the HAB species appearances with environmental parameters added as vectors using the envfit function. RDA was used to analyse the influence of environmental parameters on the HAB species' appearances. Monte Carlo Permutation tests with 999 unrestricted permutations were applied to test the significance of the variation in community composition explained by the environmental variables under a global model. When the result was statistically significant, further Monte Carlo permutation tests with 999 unrestricted permutations were performed to test the variation explained by the individual axes and environmental variables. Spearman correlations between environmental parameters and presence/absence of OTUs associated with each HAB species, between OTUs associated with HAB species, other eukaryotic species detected and with bacteria were calculated. P-value was adjusted for multiple comparisons by the Benjamini-Hochberg correction (Benjamini and Hochberg 1995). The correlation analysis results were visualised by heatmap.2 ("gplots" v.3.1.1; Warnes et al. 2020) between eukaryotes and by Gephi (v. 0.9.2; Bastian et al. 2009) between pro-and eukaryotes.

Association rule-based time-series analysis
Association rule-based time-series data analysis (Asano et al. 2019) was applied for investigating the changes in phytoplankton abundance following the changes in bacterial abundance. The method allows us to find association rules considering the potential lag effect (Agrawal et al. 1993), for example, changes in cell abundance or number of live cells may display a lagged response to a change in a parameter (Collos 1986;Nagai and Imai 1999;Śliwińska-Wilczewska et al. 2019). The number of sequence reads for Gymnodinium spp., Pseudo-nitzschia spp. and Skeletonema spp. were analysed, together with the bacterial sequence reads from 2017 and 2018 (non-rarefied data). Those groups were chosen due to their presence throughout most of the sampling season in both years. To obtain a minimum of four association rules per group analysed, the characteristic change in each group was set as follows: p = 0.1 for Gymnodinium spp. and Pseudo-nitzschia spp. and p = 0.3 for Skeletonema spp. The analysis was implemented in R v. 4.0.3 (R Core Team 2021).

Environmental parameters
During both years, salinity and water temperature displayed a similar pattern ranging from 24.5 to 32 and 18.2 °C to 27.9 °C in 2017 and from 23.9 to 32.6 and 17.4 °C to 28.9 °C in 2018, respectively (Fig. 2). The chlorophyll a pattern was more variable between the years with the values from 0.7 to 271.35 µg l -1 in 2017 and from 1.6 to 77.59 µg l -1 (Fig. 2). Nitrite concentrations ranged from 0.015 to 0.042 mg l -1 in 2017, but were mainly below the detection limit for 2018. Nitrate (0.23-0.54 mg l -1 ) and phosphate (0.02-0.58 mg l -1 ) data were only available for 2018. Silicate concentrations were almost constant throughout the sampling seasons, probably reflecting an analytical error and were thus excluded (data not shown). Ion concentrations fluctuated throughout the sampling seasons (Fig. 2). Na, Mg and Ca concentrations followed the fluctuations in salinity. Li concentrations had two peaks at the beginning of 2017 with maximum values reaching 103 parts per billion.

Overview of detected eukaryotes and prokaryotes
For eukaryotes, an average of 26,781 (SD: 3,801) sequences were detected per sample in 2017, whereas in 2018, it was 14,508 (SD: 3,593) sequences (after bioinformatics treatment, Suppl. material 2: Tables S2, S3). For prokaryotes, the average sequence reads were 30,974 (SD: 5,729) and 6,695 (SD: 2,276) for samples analysed from 1 µm filters in 2017 and 2018 (Suppl. material 2: Tables S4, S6). For samples from the 0.22 µm filters, the average number of sequences per sample was 43,532 (SD: 4,036) and 13,676 (SD: 2,116) in 2017 and 2018 (Suppl. material 2: Tables S5, S7). The OTU accumulation curves (Suppl. material 1: Figs S1-S2) indicate that the number of OTUs is nearly saturated for samples from 2017 and 2018, based on the 18S marker, but in the case of the 16S marker, the low number of OTUs and sequences for 2018 samples are visible.
In 2017, 673 unique eukaryotic OTUs (≥ 99% BLAST top-hit similarity) were detected, belonging to nine different supergroups and 32 phyla (Suppl. material 2: Table S2). In 2018, 616 eukaryote OTUs from nine supergroups and 34 phyla were present (Suppl. material 2: Table S3). In both years, the majority of the OTUs belonged to Metazoa, Dinophyceae or Bacillariophyta (Suppl. material 2: Tables S2, S3). The minimum number of unique OTUs were 120 and 105 and the maximum number of OTUs were 265 and 234 in 2017 and 2018, respectively (Fig. 2). The number of cells detected by microscopy ranged from 3 to 10,954 cells ml -1 in 2017 and 2018 (Fig. 2). Unfortunately, species-level information, based on light microscopy, is not available for 2017, but in 2018, the number of species detected per sample ranged from 17 to 50 with an average of 34 species (Fig. 2).

Number of HAB species/genera detected and appearance patterns
In total, 29 unique genera associated with HAB species and 40 unique HAB species were detected from Tokyo Bay during two years of monitoring, based on metabarcoding and the HTS approach and morphologybased identification under LM (Suppl. material 2: Table S1). Of the 40 species detected, 30 are known as toxin-producers (Moestrup et al. 2022) and 12 of these have not been previously reported from Tokyo Bay (Suppl.  Table S1). Metabarcoding and the HTSapproach allowed the detection of 26 genera, whereas 17 genera were detected, based on morphology using LM. However, three of those (Dactylisolen, Eucampia and Leptocylindrus) were not detected by metabarcoding and HTS (Fig. 3, Suppl. material 2: Table S1). The genera detected in 2017 were also present in 2018, whereas four genera were only registered in 2018 (Goniomonas, Phaeocystis, Pseudochattonella and Rhizosolenia; Suppl. material 2: Table S1). Based on metabarcoding and HTS data from both years, 36 HAB species were detected compared to 14 HAB species identified under LM in 2018 (Fig. 3, Suppl. material 2: Table S1). In 2018, five additional species (Alexandrium catenella, A. pacificum, Goniomonas aff. amphinema, Phaeocystis globosa and Pseudochattonella verruculosa) were detected by HTS compared to 2017. The total number of species/genera detected by metabarcoding and the HTS approach was 43, whereas LM allowed the detection of 20 species/genera. Based on metabarcoding and the HTS approach, 65% of the detected HAB species/genera (n = 37) were present in more than half of the sampling occasions in 2017 (June -October, n = 21) with Karlodinium veneficum, Chaetoceros spp., Prorocentrum spp., Skeletonema spp. and Thalassiosira spp. detected from all the samples (Fig. 4). At the same time, A. minutum and Azadinium poporum were detected only once or twice, respectively (Fig. 4). In 2018, 42% of the detected species/genera (n = 43) were present in more than half of the samples (May -October, n = 24). Similar to 2017, K. veneficum, Chaetoceros spp. Skeletonema spp. and Thalassiosira spp. were detected from all the samples, whereas A. catenella, Goniomonas spp., Mesodinium rubrum, P. verruculosa and Rhizosolenia spp. were present in one sample and A. pacificum in two samples (Fig. 4).
In morphology-based identification by LM, 50% of the detected species/genera were present in more than half of the sampling occasions in 2018 (Suppl. material 1: Fig. S3). Chaetoceros species were detected from all the samples and species belonging to Pseudo-nitzschia were present in more than 90% of the samples (Suppl. material 1: Fig. S3). Karenia mikimotoi, K.papilionaceae, Polykrikos hartmanii, Phaeocystis globosa and Prorocentrum triestinum were detected once, whereas, based on metabarcoding and the HTS approach, the species were present in 19, 4, 12, 3 and 21 samples, respectively (Fig. 4, Suppl. material 1: Fig. S3). On the contrary, Mesodinium rubrum was identified from 15 samples using LM, whereas, based on metabarcoding and HTS, the species was detected from only one sample in 2018.

Ordination
In the PCA, the first two axes explained 35.23% (2017) and 30.62% (2018) of the variability in the OTU relative abundances (Suppl. material 1: Fig. S4). In 2017, the water temperature was significantly (p < 0.05) associated with the variability represented by the first two axes, whereas no significant associations were detected, based on the 2018 dataset (Table 1). In the RDA, the environmental parameters explained 69.98% of the variability in the 2017 dataset and 55.79% in the 2018 dataset. The Monte Carlo permutation test indicated that, in 2017, significant amount of variation (p = 0.002) in the species composition can be explained by the environmental parameters; however, for the 2018 dataset, no significant influence was detected ( Table 1). The permutation test for individual axes in the 2017 dataset indicated the importance of the first axis (p = 0.003). Salinity, water temperature and magnesium (Mg) concentration were identified as parameters significantly influencing the community composition (p < 0.05; Table 1). In both years, the majority of OTUs clustered to the centre of the plot and some of those were placed together with a change in the environmental parameter in the RDA. A high number of OTUs clustered together with an increasing water temperature in both years, whereas for other parameters, the pattern was not as clear (Fig. 5). In the RDA from 2017, Prorocentrum cordatum was placed close to increasing Mg concentrations and Noctilluca scintillans with increasing water temperature (Fig.5A). In 2018, Phaeocystis globosa appeared together with increasing phosphate concentrations, whereas Phalacroma rotundatum and Pseudo-nitzschia pungens were placed together with increasing salinity and nitrate concentrations (Fig. 5B).

Environmental parameters
The majority of the OTUs associated with HAB species/ genera had no statistically significant (p > 0.05) correlations with the environmental parameters during both years (data not shown). An exception was Cerataulina pelagica displaying a significant positive correlation

Prokaryotes
In 2017, more than half of the OTUs associated with HAB species had significant (p < 0.05) correlations with bacteria belonging to 0.22-1 µm and ≥ 1 µm fractions (Suppl. material 2: Tables S10-S11, Fig. 6). In 2018, more than half of the HAB-associated OTUs were significantly correlated with bacteria from the ≥ 1 µm fraction, whereas 43% displayed a significant correlation with the 0.22-1 µm bacteria fraction. The correlations between OTUs associated with HAB and bacteria were between different OTUs in different years. Based on both size fractions and years, 35 OTUs associated with genera containing growth-limiting bacteria were detected that were significantly (p < 0.05) correlated with HAB-associated OTUs (Suppl. material 2: Table S12). In the case of the 0.22-1 µm bacteria fraction, most correlations with HAB species were positive in both years, whereas, in the case of the ≥ 1 µm bacteria fraction, the majority of the significant correlations were negative (Fig. 6). For most of the bacterial OTUs, the previous-ly reported target HAB species did not match with the HAB-associated OTUs that had a significant correlation.

Association rule-based analysis of time-series data
Based on the time-series analysis of selected phyla, several associations between bacteria and phytoplankton were detected (Suppl. material 2: Table S13). From the investigated genera, the highest number of associations was found between bacteria and Skeletonema spp.: 11 associations with two of those negative. For Gymnodinium spp. and Pseudo-nitzschia spp., nine and four associations were detected respectively and for both genera, one of the associations was negative.

Discussion
Two years of microalgae monitoring by metabarcoding and HTS in combination with light-microscopy allowed the detection of 29 unique genera with 40 unique species associated with harmful algal blooms (Nomura 1998;Moestrup et al. 2022) with 12 toxin-producing species recorded for the first time from Tokyo Bay. The majority of the harmful algal species/genera detected did not display significant correlations with the environmental parameters investigated, whereas significant correlations were detected with other HAB species/genera, other eukaryotes and prokaryotes, including genera containing growth-limiting bacteria.

HAB species/genera detected by metabarcoding and HTS and by light microscopy
Overall, the metabarcoding and HTS-based approach registered 28 HAB-associated species/genera not identified by LM, whereas LM allowed the detection of four species and three genera not found by metabarcoding and HTS. Differences in species/genera detected by the metabarcoding and HTS-approach vs. LM have been previously explained by a poor match between the sequences of targeted taxa and the primers used, a low number of rRNA gene copies per cell in some genera, pre-filtration of samples and availability of sequences in the reference databases ( As the concentrated samples were fixed by Lugol's iodine solution, identification may have also been hampered due to the changes in cell morphology (e.g. Fibrocapsa japonica, Vrieling et al. 1995) or due to discolouration of cells, which could mask the morphological features (Steidinger and Tangen 1996).
From the four species identified only by LM, Dinophysis caudata and Eucampia zodiacus were detected by metabarcoding and HTS, but the sequences also matched with other species from the database and were, thus, not included in further analysis. At the same time, Dactyliosolen fragilissimus and Leptocylindrus minimus were only detected by LM. As both species have been previously reported from Tokyo Bay as HAB causative species (Nomura 1998), the lack of detection by metabarcoding may be explained by the lack of sequences present in the public databases as no 18S rRNA gene sequences were available for D. fragilissimus and only one sequence covering the target region was available for L. minimus. The lack of sequences covering the target region in the database could also explain why Pseudo-nitzschia galaxiae and P. subfraudulenta that have been previously reported from the Bay (Nagai et al. 2017) were not detected. In a recent study from the Gulf of Naples, the Mediterranean Sea, P. galaxiae was successfully detected from eDNA samples by targeting the 18S rRNA gene V4 region (Ruggiero et al. 2022). Thus, until the database is improved, the V4 region of the 18S rRNA gene and other genes (e.g. 28S rRNA) could be used for improved detection of species belonging to Pseudo-nitzschia.
Two toxin-producing dinoflagellate species, Margalefidinium polykrikoides and Lingulodinium polyedra were detected by the region targeted, but their similarity scores with the best database match were lower than the 99% criteria used in this study. Thus, the availability of more sequences covering the target region might also be beneficial for improved identification of those species. Interestingly, another toxin-producing dinoflagellate species, Protoceratium reticulatum, was not detected in the samples, although the cysts of this species have been reported from the sediments throughout Tokyo Bay (Matsuoka et al. 2003). In a study from the Baltic Sea, the species could be successfully identified from eDNA samples using the same marker (Sildever et al. 2021). Thus, the lack of detection in this study could result from collecting data from only a single station, which does not capture the entire diversity of the Bay. In future studies, more stations should be included to detect as much of the diversity present as possible.
Some of the species identified by both metabarcoding and HTS, as well as by LM, were detected in notably different numbers of sampling occasions: Karenia mikimotoi, K. papilionaceae, Noctiluca scintillans, Prorocentrum cordatum and P. triestinum were registered in more samples by metabarcoding and HTS, whereas Mesodinium rubrum and Rhizosolenia spp. were recorded in a higher number of samples by LM. In the case of Karenia and Prorocentrum species, improved detection by metabarcoding and HTS may result from more precise taxonomic identification compared to LM, whereas for N. scintillans, the difference may be due to its low abundance in some sampling occasions as the species is large and easily recognisable. Further, the discrepancy in species identification between the two methods has also been explained by the differences in the sample volume analysed (Rodriguez-Ramos et al. 2014;Gran-Stadniczeñko et al. 2019): 200 to 1,000 ml by metabarcoding and HTS compared to 1 ml of raw and 1 ml of concentrated sample (total volume of 8-10 m from 200 ml) analysed by LM in this study.
In the case of M. rubrum and Rhizosolenia spp., the potential influence of the gene and the region targeted on the detection efficiency by metabarcoding and HTS is not supported by the continuous detection of M. rubrum by the metabarcoding and HTS in 2017 from Tokyo Bay, as well as by the several Rhizosolenia species recorded by the same primers from another locality in Japan (S. Nagai, unpubl. data). Interestingly, in 2017, Rhizosolenia spp. was identified from several samples by metabarcoding and HTS, but had a lower similarity (94%) with the best taxonomic match from the database. Therefore, targeting several genes and usage of genus-specific primers may improve species detection (Gran-Stadniczeñko et al. 2017;Nagai et al. 2017;Smith et al. 2017;Sildever et al. 2019Sildever et al. , 2021. The low identification of those taxa could also be related to the overall lower number of sequences obtained per sample in 2018 (average 14,508 ± 3,592 sequences after bioinformatics treatment) compared to 2017 (26,781 ± 3,709 sequences after bioinformatics treatment). This is further supported by the OTU accumulation curve displaying lower accumulation levels for the 2018 data. Thus, re-sequencing the samples from 2018 might improve the detection of those species.

HAB species/genera appearance patterns and correlation with environmental parameters
The majority of detected HAB species were present throughout the sampling season, indicating a wide tolerance range for salinity and temperature. This is further exemplified by Pseudo-nitzschia species appearing from March to October in variable temperature and salinity conditions (this study; Yap-Dejeto et al. 2010;Nagai et al. 2017). An exception to this was P. pseudodelicatissima, present mainly in July-October (this study; Yap-Dejeto et al. 2010), potentially due to the preference for higher surface water temperatures as its growth rate increases with rising temperature (Lundholm et al. 1997). In Tokyo Bay, an increase in water temperature and input of nutrients are mainly responsible for short-term fluctuations in the microalgal community composition and abundance as phosphate is limiting growth during the stratification period from summer to autumn (Nakane et al. 2008;Kubo et al. 2019). This was also visible, based on the ordination analyses from both years, where around half of the HAB-associated OTUs grouped with the increasing water temperature. In addition, based on the permutation test, changes in water temperature could explain a significant amount of variability in the HAB community composition in 2017. However, as the statistically significant (p < 0.05) correlations with the water temperature and chemical elements were not continuous between the years, a general pattern in the HAB species appearances, based on environmental parameters, could not be detected.
OTUs associated with A. catenella, Goniomonas aff. amphinema and P. verruculosa were detected only once during the study period. For example, A. catenella was registered in the second week of July 2018, when the water temperature was 26.6 °C and, two weeks later, the species was also detected from another sampling location in Tokyo Bay (data not shown), where it also remained the single detection in both years. In different coastal areas of Japan, the vegetative cells of A. catenella have been reported from a maximum of 20 °C (Sekiguchi et al. 1986;Itakura et al. 2002;Yamamoto et al. 2017). In the coastal waters of north-eastern Japan, the cells are present from January to June (Kaga et al. 2006) with the highest cell densities at 8-9 °C (Ichimi et al. 2001). In Tokyo Bay, the species formed a bloom in June of 1984 (Han and Terazaki 1993); however, the presence of vegetative cells has not been registered in later studies (Nomura and Yoshida 1997). Furthermore, the resting cysts detected from the Bay and belonging to either A. catenella or A. pacificum, lacked cell content, potentially indicating the absence of a local population (Matsuoka et al. 2003;Kotani et al. 2006). Thus, the detection of A. catenella only once throughout the sampling seasons and at high surface water temperature is potentially explained by the transport by currents from the neighbouring coastal areas as the species has caused paralytic shellfish poisoning events on the Pacific coast close to the entrance to Tokyo Bay (Kotani et al. 2006).
From other species registered only once during the study period, P. verruculosa was detected in the second week of May in 2018, when the water temperature was 18.7 °C and salinity 31.9. Interestingly, the temperature and salinity were in the previously-reported ranges suitable for the growth of Japanese P. verruculosa throughout the majority of sampling occasions in both years (Yamaguchi et al. 1997;Skjelbred et al. 2013). In southern Japan and the Seto Inland Sea, the species has been reported from January and February (Yamaguchi et al. 1997;Orita 2016), whereas in the Sea of Okhotsk, the species appears throughout the year with the highest number of occurrences in February-March and in August . As there are no previous published results on the presence of P. verruculosa in Tokyo Bay, further study is needed to investigate if the species could have been transported to the area as suggested for A. catenella or if the species occurs earlier in spring than the start of sampling periods in this study. In the case of Cryptomonadaceae, presence from June to August with the highest abundances in August in Tokyo Bay has been reported (Nomura 1998). In this study, metabarcoding and HTS could detect only one representative of this family, Goniomonas aff. amphinema, in October 2018, whereas other species from the same class (Cryptophyta) were registered continuously throughout the sampling season. Since it is not certain which species were considered as Cryptomonadaceae by Nomura (1998), for example, if other species from Cryptophyta were also included, it is not possible to compare the detected occurrence of Cryptomonadaceae only based on G. aff. amphinema with the previous information on the appearances of Cryptomonadaceae in Tokyo Bay.

HAB species/genera appearance patterns and correlation with other species/genera
Another important aspect of structuring a species' appearance is interaction with other organisms (Lima-Mendez et al. 2015;Sourisseau et al. 2017;Zhou et al. 2018;Sassenhagen et al. 2020). In both sampling seasons, HAB species displayed significant positive and negative associations with other HAB, eukaryotic and prokaryotic species; however, the majority of the detected correlations differed amongst the years, potentially explained by the variability in environmental conditions influencing the interspecific interactions (Mikhailov et al. 2019a). Interestingly, significant positive correlations between the same species in both years were detected for P. australis and P. multiseries, P. cuspidata and P. turgidula, as well as between P. australis and Eucampia spp. and T. rotula. For all of those, presence in a wide range of temperature and/or salinity conditions has been reported (Krawiec 1982;Yap-Dejeto et al. 2010;Nishikawa et al. 2011;Nagai et al. 2017;IOC-UNESCO 2021). In this study, the species appeared throughout the sampling seasons and no significant correlations with the environmental parameters were detected. An exception was P. cuspidata, showing a significant positive correlation with the water temperature in 2017, whereas in 2018, no significant correlation was detected and the species appeared already at lower temperatures (21.2 °C) than in 2017 (26.9 °C). Thus, currently, no clear explanation for the co-appearances in both years can be provided.
Detection of 35 OTUs linked with bacteria belonging to genera associated with growth-limiting effects on HAB species and detection of significant correlations between bacteria and HAB species further confirms the usefulness of metabarcoding and the HTS approach for investigating interactions amongst bacteria, phytoplankton and environmental parameters (Yang et al. 2015;Bunse et al. 2016Bunse et al. , 2019Wurzbacher et al. 2017;Zhou et al. 2018;Cui et al. 2019). Interestingly, the majority of significant correlations found between HAB and bacteria from the genera containing growth-limiting bacteria were positive. However, as the detected correlations are not between HAB species and the bacterial strains with known algicidal activity (Suppl. material 2: Table S12), but with other bacterial strains/species from the same genus, their growth-limiting activity is not certain as it can vary between the strains, species within the same genus and also in different environmental conditions (Skerratt et al. 2002;Meyer and Pohnert 2019). Thus, the significant positive correlations detected in this study may instead indicate mutualism/commensalism (Paver et al. 2013;Mikhailov et al. 2019b) or co-appearance due to indirect reasons, for example, environmental preferences (Weiss et al. 2016).
Bacteria belonging to Alpha-and Gammaproteobacteria, as well as to Flavobacteria are often dominant during the blooms (Buchan et al. 2014). Interestingly, in this study, an OTU associated with Marinobacter spp. was detected only once during the entire sampling season in 2017 (n = 21). In a laboratory study, a strain belonging to this genus impeded the growth of a toxin-producing dinoflagellate species Karenia mikimotoi (Zheng et al. 2018). At the same time, K. mikimotoi was not detected from the sample in which Marinobacter spp. was present although it was present in previous samples. In addition, by the same sampling date, the number of sequences associated with Gammaproteobacteria (class including Marinobacter spp.) had increased ≥ 10x compared to the previous week (435 to 7356 sequences/1% to 16% of the total sequence reads). Gammaproteobacteria have been reported to react later to the algal decay, with some species achieving their highest relative abundances two to four weeks after the peak of algal bloom (Teeling et al. 2012). Thus, the appearance of Marinobacter spp. might be related to the decay of an algal bloom possibly containing a high number of K. mikimotoi cells (hypothesised, based on sequence abundances, no cell count data available). This is further supported by a decrease in Chl a values (12.03 to 1.96 µg/l -1 ) coinciding with the decrease in K. mikimotoi sequence numbers. To confirm the growthlimiting activity, metabarcoding and HTS data should be combined with laboratory experiments, for example, most-probable number technique, microcosm studies, isolation and identification of the bacteria displaying growth-limiting activities towards the HAB species (Imai et al. 1993;Nagai and Imai 1999;Kim et al. 2008;Inaba et al. 2017;Bigalke et al. 2019).
Another useful method to detect potential growthlimiting bacteria is association rule-based time-series data analysis that also considers the time lag between the change in species abundances (Asano et al. 2019). In the present study, the method was tested with three phytoplankton genera and negative associations with bacteria were detected for all genera. Although further investigation and experiments are needed to confirm the negative influence of those bacteria are needed (e.g. Imai et al. 1995;Kim et al. 2008;Shi et al. 2018;Inaba et al. 2019), it provides a good indication of which bacteria to target. This is further exemplified by the decline in Skeletonema sp. one week after an increase in Synechococcus sp., based on sequence abundances. This coincides with the previously reported negative influence of Synechoccus sp. on Skeletonema sp. (S. marinoi) resulting in reduced growth rates, physical cell damage and a decline in photosynthetic capacity (Śliwińska-Wilczewska and Latała 2018; Śliwińska-Wilczewska et al. 2019). Thus, monitoring the increase of Synechococcus sp. by molecular tools could be used for forecasting the decline of Skeletonema sp. abundances. Identification of similar associations between bacteria and other HAB species may enable a near-real-time bloom forecasting system by combining molecular detection with hydrodynamical and ecological models (McGillicuddy 2010;Brown et al. 2013;Tian and Huang 2019).

Conclusions
A combination of light-microscopy-based monitoring with metabarcoding and the HTS approach allowed the detection of 40 HAB species from Tokyo Bay, including 12 toxin-producing species previously not reported from the area. Metabarcoding and the HTS approach allowed detection of twice as many HAB-associated species than light-microscopy; however, four species were detected only based on morphology under LM. This indicates the importance of using several markers to account for the differences in their identification power. Numerous statistically significant positive and negative associations could be detected amongst the HAB species, as well as amongst HAB, eukaryotic and prokaryotic species, including genera containing growth-limiting bacteria. As the majority of significant correlations were between different OTUs in different years, the interactions between different species were probably more influenced by the variability in environmental conditions between the years than within the sampling season. To understand the interactions between growth-limiting genera and microalgae, including HAB species, a further study combining time-series environmental monitoring by metabarcoding and the HTS approach and laboratory experiments are needed. The results of this study further confirm the applicability of metabarcoding and HTS-based microalgal monitoring, exemplified by the detection of several morphologically similar, small or fragile species previously not reported from Tokyo Bay.