BBI: an R package for the computation of Benthic Biotic Indices from composition data

The monitoring of impacts of anthropic activities in marine environments, such as aquaculture, oil-drilling platforms or deep-sea mining, relies on Benthic Biotic Indices (BBI). Several indices have been formalised to reduce the multivariate composition data into a single continuous value that is ascribed to a discrete ecological quality status. Such composition data is traditionally obtained from macrofaunal inventories, which is time-consuming and expertise-demanding. Important efforts are ongoing towards using High-Throughput Sequencing of environmental DNA (eDNA metabarcoding) to replace or complement morpho-taxonomic surveys for routine biomonitoring. The computation of BBI from such composition data is usually being undertaken by practitioners with excel spreadsheets or through custom script. Hence, the updating of reference morpho-taxonomic tables and cross studies comparison could be hampered. Here we introduce the R package BBI for the computation of BBI from composition data, either obtained from traditional morpho-taxonomic inventories or from metabarcoding data. Its aim is to provide an open-source, transparent and centralised method to compute BBI for routine biomonitoring.


Introduction
Biodiversity monitoring is the standard approach for the environmental impact assessment of anthropogenic activities. In marine environments, impact assessments are carried out through benthic macro-invertebrates surveys, which involve the sorting and the morpho-taxonomic identification of numerous specimens for a single site (Borja et al. 2009;Tavakoly et al. 2014). Identified taxa are ascribed to ecological weights that are defined from empirical and experimental data and Benthic Biotic Indices (BBI), such as AMBI (A Marine Biotic Index, Borja et al. 2000), ISI (Indicator Species Index, Rygg 2002), NSI or NQI1 (Norwegian Sensitivity Index and Norwegian Quality Index 1, Rygg and Norling 2013), contain their own set of taxa entries in their database. The relative abundance of each taxa having an ecological weight is being used as input in a formula (usually a weighted sum) to compute the BBI values. This value is continuous but the ecological assessment uses discrete categories (usually in five ordered categories from "very good" to "very bad" quality status), so that the continuous BBI value is often turned into a discrete category in order to make the assessment more "human readable" for regulating agencies and policy-makers. Each BBI contains its own specificities, boundaries between classes and addresses different aspects of biological quality elements. Such disparity prompted the development of a normalised Ecological Quality Ratio (nEQR) within the European Water Framework Directive (WFD), which allows the information provided by these BBI to be concatenated into a single yet integrative index to make impact assessments comparable across countries.
High-throughput amplicon sequencing of environmental DNA (eDNA metabarcoding) offers a fast and cost-effective method to describe biological communi-ties (Taberlet et al. 2012). Such molecular tools have been successfully used to compute BBI in both freshwaters (Visco et al. 2015) and marine ecosystems (Pawlowski et al. 2014;Chariton et al. 2015;Lejzerowicz et al. 2015). The composition data inferred by the molecular approach is being used in the same way as morphological data to compute BBI, by considering reads abundance as a proxy for species abundance (sensu gAMBI, see Aylagas et al. 2014), although the direct comparison is not straightforward, because the abundance of reads is not necessarily reflecting accurately the abundance of the species (Elbrecht and Leese 2015, Vivien et al. 2015, Dowle et al. 2016. Such discrepancies led to the development of correcting factors, using the cell biovolume in the case of diatoms, to lower the effect of such quantification bias (Vasselon et al. 2018) or to the use of machine learning algorithms to bypass the taxonomic assignment step when using metabarcoding data (Cordier et al. 2017).
Some of these BBI can be calculated with user-friendly software (AMBI from AZTI, available at: http://ambi. azti.es), including R packages in the case of freshwaters (see the 'biotic' package, Briers 2016) or marine ecosystems (see the 'BEQI2' package, van Loon et al. 2015). However, none of these packages included the NSI, ISI, NQI1, Bentix or nEQR indices and required the development of custom script in order to use the published reference ecological weights database (Rygg and Norling 2013). Practitioners therefore need to develop their own solutions to make an ecological impact assessment, which could hamper the reproducibility and the crossstudy comparison of the results. In addition, as these databases are occasionally updated by their maintainers, all solutions would need to be updated as well, which can further hamper the comparison of the results, both across time and studies. Therefore, an open-source software solution would allow a transparent and centralised method to compute BBI.
Here we introduce the R package BBI for the computation of BBI from composition data, either obtained from traditional morpho-taxonomic inventories or from metabarcoding data. It provides an open-source tool for transparent BBI computation and aims to centralise available tools for BBI in various aquatic ecosystems.

Package description
The R package BBI (version 0.2.0) is available from the Comprehensive R Archive Network (CRAN) at https:// cran.r-project.org/web/packages/BBI/index.html. It can be installed within the R environment, on any operating system (Linux, macOS or Windows), by using the command install.packages("BBI"). All instructions for installation of current release or development versions can be consulted on the GitHub repository page (https://github. com/trtcrd/BBI). The package requires the package "vegan" (Oksanen et al. 2018). A reference dataset is included in the package, containing 7822 metazoan taxonomic entries, covering ecological weights and groups for five BBI (Table 1). The BBI function searches for a match between the taxonomic assignments for the composition data used as input and the morpho-taxonomic reference database (BBI database) and returns a list of objects (Figure 1). These objects include the number of taxa that were found in the database, the values of each BBI per sample and the ecological quality status of samples for each BBI. It also outputs a subset of the original composition data that includes only taxa that had a match in the reference database. The nEQR function of the package can be used to compute the normalised Ecological Quality Ratio (nEQR index) over a set of indices for each sample. An example of usage is available on the GitHub repository page.

Conclusion
We introduced the R package BBI for the computation of Benthic Biotic Indices from composition data. It provides two simple R functions to automate the search for matches between the taxonomic assignments and the reference morpho-taxonomic database of 5 BBI (Table 1), to compute the BBI continuous values for each provided sample, to return the discrete ecological quality status for each pair of sample / BBI and finally to provide the normalised Ecological Quality Ratio index, which normalises the assessment across the BBI.
The BBI package will be kept up-to-date for new entries in morpho-taxonomic reference databases for the 5 BBI included here. Hence, the package aims to provide biomonitoring practitioners with a reliable, up-to-date and open-source tool for the computation of BBI from composition data, either obtained from morpho-taxonomic inventories or by eDNA metabarcoding.

Project description
Title: BBI: an R package for the computation of Benthic Biotic Indices from composition data Study area description: Microscopy, Metabarcoding, Biomonitoring, Biotic Indices Download page: https://cran.r-project.org/web/packages/BBI/index.html Programming language: R Licence: GNU Affero General Public Licence v3

Author contributions
Conceived and designed the study: TC; Wrote the R package: TC; Analysed the data: TC; Wrote the paper: TC, JP