Metadata standards and practical guidelines for specimen and DNA curation when building barcode reference libraries for aquatic life

DNA barcoding and metabarcoding is increasingly used to effectively and precisely assess and monitor biodiversity in aquatic eco systems. As these methods rely on data availability and quality of barcode reference libraries, it is important to develop and follow best practices to ensure optimal quality and traceability of the metadata associated with the reference barcodes used for identifica tion. Sufficient metadata, as well as vouchers, corresponding to each reference barcode must be available to ensure reliable barcode library curation and, thereby, provide trustworthy baselines for downstream molecular species identification. This document (1) specifies the data and metadata required to ensure the relevance, the accessibility and traceability of DNA barcodes and (2) specifies the recommendations for DNA harvesting and for the storage of both voucher specimens/samples and barcode data.


Introduction
Human well-being is intimately linked to freshwater and marine ecosystems (Dudgeon et al. 2006;Worm et al. 2006;Borja et al. 2020). Although these systems support critical services to humans, such as water supply, fisheries or tourism, they increasingly face stressor impacts (Carvalho et al. 2019). Natural and especially anthropogenic alterations are introducing important pressures on rivers, lakes, coasts and seas (European Environmental Agency 2019). Beyond the services provided by these ecosystems and the need to protect their sustainability for future generations, there is an ethical need for societies to commit themselves for nature conservation.
Only 10% of European rivers have very low concentrations of micropollutants (Loos et al. 2009), which challenges citizens on the supply of aquatic ecosystem services (Boulton et al. 2016). Another alteration is related to the increasing need of human society for energy: there is a global boom of new hydropower dams in the last decades (Zarfl et al. 2015. Consequently, at global scale, only a few rivers remained free-flowing (Liermann et al. 2012;Grill et al. 2019). This increased habitat fragmentation results in a significant biodiversity loss in rivers which, in turn, impacts on ecosystem services (e.g. Arthington et al. 2010), such as food provisioning and cultural services. In marine ecosystems, new wind and wave farms, offshore drilling platforms and fish farms have been constructed in the last decades to meet energy needs. This has important impacts on biodiversity (Shields et al. 2011;Pawlowski et al. 2014c); these human activities destroy marine habitats (e.g. damage to sea floors due to trawling, coastal urban expansion, dredging and destruction of coral reefs and mangroves) or alter them (e.g. the construction of wind and wave farms, Shields et al. 2011). Human impact is particularly negative for coral reefs, mangroves and many other coastal regions (Halpern et al. 2008). However, millions of people depend upon services and functions of such ecosystems, including food, tourism and storm protection (Barbier et al. 2011). Overexploitation of marine and freshwater biodiversity, including overfishing and destruction of some habitats such as mangroves (e.g. Myers and Worm 2003) is particularly worrying. The spread of invasive species in freshwater and marine habitats has caused dramatic changes in ecosystems and biodiversity (Dextrase and Mandrak 2006;Molnar et al. 2008). Finally, global changes also impact aquatic ecosystems. In particular global warming, in combination with other pressures, such as nutrient loading, causes other detrimental effects, such as harmful algal blooming in lakes and seas (Jacquet et al. 2005;Jeppesen et al. 2005;Wells et al. 2015) or global water scarcity (Schewe et al. 2014). More specifically in marine ecosystems, global warming and acidification negatively impact marine diversity (Kroeker et al. 2013). These concerns about the state of rivers and lakes led scientists to warn about a global threat to human water security and biodiversity since more than 80% of the world's population is exposed to such risk (Vörösmarty et al. 2010;Jenny et al. 2020).
Aquatic life and its biodiversity are crucial to ensure quality, quantity and delivery of aquatic ecosystem services (Cardinale et al. 2012;Stevenson 2014;Barbier 2017;Hammerschlag et al. 2019). For several decades, governments and transnational organisations have understood the necessity of monitoring aquatic life to set up political decisions that will help in improving or preserving aquatic ecosystems (Kopf et al. 2015). For instance, in the European legislation, several directives such as the Water Framework Directive (European Commission 2000), the European Marine Strategy Framework Directive (European Commission 2008), the European regulation on invasive alien species (European Commission 2014), the Habitats Directive (The Council of the European Communities 1992) and the European Biodiversity Strategy (European Commission 2020) were set up to protect the integrity of particular species, habitats and aquatic ecosystems. These directives are applied in each member states' legislation. In the USA, the Clean Water Act (Copeland 2016) also sets an ambitious framework for water ecosystems preservation and, in the marine realm, many countries have implemented legislation to protect them (Borja et al. 2008).
The implementation of such directives is generally based on methodologies that assess the presence and abundance of species in freshwater or marine ecosystems Hering et al. 2006;Borja et al. 2010;(Birk et al. 2012). Most of the classical methodologies used to detect and identify species rely on morphological differences. Visual observations or optical devices (binocular magnifiers or microscopes) enable experts to determine taxa and establish fauna or flora identification lists. However, such methods are time-consuming, require a high level of expertise to reliably identify the organisms and are, in some cases, unable to distinguish amongst closely-related or morphologically-indistinguishable species (i.e. cryptic species). Identification of immature stages (e.g. different ontogenic stages, such as eggs, juveniles, planktonic larvae) is also often impossible, especially for small organisms. The use of molecular, DNA-based tools to identify species overcomes this problem as long as DNA is preserved in the sample. DNA barcoding, the use of short, standardised gene sequences for species identification (Hebert et al. 2003) has become the most versatile and universally-applicable method since the adoption of standard DNA markers was agreed upon within the scientific community (e.g. COI in animals (Hebert et al. 2003), ITS in fungi (Schoch et al. 2012), MatK, rbcL (CBOL Plant Working Group 2009 or ITS (Li et al. 2011) in plants. The development of open access analytical tools and comprehensive data repositories (e.g. BOLD, the Barcode of Life Data Systems and Barcode Index Numbers (BINs)) has fast-forwarded the advancement of biodiversity assessments using DNA barcoding (Ratnasingham and Hebert 2007;Ratnasingham and Hebert 2013). The method relies upon DNA reference libraries that link species-specific taxonomic classification to a reliable reference sequence (the barcode). With the entry of new sequencing techniques, the concept has been extended to identify specimens present in bulk or environmental samples (rather than single individuals) and dubbed DNA metabarcoding (Pompanon et al. 2011). Both DNA barcoding and metabarcoding have been successfully applied to key aquatic organisms used for ecosystem assessment: such as diatoms (Kermarrec et al. 2013;Zimmermann et al. 2014b), Foraminifera (Pawlowski and Lecroq 2010;Pawlowski et al. 2014aPawlowski et al. , b, 2016Frontalini et al. 2020), ciliates, macroinvertebrates and particularly aquatic insects (Hajibabaei et al. 2012;Elbrecht and Leese 2015), marine benthic fauna (Aylagas et al. 2014;Lobo et al. 2017), fish (Hänfling et al. 2016;Pont et al. 2018), aquatic oligochaetes (Vivien et al. 2016(Vivien et al. , 2019, macroalgae and aquatic angiosperms (Scriver 2015;Akita et al. 2019Akita et al. , 2020. The international initiative iBOL (International Barcode of Life), currently represented by 35 member nations, is a research alliance committed to building and extending existing DNA barcode reference libraries, informatics platforms, analytical protocols and pipelines for assessments of biodiversity using molecular tools (https://ibol.org/). An Achilles heel of DNA barcoding and metabarcoding is the taxonomic coverage and the data quality of the barcode reference libraries. Weigand et al. (2019) give an overview of gaps in the barcode libraries for species used in aquatic biomonitoring in Europe and show that, while more than 80% of all fish species are barcoded, only 26% of the marine invertebrates are covered. They also argue for the need for effective quality assurance and quality control of barcode reference libraries. With this in mind, it is important to develop and follow best practices to ensure that the quality and traceability of the metadata associated with the reference barcodes used for identification is optimised. As early as 2005, data standards for reference DNA barcode records to be deposited in INSDC (International Nucleotide Sequence Database Collaboration) were given (Hanner 2005revised in Hanner 2009). However, these standards were customised for INSDC, had a generalist nature and missed the specificity required for different organisms and ecosystems. Some guidelines for specific organisms were made for diatoms (e.g. Zimmermann et al. 2014a) and were standardised at European level (European Standardisation Committee 2018) and then translated into each member state's language (e.g. in French: Afnor 2018). Similarly, some specific standards for data and protocols for fish were proposed in the framework of the Fish Barcode of Life campaign (Ward et al. 2009;Ward 2012). We focus on quality standards before the molecular generation of DNA barcodes commences, i.e. pre-PCR and sequencing. The objectives of this document are: (1) to specify the data and metadata required to ensure the relevance, the accessibility and traceability of DNA barcodes and (2) to specify the recommendations for DNA harvesting and for the storage of both voucher specimens/samples and barcode data.

Procedures
This section details the procedures for the storage of vouchers, DNA material, DNA-barcodes and for the harvesting of DNA.

Storage of voucher specimens
The first step in establishing a reference barcode is the correct morphological identification of specimens from which DNA will be isolated. The physical vouchers of the identified biological material should be deposited in a recognised and accessible natural history collection and be accompanied by a unique collection number (Vollmar et al. 2010;Blagoderov et al. 2012). Duplicates of vouchers could be deposited in an alternative recognised collection(s) to reduce the risk of losing vouchers and increase accessibility. Labels containing obligatory metadata must be attached to all parts of the specimens or preparations with the specimens included (such as permanent microscopic slides). Depending on the organism group, the physical vouchers can have different forms and the following sections (2.1.1 to 2.1.10) give for each organism, the obligatory and the recommended vouchers for the different types of biological materials usually used for DNA harvesting.
In the following sections, depending on the organisms considered, we recommend that vouchers are stored frozen. Various temperatures are presently used (-20 °C, -40 °C, -50 °C, -80 °C), but there is a lack of long-term comparisons (e.g. 100 years) of the impact of freezing temperatures on DNA conservation. Thus, we recommend that frozen vouchers should be stored at least -20 °C (or temperatures from -20 °C to -80 °C).

Cyanobacteria
Different kinds of biological material can be used: -Dry specimens: The obligatory voucher must be the thallus or large colonial structure built by cyanobacteria. The recommended voucher can be dried treated samples. -Monoclonal isolate in culture: The obligatory voucher must be a living culture with a unique strain number kept in a culture collection with appropriate light source (quality and quantity of daylight). -Monoclonal and axenic isolate in culture: The obligatory voucher must be a culture deposited in two official collections (e.g. PCC [Pasteur Culture Collection] in Paris, France, ATCC [American Type Culture Collection] in the USA, NIES-MCC [Microbial Culture Collection at the National Institute for Environmental Studies] in Tsukuba, Japan); those collections will give a unique number that is requested for publishing in official journals.

Diatoms
Different kinds of biological material can be used for diatoms: -Culture: The obligatory voucher must be biomass of the monoclonal culture kept at -80 °C or fixed in ethanol or fixed in a buffer and a labelled permanent microscope preparation of cleaned culture (frustules and valves). The recommended voucher can be a living culture with a unique strain number (kept in a culture collection), dried treated material, scanning electron microscope (SEM) stubs, loose dried unoxidised material, permanent slide. The recommended documentation are photographs of the alive cells and valves from slides and SEM stubs. Some diatom taxa can also be cryopreserved (e.g. Stock et al. 2018). -Colony or filament: The obligatory voucher must be a labelled permanent microscope preparation of the material (frustules and valves). The recommended voucher can be a living culture with a unique strain number (kept in a culture collection), dried treated material, scanning electron microscope stubs, loose dried unoxidised material, permanent slide. The recommended documentation are photographs of the alive cells and valves from slides and SEM stubs. -Single cell: The obligatory voucher must be light and/or electron microscope photographs showing diagnostic details of the cell. The recommended voucher can be a permanent microscopic slide with the frustule, if it has not been destroyed after extraction. -Environmental sample: The obligatory voucher must be raw material kept at -80 °C or fixed with ethanol (> 70%) or formaldehyde or buffer and a labelled permanent microscope preparation of the cleaned culture (frustule and valves). The recommended voucher can be dried treated material, scanning microscope stubs, loose dried unoxidised material, photographs.

Other microalgae
We propose one kind of biological material that can be used for other microalgae (other than diatoms and cyanobacteria); however, other options may be considered: -Culture: The obligatory voucher must be a monoclonal culture kept at -80 °C or fixed in ethanol or fixed in a buffer. The recommended voucher can be a living culture with a unique strain number (kept in a culture collection), dried treated material, light and/or electron microscope photos showing diagnostic details of the cell. Some microalgae taxa can also be cryopreserved (e.g. Stock et al. 2018).

Macroalgae and aquatic angiosperms
One kind of biological material can be used for macroalgae and aquatic angiosperms: -Specimen kept dry or wet: The obligatory voucher must be a voucher dried on a herbarium sheet. The recommended vouchers can be parts of the plant preserved wet (ethanol or formaldehyde) for anatomical or detailed morphological observations and/or a living culture with a unique strain number (kept in a culture collection).

Foraminifera
Different kinds of biological material can be used for Foraminifera: -Single cell with mineral test: The obligatory voucher must be electron microscope photographs showing details of the test. Since tests are destroyed during extraction, paratypes should be kept on micropaleontological slides. The recommended voucher can be dried tests of paratypes stored in micropaleontological slides at room temperature. -Single cell with organic wall or naked: The obligatory voucher must be light microscope photographs showing important details of the cell. Since cells are destroyed during DNA extraction, paratypes should be fixed in formalin and, for long term storage, be transferred into 70% ethanol. The recommended voucher can be test fixed in 4% formalin and permanently stored in 70% ethanol.
-Environmental sample: The obligatory voucher must be untreated material of the environmental sample stored at -20 °C or -80 °C. The recommended voucher can be single cells isolated from the environmental sample, kept on slides or fixed and stored in 70% ethanol.

Macroinvertebrates, including crustaceans, echinoderms, insects, sipunculids and cnidarians
Different kinds of biological material can be used: -Specimen frozen, dried or preserved in ethanol: The obligatory voucher must be the hard exoskeleton dried for terrestrial taxa (e.g. pinned insects) or a specimen kept in ethanol (70-96%) or frozen (-20 °C). Slide mounts of part or of the whole specimen in permanent medium is recommended, if needed for morphological observations. The recommended voucher can also be tissue samples of specimens kept at -80 °C or -20 °C if the obligatory voucher is preserved in 70-96% ethanol.

Annelids
One kind of biological material can be used (Timm and Martin 2015;Vivien et al. 2017): -Specimen preserved in absolute ethanol at -20 °C: The obligatory voucher must be the anterior parts of specimen (at least the first 15 segments) kept in ethanol (> 80%) or in 4% formalin or mounted on a slide in a permanent medium. The recommended voucher used for subsequent genetic analyses can be a tissue sample of the obligatory voucher kept in absolute ethanol at -20 °C or -80 °C.

Molluscs
Two kinds of biological material can be used: -Specimen preserved dry: For shelled specimens, the obligatory voucher must be a shelled specimen preserved dry. The recommended voucher can be a separate tissue sample, high-resolution imaging data for shell surface and internal organisation (e.g. scanning electron microscopy, microtomography) and 3D model. For shell-less specimens (e.g. slugs), no ideal method for dry vouchers exists, but comprehensive imaging should be performed on living specimens prior to storing specimens in a wet collection. -Specimen preserved wet: The obligatory voucher must be a specimen stored wet in a preservative suitable for morphology and genetics (e.g. 80% ethanol, propylene glycol) or kept at -20 °C or -80 °C. The recommended voucher can be specimen tissues stored in a preservative suitable for morphology and genetics, comprehensive imaging data, permanent microscopic slide(s) with genital apparatus, mouth parts or other diagnostic morpho-taxonomic characters. Due to their high water content, wet preserved molluscs dilute the preservation liquid. As such, the initial preservative should be renewed after 24 h and a generally fair preservative to tissue ratio respected (e.g. > 5:1).

Fish
Two kinds of biological material can be used: -Entire specimen or part of body preserved in ethanol or frozen: Obligatory voucher must be entire body or part of body preserved in ethanol (> 80%) or frozen (-20 °C). -Tissues samples preserved in ethanol or frozen or scales preserved dry: The recommended voucher can be tissue samples of specimens kept at -80 °C or in ethanol (> 70%) or scales preserved dry.

Aquatic fungi
Two kinds of biological material can be used: -Population from environmental sample: The obligatory voucher must be a sample collected on a polycarbonate filter (0.6 μm pore-size) and stored at -20 °C or -80 °C. The recommended voucher can be a photograph. -Obligate parasites: The obligatory voucher must be a living culture isolate with its host. The recommended voucher can be raw cultures material kept frozen (-20 °C or 80 °C), photographs.

DNA harvesting and banking
DNA can be extracted from a variety of biological materials depending on the organism group (cultures, tissues, populations, natural samples of mixed organisms etc.). As Table 1 illustrates, the preferred method to extract DNA differs amongst organism groups and, for some organisms, multiple methods are possible. In common for most groups is the downstream use of PCR and a choice of Sanger or High Throughput Sequencing (HTS) to generate the DNA barcode depending on the state of the specimen (e.g. age, mixed sample etc.). However, also shallow shotgun sequencing (i.e. genome skimming) is used to generate reference barcodes (Alsos et al. 2020). It is of great importance to store an aliquot of extracted DNA permanently to make DNA available for future research. The aliquot should be stored permanently at -20 °C or -80 °C in an established DNA bank or a biological specimen repository, with links to the corresponding metadata A basic phenol/chloroform extraction is feasible, but even boiling/ freezing to release the DNA or boiling individual filaments in the reaction mix tube in the PCR machine prior to adding the DNA polymerase will work.
Laamanen et al. (2001,2002) Culture of monoclonal isolate, axenic or not Several DNA extraction methods are available, but the one obtained is Shih et al. (2013) which was tested successfully on a wide diversity of cyanobacteria.

Diatoms
Monoclonal culture from single algal cell isolation Cell isolation from an environmental sample, which is then grown in culture. DNA extraction from the culture, PCR and Sanger sequencing. Culture does not need to be axenic, but must host a single diatom taxon. Adopt this methodology for cultivable species. Alternatively: high-throughput sequencing of the environmental sample, with subsequent bioinformatics and phylogenetic analysis to isolate the target species barcode. Adopt this method when target species are relatively well represented in an environmental sample, but are difficult or impossible to cultivate.

Other microalgae
Culture of monoclonal isolate, pure or not DNA extraction from the culture, PCR and Sanger sequencing. Culture does not need to be axenic, but must host a single algal taxon. Adopt this methodology for cultivable species.

Macroalgae
Specimen silica-dried, from herbarium specimen or preserved in ethanol (

Annelids
Specimen preserved in ethanol (> 80%) at -20 °C or in neutral buffered formalin or fresh/frozen** DNA extraction of entire specimens or a fragment of specimens using guanidine lysis buffer or a commercial kit, PCR and Sanger sequencing or high-throughput sequencing of genetically-tagged specimens. Vivien et al. (2017Vivien et al. ( , 2020 (see Section 3 below). To secure its visibility and accessibility, we recommend that the repository used is affiliated with a national or international network such as RARe,

Storage of DNA barcodes
We strongly recommend DNA barcodes and associated metadata to be stored digitally in public, open-access databases. Examples include general databases like BOLD (Ratnasingham and Hebert 2007)

Associated metadata
We consider the DNA sequence as primary data and all accompanying information as metadata. (See supplementary file for tabular overview of the below listed metadata).

General remarks about metadata
A reference barcode must be accompanied by a minimum set of metadata. The metadata, including photographs, must be stored in digital and open-access databases, such as those listed in Section 2.2 or in a publicly accessible collection database of the storing institution. It is important that the metadata is stored in non-proprietary formats (e.g. text documents in .txt, images in .tif) and that such databases comply with FAIR Data principles (Findable, Accessible, Interoperable, Re-usable, see Wilkinson et al. 2016) and the Biodiversity Information Standards, such as the Darwin Core, Audubon Core, ABCD 2005 standard (https://www.tdwg.org).
The metadata must be linked to the barcode via a unique identifier assigned by the database where the sequence is stored (e.g. accession number in ENA or in GenBank). If voucher material or culture strains are available in natural history or institutional culture collections, these should be linked to the sequence.
As detailed in the following sections, metadata should include information on the DNA marker, the strain cultivation, the natural sample, the taxonomic name, the identification, the sampling location, the voucher location and the barcode authors. Disruption of filamentous fungal cell walls with glass bead method, followed by digestion using proteinase K.
Gontia-Mishra et al. (2014) Rinsing fungal mycelia or yeast cells with pure water to remove potential PCR inhibitors, followed by thermolysis at 85 °C in lysis buffer, DNA amplification and extraction. Zhang et al. (2010) * Methodology is very difficult as it is challenging to get pictures and PCR products of the same cells. ** Fixation of aquatic oligochaetes and most soft-bodied organisms in a low concentration of ethanol can lead to a fragmentation/disintegration of specimens (Vivien et al. 2018(Vivien et al. , 2020b. For in situ fixation of large (> 2-3L) sediment samples, it is recommended to use neutral buffered formalin (final concentration of 4% neutral formaldehyde) instead of ethanol to avoid the destruction of oligochaete specimens (Vivien et al. 2018(Vivien et al. , 2020b. Sieving should be performed within 4 weeks after sampling and material retained should be stored in absolute ethanol at -20 °C (Vivien et al. 2018).

Categories of metadata 3.2.1 Biological material metadata
The metadata listed below give the obligatory and recommended items that ensure the traceability of the biological material used for DNA harvesting (see Section 2.1).

Biological specimens and environmental samples
Obligatory metadata 1. Location of the sampling site a) Geographical coordinates: for example, expressed in decimal values in WGS84 or in a different, specified geographical positioning system. b) Country according to the ISO 3166 standard, accepted name of ocean or sea. c) Name of the locality. Remarks: -For species of heritage interest, Red List species or endangered species, national or regional regulations might ask not to reveal precise location coordinates in order to protect their populations. These regulations should be followed and the exact locality information hidden. -In some cases, exact coordinates are not available (e.g. when older museum specimens are used; here a georeference of the locality plus an estimated uncertainty in metres can be added  (rock, macrophyte, sediment, hot vent, interstitial, etc.). 3. Habitat (e.g. plankton, epipelon, epilithon, epipsammon, tychoplankton, alluvial region, porous or karstic aquifer, sea floor, pelagic, benthic, intertidal, subtidal, etc.). 4. Sampling elevation (m a.s.l.). 5. Sampling depth (m). 6. Sampling device or sampling protocol. 7. Photos of the sampling site. 8. Environmental measurements: luminosity, pH, conductivity, salinity, temperature, sediment's grain size, organic matter content and redox potential. 9. Main ecological function(s) of the specimen (if known). For instance, already existing ecological classifications, such as FAPROTAX (Louca et al. 2016) and Tax4Fun (Aßhauer et al. 2015) for bacteria, the classification of Reynolds et al. (2002) and Padisák et al. (2009) for phytoplankton and Rimet and Bouchez (2012) for diatoms, for macro-invertebrates the classification of Usseglio-Polatera et al. (2000) or for plants, the one of Kattge et al. (2011). 10. Photos should carry the name of the photographer and associated licence, preferably a Creative Commons Licence that allows usage by third parties. 11. The FAO fishery area where the sampling was done (for marine taxa).

Cultures
Obligatory 1. Metadata associated with the environmental sample which was used to establish the culture (see above section). 2. Name of person who isolated the starting cell. 3. Date of isolation (date the uni-algal culture was established by isolating a cell from the environmental sample). 4. Date of harvesting (date the culture was harvested to extract DNA). 5. Photo(s) showing diagnostic features of the organism.

Recommended
1. Culture medium (recipe of medium used for cultivation). 2. Culture condition (light intensity, light cycles, temperature, humidity etc.). 3. Strain identifier (name or number that uniquely identifies the cultured strain in the collection).

Taxonomic information
To ensure universal practices across countries, taxonomic information must follow the international no-menclatural rules. Therefore, we recommend following the European standard (CEN 2014) for taxonomic identification. This standard includes, in particular, the following items: Obligatory data 1. The most reliable identification to the lowest possible taxonomic rank. Recommended data 1. Genus names should include the name of the author(s) with year of publication of its original valid (algae, bacteria, fungi and plants) or available (animals) publication. Genus names should be written in italics. 2. Species epithets should include the name of the author(s) with year of publication of its description or combination (see above). Species epithets should be written in italics, while "sp." should not. 3. If applicable, include name of infraspecific taxon (subspecies, variety, forma etc.) and citation of the author(s) with year of publication of its description or combination (see above). 4. Further notes on taxon status (e.g. phylogenetic affiliation, statements on the taxon concept adopted (i.e. whether a narrow/strict (sensu stricto) or a broader (sensu lato)). If necessary, a taxonym should be given including link to a published circumscription (Berendsohn 1995).

Identification metadata
Obligatory data 1. Name of the person who identified the specimen. 2. Date on which identification was made.
3. Identification method (e.g. morphology, BOLD ID engine, NCBI BLAST, etc.). 4. If applicable, use accepted terms to indicate uncertainty in the identification (e.g. aff., cf., sp., etc.).  Weigand et al. (2019) highlighted that particular care must be taken regarding the quality assurance/quality control of the reference barcode records to be produced, as failure to do so will limit their application, render them useless or even introduce wrong outcomes. The procedure and description of metadata associated with reference sequences given in this document are steps in this direction. We do not give a procedure to control the quality of the barcode, but we hope our descriptive overview and recommendations will enable standardisation and best-practice in the production and curation of reference DNA barcodes. The metadata provide key information for users of barcode reference libraries which could be time-saving in data-analysis processes. Some of them, such as ecological functions or ecological classifications, can lead to more direct utilisation of metabarcoding in functional ecological studies. This document might appear to set ambitious targets for those producing barcodes; however, we believe that this minimum is necessary to ensure quality in barcode reference libraries and thus provide trustworthy results for DNA barcoding and metabarcoding. Moreover, a general focus of sustainable development and energy-saving measures should be considered by the collections hosting the vouchers.

Conclusions
Finally, as an overall philosophy, we wish to encourage forward thinking on the format and the contents of barcode libraries and on the need for a secure access to the invaluable genetic information therein, including the information linked to specimens from which the DNA originated.

Glossary
Arthropods: Multicellular animals of the phylum Arthropoda, including insects, arachnids, myriapods and crustaceans. Barcode: See DNA barcode. Base pair: Pair of complementary cross-linked nucleotides that are the building blocks of the DNA double helix. Biological specimen: An organism of any kingdom (animal, plant or fungi) or part of an organism. Can be living ('living specimen'), frozen, dried (e.g. herbarium material, pinned insects, fish scales) or preserved in liquid preservatives (e.g. entire fish in ethanol: 'preserved specimen'). BOLD: Barcode of Life Data Systems (www.boldsystems.org). Culture: ex-situ cell culture. As a clonal culture derived from one isolated cell from the environment Cultivator: Person responsible for the cultivation of a strain. Cyanobacteria: phylum of free-living photosynthetic bacteria. Diatoms (Bacillariophyta): Group of unicellular algae, some of which form filaments or colonies, with cell walls made of silica. They are major contributors to primary productivity worldwide and are often used in ecological assessment. DNA barcode: A stretch of DNA from a universallyaccepted DNA marker that uniquely identifies specimens to species, in the context of DNA-based identification, often called just 'barcode'. DNA marker: Name of the coding or non-coding region (e.g. gene, spacer region) within the genome from which the barcode has been sequenced. The naming of the coding or non-coding regions should follow standard scientific practice. ENA: European Nucleotide Archive (www.ebi.ac.uk/ena). Environmental sample: Collection of a portion of a natural environment (water, sediment, soil or air). It contains DNA from organisms living in this environment. Fish: A group of vertebrates containing jawless fish (Agnatha), cartilaginous fish (Chondrichthyes) and bony fish (Osteichthyes). Fungi: group of heterotrophic eukaryotes including zoosporic, filamentous and yeast forms. Foraminifera: Group of unicellular heterotrophic or mixotrophic eukaryotes with organic theca or agglutinated or mineral test (rarely naked) living in all marine environments and also found in freshwater and soil. Foraminifera are used as bioindicators and can give information on pre-anthropogenic conditions as they fossilise. GenBank: National Institutes of Health (USA) genetic sequence database, an annotated collection of all publicly available DNA sequences. Habitat: Specific environment in which an organism lives. HGCN: Human Gene Nomenclature Committee.

Insects (class Insecta):
Hexapod invertebrates within the phylum Arthropoda (e.g. beetles, flies, odonates), characterised by a chitinous exoskeleton, a three-part body (head, thorax and abdomen), three pairs of jointed legs, compound eyes and one pair of antennae. Isolate: A population of cells isolated from a natural population in order to be cultured and studied. The term is usually applied in microbiology. Isolator: person responsible of the isolation of the cell from which the clonal culture was established Macroalgae: Macroscopic algae, comprising red (Rhodophyta), green (Viridiplantae) and brown (Phaeophyceae) lineages and forming ecologically-important primary producers in marine (all three lineages) and freshwater (green algae, mainly) ecosystems. Metabarcoding: An identification method that enables identification of a mixture of organisms in a sample using short DNA sequences and high-througput sequencing. Molluscs: An organism group referring to the taxa Gastropoda (snails and slugs), Bivalvia (e.g. clams, scallops, mussels), Polyplacophora (chitons), Cephalopoda (e.g. squids, octopus), Scaphopoda (tusk shells), Aplacophora and Monoplacophora. NCBI: National Center for Biotechnology Information (www.ncbi.nlm.nih.gov). Obligate parasite: a living organism which depends on a host to complete its lifecycle. Oligochaetes: Class of the phylum Annelida. The principal aquatic oligochaete families are Naididae (Naidinae and Tubificinae), Enchytraeidae, Lumbriculidae, Haplotaxidae and Propappidae. In addition, the Lumbricidae family includes aquatic and amphibious species. Pherogram: Graphical account of the results from Sanger sequencing where each nucleotide is represented by a single peak and the sequence of peaks correlates to the DNA sequence of the sample analysed. Primer: Strand of nucleic acids that serves as starting point for DNA replication. PCR: Polymerase Chain Reaction; process used for the amplification of a target region of DNA. PCR primer: synthesised short single-stranded nucleic acids that serve as starting point for DNA synthesis in the PCR-reaction. Taxon (plural taxa): Taxonomic unit, for example family, genus or species. In systematics, it designates a unit to which living beings are assigned according to certain criteria. Each known taxon has a scientific name, nomenclatural type and a circumscription. Taxonym: Taxonomic concept specified by the scientific name and the reference in which its name is used.