Forum Paper |
Corresponding author: Martyn Kelly ( mgkelly@bowburn-consultancy.co.uk ) Academic editor: Agnès Bouchez
© 2019 Martyn Kelly.
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Kelly M (2019) Adapting the (fast-moving) world of molecular ecology to the (slow-moving) world of environmental regulation: lessons from the UK diatom metabarcoding exercise. Metabarcoding and Metagenomics 3: e39041. https://doi.org/10.3897/mbmg.3.39041
|
Development of effective metabarcoding-based tools for ecological assessment requires more than just detailed knowledge of ecology and molecular genetics. There is also a need to understand the context within which they will be used, and for the organisation that uses it to understand the techniques involved and, more especially, how the data that are produced differs from that generated by traditional ecological methods. Lessons learnt during the development of a metabarcoding tool for phytobenthos in the UK are set out in this paper. This attempted to develop a molecular “mirror” of the existing light microscopy-based approach to ecological assessment. Although this conservative approach does not exploit the full potential of metabarcoding data, it does mean that benchmarks exist against which performance and data can be judged. However, the pace of developments within molecular ecology means that regulators will need to find ways of incorporating new scientific insights whilst, at the same time, ensuring a stable regulatory regime. Installation of a metabarcoding technique within a regulatory organisation, in other words, is more than a transaction in which one approach is switched for another. A deeper transformation of the organisation is required.
diatoms, implementation, metabarcoding, phytobenthos, Water Framework Directive
In 2017, the Environment Agency in England took a decision to adopt a metabarcoding approach for ecological status assessment using diatoms, in order to fulfil their obligations under the European Union (EU)’s Water Framework Directive (WFD; European Union 2000). This was the first time that an approach based on Next Generation Sequencing (NGS) had been adopted by a state regulator for WFD assessments in Europe, and the decision provoked considerable interest amongst both academic ecologists and other environmental regulators. The science that underpinned this decision has been published in reports (
Broadly speaking, there was an unspoken assumption, made by both academic scientists involved and managers within the organisation, that implementing metabarcoding methods would involve a like-for-like replacement. A method of generating ecological classifications based on traditional morphology-based identification using light microscopy (LM) would, in other words, be replaced by one which used genomic technology (NGS). The expectation was that, with a few deft shuffles of those parts of organisational flow charts relating to how ecological data were obtained, a Brave New World of ecological assessment would be ripe for the taking.
What we know now is that the changes need to be deeper and that these may extend to the culture of the organisation itself. Key decisions, in this case, were made by experienced managers who, mostly, had little understanding of molecular biology. A lesson learnt from the UK experience is that, rather than think in terms of simple transactions to produce ecological data more efficiently, the organisation needs to think more boldly in terms of transformation and this is the subject of this essay. In particular, the current mode of implementation, with analytical work focussed in a few “High Throughput Sequencing (HTS) laboratories” raises questions about the “product” of ecological assessment. Should we think solely in terms of a metric value or status class assignment produced from a “black box”? Is there still a role for a specialist ecologist to explain and interpret results? If so, how does the organisation support and develop them? In blunt terms, the shift from traditional means of gathering ecological data to a reliance on HTS laboratories is part of a broader trend in which ecological data and evidence are gathered and processed by automated means, with the risk that professional ecologists become marginalised.
Ecological assessment requires summary information about spatially and temporally variable ecological communities in a form that can be communicated within organisations and with stakeholders in a manner that supports decision making. Most of those who use the data will not be specialists in the organisms that are being assessed; many may not even be ecologists. At its most reductionist, data are collapsed to summary metrics (“Ecological Quality Ratios”, or EQRs, in the case of the WFD) which are further classified into one of five status classes. In theory, such concentrated nuggets of transferable ecological information can be reconstituted into a “guiding image” of the community or ecosystem in question (
Couple this with a trend towards metrics that are based upon complicated algorithms (
Engagement with one of the first instances in which an ecological metabarcoding approach was taken right through from conception to operational deployment offers some perspectives on the potential of this method. Whilst there is no shortage of academic papers hyping the potential of genomic approaches (
Whilst the WFD sets out a broad ambition for sustainable use of aquatic resources within the EU, it is the responsibility of each individual Member State to implement the Directive within their territory as they see fit. The result is a wide diversity of methods (
The implementation of the metabarcoding system in England also needs to be viewed within a broader political context. In particular, responsibility for the environment is devolved to the four constituent members of the United Kingdom: England, Scotland, Wales and Northern Ireland. Political and policy control rests with elected representatives but day-to-day regulation is the responsibility of agencies that are answerable to the administrations, but which retain some flexibility in the way that they manage the environment. At the same time, when dealing with the European Union on issues relating to the WFD, the United Kingdom presents a single face, via a semi-independent body, the UK Technical Advisory Group for the Water Framework Directive (UK TAG: http://www.wfduk.org). Whilst UK TAG often works smoothly, there are cases (this was one) when tensions amongst the national environment agencies can arise.
Finally, the development of the UK metabarcoding approach needs to be understood against the background of the financial crisis that engulfed the UK (and much of the rest of the world) in 2009. This had major implications for the financing of the public sector in the UK, with environmental regulation being one area that had to operate with reduced budgets. This, in turn, led to a desire to seek economies whilst still fulfilling their legal obligations with respect to EU legislation. The internal buzzword in the Environment Agency was “more for less”; in practice, several areas of activity (including monitoring) were reduced and searches for more efficient means of collecting environmental data were initiated. Claims that metabarcoding would reduce costs compared to conventional ecological analyses (
By comparison, despite dominating political debate in the UK for the past three years and generating considerable uncertainty, Brexit had a relatively small effect on the project. Following the referendum, the UK environment agencies adopted a “business as usual” approach with regard to their responsibilities to the European Commission, with the expectation that there would be a transition period between formally leaving the EU and full political independence. In any case, the principles of the WFD have been transposed to UK law and will continue to be applied for the foreseeable future, albeit outside the jurisdiction of bodies such as the European Court of Justice.
The WFD has had two profound effects on the way ecological assessment was performed in UK environment agencies: one positive and one negative. The positive effect is that the requirement to evaluate a broad range of ecosystem components initially encouraged the diversification of biological skills amongst ecologists. In 2010, for example, there were about 40 biologists distributed throughout the UK environment agencies with diatom identification skills, most of whom were also able to analyse invertebrate samples and perform macrophyte surveys as well as responding to a wide range of enquiries about local phenomena from the public. They collected their samples from a limited geographical area that they grew to know well and, over time, developed a holistic awareness of freshwater systems in this area. On the other hand, the sophisticated nature of the calculations which underpin WFD assessments, along with the need for a harmonised approach to regulation within the country, has contributed to the centralisation of decision-making within the agencies and an increasing reliance on standardised assessment tools in place of local ecological expertise.
Following the preliminary literature review (
Once these basic components of a metabarcoding system were in place attention shifted to implementation and implications for regulation. The means for computing “expected” values of the TDI (i.e. the denominator in Ecological Quality Ratio calculations) was recognised as a weakness of the LM approach used in UK and the metabarcoding project presented an opportunity to re-examine this, resulting in an improved reference model based on stronger conceptual foundations (Kelly et al. 2018). This also precipitated a new look at how phytobenthos and macrophyte data should be combined to provide an integrated assessment of “macrophytes and phytobenthos”, as required by the WFD. However, by the time this work was completed, planning for the third cycle of the WFD was too far advanced for a wholesale shift in the approaches to assessment to be contemplated and this was not adopted.
Output | Reference and notes |
---|---|
RbcL primer optimised for Illumina Mi-Seq |
|
Bioinformatic pipeline (QIIME2) |
|
Barcode library (1232 strains / 346 taxa) |
|
Optimised version of TDI for LM (TDI5LM) and recalibrated version of TDI for metabarcoding (TDI5NGS) | Principle and method described in |
Comparison of reproducibility and repeatability for LM and NGS at different scales |
|
Guidance on sampling strategies to minimise contamination when sampling diatoms for metabarcoding |
|
Preliminary guidance on interpreting results from diatom metabarcoding analyses | Annex to |
Revised reference models for estimating ecological status using diatoms (applicable to both LM and NGS versions of TDI) |
|
Revised combination rules for combining macrophytes and phytobenthos for ecological status assessments |
|
Ecological assessment produces three separate “products”:
1. The nuggets of transferable information (EQRs and ecological status classifications in the case of the WFD);
2. Underpinning data (lists of taxa and their relative abundances); and,
3. Collateral benefits, as engagement with samples and surveys increases an individual’s knowledge and understanding of the ecosystems under consideration.
The focus of the UK phytobenthos metabarcoding project was primarily on the first of these. Most of those managing the projects had “head office” functions with responsibility for national classifications and strategy, and on reporting UK compliance. Although taxa lists played an important role for interpreting assessment results in the past, their primary role for statutory assessment is as raw material for metric calculation. Budget constraints have led to a greater focus on standardized approaches to reporting and risk assessment with fewer opportunities for local staff to interpret diatom outputs.
There are substantial differences in the relative abundance of diatom species in lists produced by conventional and metabarcoding approaches (
The ability of metabarcoding to detect species is dependent, to a large extent, on the quality of barcode libraries (
Both the absence of taxa from barcode libraries and difference in their representation in light microscopy and metabarcoding data need to be understood by those engaged in interpretation. This is important because, although classifications are broadly comparable, it is possible that individual sites within a data cloud may be classified differently using light microscopy and metabarcoding (see below). Responsibility for explaining such shifts often falls to local staff who will, in turn, need to understand and explain the reasons, for which a list of taxa and abundance assessments are necessary. The taxa list also provides a link to a wider pool of knowledge, including ecological traits (
Whilst the previous paragraph ended with a call for a “structured training program” as part of the transition to metabarcoding, this one starts by lamenting the decline of a widescale “unstructured training programme” arising from the engagement of individual ecologists in the UK with all stages of assessment, from sampling to interpretation, within a relatively small geographic area often for several years. A business strategy that prioritises ‘efficiency’ over depth of technical knowledge and resilience in its experts has led to fragmentation of the data chain in recent years. This, in turn, leaves ecologists increasingly frustrated in their desire to develop the detailed local ecological understanding needed to properly assess and interpret environmental data. Because sampling phytobenthos for metabarcoding is a straightforward process that can be delegated to non-specialist technicians, after which all aspects of metabarcoding analysis take place in remote laboratories, an ecologist’s first encounter with a sample may be as a list of taxa in a spreadsheet, or even as a final metric or classification result. As a result, it is increasingly difficult for ecologists to contextualise data as they are no longer so familiar with the locality from which the data are derived. The implications of this may take a few years to become apparent as the first generation of ecologists who interpret metabarcoding data will have this experience on which to draw. After that, will there be any capacity for an informed “reality check” of the outcomes of bioinformatic pipelines?
• Option 1: Like-for-like replacement of existing methods. In other words, use NGS as alternative means of data acquisition, but continuing with existing principles behind metrics, reference conditions and status class boundaries;
• Option 2: Adopt the “Biomonitoring 2.0” philosophy propounded by
The UK phytobenthos method fits “option 1”. When development started there were no studies that had used NGS to evaluate the response of diatom assemblage composition along environmental gradients so the foundation of evidence that existed for LM studies provided a strong hypothesis for determining what taxa we should expect to detect in our metabarcoding output. In brief, if a species is present in a LM analysis, you should expect to find it in NGS output. If not, then you should take the opportunity to understand why this is not the case. This hypothesis also prompted us to ask questions about quantification (why is a particular taxon less abundant in NGS output than in an LM analysis, and vice versa?) and the breadth of our taxa dictionary. The opposite also applies: if a species is present in NGS output, then does this mean that it has previously been overlooked using traditional analyses (
A further point is that a project that sets out to develop a practical tool for ecological assessment is constrained by the wording of current legislation and regulations that it will be used to enforce. In the case of the WFD, the normative definitions for ecological status refer to “composition” and “abundance” and Member States are required to intercalibrate their methods in order to ensure that all share a common level of ambition with respect to key WFD objectives (
On the other hand, the UK experience has, to date, not yielded a method that is appreciably better than the existing one, leading some to ask whether our conservative “option 1” approach was justified. In particular, uncertainties associated with both LM and NGS approaches mean that the classification of some water bodies will change solely as a result of the switch in method (see below). Taking a longer perspective, the current NGS method has laid foundations upon which methods that better exploit the potential of NGS can be developed. However, at the time of writing, no-one has proposed a strong assessment concept for phytobenthos that could improve upon, and potentially replace, the present “option 1” implementation.
Moving from methods based on identifying organisms based on their morphology to identification based on genetic traits involves a major paradigm shift, not just in our taxonomic understanding, but also in how data are obtained, and their ecological meaning. Most of those responsible for making decisions about the implementation of metabarcoding trained in the era of morphology-based taxonomy and ecology, and had little or no understanding of the laboratory and bioinformatic stages involved in producing NGS output. Conversely, those responsible for primer design, DNA extraction and PCR, high-throughput sequencing and development of the bioinformatics pipeline were excellent molecular biologists but had little awareness of phytobenthos ecology or the regulatory framework within which the method would be used.
A further complication was that diatom taxonomy is also in the midst of its own paradigm shift, leading to reviews of species and generic limits (
Development of metabarcoding-based methods for ecological assessment inevitably calls for multidisciplinary teams, with individuals playing to their strengths. There is also a need for a measure of mutual understanding: awareness of how NGS data are produced by regulators and “old school” ecologists, awareness of the underlying legal context by bioinformaticians, awareness of the “natural history” of organisms by those whose focus is either high-level catchment management and nationwide strategy or, on the academic side, analysis of “big data”. With the benefit of hindsight there was not enough overlap between the skills of the different specialists involved in the development of the UK diatom metabarcoding tool, and some basic education in each other’s perspectives would have been fruitful. This was particularly obvious at later stages when a wider pool of individuals needed to be engaged yet where, due to pressures of time and budget, there were more teleconferences and fewer face-to-face meetings. The former, typically constrained to no more than a couple of hours, gave scant opportunity to understand and learn.
There should be a general expectation that the process of environmental regulation is stable and, when a change does happen, we are given due warning and explanation. A utility company, for example, may spend some years evaluating the case for, and design of an improvement to a wastewater treatment plant, with an expectation that the investment would be recouped over a set time period. They would not anticipate, for example, the regulator adjusting targets during that period due to a change in the way that ecological status was evaluated. When a change does occur, they would expect this to be supported by sound evidence.
Yet genetic technologies promise greatly enhanced capacities to evaluate the environment and, in these early years of implementation, we are learning all the time. Is it not in the public interest to incorporate this new knowledge into regulatory regimes as soon as possible? Finding the balance between the competing demands for “stability” and “improvement” (and, in the latter case, sifting the genuinely useful from the over-hyped) is going to be one of the major challenges facing those charged with managing the transition to molecular approaches for ecological assessment over the next decade and beyond.
Two extreme views expressed during this project were that molecular methods should not be adopted until their development had “stabilised” or that they should be adopted as soon as they were fit-for-purpose but be “locked down” so that regulation was based on a single, unambiguous implementation of the method. The former argument is not tenable because of the rapid evolution of metabarcoding technologies. When development of the UK diatom metabarcoding method started, the Roche 454 was the state-of-the-art platform for NGS and primers were designed around its capabilities. During the course of the project, this machine became obsolete and primers had to be re-designed to meet the requirements of the Illumina Mi-Seq. There is no reason to assume that this, too, might follow the Roche 454 into obscurity at some point and that the method will need to be fitted to a new platform. There are, in addition, exciting new technologies available, such as Oxford Nanopore’s Minion (Grey et al. 2015), which might well make molecular analysis more accessible to field-based ecologists without the need to engage a remote HTS laboratory. The argument for waiting for methods to stabilise is simply not realistic in the era of molecular ecology.
In practice, the relationship between phytobenthos assessments based on LM and NGS is not perfect (see
At the heart of this, lies the WFD’s controversial ‘one out, all out’ rule by which the final status of a water body is determined by the lowest status of any of the measured components of ecological status (Carvalho et al. 2018). In practice, a change in overall status due to a switch to NGS should only occur in those situations where phytobenthos either was, or becomes, the trigger for a downgrade. In many cases, assessments of other organism groups will either confirm or overrule any change that is purely due to the shift in phytobenthos classification. But this will still leave some situations where a change in classification may occur due to the method switch. Sites will be more vulnerable to this if the number of criteria monitored has been pared back to save money, but also where the number of samples contributing to a phytobenthos assessment is small (leading to high standard error and, thereby, uncertainty).
After this initial jolt to the classification regime, we should expect a period of incremental improvements in the science and the next question is whether these should be incorporated into assessments or not. The approach adopted in the UK was to take the version of the metabarcoding tool that emerged after three iterations of the prototype and to “lock down” all aspects of procedure (including bioinformatics pipeline, taxa dictionary and species sensitivities) to ensure consistency in output. For some organism groups this may not be a problem but, given the rate at which diatom taxonomy is developing (see lesson 3), adherence to a fixed list soon diverges from “best practice”. “Locking down” the taxa dictionary and bioinformatics may seem like a means of ensuring stability but, given that over 40% of reads cannot be assigned to sequences in the barcode library, this also acknowledges a large level of uncertainty in outputs. At the point at which the UK barcode library was “locked down”, the rate of unassigned reads was slightly lower than that obtained by French colleagues; two years later, they (who have continued to develop their barcode library and bioinformatics) have halved their rate of unassigned reads (author, Rachel Glover, Frédéric Rimet, Valentin Vasselon, unpublished data).
The decision to “lock down” the method in the name of regulatory stability could well prove to be the biggest mistake in the implementation process. A general principle in quality management is that variation that can be controlled should be controlled whilst that which cannot be controlled should be quantified and incorporated into the decision-making process. In effect, “fixing” the metabarcoding procedure treats a source of potentially controllable variation as uncontrollable. Filling a few gaps in the barcode library is unlikely to have a noticeable effect on the overall strength of fit between the LM and NGS methods, or between the NGS method and principal pressure gradients. However, it might correct the position of a few points within the data cloud and result in better regulation at the locations these represent.
Would this be at the expense of the “stability” which is, clearly, an important requirement of a regulatory regime? In theory (and depending on the means by which key regulatory boundaries are set), any alteration to the means by which a metric is produced could have implications for regulation. In practice, however, such alterations are likely to be minor and, in any case, methods already exist to compensate for this. Over the course of WFD implementation, the intercalibration process has ensured that Member States all share a similar level of ambition when setting ecological status boundaries. These have generated a set of methods (
The credibility of methods is strengthened when there is broad international agreement, underpinned by international standards – endorsed either by the International Standards Organisation (ISO) or Comité European de Normalisation (CEN). Conversely, the absence of such a standard may be seen as a barrier to implementation. In the case of diatoms, informal discussions with colleagues performing similar work elsewhere in Europe led to an approach to the relevant technical committee of CEN and, eventually, publication of two Technical Reports (
A second area where international co-operation has been invaluable has been in the development and curation of a barcode library (
At face value, the outcome of five years research was a metabarcoding assessment tool that had a similar level of performance to the light microscopy-based tool that it replaced, at a slightly lower cost per sample. For internal reasons, some components of the new system (e.g. a revised means for computing “expected” values of metrics:
Either the glass is half full or it is half empty: the optimistic interpretation is that, although the phytobenthos metabarcoding method is no better than the approach it replaced, having taken these first steps into the unknown, the UK’s environment agencies are now better placed to exploit metabarcoding technologies, not just for phytobenthos but for other groups of organisms too. The lessons learnt are partly technical but also wider, addressing how large organisations make decisions about emerging technologies.
It is, however, relatively easy to be cynical about how large and ponderous government agencies respond to new technology. The best we can hope for is that the lessons above are fuel for thought so that mistakes are not repeated. Once again, this is less to do with the intricacies of molecular biology and more to do with managing change, which is never an easy task. It is easier, perhaps, to recognise the need for change in others than it is to examine one’s own shortcomings, so I end with a personal “mea culpa”: I entered this process with 20 years’ experience of using phytobenthos to address questions raised by environmental legislation but largely ignorant of molecular ecology. “Lesson 3” is particularly heartfelt for me: I should have gone on a course to fill gaps in my understanding about molecular technologies (and, in particular, to have been able to engage more fully with bioinformatics) much sooner. I might then have been able to play a more effective part in debates about how to handle barcodes that could not be linked to diatom sequences in the barcode library.
A second “wish” that was out of my hands would have been to insist on a phased introduction of the new method, rather than a single blanket imposition across England in 2017, along with close interaction with the first cohorts of ecologists handling the end-products in order to improve interpretation skills and have a conversation about where the method was working well or less well.
And, one final “wish” would be that the UK had not decided to leave the European Union at the time these decisions about metabarcoding were being made. Collaboration with scientists from around the EU (not least via DNAqua-net) has proved to be invaluable, partly as their research agendas often helped answer questions that our administrators deemed unworthy of funding, and partly because the scrutiny of EU institutions provides a valuable counterbalance to the flaws of domestic environmental policy.
Many thanks to Rosetta Blackman (EAWAG), Richard Chadd (Environment Agency) and Tim Jones (Environment Agency) for comments on drafts and Pieter Boets (Universitiet Ghent) and Estelle Lefrançois for reviews of the manuscript. Thanks, too, to Kerry Walsh (Environment Agency) who steered the early development of the diatom metabarcoding approach in England. This paper was developed from talks given at two workshops funded by the COST Action DNAqua-Net (CA15219), supported by the COST (European Cooperation in Science and Technology) program.