Metabarcoding and Metagenomics :
Research Article
|
Corresponding author: Timothy R. Lee (timothy.lee@austmus.gov.au)
Academic editor: Emre Keskin
Received: 23 Nov 2017 | Accepted: 17 Feb 2018 | Published: 14 Mar 2018
© 2018 Timothy Lee, Yohannes Alemseged, Andrew Mitchell
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Lee T, Alemseged Y, Mitchell A (2018) Dropping Hints: Estimating the diets of livestock in rangelands using DNA metabarcoding of faeces. Metabarcoding and Metagenomics 2: e22467. https://doi.org/10.3897/mbmg.2.22467
|
|
The introduction of domesticated animals into new environments can lead to considerable ecological disruption, and it can be difficult to predict their impact on the new ecosystem. In this study, we use faecal metabarcoding to characterize the diets of three ruminant taxa in the rangelands of south-western New South Wales, Australia. Our study organisms included goats (Capra aegagrus hircus) and two breeds of sheep (Ovis aries): Merinos, which have been present in Australia for over two hundred years, and Dorpers, which were introduced in the 1990s. We used High-Throughput Sequencing methods to sequence the rbcL and ITS2 genes of plants in the faecal samples, and identified the samples using the GenBank and BOLD online databases, as well as a reference collection of sequences from plants collected in the study area. We found that the diets of all three taxa were dominated by the family Malvaceae, and that the Dorper diet was more diverse than the Merino diet at both the family and the species level. We conclude that Dorpers, like Merinos, are potentially a threat to some vulnerable species in the rangelands of New South Wales.
Metabarcoding, rbcL, ITS2, Ovis aries, diet metabarcoding, barcode references
Estimating the diet of an animal by analysing its faeces has many advantages: it is non-invasive and does not require time consuming observation of the animal. Microscopic analysis of plant fragments in animal faeces has long been used to investigate diets (
In order to determine an animal's diet using DNA recovered from its faeces, there must exist a reference database of sequences recovered from potential dietary species, against which sequences recovered from faeces can be compared. DNA barcoding (
To overcome these issues, the ITS2 region of the nuclear ribosomal RNA genes was proposed as a supplementary DNA barcode marker for plants (
Once a database of standard marker sequences from potential dietary taxa has been assembled, the same marker must be recovered and sequenced from faecal samples, where DNA is fragmented and species-mixed. This requirement is being met by high-throughput sequencing (HTS) technologies, which have proliferated since the turn of the century. Compared to the older Sanger sequencing method (
Combining these two technology-driven fields, DNA barcoding and HTS, gives us DNA metabarcoding (
This project was designed to test the utility and practicality of DNA metabarcoding for dietary analysis of different livestock breeds. We focused on the Merino and Dorper, as well as goats as a control, in south-western New South Wales (NSW).
Our aims were:
1. Use a DNA metabarcoding approach to compare the diets of three ruminants: Merino sheep (Australia's most popular breed, introduced over 200 years ago); Dorper sheep (a recently introduced breed with a less well characterised diet in Australia) and goats (a control) in the rangelands of south-western NSW.
2. To determine whether Dorpers are a potential threat to vulnerable flora.
Dorper, Merino and goat faeces samples were collected from 16 different sheep or goat rearing properties in south-western New South Wales, Australia (Suppl. material
We opted to focus on the ITS2 and rbcL markers in this study. RbcL was included because it is a standard plant barcoding gene, and there already exists a large online database of sequences. ITS2 was included as it evolves more quickly than rbcL, increasing its variability, and because it has a different mode of inheritance (nuclear rather than chloroplast). We chose not to focus on matK due to its smaller online sequence database, and also because its high degree of variability in primer binding sites can decrease the likelihood of successful primer binding, particularly in species-mixed samples with high taxonomic diversity (
Plant samples used for DNA barcode library construction. Botanical name is the initial identification before DNA sequencing. See DNA barcode results in Suppl. material
ID |
Common name |
Botanical name |
Date collected |
Site |
Lat (-) |
Long |
T01 |
Streaked Poverty-Bush |
Sclerolaena tricuspis |
26/07/2011 |
K-Tank |
31.86261 |
141.83269 |
T02 |
Manna Wattle |
Acacia microcarpa |
27/07/2011 |
C-lake |
33.08461 |
143.47333 |
T03 |
Black Bluebush |
Maireana pyramidata |
25/05/2011 |
Mazar |
32.76723 |
141.06483 |
T04 |
Quena |
Solanum esuriale |
25/05/2011 |
Coombah (goat) |
33.03057 |
141.73007 |
T05 |
Rosewood |
Heterodendrum oleifolium |
27/07/2011 |
C-lake |
33.09926 |
143.55813 |
T06 |
Erect Mallee Bluebush |
Maireana pentatropis |
27/07/2011 |
C-lake |
33.09926 |
143.55813 |
T07 |
Turpentine |
Eremophila sturtii |
27/07/2011 |
C-lake |
33.08461 |
143.47333 |
T08 |
Pearl Bluebush |
Maireana sedifolia |
28/07/2011 |
near Ivanhoe |
33.29779 |
143.928669 |
T09 |
Common Bottlewashers |
Enneapogon avenaceus |
26/07/2011 |
K-tank |
31.86261 |
141.83269 |
T10 |
Copper burr |
Sclerolaena sp. |
28/07/2011 |
Baymore |
33.46041 |
143.17698 |
T11 |
Bladder Saltbush |
Atriplex vesicaria |
28/07/2011 |
Baymore |
33.46041 |
143.17698 |
T12 |
Prickly Wattle |
Acacia victoriae |
25/05/2011 |
Mazar |
32.76723 |
141.06483 |
T13 |
Box Grass |
Paspalidium constrictum |
25/05/2011 |
Coombah |
33.03057 |
141.73007 |
T14 |
Harlequin Mistletoe |
Lysiana exocarpi |
25/05/2011 |
Coombah |
33.03057 |
141.73007 |
T15 |
Twiggy Sida |
Sida intricata |
13/07/2011 |
Warrananga |
33.70348 |
141.73955 |
T16 |
Kerosene Grass |
Aristida contorta |
12/07/2011 |
Bunnerungee |
33.54223 |
141.73538 |
T17 |
Ruby Saltbush |
Enchylaena tomentosa |
12/07/2011 |
Aston Merino |
33.2956 |
142.326488 |
T18 |
Ward's Weed |
Carrichtera annua |
7/06/2011 |
Orana |
32.76421 |
144.04387 |
T19 |
Austrostipa sp. |
Austrostipa sp. |
7/06/2011 |
Eurella |
32.63605 |
144.24124 |
T20 |
Cannon-Ball |
Sclerolaena paradoxa |
7/06/2011 |
Orana |
32.76421 |
144.04387 |
T21 |
Crows foot |
Erodium sp. |
27/06/2012 |
Eurella |
32.6 |
144.27 |
T22 |
Medic |
Medicago sp. |
27/06/2012 |
Eurella |
32.6 |
144.27 |
T23 |
Ptilotus |
Ptilotus sp. |
29/05/2012 |
Kimberley |
32.85 |
141.15 |
T24 |
Pop saltbush |
Atriplex holocarpa |
28/06/2012 |
Baymore |
33.44 |
143.15 |
Of the 24 reference samples collected, five could only be identified to genus. These five genera contain species that occur in the study area and are listed in NSW as threatened on at least one of two different lists; the Australian Environment Protection and Biodiversity Conservation Act 1999 national threatened flora list (
Although we did not attempt to sequence matK in the faeces samples, the matK sequences of reference plant samples were included in this study because they could facilitate species identification at a future date. Published PCR primers for rbcL, matK and ITS2 (Table
PCR primers used. Notes: 1 = used for DNA barcode library data collection, 2 = used for high-throughput sequencing of faecal pellets, 3 = used in combination with rbcLajf634r.
Gene |
Primer name |
Amplicon size (excl. primers) |
Note |
Sequence (5'-3') |
Source |
matK |
1RXkim |
~800 bp |
1 |
ACCCAGTCCATCTGGAAATCTTGGTNC |
Ki-Joong Kim (unpublished) |
matK |
3FXkim |
1 |
CGTACAGTACTTTTGTGTTTACGNG |
Ki-Joong Kim (unpublished) |
|
rbcL |
rbcLaf |
607 bp |
1 |
ATGTCACCACAAACAGAGACTAAAGC |
|
rbcL |
rbcLajf634r |
1, 2 |
GAAACGGTCTCTCCAACGCAT |
|
|
rbcL |
rbcL-AM2f |
247 bp |
2, 3 |
AAYGTYTTTGGKTTCAARGC |
This study |
ITS2 |
S2F |
~460 bp |
1, 2 |
ATGCGATACTTGGTGTGAAT |
|
ITS2 |
S3R |
1, 2 |
GCTTCTCCAGACTACAAT |
|
For amplifying rbcL and matK, each 15 μL PCR contained: 2 μL of DNA, 0.025 μL MilliQ water, 1.5 μL of 10x reaction buffer at 10 mM, 1.5 μL of MgCl2 at 25 mM, 0.3 μL of dNTPs at 10 mM, 0.075 μL of Platinum Taq at 5 U/μL and 0.3 μL each of forward and reverse primer at 5 μM. Cycling conditions for rbcL and matK were as follows: 94°C for 2 min; 94°C for 30 s, 55°C for 30 s and 72°C for 1 min x 5 cycles; 94°C for 30 s, 54°C for 30 s and 72°C for 1 min x 30 cycles; and 72°C for 7 mins. The ITS2 amplification followed the same protocol, except that 35 PCR cycles were performed, and the annealing temperature was 55°C. Sanger sequencing was carried out by Macrogen Inc. (Seoul, South Korea) using a high throughput Applied Biosystems 3730XL sequencer. Peak calling and contig assembly was carried out in Geneious 10.0.2 (
We estimated phylogenetic trees for all three genes using the sequences recovered from the reference samples, in order to explore the diversity of the three markers among our reference sequences and to check for any sequence clusters from different taxa with low divergence. All three trees were rooted with a single outgroup sequence from the Scots Pine (Pinus sylvestris), downloaded from GenBank. Phylogenetic analysis was carried out in Geneious 10.0.2 using the MrBayes v3.2.6 plugin (
Twenty-four faecal samples were chosen to include an approximately equal number of the three treatments (Merino, Dorper or goat) from across the study area (Table
Sheep faecal pellet samples used. Locations of properties can be found in Suppl. material
Sample ID |
Property (Paddock if different to stock) |
Stock (Notes) |
1 |
Warrananga |
Merino |
2 |
Orana |
Goat |
3 |
Mazar (Merino) |
Goat (wet + dry) |
4 |
Mazar (Merino) |
Merino (wet + dry) |
5 |
K-Tank |
Goat |
6 |
Eurella |
Dorper |
7 |
C-Lake |
Merino |
8 |
Kimberley |
Dorper |
9 |
Kimberley |
Goat |
10 |
Coombah |
Goat |
11 |
Popiltah |
Dorper |
12 |
Coombah |
Merino |
13 |
C-Lake |
Goat |
14 |
Bunnerungee |
Dorper |
15 |
Bunnerungee |
Merino |
16 |
Clevedale |
Dorper |
17 |
Baymore |
Dorper |
18 |
Bonton |
Merino |
19 |
Bunnerungee |
Dorper (repeat of 14) |
20 |
Bunnerungee |
Merino (repeat of 15) |
21 |
Avoca |
Dorper |
22 |
Aston |
Goat |
23 |
Aston |
Merino |
24 |
Avondale |
Merino |
Published plant DNA metabarcoding studies either have sampled only fresh plant material to generate large PCR amplicons for DNA sequencing, or have utilised alternative chloroplast genes, such as trnL, for which tried and tested primers yielding small fragments were available (
HTS was performed by the Ramaciotti Centre using the Illumina MiSeq platform. The Nextera XT 24-sample preparation kit was used because it facilitates indexing of 24 separate samples, i.e. each sample is labelled with a unique DNA sequence “index” which allows all the sequences from a particular sample to be separated after the sequencing run. The Illumina MiSeq Nano version 2 sequencing kit was used, allowing sequencing of 250 bp from each end of an amplicon. As both amplicons were less than 500 bp long, this allowed for bidirectional sequencing of each template, ensuring that high quality sequence data was obtained.
The Illumina high-throughput sequencing output consisted of 1,110,113 forward and reverse reads over the 24 samples, averaging 46,255 reads per sample (number per sample can be found in Suppl. material
Raw Illumina reads were imported into Geneious 10.0.2, filtered for length (>35 bp) and ends of sequences with stretches of Ns or quality lower than 14 were trimmed. In Geneious, bidirectional reads were merged, and any reads that failed to merge were discarded. Reads were then separated into rbcL and ITS2 reads by searching for ITS2 primer sequences within the reads in Geneious. The separated reads were then pooled across all samples and screened for Chimeras using the de novo uchime2 algorithm implemented in USEARCH v9.2.64 (
The non-chimeric reads were assembled into contigs in Geneious. Sequences had primers trimmed with a maximum of 2% gaps and maximum 2 bp gap size. Maximum mismatches per read was 2% for ITS2 and 1% for rbcL, due to the higher degree of variation expected in the ITS2 gene (
Sequence identification using the GenBank database was conducted online, using BLASTN 2.6.0 (
Match data from GenBank, consisting of the 100 closest matches for each contig, were then imported into MEGAN6 Community Edition v6.7.0 (
All sequences identified from each database were then collated into an alignment, one for BOLD identifications and one for GenBank, and neighbour-joining trees produced using the Geneious Tree Builder, with default settings. These trees were used to help determine the identity of OTUs that received no hits in either the BOLD or GenBank databases.
IDs assigned to taxa were compared against the Atlas of Living Australia (ALA) website (
We performed statistical tests on the GenBank sheep diet data to investigate similarities and differences between the breeds. We performed two-tailed t-tests in R v3.3.3 (
The metabarcoding datasets generated and analysed during the current study are available at Figshare: 10.6084/m9.figshare.5309935
The plant reference database generated and analysed during the current study is available at BOLD: dx.doi.org/10.5883/DS-SMPRS
Sequences were obtained for the rbcL, matK and ITS2 genes for 18, 16 and 15 of the 24 plant reference samples respectively.
Phylogenetic trees were derived from the slower evolving rbcL gene (Fig.
Bayesian tree of rbcL gene sequence data. Numbers at nodes are posterior probability values. Scale bar indicates substitutions per site. The tree was rooted with a single Pinus sylvestris sequence downloaded from GenBank (accession number AB097775.1).
Bayesian tree of matK gene sequence data. Numbers at nodes are posterior probability values. Scale bar indicates substitutions per site. The tree was rooted with a single Pinus sylvestris sequence downloaded from GenBank (accession number AB097781.1).
De novo chimeric sequence detection using the UCHIME algorithm flagged 21.4% of reads as chimeric in ITS2 and 42.1% as chimeric in rbcL. Following their removal, the non-chimeric reads were assembled into contigs. These ranged from 66-761 contigs per sheep for ITS2 and 119-1060 contigs per sheep for rbcL.
A number of matches on both the GenBank and BOLD databases, particularly in the rbcL gene, were for unicellular algae that were likely present in the diet as water contaminants. In all analyses, algal sequences were excluded. In the following, singletons (species or families appearing in only one sample) were also excluded.
Some specimen identifications were changed to “Genus sp.” based on available information about taxon ranges from ALA. This accounted for the incompleteness of the reference databases, so that matches to species that did not occur within the study zone were instead listed as matching only to genus level. All such cases are clearly marked in Table
Species level taxa (GenBank Data). Names of underlined taxa were changed based on the distribution of taxa in the study zone.
Merino Only |
Dorper Only |
Goat Only |
Dorper and Merino |
Merino and Goat |
Dorper and Goat |
Dorper, Merino and Goat |
Calotis hispidula |
Boerhavia sp. |
Chenopodium auricomum |
Austrostipa nodosa |
Medicago laciniata |
Erodium cygnorum |
Austrostipa nitida |
Cullen australasicum |
Brachyscome ciliaris |
Haloragis aspersa |
Lotus sp. |
Sclerolaena obliquicuspis |
Tetragonia sp. |
Calotis sp. |
Sclerolaena sp. 1 |
Erodium cicutarium |
Haloragis sp. |
Maireana sp. |
Carrichtera annua |
||
Sclerolaena sp. 2 |
Leiocarpa semicalva |
Haloragis glauca |
Salvia sp. |
Convolvulus clementii |
||
Medicago polymorpha |
Vittadinia sulcata |
Medicago minima |
||||
Minuria cunninghamii |
Sclerolaena diacantha |
|||||
Silene sp. |
Sida fibulifera |
|||||
Sisymbrium erysimoides |
Sida sp. |
|||||
Sonchus sp. |
Tetragonia tetragonioides |
|||||
Spergularia tasmanica |
Vittadinia eremaea |
Generally, Dorpers were found to have more diverse diets than Merinos. Based on the GenBank results, at the family level, there were 11 families present in the diets of both breeds, three families found in Merinos not Dorpers, and seven families found in Dorpers not Merinos (Table
Family level taxa (GenBank data) (‘*’ indicates that the column contains no taxa).
Merino Only |
Dorper Only |
Goat Only |
Dorper and Merino |
Merino and Goat* |
Dorper and Goat |
Dorper, Merino and Goat |
Casuarinaceae |
Caryophyllaceae |
Haloragaceae |
Lamiaceae |
Geraniaceae |
Aizoaceae |
|
Elatinaceae |
Euphorbiaceae |
Marsileaceae |
Amaranthaceae |
|||
Solanaceae |
Nyctaginaceae |
Asteraceae |
||||
Plumbaginaceae |
Brassicaceae |
|||||
Zygophyllaceae |
Convolvulaceae |
|||||
Fabaceae |
||||||
Goodeniaceae |
||||||
Malvaceae |
||||||
Myrtaceae |
||||||
Poaceae |
The list of species found in the Dorper and Merino faeces based on Genbank results was compared to the national list of Australian threatened taxa (
BOLD database searches yielded very similar results to the GenBank results (Suppl. materials
To estimate the completeness of taxon sampling, taxon accumulation curves at the family and species level were constructed for each taxon separately based on GenBank results. The species and family accumulation curves indicate that more samples are needed to estimate the total diversity of Dorper (Suppl. material
Dorper and Merino diet data at the species and family levels were found to be normally distributed and with equal variance (Suppl. material
Among the breeds, the ratio of the total number of contigs to the number of contigs that could be identified to species in each individual was compared. For the ITS2 data, the Kruskal-Wallis test was used, as the data failed the Shapiro-Wilk test for normality, while the ANOVA was used for the rbcL data as this data passed normality and equality of variance tests (Suppl. material
Previous studies have used the proportion of reads from different families recovered in faecal metabarcoding as a proxy for the quantity of different foods in the subjects’ diets (
Our results are highly variable among individuals, but Malvaceae appears to be the dominant family in all three taxa, accounting for between 11% and 34% of the reads (width of standard error, Fig.
Decreasing the minimum read depth to 3 increased the number of taxa recovered from the diets of all three study organisms. In the GenBank analysis at a minimum read depth of 3, the sheep have 18 species in common, 12 unique to Merinos and 15 unique to Dorpers; in the BOLD analysis at a read depth of 3, the sheep have 20 species in common, 6 unique to Merinos and 15 unique to Dorpers (Suppl. materials
Both species and family level results show that, when considered at the level of the breed, in aggregate the diet of Dorpers is more varied than that of the Merinos, with family and species level diversity being 29% and 24% higher in Dorpers respectively, based on GenBank results. However, t-test results show that, at the individual sheep diet level, Dorpers do not have a more varied diet than Merinos. This result means that, on average, while an individual Dorper does not have more plant taxa in its diet than an individual Merino, the diet of Dorpers varies more from individual to individual than does that of Merinos. This is consistent with a behavioural model in which Dorpers eat whatever food is easily available without discriminating, while Merinos are more likely to seek out a more restricted set of food plants. This corroborates the results reviewed by
Threatened taxa were found in the diets of both Merinos and Dorpers. Species level matches were found for two taxa in the Dorper diet (Minuria cunninghamii and Sida fibulifera) and one taxon from the Merino diet (S. fibulifera). Genus level matches were also found for five species in the Merino diet, and four species in the Dorper diet, indicating that the diets of these sheep may include more threatened taxa. This highlights the need for barcode sequences from threatened taxa to be uploaded to online databases to facilitate their detection in metabarcoding studies.
The rbcL, matK and ITS2 sequences could only be successfully amplified in a subset of reference samples. Poor DNA preservation in the plant tissue samples may have been a factor in PCR failure, since trial PCRs of the small amplicons designed for Illumina sequencing were successful for rbcL with all 24 samples. However, possible primer-template mismatch cannot be discounted either.
There were several instances of zero or low variation between reference samples from different species, although this is not uncommon in plant DNA barcoding studies (
Although the proportion of reads in the diet metabarcoding dataset is unlikely to match exactly the proportions by volume of plant matter in the diet, there is a relationship between the two (
The goat diet was found to be less diverse than the diets of either sheep, and broadly in line with results of previous studies. Goats are known to consume more woody plants and fewer herbaceous plants than sheep (
The range of plants available in the study area at the time the samples were taken may go some way towards explaining these results. The region received more than double the median rainfall during the two years preceding the collection of the dung samples (2010 and 2011). For this reason, plant growth and vegetation diversity were very high, with both Malvaceae and Poaceae being abundant. Merinos may therefore have been able to select only their preferred species as they are also known to consume some of the species found in Dorper-only dung when their preferred species are not present, such as during drought conditions.
It should be noted that, while our proportional reads analysis based on GenBank data showed that Malvaceae was the largest component of the diet of all three taxa, Malvaceae was almost entirely absent from the BOLD results. This was due to the difference in analytical approach between the two parts of our study. The Malvaceae component of the diets of all three taxa mainly comprised species of the genus Sida. The Sida sequences on BOLD were shorter than the contigs in our dataset, and as they had less than 100% query coverage, under the method we used in our BOLD analysis those matches were excluded. Matches to Abutilon, which was less common, did result in a BOLD match. Care must be taken in interpreting metabarcoding results, as difficulties arise when one is working with reference databases where some but not all of the species that occur in the study area are represented (
Duplicate samples were not highly similar. Based on the GenBank data, the two repeated Dorper samples had 10 species in common, and 18 species possessed by only one sample (Sørensen similarity index = 52.6%). The two repeated Merino samples had 5 species in common, and 11 species possessed by only one sample (Sørensen similarity index = 47.6%). This highlights the value in taking multiple samples from a single individual, as diets are likely to be highly variable. This has been observed previously in another herbivore, the Pacific pocket mouse (
As sequencing technologies become cheaper and provide more data, the bottleneck in metabarcoding studies becomes whether there exist complete and reliable databases of sequence data from target organisms in the study area. Diet metabarcoding studies on Australian herbivores would be greatly improved by increased taxon sampling and sequencing of Australia’s flora. Our results show that much of the diet of the sheep and goats in the area was not covered by our small reference collection, as few barcode IDs matched to our reference database. Incomplete reference collections remain a challenge in carrying out metabarcoding studies (
Plant barcoding remains challenging for several reasons. Firstly, while the COI gene in animals, being variable, easily amplified and easily aligned, is an excellent standard barcoding region, no single such region exists in plants. There is some diversity of opinion on what plant barcode region is most useful, and most researchers use more than one region at once (
Genome skimming is a relatively new technique that has the potential to circumvent some of the present difficulties in plant metabarcoding (
The diet of Dorper sheep was found to be more diverse than the diet of Merino sheep. Although the diet of Dorpers generally was more diverse, individual Dorper sheep do not seem to eat more taxa than Merino sheep, but they do appear to be less selective. The diets of the two sheep breeds were more similar to one another than they were to the goat diet. The proportions of different plant families present in the diets of all three animals show a “core” diet of four plant families, common to all three taxa and accounting for over 55% of the reads. The diversity in the Dorper diet is mainly accounted for by taxa consumed in low quantities. More sampling is required to get a full picture of the diversity of these animals’ diets.
Our preparation of a small reference database, and comparison of this database against the online BOLD and GenBank databases, emphasizes that online databases need to be more complete to get a more accurate picture of diversity present. A more complete database would also increase the usefulness of the ITS2 dataset, making it generally more useful than rbcL for species-level discrimination.
Finally, Dorpers might be a threat to the vulnerable plant species Minuria cunninghamii and Sida fibulifera, with Merinos also being a potential threat to S. fibulifera. The diets of both sheep breeds also contained taxa identified only to genus, which might potentially include threatened taxa. This highlights that Dorpers could potentially have as much of an impact as Merinos on threatened species.
This project is jointly funded by the Lower Murray-Darling Catchment Management Authority (now Western Local Land Services) and the NSW Department of Primary Industries. We are grateful to the many landholders who generously allowed us to undertake the field work in their properties.
Location of sample sites in south-western New South Wales, Australia. Location of study area shown as a rectangle on the map of Australia (insert). Names of states and territories are marked. Solid lines indicate state boundaries. Dashed line indicates the course of the Darling River. Dotted line indicates the course of the Great Darling Anabranch. Circles indicate sampling locations, squares indicate towns. Created using Inkscape 0.92.0 (https://inkscape.org/en/). Based on information from Geoscience Australia, Commonwealth of Australia 'National base map with external territories', (http://www.ga.gov.au/interactive-maps/#/theme/national-location-information/map/nationalmap) published under the Creative Commons license CC-By-Au.
Table displaying the closest matches on the BOLD database for the 24 reference samples for matK, rbcL and ITS2.
DNA barcode results (GenBank) for plant reference samples.
Number of paired-end reads for each sample resulting from Illumina Miseq Nano run.
Family level taxa (BOLD Data) (‘*’ indicates that the column contains no taxa).
Species level taxa (BOLD Data). Underlined taxa were changed based on the distribution of taxa in the study zone (‘*’ indicates that the column contains no taxa).
Taxon accumulation curves for Dorper specimens.
Taxon accumulation curves for Merino specimens.
Taxon accumulation curves for goat samples.
Tests for normailty and equality of variance to establish whether conducting t-tests on the Dorper and Merino speccies and family level diversity data is appropriate.
Shapiro-Wilk and Levene's tests exploring the appropriateness of the data for use in ANOVA or Kruskal-Wallis tests.
Family level taxa (GenBank data), at 3 minimum read depth.
Species level taxa (GenBank Data), at 3 minimum read depth. Underlined taxa were changed based on the distribution of taxa in the study zone.
Family level taxa (BOLD Data), at 3 minimum read depth.
Species level taxa (BOLD Data), at 3 minimum read depth. Underlined taxa were changed based on the distribution of taxa in the study zone