From study design to data
Planning
The first step in eDNA study design is to define the study objectives and how eDNA approach(es) may be useful to address them. eDNA approaches are particularly relevant to describing biotic assemblages, or detecting elusive or cryptic species, and species at low density without the need for taxonomic expertise. By reducing fieldwork costs and sampling time, these methods may also allow users to cover larger sampling areas or sample more frequently. One can target: i) one species of interest, ii) multiple species from a single sample using metabarcoding or iii) different single target species reusing the same samples for multiple projects. Here are a few scenarios illustrating the applications of eDNA.
Example 1: eDNA would allow me to simultaneously monitor multiple species across multiple locales without need to capture the individuals. This would decrease the cost and labour associated with capture and mitigate the need for identification of individuals by taxonomists.
Example 2: eDNA tools would be a cost-effective way to assess the spread of an invasive species. This would be especially useful at the ‘invasion front’ where invasive species are usually at low densities and less likely to be captured.
Example 3: eDNA surveys using qPCR, ddPCR or LAMP could help me to rapidly survey multiple sites for the presence of a species at risk allowing us to focus conservation efforts on these sites or to undertake further research.
Example 4: eDNA from faeces of a species at risk would enable me to quantify its diet noninvasively and potentially include recommendations for management of prey species in future conservation programs.
Example 5: eDNA assays for an emerging pathogen in air, water, soil or on environmental surfaces (e.g. tree leaves) might allow early detection and enable mitigation strategies.
It is also very important to consider the limits of the eDNA approaches in your study. Some information may not be available to address particular questions with your data derived solely from eDNA approaches. For example, eDNA may not provide estimates of absolute individual abundance, reproductive status, or population size. eDNA may not be sufficient to assess health status of individuals within a population of organisms nor provide direct insights on interactions among species; in this sense having developed and refined your objectives and the value but also limits of eDNA approaches, you might wish to weigh the costs of labour and molecular consumables, and access to laboratory instruments (e.g. qPCR machine) or appropriate spaces (e.g. a clean lab dedicated to eDNA) to process the samples and ensure data quality and minimize contamination. It is often easy to propose to use eDNA to address important questions in conservation, ecology or epidemiology, but it can be much harder to deliver high-quality, validated eDNA data. Clean labs ideally have positive air pressure to minimize contamination, HEPA filtration, a one-way workflow, and are easily cleaned and sterilized. Because of the price and the need for a clean eDNA dedicated space (Table 1), it is often more cost-effective to collaborate with eDNA researchers or use services provided by companies. See Fig. 12 for rough estimates on costs and labour requirements.
Examples of academic eDNA research labs in Canada: Lougheed Lab (Queen’s University, ON), Clare Lab (York University, ON), Cristescu Lab (McGill University, QC) Hanner Lab (University of Guelph, ON), Poesch Lab (University of Alberta, AL), Helbing Lab (University of Victoria, BC). There are many such facilities in the USA, European and Asian countries, and other regions of the world.
Examples of organizations offering eDNA services in Canada: Nature Metrics (https://www.naturemetrics.co.uk/northamerica/), eDNATec (https://ednatec.com/solutions/our-services/), Bureau Veritas (https://www.bvna.com/sites/g/files/zypfnx386/files/media/document/eDNA%20Testing.pdf; qPCR only), UNBC Genetics Lab (https://www2.unbc.ca/genetics/pricing).
Item |
Cost |
Notes |
Reference |
|---|---|---|---|
Biosafety Cabinet with HEPA filter and UV light |
$25,000 |
Models vary in price between under $5000 for a benchtop model to over $30,000 |
|
4-pack of micropipettes |
$1,556 |
Dedicated set for clean space |
https://www.fishersci.ca/shop/products/eppendorf-pipette-pick-a-pack-sets-5/p-4344412 |
Mini-centrifuge |
$385 |
Dedicated model for clean space |
https://www.fishersci.ca/shop/products/fisherbrand-standard-mini-centrifuge/12006901 |
Vortex |
$343 |
Dedicated model for clean space |
https://www.fishersci.ca/shop/products/fisher-scientific-mini-vortex-mixer-3/14955151 |
Incubator |
$3,853 |
Dedicated model for clean space |
https://www.fishersci.ca/shop/products/boekel-bambino-hybridization-oven/13245121 |
Microfuge |
$5,089 |
Dedicated model for clean space |
https://www.fishersci.ca/shop/products/sorvall-legend-micro-17-microcentrifuge-4/p-811129 |
Fig. 12 Flowchart with rough costs and processing time for potential stages of an eDNA workflow. All costs are approximate estimates taken from our experiences and are in CAD.
eDNA Lab Design
Minimizing contamination is critical in eDNA studies for improving signal to noise ratio, accurate detection of rare or low abundance species, and avoiding false positives. As it is prohibitively expensive to completely retrofit or build a new facility exclusively for eDNA, many labs repurpose their existing spaces for eDNA. Therefore, some of the most common sources of false positives in eDNA studies are cross-contamination from other samples present in an active lab, particularly from tissue DNA extractions or PCR.
Avoiding contamination does not require a dedicated facility but can be achieved through trained personnel who are highly familiar with precautionary protocols, having dedicated equipment for eDNA extraction and pre-PCR steps, proper sample handling and storage, and having dedicated fieldwork prep, pre-PCR, and post-PCR workspaces with minimal back-traffic. Although it can be impractical to renovate a lab to the highest level of accreditation, ISO/IEC 17025, some simple practices can significantly improve your studies. Contamination can occur at any stage in eDNA workflows but is typically most critical in pre-PCR stages such as when collecting samples, filtering samples, and extracting DNA from filters. In this section, we describe lab design and workflow precautions that can help you avoid contamination and get the best possible results from your samples.
Personnel Training
Personnel involved in eDNA fieldwork and lab work should receive dedicated training specifically for eDNA. Although eDNA protocols can be simpler and faster than traditional sampling approaches, it is also more prone to contamination and degradation. Personnel should be aware of potential contamination sources from both in-lab and out-of-lab sources, how to prevent contamination (through proper use of PPE, decontamination with mechanical cleaning, bleach/DNA destroying cleaning solutions, and UV light, physical isolation, and proper workflow), and the sensitivity of eDNA to degradation (through keeping samples cold, in the dark, and/or dry or in storage solutions/buffers). It is easy to become negligent or sloppy in following these protocols, as any degradation or contamination will not become evident until lab work and analysis have been completed, and it can be difficult to trace the step at which it occurred. Therefore, personnel should also take site notes when it is safe to do so, ideally with a standardized sampling sheet or app. See Nicholson et al. (2020) for a framework on reporting standards.
Sampling Equipment
Prior to sampling, the use of an equipment and task checklist can reduce errors and improve workflow. Equipment should be stored away from areas where DNA extraction or PCR is performed and should be cleaned before and after sampling with DNA destroying methods (typically dilute bleach and UV-C light). In particular, sampling bottles or any reusable equipment that comes in direct contact with the sampled water should be carefully cleaned with dilute bleach solution (0.5% to 1% sodium hypochlorite, or a 1:10 or 1:5 dilution of household bleach) (Goldberg et al. 2016). During sampling, personnel should use adequately sterilize PPE, including nitrile gloves that are changed between sites, waders cleaned with brushing, dilute bleach, and dH2O between sites, and face masks that are changed between sites. Used and unused PPE should be separated between sites. Bulk sampling and tissue sampling should ideally not be conducted at the same time as eDNA, or at least as isolated as possible. Collected samples should be kept as physically isolated from both each other and unsterilized surfaces as possible through the use of resealable bags and containers. Finally, negative controls should be collected at the field and filtration stages of your workflow to measure the level of contamination at those steps (Sepulveda et al. 2020).
Dedicated eDNA Lab Apparatus
Having dedicated equipment for eDNA lab work is critical to avoiding contamination from other samples in the lab. Any equipment for liquid handling (e.g. pipettors), that can create aerosols, or occupy the same surface is a source of potential cross contamination (Scherczinger et al. 1999).
In general, there should be a set of dedicated equipment for anything involved in eDNA extraction or handling eDNA samples prior to any PCR based steps. This includes a biosafety cabinet with positive airflow, easy to clean surfaces, and UV-C light for sterilizing the surface, a set of dedicated micropipettes, 1.5-2 mL tube racks, tweezers (and anything else used to handle filters), lab coats, a microfuge/mini-centrifuge, and vortex. There should also be a dedicated supply of lab consumables such as microfuge tubes, falcon tubes, gloves, face masks, extraction reagents or kit, and ddH2O. If possible, there should be a dedicated incubator, -20oC freezer (or at least dedicated shelves), shaker table, PCR machine, and glassware (Mifflin 2007).
Apparatus dedicated solely for eDNA work should be used for steps prior to PCR. After eDNA has gone through PCR, cross-contamination is much less likely (as there are now many more copies of your target DNA), and general post-PCR lab equipment can be used on the PCR product. In other words, do not use pre-PCR, eDNA specific apparatus for PCR 1 products in your metabarcoding library preparations, or for handling qPCR product for Sanger sequencing verification.
Sample handling and Proper Storage
Proper sample (both pre-extraction filters and extracted eDNA) handling and storage is critical to successful and reproducible eDNA studies. In general, eDNA is more stable later in the workflow. It is imperative to filter samples as quickly as possible, ideally on-site, or within 1-2 hours if transportation off-site is required. The half-life of eDNA in natural water samples is as little as hours due to microbial activity and other decay-causing mechanisms (Mauvisseau et al. 2022). If quick filtration is not possible and the sample must be preserved, it should be kept at fridge temperatures in the dark if it can be filtered on the same day, or frozen at -20oC if it cannot be filtered for several days (Kumar et al. 2019). Note that freezing will degrade the DNA through freeze-thaw mechanisms, will not preserve the sample indefinitely, and the sample may take a long time to thaw. eDNA is typically more stable once it is filtered and kept in appropriate conditions (typically in a 4°C fridge in preservation/lysis buffer or ethanol if it’s to be extracted within a week, -20°C freezer conditions for longer term storage, or fully dried with silica beads or other desiccating agents) (Kumar et al. 2019). If the filter is not kept in appropriate conditions (for example, if it’s exposed to sunlight, kept at room temperature, or is not in an appropriate buffer or fully dry), degradation can still rapidly occur, resulting in low yields and false negatives.
eDNA is far more stable after filtration and extraction, as it is now purer and does not have biological activity. After extraction, keep eDNA samples in 4oC fridge conditions for immediate use (within a few days), in a -20°C freezer for storage and use within a year, and in -80°C ultracold conditions or liquid nitrogen for archival storage. To avoid degradation from freeze-thaw cycles, aliquot the eDNA into multiple tubes for separate uses.
All samples should be stored away from tissue samples or PCR products, ideally in their own fridge/freezers. If dedicated fridges and freezers are not possible, have dedicated shelf space and isolate samples with bags, plastic boxes, or other physical separation.
Workspace Separation and Design
Optimizing your workspace is critical to maintaining contamination free eDNA samples and obtaining accurate results. Organization and foot traffic in your lab should be based on workflow. Areas for storing sampling equipment, recently collected samples, and for filtration should be separate from areas for molecular labwork. The most important separation in your lab workspace is the divide between pre-PCR and post-PCR processes. PCR products are often the worst source of contamination in a molecular lab, due to their persistence and the high number of copies of DNA PCR produces. Foot traffic should be unidirectional. If separate rooms are not possible, separate bench counter areas should be maintained for different purposes, ideally separated by dividers (Mifflin 2007).
eDNA extraction and handling eDNA samples is typically the most contamination sensitive part of eDNA workflows and should occur separately from areas for PCR-based assays, work on tissue samples, or non-eDNA related labwork. A room dedicated to eDNA extraction should ideally have positive airflow to reduce airborne contaminants.
Each lab space should have its own lab coats and supply of masks and gloves. More sensitive rooms should also have separate footwear/shoe protections and hairnets.
The following figures, made by Dr. Bojian Chen, depict a design for a new lab for eDNA and other environmental monitoring protocols to be deployed in eastern Ontario biomonitoring.
Fig. 13 General layout of an eDNA lab. Note how users are directed towards a unidirectional workflow.
Fig. 14 Rendering of the sample filtration/equipment room.
Fig. 15 Rendering of the foyer room, for storing personal items and for donning PPE.
Fig. 16 Rendering of the extraction and pre-PCR room.
Fig. 17 Rendering of the post-PCR and library prep room.
Fig. 18 Rendering of the general storage area.
Primers and Probes Design & Validation
The choice of primers and markers is a crucial step in eDNA studies. In general, the marker of choice should have a large number of existing, verified entries in online databases such as BOLD or Genbank, be conserved enough to have low intraspecific variation (low number of nucleotide substitutions between members of the same species), but variable enough to have high interspecific variation (have nucleotide substitutions between members of different species, particularly co-occurring ones). Marker choice(s) will typically be based on data availability, and extensive testing is required for a high quality assay (see Freeland 2017 for more details). Always explore the literature to determine whether primers for the focal species/group(s) already exist. This can easily be done using keywords on Google Scholar or Web of Science. If primers are already available, it is important to i) check whether they have been validated properly (Langlois et al. 2021; Elbrecht et al. 2016; Thalinger et al. 2021b; Tournayre et al. 2023) and ii) test them yourself using positive controls (e.g. species of interest, mock communities) and negative controls. Simply because they worked in one lab does not automatically translate into a working assay in your lab (different instruments, different source samples, potentially different versions of your consumables). Amplification efficiency is specific to the qPCR/dPCR platform and reagents it was tested on.
Ideally, primers must be validated in silico (predicted amplification success using reference sequences), in vitro (DNA from tissues), and in situ (eDNA samples with known presence and known absence of the target species) (Fig. 19). Various software and online tools for in silico testing exist. The specificity of species-specific primers are usually evaluated “by-eye” (i.e. counting the number of mismatches between primers/probe and template sequence in an alignment) or ‘blasted’ in the National Center for Biotechnology Information (NCBI) Genbank. The first approach is fairly rudimentary and does not account for mismatches and spacing that may play a role in non-target amplification or target species exclusion. The second approach has the advantage of comparing the primer/probe sequences to a massive number of reference sequences available in public databases. However, it does not allow the simultaneous evaluation of the primers and probe. A recent online machine learning tool, eDNAssay, has been developed to overcome these limitations (Kronenberger et al. 2022; https://nationalgenomicscenter.shinyapps.io/eDNAssay/). Initially developed to predict qPCR cross-amplification (e.g. Katz et al. 2023, Nordstrom et al. 2023), it has also been used in ddPCR (Tournayre et al. 2023) and metabarcoding (Vanderpool et al. 2024) studies. In general, using existing primers from the literature requires significantly less testing than creating an assay de novo. For more details on qPCR/dPCR assay development, see:
Fig. 19 Flowchart of eDNA single species assay (qPCR or dPCR) development and validation.
Metabarcoding primer development requires significant in silico and in vitro testing. Unlike other tools such as the commonly used ecoPCR (Ficetola et al. 2010; Bellemain et al. 2010), PrimerMiner provides a more realistic evaluation of metabarcoding primers by taking into account the number and type of mismatches, their position, and whether they are adjacent (Elbrecht and Leese, 2017a). One may include ‘degenerate’ bases to increase the diversity of species that may be detected (Tournayre et al. 2020; Elbrecht and Leese 2017b). Primers with degenerate bases have an equimolar mix of nucleotides (Table 2). Thus, if a primer sequence is generally conserved across its length, but exhibits variation at one particular key nucleotide, we could address this by ordering a mix of primers with slightly different versions of the same sequence. For example, if the variable site contained either a ‘C’ or a ‘T’ we would code this as ‘Y’ and the resulting primer will comprise an equimolar mix of C (50%) and T (50%) allowing binding to both C and T at the same base position.
Degenerate base |
Mix |
|---|---|
R |
A + G |
Y |
C + T |
S |
G + C |
W |
A + T |
K |
G + T |
M |
A + C |
B |
C + G + T |
D |
A + G + T |
H |
A + C + T |
V |
A + C + G |
N/I |
A + T + C + G |
To our knowledge, as of writing this text in 2024, only two tools have been developed to facilitate primer selection: the in silico-based MultiBarcodeTools (https://multibarcode.k.u-tokyo.ac.jp/; Zhu and Iwasaki, 2023), and the real metabarcoding data-based SNIPe (https://snipe.dlougheed.com/; Tournayre et al. 2024). The latter provides a comparison of the primer pairs based on the number of detected taxa, the taxonomic resolution of these identifications, and the number of off-target detections. Because metabarcoding studies target a many taxa simultaneously it would be very challenging to test species one by one. Instead, it is possible to use mock communities (mixtures of DNA from suites of taxa to mimic what we may find in nature). We illustrate the usefulness of SNIPe below.
Primers and probes can be ordered from the following companies (a non-exhaustive list biased towards our own experiences): Integrated DNA Technology (IDT), Eurofins Genomics, ThermoFisher or Applied Biological Materials. Primers and probes should be stored at -20°C, aliquoted into separate tubes, and kept isolated from sources of DNA (e.g. samples, tubes of DNA extracts, PCR plates) to limit degradation and possibilities for contamination.
An example, using SNIPe to select three primer pairs to survey fish
Tournayre et al. (2024) tested 14 primer pairs (from the literature, modified from the literature, and of their own design) in silico, using DNA from tissues, using mock communities comprising different mixtures of DNA, using samples from aquaria with known species, and using eDNA samples from natural water bodies. The different primer pairs (see the original paper for details) were originally designed to target particular taxa (e.g. flowering plants including macrophytes, fish, macroinvertebrates, phytoplankton) but also do detect other taxa for which they were not designed (so-called ‘off-target taxa – see glossary). In total, Tournayre et al. (2024) detected 461 taxa including bacteria, invertebrates, vertebrates across classes, rotifers, tardigrades, bryophytes, angiosperms and microfungi. Using this dataset and SNIPe we can explore what subsets of these assayed primers might be most useful for surveying particular taxa of interest (note that SNIPe does allow for users to upload their own validated eDNA dataset using multiple primers).
To illustrated SNIPe, let’s suppose that we wish to select the best combination of three primer pairs for surveying fish communities. Using the 14-primer pair dataset, we find two combinations of three primer pairs that detect 30 ‘on-target’ taxa (i.e. fish species) (Fig. 20). Each set of three primer pairs also adds three additional taxa that would otherwise not be detected using two of the three primer pairs. How might we decide then which three primer pair set to deploy for actual surveys of fish communities? Here we can note that the 16SMOL-COIFish-COIFishdegen combination yields 39 ‘off-target’ taxa including crustaceans, birds, mammals, worms and even bacteria and bryozoans, whereas the COIFish-COIFishdegen-MiFish primer pair combination yield only 9 ‘off-target’ taxa all vertebrates. In this instance then, we may wish to go with the second set of primer pairs (noting of course that there may be other factors at play in one’s lab including whether one has data from previous studies from one or more of these primer sets).
Fig. 20 SNIPe finds two three-primer-pair combinations yielding 30 ‘on-target’ taxa using the Tournayre et al. (2024) dataset.
SNIPe provides many other outputs that may prove useful. For example, Fig. 21 shows the Euler diagram for the COIFish-COIFishdegen-MiFish primer pairs showing overlap in species detections (e.g. all three primer pairs detect ten fish species, COIfishdegen and MiFish both detect an additional fish species not detected by COIfish, and MiFish detects three fish species not detected by the other two primer pairs).
Fig. 21 Euler diagram of species detections for the COIFish-COIFishdegen-MiFish primer pair set.
For the 16SMOL-COIFish-COIfishdegen primer pair combination we can visualize the accumulation of new taxa across primer pairs but also identify which of the primer pair(s) is resulting in ‘off-target’ detection, in this case 16SMOL (Fig. 22).
Fig. 22 SNIPe view showing taxa coverage by taxon ‘supergroup’.
Finally, we can visualize the lists of taxa detected (both on-target or off-target or both) and can download a CSV file of ‘on-target’ taxa as shown below for the COIFish-COIFishdegen-MiFish primer pairs (Table 3. Two of the detections are not to species level but rather to genus (Lepomus sp., visible in Table 3, and Catostomus sp.) revealing modest lack of resolution for these primers.
Supergroup |
Taxa_group |
Phylum |
Order |
Family |
Genus |
Final_ID |
Primer |
|---|---|---|---|---|---|---|---|
Vertebrate |
Fish |
Chordata |
Perciformes |
Centrarchidae |
Lepomis |
Lepomis_sp |
COIFish |
Vertebrate |
Fish |
Chordata |
Perciformes |
Centrarchidae |
Lepomis |
Lepomis_sp |
COIFishdegen |
Vertebrate |
Fish |
Chordata |
Perciformes |
Centrarchidae |
Lepomis |
Lepomis_gibbosus |
COIFish |
Vertebrate |
Fish |
Chordata |
Perciformes |
Centrarchidae |
Lepomis |
Lepomis_gibbosus |
COIFishdegen |
Vertebrate |
Fish |
Chordata |
Perciformes |
Centrarchidae |
Lepomis |
Lepomis_macrochirus |
MiFish |
Vertebrate |
Fish |
Chordata |
Perciformes |
Centrarchidae |
Ambloplites |
Ambloplites_rupestris |
COIFish |
Vertebrate |
Fish |
Chordata |
Perciformes |
Centrarchidae |
Ambloplites |
Ambloplites_rupestris |
COIFishdegen |
Vertebrate |
Fish |
Chordata |
Perciformes |
Centrarchidae |
Ambloplites |
Ambloplites_rupestris |
MiFish |
Vertebrate |
Fish |
Chordata |
Perciformes |
Centrarchidae |
Micropterus |
Micropterus_salmoides |
COIFish |
Vertebrate |
Fish |
Chordata |
Perciformes |
Centrarchidae |
Micropterus |
Micropterus_salmoides |
COIFishdegen |
Vertebrate |
Fish |
Chordata |
Perciformes |
Centrarchidae |
Micropterus |
Micropterus_salmoides |
MiFish |
Vertebrate |
Fish |
Chordata |
Perciformes |
Centrarchidae |
Pomoxis |
Pomoxis_nigromaculatus |
COIFish |
Vertebrate |
Fish |
Chordata |
Perciformes |
Percidae |
Perca |
Perca_flavescens |
COIFish |
Vertebrate |
Fish |
Chordata |
Perciformes |
Percidae |
Perca |
Perca_flavescens |
COIFishdegen |
Vertebrate |
Fish |
Chordata |
Perciformes |
Percidae |
Etheostoma |
Etheostoma_olmstedi |
COIFishdegen |
Vertebrate |
Fish |
Chordata |
Siluriformes |
Ictaluridae |
Ameiurus |
Ameiurus_natalis |
COIFish |
Vertebrate |
Fish |
Chordata |
Siluriformes |
Ictaluridae |
Ameiurus |
Ameiurus_natalis |
COIFishdegen |
Vertebrate |
Fish |
Chordata |
Siluriformes |
Ictaluridae |
Ameiurus |
Ameiurus_natalis |
MiFish |
Vertebrate |
Fish |
Chordata |
Siluriformes |
Ictaluridae |
Ameiurus |
Ameiurus_nebulosus |
COIFish |
Vertebrate |
Fish |
Chordata |
Siluriformes |
Ictaluridae |
Ameiurus |
Ameiurus_nebulosus |
COIFishdegen |
Vertebrate |
Fish |
Chordata |
Siluriformes |
Ictaluridae |
Ameiurus |
Ameiurus_nebulosus |
MiFish |
Vertebrate |
Fish |
Chordata |
Siluriformes |
Ictaluridae |
Noturus |
Noturus_gyrinus |
COIFish |
… |
eDNA Sampling & Storage
Please refer to Section 2.1. Sampling strategy, and Figure 2 and Table 1 of Bruce et al. (2021) as they provide relevant, detailed guidance for water sampling including when and where to sample and sample number/volume in lentic, lotic and marine ecosystems: https://ab.pensoft.net/book/68634/. It is critical that you report your definition of sampling region, sites, stations, and replicates in whatever materials you produce from your eDNA research (Table 4).
In general, your sampling scheme should consider your biological questions, the life history (especially phenology) of your target species, the hydrological characteristics of your target system (if sampling aquatic systems) or airflow patterns (for sampling aerial eDNA), and the logistics of sampling. For maximum detectability, sample when your target species is most active in the area (such as during breeding). A difference of a few weeks before and during a breeding season can cause significant differences in ease of detection (Chen et al. 2023). Weather events such as precipitation may also dilute eDNA or increase inhibition from turbidity (Chen et al. 2023; Osathanunkul and Suwannapoom 2024). Many optimal sampling locales in a system may not be easily accessible, or may be on private property, and a pilot study and site scouting may be needed. Finally, hydrological properties must be a key consideration of your study design (Table 5).
Region |
Broad area that describe the context of the study (watershed, management area, etc) - Bay of Quinte, St. Lawrence River, Rideau Canal. |
Sites |
Independent sampling sites that cover different aspects of the region (distinct waterbodies or parts of waterbodies, segments of a river). |
Stations |
Sampling locales within a site (spatial replicates) that are used to improve species detection or evaluation DNA variation within systems or habitats (grid/transect sampling, multiple locations within a lake, etc). |
Field Replicates |
Separate sampling units collected at the same station at the same time, stored in separate containers, analyzed independently (to measure sampling-based variation). Usually a minimum of 3. |
Filter Replicates |
Obtained by cutting a filter into separate pieces and independently testing each piece, often for separate types of analysis. |
Technical Replicates |
Separate qPCR or PCR reactions from the same DNA extract/sample, minimum of 3. |
|
Precautions to avoid contamination
Regardless of sampled media (e.g. soil, air, water), equipment (e.g. tweezers for handling filters, waders, cooler, reusable sampling bottles) must be decontaminated using 10% bleach and rinsed with distilled water before and between sampling bouts. To check that bottles have been properly decontaminated, use a field control (i.e. bottle filled with distilled water). For aquatic sampling, it is important to rinse the equipment with distilled water before entering any water body as bleach solution that has not dissipated could harm organisms. Use disposable latex or nitrile gloves to collect the samples and change gloves between sites or if contamination is suspected. Wear a mask during sample collection to prevent breathing based contamination. Using controls at all stages of your workflow is crucial for measuring contamination at each stage (Table 6). Positive signals within controls may be used to diagnose protocol issues and used as a threshold criteria for positive detection.
Field control |
|
Filtration control |
|
Extraction control |
|
Metadata
Record essential information such as location, geographical coordinates, date of sampling and identity of people who sampled. Also record any supplementary metadata that could be useful to interpreting your results later (e.g. weather, water, air or soil temperature, pH, turbidity, wind speed, visual observations of species or habitat structure). One should undertake eDNA sampling first before (for example) using probes to collect water physicochemistry data to avoid cross-contamination. The Molecular Detection Mapping and Analysis Platform for R (MDMAPR; Yu et al. 2020) can be used to merge raw qPCR fluorescence data and metadata together to facilitate the spatial visualisation of species presence/absence detections.
Storage until further processing
Warm temperatures and exposure to UV light degrade eDNA. As eDNA degrades quickly, it is important to reduce the time between sampling and filtering (water) or sampling and storage (e.g. soil, fecal, blood, or hair samples). Between collection and filtration, water samples should be stored in a cooler with ice packs so that they are protected from sunlight and high temperatures (two factors that degrade DNA). Other types of samples such as fecal pellets should be directly stored in the freezer (-20°C) as is, in 95% ethanol or in sterile bags with silica gel.
eDNA capture: Filtration vs precipitation (water samples)
Isolation of eDNA from water samples can be done by precipitation or filtration. Filtration involves passage of water samples through a filter to capture the DNA whereas the precipitation method uses ethanol to precipitate DNA in the water sample. Both approaches can be used, but filtration is preferable as it allows processing larger volumes of water, reduce can be done either on or off-site, but always as soon as possible after sampling (< 24H) to minimize degradation that may compromise eDNA signals. Filtering on-site reduces the risk of external contamination (e.g. DNA present in the lab) and reduces risk of degradation during transport. Water can be filtered using a syringe (small volumes), vacuum (small to medium volumes) or a peristaltic pump (large volumes). Turner et al. (2014) recommended using 0.2- μm-pore-size filters for macro-organisms, but noted that filters clogged even with small throughput volumes (e.g. 250 mL). Two solutions to this conundrum have been proposed: i) Increasing pore size and processing larger volumes; and ii) Adding a pre-filtration step to turbid water to prevent clogging (Takasaki et al. 2021). However, sometimes using several filters per sample is inevitable (Sengupta et al. 2019). Filters should be preserved either dry or in a lysis buffer in the freezer (Majaneva et al. 2018).
Equipment (e.g. pump, tubes, filter holders, tweezer) must be thoroughly bleached (10%; >=20min) and rinsed with distilled water between each sample. A negative control for filtration (i.e. distilled water filtered along with the samples) must be included in each filtration session to measure contamination during the filtration process. Disposable gloves should be worn and changed when contamination is suspected.
eDNA Processing – Lab Work
Doing lab work involves manipulating chemicals and potentially harmful reagents. Follow assiduously safety recommendations for the reagents and read the Material Safety Data Sheets (MSDS) if you are using reagents new to you. For example, if using chloroform- DNA extraction protocol, do not use chloroform outside of a working fume-hood and use nitrile gloves. For your own security and to avoid contaminating the samples, wear disposable gloves (latex or nitrile depending on the reagents), a clean lab coat, and close-toed shoes, and tie your hair. Keep track of your work, note sample ID, the protocol and any information that could be relevant to interpret the data, including suspicion of contamination between samples or human error during processing - we highly recommend that you keep a lab book.
DNA Extraction
Ideally, DNA extractions should be done in a dedicated lab space with no PCR-based work going on because amplified DNA (millions of copies of amplified DNA) can easily contaminate your samples. Equipment (e.g. bench, pipettes, centrifuge) must be bleached (≥10%) and if possible decontaminated using UV-C light (at least 20 min). If working with tubes, it is important not to touch the inside of the cap to avoid contamination between samples. A no-template control (NTC) of extraction (one tube filled with extraction reagents but no DNA) must be included in each set of extractions.
Many methods and kits are used for eDNA extraction, the most commonly being the QIAGEN Blood and Tissue kit (e.g. Thomsen et al., 2012, Hinlo, Gleeson, and Furlan 2017, Walz, Yamahara, and Chavez 2019, Qiagen N.V.), and the cheaper alternative based on chloroformphenol reactions (e.g. Turner et al. 2014, Feng, Bulté, and Lougheed, 2020, Chen et al. 2023). See Fig. 23 for a general eDNA extraction workflow.
Fig. 23 General steps in DNA extraction noting myriad protocols and variations therein.
DNA Amplification
The use of technical replicates and multiple controls IS necessary to obtain robust data – indeed, if one wishes to publish or if this is to be used to guide policy such practices are required. Technical replicates (i.e. each PCR reaction is repeated three times or more using the same conditions and reagents) is used to control for PCR stochasticity and contamination. The recommended minimum number of technical replicates is three: a species is considered as present only if present in at least two out of three replicates. In metabarcoding studies, if time and budget do not allow for separate processing of replicates, replicates can be pooled before proceeding to PCR2 (indexation) but data will be less robust as it will not be possible to track the origin of cross-contaminations and PCR stochasticity (Lawson et al. 2019).
One must also include ‘no-template’ controls (i.e. only reagents, no addition of DNA) at each step of the process to test for contamination: field controls, filtration controls (water only), DNA extraction controls, and qPCR/ddPCR/PCR controls.
For qPCR/ddPCR studies, it is recommended that one use DNA of the species of interest as a positive control to check the efficiency of the reaction. In metabarcoding studies, the positive control should be a non-resident species (i.e. a species not present in the focal geographical region) because of false-assignment errors during sequencing. A falseassignment error is when a sequence is attributed to the wrong sample, leading to false positive detection (species is detected as present but is absent) in that sample. Therefore, using a non-resident control allows one to calculate the rate of false-assignment and to correct the data accordingly. DNA amplification success can be verified by running the PCR/qPCR product in an agarose gel (Fig. 24).
List of all controls: NCfield, NCfiltration, NCextraction, NCPCR1, NCPCR2 (for 2 step PCR only), Positive control and technical replicates (Table 4, Table 6).
Fig. 24 Photo of a 1% agarose gel. L = DNA Ladder (100 to 1,500 bp), 1 = No-template control, 2 = Positive control (tissue DNA), 3 to 7 and 12 = failed eDNA samples (no band), 8 to 11 and 13: successful eDNA samples (bright band at the expected amplicon size).
DNA Sequencing (metabarcoding)
DNA can be sequenced as single-end (i.e. in only one direction) or as paired-end (sequencing the amplicon forward and backward). Paired-end sequencing usually generates an overlap that provides high-quality data because the amplicons are sequenced twice in the overlap region. While Illumina platforms (MiSeq, HiSeq, NextSeq and NovaSeq) dominate the Next Generation sequencing market (lowest error rate and least expensive, short amplicons), other sequencing platforms such as ThermoFisher Scientific (Ion torrent), Oxford Nanopore Technology (MinION) and PacBio exist as well.
Outputs
qPCR
Here we present the outputs of the Biorad CFX96 Real-Time PCR detection system using Biorad CFX Maestro® software. Note that outputs and options may vary from one software package to another, so please refer to relevant user manuals.
Amplification chart
The amplification chart displays the fluorescence intensity (relative fluorescence unit or RFU) plotted against the number of cycles. There is one curve per fluorophore per well. Technical replicates should overlap otherwise an outlier technical replicate can be excluded from the analysis.
The Cq value will remain the same regardless of RFU value. When manually changing the threshold value we recommend using the log scale display mode as the curves are visually less flattened. The Cq value can also be determined automatically by the software with two modes: the regression and the single threshold modes. The user guide indicates that the regression mode applies “… a multivariable, nonlinear regression model to individual well traces and then uses this model to compute an optimal Cq value” and the single threshold mode “… uses a single threshold value to calculate the Cq value based on the threshold crossing point of individual fluorescence traces”.
Fig. 25 Example amplification curve chart. The Y-axis is in relative fluorescence units (RFUs), while the X-axis is in cycles. The horizontal line at approximately 25 RFU is the threshold. The intersection of the amplification curve and threshold line is the Cq value for that sample. Taken from Bio-Rad CFX Manager 3.1 software (Bio-Rad Laboratories, Inc).
Standard curve
The vertical axis shows the Cq value and the horizontal axis shows the log of the starting concentration (log starting quantity). The legend shows the type of DNA template (standard or target sample), the colour of the fluorophore (e.g. FAM or HEX), efficiency (%; how much is being produced with each cycle), \(R^2\) (goodness-of-fit), slope of the standard curve, and y-intercept values (where the curve intercepts the y-axis).
Note: It is possible to obtain an E value higher than 100%. This can be explained by an excess of starting quantity templates or the presence of inhibitors that prevent Cq values from shifting into earlier cycles as product concentration increases. It can also be explained by the non-specificity of the primers when using intercalating dyes like SYBR green. This can be checked by looking at the melting curve (Fig. 27): if only one curve is observed then primers are specific; however, if multi-peaks are observed primers may have amplified different fragments. This blog post provides detailed information on reasons and solutions for efficiency values that are too low or high: https://biosistemika.com/blog/qpcr-efficiency-over-100/.
Fig. 26 Example amplification curve for standards (left) and standard curve (right). The standard curve on the right has Cq value plotted against known starting quantity (log10 transformed). Taken from Bio-Rad CFX Manager 3.1 software (Bio-Rad Laboratories, Inc).
Melting curve
Melting curves are a low cost, within assay method for determining if your intercalating dye (SYBR Green) based qPCR has produced a single product. Intercalating dyes fluoresce when they bind to double stranded DNA, but are not sequence specific. Double stranded DNA dissociates into single strands as temperature increases, typically between 70°C to 90°C, releasing the intercalating dye and reducing the fluorescent single. This temperature of dissociation, or melting temperature, varies between sequences (with higher G/C content regions having greater binding energy and therefore melting temperature). Therefore, through increasing the temperature in small intervals and measuring fluorescence at each interval, you generate a melting curve of your qPCR product of temperature against RFU (relative fluorescence units). Through taking the first derivative of this curve, we can find the temperatures at which the rates of dissociation are greatest, which form peaks (Ririe, Rasmussen, Wittwer, 1997). This is all automated within the software packages of most qPCR platforms. These peaks can help you assess if there is non-specific amplification or the presence of primer/dimers in your reaction. For more details on melt curve analysis, read: https://www.idtdna.com/pages/education/decoded/article/interpreting-melt-curves-anindicator-not-a-diagnosis.
Fig. 27 Example melt curve and first derivative of melt curve (right). The sample with a peak at approximately 82°C is the desired amplification product. The sample with a smaller peak at approximately 76°C is a primer/dimer. Taken from Bio-Rad CFX Manager 3.1 software (Bio-Rad Laboratories, Inc).
Data table
The data table displays the Cq value of each curve, Cq mean and Standard Deviation per group of replicates, Starting quantity (SQ; select the unit in Settings), Log SQ and SQ mean (select the unit in Settings) per group of replicates.
ddPCR
The first ddPCR output to check is the number of droplets generated for each sample. The number of droplets must be equal or superior to 10,000 and uniform among samples to allow comparison (Fig. 28). The second main output is the number of positive and negative droplets (Fig. 29). The threshold is automatically calculated by the software but can be adjusted manually. Separation of positive and negative droplets can be improved through incubating the PCR product before droplet reading in fridge conditions (4°C) for 3 hours to overnight (Personal communications, Bio-Rad). The third output (calculated based on the other ones) is the concentration of the target species (number of DNA copies/μL) (Fig. 30). The lower and upper limits of concentration are 0.25 copies/μL and 5,000 copies/μL, respectively. The observed concentration can be converted into the number of copies present in the starting material.
Fig. 28 Example droplet count graph. The number of droplets in each well is on the Y-axis. Well labels are on the X-axis. Taken from Bio-Rad QX Manager 2 software (Bio-Rad Laboratories, Inc).
Fig. 29 Example droplet amplitude graph. The RFU of each droplet is on the Y-axis. Well labels are on the X-axis. The red line indicates the threshold (dividing line between positive and negative droplets. Taken from Bio-Rad QX Manager 2 software (Bio-Rad Laboratories, Inc).
Fig. 30 Concentration graph. The concentration in copies/μL is on the Y-axis. Well labels are on the X-axis. Concentrations were calculated by the software with Poisson statistics. Taken from Bio-Rad QX Manager 2 software (Bio-Rad Laboratories, Inc).
The following example is provided by Bio-Rad in the Droplet Digital PCR Application guide to understand how to convert copies/μL into copies in the starting material (from: https://www.bio-rad.com/webroot/web/pdf/lsr/literature/Bulletin_6407.pdf ):
“Mix 10 μl of sample with 12.5 μl of ddPCR Supermix for Probes and 2.5 μl of assay (primer and probe mix), for a total volume of 25 μl. Load 20 μl of this mix into a DG8™ DropletGenerator Cartridge and run ddPCR. The software reports that the concentration is 8 copies/μl. Two equivalent methods illustrate how many total copies and how many copies/μl of the target DNA were present in the original 10 μl sample.
Method #1: The ratio of sample to total volume is 10/25 = 2/5. Since there were 8 copies/μl in the final PCR mix, there were 8 x (5/2) = 20 copies/μl in the original sample. In the full 10 μl of the original sample, there were 10 x 20 = 200 copies of the target DNA.
Method #2: Since there were 8 copies/μl in the PCR mix and a total of 25 μl of the PCR mix was made, there were 8 x 25 = 200 copies of the target DNA in the PCR mix. This mix contained 10 μl of the original sample, so there were 200 copies of target DNA in the full 10 μl of starting sample, and 200/10 = 20 copies/μl of target in the starting sample”
Metabarcoding (pair-end sequencing)
Most sequencing platforms provide data that are already demultiplexed: the library has been split up into different files for each sample (i.e. each read has been assigned to a sample). The end-user will receive two compressed fastq files (Box 1) per sample, one for the Read 1 and one for the Read 2 (see pair-end sequencing above). Those files have the same header per sample except the short form codes “R1” (Read 1) or “R2” (Read 2): nameofthesample_R1.fastq.gz and nameofthesample_R2.fastq.gz. For example, if you have sequenced four samples A, B, C, D, then you will have eight files: sampleA_R1.fastq.gz, sampleA_R1.fastq.gz, sampleB_R1.fastq.gz, sampleB_R2.fastq.gz, sampleC_R1.fastq.gz, sampleC_R2.fastq.gz, sampleD_R1.fastq.gz, sampleD_R2.fastq.gz.
The Sequencing Analysis Viewer (SAV) is free software to check the quality of Illumina sequencing runs. https://support.illumina.com/sequencing/sequencing_software/sequencing_analysis_viewer_sav.html
The three main metrics to check are:
Cluster density (K/mm2). During the sequencing, sequences are gathered in a cluster on the flow cell and read by the instrument. The optimal cluster density depends on the reagent kit that was used. The optimal density for a MiSeq 500v2 kit is between 700 and 800 K/mm², while for a MiSeq 600v3, it is between 1200 and 1400 K/mm². If cluster density is too low (called under-clustering), data quality is maintained but fewer sequences are produced. If cluster density is too high (over-clustering), the image analysis will be affected resulting in both lower quality (lower % of reads passing filter PF) and quantity of the sequences. In the case of extreme over-clustering, the sequencing run will fail because the camera of the instrument will not be able to distinguish the clusters from each other.
Reads passing filter PF (%). It indicates the percentage of sequences that pass the Illumina image quality filter. Expected PF is usually >70-80%. 41
Global percentage of bases whose Q score > 30 (global index of sequencing quality). A Q score of 30 indicates the probability of one incorrect base every 1,000 bases.