Abstract

Quantifying the biomass, or number of individuals, diversity, and distribution of marine species is a critical aspect of understanding and managing marine ecosystems. In recent years, there has been growing interest in using environmental DNA (eDNA) for marine ecosystem management and biodiversity assessment. However, the main challenge hindering eDNA applicability has been the inability to infer absolute species abundances from multispecies analysis (eDNA metabarcoding). In this study, we demonstrate a way forward by estimating the abundance of commercially important fish species in a Norwegian fjord using a joint Bayesian statistical model of traditional trawl-catch data and molecular data derived from eDNA. Using this model, we accurately predict out-of-sample trawl catches using eDNA alone. Moreover, our model provides empirical estimates for key processes linking marine eDNA concentration to the fish population abundance estimated from trawl observations, including trawl catchability, DNA shedding, degradation, dilution, transport, recovery rate, and isolation efficiency. These processes, including amplification efficiencies correcting for Polymerase Chain Reaction (PCR) bias, are species-specific and enable the translation of eDNA metabarcoding data into abundances. These findings have broad implications for the use of eDNA in marine ecosystem management and conservation efforts.

Introduction

Quantifying the abundance, diversity, and distribution of species is a cornerstone of fisheries management and ecosystem conservation (Preston 1948, Jetz et al. 2012, Hill et al. 2014, Thomsen et al. 2016, Callaghan et al. 2021, Farr et al. 2022). Traditional survey methods, including scientific, and commercial trawl catches, close kin mark-recapture, telemetry, hydro-acoustics, and electrofishing surveys, have been widely used to gather ecological data and form the foundation for most fisheries management and conservation (Fraser et al. 2007, Neebling and Quist 2011, Crossin et al. 2017, Lees et al. 2021). However, these methods are invasive, and a few have a net negative ecological impact (Biju Kumar and Deepthi 2006, Eigaard et al. 2017). Furthermore, trawl and acoustic surveys are cost-intensive, requiring a large number of crew to sort the trawl catches, taxonomists onboard to identify species, and may be unsuitable for specific habitats, such as rocky bottoms (Lacoursière-Roussel et al. 2016).

Environmental DNA (eDNA), free genetic material shed by organisms which can be captured from environmental samples and sequenced, has emerged as a promising molecular approach that offers a less invasive method to identify and quantify species (Taberlet et al. 2012, Thomsen et al. 2012). Through “eDNA metabarcoding”—molecular identification of multiple species—researchers and resource managers can infer community composition, biodiversity changes, and ecological shifts due to anthropogenic impacts (Hansen et al. 2018, Jeunen et al. 2019, Atienza et al. 2020, Larson et al. 2022, Turon et al. 2022, Guri et al. 2023). Despite the recent technological advancements, eDNA methods have not yet reached the level of accuracy to fully replace traditional surveys for fish ecosystem monitoring or stock assessments. However, the integration of eDNA surveys with traditional methods can increase our understanding of marine ecosystems (Valdivia‐Carrillo et al. 2021, Pont et al. 2023), fisheries management (Stoeckle et al. 2021), and conservation (Hill et al. 2014, He et al. 2023). By combining these two survey types, we can obtain more comprehensive and accurate information on the species distribution and biological factors related to it (Veron et al. 2023). This becomes particularly pivotal when considering the ecological repercussions of trawl surveys, including adverse effects on non-target species and vulnerable benthic habitats (Lacoursière-Roussel et al. 2016). The integration of eDNA and trawl data can help to address some of the drawbacks of traditional survey methods, such as their high costs, invasive nature, inability to access restricted or vulnerable areas, and reliance on taxonomic expertise (Closek et al. 2019, Pont et al. 2023, He et al. 2023, Veron et al. 2023), thereby mitigating the dependence solely on the latter survey method (Valdivia‐Carrillo et al. 2021).

Despite the potential of eDNA as a method for understanding marine ecosystems, there are still several challenges that need to be addressed (Ramírez-Amaro et al. 2022). Metabarcoding methods do not reflect absolute concentrations of eDNA in environmental samples due to a combination of PCR stochasticity and species-specific amplification efficiencies, especially when using universal markers (12S, 16S, or COI). Furthermore, the compositional nature of metabarcoding data provides only inferable proportions (Gloor et al. 2017). Consequently, quantitatively assessing biomass or the abundance of targeted organisms through metabarcoding remains elusive (Rourke et al. 2022). Such constraints have hindered the broader adoption and integration of eDNA metabarcoding methods in cross-disciplinary fields such as fisheries, conservation biology, and ecosystem-based management (Ramírez-Amaro et al. 2022). Species-specific quantitative PCR methods (real-time PCR—qPCR or digital droplet PCR—ddPCR; Pont et al. 2023) offer an attractive alternative to metabarcoding because they are quantitative, hence reflect absolute eDNA concentrations. For example, Salter et al. (2019) showed a strong correlation (R2 = 0.66; P = 0.008) between absolute eDNA quantities and the biomass of Atlantic cod, and Shelton et al. (2022) showed reliable hake distributions from eDNA on a continental scale. More recently, Maes et al. (2023) employed ddPCR and found eDNA concentrations can represent fish abundance and biomass for two commercial flatfishes (Solea solea and Pleuronectes platessa). In the context of ecosystem-based management of multiple species, however, species-specific quantitative methods can be more expensive and less efficient than metabarcoding multi-species, primarily due to the requirement for prior knowledge or isolation of the target species (Schneider et al. 2016).

eDNA metabarcoding has rapidly advanced as a tool for exploring fish communities. Several studies have shown positive relationships between eDNA metabarcoding data and trawl data (Thomsen et al. 2016, Afzali et al. 2021, Stoeckle et al. 2021, Yates et al. 2022). Stoeckle et al. (2021) compared log proportional metabarcoding reads to log proportional abundances from bottom trawls and observed a positive correlation when comparing across different months (mean R2 = 0.565). Taken together, the available evidence indicates that metabarcoding reads without any type of transformation accounting for PCR biases cannot yield absolute abundance but may reveal information of rank, relative, or semiquantitative abundances if differences in amplification rates among species are small (Salter et al. 2019, Stoeckle et al. 2021, Guri et al. 2023, Veron et al. 2023). Pont et al. (2023) and Allan et al. (2023) addressed this issue by combining eDNA metabarcoding with qPCR (multiplying metabarcoding relative read-abundances with total qPCR fish DNA concentration) and found a strong correlation between total DNA concentrations and total fish biomass measured through traditional electrofishing in river systems. However, when comparing metabarcoding with the species-specific DNA concentration, they found roughly two orders of magnitude variation, which may have resulted from differences in species-specific amplification efficiencies. Alternatively, Shelton et al. (2023) developed a model to estimate initial DNA proportions in metabarcoding samples while accounting for species-specific amplification biases by sequencing mock community samples of known concentration of DNA extracts for a given list of taxa alongside environmental samples. To further improve the application of these models and methods, additional information is required to advance beyond proportions and provide absolute species concentrations from metabarcoding data. Addressing all aforementioned challenges will be critical to the effectively applying eDNA metabarcoding in ecological research and further integrating it into ecosystem-based management and fisheries stock assessments.

In this study, we developed a comprehensive and reliable framework for analyzing eDNA data in conjunction with other ecological data sources, such as trawl catch data, to improve the quantitative accuracy of metabarcoding analyses and provide a more holistic understanding of aquatic ecosystems. Our Bayesian joint model framework integrated eDNA concentrations from eDNA metabarcoding and ddPCR with fish density from trawl surveys (Fig. 1) to estimate of biological parameters of interest, such as abundance, and parameters describing the links between eDNA and traditional sampling data. Additionally, using these links, we show how we can predict out-of-sample trawl catches solely using eDNA observations.

Simplified schematic overview of the joint Bayesian model (three model compartments) workflow including all the inferences and processes. The model indicates θ as the conversion between eDNA concentrations and fish densities (θ, with units: fish/km2/copies/µl). Processes (dashed lines) are shown for schematic understanding, thus are circumvented (not measured) in this model.
Figure 1.

Simplified schematic overview of the joint Bayesian model (three model compartments) workflow including all the inferences and processes. The model indicates θ as the conversion between eDNA concentrations and fish densities (θ, with units: fish/km2/copies/µl). Processes (dashed lines) are shown for schematic understanding, thus are circumvented (not measured) in this model.

Methods

We developed and joined five distinct datasets into a common analysis: trawl-catch counts, ddPCR observations of known concentration from standard samples for a reference species, ddPCR observations of unknown concentration from environmental samples for the reference species, metabarcoding of a known mixture of DNA extracts from multiple species of interest, and metabarcoding reads of environmental samples for target fish species. Below, we describe how each dataset was produced before providing detailed information on the joint statistical model and subsequent analyses.

In brief, the model uses species-specific ddPCR and a mock community to both calibrate metabarcoding results and to derive eDNA concentrations in environmental samples. We use this molecular dataset in combination with traditional trawl data to estimate the abundances of several species and rigorously test the resulting joint model against out-of-sample data to show its reliability.

Study area and samples

Balsfjord, northern Norway (Supplementary Fig. S1), is 40 km long and averages 150 m in depth with a sill at the fjord’s entrance, and the archipelago limits the water exchange between the fjord and the Norwegian Sea (open sea). Nearly all high-latitude Norwegian fjords, including Balsfjord, are ice-free and characterized by an Arctic light regime (Reigstad and Wassmann 1996). GPS coordinates and other metadata of sampling stations including GPS coordinates are provided in Supplementary Table S1.

Trawl surveys

Study samples were collected on research cruises in October 2019, 2020, and 2021 on the R/V Kristine Bonnevie as part of the Norwegian coastal annual surveys. The main aim of the survey was to collect data for abundance estimation of Norwegian coastal cod (Gadus morhua) and Northeast Arctic saithe (Pollachius virens) and catch count and weight of all fish species are recorded. Bottom trawl surveys were conducted annually in four distinct stations in Balsfjord alongside eDNA samples (Supplementary Fig. S1). Trawl surveys used a standard Campelen 1800 sampling trawl with an 80 mm (stretched) mesh size in the front section and 22 mm in the cod-end. The trawl sweeps were 40 m in length, and rockhopper gear was employed. Samples of the catch were sorted, weighed, and measured according to the recommended instructions (Mjanger et al. 2019) by the Institute of Marine Research crew.

eDNA water sampling, filtration, and extraction

Water was collected prior to trawling in three different 5-l Niskin bottles attached to a rosette which was raised and lowered using a deck-mounted winch at the established sampling station (Supplementary Fig. S1). At each station, we filtered triplicates of 2 l of seawater through 0.22 μm Sterivex filters (MerckMillipore) from three depths, surface (10 m), pycnocline (depth of highest density, ∼50 m) and bottom (10 m above bottom). Each filter was thereafter transferred to a sterile 50 ml Falcon centrifuge tubes and directly stored at −20°C. DNA extraction of water samples followed the manufacturer’s protocol using the DNeasy PowerWater Sterivex Kit (Qiagen GmbH, Hilden Germany), with minor adjustments (excluding PowerBead Tubes steps). For more details regarding these workflows see Guri et al. (2023). Negative controls such as field water and air blanks, laboratory blanks, and PCR blanks were taken throughout the workflow as described in Shu et al. (2020).

Droplet digital PCR

Quantitative molecular assays targeting the mitochondrial DNA (mtDNA), specifically 103-bp region of the ATPase gene of Atlantic cod (Gadus morhua, Table 1) and the cytochrome b sequence of Atlantic herring 86-bp (Clupea harengus, Table 1), were performed using ddPCR on seawater samples. Prior to running the environmental samples, assays were optimized using different primer/probe concentrations and temperature gradient to give the greatest fluorescence contrast between positive and negative droplets. Herring and cod ddPCR assays were run in duplex reactions using 6-FAM and VIC dedicated channels, respectively. All ddPCR runs were conducted using a QX200 ddPCR system (Bio-Rad) where ca. 20 000 droplets were generated in 20 µl reaction volumes. Each reaction used 5 µl of DNA template from environmental samples, 10 μl of ddPCR Supermix with no dUTP (Bio-Rad), 900 nM of final concentration of forward and reverse primers (Table 1), and 250 nM final concentration of TaqMan probe for each assay (cod and herring, respectively). The thermocycler reactions were run in C1000 Touch Thermal Cycler with 96-Deep Well Reaction Module (Bio-Rad) using the PCR program as follows: 10 min at 95°C for enzyme activation, followed by 44 cycles of denaturation for 1 min at 95°C and primer annealing and elongation for 2 min at 55.6°C, with a ramp rate of 2°C per s, followed by 10 min at 98°C and stored at 4°C. Alongside the environmental samples, we ran standard samples of known concentration from 10−2–104 copies/µl for each species to construct the relationship between positive droplets and known concentration (Supplementary Table S2).

Table 1.

Sequences for ddPCR assays targeting Gadus morhua and Clupea harengus (targeting 103-bp region of the ATPase gene and cytochrome b sequence of the mitochondrial DNA, respectively).

TargetPrimers and probeSequence5' DyeSize
Gadus morhuaForwardGAD-FIIGCAATCGAGTYGTATCYCTHCAAGGAT103 bp(Taylor et al. 2002)
ReverseGAD-R IIIGCAAGWAGYGGHGCRCADTTGTG(Nash et al. 2012)
ProbeCustom MGBNFQCTTTTTACCTCTAAATGTGGGAGGVIC
Clupea harengusForwardCluhar_CYBF14928CCCATTTGTGATTGCAGGGG86 bp(Knudsen et al. 2019)
ReverseCluhar_CYBR15013CTGAGTTAAGTCCTGCCGGG(Knudsen et al., 2019)
ProbeCluhar_CYBP14949TACTATTCTCCACCTTCTGTTCCTC-BHQ16FAM
TargetPrimers and probeSequence5' DyeSize
Gadus morhuaForwardGAD-FIIGCAATCGAGTYGTATCYCTHCAAGGAT103 bp(Taylor et al. 2002)
ReverseGAD-R IIIGCAAGWAGYGGHGCRCADTTGTG(Nash et al. 2012)
ProbeCustom MGBNFQCTTTTTACCTCTAAATGTGGGAGGVIC
Clupea harengusForwardCluhar_CYBF14928CCCATTTGTGATTGCAGGGG86 bp(Knudsen et al. 2019)
ReverseCluhar_CYBR15013CTGAGTTAAGTCCTGCCGGG(Knudsen et al., 2019)
ProbeCluhar_CYBP14949TACTATTCTCCACCTTCTGTTCCTC-BHQ16FAM
Table 1.

Sequences for ddPCR assays targeting Gadus morhua and Clupea harengus (targeting 103-bp region of the ATPase gene and cytochrome b sequence of the mitochondrial DNA, respectively).

TargetPrimers and probeSequence5' DyeSize
Gadus morhuaForwardGAD-FIIGCAATCGAGTYGTATCYCTHCAAGGAT103 bp(Taylor et al. 2002)
ReverseGAD-R IIIGCAAGWAGYGGHGCRCADTTGTG(Nash et al. 2012)
ProbeCustom MGBNFQCTTTTTACCTCTAAATGTGGGAGGVIC
Clupea harengusForwardCluhar_CYBF14928CCCATTTGTGATTGCAGGGG86 bp(Knudsen et al. 2019)
ReverseCluhar_CYBR15013CTGAGTTAAGTCCTGCCGGG(Knudsen et al., 2019)
ProbeCluhar_CYBP14949TACTATTCTCCACCTTCTGTTCCTC-BHQ16FAM
TargetPrimers and probeSequence5' DyeSize
Gadus morhuaForwardGAD-FIIGCAATCGAGTYGTATCYCTHCAAGGAT103 bp(Taylor et al. 2002)
ReverseGAD-R IIIGCAAGWAGYGGHGCRCADTTGTG(Nash et al. 2012)
ProbeCustom MGBNFQCTTTTTACCTCTAAATGTGGGAGGVIC
Clupea harengusForwardCluhar_CYBF14928CCCATTTGTGATTGCAGGGG86 bp(Knudsen et al. 2019)
ReverseCluhar_CYBR15013CTGAGTTAAGTCCTGCCGGG(Knudsen et al., 2019)
ProbeCluhar_CYBP14949TACTATTCTCCACCTTCTGTTCCTC-BHQ16FAM

Mock communities

To calibrate metabarcoding observations and account for amplification bias (Gold et al. 2023, Shelton et al. 2023a), we selected ten fish species that were common in previous metabarcoding or trawl surveys in the study area (Guri et al. 2023). For the species Maurolicus muelleri (silvery lightfish), Gadus morhua (cod), Leptoclinus maculatus (daubed shanny), Hippoglossoides platessoides (long rough dab), Myoxocephalus scorpius (shorthorn sculpin), Cyclopterus lumpus (lumpsucker), Pholis gunnellus (rock gunnel), Brosme brosme (cusk), we used tissue samples collected by and DNA extracted at the University of Tromsø. The remaining species tissue samples, Mallotus villosus (capelin) and Pleuronectes platessa (European plaice), were purchased from the local fish market in Tromsø (Dragøy Fisk AS). Genomic DNA from bought fish muscle tissue samples was thereafter extracted using the DNeasy Blood & Tissue Kit (Qiagen GmbH, Hilden Germany) following the manufacturer’s protocol. A mock community of representative fish species was constructed from these genomic DNA samples by amplifying ca. 172 bp-long region of the 12S rRNA gene using MiFish-U universal primer set (Forward: 5′′-GTCGGTAAAACTCGTGCCAGC-3′′; Reverse: 5′′-CATAGTGGGGTATCTAATCCCAGTTTG-3′′; Miya et al. 2015). Reactions were run in 20 µl volume containing 12.5 µl of TaqMan Environmental Master Mix 2.0, 1.25 µl of each forward and reverse primers (10 nM concentration each), 8 µl of dH2O and 2 µl of DNA template of above-listed species. The thermocycler program consisted of an initial polymerase activation for 10 min at 95°C, followed by 40 cycles of denaturation for 1 min at 95°C, annealing for 1 min at 60°C and primer extension for 1 min at 72°C, followed by a final extension for 10 min at 72°C and lastly storage at 4°C. PCR products were purified using the MinElute PCR Purification Kit (Qiagen GmbH, Hilden Germany) to ensure complete removal of primer-dimers. All samples were measured for concentration of dsDNA in purified PCR products using the High Sensitivity dsDNA assay with Qubit quantification system. PCR products were diluted in 10-fold steps to achieve a final concentration of ∼105 copies/µl.

In total, six separate (but similar in concentration) mock community samples (250 µl each) were created by combining 25 µl of amplicon DNA from all 10 species in equimolar concentrations resulting in ca. 104 copies/µl for each species (Supplementary Table S4). Mock community samples were amplified and sequenced alongside the environmental samples following the same workflow and protocol. Only six species in the mock community were also caught in trawl surveys (Supplementary Table S5) hence were selected for downstream analysis.

Library preparation, and NGS sequencing

In total, 109 samples divided into two libraries, including eDNA samples (n = 84), PCR blanks (n = 2), positive controls (n = 6), extraction blanks (n = 5), fieldwork water and air blank (n = 9 and 3, respectively), were amplified using the same primer set used for mock community samples (see above). We used fusion primers (primers containing adaptor and index), allowing for a one-step PCR protocol, and followed a thermocycler program and all following steps as described in Guri et al. (2023). Libraries were sequenced on GeneStudio S5 sequencer (Thermo Fisher Scientific) using the Ion 540-sequencing chip with the 200 bp protocol (Thermo Fisher Scientific).

Bioinformatics

Sequences were automatically demultiplexed and quality filtered after the sequencing process using Torrent Suite™ software inbuilt in the sequencer following their default settings. The sampled sequence dataset was thereafter filtered for chimeric sequences using a uchime-denovo algorithm in VSEARCH (Rognes et al. 2016). The sequences were clustered into Molecular Operational Taxonomic Units (MOTU) using SWARM 2 (Mahé et al. 2014) with a distance of d = 3, and subsequently the singletons were removed. MOTUs with a larger number of reads than 50 of the total library reads were retained and annotated by performing BLAST search against NCBI nucleotide (nt) database (23 August 2023) using the BLASTn algorithm. We used an E-value threshold of 1e−30, maximum target sequences of 50 and a minimal percentage identity of 90. Lastly, we removed all taxa that were not assigned to fish (class: Actinopterygii or Chondrichthyes) and sequences whose read abundance was larger than 10% in negative controls (cumulatively) compared to the environmental samples.

Bayesian model

We developed a Bayesian joint model to synthesize information from trawl catches, ddPCR, and metabarcoding, to estimate fish quantities. We used mock community samples (box 3a in Fig. 1; only six species used) with known initial DNA concentrations to estimate species-specific amplification efficiencies. Knowing the amplification efficiencies enables the model to determine the species’ initial proportion and thus a species’ relative abundance in environmental samples of metabarcoding data (box 3b in Fig. 1; Shelton et al. 2023). Simultaneously, the model determines a link between known concentrations and positive droplets observed by the ddPCR system (box 2a in Fig. 1). Using this link-function, the ddPCR model estimates the absolute concentration of the reference (one) species arbitrarily chosen in ddPCR and mock community data in the environmental samples (box 2b in Fig. 1). Having both the relative abundance of species and the absolute concentration of one species in the environmental samples, the model expands proportion into absolute concentration for all species, or at least all species for which amplification efficiency can be estimated. Lastly, the model assumed a species-specific correlation between trawl catches and absolute DNA concentration which is a conversion parameter, and primary parameter of interest (θ) linking the two methods. See Fig. 1 for a simplified schematic representation of the joint Bayesian model.

In detail, in this model, we defined true fish density as N (units: fish/km2) and q for the catchability parameter on a scale of 0 to 1 (Zhang et al. 2020). Subscripts i, j, y, d, b, r, m, and p indicate species, station, year, depth category, biological replicate, technical replicate, observation in mock community aliquot samples, and observation in ddPCR standard aliquot samples, respectively.

The process (unmeasured biological parameters, Fig. 1)

Let X be the unobserved fish density detectable by trawling (fish/km2 for species i in sample j in year y. Based on equations 1 and 3 in Mahévas et al. (2011), we model unobserved fish density (X) as a function of catchability (q) and true fish density (N):

(1.1)

Let C be the unobserved DNA concentration (copies/µl) from the same subscripts as X and let λ be the “integrated eDNA factor”—the conversion factor between the true fish density and unobserved DNA concentration. By analogy to catchability in the context of trawl data, we model the concentration of environmental samples as a function of the true fish density (N) and an integrated eDNA factor (λ) expressed as:

(1.2)

Note that we assume a simple proportional species-specific form for both catchability (qi) and the integrated DNA factor (λi). These proportionalities are a common and reasonable assumption both in fisheries (Beare et al. 2005, Fraser et al. 2007, Mahévas et al. 2011) and environmental DNA (Stoeckle et al. 2021, Yates et al. 2022). However, different, and more complicated functional forms may certainly be reasonable and are worth further exploration (see more in discussion).

Biological parameters (Fig. 1)

Trawl model compartment (MC 1)

We model species-specific trawl catch count observations (Zijy) as negative binomial observations that are a function of annual mean fish density (Viy) for each species (i) and year (y) across all the j stations (j = 1, 2, …, Jy), known fishing effort (Ejy, area-swept, km2; Fraser et al. 2007), and an overdispersion term φiy:

(2.1)

The annual geometric mean fish density (V) provides a link between the trawl observations (Z) and the unobserved fish density (X):

(2.2)

allowing X to be flexible at each station (j) but constrained for each year (across all the stations Jy; see more in discussion why this flexibility is beneficial).

ddPCR model compartment (MC 2)

Let S be the DNA concentration (units: copies/µl) from standards and C be the latent DNA concentration (Equation 1.2). The number of positive droplets (W) in ddPCR follows a binomial distribution (Guri et al. 2024, submitted), where the probability of a droplet being positive (ω) is a linear function of loge DNA concentration (S or C) with an intercept (β0) and slope (β1) using clog-log (link-function), given the total number of droplets generated (U) expressed as:

(3.1.1)
(3.1.2)
(3.1.3)
(3.1.4)

Metabarcoding model compartment (MC 3)

Based on Shelton et al. (2023) we established that the number of observed reads (Y) from metabarcoding is a draw from a multinomial distribution given the proportions for each species ψ and the total number of reads per sample R:

(4.1.1)
(4.1.2)

Equations 4.2.1 and 4.2.2 link proportions ψ for each species i to observations, via a softmax transformation of the ratios of species abundances γ.

(4.2.1)
(4.2.2)

The known initial concentrations of all species i from mock community samples is converted into additive log ratios (alri) relative to a reference species of choice (G. morhua in this study, see Shelton et al. 2023), using the following equation:

(4.3.1)

where NPCR is the number of PCR cycles used to amplify the amplicons, α is the species-specific amplification efficiency (again, expressed as a log-ratio relative to that of a reference species; see Shelton et al. 2023), and η is the parameter that allows for overdispersion in the counts beyond the variability provided by the multinomial distribution, capturing the substantial variance among replicates often observed in metabarcoding data.

(4.4.1)
(4.4.2)

Having estimated the species-specific amplification efficiencies in Equation 4.3.1, we can derive the absolute DNA quantities for each species, station, and year (Cijy), given the known concentration of the reference species (Ci=ref, G. morhua) estimated in Equation 3.2.2 for the same samples.

(4.3.2)

Note that because we have information about the absolute concentration of DNA from some species the term alr in Equation 4.3.1 can be replaced by the explicit ratio (difference in log-space) of DNA concentrations (eq. 4.3.2).

Joining of the model compartments

Using Equations 1.1 and 1.2, we can derive the relationship between unobserved fish density and the eDNA concentration as following:

(5.1)

Without additional information, we can estimate neither the catchability (q) nor the integrated eDNA factor in the marine environment (λ; these two parameters are unidentifiable). Therefore, we introduce a new parameter (θ) that encapsulates both the integrated eDNA factor and catchability,

(5.2)

For species detected by both trawl and eDNA we treat unobserved fish density (X) as a function of eDNA concentration (C) and the new parameter (θ), which is interpreted as the conversion factor between fish density observed by trawl toward eDNA concentrations (units: fish/km2/copies/µl):

(5.3)

By extension of Equation 5.2 and 5.3, where a species is observed in only one data stream, indicating no information on λ or q, θ inferences are uncertain (see the “Results” section, Fig. 3b and d). The integrated eDNA factor (λ) is a parameter that encapsulates multiple biological processes such as DNA shedding and degradation rate, DNA transport and dilution, and DNA isolation and extraction efficiency (θ, φ, ψ, and ξ parameters in Shelton et al. 2016).

The joint model (Supplementary Fig. S2) was implemented using the Stan language as implemented in R (package: Rstan) running four independent MCMC chains using 4000 warmup and 6000 sampling iterations (Table 2 for parameters and their priors). The posterior predictions were diagnosed using |$\hat{R}$| statistics (Gelman and Rubin 1992) and considered convergence for |$\hat{R}$| values less than 1.1 and effective sample size (ESS) greater than 1000 for all parameters.

Table 2.

Data, state processes, parameters, and subscripts employed in the joint Bayesian model and their prior distributions (N stands for normal distribution with mean and standard deviation; Γ stands for gamma distribution with shape and rate).

DescriptionPrior
Data
Zobserved number trawl catch countN/A
Etrawling effort estimated in km2N/A
Wpositive droplets observed through ddPCRN/A
Ptotal number of droplets accepted by ddPCRN/A
Sknown DNA concentration in c/µlN/A
Yobserved number of reads through metabarcodingN/A
Rtotal number of reads for all I speciesN/A
alradditive-log-ratio of initial concentration for all I species relative to the reference species prior to sequencingN/A
NPCRnumber of PCR cycles runN/A
State processes
Ntrue (unobserved) fish density in log of fish count/km2Not estimated
Xunobserved estimated trawled fish density in log of fish count/km2N(0,10)
Cestimated eDNA concentration in log of copies/µlN(0,10)
Vtrawled fish density in log of fish count/km2 (averaged across stations)N/A
Parameters
θconversion factor between trawled fish and eDNA concentrationN(0,10)
φnegative binomial distribution overdispersion parameterΓ(50,1)
β0intercept of the linear relation between positive droplets and DNA concentrationN(0,10)
β1regression slope of the linear relation between positive droplets and DNA concentrationN(0,10)
αamplification efficiencyN(0,0.01)
ηmultinomial distribution overdispersion parameterN(0, τ)
τstandard deviation parameter for ηΓ(100,1000)
ψvector of probabilities for multinomial distributionN/A
γvector the log-concentration of all I species relative to the reference species after sequencingN/A
λintegrated eDNA factorN/A
qtrawl catchabilityN/A
Subscripts
ispecies (n = 7)
jstation (n = 4)
yyear (n = 3)
ddepth (n = 3)
bbiological replicate (n = 2 for year 2019 and 2020, n = 3 for 2021)
rtechnical replicate (n = 3 for ddPCR runs)
pstandard aliquote sample (n = 160)
mmock community aliquote sample (n = 6)
DescriptionPrior
Data
Zobserved number trawl catch countN/A
Etrawling effort estimated in km2N/A
Wpositive droplets observed through ddPCRN/A
Ptotal number of droplets accepted by ddPCRN/A
Sknown DNA concentration in c/µlN/A
Yobserved number of reads through metabarcodingN/A
Rtotal number of reads for all I speciesN/A
alradditive-log-ratio of initial concentration for all I species relative to the reference species prior to sequencingN/A
NPCRnumber of PCR cycles runN/A
State processes
Ntrue (unobserved) fish density in log of fish count/km2Not estimated
Xunobserved estimated trawled fish density in log of fish count/km2N(0,10)
Cestimated eDNA concentration in log of copies/µlN(0,10)
Vtrawled fish density in log of fish count/km2 (averaged across stations)N/A
Parameters
θconversion factor between trawled fish and eDNA concentrationN(0,10)
φnegative binomial distribution overdispersion parameterΓ(50,1)
β0intercept of the linear relation between positive droplets and DNA concentrationN(0,10)
β1regression slope of the linear relation between positive droplets and DNA concentrationN(0,10)
αamplification efficiencyN(0,0.01)
ηmultinomial distribution overdispersion parameterN(0, τ)
τstandard deviation parameter for ηΓ(100,1000)
ψvector of probabilities for multinomial distributionN/A
γvector the log-concentration of all I species relative to the reference species after sequencingN/A
λintegrated eDNA factorN/A
qtrawl catchabilityN/A
Subscripts
ispecies (n = 7)
jstation (n = 4)
yyear (n = 3)
ddepth (n = 3)
bbiological replicate (n = 2 for year 2019 and 2020, n = 3 for 2021)
rtechnical replicate (n = 3 for ddPCR runs)
pstandard aliquote sample (n = 160)
mmock community aliquote sample (n = 6)

N/A stands for data or processes that do not require prior distributions.

Table 2.

Data, state processes, parameters, and subscripts employed in the joint Bayesian model and their prior distributions (N stands for normal distribution with mean and standard deviation; Γ stands for gamma distribution with shape and rate).

DescriptionPrior
Data
Zobserved number trawl catch countN/A
Etrawling effort estimated in km2N/A
Wpositive droplets observed through ddPCRN/A
Ptotal number of droplets accepted by ddPCRN/A
Sknown DNA concentration in c/µlN/A
Yobserved number of reads through metabarcodingN/A
Rtotal number of reads for all I speciesN/A
alradditive-log-ratio of initial concentration for all I species relative to the reference species prior to sequencingN/A
NPCRnumber of PCR cycles runN/A
State processes
Ntrue (unobserved) fish density in log of fish count/km2Not estimated
Xunobserved estimated trawled fish density in log of fish count/km2N(0,10)
Cestimated eDNA concentration in log of copies/µlN(0,10)
Vtrawled fish density in log of fish count/km2 (averaged across stations)N/A
Parameters
θconversion factor between trawled fish and eDNA concentrationN(0,10)
φnegative binomial distribution overdispersion parameterΓ(50,1)
β0intercept of the linear relation between positive droplets and DNA concentrationN(0,10)
β1regression slope of the linear relation between positive droplets and DNA concentrationN(0,10)
αamplification efficiencyN(0,0.01)
ηmultinomial distribution overdispersion parameterN(0, τ)
τstandard deviation parameter for ηΓ(100,1000)
ψvector of probabilities for multinomial distributionN/A
γvector the log-concentration of all I species relative to the reference species after sequencingN/A
λintegrated eDNA factorN/A
qtrawl catchabilityN/A
Subscripts
ispecies (n = 7)
jstation (n = 4)
yyear (n = 3)
ddepth (n = 3)
bbiological replicate (n = 2 for year 2019 and 2020, n = 3 for 2021)
rtechnical replicate (n = 3 for ddPCR runs)
pstandard aliquote sample (n = 160)
mmock community aliquote sample (n = 6)
DescriptionPrior
Data
Zobserved number trawl catch countN/A
Etrawling effort estimated in km2N/A
Wpositive droplets observed through ddPCRN/A
Ptotal number of droplets accepted by ddPCRN/A
Sknown DNA concentration in c/µlN/A
Yobserved number of reads through metabarcodingN/A
Rtotal number of reads for all I speciesN/A
alradditive-log-ratio of initial concentration for all I species relative to the reference species prior to sequencingN/A
NPCRnumber of PCR cycles runN/A
State processes
Ntrue (unobserved) fish density in log of fish count/km2Not estimated
Xunobserved estimated trawled fish density in log of fish count/km2N(0,10)
Cestimated eDNA concentration in log of copies/µlN(0,10)
Vtrawled fish density in log of fish count/km2 (averaged across stations)N/A
Parameters
θconversion factor between trawled fish and eDNA concentrationN(0,10)
φnegative binomial distribution overdispersion parameterΓ(50,1)
β0intercept of the linear relation between positive droplets and DNA concentrationN(0,10)
β1regression slope of the linear relation between positive droplets and DNA concentrationN(0,10)
αamplification efficiencyN(0,0.01)
ηmultinomial distribution overdispersion parameterN(0, τ)
τstandard deviation parameter for ηΓ(100,1000)
ψvector of probabilities for multinomial distributionN/A
γvector the log-concentration of all I species relative to the reference species after sequencingN/A
λintegrated eDNA factorN/A
qtrawl catchabilityN/A
Subscripts
ispecies (n = 7)
jstation (n = 4)
yyear (n = 3)
ddepth (n = 3)
bbiological replicate (n = 2 for year 2019 and 2020, n = 3 for 2021)
rtechnical replicate (n = 3 for ddPCR runs)
pstandard aliquote sample (n = 160)
mmock community aliquote sample (n = 6)

N/A stands for data or processes that do not require prior distributions.

To test the robustness of the parameters and the ability to use eDNA observations to estimate fish density, we conducted an out-of-sample analysis using two years of data as a training subset (part I) and the remaining year as a test subset (part II). In part I, the model estimated the parameters (θ, φ, β0, β1, α, η, ψ, γ, and τ) using observations (trawl, ddPCR, and metabarcoding) from only two years. Subsequently, for part II, we predicted fish density in the third year using the estimated parameters from part I in the trawl model (MC 1) and the eDNA model (MC 2 and 3). We defined delta (δ) as the difference between natural log fish density estimated by eDNA models and trawl models thus the log-fold difference for the out-of-sample year. Delta values closer to 0 indicate that trawl observations were well predicted using the eDNA model (or vice versa). This process was iterated three times separately with a different year data left out each time.

We included the seven species in the Bayesian model for which we had joint presence between trawl surveys or eDNA observations and ddPCR standard or mock community samples (Supplementary Table S5).

Results

The joint model successfully linked disparate datasets in a comprehensive and quantitative framework, yielding both meaningful estimates of fish abundances and derived parameters of substantial value. The posterior distribution of model parameters resulted in |$\hat{R}$| values <1.005 indicating that the chains converged to a common distribution, ensuring the stability of the outcomes. Additionally, the model had weak autocorrelation thus indicating efficient mixing as the effective sample sizes (ESS) were larger than ca. 2000. Posterior summaries of parameters together with |$\hat{R}$| and ESS can be found in Supplementary Table S6. The results regarding trawl catches, metabarcoding, ddPCR and mock community samples can be found in Supplementary Material S1.

Posterior predictive checks consistently indicated substantial agreement between predicted and observed data across all three observation methods: ddPCR, metabarcoding, and trawl surveys (Supplementary Fig. S3 a, b, and c, respectively). This result indicated the model’s capacity to accurately capture the characteristics of the observed data and their ability to extrapolate meaningful predictions forecasting future outcomes and deepening the understanding of the connections between these three observation methods.

The DNA concentration data generated by the eDNA models (ddPCR and metabarcoding) indicated a strong correlation with the unobserved fish density (X) on the trawl data (Fig. 2) given the species-specific conversion factor (θ). Such correlation indicated the reliability of the parameters (θ, β0, β1, α, η, and τ) in accurately representing the intricate relationships between the underlying processes and the observed data. This empirical evidence reinforced the dependency of the species-specific conversion factor (θ; Supplementary Table S7) for linking eDNA observation with other conventional methods (i.e. trawl catches) as the explanatory variable for underlying biological mechanisms between organisms and their shed and captured DNA.

The joint Bayesian model workflow translating metabarcoding reads (lower left corner) and ddPCR droplets (lower right corner) into fish density (expressed as DNA concentration * conversion factor) and their correlation with the unobserved fish density (X; indicated as crosses) and estimated mean yearly fish density from trawl observations (V; indicated as filled dots) in the y-axis. C = DNA concentration and θ = conversion factor between DNA concentrations to fish density. Metabarcoding and ddPCR models indicated here are model compartments within the joint Bayesian model. The plot indicates model fit and reliable parameter estimation for linking trawl and eDNA observations.
Figure 2.

The joint Bayesian model workflow translating metabarcoding reads (lower left corner) and ddPCR droplets (lower right corner) into fish density (expressed as DNA concentration * conversion factor) and their correlation with the unobserved fish density (X; indicated as crosses) and estimated mean yearly fish density from trawl observations (V; indicated as filled dots) in the y-axis. C = DNA concentration and θ = conversion factor between DNA concentrations to fish density. Metabarcoding and ddPCR models indicated here are model compartments within the joint Bayesian model. The plot indicates model fit and reliable parameter estimation for linking trawl and eDNA observations.

The model output of the conversion factor (θ) ranged considerably from e5.95 for C. harengus to e30 for P. platessa (with θ fish/km2 equalling 1 copy/µl observed in a sample extract; Table 3). The posterior distribution of the conversion factor differed significantly between the remaining species and varied between e8 and e14 fish/km2 per copies/µl (Table 3). We noted that we observed no sequences of P. platessa in the metabarcoding data (Supplementary Table S7), thus the θ values for that species was estimated at the lowest boundary possible by the model. Similarly, due to extremely small catch counts, the conversion parameter of C. lumpus was hard for the model to estimate, thus a large standard deviation of the parameter's posterior distribution was observed. Rather than being reasonable estimates of θ, these two species instead represented lack-of-data scenarios in which insufficient information–in one case, no sequencing observations, and in the other, no trawl observations–results in unreasonable estimates. We included them specifically to illustrate some of the pitfalls and model prerequisites with respect to observation data.

Table 3.

Species-specific conversion parameter (θ) estimated by the joint model (10 000 iterations), with the mean, standard deviation (SD), and 95% credibility intervals (CI).

Log(θ) (fish/km2/copies/µl)
SpeciesEstimateSD95% CILog(C)Log(X)
Species(mean)(lower and upper)(copies/µl)(fish/km2)
Gadus morhua8.090.098.068.40-2.046.20
Clupea harengus5.950.135.696.22-1.494.76
Mallotus villosus12.960.1412.6713.24-3.6110.20
Cyclopterus lumpus8.931.136.3310.72-9.210.40
Leptoclinus maculatus14.560.1714.2214.89-9.085.76
Hippoglossoides platessoides12.280.1512.0012.57-2.589.73
Pleuronectes platessa30.843.9024.4339.70-27.403.89
Log(θ) (fish/km2/copies/µl)
SpeciesEstimateSD95% CILog(C)Log(X)
Species(mean)(lower and upper)(copies/µl)(fish/km2)
Gadus morhua8.090.098.068.40-2.046.20
Clupea harengus5.950.135.696.22-1.494.76
Mallotus villosus12.960.1412.6713.24-3.6110.20
Cyclopterus lumpus8.931.136.3310.72-9.210.40
Leptoclinus maculatus14.560.1714.2214.89-9.085.76
Hippoglossoides platessoides12.280.1512.0012.57-2.589.73
Pleuronectes platessa30.843.9024.4339.70-27.403.89

For perspective, the mean values of unobserved fish density (X) and eDNA concentration (C) across stations and years are shown. Values are presented in natural logarithm scale for easier interpretation hence log(X) = log(C) + log(θ) (from Equation 5.3).

Table 3.

Species-specific conversion parameter (θ) estimated by the joint model (10 000 iterations), with the mean, standard deviation (SD), and 95% credibility intervals (CI).

Log(θ) (fish/km2/copies/µl)
SpeciesEstimateSD95% CILog(C)Log(X)
Species(mean)(lower and upper)(copies/µl)(fish/km2)
Gadus morhua8.090.098.068.40-2.046.20
Clupea harengus5.950.135.696.22-1.494.76
Mallotus villosus12.960.1412.6713.24-3.6110.20
Cyclopterus lumpus8.931.136.3310.72-9.210.40
Leptoclinus maculatus14.560.1714.2214.89-9.085.76
Hippoglossoides platessoides12.280.1512.0012.57-2.589.73
Pleuronectes platessa30.843.9024.4339.70-27.403.89
Log(θ) (fish/km2/copies/µl)
SpeciesEstimateSD95% CILog(C)Log(X)
Species(mean)(lower and upper)(copies/µl)(fish/km2)
Gadus morhua8.090.098.068.40-2.046.20
Clupea harengus5.950.135.696.22-1.494.76
Mallotus villosus12.960.1412.6713.24-3.6110.20
Cyclopterus lumpus8.931.136.3310.72-9.210.40
Leptoclinus maculatus14.560.1714.2214.89-9.085.76
Hippoglossoides platessoides12.280.1512.0012.57-2.589.73
Pleuronectes platessa30.843.9024.4339.70-27.403.89

For perspective, the mean values of unobserved fish density (X) and eDNA concentration (C) across stations and years are shown. Values are presented in natural logarithm scale for easier interpretation hence log(X) = log(C) + log(θ) (from Equation 5.3).

The out-of-sample analysis showed that molecular data alone reliably predicted trawl catches (Fig. 3a and c). Five of seven species demonstrated a strong correlation between the predicted and observed fish densities, thus resulting in small δ-values (Fig. 3a and c). Among these, four species (C. harengus, G. morhua, L. maculatus, and M. villosus) resulted in δ-values of less than, or equal to, one for each predicted year, underscoring the robustness of the parameters governing the underlying relationships. Notably, H. platessoides displayed greater variability than the other four species, thus suggested slightly less robust and reproducible parameters among years. This variation may be attributed to an outlier year, although it was challenging to definitively ascertain this due to the limited dataset of n = 3 years in total (and therefore n = 2 years in training-data subsets). Increased availability of coupled data (trawl and eDNA) could enhance the robustness of θ and δ parameter estimation.

Leave-one-year-out analysis indicating the correlation between the predicted (x-axis) and observed fish densities (y-axis; a and b). The species-specific posterior distribution of δ variable (± 95% confidence interval of the posterior distribution) indicates the difference between predicted and observed fish densities (c and d). Due to the infrequent presence in the observed data species Cyclopterus lumpus and Pleuronectes platessa are considered outliers and thus presented separately (b and d).
Figure 3.

Leave-one-year-out analysis indicating the correlation between the predicted (x-axis) and observed fish densities (y-axis; a and b). The species-specific posterior distribution of δ variable (± 95% confidence interval of the posterior distribution) indicates the difference between predicted and observed fish densities (c and d). Due to the infrequent presence in the observed data species Cyclopterus lumpus and Pleuronectes platessa are considered outliers and thus presented separately (b and d).

The remaining two species, C. lumpus and P. platessa, offered a useful illustration: when insufficient data were available, no meaningful inferences could be made. These species showed a notable average discrepancy of two orders of magnitude between the eDNA model-predicted fish densities and those derived from trawl observations (Fig. 3b and d). Furthermore, the mean within-year-out variability spanned over eight orders of magnitude, signifying the data insufficiency in either the trawl or the eDNA modules yielded challenges in estimating parameters.

Discussion

The current study has successfully bridged the gap between eDNA and trawl surveys, quantifying their bias alongside the biological relationships (jointly) between fish abundance and DNA concentration in the marine environment. Even more significantly, this bridge has been extended to multiple species, thereby enhancing the versatility and applicability of eDNA methods for stock assessments and marine biological monitoring. Extensive discussion regarding the outputs of metabarcoding, ddPCR and mock community can be found in Supplementary Material S2.

Biological interpretation of the model

Given the inextricable nature of DNA shedding and degradation, transport and dilution, and recovery and isolation processes, we unified them under the comprehensive term “integrated DNA factor–λ” in this study (framework in Shelton et al. 2016). We emphasize that the processes of catchability (q) and the integrated DNA factor (λ) are not assessed through this model but have been circumvented using the trawl to DNA conversion parameter (θ). Below, we discuss how these processes biologically interplay with the latent parameters estimated in this model.

The strong replicability of θ over years (for species with available data on both survey methods), as indicated by small values of δ in leave-one-out analysis, demonstrates that trawl metabarcoding and ddPCR observations yielded measurable and reliable results using the established model. Replicable results support the assumptions that trawl catchability (q) and the integrated DNA factor (λ) have remained constant throughout the years of survey period. Our rationale for assuming constant q is rooted in the standardized nature of the annual trawl surveys conducted by the IMR (Institute of Marine Research) and λ in similar environmental and biological factors (Zhang et al. 2020) during the sampling period.

The joint modelling of trawl and eDNA observations allows us to enhance the spatial resolution of fish densities. Following established research protocols, we model fish density as the annual average (V), based on four trawling catch count observations (assuming homogenous fish densities among stations), given the trawling effort. This approach becomes necessary because trawls at a specific location lack replication thus catches are commonly averaged over larger survey areas (Beare et al. 2005). Conversely, when we complement trawl observations using eDNA data, we gain the ability to develop models at the site level (X) and potentially future models at the depth level, rather than restricting eDNA analysis to the large aggregated, fjord level in our study. This flexibility offers a twofold advantage. First, it enables the ability to incorporate replicates of observation at different unit levels (site, area, station, and depth) thus increasing the accuracy of estimates. Second, it enables the model to establish the link between eDNA concentrations and fish densities without requiring a fixed relationship at every individual station, a factor that has been the downfall of previous studies attempting to establish this connection (Pont et al. 2023; Veron et al. 2023). This matches our understanding of the ecology of eDNA, in which DNA can be transported among stations (Rourke et al. 2022). This model advantage is thus particularly important as it mitigates issues related to lateral transport and dilution of eDNA (Andruszkiewicz et al. 2019), corresponding in maintaining the consistency of λ. Additionally, factors of eDNA recovery and isolation efficiency must have remained the same as the sampling technique and the laboratory workflow remained unchanged among the entire set of samples. Lastly, since all observations were conducted in October of different years, we posit that the behaviours of the fish population must have remained largely consistent, as did the relevant abiotic factors, both responsible for DNA shedding and degradation (Goldberg et al. 2011, Strickler et al. 2015, Jo et al. 2019, Mauvisseau et al. 2022). These results lead us to conclude that the integrated DNA factor (λ) has likely remained unchanged, particularly for species with small δ values.

For species with large δ values, several factors may explain variations of the θ parameter among years, which in turn could indicate either changes in q or λ between those years of observation. In the case of C. lumpus, the metabarcoding data suggest relatively small quantities, but trawl surveys observed minimal to no catches, suggesting small fish densities, or consequently pointing to an exceptionally small catchability term. This catchability term has been previously observed and indicated that bottom trawl surveys rarely catch C. lumpus, especially at depths greater than 60 m (Kennedy et al. 2016). Conversely pelagic trawls have been indicating larger catch rates of such species (Eriksen et al. 2014). Consequently, an unidentifiable catchability term can hinder the parametrization of θ. Conversely, flatfish species like H. platessoides and P. platessa exhibit strong catchability due to their preference for sandy and muddy bottoms. Trawl observations revealed relatively large fish densities for these species. However, in the case of P. platessa, no reads were observed through metabarcoding data, indicating an apparent absence of eDNA in the water, thus rendering the λ term ambiguous (hence also θ). An additional cause to this could be our filtering pipeline, as we removed amplicons with less than 50 reads, thus considered rare species (Jerde et al. 2016). Nevertheless, H. platessoides exhibited substantial read numbers and relatively large eDNA concentrations, but its δ values were substantial, indicating variation in the θ parameter among years of observation. This presents a more complex problem, and additional observations may be necessary to identify the source of this issue. We hypothesize that two potential errors could contribute to this problem. First, there may be an abnormality in the catchability terms in one of the years, making the outlier year unidentifiable given only three years of observations. Second, water samples instead of sediment samples could introduce biases for benthic species in the processes encompassed by the integrated DNA factor, such as DNA vertical transport and DNA recovery.

The variability among species highlights a general point that biological knowledge matters for the interpretation and use of eDNA observations, and that the use of eDNA for biomass estimation is likely species- and context-specific. Additional applications using eDNA information need to be developed to understand the kinds and magnitudes of uncertainty associated with using eDNA information in different contexts and with different species. Having a deeper understanding of these parameters and estimating their empirical value can enhance our biological understanding of the molecular sampling tools, thus boosting the applicability of eDNA to fisheries management and stock assessments (Jo et al. 2019). However, our work provides a roadmap for moving away from simple sample-to-sample correlations among sampling methods and toward a method that can inform and improve management and conservation actions. In this study, the specific conversion parameters are associated with the catchability of bottom trawling (the method’s inherent bias), and therefore, their applicability cannot be broadly extended to other methods or study areas. However, this statistical framework can be adapted to other sampling methods.

The future of eDNA and the trawl joint model

To our knowledge, this is the first empirical demonstration of species-specific conversion factors between eDNA metabarcoding and bottom trawl datasets. Future refinements should expand upon the existing model to include alternative sampling techniques and further generalize the framework and parameters we have presented here.

One potential area for improvement is the external estimation of trawl catchability (q; Fraser et al. 2007), which could lead to a more robust conversion parameter θ and also inform estimates of the integrated DNA factor (λ). Making λ a function of biological or environmental variables could allow for complex relationships between the trawl catchability and λ. However, exploring alternative forms is beyond the scope of this paper. One aspect of interest is understanding the effect sample depth, which could affect spatial- and species-specific DNA recovery and inform apparent differences between eDNA and trawl sampling (Jeunen et al. 2019, Canals et al. 2021). A further point of interest is accounting for the potential influence of allometric scaling on DNA shedding, metabolic rate, consumption, and excretion, and how potential interactions with environmental factors may affect the interpretation of eDNA survey results (Urban et al. 2023). An understanding of the effect of allometric scaling seems likely to be important for enabling molecular tools to estimate the size distribution of populations in marine environments (Yates et al. 2022). This could significantly further enhance the application of eDNA surveys into stock and population assessments.

The model can also be adapted to accommodate alternative monitoring techniques and data inputs. For example, catch-release, acoustic surveys, and trawl biomass data could either replace or be used in conjunction with the trawl model by adding another model compartment similarly to the presented model compartments (MC1, MC2, and MC3). Incorporating these additional methods would expand the range of available data collection techniques and provide further insight into the biological processes that underpin the molecular tools used in ecological assessments. Although this may alter the interpretation of parameters such as θ or λ, it would contribute to a more comprehensive understanding of their underlying biology. Our work shows how eDNA can inform and be included in the class of integrated analyses which are the foundation for modern stock assessments and fisheries management (Maunder and Punt 2013).

In summary, the future of this model holds significant potential for development and improvement. By meticulously documenting and mapping all parameters used in the model, along with the specific biological processes they represent, we can increase transparency and facilitate comparisons with other studies. This, in turn, enhances the replicability and reliability of surveys, a capability that conventional surveys alone cannot offer. Ultimately, these advances will contribute to the development of molecular sampling tools and their applicability in fisheries management and stock assessments.

Management applications

Trawl surveys provide consistent time-series data essential for stock and ecosystem assessments. However, the reliability and continuity of these time series can be influenced by various factors. Area accessibility (i.e. closed areas for conservation or energy transition), the necessity to explore new areas as fish stocks migrate, funding limitations, vessel availability, and adverse weather conditions can all impact the consistency of data collection. In response to these challenges, embracing innovative molecular tools and coupling them with conventional methods can provide more flexibility and resilience to changing circumstances while significantly reducing the ecological impact and maximizing the efficiency of surveys (Schneider et al. 2016, Di Muri et al. 2020, Veron et al. 2023). This strategic fusion can enhance the robustness and adaptability of survey programs, thereby strengthening our capacity for effective stock and ecosystem assessments in an ever-changing marine environment.

Conclusions

This study establishes the framework for species quantification and community composition of eDNA metabarcoding. We stress the need for species-specific conversion parameters to accurately estimate species abundances from eDNA metabarcoding. Furthermore, accounting for amplification efficiencies is necessary for ensuring the accuracy and replicability of metabarcoding data. These insights hold considerable implications for the application of eDNA in the management and conservation of marine ecosystems.

Acknowledgements

We thank all the technicians at IMR for the contribution during the field work and Agneta Hansen for the contribution during the field and lab work.

Author contributions

G.G., T.J., N.Y., J.I.W., R.K., J.R., A.S., T.H., J.F., and K.P. conceptualized the study; G.G., T.H., and J.I.W. conducted the fieldwork; G.G. and T.H. did the lab work; G.G., T.H., and J.I.W. did the bioinformatic analyses; G.G., A.S., and R.K. did the statistical analyses; G.G. with support from T.J., R.K., J.R., J.F., A.S., J.I.W., N.Y., and K.P. interpreted the statistical and ecological results; G.G. wrote the manuscript draft with contributions from all co-authors, and all authors approved the submitted version.

Conflict of interest

Authors declare no conflict of interest.

Funding

This research was supported by the project FISHDIV (NRC301691) funded by the Norwegian Research Council (NRC) and the program “Marine Ecosystem processes” at Institute of Marine Research.

Data availability

The data and the model’s script used in this article are available in Zenodo repository. (Dataset reference https://zenodo.org/records/12721619).

References

Afzali
 
SF
,
Bourdages
 
H
,
Laporte
 
M
 et al.  
Comparing environmental metabarcoding and trawling survey of demersal fish communities in the Gulf of St. Lawrence, Canada
.
Environmental DNA
.
2021
;
3
:
22
42
.

Allan
 
EA
,
Kelly
 
RP
,
D'Agnese
 
ER
 et al.  
Quantifying impacts of an environmental intervention using environmental DNA
.
Ecol Appl
.
2023
;
33
:
e2914
.

Andruszkiewicz
 
EA
,
Koseff
 
JR
,
Fringer
 
OB
 et al.  
Modeling environmental DNA transport in the coastal ocean using lagrangian particle tracking
.
Front Mar Sci
.
2019
;
6
:
477
.

Atienza
 
S
,
Guardiola
 
M
,
Præbel
 
K
 et al.  
DNA metabarcoding of deep-sea sediment communities using COI: community assessment, spatio-temporal patterns and comparison with the 18S rDNA marker
.
Diversity
.
2020
;
12
:
123
.

Beare
 
DJ
,
Needle
 
CL
,
Burns
 
F
 et al.  
Using survey data independently from commercial data in stock assessment: an example using haddock in ICES Division VIa
.
ICES J Mar Sci
.
2005
;
62
:
996
1005
.

Biju Kumar
 
A
,
Deepthi
 
GR
.
Trawling and by-catch: implications on marine ecosystem
.
Curr Sci
.
2006
;
90
:
922
31
.

Callaghan
 
CT
,
Nakagawa
 
S
,
Cornwell
 
WK
.
Global abundance estimates for 9,700 bird species
.
Proc Natl Acad Sci
.
2021
;
118
:
e2023170118
.

Canals
 
O
,
Mendibil
 
I
,
Santos
 
M
 et al.  
Vertical stratification of environmental DNA in the open ocean captures ecological patterns and behavior of deep-sea fishes
.
Limnol Oceanogr Lett
.
2021
;
6
:
339
47
.

Closek
 
CJ
,
Santora
 
JA
,
Starks
 
HA
 et al.  
Marine vertebrate biodiversity and distribution within the central California Current using environmental DNA (eDNA) metabarcoding and ecosystem surveys
.
Front Mar Sci
.
2019
;
6
:
732
.

Crossin
 
GT
,
Heupel
 
MR
,
Holbrook
 
CM
 et al.  
Acoustic telemetry and fisheries management
.
Ecol Appl
.
2017
;
27
:
1031
49
.

Di Muri
 
C
,
Handley
 
LL
,
Bean
 
CW
 et al.  
Read counts from environmental DNA (eDNA) metabarcoding reflect fish abundance and biomass in drained ponds
.
Metabarcod Metagenom
.
2020
;
4
:
e56959
.

Eigaard
 
OR
,
Bastardie
 
F
,
Hintzen
 
NT
 et al.  
The footprint of bottom trawling in European waters: distribution, intensity, and seabed integrity
.
ICES J Mar Sci
.
2017
;
74
:
847
65
.

Eriksen
 
E
,
Durif
 
CMF
,
Prozorkevich
 
D
.
Lumpfish (Cyclopterus lumpus) in the Barents Sea: development of biomass and abundance indices, and spatial distribution
.
ICES J Mar Sci
.
2014
;
71
:
2398
402
.

Farr
 
MT
,
O'Brien
 
T
,
Yackulic
 
CB
 et al.  
Quantifying the conservation status and abundance trends of wildlife communities with detection–nondetection data
.
Conserv Biol
.
2022
;
36
:
e13934
.

Fraser
 
HM
,
Greenstreet
 
SPR
,
Piet
 
GJ
.
Taking account of catchability in groundfish survey trawls: implications for estimating demersal fish biomass
.
ICES J Mar Sci
.
2007
;
64
:
1800
19
.

Gelman
 
A
,
Rubin
 
DB
.
Inference from iterative simulation using multiple sequences
.
Statist Sci
.
1992
;
7
:
457
72
.

Gloor
 
GB
,
Macklaim
 
JM
,
Pawlowsky-Glahn
 
V
 et al.  
Microbiome datasets are compositional: and this is not optional
.
Front Microbiol
.
2017
;
8
:
2224
.

Gold
 
Z
,
Shelton
 
AO
,
Casendino
 
HR
 et al.  
Signal and noise in metabarcoding data
.
PLoS One
.
2023
;
18
:
e0285674
.

Goldberg
 
CS
,
Pilliod
 
DS
,
Arkle
 
RS
 et al.  
Molecular detection of vertebrates in stream water: molecular detection of vertebrates in stream water: a demonstration using Rocky Mountain tailed frogs and Idaho giant salamanders
.
PLoS One
.
2011
;
6
:
e22746
.

Guri
 
G
,
Shelton
 
AO
,
Kelly
 
RP
 et al.  
Quantifying the sensitivity between qPCR and ddPCR mechanisms for eDNA samples
.
Ecol Evol
.
2024
;
(under review)
.

Guri
 
G
,
Westgaard
 
J
,
Yoccoz
 
N
 et al.  
Maximizing sampling efficiency to detect differences in fish community composition using environmental DNA metabarcoding in subarctic fjords
.
Environmental DNA
.
2024
;
6
:
e409
.

Hansen
 
BK
,
Bekkevold
 
D
,
Clausen
 
LW
 et al.  
The sceptical optimist: challenges and perspectives for the application of environmental DNA in marine fisheries
.
Fish Fisher
.
2018
;
19
:
751
68
.

He
 
X
,
Jeffery
 
NW
,
Stanley
 
RRE
 et al.  
eDNA metabarcoding enriches traditional trawl survey data for monitoring biodiversity in the marine environment
.
ICES J Mar Sci
.
2023
;
80
:
1529
38
.

Hill
 
NA
,
Barrett
 
N
,
Lawrence
 
E
 et al.  
Quantifying fish assemblages in large, offshore marine protected areas: an Australian case study
.
PLoS One
.
2014
;
9
:
e110831
.

Jerde
 
CL
,
Olds
 
BP
,
Shogren
 
AJ
 et al.  
Influence of stream bottom substrate on retention and transport of vertebrate environmental DNA
.
Environ Sci Technol
.
2016
;
50
:
8770
9
.

Jetz
 
W
,
McPherson
 
JM
,
Guralnick
 
RP
.
Integrating biodiversity distribution knowledge: toward a global map of life
.
Trends Ecol Evol
.
2012
;
27
:
151
9
.

Jeunen
 
GJ
,
Knapp
 
M
,
Spencer
 
HG
 et al.  
Environmental DNA (eDNA) metabarcoding reveals strong discrimination among diverse marine habitats connected by water movement
.
Mol Ecol Resour
.
2019
;
19
:
426
38
.

Jo
 
T
,
Murakami
 
H
,
Yamamoto
 
S
 et al.  
Effect of water temperature and fish biomass on environmental DNA shedding, degradation, and size distribution
.
Ecol Evol
.
2019
;
9
:
1135
46
.

Kennedy
 
J
,
Jónsson
 
,
Ólafsson
 
HG
 et al.  
Observations of vertical movements and depth distribution of migrating female lumpfish (Cyclopterus lumpus) in Iceland from data storage tags and trawl surveys
.
ICES J Mar Sci
.
2016
;
73
:
1160
9
.

Knudsen
 
SW
,
Ebert
 
RB
,
Hesselsøe
 
M
 et al.  
Species-specific detection and quantification of environmental DNA from marine fishes in the Baltic Sea
.
J Exp Mar Biol Ecol
.
2019
;
510
:
31
45
.

Lacoursière-Roussel
 
A
,
Côté
 
G
,
Leclerc
 
V
 et al.  
Quantifying relative fish abundance with eDNA: a promising tool for fisheries management
.
J Appl Ecol
.
2016
;
53
:
1148
57
.

Larson
 
WA
,
Barry
 
P
,
Dokai
 
W
 et al.  
Leveraging eDNA metabarcoding to characterize nearshore fish communities in Southeast Alaska: do habitat and tide matter?
.
Environ DNA
.
2022
;
4
:
868
80
.

Lees
 
KJ
,
MacNeil
 
MA
,
Hedges
 
KJ
 et al.  
Estimating demographic parameters for fisheries management using acoustic telemetry
.
Rev Fish Biol Fisher
.
2021
;
31
:
25
51
.

Maes
 
SM
,
Desmet
 
S
,
Brys
 
R
 et al.  
Detection and quantification of two commercial flatfishes (Solea solea and Pleuronectes platessa) in the North Sea using environmental DNA
.
Environ DNA
.
2023
;
6
:
426
.

Mahé
 
F
,
Rognes
 
T
,
Quince
 
C
 et al.  
Swarm: robust and fast clustering method for amplicon-based studies
.
PeerJ
.
2014
;
2
:
e593
.

Mahévas
 
S
,
Trenkel
 
VM
,
Doray
 
M
 et al.  
Hake catchability by the French trawler fleet in the Bay of Biscay: estimating technical and biological components
.
ICES J Mar Sci
.
2011
;
68
:
107
18
.

Maunder
 
MN
,
Punt
 
AE
.
A review of integrated analysis in fisheries stock assessment
.
Fish Res
.
2013
;
142
:
61
74
.

Mauvisseau
 
Q
,
Harper
 
LR
,
Sander
 
M
 et al.  
The multiple states of environmental DNA and what is known about their persistence in aquatic environments
.
Environ Sci Technol
.
2022
;
56
:
5322
33
.

Miya
 
M
,
Sato
 
Y
,
Fukunaga
 
T
 et al.  
MiFish, a set of universal PCR primers for metabarcoding environmental DNA from fishes: detection of more than 230 subtropical marine species
.
R Soc Open Sci
.
2015
;
2
:
150088
.

Mjanger
 
H
,
Svendsen
 
BV
,
Senneset
 
H
 et al.  
2019
;
Håndbok for Prøvetaking av Fisk, Krepsdyr og Andre Evertebrater
.
Versjon 5.0
. (
In Norwegian
).

Nash
 
RDM
,
Wright
 
PJ
,
Matejusova
 
I
 et al.  
Spawning location of Norway pout (Trisopterus esmarkii Nilsson) in the North Sea
.
ICES J Mar Sci
.
2012
;
69
:
1338
46
.

Neebling
 
TE
,
Quist
 
MC
.
Comparison of boat electrofishing, trawling, and seining for sampling fish assemblages in Iowa's non-wadeable Rivers
.
North Am J Fisher Manage
.
2011
;
31
:
390
402
.

Pont
 
D
,
Meulenbroek
 
P
,
Bammer
 
V
 et al.  
Quantitative monitoring of diverse fish communities on a large scale combining eDNA metabarcoding and qPCR
.
Mol Ecol Resour
.
2023
;
23
:
396
409
.

Preston
 
FW
.
The commonness, and rarity, of species
.
Ecology
.
1948
;
29
:
254
83
.

Ramírez-Amaro
 
S
,
Bassitta
 
M
,
Picornell
 
A
 et al.  
Environmental DNA: state-of-the-art of its application for fisheries assessment in marine environments
.
Front Mar Sci
.
2022
;
9
:
1004674
.

Reigstad
 
M
,
Wassmann
 
P
.
Importance of advection for pelagic-benthic coupling in north Norwegian fjords
.
Sarsia
.
1996
;
80
:
245
57
.

Rognes
 
T
,
Flouri
 
T
,
Nichols
 
B
 et al.  
VSEARCH: a versatile open source tool for metagenomics
.
Peer J
.
2016
;
2016
:
1
22
.

Rourke
 
ML
,
Fowler
 
AM
,
Hughes
 
JM
 et al.  
Environmental DNA (eDNA) as a tool for assessing fish biomass: a review of approaches and future considerations for resource surveys
.
Environ DNA
.
2022
;
4
:
9
.

Salter
 
I
,
Joensen
 
M
,
Kristiansen
 
R
 et al.  
Environmental DNA concentrations are correlated with regional biomass of Atlantic cod in oceanic waters
.
Commun Biol
.
2019
;
2
:
461
.

Schneider
 
J
,
Valentini
 
A
,
Dejean
 
T
 et al.  
Detection of invasive mosquito vectors using environmental DNA (eDNA) from water samples
.
PLoS One
.
2016
;
11
:
e0162493
.

Shelton
 
AO
,
Gold
 
ZJ
,
Jensen
 
AJ
 et al.  
Toward quantitative metabarcoding
.
Ecology
.
2023
;
104
:
e3906
.

Shelton
 
AO
,
O'Donnell
 
JL
,
Samhouri
 
JF
 et al.  
A framework for inferring biological communities from environmental DNA
.
Ecol Appl
.
2016
;
26
:
1645
59
.

Shelton
 
AO
,
Ramón-Laca
 
A
,
Wells
 
A
 et al.  
Environmental DNA provides quantitative estimates of Pacific hake abundance and distribution in the open ocean
.
Proc R Soc B Biol Sci
.
2022
;
289
:
20212613
.

Shu
 
L
,
Ludwig
 
A
,
Peng
 
Z
.
Standards for methods utilizing environmental DNA for detection of fish species
.
Genes
.
2020
;
11
:
296
.

Stoeckle
 
MY
,
Adolf
 
J
,
Charlop-Powers
 
Z
 et al.  
Trawl and eDNA assessment of marine fish diversity, seasonality, and relative abundance in coastal New Jersey, USA
.
ICES J Mar Sci
.
2021
;
78
:
293
304
.

Strickler
 
KM
,
Fremier
 
AK
,
Goldberg
 
CS
.
Quantifying effects of UV-B, temperature, and pH on eDNA degradation in aquatic microcosms
.
Biol Conserv
.
2015
;
183
:
85
92
.

Taberlet
 
P
,
Coissac
 
E
,
Hajibabaei
 
M
 et al.  
Environmental DNA
.
Mol Ecol
.
2012
;
21
:
1789
93
.

Taylor
 
MI
,
Fox
 
C
,
Rico
 
I
 et al.  
Species-specific TaqMan probes for simultaneous identification of (Gadus morhua L.), haddock (Melanogrammus aeglefinus L.) and whiting (Merlangius merlangus L.)
.
Mol Ecol Notes
.
2002
;
2
:
599
601
.

Thomsen
 
PF
,
Kielgast
 
J
,
Iversen
 
LL
 et al.  
Detection of a diverse marine fish fauna using environmental DNA from seawater samples
.
PLoS One
.
2012
;
7
:
e41732
.

Thomsen
 
PF
,
Møller
 
PR
,
Sigsgaard
 
EE
 et al.  
Environmental DNA from seawater samples correlate with trawl catches of subarctic, deepwater fishes
.
PLoS One
.
2016
;
11
:
e0165252
.

Turon
 
M
,
Nygaard
 
M
,
Guri
 
G
 et al.  
Fine-scale differences in eukaryotic communities inside and outside salmon aquaculture cages revealed by eDNA metabarcoding
.
Front Genet
.
2022
;
13
:
957251
.

Urban
 
P
,
Bekkevold
 
D
,
Degel
 
H
 et al.  
Scaling from eDNA to biomass: controlling allometric relationships improves precision in bycatch estimation
.
ICES J Mar Sci
.
2023
;
80
:
1066
78
.

Valdivia-Carrillo
 
T
,
Rocha-Olivares
 
A
,
Reyes-Bonilla
 
H
 et al.  
Integrating eDNA metabarcoding and simultaneous underwater visual survey to describe complex fish communities in a marine biodiversity hotspot
.
Mol Ecol Resour
.
2021
;
21
:
1558
74
.

Veron
 
P
,
Rozanski
 
R
,
Marques
 
V
 et al.  
Environmental DNA complements scientific trawling in surveys of marine fish biodiversity
.
ICES J Mar Sci
.
2023
;
80
:
2150
65
.

Yates
 
CM
,
Wilcox
 
MT
,
Stoeckle
 
MY
 et al.  
Interspecific allometric scaling in eDNA production among northwestern Atlantic bony fishes reflects physiological allometric scaling
.
Environ DNA
.
2022
;
5
:
1105
15
.

Zhang
 
C
,
Chen
 
Y
,
Xu
 
B
 et al.  
Evaluating the influence of spatially varying catchability on multispecies distribution modelling
.
ICES J Mar Sci
.
2020
;
77
:
1841
53
.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
Handling Editor: W Stewart Grant
W Stewart Grant
Handling Editor
Search for other works by this author on:

Supplementary data