Abstract

Understanding industry agglomeration and its driving forces is critical for the formulation of industrial policy in developing countries. Crucial to this process is the definition and measurement of agglomeration. We construct a new coagglomeration index based purely on the location of firms. We examine what this index reveals about the importance of transport costs, labour market pooling and technology transfer for agglomeration processes, controlling for overall industry agglomeration. We compare the results based on our new measure to existing measures in the literature and find very different underlying stories at work. We conclude that in conducting analyses of this kind giving consideration to the source of agglomeration economies, employees or entrepreneurs, and finding an appropriate measure for agglomeration, are both crucial to the process of identifying agglomerative forces.

1. Introduction

Manufacturing production tends to be highly geographically concentrated, particularly in developing countries. Within countries the pattern of concentration of economic activity will impact on economic development, with areas of dense economic activity prospering while others are left behind. Understanding the geographic clustering of economic activity has received considerable attention in the economics literature. The seminal theoretical work by Marshall (1920) postulated three reasons why firms locate in close proximity to each other. First, the cost of transporting goods is reduced when firms are located close to their customers or suppliers. Second, when firms locate in a cluster, a pool of workers emerges. This makes it easier to hire new workers when labour demand increases, and also facilitates better matching of workers to jobs. 1 Third, knowledge spillovers, in particular informal exchanges of ideas, are more likely when firms are in close geographic proximity ( Krugman, 1991 ; Fujita et al., 1999 ). Essentially, firms can reduce costs by locating in geographic clusters as proximity to other firms reduces the cost of transporting goods, people and ideas.

These are the agglomerative forces that drive clustering, and the reasons why we expect firms in clusters to be more productive. In a developing country context, where firms operate far away from the ‘best practice’ technology frontier, productivity spillovers in particular are likely to have a large effect on firm performance. There is considerable scope for improvements in technology and practices. Indeed, a number of recent empirical contributions have found evidence of positive benefits associated with agglomeration; in India and Indonesia ( Diechmann et al., 2008 ), in Chile, China and Malaysia ( Collier and Page, 2009 ), and in Ethiopia ( Bigsten et al., 2011 ). 2 It is not surprising therefore that policymakers in developing countries have focussed on industrial policies that foster clusters to promote growth, and reduce spatial inequality. 3

In spite of the importance attached to agglomeration as a force in economic transformation and development, few attempts have been made in the empirical literature to explicitly define and measure the extent of clustering. A notable exception is Ellison and Glaeser (1997) . They construct an (EG) index which measures the extent of concentration or coagglomeration of two industries. Their measure is closely related to the covariance of the area industry employment shares of the two industries, rescaled to eliminate sensitivity to the size of geographic area considered. It captures correlations in the relative size of two industries in an area, in terms of employment shares, compared with the relative size of all other industries in the area. The index sums over all areas and therefore allows for analysis of the factors underlying the geographic concentration of industry over and above that which can be attributed to urbanisation (see, for example, Ellison and Glaeser (1999) and Ellison et al. (2010) ).

In choosing an index to measure coagglomeration, a key consideration is the source of potential agglomeration economies. As indicated, in a developing country context, productivity spillovers in particular are likely to have a large effect on firm performance. Studies of agglomeration are often focussed on identifying whether technology transfers/productivity spillovers are an agglomerative force. In settings where the workforce is highly educated and highly skilled it would seem sensible to consider employees as a potential source of this type of agglomeration economy. However in developing countries where the vast majority of workers have low levels of education and are low skilled it may be more sensible to consider the entrepreneur as the source of agglomeration economies. There is a large literature supporting the idea that there are significant knowledge spillovers associated with entrepreneurship and that agglomeration economies are bigger for smaller firms ( Glaeser et al., 2010 ; Rosenthal and Strange, 2010 ; Faggio et al., 2014 ). 4 In this article, we propose that focussing on the firm (the entrepreneur) in measuring the extent of agglomeration is more relevant in developing country contexts.

A potential caveat of the EG index of coagglomeration when applied in developing country settings is that it focuses on employees as the source of agglomeration economies and as such may not capture agglomeration economies in clusters of small-sized firms. This is particularly an issue when using the index to identify technology transfers as an agglomerative force in cases where clusters of industries are small in terms of number of employees, but large in terms of the number of firms. A second potential caveat relates to the way in which the EG index controls for urbanisation. It compares each area industry employment share to the mean industry employment share in that area. In doing so it controls for population density or urbanisation in that area , rather than controlling for the overall population distribution. This approach may lead to the EG index over-weighting clusters in low population (rural) areas. 5

In this article, we propose an alternative measure of the extent to which firms from two different industries locate in the same place, or coagglomerate. We term this measure the excess colocation (XCL) index. The measure overcomes the potential shortcomings of the EG measure by focusing on individual firms, rather than employees, as the source of agglomeration economies. This distinction, we argue, is particularly relevant in contexts where the majority of workers are low skilled, value added in production is low, and high-tech firms are small in terms of the number of employees, while low-tech firms are generally very large. 6 In this setting, a concentration of firms ( entrepreneurs ) in different industries rather than employees of different industries is more likely to be a source of innovation or knowledge transfers. Moreover, our approach differs from the EG index as the XCL index controls for the overall distribution of firms rather than controlling for urbanisation in each area separately. Constructing the XCL index in this way allows for the possibility that clusters of firms in urban areas may be as important as other clusters, equivalent in size, but located in rural areas. We control for the existing spatial distribution of firms in the index using a counterfactual distribution constructed via a simple bootstrap. Our index then captures the extent to which firms from two industries choose to locate in the same (or different) areas, controlling for natural advantage and urbanisation.

We compare the merits of our XCL index to the EG index by examining the agglomerative forces at work in Vietnam. Despite the policy focus, empirical evidence on the factors driving the location choice of firms in developing country contexts remains weak. With some limited exceptions, the majority of empirical work on the factors driving agglomeration has concentrated on developed countries. 7 As such, an additional contribution of our article is that it adds to the evidence on the agglomerative forces behind the clustering of firms in developing country settings.

Vietnam is a particularly interesting country context in which to investigate industry clustering. In the mid-1980s, Vietnam began to move from a centrally planned to a market-oriented economy and economic transformation has resulted in a structural shift from agriculture to manufacturing. Industrialisation has been accompanied by rapid economic growth; however, spatial disparity has become a big concern ( Son, 2009 ). Substantial inflows of foreign direct investment (FDI), which have been a driving force for growth in Vietnam, have also contributed to the spatial disparity; three quarters of FDI is concentrated around the two main urban centres, Hanoi and Ho Chi Min City ( Huang and Bocchi, 2009 ). Moreover, Howard et al. (2011) find that manufacturing firms in Vietnam exhibit a high degree of clustering.

We use firm-level data aggregated to the industry level and focus on between-industry agglomeration. 8 We use each of the coagglomeration indices, our XCL index and the EG index, to test the relative importance of each of Marshall’s three theoretical agglomerative forces (input–output links, labour market pooling and technology transfers). A priori we expect to find very different agglomerative forces at work in a developing country context such as Vietnam compared with what has been found for developed countries. Markets in developing countries tend to be less integrated, the value added from production is lower, and firms are less technologically advanced. For example, in our data only about 20% of firms are middle-high-tech or high-tech. Moreover, using our data, we are able to capture informal channels of technology diffusion between firms which adds an important dimension to existing studies that have so far focussed on formal or contracted technology transfers ( Jaffe, 1986 ; Ellison et al., 2010 ).

Using our measure, we find that technology transfers are the most important agglomerative force in Vietnam. Using the EG measure of coagglomeration we find very different underlying stories at work. In particular, using the EG index we find no role for technology transfers. We explain this difference by the fact that the focus of the source of agglomeration economies is an important consideration in the construction of a measure of coagglomeration. While both indices are valid, we argue that in a developing country context, where productivity spillovers or technology transfers are of crucial importance, the firm or the entrepreneur is the more appropriate focus. The contrasting results highlight how finding an appropriate measure for agglomeration is crucial to the process of identifying agglomerative forces.

The remainder of the article is organised as follows. In Section 2, we present our measure of coagglomeration and compare it to the EG index commonly applied in studies of this kind. We also present alternative specifications of both measures when the focus of the source of agglomeration economies is switched. In Section 3, we present our data and provide evidence on the extent of coagglomeration of industry pairings in Vietnam using the different indices. Section 4 describes each of the agglomerative forces considered and presents the measures used in our analysis along with descriptive statistics. Section 5 presents and discusses the results while Section 6 concludes the article.

2. Definition and measurement of agglomeration

2.1. The EG measure of coagglomeration

The EG index is derived from a location choice model which assumes that agglomeration is a result of a sequence of profit maximising location decisions by individual firms. 9 The EG coagglomeration index for two industries A and B is given by Equation (1).
(1)
where m indexes administrative areas; s mA is the share of industry A’s employment in area m ; s mB is the share of industry B’s employment in area m ; and x m is the mean employment share in the area m across all industries.

As highlighted in the introduction, by using employment shares in the construction of the coagglomeration measure, the EG index implicitly focuses on individual employees as the source of agglomeration economies. In a developing country context individual firms (entrepreneurs) are more likely to be the source of agglomeration economies, particularly in relation to technology transfers, given that the majority of employment is unskilled.

Moreover, a potential caveat to using the EG index is that it compares each area industry employment share to the mean industry employment share in that area. In doing so, it controls for population density or urbanisation in that area separately, rather than controlling for the overall employment distribution (as the XCL measure does). This can lead to the EG index over-weighting clusters in low population areas. Indeed, Billings and Johnson (2015) use generated data and find that the EG index may generate statistically significant localisation for an industry with high spatial variance due to high industry concentration in low population density (rural) areas. This is likely to be an issue particularly in Vietnam where the government actively encourages labour intensive firms to locate in rural areas where there are smaller populations.

2.2. A new measure of coagglomeration: The Excess Colocation (XCL) Index

Our XCL measure is constructed using two steps. First, we consider every possible set of two industries A and B and calculate a colocation (CL) index which measures the extent to which the two industries are located in the same area. More precisely, for p firms in industry A and q firms in industry B we take each firm i in industry A and sum the number of firms in industry B located in the same area. We then take the number of colocated pairings as a proportion of all possible pairings across the two industries (i.e. p x q ). 10 This measure is bound by 0 and 1. The formula for the CL index is given by Equation (2).
(2)
where C ij = 1 if firms i and j are located in the same area, and 0 otherwise. The sum of the C ij tells us the number of firms from industry A colocated with firms from industry B and vice versa. A simple numerical example is provided in the Appendix.
Our measure also needs to control for the density of manufacturing to avoid confounding the general tendency of manufacturing and economic activity to cluster with the tendency of particular industries to cluster. Therefore, in the second step, we control for density by constructing a counterfactual via a simple bootstrap. For each set of two industries A and B, with p firms in industry A and q firms in industry B, we randomly sample p firms from the population of firms and assign them to industry A. Similarly, we randomly sample q firms from the population of firms and assign them to industry B. We then calculate the CL index for industries A and B based on this random sample. This process is repeated 50 times and the mean CL index, based on the random samples, is calculated. This mean random CL index for each industry pairing is then subtracted from the actual CL index calculated for each industry pairing. The resulting colocation measure controls for the existing spatial distribution of firms and therefore captures the extent to which firms from different industries locate in the same area, controlling for natural advantages and the general tendency for economic activity to agglomerate. The formula for this measure, the excess colocation (XCL) index is given by Equation (3).
(3)
The measure is bound by −1 and +1. Positive values indicate that the firms from industries A and B locate in the same area more often than one would expect, given the general distribution of manufacturing activity; negative values imply that they locate in the same area less often than one would expect.

2.3. Employment versus firm: XCL and EG indices

One key difference between the XCL and the EG index is that the former considers agglomeration in terms of firms or entrepreneurs while the latter considers agglomeration in terms of employment. We argue, as others have previously, that it is reasonable to consider the source of agglomeration economies to be individual entrepreneurs/firms rather than employees ( Fujita and Ogawa, 1982 ; Henderson, 2003 ; Glaeser et al., 2010 ; Rosenthal and Strange, 2010 ). The identification of the driving forces behind agglomeration may depend on whether the focus is firms or employment. We therefore also consider an amended XCL index which focuses on employees rather than firms, and an amended EG index which focuses on firms rather than employees.

The amended XCL index for employment is given by Equation (4)
(4)
where C ij = 1 if firms i and j are located in the same area, and 0 otherwise, e i is the number of employees in firm i , e j is the number of employees in firm j , and the denominator is the total employment in both industries. As with the XCL index, for p firms in industry A and q firms in industry B we take each firm i in industry A and sum the number of firms in industry B located in the same area. However, this summation is now weighted by the number of employees in the two firms. This measure is bound by −1 and +1 and controls for the existing spatial distribution of firms.
The amended EG measure simply replaces the shares of employment by the share of firms. It is given by Equation (5)
(5)
where m indexes administrative areas; s mA is the industry A’s share of firms located in area m ; s mB is industry B’s share of firms located in area m ; and x m is the mean firm share in the area m across all industries.

3. Coagglomeration of manufacturing industries in Vietnam

To compare the coagglomeration indices, we use a dataset of 31,065 manufacturing firms taken from the Vietnamese Enterprise Survey (VES) for the year 2007 provided by the General Statistics Office (GSO, 2010). The dataset includes all registered manufacturing firms with more than 30 employees, and a random sample of 15% of small registered enterprises with fewer than 30 employees. 11 Along with the standard financial information on firms the data include the name of the commune in which each firm is located.

The data are collected as follows. The VES instrument is mailed to firms which submit the completed questionnaires by return post to the Provincial Statistics Office. Under the Law on Statistics all firms are legally required to comply. Any firms that do not respond are contacted by provincial authorities by mail, by phone or through face-to-face visits. All data gathered are checked by the General Statistics Office for internal consistency and cross-checked with the administrative provincial data before being made available. As such the data are as complete a record as possible on the economic activities of firms in Vietnam. We use the firm-level data to compute pairwise coagglomeration measures for 43 manufacturing industries. 12

The use of a geographic region in the calculation of a measure of agglomeration can lead to problems with firms located on the border of regions (see, for example, Gorman and Kulkarni (2004) ). In Vietnam, the setting for our empirical analysis, provinces are the largest in area (there are 64 provinces). They are administered by provincial councils, elected by the inhabitants of each province. This structure is repeated at the district level. Zoning and planning are determined at each level of government down to the district level. Provinces and districts can therefore be very different in terms of their laws and regulations. Communes are quite small and the communes within a district are identical in regulatory terms. Defining clusters at the commune level may dissect important clusters while defining clusters at the district level is unlikely to do so. Special economic zones do not cross district borders due to the separation of administrative powers between districts. For these reasons, we rely in our analysis on clusters defined at the district level. We present summary statistics and the results of our analysis when clustering is defined at the commune and province level as robustness checks. 13

Table 1 presents the top five pairings on the basis of the XCL index. This measure controls for natural advantages and urbanisation in its construction and so we identify industries that are colocated to an extent beyond what we would expect purely on the basis of these factors. We find a high level of coagglomeration between vertically linked industries. For example, the domestic appliance industry is closely agglomerated with the plastic and by-products industry. It is likely that the latter provide important inputs to the former. Similarly, we find that the special purpose machinery industry is coagglomerated with the ready-made apparel industry and the leather and leather products industry. These pairings suggests that transport costs of inputs from upstream to downstream industries may be a motivating factor in the location choice of firms. It is also possible that the coagglomeration of these industries facilitates transfers of technology through the supply chain. Other vertically linked industries that are coagglomerated include the pairing of the printing and publishing industry with industries that are likely to require information booklets including regulations or instructions for the products that they produce, for example the manufacture of medical and surgical equipment. We also present, for illustrative purposes, the top five coagglomeration pairs on the basis of the EG measure.

Table 1.

Highest coagglomeration pairings

XCL Index
Industry 1Industry 2
29: Plastic and by-products36: Domestic appliances
18: Leather and Leather products35: Special purpose machinery
40: Medical and surgical equipment21: Publishing
35: Special purpose machinery17: Ready-made apparel
40: Medical and surgical equipment22: Printing
XCL Index
Industry 1Industry 2
29: Plastic and by-products36: Domestic appliances
18: Leather and Leather products35: Special purpose machinery
40: Medical and surgical equipment21: Publishing
35: Special purpose machinery17: Ready-made apparel
40: Medical and surgical equipment22: Printing
EG Index
Industry 1Industry 2
37: Electrical machinery41: Precision and optical equipment
05: Milk and dairy products21: Publishing
08: Prepared feeds for animals37: Electrical machinery
08: Prepared feeds for animals41: Precision and optical equipment
01: Production, processing and preserving of meat and meat products05: Milk and dairy products
EG Index
Industry 1Industry 2
37: Electrical machinery41: Precision and optical equipment
05: Milk and dairy products21: Publishing
08: Prepared feeds for animals37: Electrical machinery
08: Prepared feeds for animals41: Precision and optical equipment
01: Production, processing and preserving of meat and meat products05: Milk and dairy products
Table 1.

Highest coagglomeration pairings

XCL Index
Industry 1Industry 2
29: Plastic and by-products36: Domestic appliances
18: Leather and Leather products35: Special purpose machinery
40: Medical and surgical equipment21: Publishing
35: Special purpose machinery17: Ready-made apparel
40: Medical and surgical equipment22: Printing
XCL Index
Industry 1Industry 2
29: Plastic and by-products36: Domestic appliances
18: Leather and Leather products35: Special purpose machinery
40: Medical and surgical equipment21: Publishing
35: Special purpose machinery17: Ready-made apparel
40: Medical and surgical equipment22: Printing
EG Index
Industry 1Industry 2
37: Electrical machinery41: Precision and optical equipment
05: Milk and dairy products21: Publishing
08: Prepared feeds for animals37: Electrical machinery
08: Prepared feeds for animals41: Precision and optical equipment
01: Production, processing and preserving of meat and meat products05: Milk and dairy products
EG Index
Industry 1Industry 2
37: Electrical machinery41: Precision and optical equipment
05: Milk and dairy products21: Publishing
08: Prepared feeds for animals37: Electrical machinery
08: Prepared feeds for animals41: Precision and optical equipment
01: Production, processing and preserving of meat and meat products05: Milk and dairy products

Direct comparison between the coagglomeration of the industry pairs measured using the XCL and EG indices is problematic given the key differences in their construction, and considering the XCL index controls for both urbanisation and natural advantages. We would therefore not necessarily expect the measures to identify similar industries as highly coagglomerated. The percentage of pairings each measure defines as coagglomerated is similar for all indices; the EG measure indicates coagglomeration in 40% of industry pairings, the XCL index indicates coagglomeration in 43% of pairings. The results are similar when the focus of the indices is switched; 43% of pairings are identified as coagglomerated using the EG by firms measure while 45% of firms are coagglomerated according to the XCL by employment measure.

As indicated, it is likely that the EG index over-weights clusters in low population areas. To illustrate this point, we use the Vietnam Household Living Standards Survey (VHLSS) for 2006 to identify the industries that have the highest percentage of rural employment. We take the top 10 sectors and compare the values given by each index for pairings consisting of these industries (45 pairings). 14 The percentage of rural pairings the XCL and the XCL by employment measures define as coagglomerated is similar to the percentages for the full sample; 42% and 40% respectively. Interestingly, both the EG and the EG by firms measures report a much higher percentage of the rural sector pairings as coagglomerated; 64% and 56% respectively. This supports our contention that the EG index, in controlling for urbanisation in each area separately, over-weights clusters in rural areas. The issue remains when the EG by firms measure is used. Although the focus of agglomeration economies is switched, the way in which urbanisation is controlled for in the construction of the index remains the same.

4. Determinants of coagglomeration

In this section, we describe the variables used to measure each of the three drivers of coagglomeration and the data sources relied on in their construction. We also explain how we control for natural advantages in the empirical analysis and present descriptive statistics.

4.1. Transport costs

Firms wish to minimise their costs in order to maximise profits. The further the distance inputs and outputs need to be transported, the greater the transportation costs. Therefore, firms will have an incentive to locate near their suppliers and/or their customers. To capture the extent to which transport costs matter in the location decisions of firms we use the input–output linkages between different industries.

Following the approach of Ellison et al. (2010) , we use the Vietnam Supply and Use Tables (SUT) for 2007 (CIEM, 2007). The SUT provides information on the use of 138 commodities in 112 production activities. Commodities and production activities are mapped to the four-digit ISIC codes used in the Vietnamese Enterprise Survey and 43 manufacturing industry codes that are comparable across the two datasets are generated. Details of these codes are provided in the Appendix. The information in the SUT is aggregated to the new industry codes. We consider three different measures of input–output linkages as in Ellison et al. (2010) . First, we consider the flow of inputs between each industry pairing. We compute the proportion of total inputs that industry A buys from industry B and vice versa and take the maximum as a measure of the degree of input linkages between the two industries. Second, we consider the proportion of total output that industry A provides to industry B and vice versa and take the maximum as a measure of the degree of output linkages between the two industries. Third, we take the maximum of these two input and output measures to produce the variable InputOutput AB . Finally, we take the maximum of the input and output measures in absolute terms, measured as the maximum of the total value of inputs/outputs that A sells/buys from B and vice versa. We expect that firms that are highly linked on the basis of these measures are more likely to cluster.

4.2. Labour market pooling

To assess the importance of labour market pooling, we examine the correlation in the type of workers employed in each of the different industries. We use the VHLSS for 2006 which is a representative sample of all households in Vietnam (GSO, 2006). The database contains information on the occupation, level of education and experience of household members, together with the two-digit industry in which they are employed. 15 In order to measure the extent to which two different industries employ workers with similar skills sets, and therefore have incentives to locate near a common ‘pooled’ labour market, we calculate the correlations between the employment patterns of all sets of two industries.

The VHLSS includes information on the highest school grade completed by the worker according to the 12 grade system. 16 We calculate the share of each industry’s employment of each education level (E), Share AE . We then calculate the correlation between Share AE and Share BE across all industries A and B which we term EducationCorrelation AB .

The VHLSS also includes information on the experience of workers. Specifically, the respondents are asked which is the most time consuming job the worker has been engaged in for the last 12 months and, ‘For how many years has [name] been doing this work?’ We define six categories of experience levels and place workers in groups according to the number of years reported. 17 We calculate the share of each industry’s employment of each experience level (X), Share AX . We then calculate the correlation between Share AX and Share BX across all industries A and B which we term ExperienceCorrelation AB .

Information on the occupation of each worker is also included in the VHLSS. There are 32 different occupation groups specified. We calculate the share of each industry’s employment in each occupation (O) Share AO . We then calculate the correlation between Share AO and Share BO across all industries A and B which we term OccupationCorrelation AB .

In order to obtain our measure of the extent to which firms in industries A and B hire similar types of workers, we simply take the average of the three variables; EducationCorrelation AB , ExperienceCorrelation AB and OccupationCorrelation AB to obtain the variable SkillsCorrelation AB .

4.3. Technology spillovers

Of the three main agglomerative forces technology spillovers are arguably the hardest to quantify or measure. Measuring the flow of ideas between industries is difficult, although a number of different proxies have been used in the literature. Audretsch and Feldman (1996) use industry R&D, university R&D and skilled labour as measures of knowledge spillovers. Greenstone et al. (2010) quantify agglomeration spillovers as the change in total factor productivity (TFP) experienced by incumbent manufacturing firms when a large manufacturing firm locates in the same area. Ellison et al. (2010) use two measures. First, a technology matrix similar to the input–output matrix for the USA which captures how R&D activity in one industry flows out to benefit another industry. Second, they use a patent database to construct measures of patents in and out of pairs of industries. These measures, however, only capture official exchanges of technology. It is likely that many technology exchanges are more informal or accidental, particularly in developing country contexts.

In our analysis, we use a specially designed module on technology usage among the Vietnamese manufacturing firms that was included in the 2010 round of the Vietnamese Enterprise Survey (GSO, 2010). A sample of over 8000 representative manufacturing firms were asked the question ‘Do most contracts include technology transfer from the supplier to the enterprise?’ The firms were also asked the same question in relation to their customers. If the answer to this question was yes, they were asked whether the technology transfer is mainly ‘intentional and part of the legal contract’, ‘intentional but not part of the legal contract’ or ‘unintentional’. The question therefore captures both formal and accidental or informal technology transfers from the supplier or customer to the enterprise.

We construct our technology transfer variable by calculating the proportion of firms in each industry reporting that most contracts include a transfer from the supplier to the enterprise and a separate variable representing the proportion of firms whose contracts include a transfer from the customer to the enterprise. We therefore have two variables; technology transfers from suppliers and technology transfers from customers. We need to pair these measures with supplier and customer industries; we would like to know the extent to which firms in industry A receive technology transfers from their suppliers or customers in industry B. For each industry, we map technology transfers received from suppliers to its input supply industries by interacting the variable with a measure of the proportion of inputs it buys from each of the other industries. In other words, for industry A we multiply the measure of technology transfer from suppliers by the proportion of industry A’s inputs that come from industry B to produce the measure SupplierTransfer AB . Similarly, we interact technology transfer from customers with the proportion of outputs it sells to other industries to produce the measure CustomerTransfer AB

The technology transfer variables are not symmetric; SupplierTransfer AB is not necessarily equal to SupplierTransfer BA , similarly CustomerTransfer AB is not necessarily equal to CustomerTransfer BA . In order to make both variables symmetric we take the mean of the transfer variables AB and BA. Finally, to create one measure of technology transfers between industries we take the mean of these two symmetric variables ( SupplierTransfer AB and CustomerTransfer AB ) to obtain our variable TechTransfer AB . 18

If technology transfers from suppliers and/or from customers to firms are common in an industry, we would expect firms in that industry to locate near their suppliers and/or customers. Therefore, a positive relationship between our technology transfer variable and the coagglomeration index would provide evidence that technology transfers are an important agglomerative force.

4.4. Natural advantages

Some geographic areas may have natural advantages over others that result in cluster formation. They are important to control for in our analysis. For example, areas that are rich in minerals will attract clusters of mining companies. When using the XCL measure (Equation (3)) as the dependent variable, natural advantages are controlled for in the construction of the counterfactual. However, when the EG index (Equation (1)) is used we need to include a measure for natural advantage as an explanatory variable in the analysis. Following Ellison and Glaeser (1999) , we develop a predicted spatial distribution of firms based upon cost differences between regions (commune/district/province). We construct a set of probabilities for each region which captures the probability that a firm will locate in that region if cost is the only factor in its location decision. We use data from the VES on the average wage paid by firms and the percentage tax they pay (calculated by tax paid divided by total revenue). We then express the cost per region as a percentage of the total costs faced by firms in Vietnam. As firms are more likely to locate in a region with lower costs, we take the reciprocal of this percentage and compute location probabilities. We then randomly allocate firms to regions in Vietnam using these probabilities. We calculate the EG index for this predicted spatial distribution of firms and use this variable as a control for differences in costs or natural advantages across regions in the analysis.

4.5. Descriptive statistics

Table 2 presents summary statistics for the dependent and explanatory variables used in our analysis. The coagglomeration measures are calculated at the three levels of administrative area for comparison purposes. Negative and positive values of the measures have the same interpretation; negative values imply the two industries locate in different areas, positive values imply the two industries locate in the same areas or coagglomerate. The mean and the maximum values of the XCL measure increase as expected as the size of the administrative area increases. The mean of the EG index is approximately zero. This is largely by definition as x m , the measure of an area’s ‘size’, is the share of manufacturing employment, so the deviation of each industry from the benchmark will be approximately uncorrelated with the average of the deviations of all other industries ( Ellison et al., 2010 ).

Table 2.

Descriptive statistics

MeanStd. DevMinMax
XCL Index
    XCL (Commune)0.0000.001−0.0020.004
    XCL (District)0.0000.006−0.0090.021
    XCL (Province)0.0100.074−0.1080.298
Emp XCL Index
    Emp XCL (Commune)0.0000.002−0.0020.016
    Emp XCL (District)−0.0010.006−0.0100.037
    Emp XCL (Province)0.0070.058−0.0970.257
EG Index
    EG (Commune)00.013−0.0260.224
    EG (District)00.149−0.3740.208
    EG (Province)00.338−0.1080.247
EG Firms Index
    EG (Commune)00.001−0.0040.002
    EG (District)00.003−0.0090.013
    EG (Province)00.028−0.1210.129
Natural advantage EG
    Cost advantage (Commune)00.003−0.0100.049
    Cost advantage (District)00.003−0.0100.049
    Cost advantage (Province)00.006−0.0210.061
Natural advantage EG Firms
    Cost advantage (Commune)00.001−0.0040.000
    Cost advantage (District)00.001−0.0070.006
    Cost advantage (Province)00.001−0.0140.016
Marshallian factors
    Input–output maximum0.0460.1130.0000.893
    Technology transfer0.0040.0140.0000.221
    Skills correlation0.3770.282−0.3281.000
MeanStd. DevMinMax
XCL Index
    XCL (Commune)0.0000.001−0.0020.004
    XCL (District)0.0000.006−0.0090.021
    XCL (Province)0.0100.074−0.1080.298
Emp XCL Index
    Emp XCL (Commune)0.0000.002−0.0020.016
    Emp XCL (District)−0.0010.006−0.0100.037
    Emp XCL (Province)0.0070.058−0.0970.257
EG Index
    EG (Commune)00.013−0.0260.224
    EG (District)00.149−0.3740.208
    EG (Province)00.338−0.1080.247
EG Firms Index
    EG (Commune)00.001−0.0040.002
    EG (District)00.003−0.0090.013
    EG (Province)00.028−0.1210.129
Natural advantage EG
    Cost advantage (Commune)00.003−0.0100.049
    Cost advantage (District)00.003−0.0100.049
    Cost advantage (Province)00.006−0.0210.061
Natural advantage EG Firms
    Cost advantage (Commune)00.001−0.0040.000
    Cost advantage (District)00.001−0.0070.006
    Cost advantage (Province)00.001−0.0140.016
Marshallian factors
    Input–output maximum0.0460.1130.0000.893
    Technology transfer0.0040.0140.0000.221
    Skills correlation0.3770.282−0.3281.000
Table 2.

Descriptive statistics

MeanStd. DevMinMax
XCL Index
    XCL (Commune)0.0000.001−0.0020.004
    XCL (District)0.0000.006−0.0090.021
    XCL (Province)0.0100.074−0.1080.298
Emp XCL Index
    Emp XCL (Commune)0.0000.002−0.0020.016
    Emp XCL (District)−0.0010.006−0.0100.037
    Emp XCL (Province)0.0070.058−0.0970.257
EG Index
    EG (Commune)00.013−0.0260.224
    EG (District)00.149−0.3740.208
    EG (Province)00.338−0.1080.247
EG Firms Index
    EG (Commune)00.001−0.0040.002
    EG (District)00.003−0.0090.013
    EG (Province)00.028−0.1210.129
Natural advantage EG
    Cost advantage (Commune)00.003−0.0100.049
    Cost advantage (District)00.003−0.0100.049
    Cost advantage (Province)00.006−0.0210.061
Natural advantage EG Firms
    Cost advantage (Commune)00.001−0.0040.000
    Cost advantage (District)00.001−0.0070.006
    Cost advantage (Province)00.001−0.0140.016
Marshallian factors
    Input–output maximum0.0460.1130.0000.893
    Technology transfer0.0040.0140.0000.221
    Skills correlation0.3770.282−0.3281.000
MeanStd. DevMinMax
XCL Index
    XCL (Commune)0.0000.001−0.0020.004
    XCL (District)0.0000.006−0.0090.021
    XCL (Province)0.0100.074−0.1080.298
Emp XCL Index
    Emp XCL (Commune)0.0000.002−0.0020.016
    Emp XCL (District)−0.0010.006−0.0100.037
    Emp XCL (Province)0.0070.058−0.0970.257
EG Index
    EG (Commune)00.013−0.0260.224
    EG (District)00.149−0.3740.208
    EG (Province)00.338−0.1080.247
EG Firms Index
    EG (Commune)00.001−0.0040.002
    EG (District)00.003−0.0090.013
    EG (Province)00.028−0.1210.129
Natural advantage EG
    Cost advantage (Commune)00.003−0.0100.049
    Cost advantage (District)00.003−0.0100.049
    Cost advantage (Province)00.006−0.0210.061
Natural advantage EG Firms
    Cost advantage (Commune)00.001−0.0040.000
    Cost advantage (District)00.001−0.0070.006
    Cost advantage (Province)00.001−0.0140.016
Marshallian factors
    Input–output maximum0.0460.1130.0000.893
    Technology transfer0.0040.0140.0000.221
    Skills correlation0.3770.282−0.3281.000

The input–output maximum is expressed as a fraction and so is bound by 0 and 1. It has a mean of 0.046. This means that, on average, approximately 5% of inputs/outputs are supplied/purchased between industry pairs. This is much higher than the Ellison et al. (2010) measure for the USA of 0.007 suggesting that, on average, less than 1% of inputs/output are traded between three-digit industry pairs. Our data also reveal that there are a number of industries that do not buy or sell any goods to one another. The maximum, however, is high at 0.893 suggesting that there is a lot of variation between industry pairs in the extent of input–output linkages and some industries are particularly well integrated. 19

The skills correlation measure is bound by −1 and 1, where 1 is perfect positive correlation. The mean value is 0.38 suggesting a relatively high degree of correlation in the types of workers that different industries employ. This measure is also comparable to the Ellison et al. (2010) mean of 0.47 in the correlation between occupation types among industries. Finally, the technology transfer variable has a mean of 0.004, a minimum of zero (implying that no technology transfers occur between some industries) and a maximum of 0.221. The higher this value is the greater are the technology linkages between industries.

5. Empirical model and results

Our core empirical model is given in Equation (6). 20
(6)
Each of the variables are transformed to have unit standard deviation for ease of comparison of the estimated coefficients on each of the different variables, and to assess the relative importance of each factor in explaining overall coagglomeration patterns. As our unit of analysis is industry pairings, the residuals may be correlated. To correct for the cross-observation correlation in the error terms involving the same industries, we report bootstrapped standard errors. We use a non-parametric approach which involves drawing 50 random samples, with replacement, from the data. For each of these random samples, we estimate the regression coefficients. We then calculate the sample standard deviation of the sampling distribution to obtain the bootstrapped standard errors. Bootstrapping also deals with the econometric issues arising from the use of the generated regressors ( Pagan, 1984 ; Ellison et al., 2010 ).

5.1. Determinants of coagglomeration: XCL index

The top panel of Table 3 presents the results of the regressions with the XCL index as the dependent variable in the analysis. We present the results where clusters are defined by districts. The results for the other geographic levels of aggregation, commune and province, are presented in the Appendix.

Table 3.

Determinants of coagglomeration

(1)(2)(3)(4)(5)(6)
XCL Index
Input–output maximum0.032−0.016−0.016
(0.029)(0.034)(0.036)
Technology transfer0.092**0.100**0.096*
(0.043)(0.048)(0.055)
Skills correlation0.065**0.061*
(0.032)(0.035)
R20.0010.0080.0030.0060.012
Observations903903903903903

EG Index
Natural advantage0.118*0.119**0.111
(0.068)(0.057)(0.080)
Input–output maximum0.0030.0100.011
(0.027)(0.028)(0.031)
Technology transfer−0.014−0.024−0.029
(0.020)(0.038)(0.040)
Skills correlation0.092***0.082*
(0.035)(0.044)
R20.0130.0000.0000.0080.0110.017
Observations903903903903903903
(1)(2)(3)(4)(5)(6)
XCL Index
Input–output maximum0.032−0.016−0.016
(0.029)(0.034)(0.036)
Technology transfer0.092**0.100**0.096*
(0.043)(0.048)(0.055)
Skills correlation0.065**0.061*
(0.032)(0.035)
R20.0010.0080.0030.0060.012
Observations903903903903903

EG Index
Natural advantage0.118*0.119**0.111
(0.068)(0.057)(0.080)
Input–output maximum0.0030.0100.011
(0.027)(0.028)(0.031)
Technology transfer−0.014−0.024−0.029
(0.020)(0.038)(0.040)
Skills correlation0.092***0.082*
(0.035)(0.044)
R20.0130.0000.0000.0080.0110.017
Observations903903903903903903

Bootstrapped standard errors are presented in parentheses. *** p < 0.01, ** p < 0.05, * p < 0.1. Variables are transformed to have unit standard deviation for ease of interpretation. Results for the commune and province levels of aggregation are presented in Table A1 of the Appendix.

Table 3.

Determinants of coagglomeration

(1)(2)(3)(4)(5)(6)
XCL Index
Input–output maximum0.032−0.016−0.016
(0.029)(0.034)(0.036)
Technology transfer0.092**0.100**0.096*
(0.043)(0.048)(0.055)
Skills correlation0.065**0.061*
(0.032)(0.035)
R20.0010.0080.0030.0060.012
Observations903903903903903

EG Index
Natural advantage0.118*0.119**0.111
(0.068)(0.057)(0.080)
Input–output maximum0.0030.0100.011
(0.027)(0.028)(0.031)
Technology transfer−0.014−0.024−0.029
(0.020)(0.038)(0.040)
Skills correlation0.092***0.082*
(0.035)(0.044)
R20.0130.0000.0000.0080.0110.017
Observations903903903903903903
(1)(2)(3)(4)(5)(6)
XCL Index
Input–output maximum0.032−0.016−0.016
(0.029)(0.034)(0.036)
Technology transfer0.092**0.100**0.096*
(0.043)(0.048)(0.055)
Skills correlation0.065**0.061*
(0.032)(0.035)
R20.0010.0080.0030.0060.012
Observations903903903903903

EG Index
Natural advantage0.118*0.119**0.111
(0.068)(0.057)(0.080)
Input–output maximum0.0030.0100.011
(0.027)(0.028)(0.031)
Technology transfer−0.014−0.024−0.029
(0.020)(0.038)(0.040)
Skills correlation0.092***0.082*
(0.035)(0.044)
R20.0130.0000.0000.0080.0110.017
Observations903903903903903903

Bootstrapped standard errors are presented in parentheses. *** p < 0.01, ** p < 0.05, * p < 0.1. Variables are transformed to have unit standard deviation for ease of interpretation. Results for the commune and province levels of aggregation are presented in Table A1 of the Appendix.

Input–output links do not emerge as an agglomerative force. This result also holds when the output–index, the input–index and the absolute input/output values are considered as alternative measures. Technology transfers are the most important force; a one unit standard deviation increase in the technology transfer variable is associated with a 0.096 standard deviation increase in the XCL index. Labour pooling is also a significant force; a one unit standard deviation increase in the labour correlations variable is associated with a 0.061 standard deviation increase in the XCL index.

5.2. Determinants of coagglomeration: EG index

We now examine how the results of the analysis using the XCL measure of agglomeration differ from the results that would emerge based on the EG approach. We use the EG index ( EG AB ) given in Equation (1) as the dependent variable in the empirical model given in Equation (6). The results are presented in the bottom panel of Table 3 .

As before, all variables are normalised to have a standard deviation of one for ease of comparison and bootstrapped standard errors are reported in parenthesis. Skills correlations are found to be a statistically significant agglomerative force and are larger in magnitude when the EG index is the dependent variable than for the XCL index. We find no evidence that input–output linkages are a significant agglomerative force using either measure of coagglomeration. 21 This does not necessarily mean that transport costs and proximity to suppliers or customers are an unimportant factor in firms’ location decisions. Our results only suggest that firms do not locate near domestic suppliers or customers in other industries. Input–output linkages may however be a force for within industry coagglomeration. Additionally, firms that import their inputs or export their outputs may consider transport costs in their location decisions and so locate near to ports or airports; this is not captured by the input–output measure but is captured by the measure for natural advantages.

The most notable difference in the results for the EG index compared with the XCL index is that technology transfers do not appear to be a significant agglomerative force when using the EG measure. We return to this point below.

5.3. Determinants of coagglomeration: employment versus firm

One of the key difference between the two measures of coagglomeration considered in the previous section is their implicit focus on the source of agglomeration economies. The XCL index focuses on individual firms, representing entrepreneurs, as the source of these economies while the EG index focuses on the employees as the source. We therefore consider alternative specifications of both indices where the focus of the agglomeration economies is switched.

First, we repeat our analysis using the XCL index measured using employees rather than firms ( EMP_XCL ) given by Equation (4) as the dependent variable in the analysis. To construct this index, rather than counting the number of firms in industries A and B located in the same area and controlling for the total number of firms in both industries, we count the number of employees in the two industries located in the same area, controlling for the total number of employees in both industries. The results of the analysis are presented in the top panel of Table 4 .

Table 4.

Determinants of coagglomeration: firm versus employment

(1)(2)(3)(4)(5)(6)
EMP_XCL Index
Input–output maximum0.0360.0420.042
(0.034)(0.034)(0.032)
Technology transfer0.009−0.012−0.013
(0.030)(0.025)(0.033)
Skills correlation0.0240.024
(0.031)(0.039)
R20.000.000.000.000.00
Observations0.0010.0000.0010.0010.002

EG by firms Index
Natural advantage0.164***0.164***0.095**
(0.052)(0.058)(0.045)
Input–output maximum−0.004−0.033−0.028
(0.028)(0.025)(0.032)
Technology transfer0.0340.0390.028
(0.026)(0.028)(0.038)
Skills correlation0.293***0.267***
(0.038)(0.035)
R20.0270.0000.0010.0860.0280.095
Observations903903903903903903
(1)(2)(3)(4)(5)(6)
EMP_XCL Index
Input–output maximum0.0360.0420.042
(0.034)(0.034)(0.032)
Technology transfer0.009−0.012−0.013
(0.030)(0.025)(0.033)
Skills correlation0.0240.024
(0.031)(0.039)
R20.000.000.000.000.00
Observations0.0010.0000.0010.0010.002

EG by firms Index
Natural advantage0.164***0.164***0.095**
(0.052)(0.058)(0.045)
Input–output maximum−0.004−0.033−0.028
(0.028)(0.025)(0.032)
Technology transfer0.0340.0390.028
(0.026)(0.028)(0.038)
Skills correlation0.293***0.267***
(0.038)(0.035)
R20.0270.0000.0010.0860.0280.095
Observations903903903903903903

Bootstrapped standard errors are presented in parentheses. *** p < 0.01, ** p < 0.05, * p < 0.1. Variables are transformed to have unit standard deviation for ease of interpretation. Results for the commune and province levels of aggregation are presented in Table A2 of the Appendix.

Table 4.

Determinants of coagglomeration: firm versus employment

(1)(2)(3)(4)(5)(6)
EMP_XCL Index
Input–output maximum0.0360.0420.042
(0.034)(0.034)(0.032)
Technology transfer0.009−0.012−0.013
(0.030)(0.025)(0.033)
Skills correlation0.0240.024
(0.031)(0.039)
R20.000.000.000.000.00
Observations0.0010.0000.0010.0010.002

EG by firms Index
Natural advantage0.164***0.164***0.095**
(0.052)(0.058)(0.045)
Input–output maximum−0.004−0.033−0.028
(0.028)(0.025)(0.032)
Technology transfer0.0340.0390.028
(0.026)(0.028)(0.038)
Skills correlation0.293***0.267***
(0.038)(0.035)
R20.0270.0000.0010.0860.0280.095
Observations903903903903903903
(1)(2)(3)(4)(5)(6)
EMP_XCL Index
Input–output maximum0.0360.0420.042
(0.034)(0.034)(0.032)
Technology transfer0.009−0.012−0.013
(0.030)(0.025)(0.033)
Skills correlation0.0240.024
(0.031)(0.039)
R20.000.000.000.000.00
Observations0.0010.0000.0010.0010.002

EG by firms Index
Natural advantage0.164***0.164***0.095**
(0.052)(0.058)(0.045)
Input–output maximum−0.004−0.033−0.028
(0.028)(0.025)(0.032)
Technology transfer0.0340.0390.028
(0.026)(0.028)(0.038)
Skills correlation0.293***0.267***
(0.038)(0.035)
R20.0270.0000.0010.0860.0280.095
Observations903903903903903903

Bootstrapped standard errors are presented in parentheses. *** p < 0.01, ** p < 0.05, * p < 0.1. Variables are transformed to have unit standard deviation for ease of interpretation. Results for the commune and province levels of aggregation are presented in Table A2 of the Appendix.

When the XCL measure is calculated in this way no agglomerative forces emerge as significant. In particular, the loss in significance of the technology transfers variable suggests that, as expected, individual firms or entrepreneurs rather than employees are the source of technology transfers between firms. Additionally, the analysis does not identify labour market pooling as a significant agglomerative force. This implies that when employees are the focus of agglomeration economies controlling for the employment in each area separately in the construction of the index may be important. By simply counting the number of employees the two industries have in the same location, labour market pooling is incorporated into the index and cannot be separately identified as an agglomerative force.

Second, we amend the EG index so it focuses on firms rather than employees as the source of agglomeration economies. We use the amended EG by firms index given by Equation (5) as the dependent variable in our analysis. The results are presented in the bottom panel of Table 4 . Interestingly, technology transfers do not appear as an important agglomerative force when the EG by firms index is used. This is consistent with our initial hypothesis that the EG index may overweight clusters in rural areas due to the fact that it controls for urbanisation separately in each area. As outlined in the introduction, there are two key differences between the XCL and EG indices. The EG by firms index removes one of these differences by focussing the index on firms as the source of agglomeration economies, consistent with the XCL measure. However, the EG by firms measure controls for firm density or urbanisation in the same way as the standard EG measure. Consequently, the issue becomes one of over-weighting clusters in areas of low firm density, which again are likely to be rural areas. We therefore would not expect the EG by firms results to converge to the XCL results; although both indices are now focused on firms, they still control for urbanisation in different ways and the EG by firms index will still overweight clusters in rural areas.

5.4. Limitations and caveats

A limitation of analyses of the type conducted in this article is that there are potential sources of endogeneity. It is possible, for example, that causation runs in the opposite direction for some of the agglomerative forces or that there are omitted factors driving the results. While we cannot rule out all possible confounding factors, the steps taken in our analysis to control for natural advantages, urbanisation and the density of firms, in addition to the fact that our results are robust across different measures and levels of regional aggregation, go some way to alleviating these concerns. Thus, we urge some caution in inferring causality. However, we do not believe that this detracts from our overall story given that the central aim of the article is to present a new measure of coagglomeration and to compare it to the EG index.

There are two caveats in relation to the data that limit our analysis. First, we do not have access to GPS data. We know in which commune the firm is located, which is a very small geographic area, but not the exact location. This means that we cannot consider other measures of coagglomeration, such as that developed by Duranton and Overman (2005) , which use information on the exact location of the firm. Second, we only have a representative sample of small firms, defined as those with fewer than 30 employees, and this may have some impact on the results. The main difference in the analysis using the two alternative measures is the identification of technology transfers as an agglomerative force. As noted throughout the article, there are two key differences between the indices; the focus of the source of agglomeration economies, and the different way in which the two measures control for urbanisation. There are therefore two possible channels through which differences in the results can be explained. Although the fact that we have only a sample of small firms might affect our results through the second channel it will not have any affect through the first. So while it is possible that only having a sample of smaller firms explains some of the difference between the results for the XCL indices and the EG index it is extremely unlikely that it explains it all.

6. Conclusion

Understanding the agglomeration of industries is analytically challenging and the existing empirical evidence is fragmented and scarce. We have explored the implications of a new alternative coagglomeration measure that may be more relevant in developing country contexts than standard measures. We also developed a measure of technology transfers that encompasses both formal and informal channels, noting that the latter is often not considered in empirical analysis.

The major difference between the analysis using the more commonly applied EG index and the analysis using the XCL measure developed here is the identification of technology transfers as an important agglomerative force. Using our measure, we find that technology transfers are the most important agglomerative force in Vietnam. When the EG measure is used the technology transfers play no role. This notable contrast in the results can be explained by two key differences in the measures; the focus of the source of agglomeration economies implicit in the construction of each measure, and the difference in how each measure controls for urbanisation.

The EG measure focuses on employees as the source of agglomeration economies while the XCL index focuses on firms/entrepreneurs. One can argue the case for either, but we contend that in a developing country context, where productivity spillovers or technology transfers are of crucial importance, the firm is the more appropriate focus. Technology spillovers need not necessarily depend on the number of employees but could take place between many small firms that form a cluster. In fact, in developing countries, firms that are more technologically advanced may have fewer employees. Large firms tend to be labour intensive simply because they use low levels of technology and employ large amounts of unskilled labour. Therefore the EG measure, by construction, will not capture the role of technology spillovers on the coagglomeration choices of these firms. Additionally, in controlling for urbanisation in each area separately, rather than controlling for the overall distribution of economic activity, the EG measure may overweight clusters in rural areas. The XCL index overcomes this issue by controlling for the overall distribution of firms.

Understanding the driving forces behind agglomeration is critical for governments in the formulation of industrial policy. We have shown that the definition and measurement of agglomeration are crucial to analytical outcomes when agglomeration is linked to underlying driving forces. The EG index of coagglomeration captures correlations in the relative size of two industries, in terms of employment, across areas. While this may explain an important part of the agglomeration story, in a developing country context, entrepreneurs are more likely to be the source of agglomeration economies. In this setting, the EG index may underestimate the relative importance of some clusters. This is certainly the case in Vietnam. It is left for future research to establish the extent to which the differences we observe reflect underlying characteristics of developed compared with developing country contexts or are embedded in the way in which agglomeration is defined and measured.

1 See Helsley and Strange (1990) . Another interpretation is that there is a risk sharing aspect to a large pool of labour and therefore labour market pooling makes workers and firms better off when firms face idiosyncratic demand shocks ( Krugman, 1991 ; Overman and Puga, 2010 ).

2 It is also possible that agglomeration has negative effects. Economic concentrations can lead to spatial inequalities. For example, Lee and Rodriguez-Pose (2012) find a link between innovation centres and inequality in European regions. There is evidence from many developing countries that spatial inequality is not only occurring, but increasing over time ( Felkner and Townsend, 2011 ).

3 See Pack and Saggi (2006) for a full discussion of industrial policy in developing countries.

4 Chatterji et al. (2013) provide an overview of the literature on the spatial concentration of entrepreneurship and the existing evidence for associated knowledge spillovers focussing on the USA. For a theoretical model which specifically links knowledge spillovers to entrepreneurship see Acs et al. (2009) .

5 A third possible caveat to using the EG index is that it is a model-based measure of coagglomeration, and the underlying model assumes that industries that can benefit from agglomerating will coagglomerate. This assumption is challenging in countries like Vietnam where there is a high concentration of state ownership of firms, and where there are restrictions on where foreign-owned firms can locate.

6 The majority of workers in Vietnam are low skilled; only 6% of workers have achieved higher than an upper secondary school diploma. Moreover, in our data the mean number of employees in relatively high-tech firms is 92 while compared with 143 employees in low-tech firms.

7 For early empirical work see, for example, Kim (1999) , Ellison and Glaeser (1999) , and Knarvik and Steen (1999) and for more recent studies see Devereux et al. (2004) , Greenstone et al. (2010) , Overman and Puga (2010) and, most related to the current paper, Ellison et al. (2010) .

8 This focus is motivated by recent empirical literature which suggests that competition effects can reduce firm profitability in dense clusters of economic activity and that these losses outweigh productivity gains at the firm-level ( Siba et al., 2012 ; Chhair and Newman, 2014 ). Coagglomeration of different industries is therefore expected to be a more important source of agglomeration externalities.

9 See Ellison and Glaeser (1997) for a full derivation of the model and the EG index.

10 We also consider an absolute version of the CL index that does not control for the total number of firms in industries A and B. This absolute index therefore takes larger values for larger industries. Results of the analysis using the absolute measure are similar and available on request.

11 The fact that we only have a sample (albeit representative) of small firms in our data is a limitation. In particular, it may exacerbate the difference between the EG and CL/XCL indices. Small firms, however, account for a small proportion of overall manufacturing output so this feature of the data is unlikely to explain a large part of the difference between the indices. We discuss this limitation further in the empirical results section.

12 The full set of industries are listed in the Appendix. It should be noted that while it is possible to use our data to construct coagglomeration indices for four-digit ISIC industry pairings, we are constrained by the level of industry disaggregation available for the other variables used in our analysis. For this reason, we aggregate the four-digit industry codes into a common set of industry codes that are available for all measures. The level of aggregation lies somewhere between the four-digit and two-digit level.

13 The results of the analysis with areas of very different size are consistent with those presented here. We are therefore confident that our results are not purely driven by the choice of geographic area.

14 The VHLSS contains information on a representative sample of workers, including the two-digit sector they work in and whether or not this is in a rural area. Some mapping is required to translate the two digit sectors into our sector codes given in the Appendix. This mapping is described in footnote 15. We calculate the proportion of workers in a sector that are in a rural location and take the top 10 sectors by this measure. These are sectors 39, 19, 14, 18, 30, 31, 32, 16, 15 and 43. Percentage of rural employment in these sectors ranges from 50% to 30%.

15 Some mapping is required to match the two-digit industry information in the VHLSS with the four-digit industry information provided in the Enterprise Survey. The two-digit information provided in the VHLSS is apportioned to four-digit industries using the proportion of two-digit employment employed by that four-digit industry. For example, suppose the VHLSS indicates that 20% of all engineers are employed by two-digit industry 10, and there are two 4 digit industries within this two-digit industry; 1010 and 1020. The Enterprise Survey tells us the number of workers employed by each industry. Let us say industry 1010 employs 40% of all industry 10 workers, so industry 1020 employs 60%. The four-digit industry shares of employment of engineers are then 8% for 1010 and 12% for industry 1020.

16 Where 1–5 are levels in primary school, 6–9 are levels in lower secondary and 10–12 levels in upper secondary. The mean level of grade completed is 7.3 with a standard deviation of 3.6. The VHLSS also contains a question on the highest diploma obtained; however, for 94% of workers this is an upper secondary school diploma or lower diploma. There is therefore much more variation and information in the highest grade completed measure so it is superior for measuring the education level of workers.

17 We define experience = 1 if workers have 1 year experience or less; 2 if they have between 2 and 5 years experience; 3 if they have between 6 and 10 years experience; 4 if they have between 11 and 20 years experience; 5 if they have between 21 and 30 years experience; and 6 if they have more than 30 years experience. When we classify experience according to this scale (1–6) the mean experience level is 3.2 and the standard deviation is 1.4.

18 We also make the technology transfer variables symmetric by taking the maximum of the two asymmetric measures (AB and BA). We then combine the two technology transfer variables (from suppliers and from customers) by taking the maximum of the two variables for each industry pairing AB. We obtain very similar results when using this alternative measure.

19 Ellison et al. (2010) find a maximum of 0.823 for their measure of input–output linkages between three-digit industry pairs in the USA.

20 This model is estimated using OLS. We also estimate a generalized linear model which imposes a logistic distribution on the dependent variable. The results are unaffected.

21 We run the analysis using each of the three relative input–output measures and the absolute input–output measure and find no significant effect on coagglomeration.

Acknowledgements

We are most grateful to two anonymous referees and the editor for constructive critique and helpful comments. We are grateful as well to colleagues at the Central Institute of Economic Management (CIEM), Hanoi, Vietnam for collaboration on the issues addressed here for more than a decade. We also acknowledge comments and critique from participants at various seminars and conferences. The same goes for collaborators in the UNU-WIDER/Brookings ‘Learning to compete (L2C)’ project that helped shape our analysis and sharpen our research focus. We would finally like to thank Chris Adam, Arne Bigsten, Justin Lin, Howard Pack, John Page, John Rand, Måns Söderbom and Abebe Shimeless for comments and encouragement. The usual caveats apply.

References

Acs
Z. J.
Braunerhjelm
P.
Audretsch
D. B.
Carlsson
B.
(
2009
)
The knowledge spillover theory of entrepreneurship
.
Small Business Economics
,
32
:
15
30
.

Audretsch
D. B.
Feldman
M. P.
(
1996
)
R&D spillovers and the geography of innovation and production
.
American Economic Review
,
86
:
630
640
.

Bigsten
A.
Gebreeyesus
M.
Siba
E.
Söderbom
M.
(
2011
)
The effects of agglomeration and competition on prices and productivity: evidence for Ethiopia’s manufacturing industry. Mimeo, University of Gothenburg
.

Billings
S. B.
Johnson
E. B.
(
2015
)
Measuring agglomeration: which estimator should we use? Unpublished
.

Chatterji
A.
Glaeser
E.
Kerr
W.
(
2013
)
Clusters of entrepreneurship and innovation. NBER Working Paper Number 19013
.

Chhair
S.
Newman
C.
(
2014
)
Clustering, productivity and spillover effects: evidence from Cambodia. UNU-WIDER Working Paper Number WP2014‐065
.

Collier
P.
Page
J.
(
2009
)
Industrial Development Report 2009: Breaking in and Moving Up – New Industrial Challenges for the Bottom Billion and the Middle Income Countries
.
Vienna
:
UNIDO
.

Deichmann
U.
Lall
V. S.
Redding
S. J.
Venables
A. J.
(
2008
)
Industrial location in developing countries
.
World Bank Research Observer
,
23
:
219
246
.

Devereux
M.
Griffith
R.
Simpson
H.
(
2004
)
The geographic distribution of production activity in the UK
.
Regional Science and Urban Economics
,
34
:
533
564
.

Duranton
G.
Overman
H. G.
(
2005
)
Testing for localization using micro-geographic data
.
Review of Economic Studies
,
72
:
1077
1106
.

Ellison
G.
Glaeser
E. L.
(
1997
)
Geographic concentration in U.S. manufacturing industries: a dartboard approach
.
Journal of Political Economy
,
105
:
889
927
.

Ellison
G.
Glaeser
E. L.
(
1999
)
The geographic concentration of industry: does natural advantage explain agglomeration?
American Economic Review
,
89
:
311
316
.

Ellison
G.
Glaeser
E. L.
Kerr
W. R.
(
2010
)
What causes industry agglomeration? Evidence from coagglomeration patterns
.
American Economic Review
,
100
:
1195
1213
.

Faggio
G.
Silva
O.
Strange
W. C.
(
2014
)
Heterogeneous agglomeration. SERC Discussion Papers. SERCDP0152.

Felkner
J.
Townsend
R. M.
(
2011
)
The geographic concentration of enterprise in developing countries
.
Quarterly Journal of Economics
,
126
:
2005
2061
.

Fujita
M.
Krugman
P. R.
Venables
A. J.
(
1999
)
The Spatial Economy: Cities, Regions and International Trade
.
Cambridge, MA
:
MIT Press
.

Fujita
M.
Ogawa
H.
(
1982
)
Multiple equilibria and structural transition of non-monocentric urban configurations
.
Regional Science and Urban Economics
,
12
:
161
196
.

Glaeser
E.
Kerr
W.
Ponzetto
G.
(
2010
)
Clusters of entrepreneurship
.
Journal of Urban Economics
,
67
:
150
168
.

Gorman
S. P.
Kulkarni
R.
(
2004
).
Spatial small worlds: new geographic patterns for an information economy
.
Environment and Planning B: Planning and Design
,
31
:
273
296
.

Greenstone
M.
Hornbeck
R.
Moretti
E.
(
2010
)
Identifying agglomeration spillovers: evidence from winners and losers of large plant openings
.
Journal of Political Economy
,
118
:
536
598
.

Helsley
R. W.
Strange
W. C.
(
1990
)
Matching and agglomeration economies in a system of cities
.
Regional Science and Urban Economics
,
20
:
189
212
.

Henderson
J. V.
(
2003
)
Marshall’s scale economies
.
Journal of Urban Economics
,
53
:
1
28
.

Howard
E.
Newman
C.
Thijssen
J.
(
2011
)
Are spatial networks of firms random? Evidence from Vietnam. UNU-WIDER Working Paper No. 87
.

Huang
Y.
Bocchi
A. M.
(
2009
)
Lessons from experience: reshaping economic geography in East Asia
. In
Huang
Y.
Bocchi
A. M.
(eds)
Reshaping Economic Geography in East Asia
.
Washington, DC
:
World Bank, EAP Companion Volume to the WDR 2009
.

Jaffe
A. B.
(
1986
)
Technological opportunity and spillovers of R&D: evidence from firms' patents, profits, and market value
.
American Economic Review
,
76
:
984
1001
.

Kim
S.
(
1999
)
Regions, resources and economics geography: sources of U.S. regional comparative advantage, 1880–1987
.
Regional Science and Urban Economics
,
29
:
1
32
.

Knarvik
K. H. M.
Steen
F.
(
1999
)
Self-reinforcing agglomerations? An empirical industry study
.
Scandinavian Journal of Economics
,
101
:
515
532
.

Krugman
P. R.
(
1991
)
Increasing returns and economic geography
.
Journal of Political Economy
,
99
:
483
499
.

Lee
N.
Andrés
R.-P.
(
2012
)
Innovation and spatial inequality in Europe and USA
.
Journal of Economic Geography
,
13
:
1
22
.

Marshall
A.
(
1920
)
Principles of Economics
.
London
:
Macmillan and Co. Ltd
.

Overman
H. G.
Puga
D.
(
2010
)
Labour pooling as a source of agglomeration: an empirical investigation
. In
Glaeser
E. L.
(ed.)
Agglomeration Economics
.
Chicago, IL
:
University of Chicago Press
.

Pack
H.
Saggi
K.
(
2006
)
The case for industrial policy: a critical survey. Policy Research Working Paper Series 3839: The World Bank
.

Pagan
A.
(
1984
)
Econometric issues in the analysis of regressions with generated regressors
.
International Economic Review
,
25
:
221
247
.

Rosenthal
S.
Strange
W.
(
2010
)
Small establishments/big effects: agglomeration, industrial organization and entrepreneurship
. In
Glaeser
E.
(ed.)
Agglomeration Economics
.
Chicago, IL
:
University of Chicago Press
, pp. 277–302
.

Son
D. K.
(
2009
)
Rural development and issues in Vietnam: spatial disparities and some recommendations
. In
Huang
Y.
Bocchi
A. M.
(eds)
Reshaping Economic Geography in East Asia
.
Washington, DC
:
World Bank, EAP Companion Volume to the WDR 2009
.

Siba
E.
Söderbom
M.
Bigsten
A.
Gebreeyesus
M.
(
2012
)
Enterprise agglomeration, output prices and physical productivity: firm-level evidence from Ethiopia. Mimeo, University of Gothenburg
.

Appendix

Numerical example of construction of CL index

Suppose there are six firms in total, two industries and two locations, as per the table below.

FirmIndustryLocation
1AX
2BX
3AX
4BY
5AY
6BX
FirmIndustryLocation
1AX
2BX
3AX
4BY
5AY
6BX
FirmIndustryLocation
1AX
2BX
3AX
4BY
5AY
6BX
FirmIndustryLocation
1AX
2BX
3AX
4BY
5AY
6BX

The aim of the CL index is to measure the extent to which firms from industries A and B locate in the same area.

First, take industry A. The index begins with the first firm in industry A and checks how many of the firms in Industry B are located in the same area as firm 1.

Recall that C ij = 1 if firms are in the same area, and zero otherwise. This implies that C 12 = 1, C 14 = 0, C 16 = 1, and the sum, ∑C ij for i = 1 is 2.

The index then moves to the next firm in industry A, firm 3, and checks how many of the firms in industry B are located in the same area as firm 3, and so C 32 = 1, C 34 = 0, C 36 = 1, and the sum, ∑C ij for i = 3 is 2.

The index then moves to the last firm in industry A, firm 5, and checks how many of the firms in industry B are located in the same area as firm 5. The appropriate entries into the index will be C 52 = 0, C 54 = 1, C 56 = 0, and the sum, ∑C ij for i = 5 is 1.

The total sum ∑C ij = 5.

The index controls for the size of both industries (each industry has 3 firms) by dividing by the total possible number of pairings (3 × 3) = 9 so the CL index for industries A and B in this example is 5/9 or 0.56.

We define this colocation value as CL AB , but CL BA would give us an identical result. C ij must equal C ji ; the relationship is symmetric. If firm i is located in the same area as firm j then firm j must also be located in the same area as firm i.

Additionally, note that if all the firms from both industries were located in the same area, then the total sum ∑C ij would equal 9, and so the value of the CL index would be 1.

Description of manufacturing industry codes

  1. Production, processing and preserving of meat and meat products.

  2. Processing and preserving of fish and fish products.

  3. Processing and preserving of fruit and vegetable.

  4. Manufacture of vegetable and animal oils and fats.

  5. Manufacture of milk and dairy products.

  6. Processing of rice and flour.

  7. Other food manufacturing.

  8. Manufacture of prepared feeds for animals.

  9. Manufacture of cakes, jams, candy, coca, chocolate products.

  10. Manufacture of sugar.

  11. Manufacture of alcohol and liquors.

  12. Manufacture of beer.

  13. Manufacture of alcohol-free beverages e.g. soft drinks, mineral waters.

  14. Manufacture of cigarettes and other tobacco products.

  15. Manufacture of fibre (all kinds).

  16. Manufacture of textile products (all kinds).

  17. Manufacture of ready-made apparel (all kinds).

  18. Manufacture of leather and leather products.

  19. Manufacture of wood and by-products.

  20. Manufacture of pulp, paper and by-products.

  21. Publishing.

  22. Printing.

  23. Manufacture of coke, coal and other by-products.

  24. Manufacture of gasoline and lubricants.

  25. Manufacture of fertilizers.

  26. Manufacture of other chemical products.

  27. Manufacture of pharmaceuticals, medicinal chemicals and botanical products.

  28. Manufacture of processed rubber and by-products.

  29. Manufacture of plastic and by-products.

  30. Manufacture of glass and by-products.

  31. Manufacture of other non-metallic mineral products.

  32. Manufacture of cement and cement products.

  33. Manufacture of metal and metal products.

  34. Manufacture of general purpose machinery.

  35. Manufacture of special purpose machinery.

  36. Manufacture of domestic appliances.

  37. Manufacture of electrical machinery.

  38. Manufacture of electrical equipment.

  39. Manufacture of machinery used for broadcasting, television and information activities.

  40. Manufacture of medical and surgical equipment.

  41. Manufacture of precision and optical equipment.

  42. Manufacture of transportation machinery and equipment.

  43. Manufacture of other goods.

Results for other levels of geographic aggregation

Table A1.

Determinants of coagglomeration: communes and province

(1)(2)(3)(4)(5)(6)(7)(8)(9)(10)(11)(12)
EMP_XCL
Commune levelProvince level

Input–output maximum0.0580.0200.0210.012−0.026−0.026
(0.036)(0.040)(0.045)(0.029)(0.033)(0.038)
Technology transfer0.088**0.078**0.074**0.0650.078*0.078**
(0.039)(0.036)(0.038)(0.042)(0.046)(0.035)
Skills correlation0.074**0.069**0.0050.002
(0.030)(0.034)(0.029)(0.032)
R20.0030.0160.0110.0080.0130.0000.0040.0000.0050.005

EG by firms
Commune levelProvince level

Natural advantage0.176*0.177*0.166−0.042−0.040−0.042
(0.100)(0.097)(0.105)(0.034)(0.034)(0.040)
Input–output maximum−0.003−0.013−0.0120.0190.0350.035
(0.027)(0.026)(0.028)(0.029)(0.030)(0.032)
Technology transfer0.0070.006−0.000−0.017−0.034−0.036*
(0.016)(0.019)(0.022)(0.017)(0.024)(0.021)
Skills correlation0.125***0.109***0.0460.049
(0.045)(0.034)(0.045)(0.043)
R20.0310.0000.0000.0150.0310.0430.0010.0000.0000.0020.0030.054
(1)(2)(3)(4)(5)(6)(7)(8)(9)(10)(11)(12)
EMP_XCL
Commune levelProvince level

Input–output maximum0.0580.0200.0210.012−0.026−0.026
(0.036)(0.040)(0.045)(0.029)(0.033)(0.038)
Technology transfer0.088**0.078**0.074**0.0650.078*0.078**
(0.039)(0.036)(0.038)(0.042)(0.046)(0.035)
Skills correlation0.074**0.069**0.0050.002
(0.030)(0.034)(0.029)(0.032)
R20.0030.0160.0110.0080.0130.0000.0040.0000.0050.005

EG by firms
Commune levelProvince level

Natural advantage0.176*0.177*0.166−0.042−0.040−0.042
(0.100)(0.097)(0.105)(0.034)(0.034)(0.040)
Input–output maximum−0.003−0.013−0.0120.0190.0350.035
(0.027)(0.026)(0.028)(0.029)(0.030)(0.032)
Technology transfer0.0070.006−0.000−0.017−0.034−0.036*
(0.016)(0.019)(0.022)(0.017)(0.024)(0.021)
Skills correlation0.125***0.109***0.0460.049
(0.045)(0.034)(0.045)(0.043)
R20.0310.0000.0000.0150.0310.0430.0010.0000.0000.0020.0030.054

Note: These results are comparable to those presented in Table 3 . Each model is estimated using 903 observations. Bootstrapped standard errors are presented in parentheses. *** p < 0.01, ** p < 0.05, * p < 0.1, Variables are transformed to have unit standard deviation for ease of interpretation.

Table A1.

Determinants of coagglomeration: communes and province

(1)(2)(3)(4)(5)(6)(7)(8)(9)(10)(11)(12)
EMP_XCL
Commune levelProvince level

Input–output maximum0.0580.0200.0210.012−0.026−0.026
(0.036)(0.040)(0.045)(0.029)(0.033)(0.038)
Technology transfer0.088**0.078**0.074**0.0650.078*0.078**
(0.039)(0.036)(0.038)(0.042)(0.046)(0.035)
Skills correlation0.074**0.069**0.0050.002
(0.030)(0.034)(0.029)(0.032)
R20.0030.0160.0110.0080.0130.0000.0040.0000.0050.005

EG by firms
Commune levelProvince level

Natural advantage0.176*0.177*0.166−0.042−0.040−0.042
(0.100)(0.097)(0.105)(0.034)(0.034)(0.040)
Input–output maximum−0.003−0.013−0.0120.0190.0350.035
(0.027)(0.026)(0.028)(0.029)(0.030)(0.032)
Technology transfer0.0070.006−0.000−0.017−0.034−0.036*
(0.016)(0.019)(0.022)(0.017)(0.024)(0.021)
Skills correlation0.125***0.109***0.0460.049
(0.045)(0.034)(0.045)(0.043)
R20.0310.0000.0000.0150.0310.0430.0010.0000.0000.0020.0030.054
(1)(2)(3)(4)(5)(6)(7)(8)(9)(10)(11)(12)
EMP_XCL
Commune levelProvince level

Input–output maximum0.0580.0200.0210.012−0.026−0.026
(0.036)(0.040)(0.045)(0.029)(0.033)(0.038)
Technology transfer0.088**0.078**0.074**0.0650.078*0.078**
(0.039)(0.036)(0.038)(0.042)(0.046)(0.035)
Skills correlation0.074**0.069**0.0050.002
(0.030)(0.034)(0.029)(0.032)
R20.0030.0160.0110.0080.0130.0000.0040.0000.0050.005

EG by firms
Commune levelProvince level

Natural advantage0.176*0.177*0.166−0.042−0.040−0.042
(0.100)(0.097)(0.105)(0.034)(0.034)(0.040)
Input–output maximum−0.003−0.013−0.0120.0190.0350.035
(0.027)(0.026)(0.028)(0.029)(0.030)(0.032)
Technology transfer0.0070.006−0.000−0.017−0.034−0.036*
(0.016)(0.019)(0.022)(0.017)(0.024)(0.021)
Skills correlation0.125***0.109***0.0460.049
(0.045)(0.034)(0.045)(0.043)
R20.0310.0000.0000.0150.0310.0430.0010.0000.0000.0020.0030.054

Note: These results are comparable to those presented in Table 3 . Each model is estimated using 903 observations. Bootstrapped standard errors are presented in parentheses. *** p < 0.01, ** p < 0.05, * p < 0.1, Variables are transformed to have unit standard deviation for ease of interpretation.

Table A2.

Determinants of coagglomeration: EMP_XCL and EG by firms measures—communes and provinces

(1)(2)(3)(4)(5)(6)(7)(8)(9)(10)(11)(12)
EMP_XCL
Commune levelProvince level

Input–output maximum0.054**0.056*0.056**0.0090.0110.011
(0.023)(0.031)(0.028)(0.030)(0.036)(0.038)
Technology transfer0.023−0.005−0.0060.001−0.004−0.005
(0.036)(0.030)(0.036)(0.035)(0.036)(0.031)
Skills correlation0.0350.0340.0080.008
(0.040)(0.029)(0.033)(0.031)
R20.0020.0010.0010.0030.0040.0000.0000.0000.0000.000

EG by firms
Commune levelProvince level

Natural advantages0.722***0.721***0.661***0.0400.040−0.002
(0.037)(0.035)(0.029)(0.029)(0.031)(0.036)
Input–output maximum0.0410.0210.022−0.031−0.050**−0.048**
(0.034)(0.025)(0.024)(0.024)(0.020)(0.024)
Technology transfer0.065**0.0070.0010.0160.0380.027
(0.025)(0.019)(0.025)(0.031)(0.0229)(0.024)
Skills correlation0.397***0.189***0.231***0.231***
(0.037)(0.025)(0.031)(0.031)
R20.5210.0020.0040.1590.5210.5540.0020.0010.0000.0540.0040.056
(1)(2)(3)(4)(5)(6)(7)(8)(9)(10)(11)(12)
EMP_XCL
Commune levelProvince level

Input–output maximum0.054**0.056*0.056**0.0090.0110.011
(0.023)(0.031)(0.028)(0.030)(0.036)(0.038)
Technology transfer0.023−0.005−0.0060.001−0.004−0.005
(0.036)(0.030)(0.036)(0.035)(0.036)(0.031)
Skills correlation0.0350.0340.0080.008
(0.040)(0.029)(0.033)(0.031)
R20.0020.0010.0010.0030.0040.0000.0000.0000.0000.000

EG by firms
Commune levelProvince level

Natural advantages0.722***0.721***0.661***0.0400.040−0.002
(0.037)(0.035)(0.029)(0.029)(0.031)(0.036)
Input–output maximum0.0410.0210.022−0.031−0.050**−0.048**
(0.034)(0.025)(0.024)(0.024)(0.020)(0.024)
Technology transfer0.065**0.0070.0010.0160.0380.027
(0.025)(0.019)(0.025)(0.031)(0.0229)(0.024)
Skills correlation0.397***0.189***0.231***0.231***
(0.037)(0.025)(0.031)(0.031)
R20.5210.0020.0040.1590.5210.5540.0020.0010.0000.0540.0040.056

Note: These results are comparable to those presented in Table 4 . Each model is estimated using 903 observations.

Bootstrapped standard errors are presented in Bootstrapped standard errors are presented in parentheses. *** p < 0.01, ** p < 0.05, * p < 0.1. Variables are transformed to have unit standard deviation for ease of interpretation.

Table A2.

Determinants of coagglomeration: EMP_XCL and EG by firms measures—communes and provinces

(1)(2)(3)(4)(5)(6)(7)(8)(9)(10)(11)(12)
EMP_XCL
Commune levelProvince level

Input–output maximum0.054**0.056*0.056**0.0090.0110.011
(0.023)(0.031)(0.028)(0.030)(0.036)(0.038)
Technology transfer0.023−0.005−0.0060.001−0.004−0.005
(0.036)(0.030)(0.036)(0.035)(0.036)(0.031)
Skills correlation0.0350.0340.0080.008
(0.040)(0.029)(0.033)(0.031)
R20.0020.0010.0010.0030.0040.0000.0000.0000.0000.000

EG by firms
Commune levelProvince level

Natural advantages0.722***0.721***0.661***0.0400.040−0.002
(0.037)(0.035)(0.029)(0.029)(0.031)(0.036)
Input–output maximum0.0410.0210.022−0.031−0.050**−0.048**
(0.034)(0.025)(0.024)(0.024)(0.020)(0.024)
Technology transfer0.065**0.0070.0010.0160.0380.027
(0.025)(0.019)(0.025)(0.031)(0.0229)(0.024)
Skills correlation0.397***0.189***0.231***0.231***
(0.037)(0.025)(0.031)(0.031)
R20.5210.0020.0040.1590.5210.5540.0020.0010.0000.0540.0040.056
(1)(2)(3)(4)(5)(6)(7)(8)(9)(10)(11)(12)
EMP_XCL
Commune levelProvince level

Input–output maximum0.054**0.056*0.056**0.0090.0110.011
(0.023)(0.031)(0.028)(0.030)(0.036)(0.038)
Technology transfer0.023−0.005−0.0060.001−0.004−0.005
(0.036)(0.030)(0.036)(0.035)(0.036)(0.031)
Skills correlation0.0350.0340.0080.008
(0.040)(0.029)(0.033)(0.031)
R20.0020.0010.0010.0030.0040.0000.0000.0000.0000.000

EG by firms
Commune levelProvince level

Natural advantages0.722***0.721***0.661***0.0400.040−0.002
(0.037)(0.035)(0.029)(0.029)(0.031)(0.036)
Input–output maximum0.0410.0210.022−0.031−0.050**−0.048**
(0.034)(0.025)(0.024)(0.024)(0.020)(0.024)
Technology transfer0.065**0.0070.0010.0160.0380.027
(0.025)(0.019)(0.025)(0.031)(0.0229)(0.024)
Skills correlation0.397***0.189***0.231***0.231***
(0.037)(0.025)(0.031)(0.031)
R20.5210.0020.0040.1590.5210.5540.0020.0010.0000.0540.0040.056

Note: These results are comparable to those presented in Table 4 . Each model is estimated using 903 observations.

Bootstrapped standard errors are presented in Bootstrapped standard errors are presented in parentheses. *** p < 0.01, ** p < 0.05, * p < 0.1. Variables are transformed to have unit standard deviation for ease of interpretation.

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( http://creativecommons.org/licenses/by-nc/4.0/ ), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact [email protected]