Modularity based community detection in hypergraphs

The number of hyperedges of type (c, d) (the top-10 most frequent ones) for the primary-school dataset (left) and the cora co-citation dataset (right) (Combinations contributing to hypergraph modularity are highlighted in grey.)

primary-school
d	c	Purity	Frequency
2	1	50%	5202
2	2	100%	2546
3	3	100%	2434
3	2	67%	1751
3	1	33%	415
4	4	100%	158
4	2	50%	93
4	3	75%	84
4	1	25%	12
5	3	60%	6

cora
d	c	Purity	Frequency
2	2	100%	472
3	3	100%	307
4	4	100%	175
2	1	50%	151
3	2	67%	118
4	3	75%	91
5	5	100%	83
5	4	80%	55
4	2	50%	42
3	1	33%	39

cora
d	c	Purity	Frequency
2	2	100%	472
3	3	100%	307
4	4	100%	175
2	1	50%	151
3	2	67%	118
4	3	75%	91
5	5	100%	83
5	4	80%	55
4	2	50%	42
3	1	33%	39

Table 1

primary-school
d	c	Purity	Frequency
2	1	50%	5202
2	2	100%	2546
3	3	100%	2434
3	2	67%	1751
3	1	33%	415
4	4	100%	158
4	2	50%	93
4	3	75%	84
4	1	25%	12
5	3	60%	6

cora
d	c	Purity	Frequency
2	2	100%	472
3	3	100%	307
4	4	100%	175
2	1	50%	151
3	2	67%	118
4	3	75%	91
5	5	100%	83
5	4	80%	55
4	2	50%	42
3	1	33%	39

cora
d	c	Purity	Frequency
2	2	100%	472
3	3	100%	307
4	4	100%	175
2	1	50%	151
3	2	67%	118
4	3	75%	91
5	5	100%	83
5	4	80%	55
4	2	50%	42
3	1	33%	39

Another dataset we use for our experiments in the co-reference dataset between scientific publication which belong to one of seven classes (cora); see [54] for more details. Hyperedges consist of co-cited scientific publications, and we only keep hyperedges of size 2 or more. There are 1434 nodes appearing in at least one hyperedge (cited publications), and 1579 hyperedges. In Table 1 (right), we show the top-10 distribution of edge composition with respect to the true communities. We see that there are many pure community edges (with c = d), so we can expect that hypergraph τ-modularity functions with large parameter τ would do well.

Having node attributes that represent a proxy of the corresponding ground-truth communities is convenient, as one can easily test the quality of various algorithms by measuring the similarity between partitions returned by them and the ‘ground-truth’ partition. However, it is important to keep in mind that, in general, the ground-truth is unknown and one can only make the hypothesis that the attributes used are correlated to it. Indeed, it is often impossible to ensure that these observed discrete-valued node attributes used as labels are good representation of an underlying mechanism responsible for creation of communities [55].

4 HYPERGRAPH MODULARITY OPTIMIZATION ALGORITHM: H–LOUVAIN

Let us fix the hypergraph modularity function |$q_H ({\bf A})$|⁠, either by restricting ourselves to τ-modularity function with some specific value of τ (such as τ = 2 that is recommended as the default value) or by specifying the more general hyper-parameters |$\eta _{c, d}$|⁠. The goal of this section is to highlight challenges in designing a heuristic algorithm aiming to optimize |$q_H ({\bf A})$| and to describe our solution that overcame these challenges, producing an algorithm that we will refer to as h–Louvain.

4.1 Louvain algorithm

Let us start by introducing one of the most popular algorithms for detecting communities in graphs, namely, the Louvain algorithm [34]. It is a hierarchical clustering algorithm that tries to optimize the modularity function we described in Section 2.

In the first pass of this algorithm, small communities are found by optimizing the graph modularity function locally on all nodes. Then, each small community is grouped together into a single node that we will refer to as super-node. This process is repeated recursively on those smaller graphs consisting of super-nodes (the subsequent passes) until no improvement on the modularity function can be further achieved.

One pass of the algorithm consists of two phases that are repeated iteratively. In the first phase, each node in the network is assigned to its own community. For each node v, we consider all neighbours u of v and compute the change in the modularity function if v is removed from its own community and moved into the community of u. It is important to mention that this value can be easily and efficiently calculated without the need to recompute the modularity function from scratch. Once all the communities that v could belong to are considered, v is placed into the community that resulted in the largest increase of the modularity function. If no increase is possible, v remains in its original community. The process is repeated for the remaining nodes following a given (typically random) permutation of nodes, possibly multiple times, until a local maximum value is achieved and the first phase ends.

During the second phase, the algorithm contracts all nodes that belong to one community into a single super-node. All edges within that community are replaced by a single weighted loop. Similarly, all edges between two communities are replaced by a single weighted edge. Once the new network is created, the second phase ends. The resulting graph is typically much smaller than the original graph. As a result, the first pass is typically the most time consuming part of the algorithm.

4.2 Challenges with adjusting the algorithm to hypergraphs

One could try to directly apply the Louvain algorithm to optimize hypergraph modularity, since in both cases the goal is to find a partition of the set nodes. However, as the algorithm moves only one node at a time, it creates a problem in the case of hypergraphs.

Consider, e.g. a hypergraph in which all hyperedges have size at least four. In this case, regardless which two nodes u and v are considered for possible merging into one community, the edge contribution would not change (ie it would stay equal to zero), even if u and v are part of some hyperedge. (Recall that only hyperedges with majority of nodes from the same community may affect the edge contribution). On the other hand, the degree tax would increase after such a move and, as a result, the modularity function would decrease. Therefore, no move would be made and the algorithm would get immediately stuck. We will refer to this issue as a lift off from the ground problem.

The above, extreme, situation is not the only problem one should be aware of. This time consider a hypergraph that consists of a mixture of hyperedges of various sizes, including edges of size two. In this scenario there is no problem with lifting off from the ground but small hyperedges clearly play a much more important role than large ones during the initial merging in the first phase of the algorithm. On the other hand, very large hyperedges would be mostly ignored. This behaviour is not desirable either. In order to illustrate a potential danger, consider a hypergraph representing interactions between researchers at some institution. Nodes in this hypergraph correspond to researchers and hyperedges correspond to meetings of some groups of people. For simplicity, assume that there are two communities, say, faculty of science and faculty of engineering. Many hyperedges within the two communities are large (e.g. hyperedges associated with departmental meetings) whereas hyperedges between the two communities are mostly of size two (e.g. two members of different teams meet individually from time to time). In this scenario, the algorithm would start merging people from different communities during the first phase.

Finally, let us note that one could alternatively consider modifying the algorithm and allow for not only merging two nodes into one community in a single move but entire hyperedges. Again, this does not seem to be desirable as hyperedges might consist of members from different communities and so such operations would generate many incorrect merges too fast.

4.3 Our approach to hypergraph modularity optimization

In order to overcome the above mentioned challenges, we want to design an algorithm that, as in the classical Louvain algorithm, merges single pairs of nodes while, at the same time, takes into account information stored in hyperedges of all sizes. To that end we propose to optimize a linear combination of the hypergraph modularity |$q_H ({\bf A})$| and the graph modularity of the corresponding 2-section graph |$H_{[2]}$|⁠, ie optimize function

$$q({\bf A}, \alpha ): = \alpha \cdot q_H ({\bf A}) + (1 - \alpha )\cdot q_{H_{[2]} } ({\bf A}),$$

(3)

where |$\alpha \in [0, 1]$|⁠. For simplicity, we will refer to our algorithm as h–Louvain.

To understand the motivation behind this approach, let us observe the following. The hypergraph modularity, equation (2), is flexible and may approximate well the graph modularity for the corresponding 2-section graph |$H_{[2]}$|⁠. Indeed, if c vertices of a hyperedge e of size d and weight w(e) fall into one part of the partition A, then the contribution to the graph modularity is |$w(e)\left( \matrix{ c \cr 2 \cr} \right)/\left( \matrix{ |e| \cr 2 \cr} \right) \approx w(e)(c/|e|)^2$| (in the variant of the 2-section where the total weight is preserved) or |$w(e)\left( \matrix{ c \cr 2 \cr} \right)/(|e| - 1)$| (if the degrees are preserved). Hence, the hyper-parameters of the hypergraph modularity can be adjusted to approximate |$H_{[2]}$| modularity. The only difference is that (2) does not allow to include contributions from parts that contain at most |$d/2$| vertices which still contributes to the graph modularity of |$H_{[2]}$|⁠.

The observation justifies using |$q({\bf A}, \alpha )$| for optimizing the hypergraph modularity. It is a linear combination of the actual hypergraph modularity we want to optimize, |$q_H ({\bf A})$|⁠, and an approximation of the hypergraph modularity for special value of hyper-parameter (τ = 2) and without the restriction of hyperedge contribution, |$q_{H_{[2]} } ({\bf A})$|⁠. The benefit of the second part is that it is sensitive to merging two nodes and so it always gives some indication of how nodes should be merged (even if the first part |$q_H ({\bf A})$| does not give such an indication). In short, it resolves the lifting off from the ground problem. If α is close to zero, then we concentrate mostly on the approximation part, while if α is close to one, then we mostly concentrate on the actual hypergraph modularity we aim to optimize.

The above discussion leads us to the conclusion that the parameter |$\alpha \in [0, 1]$| should be appropriately tuned during the algorithm. The main questions are: a) when the change should be made, and b) what values of this parameter should be used? In [37], we performed various experiments and made the following observations. The optimization process should start with low values of the parameter α (to let the process lift off from the ground) and then it should be gradually increased till it reaches one by the end of the process. The algorithm should start increasing parameter α when the communities induce enough edges so that merging additional nodes makes a difference in the edge contribution of the q_H function value; this, in particular, means that since the strict hypergraph modularity pays attention to only pure hyperedges (all members belong to one community), in this case, the algorithm needs to start with lower values of α and increase it slower than for the majority or the linear counterparts of the hypergraph modularity for which it is enough that over 50% of nodes in some hyperedge are captured in one community.

Based on these observations, we propose the following schema for setting the successive values of α used in the objective function (3), which leads to monotonic (non-decreasing) sequences (⁠|$\alpha _1 , \alpha _2 , \ldots$|⁠). The schema is guided by the following two parameters: |$p_b \in [0, 1]$| and |$p_c \in (0, 1)$|⁠. The parameter p_b is used to determine the values of α_i, while p_c governs when the algorithm switches from |$\alpha _{i - 1}$| to α_i (for |$i \ge 2$|⁠) as the optimization progresses.

For a given pair of parameters (p_b, p_c), the values of α_i are determined as follows: for any |$i \in {\mathbb{N}}$|⁠,

$$\alpha _i = 1 - (1 - p_b )^{i - 1} .$$

(We use the convention that |$0^0 = 1$|⁠.) Note that |$\alpha _1 = 0$| and |$\alpha _i \to 1$| as |$i \to \infty$|⁠, unless p_b = 0. In the degenerate case, if p_b = 0, then |$\alpha _i = 0$| for all i. The algorithm switches from |$\alpha _{i - 1}$| to α_i’s (for |$i \ge 2$|⁠) when the number of communities drops to |$np_c^{i - 1}$| or below for the first time (note that the number of communities typically decreases but it is not always the case; as usual, n denotes the number of nodes). In summary, the two parameters have the following interpretation: p_b controls the rate of change of α (values close to zero make α_i converge to one slowly, values close to one make convergence fast); p_c controls the speed of change of α.

There are two possible endings once the algorithm reaches a partition in which no improvement of the modularity function is possible via local changes. (Note that it might happen when the value of α_i is still away from one.) By default, we fix |$\alpha _i = 1$| and continue optimizing the hypergraph modularity function on the small graph consisting of super-nodes until no further improvement can be achieved. Alternatively, the local optimization can be performed on the original graph consisting of nodes. The pseudo-code of h–Louvain can be found in the Appendix (see Section A). As it is discussed in Section 4.5, we use the default ending when doing Bayesian optimization to select a good pair (p_b, p_c) of parameters because it is faster. Once the final pair is chosen, we do an additional tuning process with local-optimization enabled which typically yields better values of the objective function.

4.4 Parameters of the h–Louvain algorithm

In this section, we aim to investigate the quality of the h–Louvain algorithm for different pairs of parameters (p_b, p_c). To that end, we analyzed the performance of the algorithm using 9 different h–ABCD graphs on |$1, 000$| nodes. For each of the three options for community hyperedges in the h–ABCD model (namely, strict, linear, and majority), we used the following three settings with respect of different levels of noise and sizes of hyperedges:

small level of noise (⁠|$\xi = 0.15, \;\xi _{emp} = 0.29$|⁠), hyperedges of size between 2 and 5, the degree distribution following power-law with exponent |$\gamma = 2.5$|⁠, minimum and maximum degree 5 and 20,
large level of noise (⁠|$\xi = 0.6, \;\xi _{emp} = 0.62$|⁠), hyperedges of size between 2 and 5, the degree distribution following power-law with exponent |$\gamma = 2.5$|⁠, minimum and maximum degree 5 and 20,
large hyperedges (sizes between 5 and 8), large level of noise (⁠|$\xi = 0.3, \;\xi _{emp} = 0.63$|⁠), the degree distribution following power-law with exponent |$\gamma = 2.5$|⁠, minimum and maximum degree 5 and 60.

(In the above description, ξ_emp refers to the actual level of noise in the produced hypergraph. The model ensures that |$\xi _{emp} \approx \xi$| for graphs without small communities. In our scenario, it is not the case but the generated hypergraphs still have drastically different levels of noise.) In all three settings, the distribution of community sizes follows power-law with exponent |$\beta = 1.5$|⁠, minimum and maximum size 10 and 30. The distribution of hyperedges of different sizes is |$(0.4, 0.3, 0.2, 0.1)$|⁠, ie there are slightly more hyperedges of smaller size.

Figure 1 presents the performance of the algorithm for three selected hypergraphs out of the 9 we experimented with. (Results for the remaining six hypergraphs can be found in the associated GitHub repository.) For each hypergraph, we present the quality of the algorithm optimizing the corresponding modularity function (e.g. for strict hypergraph we optimize the strict modularity function) as a function of (p_b, p_c). The parameters were tested from the 2-dimensional grid (p_b, p_c), where |$p_b \in \{ 0.05, 0.1, 0.3, 0.5, 0.7, 0.9, 0.95\}$| and |$p_c \in \{ 0.1, 0.2, \ldots , 0.9\}$|⁠. For each pair of parameters, the average modularity function is reported over 10 independent runs with different random seeds. (Recall that h–Louvain is a randomized algorithm.)

=‘Three heatmaps showing three different modularity values, each for several values of the parameters pb and pc’.

Fig. 1

Quality of h–Louvain on h–ABCD as a function of parameters p_b, p_c. Optimal combinations of the two parameters depend on the choice of h–ABCD variant and hypergraph modularity function: strict, large level of noise (left), majority, large level of noise (middle), or linear, small level of noise (right).

The general conclusion is that the optimal choice of parameters depends on many factors: property of the hypergraph (such as the composition of community hyperedges, the level of noise, sizes of hyperedges) as well as the modularity function that one aims to maximize. However, not surprisingly, it is not recommended to set both parameters to be close to zero (the case of slow and small increases of the alpha parameter, so optimizing mainly the graph modularity of the corresponding 2-section graph |$H_{[2]}$|⁠), or to be close to one (the case of fast and significant changes of the alpha parameter, so optimizing the hypergraph modularity q_H almost from the beginning of algorithm execution). The optimal values are often obtained for settings with balanced values of both parameters, namely, with |$p_b + p_c \approx 1$|⁠. In order to find a ‘sweet spot’ in an unsupervised way, we use Bayesian optimization that we discuss next. Let us remark that one can expect to see large regions of the parameter space yielding low quality solutions (blue and light red zones on Fig. 1). Therefore, using a simpler hyperparameter tuning technique such as the grid search would be less efficient. The algorithm would spend a lot of time investigating regions of the hyperparameter space that are not promising. On the other hand, Bayesian optimization avoids scanning such regions in detail and performs more dense search in more promising regions instead.

4.5 Bayesian optimization: selecting the parameters

In order to find a good pair of the two parameters (p_b, p_c) guiding the h–Louvain algorithm that yield large hypergraph modularity function, we use the Bayesian optimization approach [56]. We chose this tool for our problem as this approach is best suited for optimizing objective functions that take a long time to evaluate over continuous domains of less than 20 dimensions, and tolerates well non-negligible local variability of the evaluation of the function. It builds a surrogate for the objective function and quantifies the uncertainty in that surrogate using a Bayesian machine learning technique, Gaussian process regression, and then uses an acquisition function defined from this surrogate to decide where to sample the domain in an on-line fashion. Specifically, we use the [57] package providing the Bayesian optimization procedure. All general properties of time complexity and performance follow this implementation. Below we describe in detail how we specifically prepare an input for this optimization in our specific optimization problem.

Specifically, in our case the Bayesian optimization aims to explore the two dimensional space (p_b, p_c) with |$p_b \in [0, 1]$| and |$p_c \in (0, 1)$|⁠. The target function is defined as the average modularity function of the outcome partition for 10 independent executions of the h–Louvain algorithm with different (but fixed across runs) random number generator seeds. Note that in this setting we maximize a deterministic function (since the seeds are fixed). We take the average over 10 different seeds because we aim to identify the region of the (p_b, p_c) domain that leads to good values of the obtained evaluations of modularity and taking the average reduces the noise that is present in modularity values observed in single runs of the algorithm.

Because tuning hyper-parameters is computationally intensive, we initially use the default ending of the algorithm, ie without the local-optimization approach for the last phase (see Section 4.3 for an explanation how the optional ending works). The reason for this choice is that in this phase of the process we are mostly interested in capturing the shape of the response surface (recall that we take the average of 10 runs of the algorithm for the very same reason, namely, to smooth-out the results and to better capture the shape of the response surface). The default ending is sufficient for this purpose and it is substantially faster. After finding an approximation of the optimal (p_b, p_c) combination, the algorithm comes back to the partition obtained with these parameters but this time the local-optimization approach is used during the last phase. It is more computationally expensive, but at this stage of the procedure we are interested in finding the maximum value of the hypergraph modularity, and so this additional computational cost is justified.

We configured the Bayesian optimization procedure so that it starts with evaluation of 5 initial pairs of parameters selected randomly from the domain and at least 10 pairs are tested in total. Once the Bayesian optimization converges, the algorithm returns the partition of the largest modularity from all partitions generated during the entire process. Note that this partition might not be one of the 10 partitions that contributed to the largest value of the target function; these partitions only have the best average modularity.

In order to visualize the Bayesian optimization procedure, we performed the following experiment. We selected one of the nine h–ABCD hypergraphs we experimented with (namely, the linear hypergraph with small level of noise, but this time with only n = 300 nodes) and one of the three modularity functions (namely, the linear one) to be our target function. For cleaner visualization, we fixed |$p_b = 0.9$| and used the procedure to find the optimum value of p_c that maximizes the selected modularity function. Figure 2 presents situation at step k of the algorithm for |$k \in \{ 8, 9, 10, 11\}$|⁠. The blue curve is the target function that we independently computed but it is not available to the Bayesian optimization. The orange curve is a surrogate for the target function based on k—1 sampled points that are marked on this curve. The level of uncertainty is represented by the shaded area around this curve. Based on this information, the Bayesian optimization selects the next point to sample at this step which is depicted as a green point that lies outside of the orange curve and a green vertical line. Note that the blue curve is deterministic (as we use fixed seeding of random number generator), as discussed above. It still has visible local variability, although it is possible to identify the region of good values of the parameter p_c (in this case around 0.2). This variability is expected and is the reason why when computing it we take the average of 10 independent evaluations of the algorithm (this approach significantly reduces the level of noise).

=‘Four steps of the Bayesian optimization process showing values of the target and surrogate functions for several choices of parameter pc’.

Fig. 2

Visualization of the Bayesian optimization approach optimizing the modularity function returned by the h–Louvain algorithm.

4.6 Computational complexity

The computational complexity of the proposed algorithm is the same as for Louvain algorithm (both in terms of the number of nodes |$|V|$| and in terms of the number of hyperedges |$|E|$|⁠), as we follow the same 2-phase process. The difference is that for computing the hypergraph modularity function we have additional parameters that affect the runtime of our algorithm.

The first one is the maximum hyperedge size |$d_{\max } = \max _{e \in E} \{ |e|\}$|⁠, where the worst-case complexity of the algorithm is quadratic, ie |$O(d_{\max ^2 } )$|⁠; recall that the hypergraph modularity function consists of |$O(d_{\max ^2 } )$| ‘slices’ (see (2)). However, in practice, this cost is lower as in the implementation we use caching of the intermediate results of the computations. This approach has proven empirically to significantly lower the run-time of the algorithm.

The second one is associated with the selection of the hyperparameters of the hypergraph optimization procedure. This part only adds a constant multiplicative term to the total cost of computations. This constant term depends on the decision of the user when to stop Bayesian optimization and, in general, it can be potentially significant.

5 EXPERIMENTS

In this section, we present several experiments aimed at testing our h–Louvain clustering algorithm as well as comparing outcomes of selecting various hypergraph modularity functions. One general observation is that the choice of the objective function to optimize, here the hypergraph τ-modularity, typically has an enormous impact on the quality of the results (see Section 5.1). Fortunately, one should be able to make a reasonable selection of a good value of τ in an unsupervised way.

In general, in most of our experiments, we compare results obtained with our h–Louvain algorithm with results obtained by classical Louvain algorithm on the corresponding (weighted) 2-section graphs, as well as results using Kumar’s algorithm; see [28]. Kumar’s algorithm is a modification of the Louvain algorithm on 2-section graphs in which edges are re-weighted by taking into account the underlying hypergraph structure, and the hyperedge composition (with respect to the communities).

We consider synthetic hypergraphs with community structures, obtained via the h–ABCD benchmark, as well as two real-life hypergraphs, all of them described earlier in Section 3. Synthetic hypergraphs allow us to investigate the performance of algorithms in various scenarios (see Section 5.2), from hypergraphs with low level of noise (ξ close to 0) that are easy to deal with to noisy hypergraphs (ξ close to 1) that are challenging to find communities in. We also investigate a challenging case in which there are many hyperedges between two small communities (see Section 5.3). It is known that many networks exhibit self-similar, ‘fractal-type’ structure (see [49, 58] and references therein) so such example aims to reproduce typical scenarios. This example highlights the power of our h–Louvain algorithm that in this particular setting clearly outperforms its competitors.

For the real hypergraphs, we additionally consider the ‘all of nothing’ (AON) variant of the Louvain algorithm. Specifically, we consider the version aiming to optimize the strict modularity, referred to as AON, as described in the associated GitHub repository⁴. Note that, unless we start from some non-trivial partition such as the one obtained from the 2-section graph with Louvain, this algorithm requires 2-edges to be present. This is the case for the real hypergraphs we considered (but not so for several h–ABCD benchmark hypergraphs).

In general, the experiments on real-world hypergraphs (see Section 5.4) show that for appropriate value of τ affecting the choice of the objective function to optimize, one can improve the quality of the clusters measured by the AMI score with respect to the ground-truth.

5.1 Selecting the modularity function to optimize

Selecting an appropriate hypergraph modularity function to optimize is an important part of the process. The choice depends on how strongly one believes that a hyperedge is an indicator that at least some fraction of its nodes fall into one of the communities. In some situations, a reasonable assumption could be that not necessarily all members of that hyperedge must be in a single community but majority should (in such situations, quadratic modularity function might work well). On the other hand, some situations might have some underlying physical constraints that make one believe that all members should belong to one community unless such hyperedge is simply a noise (this time, strict modularity might be the one to optimize). If an analyst has some reasonable assumptions about the underlying process that created a hypergraph, then the decision which modularity function to use should be made based on this expert knowledge. In this section, we provide a general strategy for selecting a modularity function if such expert knowledge is not available, based on the structure of the hypergraph that can be detected in an unsupervised way.

Let us first start with highlighting important implications of the choice of the modularity function one decides to optimize. Recall that a community hyperedge (hyperedge with more than 50% of members from one of the communities) of size d that have exactly c members from one of the communities is said to be of type (c, d). In the absence of having the ground-truth available, one way to compare partitions returned by the algorithm aiming to optimize different modularity functions is to look at the distribution of edges of certain types. In our first experiment, we consider a synthetic h–ABCD graphs with only edges of size 5 and generated with the strict or the majority model, strict_5 and majority_5 with noise parameters |$\xi = 0.3$| and |$\xi = 0.2$|⁠. We run this experiment to show how the (c, d) hyperedge composition changes under various modularity functions optimized. We expect that there should be visible difference in this distribution between strict_5 and majority_5 graphs independent of the choice of the objective function for community detection.

In Table 2, we compare the distribution of edge types for 5 different partitions, namely, (i) the ground-truth partition, and partitioned returned by (ii) 2-section Louvain, (iii) h–Louvain with τ = 2 (quadratic modularity), (iv) h–Louvain with τ = 3 (cubic modularity), and (v) h–Louvain with |$\tau \to \infty$| (strict modularity). We count the number of edges with 5, 4 or, respectively, 3 nodes in the most frequent community; the remaining edges are considered to be noise.

Table 2

The number of hyperedges of each type for h–ABCD hypergraphs with 5-edges generated with strict (strict_5) or majority (majority_5) assignment rule.

Majority	Ground-truth	Louvain	h–Louvain
class size	communities	2-section	τ = 2	τ = 3	\|$\tau \to \infty $\| (strict)
Strict with noise parameter \|$\xi = 0.2$\|
5	352	352	349	352	352
4	36	36	41	36	36
3	92	92	91	92	92
\|$ \le 2$\|	42	42	41	42	42
Strict with noise parameter \|$\xi = 0.3$\|
5	314	314	311	314	314
4	30	30	35	30	30
3	123	123	122	123	123
\|$ \le 2$\|	55	55	54	55	55
Majority with noise parameter \|$\xi = 0.2$\|
5	169	170	137	169	264
4	175	165	171	174	154
3	137	138	161	137	104
\|$ \le 2$\|	41	49	53	42	0
Majority with noise parameter \|$\xi = 0.3$\|
5	158	129	88	158	206
4	145	140	148	144	147
3	158	151	196	161	169
\|$ \le 2$\|	61	102	90	59	0

Majority	Ground-truth	Louvain	h–Louvain
class size	communities	2-section	τ = 2	τ = 3	\|$\tau \to \infty $\| (strict)
Strict with noise parameter \|$\xi = 0.2$\|
5	352	352	349	352	352
4	36	36	41	36	36
3	92	92	91	92	92
\|$ \le 2$\|	42	42	41	42	42
Strict with noise parameter \|$\xi = 0.3$\|
5	314	314	311	314	314
4	30	30	35	30	30
3	123	123	122	123	123
\|$ \le 2$\|	55	55	54	55	55
Majority with noise parameter \|$\xi = 0.2$\|
5	169	170	137	169	264
4	175	165	171	174	154
3	137	138	161	137	104
\|$ \le 2$\|	41	49	53	42	0
Majority with noise parameter \|$\xi = 0.3$\|
5	158	129	88	158	206
4	145	140	148	144	147
3	158	151	196	161	169
\|$ \le 2$\|	61	102	90	59	0

Table 2

The number of hyperedges of each type for h–ABCD hypergraphs with 5-edges generated with strict (strict_5) or majority (majority_5) assignment rule.

Majority	Ground-truth	Louvain	h–Louvain
class size	communities	2-section	τ = 2	τ = 3	\|$\tau \to \infty $\| (strict)
Strict with noise parameter \|$\xi = 0.2$\|
5	352	352	349	352	352
4	36	36	41	36	36
3	92	92	91	92	92
\|$ \le 2$\|	42	42	41	42	42
Strict with noise parameter \|$\xi = 0.3$\|
5	314	314	311	314	314
4	30	30	35	30	30
3	123	123	122	123	123
\|$ \le 2$\|	55	55	54	55	55
Majority with noise parameter \|$\xi = 0.2$\|
5	169	170	137	169	264
4	175	165	171	174	154
3	137	138	161	137	104
\|$ \le 2$\|	41	49	53	42	0
Majority with noise parameter \|$\xi = 0.3$\|
5	158	129	88	158	206
4	145	140	148	144	147
3	158	151	196	161	169
\|$ \le 2$\|	61	102	90	59	0

Majority	Ground-truth	Louvain	h–Louvain
class size	communities	2-section	τ = 2	τ = 3	\|$\tau \to \infty $\| (strict)
Strict with noise parameter \|$\xi = 0.2$\|
5	352	352	349	352	352
4	36	36	41	36	36
3	92	92	91	92	92
\|$ \le 2$\|	42	42	41	42	42
Strict with noise parameter \|$\xi = 0.3$\|
5	314	314	311	314	314
4	30	30	35	30	30
3	123	123	122	123	123
\|$ \le 2$\|	55	55	54	55	55
Majority with noise parameter \|$\xi = 0.2$\|
5	169	170	137	169	264
4	175	165	171	174	154
3	137	138	161	137	104
\|$ \le 2$\|	41	49	53	42	0
Majority with noise parameter \|$\xi = 0.3$\|
5	158	129	88	158	206
4	145	140	148	144	147
3	158	151	196	161	169
\|$ \le 2$\|	61	102	90	59	0

In the second column of in Table 2 we show hyperedge composition of the ground truth. As expected, for hypergraph with the strict model used the (5, 5) hyperedges are most common, and for hypergraph with the majority model used (5, 5), (4, 5), and (3, 5) hyperedges have similar frequencies (this holds both for |$\xi = 0.2$| and |$\xi = 0.3$|⁠). The crucial observation is that regardless of which of the modularity function is optimized (Louvain 2-section or h–Louvain with varying τ), the recovered hyperedge composition is similar to the ground truth. This observation suggests using the following approach in cases where the user does not have a prior preference for the τ parameter in the h–Louvain algorithm.

As a rule of thumb, running a quick clustering (for example with 2-section Louvain) as a part of Exploratory Data Analysis (EDA), and looking at the composition of edge types is a recommended first step that can be used to decide on the value(s) of τ one wants to use as the objective τ-modularity function for h–Louvain. In general, there are two major possible scenarios that the user could consider. Seeing mostly ‘pure’ edges suggests using large value of τ (or strict modularity), while the opposite suggests using a smaller values of τ such as τ = 2 or τ = 3.

5.2 Synthetic h–ABCD hypergraphs

We ran a series of experiments using the synthetic h–ABCD benchmark hypergraphs. For each family of hypergraphs, we considered a wide range of values for the noise parameter ξ, and for each ξ, we generated 30 independent copies of h–ABCD hypergraphs. For each hypergraph, we obtained clusterings in various ways:

taking the 2-section (weighted) graph and applying the Louvain algorithm several times, keeping the partition with the largest (graph) modularity;
running Kumar’s algorithm;
running our h–Louvain algorithm with Bayesian optimization for τ = 2 and τ = 3, and
running our h–Louvain algorithm with Bayesian optimization using the strict modularity (⁠|$\tau \to \infty$|⁠) as the objective function.

In the analysis of the results, we computed the AMI of each partition with respect to the ground truth communities. The plots in Figs 3–5 show the difference of the AMI of a given algorithm and the AMI of 2-section result. In other words, we measure how much gain/loss is obtained by switching from a standard 2-section approach to finding communities in a hypergraph to our algorithm designed specifically for hypergraphs.

=‘Looking at the difference in AMI with 2-section clustering for several of the algorithms considered for hypergraphs with strict edge composition’.

Fig. 3

Results with h–ABCD hypergraphs with strict model for the community edge composition (strict_5 and strict_2to5) showing AMI difference between 2-section communities and the considered algorithms. Positive values indicate increase of AMI for a given algorithm.

=‘Looking at the difference in AMI with 2-section clustering for several of the algorithms considered for hypergraphs with linear edge composition’.

Fig. 4

Results with h–ABCD hypergraphs with linear model for the community edge composition (linear_5 and linear_2to5) showing AMI difference between 2-section communities and the considered algorithms. Positive values indicate increase of AMI for a given algorithm.

=‘Looking at the difference in AMI with 2-section clustering for several of the algorithms considered for hypergraphs with majority edge composition’.

Fig. 5

Results with h–ABCD hypergraphs with majority model for the community edge composition (majority_5 and majority_2to5) showing AMI difference between 2-section communities and the considered algorithms. Positive values indicate increase of AMI for a given algorithm.

Here are some general remarks from those experiments:

The hypergraph specialized (h-Louvain and Kumar) algorithms give the most substantial benefits for moderately noisy hypergraphs. The reason is that for hypergraphs with very low level of noise (values of ξ close to zero) all algorithms produce similar results, as the community-finding problem is simple, and for very noisy graph (values of ξ close to one) the noise itself creates spurious communities that the algorithm start to recover (this effect has been previously studied and analytically analyzed for the ABCD graphs in [48]).
Our h–Louvain algorithm with τ = 2 and τ = 3 outperforms Kumar’s algorithm most of the time. The exceptions are a few cases with large amount of noise in the hypergraph (values of ξ close to one).
Strict h–Louvain modularity function (⁠|$\tau \to \infty$|⁠) may work poorly for hypergraphs that have many non-pure community hyperedges. For this reason, in the case of absence of a prior preference, users should follow the initial verification procedure described in Section 5.1 before using this parameterization. The reason is that for |$\tau \to \infty$|⁠, all hyperedges of type (c, d) with c < d are not counted as community hyperedges, which would loose potentially valuable information in cases when they are indeed informative.

5.3 More challenging case—synthetic hypergraphs with localized noise

In this experiment, we simulate an example in which the difficulty in recovering communities is due to the fact that several ‘noise’ edges touch a small number of communities instead of being sprinkled over several communities. To that end, we generated a h–ABCD hypergraph with n = 300 nodes, degree exponent |$\alpha = 2.5$| in the range |$[5, 30]$|⁠, community size exponent |$\beta = 1.5$| in the range |$[40, 60]$|⁠, edges of size 5 with purity distribution for community hyperedges set to (0.7, 0.2, 0.1), ie 70% of them have 3 community nodes (type (3, 5)), 20% have 4 (type (4, 5)), and 10% have 5 (pure hyperedges, type (5, 5)). The overall noise is set to |$\xi = 0.2$|⁠. To this hypergraph, we added 35 5-edges where the nodes are randomly sampled only from the two smallest communities. This simulates ‘localized’ noise which should make the community detection more challenging.

First, simulating a real-life application of the procedure we proposed in Section 5.1, we look at the edge composition when running a 2-section clustering, which is reported in Table 3 along with the edge composition for the ground-truth communities. Running this quick experiment is indicative that smaller values for τ are likely to be a better choice than using the strict modularity version, since there are not that many ‘pure’ edges.

Table 3

Number of edges of each type for h–ABCD hypergraphs with 5-edges and localized noise added respectively for the ground-truth partition, and for a partition obtained by running Louvain on the weighted 2-section graph.

		Frequency
d	c	Ground-truth	Louvain
5	5	58	73
5	4	158	150
5	3	247	215
5	2	120	146
5	1	7	6

		Frequency
5	5	58	73
5	4	158	150
5	3	247	215
5	2	120	146
5	1	7	6

Table 3

		Frequency
d	c	Ground-truth	Louvain
5	5	58	73
5	4	158	150
5	3	247	215
5	2	120	146
5	1	7	6

		Frequency
5	5	58	73
5	4	158	150
5	3	247	215
5	2	120	146
5	1	7	6

We did 100 runs for each choice of the τ-modularity for our h–Louvain: strict (⁠|$\tau \to \infty$|⁠) and |$\tau \in \{ 0, 0.5, 1, 1.5, 2, 2.5, 3, 3.5, 4\}$|⁠. We also did 100 runs using the 2-section modularity with Louvain, and Kumar’s algorithm. The results are presented in Fig. 6. From those results, we see that clustering the 2-section graph or using Kumar’s algorithm yield good results, but we can improve those results when optimizing the hypergraph τ-modularity when choosing |$\tau \approx 2$|⁠. As expected from the preliminary EDA analysis we performed earlier, using small values for τ (close to zero) or large values for τ (including strict modularity, |$\tau \to \infty$|⁠) are bad choices in this case.

=‘Boxplots of AMI values for several clustering algorithms considered’.

Fig. 6

Results of 100 runs for several choices of hypergraph τ-modularity for h–Louvain (strict, and with |$0 \le \tau \le 4$|⁠) as well as using Louvain on 2-section modularity and Kumar.

5.4 Real-world hypergraphs

We consider two real-world hypergraphs the primary-school contact and the cora. We first run the EDA procedure, suggested in Section 5.1, on both graphs by looking at the hyperedge composition associated with the corresponding 2-section clusterings. The results are presented in Table 4. We can see that the primary-school hypergraph (left) has relatively more non-pure hyperedges than the cora hypergraph. This indicates that one should expect that for the primary-school case the optimal τ is smaller than the one for the cora hypergraph.

Table 4

The number of edges of each type (top-10 most frequent) for the primary-school contact (left) and cora (right) hypergraphs. The corresponding partitions were obtained by running Louvain on the weighted 2-section graph.

primary-school
d	c	Purity	Frequency
2	1	50%	4051
2	2	100%	3697
3	3	100%	3385
3	2	67%	1054
3	1	33%	161
4	4	100%	240
4	3	75%	58
4	2	50%	47
5	4	80%	3
5	3	60%	3

primary-school
d	c	Purity	Frequency
2	1	50%	4051
2	2	100%	3697
3	3	100%	3385
3	2	67%	1054
3	1	33%	161
4	4	100%	240
4	3	75%	58
4	2	50%	47
5	4	80%	3
5	3	60%	3

cora
d	c	Purity	Frequency
2	2	100%	512
3	3	100%	354
4	4	100%	212
5	5	100%	108
3	2	67%	85
4	3	75%	62
2	1	50%	60
5	4	80%	41
4	2	50%	32
5	3	60%	21

cora
d	c	Purity	Frequency
2	2	100%	512
3	3	100%	354
4	4	100%	212
5	5	100%	108
3	2	67%	85
4	3	75%	62
2	1	50%	60
5	4	80%	41
4	2	50%	32
5	3	60%	21

Table 4

primary-school
d	c	Purity	Frequency
2	1	50%	4051
2	2	100%	3697
3	3	100%	3385
3	2	67%	1054
3	1	33%	161
4	4	100%	240
4	3	75%	58
4	2	50%	47
5	4	80%	3
5	3	60%	3

primary-school
d	c	Purity	Frequency
2	1	50%	4051
2	2	100%	3697
3	3	100%	3385
3	2	67%	1054
3	1	33%	161
4	4	100%	240
4	3	75%	58
4	2	50%	47
5	4	80%	3
5	3	60%	3

cora
d	c	Purity	Frequency
2	2	100%	512
3	3	100%	354
4	4	100%	212
5	5	100%	108
3	2	67%	85
4	3	75%	62
2	1	50%	60
5	4	80%	41
4	2	50%	32
5	3	60%	21

cora
d	c	Purity	Frequency
2	2	100%	512
3	3	100%	354
4	4	100%	212
5	5	100%	108
3	2	67%	85
4	3	75%	62
2	1	50%	60
5	4	80%	41
4	2	50%	32
5	3	60%	21

5.4.1 Contact hypergraph

Let us first consider the primary-school contact hypergraph described in Section 3.2. The results are shown in Fig. 7, where we compare 2-section (graph) clustering with Louvain, Kumar, and AON clustering as well as our h–Louvain using different values of τ for the modularity function. The AMI scores are averaged over 30 runs. The variance is negligible and is not shown. From this experiment, we see that one can get some improvement over 2-section and Louvain or Kumar’s algorithm when using small values for τ in our h–Louvain.

=‘Comparing AMI values for several clustering algorithms considered’.

Fig. 7

Results for several choices of hypergraph τ-modularity for h–Louvain (strict and with |$0 \le \tau \le 4$|⁠) as well as using 2-section modularity and Louvain, Kumar’s, and AON algorithms.

5.4.2 Co-citation hypergraphs

Next, we consider the cora co-citation hypergraph described in Section 3.2 in which nodes are publications which belong to 7 categories, and hyperedges represent co-citations. Since the graph has several small disconnected components, we restrict ourselves to the giant component which has 1330 nodes and 1503 hyperedges.

We ran each clustering algorithm 50 times, with the results reported in Fig. 8. The results with AON were worse in this case (with AMI around 0.21, not reported in Fig. 8). Instead, we report the results when starting from a partition returned by the 2-section graph clustering before running AON, which give better results. As expected from the EDA analysis comparing primary-school and cora hypergraphs, for the cora hypergraph, we get good results running h–Louvain with values of τ larger than for the primary-school hypergraph, slightly improving on the results with 2-section graph clustering with Louvain, Kumar’s algorithm, or AON.

Fig. 8

Clustering the cora co-citation hypergraph.

https://github.com/pawelwm/h-louvain

6 CONCLUSIONS

In this paper, we proposed a modification of the classical Louvain algorithm that allows us to optimize the hypergraph modularity, h-Louvain. Our approach is to optimize a weighted average of the 2-section graph modularity and the hypergraph modularity, with an increasing weight of hypergraph modularity component as the optimization process progresses. We presented both theoretical arguments as well as empirical evidence that the approach of increasing the weight of the hypergraph modularity component is efficient. Since there are several ways to update this weight, we developed a method allowing for automatic selection of hyperparameters of this process using Bayesian optimization. We have shown that the h–Louvain algorithm is competitive and, in particular, that it can outperform both Louvain on 2-section graph and Kumar’s algorithms in terms of recovering ground truth communities both for synthetic and real networks.

Additionally, let us mention about another important and interesting aspect. Since in h–Louvain the optimization process is stochastic by nature, the results of a single optimization pass can be easily improved by running many such optimizations in parallel. Therefore, an important extension to the algorithm is for allowing it to learn how to dynamically set the tuneable parameters when multiple optimization processes are executed.

Footnotes

https://github.com/bkamins/ABCDGraphGenerator.jl/

https://github.com/tolcz/ABCDeGraphGenerator.jl/

https://github.com/nveldt/HyperModularity.jl

References

[1]

Benson

Abebe

Schaub

Jadbabaie

Kleinberg

Simplicial closure and higher-order link prediction

Proc Natl Acad Sci USA

2018

;

115

E11221

–

E11230

PubMed

[2]

Feng

Heath

Jefferson

Joslyn

Kvinge

Mitchell

Praggastis

Eisfeld

Sims

Thackray

, et al. .

Hypergraph models of biological networks to identify genes critical to pathogenic viral response

BMC Bioinformatics

2021

;

287

[3]

Easley

Kleinberg

Networks, Crowds, and Markets: Reasoning About a Highly Connected World

Cambridge, UK

Cambridge University Press

2010

[4]

Jackson

MO.

Social and Economic Networks

Princeton, USA

Princeton University Press

2010

[5]

Kamiński

Prałat

Théberge

Mining Complex Networks

Boca Raton, FL, USA

Chapman and Hall/CRC

2021

[6]

Newman

Networks

Oxford, UK

Oxford University Press

2018

[7]

Battiston

Cencetti

Iacopini

Latora

Lucas

Patania

Young

Petri

Networks beyond pairwise interactions: structure and dynamics

Phys Rep

2020

;

874

–

[8]

Benson

Gleich

Higham

Higher-order network analysis takes off, fueled by old ideas and new data

SIAM News Blog

(

[9]

Benson

Gleich

Leskovec

Higher-order organization of complex networks

Science

2016

;

353

163

–

166

[10]

Lambiotte

Rosvall

Scholtes

Understanding complex systems: from networks to optimal higher-order models. arXiv, arXiv:1806.05977,

2018

, preprint: not peer reviewed.

[11]

Tian

Zafarani

Higher-order networks representation and learning: A survey

ACM SIGKDD Explorations Newsletter

2024

;

–

[12]

Lee

Eliassi-Rad

Shin

A survey on hypergraph mining: patterns, tools, and generators. arXiv, arXiv:2401.08878,

2024

, preprint: not peer reviewed.

[13]

Xia

Yin

Wang

Cui

Zhang

. Self-supervised hypergraph convolutional networks for session-based recommendation. In: Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 35, pp. 4503–4511. AAAI,

2021

[14]

Liao

Ling

Hypergraph neural networks for hypergraph matching. In:

Proceedings of the IEEE/CVF International Conference on Computer Vision

. pp.

1266

–

1275

Montreal, Canada

IEEE

2021

Google Preview

[15]

Ding

Wang

Liu

Be more with less: Hypergraph attention networks for inductive text classification

. arXiv, arXiv:2011.00387,

2020

preprint

not peer reviewed

Google Preview

[16]

Matwin

Milios

Prałat

Soares

Théberge

Generative Methods for Social Media Analysis

Springer Nat

2023

[17]

Lee

Structure of international trade hypergraphs

J Stat Mech Theory Exp

2022

;

2022

103402

[18]

Grzesiak-Kopeć

Oramus

Ogorzałek

Hypergraphs and extremal optimization in 3d integrated circuit design automation

Adv Eng Inf

2017

;

491

–

501

[19]

Lee

Choe

Shin

How do hyperedges overlap in real-world hypergraphs?-patterns, measures, and generators. In: Proceedings of the Web Conference, Ljubljana Slovenia. pp.

3396

–

3407

. New York, USA: ACM,

2021

[20]

Juul

Benson

Kleinberg

Hypergraph patterns and collaboration structure

Frontiers in Physics

2024

;

1301994

[21]

Ahn

Lee

Suh

Hypergraph spectral clustering in the weighted stochastic block model

IEEE J Select Topics Sign Process

2018

;

959

–

974

[22]

Benson

Gleich

Leskovec

. Tensor spectral clustering for partitioning higher-order network structures. In: Proceedings of the 2015 SIAM International Conference on Data Mining. pp.

118

–

126

. IEEE,

2015

[23]

Chien

Lin

Wang

IH.

Community detection in hypergraphs: optimal statistical limit and efficient algorithms. In: International Conference on Artificial Intelligence and Statistics. pp. 871–879. Playa Blanca, Lanzarote, Canary Islands: PMLR,

2018

[24]

Chodrow

Veldt

Benson

Generative hypergraph clustering: from blockmodels to modularity

Sci Adv

2021

;

eabh1303

[25]

Kamiński

Poulin

Prałat

Szufel

Théberge

Clustering via hypergraph modularity

PLoS One

2019

;

e0224307

[26]

Kamiński

Prałat

Théberge

Community detection algorithm using hypergraph modularity. In: International Conference on Complex Networks and Their Applications. pp. 152–163. Springer Cham

2020

[27]

Kumar

Vaidyanathan

Ananthapadmanabhan

Parthasarathy

Ravindran

Hypergraph clustering by iteratively reweighted modularity maximization

Appl Netw Sci

2020

;

–

[28]

Kumar

Vaidyanathan

Ananthapadmanabhan

Parthasarathy

Ravindran

A new measure of modularity in hypergraphs: theoretical insights and implications for effective clustering. In:

Cherifi

Gaito

Mendes

Moro

Rocha

(eds.),

Complex Networks and Their Applications VIII

Cham

Springer International Publishing

2020

, pp.

286

–

297

[29]

Yin

Benson

Leskovec

Higher-order clustering in networks

Phys Rev E

2018

;

052306

[30]

Yin

Benson

Leskovec

Gleich

DF.

Local higher-order graph clustering. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada. pp. 555–564. New York, USA: ACM,

2017

[31]

Brandes

Delling

Gaertler

Gorke

Hoefer

Nikoloski

Wagner

On modularity clustering

IEEE Trans Knowledge Data Eng

2007

;

172

–

188

[32]

Fortunato

Community detection in graphs

Phys Rep

2010

;

486

–

174

[33]

Newman

Modularity and community structure in networks

Proc Natl Acad Sci USA

2006

;

103

8577

–

8582

[34]

Blondel

Guillaume

Lambiotte

Lefebvre

Fast unfolding of communities in large networks

J Stat Mech Theory Exp

2008

;

2008

P10008

[35]

Traag

Waltman

Van Eck

From Louvain to Leiden: guaranteeing well-connected communities

Sci Rep

2019

;

5233

[36]

Contreras-Aso

Criado

Vera de Salas

Yang

Detecting communities in higher-order networks by using their derivative graphs

Chaos Solitons Fractals

2023

;

177

114200

[37]

Kamiński

Misiorek

Prałat

Théberge

Modularity based community detection in hypergraphs. In: International Workshop on Algorithms and Models for the Web-Graph, Toronto, ON Canada. pp. 52–67. Switzerland: Springer Nature,

2023

[38]

Newman

Girvan

Finding and evaluating community structure in networks

Phys Rev E

2004

;

026113

[39]

Fortunato

Barthelemy

Resolution limit in community detection

Proc Natl Acad Sci USA

2007

;

104

–

[40]

Clauset

Newman

Moore

Finding community structure in very large networks

Phys Rev E

2004

;

066111

[41]

Lancichinetti

Fortunato

Limits of modularity maximization in community detection

Phys Rev E

2011

;

066122

[42]

Newman

Fast algorithm for detecting community structure in networks

Phys Rev E

2004

;

066133

[43]

Chung Graham

Complex graphs and networks

. Vol.

107

Am Math Soc

2006

[44]

Lancichinetti

Fortunato

Radicchi

Benchmark graphs for testing community detection algorithms

Phys Rev E

2008

;

046110

[45]

Lancichinetti

Fortunato

Benchmarks for testing community detection algorithms on directed and weighted graphs with overlapping communities

Phys Rev E

2008

;

016118

[46]

Kamiński

Prałat

Théberge

Artificial benchmark for community detection (abcd)–fast random graph model with community structure

Netw Sci

2021

;

153

–

178

[47]

Kaminski

Olczak

Pankratz

Pralat

Théberge

Properties and performance of the abcde random graph model with community structure

Big Data Res

2022

;

100348

[48]

Kamiński

Pankratz

Prałat

Théberge

Modularity of the abcd random graph model with community structure

J Complex Netw

2022

;

cnac050

[49]

Barrett

Kamiński

Prałat

Théberge

Self-similarity of Communities of the ABCD Model. In International Workshop on Algorithms and Models for the Web-Graph. Cham: Springer Nature Switzerland,

2024

–

[50]

Kamiński

Prałat

Théberge

Artificial benchmark for community detection with outliers (abcd+o)

Appl Netw Sci

2023

;

[51]

Kamiński

Prałat

Théberge

Hypergraph artificial benchmark for community detection (h–abcd)

J Complex Netw

2023

;

cnad028

[52]

Stehlé

, et al. .

High-resolution measurements of face-to-face contact patterns in a primary school

PLoS ONE

2011

;

e23176

[53]

Mastrandrea

Fournet

Barrat

Contact patterns in a high school: a comparison between data collected using wearable sensors, contact diaries and friendship surveys

PLoS ONE

2015

;

e0136497

[54]

Yadati

Nimishakavi

Yadav

Nitin

Louis

Talukdar

Hypergcn: a new method for training graph convolutional networks on hypergraphs. In:

Advances in Neural Information Processing Systems (NeurIPS)

Vancouver BC Canada

, pp.

1509

–

1520

Curran Associates, Inc

2019

Google Preview