How `cliquematch` works¶

cliquematch implements an exact algorithm to find maximum cliques, with a lot of pruning optimizations for large sparse graphs. The algorithm used is similar to FMC [1] and PMC [2], but also uses the concept of bitset-compression from BBMC [3] to save memory. It is implemented in C++11, does not use any additional libraries apart from the C++ STL, and is single-threaded.

Avoiding memory allocations¶

In earlier versions of cliquematch, the clique search methods were occasionally slowed due to many requests of small amounts from heap memory. In version 2.1, cliquematch does exactly one heap allocation – a std::vector large enough to hold a maximum clique – at the start of every search method, be it the heuristic search, finding one clique or enumerating cliques. During the clique search, it performs zero memory allocations. After a clique is found, it requests memory to return the clique to the user. Thus the number of heap allocations is mostly constant: a set number of allocations to initialize the Graph object (changes slightly depending on size of the graph), zero allocations during the search, and one allocation per clique returned.

This is possible because clique size always has an upper bound (degree, core-number, or the measure used in cliquematch all provide such a bound) and because bitsets are used for representing branches of the clique search. Thus cliquematch computes an upper bound for clique size (ie how deep a search can go), the maximum bitset space required for a vertex ( \(\lceil \frac{d_{max}}{64} \rceil\) ) and allocates space accordingly. The allocated space is reused throughout the search, thus avoiding any heap allocations during the most used part of the program.

Benchmarking the `cliquematch` algorithm¶

The below table shows how cliquematch compares to FMC and PMC.

\(name\)	\(V\)	\(E\)	\(w\)	\(t_{cm}\)	\(t_{fmc}\)	\(t_{pmc}\)	\(t_{cm\_heur}\)	\(w_{cm\_heur}\)	\(t_{fmc\_heur}\)	\(w_{fmc\_heur}\)
Erdos02	6927	8472	7	0.0001	0.0015	0.0022	0.0001	6	0.0009	7
Erdos972	5488	7085	7	0.0001	0.0004	0.0016	0.0001	7	0.0002	6
Erdos982	5822	7375	7	0.0001	0.0004	0.0018	0.0	7	0.0002	7
Erdos992	6100	7515	8	0.0001	0.0004	0.0018	0.0	8	0.0002	8
Fault_639	638802	14626683	18	2.0622	13.8244		0.5864	18	2.4625	18
brock200_2	200	9876	12	0.217	0.6421	0.0016	0.0019	10	0.0023	9
c-fat200-5	200	8473	58	0.0023	0.4169	0.0003	0.0001	58	0.0102	58
ca-AstroPh	18772	198110	57	0.0004	0.0789	0.0132	0.0003	56	0.0277	57
ca-CondMat	23133	93497	26	0.0003	0.0048	0.0083	0.0002	26	0.0042	26
ca-GrQc	5242	14496	44	0.0001	0.0005	0.0017	0.0001	43	0.0012	44
ca-HepPh	12008	118521	239	0.0034	0.0131	0.0149	0.0012	239	0.2518	239
ca-HepTh	9877	25998	32	0.0001	0.0007	0.0035	0.0	32	0.0001	32
caidaRouterLevel	192244	609066	17	0.0257	0.2098	0.0774	0.0084	17	0.073	15
coPapersCiteseer	434102	16036720	845	0.0248	0.8344	1.4019	0.0144	845	14.6734	845
com-Youtube	1134890	2987624	17	1.1393	9.618		0.1412	16	0.3515	13
cond-mat-2003	31163	120029	25	0.0007	0.0118	0.0096	0.0004	25	0.0038	25
cti	16840	48232	3	0.0014	0.0036	0.0046	0.0009	3	0.0013	3
hamming6-4	64	704	4	0.0002	0.0003	8.5115	0.0	4	5.6982	4
johnson8-4-4	70	1855	14	0.0405	0.1459	0.0003	0.0001	11	0.0005	14
keller4	171	9435	11	3.4486	15.4211		0.0016	9	0.0027	9
loc-Brightkite	58228	214078	37	5.0651	2.7146	0.0277	0.0038	36	0.0151	31

fmc -t 0 -p was used to run the FMC branch-and-bound algorithm.
pmc -t 1 -r 1 -a 0 -h 0 -d (single CPU thread, reduce wait time of 1 second, full algorithm, skip heuristic, search in descending order) was used to run the PMC branch-and-bound algorithm.
fmc -t 1 was used to run the FMC heuristic algorithm.
A small script was used to run the cliquematch algorithm. cliquematch was compiled with BENCHMARKING=1.

The benchmark graphs were taken from the UFSparse [4], SNAP [5], and DIMACS [6] collections. \(|V|\) and \(|E|\) denote the number of nodes and edges in the graph. \(\omega\) denotes the size of the maximum clique found. All the branch-and-bound methods agreed on the maximum clique size in every benchmark. \(t_{cm}\), \(t_{fmc}\), and \(t_{pmc}\) denote the time taken by cliquematch, FMC, and PMC respectively in the branch-and-bound search: the least time is in bold text. \(w_{cm-heur}\), \(t_{cm-heur}\) denote the size of clique and time taken by the heuristic method in cliquematch; and similarly for \(w_{fmc-heur}\) and \(t_{fmc-heur}\). A minus sign (-) indicates that the program returned an error without completing the calculation.

Correspondence Graphs in `C++`¶

The correspondence graph classes are generated using simple C++ template programming (simple because it’s basically a typed macro substitution). cliquematch can be extended with custom correspondence graphs: they can be prototyped using the existing classes, and/or implemented in C++ for performance. The source code of IsoGraph or AlignGraph are good places to start if you’re trying to implement a custom correspondence graph class.

References¶

1: Bharath Pattabiraman, Md Mostofa Ali Patwary, Assefaw H Gebremedhin, Wei-keng Liao, and Alok Choudhary. Fast algorithms for the maximum clique problem on massive graphs with applications to overlapping community detection. Internet Mathematics, 11(4-5):421–448, 2015.
2: Ryan A Rossi, David F Gleich, and Assefaw H Gebremedhin. Parallel maximum clique algorithms with applications to network analysis. SIAM Journal on Scientific Computing, 37(5):C589–C616, 2015.
3: Pablo San Segundo, Diego Rodríguez-Losada, and Agustín Jiménez. An exact bit-parallel algorithm for the maximum clique problem. Computers & Operations Research, 38(2):571–581, 2011.
4: Timothy A Davis and Yifan Hu. The university of florida sparse matrix collection. ACM Transactions on Mathematical Software (TOMS), 38(1):1–25, 2011.
5: Jure Leskovec and Andrej Krevl. SNAP Datasets: Stanford large network dataset collection. https://snap.stanford.edu/data, June 2014.
6: David S Johnson and Michael A Trick. Cliques, coloring, and satisfiability: second DIMACS implementation challenge, October 11-13, 1993. Volume 26. American Mathematical Soc., 1996.

How cliquematch works¶

Avoiding memory allocations¶

Benchmarking the cliquematch algorithm¶

Correspondence Graphs in C++¶

References¶

How `cliquematch` works¶

Benchmarking the `cliquematch` algorithm¶

Correspondence Graphs in `C++`¶