Union Find
Description
This is yet another important graph analytics algorithm. By using a disjoint-set - a data structure that keeps track of non-overlapping sets, the algorithm enables the user to quickly check whether a pair of given nodes is in the same or different connected component. The benefit of having this structure is that a check on a pair of nodes is effectively executed in O(1) time.
The implementation of the disjoint-set data structure and its operations uses the union by rank and path splitting optimizations described in "Worst-case Analysis of Set Union Algorithms", developed by Robert Tarjan and Jan van Leeuwen.
Structure of the disjoint set on the right side, and graph example on the left. When checking the components, the algorithm only checks the shallow tree on the left
Materials
Implementation
Union Find is implemented as part of the MAGE project. Be sure to check it out in the link above. ☝️
Use cases
Social graphs are often enormous in size. Searching through them and determining whether something is in the same component might take a lot of time. By organizing such structures in disjoint sets, and using the union-find algorithm, it is possible to determine this connectivity extremely efficiently.