Loading problems...
TL;DR: Iterate through the edges and use a Union-Find (Disjoint Set Union) data structure to track connected components; the first edge that connects two nodes already within the same component is the answer.
The LeetCode 684 problem, "Redundant Connection," asks us to identify an edge in a given graph that, if removed, would turn the graph into a tree. We are given a set of edges for a graph that contains nodes and edges. By definition, a tree with nodes must have exactly edges. The input graph is essentially a tree with one additional edge added, which inevitably creates a cycle. Our goal is to find this extra edge. If there are multiple edges that could be removed to restore the tree property, we must return the one that appears last in the input array.
This is a classic problem for testing understanding of graph connectivity and cycle detection, making it a popular interview question at companies like Amazon.
A naive approach to solving the Redundant Connection problem involves checking the connectivity of the graph incrementally using standard traversal algorithms like Depth-First Search (DFS) or Breadth-First Search (BFS).
The logic proceeds as follows:
edges one by one.[u, v], before adding it to the graph, perform a DFS or BFS to check if a path already exists between node u and node v.[u, v] would create a cycle. Since we are processing edges in the order given, this edge is the one that completes the cycle and is "redundant" according to the problem constraints (specifically the "last occurring" requirement).textfunction findRedundantConnection(edges): graph = empty adjacency list for each edge [u, v] in edges: if hasPath(u, v, graph): // Perform DFS/BFS return [u, v] add edge [u, v] to graph return []
The time complexity of this brute force approach is O(N²). In the worst case, for every edge we process, we might traverse the entire graph built so far to check for connectivity. Since there are edges and each DFS/BFS takes time, the total work is quadratic.
While might pass for small constraints (), it is inefficient compared to the optimal solution. The primary failure of this approach is that it repeats the work of traversing the graph for every new edge, failing to leverage the state of connected components efficiently.
The key intuition for the optimal solution lies in the definition of a tree and a cycle. A tree is a connected graph with no cycles. If we start with isolated nodes and add edges one by one, we are essentially merging separate trees (or components) together.
As long as an edge connects two nodes that are currently in different components, that edge is necessary for connectivity; it acts as a bridge merging two separate sets. However, if we encounter an edge [u, v] where u and v are already in the same component (i.e., they share a common root or ancestor), adding this edge creates a path back to a node already reachable, thus forming a cycle.
Because the problem asks for the edge occurring last in the input that can be removed, and we process the input array linearly, the very first time we encounter an edge connecting two already-connected nodes, that edge is the one that "closes" the loop. In the context of the input order, this is the redundant connection.
Visual Description: Imagine the nodes as distinct sets.
[1, 2], we verify the roots of 1 and 2. If they are different, we merge them. Now {1, 2} is one set.[2, 3], we merge node 3 into the set {1, 2}. Now {1, 2, 3} is one set.[1, 3], we check the roots. Both 1 and 3 belong to the same set. This implies a path already exists between them (via 2). Therefore, [1, 3] is the redundant edge.
We will implement the Union-Find data structure to solve this efficiently. The strategy relies on maintaining a parent array where parent[i] points to the parent of node i.
parent array of size (since nodes are 1-indexed). Initially, every node is its own parent (parent[i] = i).u and v, we find their roots. If the roots are different, we set the parent of one root to be the other. We can use union by rank (attaching the shorter tree to the taller tree) to further optimize, though path compression alone is usually sufficient for interview code.edges. For each edge [u, v]:
rootU = find(u) and rootV = find(v).rootU == rootV, it means u and v are already connected. This edge closes a cycle. Return [u, v].union(rootU, rootV) to merge the sets.Let's trace the algorithm with Example 1: edges = [[1,2], [1,3], [2,3]].
Setup:
parent array: [0, 1, 2, 3] (indices 1, 2, 3 are relevant).Process Edge [1, 2]:
find(1) returns 1.find(2) returns 2.parent[2] = 1.parent: [0, 1, 1, 3]. Set {1, 2} is connected.Process Edge [1, 3]:
find(1) returns 1.find(3) returns 3.parent[3] = 1.parent: [0, 1, 1, 1]. Set {1, 2, 3} is connected.Process Edge [2, 3]:
find(2): 2 -> 1. Returns 1.find(3): 3 -> 1. Returns 1.1 == 1).2 and 3 are already connected.[2, 3].The algorithm relies on the invariant that the parent array correctly partitions nodes into disjoint sets based on the edges processed so far.
find(u) == find(v), there exists a path between u and v using previously processed edges.Thus, the first edge that fails the union check is the correct answer.
cppclass Solution { vector<int> parent; // Find function with path compression int find(int i) { if (parent[i] == i) { return i; } // Path compression: point node directly to root return parent[i] = find(parent[i]); } // Union function void unite(int i, int j) { int rootI = find(i); int rootJ = find(j); if (rootI != rootJ) { parent[rootI] = rootJ; } } public: vector<int> findRedundantConnection(vector<vector<int>>& edges) { int n = edges.size(); // Initialize parent array. Nodes are 1-indexed. parent.resize(n + 1); for (int i = 1; i <= n; i++) { parent[i] = i; } for (const auto& edge : edges) { int u = edge[0]; int v = edge[1]; // If roots are the same, this edge creates a cycle if (find(u) == find(v)) { return edge; } // Otherwise, merge the sets unite(u, v); } return {}; // Should not reach here given problem constraints } };
javaclass Solution { private int[] parent; private int find(int i) { if (parent[i] == i) { return i; } // Path compression parent[i] = find(parent[i]); return parent[i]; } private void union(int i, int j) { int rootI = find(i); int rootJ = find(j); if (rootI != rootJ) { parent[rootI] = rootJ; } } public int[] findRedundantConnection(int[][] edges) { int n = edges.length; parent = new int[n + 1]; // Initialize parent pointers for (int i = 1; i <= n; i++) { parent[i] = i; } for (int[] edge : edges) { int u = edge[0]; int v = edge[1]; // If they share the same root, we found the cycle if (find(u) == find(v)) { return edge; } union(u, v); } return new int[0]; } }
pythonclass Solution: def findRedundantConnection(self, edges: List[List[int]]) -> List[int]: parent = list(range(len(edges) + 1)) def find(i): if parent[i] != i: # Path compression parent[i] = find(parent[i]) return parent[i] def union(i, j): root_i = find(i) root_j = find(j) if root_i != root_j: parent[root_i] = root_j return True return False for u, v in edges: # If union returns False, roots were already same -> cycle detected if not union(u, v): return [u, v] return []
Time Complexity:
We process edges. For each edge, we perform two find operations and at most one union. With path compression and union by rank, the amortized time complexity of these operations is , where is the Inverse Ackermann function. For all practical values of , , making this effectively .
Space Complexity: We require an array of size to store the parent pointers (and optionally rank/size) for the Union-Find structure.
The Union-Find pattern is highly versatile for graph connectivity problems.
parent array logic.Why does the naive DFS solution get Time Limit Exceeded (TLE)?
Why is Path Compression necessary in Union-Find?
find without updating the parent pointer (path compression).find operation .Can we solve this using Topological Sort?
Does the order of edges matter?