Algorithm: Minimum Spanning Tree
This blog is about a very common algorithm applied on a connected undirected graph. So the algorithm is known as minimum spanning tree. In this blog we will talk about what it is, how to find it, its applications, which algorithm is better. Lets get familiarised with background before getting into further specifics.
So what is a graph?
Graph is a non-linear data structure which comprises nodes and edges. It is a pictorial portrayal of a set of objects where a few sets of items are associated by links.The points that interface the interconnected items are called vertices, and the edges are the associations that interface the vertices.
Graph is broadly distributed in Directed graph and Undirected graph.
What is a Spanning Tree?
To start with Minimum spanning tree (MST) one must get a hunch of what is a spanning tree.
Spanning tree is the subset of graph. Say u have a graph G(V,E) then your spanning tree will contain the same number of vertices ( in this case V) but the minimum number of edges (in this case edges will be ≤ E).
Certain Properties of spanning tree..
•More than one spanning tree can exist
•All possible spamming trees have same no. of edges and vertices
•Don’t have cycles
•Minimally connected
•Adding any edge will create cycle
•Spanning trees edges are equal to (no. of vertices)-1
What is a MST?
Minimum spanning tree is a variant of spanning tree. It is defined as a spanning tree with minimum number of weights in an unweighted graph. Whereas in a weighted graph it is the spanning tree which have minimal number of edges along with the total weights of all the edges being minimum.
Why do we need MST?
Ok we have gone through what is MST but here a major question arises that why do we need a MST. So to understand that let's take a simple example.
Say, you have started a new telecommunications company and you have towers that spread across a large area. Now you want to connect those towers from each other so that the data can be passed. Connecting different towers involve in different costs. So now the problem is that you have to minimize the cost. How will you do that?
So, here the solution would be finding a MST because if its not a spanning tree, you can remove a connection and then also it would be still be connected. So to minimise the cost, it has to be a spanning tree with minimum total cost.
How can we find MST in a graph?
For finding minimum spanning tree in a graph, we majorly use two popular algorithm which are:
- Kruskal’s Algorithm
- Prim’s Algorithm
Lets first start understanding Kruskal’s Algorithm….
Kruskal's Algorithm:
It is an algorithm based on greedy approach . It basically focus on edges. It adds the least weighted edge one by one until the spanning tree is being formed.
So the steps to get a MST from Kruskal’s Algorithm are as follows:
- Sort the edges in ascending order (in reference to edge’s weight).
- Start by adding edges one by one from least to most weighted.
- If adding any edge creates a cycle skip that edge and go to the next.
As we can see in Fig 4 we were given a weighted graph with 7 vertices. Hence the resulting minimum spanning tree would be of 6 edges (|E|=|V - 1|= 7-1 = 6). In this examples these r the following steps:
Step 1: Start with a weighted graph.
Step 2: Choose the edge with the least weight, if there are more than 1, choose anyone.
Step 3: Choose the next shortest edge that doesn’t create a cycle and add it
Step 4: Repeat until you have a spanning tree.
Kruskal’s Algorithm Pseudocode
Any minimum spanning tree algorithm revolves around determining whether adding an edge would result in a loop.
The most widely recognized approach to track down this out is an algorithm called Union Find. The Union-Find algorithm divides vertices into clusters, allowing us to determine if two vertices belong to the same cluster and thus if adding an edge would result in a loop.
Complexity of Kruskal's Algorithm
Since, the fundamental time spent in arranging the edges so the time complexity is O(E log(V)) or O(E log(E)), where E is the quantity of edges and V is the quantity of vertices.
Both of these are comparable as E ≤ V*V, hence log(E) ≤ 2*log(V).
Prim’s Algorithm:
This algorithm, like Kruskal’s Algorithm, takes a greedy approach. It begins with a minimum spanning tree that is empty. There are two sets of vertices in it. One set contains vertices that have been included in the minimum spanning tree, while the other set contains vertices that have not yet been included in the minimum spanning tree. It is complete when all vertices in the set containing MST are included.
So the steps to get a MST from Prim’s Algorithm are as follows:
- Make a set of vertices that are used in MST.
- Make a list of edges that connect vertices in the set to vertices that aren’t in the set.
- Select a vertex and add it to the set.
- To the list generated in step 2, add all the edges that contain a vertex.
- Choose the smallest weighted edge and search to see if one of the vertex is missing from the set. Remove edge from list and repeat the process if both of them are in the set.
- Change the list of edges by adding the vertices to the set.
- Proceed until no vertices are left in the sequence.
As we can see in Fig 6we were given a weighted graph with 7 vertices. Hence the resulting minimum spanning tree would be of 6 edges (|E|=|V — 1|= 7–1 = 6). In this examples these r the following steps:
Step 1: Start with a weighted graph.
Step 2: Choose a vertex
Step 3: Choose the shortest edge from this vertex and add it
Step 4: Choose the nearest vertex not yet in the solution
Step 5: Choose the nearest edge not yet in the solution, if there are multiple choices, choose one at random.
Step 6: Repeat until you have a spanning tree.
Prim’s Algorithm Pseudocode
The pseudocode for prim’s algorithm exhibits how two arrangements of vertices, U and V-U, are made. V-U is a bunch of vertices that haven’t been visited, and U is a rundown of vertices that haven’t been visited. By associating the least weight vertex, we change vertices from set V-U to set U individually.
Time complexity of Prim’s Algorithm
The time complexity of Prim’s algorithm is O((V+E)(log(V)). As an addition in minimum priority queue takes log(V). Priority Queue is needed as we need to get least weighted edges.
Which is better Prim’s or Kruskal’s algorithm?
If we compare them with time complexity, they both have same time complexity. Both the algorithms take O(E log(V)) time.
But, Kruskal is easy to implement when compared to Prim’s. So, now the question arises, when is it appropriate to use Prim’s? The answer is in case of dense graphs. In the case of dense graphs, the number of edges are high. So, creating the minimum spanning tree while focusing on vertices are better than doing it with edges as it is very difficult to tackle all the edges while it would be easy to just focus on near by edges of a particular vertice.