Escolar Documentos
Profissional Documentos
Cultura Documentos
UNIT V
Hashing: Introduction – Hash function – methods - Hash table implementation - rehashing.
Graph: Directed and un directed graph – representation of graphs – graph traversals: Depth first
search – Breadth first search – transitive closure – spanning trees – application - topological
sorting.
Objective:
Outcomes:
The student will have a strong background of graph theory which has diverse applications
in the areas of computer science, biology, chemistry, physics, sociology, and engineering.
5.1 Hashing:
Hashing is used for storing relatively large amounts of data in a table called a hash table.
Hashing is a technique used to perform insertions, deletions, and finds the element in constant
averagetime.
Hash table:
Hash Table is a data structure in which keys are mapped to array positions by a hash
function.
Hash table is usually fixed as M-size, which is larger than the amount of data that we want
to store.
Hash function:
Hash function is a mathematical formula, produces an integer which can be used as an index for
the key in the hash table.
Perfect Hash Function- Each key is transformed into a unique storage location
Imperfect hash Function- Maps more than one key to the same storage location
5.1.1 Division-method:
In this method we use modular arithmetic system to divide the key value by some
integer division m. It gives us the location value, where the element can be placed.
L=(k mod m)
L->Location in table
K->key value
m->table size
Square is 256. Then the address as 56(two digits starting form mid of 256).
In this the key is actually positioned into number of parts, each post having the same length
as their of the required address. Add the value of each parts, ignoring the final carry to get the
required address.
Partitioned:12,34,56,78
If the element to be inserted is mapped the same location, where an element is already
inserted the we have a collision and it must be resolved.
Separate chaining – used with open hashing
Open addressing – used with closed hashing
For example: Consider the keys to be placed in their home buckets are
131, 3, 4, 21, 61, 24, 7, 97, 8, 9
Then we will apply a hash function as
H(key) = key mod D
where D is the size of table. The hash table will be: Here D = 10
Linear probing
Quadratic probing
Double hashing
5.2.2.1 Linear Probing
Example: Consider a hash table with size = 10. Using linear probing insert the keys 72, 27, 36, 24,
63, 81 and 92 into the table.
Step1: Key = 72
h(72, 0) = (72 mod 10 + 0) mod 10
= (2) mod 10
=2
Since, T[2] is vacant, insert key 72 at this location
Step2: Key = 27
h(27, 0) = (27 mod 10 + 0) mod 10
= (7) mod 10
=7
Since, T[7] is vacant, insert key 27 at this location
Step3: Key = 36
h(36, 0) = (36 mod 10 + 0) mod 10
= (6) mod 10
=6
Since, T[6] is vacant, insert key 36 at this location
Step4: Key = 24
h(24, 0) = (24 mod 10 + 0) mod 10
= (4) mod 10
=4
Since, T[4] is vacant, insert key 24 at this location
Step5: Key = 63
h(63, 0) = (63 mod 10 + 0) mod 10
= (3) mod 10
=3
Since, T[3] is vacant, insert key 63 at this location
Step6: Key = 81
h(81, 0) = (81 mod 10 + 0) mod 10= (1) mod 10 = 1
Since, T[1] is vacant, insert key 81 at this location
Now, T[2] is occupied, so we cannot store the key 92 in T[2]. Therefore, try again for next location. Thus
probe, i = 1, this time.
Key = 92
Now, T[3] is occupied, so we cannot store the key 92 in T[3]. Therefore, try again for next location. Thus
probe, i = 2, this time.
Key = 92
Now, T[4] is occupied, so we cannot store the key 92 in T[4]. Therefore, try again for next location. Thus
probe, i = 3, this time.
Key = 92
5.3 Rehashing: Rehashing is a technique in which the table is resized, i.e., the size of table is
doubled by creating a new table. It is preferable if the total size of table is a prime number. There
are situations in which the rehashing is required
Now this table is almost full and if we try to insert more elements collisions will occur and
eventually further insertions will fail. Hence we will rehash by doubling the table size.
The old table size is 10 then we should double this size for new table, that becomes 20. But
20 is not a prime number, we will prefer to make the table size as 23. And new hash function
will be
Rehashing Example
5.3 Graph
• A data structure that consists of a set of nodes (vertices) and a set of edges that relate
the nodes to each other
• The set of edges describes relationships among the vertices
• Trees are special cases of graphs.
Definition of graph.
4
Graph G
Undirected graph- When the edges in a graph have no direction, the graph is called undirected
Directed graph- When the edges in a graph have a direction, the graph is called directed (or
digraph)
Adjacent nodes: two nodes are adjacent if they are connected by an edge
Length of path of graph: The length of a path in a graph is the number of edges in the path
Sub Graph : A sub-graph of G is a graph G„ such that V(G‟) V(G ) and E(G „) E(G). Some of
Complete graph: a graph in which every vertex is directly connected to every other vertex .The
Complete graph can be directed or undirected.
Weighted graph: a graph in which each edge carries a cost for traveling between the nodes.
Graph Connectivity: An undirected graph is said to be connected if there is a path between every
pair of nodes. Otherwise, the graph is disconnected
In a directed or undirected graph if there is path from every vertex to other vertex then it is
called as strongly connected.
If a directed graph is not strongly connected , but the underlying graph(without direction to
arcs) is connected then the graph is said to weakly connected
Forest in graph: A forest is an acyclic undirected graph (not necessarily connected), i.e., each
connected component is a tree.
Advantage:
– Simple to implement
– Easy and fast to tell if a pair (i,j) is an edge: simply check if A[i][j] is 1 or 0
Disadvantage:
Even if there are few edges, the matrix takes O(n2) in memory
Backtracking is a general algorithm for finding all (or some) solutions to some
computational problem, that incrementally builds candidates to the solutions, The DFS uses Stack
data structure.
//VISITED (v) 1
procedure DFS(G,v):
label v as discovered
for all edges from v to w in G.adjacentEdges(v) do
if vertex w is not labeled as discovered then
recursively call DFS(G,w)
Computing time
In case G is represented by adjacency lists then the vertices w adjacent to v can be
determined by following a chain of links. Since the algorithm DFS would examine each
node in the adjacency lists at most once and there are 2e list nodes. The time to complete
the search is O (e).
If G is represented by its adjacency matrix, then the time to determine all vertices adjacent to
v is O(n). Since at most n vertices are visited. The total time is O(n2).
Output of a depth-first search: The depth first search of a graph outputs a spanning tree of the
vertices reached during the search.
Note: The BFS algorithm uses a queue data structure to store intermediate results as it
traverses the graph
Computing Time
Each vertex visited gets into the queue exactly once, so the loop forever is iterated at most n
times.If an adjacency matrix is used, then the for loop takes O(n) time for each vertex visited.
The Total time is, therefore, O(n2).
In case adjacency lists are used the for loop as a total cost of d1+……..+dn = O(e) where di =
degree(vi). Again, all vertices visited. Together with all edges incident to from a connected
component of G.
Wars hall's algorithm is an efficient method for computing the transitive closure of a relation.
Wars hall's algorithm takes as input the matrix MR representing the relation and outputs the matrix
MR of the relation R*the transitive closure of R.
Warshall's algorithm determines whether there is a path between any two nodes in the graph.
It does not give the number of the paths between two nodes
Algorithm Warshall(a[1..n,1..n])
{
R(0) = A
for I = 1 to n
{
for j = 1 to n
{
for k = 1 to n
{
R(k) = R(k-1)[i,j] or R(k-1)[i,k] and R(k-1)[j,k]
}
}
}
}
A Minimum Spanning Tree (MST) is a sub-graph of an undirected graph such that the sub-
graph spans (includes) all nodes, is connected, is acyclic, and has minimum total edge weight.
Note: The minimum spanning tree may not be unique. However, if the weights of all the edges are
pair wise distinct, it is indeed unique
There are two popular techniques for constructing a minimum cost spanning tree.
Prim’s algorithm
Kruskal’s algorithm
Prim’s Algorithm.
Prim's algorithm for finding an MST is a greedy algorithm.
Start by picking any vertex r to be the root of the tree. •
While the tree does not contain all vertices in the graph find shortest edge leaving the
tree and add it to the tree.
Algorithm Complexity:The running time is O(|V|2) without heaps, which is optimal for dense
graphs, and O(|E| log |V|) using binary heaps, which is good for sparse graphs.
A directed graph with cycle cannot be sorted. The below Figure show as an example where
topological sorting cannot be performed in cyclic graph
1. Define Hashing.
Hashing is the transformation of string of characters into a usually shorterfixed length value or
key that represents the original string. Hashing is used toindex and retrieve items in a database
because it is faster to find the item using theshort hashed key than to find it using the original
value.
11. What are the types of collision resolution strategies in open addressing?
• Linear probing
• Quadratic probing
• Double hashing
22.Classify the Hashing Functions based on the various methods by which the key value is
found.
Direct method,
Subtraction method,
Modulo-Division method,
Digit-Extraction method,
Mid-Square method,
Folding method,
Pseudo-random method.
35. What is a minimum spanning tree and list two algorithms to find minimum spanning
tree?
A minimum spanning tree of an undirected graph G is a tree formed from graph
edges that connects all the vertices of G at the lowest total cost.
Two algorithms to find minimum spanning tree
Kruskal‟salgorithm
Prim‟s algorithm
36. Define graph traversals and write two graph traversal techniques
Traversing a graph is an efficient way to visit each vertex and edge exactly once.
The two graph traversal techniques are
DFS
BFS
37. List the two important key points of depth first search.
i) If path exists from one node to another node, walk across the edge – exploring
the edge.
ii) If path does not exist from one specific node to any other node, return to the previous
node where we have been before – backtracking.
39.DifferentiateBFSandDFS.(November2015)
Assignment Questions
1. Given input {4371, 1323, 6173, 4199, 4344, 9679, 1989} and a hash function h(x) = x (mod
( ) 10), show the resulting
a. separate chaining hash table
b. hash table using linear probing
c. hash table using quadratic probing
d. hash table with second hash function h2(x) = 7 − (x mod 7)
What are the advantages and disadvantages of the various collision resolution strategies?
2. Find a minimum spanning tree for the graph in Figure using both Prim‟s and Kruskal‟s
algorithms. b. Is this minimum spanning tree unique? Why?
4. Perform the BFS and DFS graph traversal on the following graph
5. Consider a directed acyclic graph D given in Figure Sort the nodes of D; by applying
topological sort on D