What is Hashing ? - data structure tutorial



Hashing is an important topic of Data Structures which confuses students a little bit but let me tell you that it is one of the easiest topics of Data Structures. I will explain you hashing in a very easy way so stay tuned (What is Hashing ? - data structure tutorial).

 What is Hashing ?

Hashing is a process of mapping/storing large amount of data into small tables with the help of some algorithms called Hashing Functions.
Arrays are used in form of tables to store data.
In hashing we convert data or key values into a range of indexes of an array.
Hashing makes data insertion and retrieval in constant time.

What is Hash Function ?

Hash Function is a function which converts data or key values into hash keys. This function takes a value and maps it to a value of a certain length which is called Hash Value.

What is Hashing ? - data structure tutorial


The figure shown in the above picture is a Hash Table of size 10. Each block in the hash table is called a Slot. The hash function takes the data the maps them to any one of the locations 0 to 9.


Let's say we have some data values {26,70,18,31,54,93} that we want to store in the hash table, then how can we store them ? We have several methods to determine hash key values.

1. Modulo Division Method

Hash key = Hash Value % Total num. of slots in hash table

Now let's calculate hash key for each hash value using modulo division hashing:

Hash Value                         Value % num. of slots                              Hash Key
 26                                               26%10                                                  6
 70                                               70%10                                                  0
 18                                               18%10                                                  8
 31                                               31%10                                                  1
 54                                               54%10                                                  4
 93                                               93%10                                                  3

What is Hashing ? - data structure tutorial




The above picture shows which hash value goes to which index.
Observe that only 6 slots have been assigned out of 10 slots. This is refereed as load factor. In this case load factor is 6/10.
                                  load factor= num of items / table size
 If table size is a prime number then chances of collision is very less. 

Collision in Hashing
Collision means when 2 key values corresponds to same hash value.
eg:
   Our hash table has 10 slots and has entries as follows:
    10   _    32   23   _   55  _   17   _  99
   Then another value comes x=22. So,
     22%10 = 2
   but 2th index is already occupied by a hash value 32. This is collision.


*** When a number is divided by a Prime number then it results in unique values.

2. Direct Method

  In this we directly apply a hash function on a value which gives us the corresponding key.

What is Hashing ? - data structure tutorial


As you can see hash function are applied to key values which corresponds to random address in the memory.
Chances of collision is very less in this method.
This method is not suitable for large number of key values.


3. Middle Square Method

In this we square the hash value and then select the key from middle of the squared hash value.
What is Hashing ? - data structure tutorial


eg:  (15)^2 = 225 
           middle value of 225 is 2 so, 15 goes to slot 2.
In middle square method there is non uniform distribution of keys.


Techniques to avoid collision in Hashing:

1. Linear Probing

Let say, our hash table has 10 slots and has entries as follows:
    10   _    32   23   _   55  _   17   _  99
   Then another value comes x=22. So,
     22%10 = 2
   but 2th index is already occupied by a hash value 32.

 In linear probing we will move 1 slot ahead and check until we get an empty slot.
                                    Hash key = (Hash Value % Total num. of slots in hash table)+n
where n=1,2,....size_of_table
   22 will go to slot 2 but it is already occupied by 32, so
                                  hask key= (22%10) +1 = 3
   The 3rd slot has value 23, so
                                  hask key= (22%10) +2 = 4
   The 4th slot is empty so 22 will go into that slot.

2. Quadratic Probing

Let say, our hash table has 10 slots and has entries as follows:
    10   _    32   23   _   55  _   17   _  99
   Then another value comes x=22. So,
     22%10 = 2
   but 2th index is already occupied by a hash value 32.

Quadratic probing is similar to linear probing. The difference is that if you try to insert hash value then first you will add 1^2 ,then 2^2, then 3^2 and so on.
                      Hash key = (Hash Value % Total num. of slots in hash table)+n^2  
where n=1,2,3.....size_of_table
   22 will go to slot 2 but it is already occupied by 32 so
                                hash key= (22%10) + 1^2 = 3rd index
   22 will go to slot 3 but it is also occupied by 23 so
                                hash key= (22%10) + 2^2 = 6th index
   22 will go to slot 6 and it is empty so 22 will be assigned to slot 6.

If you have any queries about What is Hashing ? - data structure tutorial please mention it in the comment section below.


1 Comments

  1. Sir you made my day! It's the best article about hashing i've seen, it helped me a lot, thank you!!!

    ReplyDelete
Post a Comment
Previous Post Next Post