Introduction to Hashing Function in Java
Hashing function in Java was created as a solution to define & return the value of an object in the form of an integer, and this return value obtained as an output from the hashing function is called as a Hash value. Every Hashing function returns an integer of 4 bytes as a return value for the object. Any two objects of the same type will have the same integer value as Hashing function’s output, and similarly, different objects will have different hash values. One cannot derive the objects from the hash value, and this makes the hashing function an irreversible function.
What is a Hashing Function?
A Hash Function can be defined as a function that returns an integer value corresponding to an Object. Hash Function always returns the same integer value for the same object. The integer value returned by the hash function is called Hash Value. Following are the important points regarding the Hash function:
- Always returns an integer (4 bytes) for an Object.
- We cannot compute object state from the hash value that is hash functions are irreversible in nature.
- Two equal objects will have the same hash value.
- Two unequal objects do not always have different Hash values.
Applications of Hash Function
Here are the common applications of hash functions:
1. Data Structures
Almost every programming language contains hash-based data structures. For example, java contains a Hash table, Hash Map, Hash Set, Tree Set data structures which are based on the Hash function. These data structures are the Key-Value design, where each key is unique, whereas the same value can exist for multiple keys.
2. Message Digest
This algorithm is used in a data integrity check. This algorithm takes a message of any length as input and produces a fixed-length (128-bit) data as output. Examples of the message-digest algorithms include MD2, MD4, MD5, and MD6.
3. Secure Hash Algorithm
This algorithm is used for data security and is used in applications and protocols like Secure Socket Layer (SSL). SHA-0, SHA-1, SHA-2, and SHA-3 are common categories of the Secure Hash Algorithm.
4. Password Verification and Storage
Let’s consider a login scenario in which when a password is entered to authenticate a user, a hash value of the entered password is computed and is sent over the network to the server where the hash of the original is stored. This is done to ensure that no sniffing is done when a password is sent from the client to the server.
5. Compiler Operation
Since different keywords are used in a programming language, in order to differentiate between these keywords and identifiers, the compiler uses a hash set that is implemented using a hash table to store all these keywords and identifiers.
6. Rabin- Karp Algorithm
It is a searching algorithm that makes use of hashing to search for one or more patterns in a given string. It is one of the most commonly used Algorithms.
7. Comparable and Comparator Interfaces
These interfaces contain functions that are used to compare two objects at a time. The return value of these functions may be negative, zero or positive based on whether a given object is less than, equal to, or greater than the object we are comparing to. Internally comparator and comparable interfaces use a hash function to compare objects from one another.
8. Priority Queue
The priority queue is unlike a normal queue that follows FIFO (First in First out) order. In priority queue elements are arranged in custom order based on their priority, which is internally implemented using comparable and comparator which interns are based on Hash Functions.
Designing Hash Functions
Here are some general design principles for creating hash functions:
- A hash function should be efficiently evaluated.
- Hash values computed from hash functions should be uniformly distributed; this helps to avoid collisions.
- The Java programming language provides a general hashing function with the hashCode () method in the Object superclass.
public int hashCode (){
//Logic goes here
}
Hash Collision in Java
A hash collision occurs when two or more objects return the same hash value. Let us take an example of a java hash map that stores data in key-value pairs. When we put an object in a hash map, the key’s hash value is computed and based on this hash value bucket location to store the value object is found. Objects having different hash values must go into different buckets. When two or more objects have the same hash value, they are stored in the same bucket location using an additional data structure called a linked list. All objects having the same hash value are chained together using a linked list. This mechanism is called chaining. Following are the ways to handle collisions is a hash function:
- Chaining: As already covered, the idea behind chaining is to create a linked list of objects having the same hash value. Chaining is a simple technique but requires additional memory overhead.
- Open Addressing: In this technique, all the elements are stored in a hash table in which each entry contains either a record or NULL. When an element is searched, each entry in the hash table is searched for the desired record until the required record is found, or it is concluded that the record does not exist in the table.
Advantages of Hashing
The following are the advantages of hashing:
- Compare the contents of two files easily and efficiently, without opening them.
- Hash functions are used in checking the integrity of a file.
- With the help of hashing, search operation in data structures has become faster.
- Hash functions play a vital role in data security as most security algorithms and protocols use hashing.
- Hashing converts data into a shorter fixed-length value or key, which represents the original string that can be sent over the network.
Disadvantages of Hashing
Apart from advantages, there are also some limitations of hashing:
- Hashing cannot be implemented to sort data.
- The hash collision cannot be practically avoided, which in turn leads to inefficiency.
Recommended Articles
This is a guide to Hashing Function in Java. Here we discuss applications of hash function along with advantages and disadvantages. You may also look at the following articles to learn more –