Hash table data structure (aka dictionary, hash map, associate array) is a key-value pairs mapping backed by a resizeable array data structure
Attributes
- Hash table uses a load factor to decide when to grow the internal array capacity
- A hash function is used to generate a hash code and array index to properly distribute entries across the array buckets
- Hash codes generated by the hash function may be duplicated. Separate chaining through a linked list or red-black tree data structure is the most common solution to resolve the problem
Public operations
- Search, aka Get: retrieve the value by key
- Add, aka Put: add a key-value entry into the hash table
- Delete, aka Remove: remove an entry from the hash table by key
Internal operations
- Hashing: properly distribute entries across the hash table
- Resize: auto grow hash table capacity
- Rehash: re-execute hashing operation for every entry in the buckets
Time complexity
- Hash table offers constant-time O(1) on average and linear-time O(n) in worst case performance for search, add and delete operations
Let’s walk through this tutorial to explore them in more details
The load factor
Hash table is backed by an array and it uses the load factor to decide when to grow the array capacity
Load factor is defined as below
As a general rule, loadFactor = 0.75
offers a good tradeoff between time and space costs. Higher values decrease the space overhead but increase the lookup costs and vice versa
Hashing operation
Hashing is the hash table internal operation which distributes entries across an array
Hash table uses a hash function to compute the key object to a hash code number
Hash code is reduced to array index via modulo operator %
Then the computed index is used to assign value object to the array