
1. Setup
From previous post: Hash Tables - Part 1
thesaurus_arr = [None]*16 # create empty thesaurus
letters_dict = {chr(i): i - 96 for i in range(97, 97 + 26)}
print(f"letters_dict: \n{letters_dict}")
letters_dict:
{'a': 1, 'b': 2, 'c': 3, 'd': 4, 'e': 5, 'f': 6, 'g': 7, 'h': 8, 'i': 9, 'j': 10, 'k': 11, 'l': 12, 'm': 13, 'n': 14, 'o': 15, 'p': 16, 'q': 17, 'r': 18, 's': 19, 't': 20, 'u': 21, 'v': 22, 'w': 23, 'x': 24, 'y': 25, 'z': 26}
def alphabet_multiplicative_hash_fn(input_word: str):
# convert some string into hash_value by multiplying all letters value
hash_value = 1
for character in input_word:
# dict_val = dict[char]
letter_val = letters_dict[character]
hash_value *= letter_val # cumulative product: char1 * char2 ...
return hash_value
PretendDictMemoryList = [None]*16
print(f"PretendDictMemoryList: {PretendDictMemoryList}")
PretendDictMemoryList: [None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None]
print(f"Test getting 'd': {letters_dict['d']}") # 4
print(f"Test hash_fn('bad'): {alphabet_multiplicative_hash_fn('bad')}") # 2*1*4 = 8
print(f"Test hash_fn('cab'): {alphabet_multiplicative_hash_fn('cab')}") # 3*1*2 = 6
entry_pairs_list = [("bad","evil"),("cab","taxi"),("aaa","best")]
entry_pairs_list
for pair in entry_pairs_list:
hash_value = alphabet_multiplicative_hash_fn(pair[0])
PretendDictMemoryList[hash_value-1]=pair[1]
# print(hash_value, PretendDictMemoryList)
thesaurus_app = PretendDictMemoryList
print(f"\nThesaurus app:\n{thesaurus_app}")
Test getting 'd': 4
Test hash_fn('bad'): 8
Test hash_fn('cab'): 6
Thesaurus app:
['best', None, None, None, None, 'taxi', None, 'evil', None, None, None, None, None, None, None, None]
2. Collisions (same hash values)
2.1 New Entry with Same Hash Value
Notice new entry ["dab","tap"]
will result in a hash_value
of 8
:
print(f"Test hash_fn('dab'): {alphabet_multiplicative_hash_fn('dab')}") # 4*1*2 = 8
2.2 A Collision Has Occured 💥!
Index 8
in memory already exists from a previous entry:
Index 8
will be over-ridden by the new entry:
- from
["bad","evil"]
- to
["dab","tap"]
The current approach is sub-optimal.
entry_pairs_list = [("dab","tap")]
for pair in entry_pairs_list:
hash_value = alphabet_multiplicative_hash_fn(pair[0])
PretendDictMemoryList[hash_value-1]=pair[1]
thesaurus_app = PretendDictMemoryList
print(f"\nThesaurus app:\n{thesaurus_app}")
Thesaurus app:
['best', None, None, None, None, 'taxi', None, 'tap', None, None, None, None, None, None, None, None]
2.3 Updated Solution: Sub-Arrays
Create an array with sub-arrays at the index
For the example: * create an array within index 8, * with each item of the array, * represented by the [input_word,synonym]
2.4 Search Time-Complexity: Previous versus Updated Solution
Previous Solution:
PretendDictMemoryList[8] = "taxi"
Updated Solution:
PretendDictMemoryList[8] = [["cab","taxi"],["dab","tap"]]
- Search \(O(1)\) to get value at index (or
hash_value
)
- Then, search array \(O(n)\) to find our
cab
or dab
key
- Thus, Total Search \(O(n+1) \to O(n)\)