Optimizing Pair Search in Lists: A Hash Map Approach

Introduction

When working with lists in Python, a common problem is finding two numbers that sum up to a given target. A naive approach involves nested loops, leading to an inefficient O(n^2) time complexity. However, by using hash maps (dictionaries in Python), we can significantly optimize this to O(n).

In this blog, we will explore an efficient solution using dictionaries and the enumerate() function.

Problem Statement

Given a list of integers and a target sum, find the indices of two numbers in the list that add up to the target.

Example:

arr = [1, 4, 6, 7]
target = 10

Expected Output:

indices are 3 1

Explanation: arr[3] + arr[1] = 7 + 4 = 10.

Optimized Approach: Using a Dictionary

We can solve this problem efficiently using a dictionary (hash map) to store previously visited elements. The idea is:

Iterate through the list.
Compute the complement (the value needed to reach the target).
Check if the complement exists in the dictionary.
If it does, print the indices.
Otherwise, store the current element and its index in the dictionary.

Code Implementation

arr = [1, 4, 6, 7]
target = 10
num_dict = {}

for index, num in enumerate(arr):
    if (target - num) in num_dict:
        print('indices are', index, num_dict[target - num])
    num_dict[num] = index  # Store the index of the current number

Explanation

We initialize an empty dictionary, num_dict.
As we iterate, we compute the complement (target - num).
If this complement is already in num_dict, we print the indices of the complement and the current number.
Otherwise, we store the number in num_dict with its index.

Why is This Efficient?

Time Complexity: O(n), as each element is processed once.
Space Complexity: O(n), due to storing elements in the dictionary.
Avoids Nested Loops: A brute-force approach would take O(n^2), making this method significantly faster.

Understanding Dictionary Key Lookup Complexity

A dictionary (dict) in Python is implemented as a hash table (or hash map). This provides O(1) average time complexity for key lookups, insertions, and deletions.

Dictionary Lookup Mechanism

When checking if a key exists using if key in dict: or retrieving dict[key], Python performs:

Hashing the Key: The key is passed through a hash function (hash(key)) to compute a hash value.
Index Calculation: The hash value determines the index in an internal array.
Direct Access: If no collision occurs, the key-value pair is found instantly in O(1) time.

When Does Dictionary Lookup Become O(n)?

Though dictionary key lookup is O(1) on average, it can degrade to O(n) in the worst case due to hash collisions.

Hash Collisions: If multiple keys have the same hash, Python stores them in a linked list (or probing sequence), requiring a linear search among colliding keys.
Too Many Collisions: If the dictionary grows significantly, resizing (rehashing) might be needed, momentarily increasing lookup time.
Worst-Case Scenario: If all keys map to the same index due to poor hashing (high collision rate), searching for a key may require scanning all n keys, making it O(n).

However, Python's hashing function and dynamic resizing minimize collisions, ensuring lookups remain O(1) in most cases.

Edge Cases to Consider

No valid pairs: If no two numbers sum to the target, the function will simply not print anything.
Multiple pairs exist: The code only prints the first valid pair.
Duplicate numbers: If duplicate values exist, the dictionary ensures correct index tracking.

Conclusion

Using dictionaries allows us to find pairs in O(n) time, making this approach highly efficient for large datasets. By leveraging Python's built-in enumerate() function, we also make the code more readable and concise.

mlTutor