Rabin–Karp Algorithm: Efficient String Searching with Hashing

📘 Introduction

String matching is one of the most fundamental problems in computer science, with applications ranging from search engines to DNA sequencing. While brute-force approaches can be slow, the Rabin–Karp Algorithm introduces a clever use of hashing to make pattern searching more efficient.

🧠 The Idea Behind Rabin–Karp Algorithm

Instead of comparing the pattern with every substring character by character, Rabin–Karp converts both the pattern and substrings into hash values.

If the hash values match, only then a direct comparison is performed to confirm.
This reduces unnecessary comparisons and speeds up the search process.

Think of it like checking the fingerprint of text chunks before doing a full identity check.

⚙️ Algorithmic Steps

Here’s how Rabin–Karp Algorithm works step-by-step:

1.Compute the hash of the pattern.

2.Compute the hash of the first substring of the text (same length as the pattern).

3.Slide the window one character at a time:

Update the hash efficiently (rolling hash).
Compare with the pattern’s hash.
If hashes match, verify with direct comparison.

4.Continue until the end of the text.

💻 Rabin–Karp Algorithm Implementation

def rabin_karp(text, pattern, prime=101):
    """
    Rabin-Karp Algorithm for string matching.
    text: The main string where we search.
    pattern: The substring we want to find.
    prime: A prime number used for hashing (to reduce collisions).
    """

    m, n = len(pattern), len(text)   # m = length of pattern, n = length of text
    pattern_hash = 0                 # hash value for the pattern
    text_hash = 0                    # hash value for current window of text
    h = 1                            # multiplier for rolling hash

    d = 256  # number of possible characters (ASCII range)

    # Precompute h = pow(d, m-1) % prime
    # This helps in removing the leading character during rolling hash
    for i in range(m-1):
        h = (h * d) % prime

    # Initial hash values for pattern and first window of text
    for i in range(m):
        pattern_hash = (d * pattern_hash + ord(pattern[i])) % prime
        text_hash = (d * text_hash + ord(text[i])) % prime

    # Slide the pattern over text one by one
    for i in range(n - m + 1):
        # If hash values match, check characters one by one
        if pattern_hash == text_hash:
            if text[i:i+m] == pattern:
                print(f"Pattern found at index {i}")

        # Calculate hash for next window (rolling hash)
        if i < n - m:
            text_hash = (d * (text_hash - ord(text[i]) * h) + ord(text[i+m])) % prime

            # Python can give negative values, so adjust
            if text_hash < 0:
                text_hash += prime


# Example
text1 = "ABCCDDAEFG"
pattern1 = "CDD"

rabin_karp(text1, pattern1)

# Output
# Pattern found at index 3

📊 Time & Space Complexity

Best Case: O(n+m) (when hash mismatches are frequent)
Worst Case: O(n⋅m) (when many hash collisions occur)
Average Case: O(n+m)
Space Complexity: O(1) (only a few extra variables for hashing)

⚖️ Strengths & Weaknesses

✅ Strengths:

Efficient for multiple pattern searches.
Uses hashing to skip unnecessary comparisons.
Simple and elegant rolling hash mechanism.

❌ Weaknesses:

Hash collisions can degrade performance.
Requires careful choice of hash function and prime number.

🔑 Key Takeaways

Rabin–Karp is a hash-based string matching algorithm that shines when searching for multiple patterns.
It balances efficiency with simplicity, though hash collisions can be a drawback.
Understanding Rabin–Karp builds strong intuition for how hashing can optimize algorithms beyond just data storage.

Rabin–Karp Algorithm: Efficient String Searching with Hashing

Dharmendra Sharma

📘 Introduction

🧠 The Idea Behind Rabin–Karp Algorithm

⚙️ Algorithmic Steps

💻 Rabin–Karp Algorithm Implementation

📊 Time & Space Complexity

⚖️ Strengths & Weaknesses

🔑 Key Takeaways

More articles by Dharmendra Sharma

Explore content categories

📘 Introduction

🧠 The Idea Behind Rabin–Karp Algorithm

⚙️ Algorithmic Steps

💻 Rabin–Karp Algorithm Implementation

📊 Time & Space Complexity

⚖️ Strengths & Weaknesses

🔑 Key Takeaways

More articles by Dharmendra Sharma

Tower of Hanoi Algorithm: A Classic Puzzle with Timeless Lessons

Databases Aren’t Just Storage — They’re Architecture

Aho–Corasick Algorithm: Multi‑Pattern Matching Made Simple

Load Balancers – The Backbone of Scalable Systems

Manacher’s Algorithm: Cracking Palindromes in Linear Time

CAP Theorem: Where Engineering Meets Reality

Boyer–Moore Algorithm: The Algorithm That Skips Ahead

Latency vs Throughput: Speed vs Capacity Explained

Z-Algorithm: Fast String Matching for Modern Problems

Availability: The Backbone of Reliable Systems

Explore content categories