Boyer–Moore Algorithm: The Algorithm That Skips Ahead

Dharmendra Sharma

Published Jan 31, 2026

📘 Introduction

In the world of computer science, efficient string searching is a cornerstone of text processing, compilers, and even everyday applications like search engines or word processors. Among the many algorithms designed for this purpose, the Boyer–Moore Algorithm, introduced by Robert S. Boyer and J Strother Moore in 1977, stands out as one of the most efficient and widely adopted

🧠 The Idea Behind Boyer–Moore Algorithm

Unlike the naive approach that checks every character sequentially, Boyer–Moore leverages two powerful heuristics:

Bad Character Rule: If a mismatch occurs, the algorithm shifts the pattern to align with the last occurrence of the mismatched character in the pattern.
Good Suffix Rule: If a suffix of the pattern matches but a mismatch occurs earlier, the algorithm shifts the pattern to align with another occurrence of that suffix.

Together, these heuristics allow the algorithm to skip large portions of text, making it significantly faster in practice than many alternatives.

⚙️ Algorithmic Steps

Here’s how Boyer–Moore Algorithm works step-by-step:

Preprocess the pattern to build bad character and good suffix tables.
Align the pattern with the text and compare characters from right to left.
On mismatch, use the precomputed tables to decide how far to shift the pattern.
Repeat until the pattern is found or the text is exhausted.

💻 Boyer–Moore Algorithm Implementation

# Boyer-Moore String Search Algorithm Implementation 

# Step 1: Preprocess for Bad Character Rule
def bad_char_heuristic(pattern):
    """
    Creates a dictionary storing the last occurrence of each character in the pattern.
    Used for the Bad Character Rule.
    """
    bad_char = {}
    for i in range(len(pattern)):
        bad_char[pattern[i]] = i
    return bad_char


# Step 2: Boyer-Moore Search Function
def boyer_moore_search(text, pattern):
    """
    Searches for occurrences of 'pattern' in 'text' using Boyer-Moore algorithm.
    Returns a list of starting indices where pattern is found.
    """
    m = len(pattern)
    n = len(text)
    bad_char = bad_char_heuristic(pattern)  # Preprocess pattern

    result = []  # Stores indices of matches
    s = 0  # Shift of the pattern with respect to text

    while s <= n - m:
        j = m - 1

        # Compare pattern with text from right to left
        while j >= 0 and pattern[j] == text[s + j]:
            j -= 1

        # If pattern is found
        if j < 0:
            result.append(s)
            # Shift pattern so next character in text aligns with last occurrence in pattern
            s += (m - bad_char.get(text[s + m], -1)) if s + m < n else 1
        else:
            # Shift pattern using Bad Character Rule
            s += max(1, j - bad_char.get(text[s + j], -1))

    return result


# Step 3: Example
text = "ABAAABCDABCABCDABCD"
pattern = "ABCD"

matches = boyer_moore_search(text, pattern)

if matches:
     print("Pattern found at indices:", matches)
else:
     print("Pattern not found.")

# Output
# Pattern found at indices: [4, 11, 15]

📊 Time & Space Complexity

Best Case: O(n/m) — when mismatches allow large jumps.
Average Case: Sublinear, often faster than other algorithms in practice.
Worst Case: O(mn), though rare.
Space Complexity: O(k+m), where k is the alphabet size and m is the pattern length

⚖️ Strengths & Weaknesses

✅ Strengths:

Efficient in practice, especially with large alphabets.
Skips unnecessary comparisons using heuristics.
Widely used in text editors, search engines, and bioinformatics.

❌ Weaknesses:

Preprocessing overhead for small patterns.
Worst-case performance can degrade to O(mn).
More complex to implement compared to simpler algorithms like Naive or Rabin–Karp.

🔑 Key Takeaways

Boyer–Moore is a landmark algorithm in string searching, combining clever heuristics to achieve practical efficiency.
It’s best suited for applications where large texts and complex patterns are common.
While not always optimal in worst-case scenarios, its average-case performance makes it a go-to choice in real-world systems.

To view or add a comment, sign in

Boyer–Moore Algorithm: The Algorithm That Skips Ahead

Dharmendra Sharma

📘 Introduction

🧠 The Idea Behind Boyer–Moore Algorithm

⚙️ Algorithmic Steps

Recommended by LinkedIn

💻 Boyer–Moore Algorithm Implementation

📊 Time & Space Complexity

⚖️ Strengths & Weaknesses

✅ Strengths:

❌ Weaknesses:

🔑 Key Takeaways

More articles by Dharmendra Sharma

Others also viewed

DAA + DSA: A Unified Framework to Master Algorithmic Thinking

Dimensionality Reduction with t-SNE: A Mathematical and Python Approach

KISSing Algorithms: Graph Matching

Limitations of the Linear Kalman Filter

AdS/CFT String Theory Correspondence and Boltzmann Machines-2

The math of similarity - Cohesion

Fine-Tuning LLaMA2 with Alpaca Dataset Using Alpaca-LoRA

CLIP by OpenAI — by first running the colab

Executing command line binaries in Node.js

Explore content categories

📘 Introduction

🧠 The Idea Behind Boyer–Moore Algorithm

⚙️ Algorithmic Steps

Recommended by LinkedIn

💻 Boyer–Moore Algorithm Implementation

📊 Time & Space Complexity

⚖️ Strengths & Weaknesses

✅ Strengths:

❌ Weaknesses:

🔑 Key Takeaways

More articles by Dharmendra Sharma

Tower of Hanoi Algorithm: A Classic Puzzle with Timeless Lessons

Databases Aren’t Just Storage — They’re Architecture

Aho–Corasick Algorithm: Multi‑Pattern Matching Made Simple

Load Balancers – The Backbone of Scalable Systems

Manacher’s Algorithm: Cracking Palindromes in Linear Time

CAP Theorem: Where Engineering Meets Reality

Latency vs Throughput: Speed vs Capacity Explained

Z-Algorithm: Fast String Matching for Modern Problems

Availability: The Backbone of Reliable Systems

Rabin–Karp Algorithm: Efficient String Searching with Hashing

Others also viewed

DAA + DSA: A Unified Framework to Master Algorithmic Thinking

Dimensionality Reduction with t-SNE: A Mathematical and Python Approach

KISSing Algorithms: Graph Matching

Limitations of the Linear Kalman Filter

AdS/CFT String Theory Correspondence and Boltzmann Machines-2

The math of similarity - Cohesion

Fine-Tuning LLaMA2 with Alpaca Dataset Using Alpaca-LoRA

CLIP by OpenAI — by first running the colab

Executing command line binaries in Node.js

Explore content categories