Boyer–Moore Algorithm: The Algorithm That Skips Ahead
📘 Introduction
In the world of computer science, efficient string searching is a cornerstone of text processing, compilers, and even everyday applications like search engines or word processors. Among the many algorithms designed for this purpose, the Boyer–Moore Algorithm, introduced by Robert S. Boyer and J Strother Moore in 1977, stands out as one of the most efficient and widely adopted
🧠 The Idea Behind Boyer–Moore Algorithm
Unlike the naive approach that checks every character sequentially, Boyer–Moore leverages two powerful heuristics:
Together, these heuristics allow the algorithm to skip large portions of text, making it significantly faster in practice than many alternatives.
⚙️ Algorithmic Steps
Here’s how Boyer–Moore Algorithm works step-by-step:
Recommended by LinkedIn
💻 Boyer–Moore Algorithm Implementation
# Boyer-Moore String Search Algorithm Implementation
# Step 1: Preprocess for Bad Character Rule
def bad_char_heuristic(pattern):
"""
Creates a dictionary storing the last occurrence of each character in the pattern.
Used for the Bad Character Rule.
"""
bad_char = {}
for i in range(len(pattern)):
bad_char[pattern[i]] = i
return bad_char
# Step 2: Boyer-Moore Search Function
def boyer_moore_search(text, pattern):
"""
Searches for occurrences of 'pattern' in 'text' using Boyer-Moore algorithm.
Returns a list of starting indices where pattern is found.
"""
m = len(pattern)
n = len(text)
bad_char = bad_char_heuristic(pattern) # Preprocess pattern
result = [] # Stores indices of matches
s = 0 # Shift of the pattern with respect to text
while s <= n - m:
j = m - 1
# Compare pattern with text from right to left
while j >= 0 and pattern[j] == text[s + j]:
j -= 1
# If pattern is found
if j < 0:
result.append(s)
# Shift pattern so next character in text aligns with last occurrence in pattern
s += (m - bad_char.get(text[s + m], -1)) if s + m < n else 1
else:
# Shift pattern using Bad Character Rule
s += max(1, j - bad_char.get(text[s + j], -1))
return result
# Step 3: Example
text = "ABAAABCDABCABCDABCD"
pattern = "ABCD"
matches = boyer_moore_search(text, pattern)
if matches:
print("Pattern found at indices:", matches)
else:
print("Pattern not found.")
# Output
# Pattern found at indices: [4, 11, 15]
📊 Time & Space Complexity
⚖️ Strengths & Weaknesses
✅ Strengths:
❌ Weaknesses:
🔑 Key Takeaways