The Viterbi algorithm is a dynamic programming algorithm used to find the most probable sequence of hidden states—called the Viterbi path—that results in a sequence of observed events in a Hidden Markov Model (HMM). This algorithm is widely used in speech recognition, bioinformatics, and natural language processing.
Table of contents
Open Table of contents
Introduction
Hidden Markov Models (HMMs) are statistical models where the system being modeled is assumed to follow a Markov process with hidden states. In many real-world scenarios, we observe a sequence of outputs or emissions, but not the sequence of internal states that generated them.
The Viterbi algorithm helps solve this by efficiently computing the most probable sequence of hidden states that explains a sequence of observed events, using known state transition and emission probabilities.
Real-world applications include:
- Speech and handwriting recognition
- Gene prediction and sequence alignment in bioinformatics
- Part-of-speech tagging in NLP
- Error correction in digital communication systems
Core Concepts
An HMM is defined by:
- States: Possible hidden conditions of the system.
- Observations: Measurable events or symbols.
- Transition probabilities: Probability of moving from one state to another.
- Emission probabilities: Probability of observing a symbol from a state.
- Initial state distribution: Probability of starting in each state.
The Viterbi algorithm calculates:
- The most probable sequence of states (
path
) given a sequence of observations.
Viterbi Algorithm Implementation
1. Define the HMM and Observations
states = ['Rainy', 'Sunny']
observations = ['walk', 'shop', 'clean']
start_probability = {'Rainy': 0.6, 'Sunny': 0.4}
transition_probability = {
'Rainy': {'Rainy': 0.7, 'Sunny': 0.3},
'Sunny': {'Rainy': 0.4, 'Sunny': 0.6},
}
emission_probability = {
'Rainy': {'walk': 0.1, 'shop': 0.4, 'clean': 0.5},
'Sunny': {'walk': 0.6, 'shop': 0.3, 'clean': 0.1},
}
2. Implement the Viterbi Algorithm
def viterbi(obs, states, start_p, trans_p, emit_p):
V = [{}]
path = {}
# Initialization
for s in states:
V[0][s] = start_p[s] * emit_p[s][obs[0]]
path[s] = [s]
# Recursion
for t in range(1, len(obs)):
V.append({})
new_path = {}
for curr_state in states:
(prob, prev_state) = max(
(V[t - 1][y0] * trans_p[y0][curr_state] * emit_p[curr_state][obs[t]], y0) for y0 in states
)
V[t][curr_state] = prob
new_path[curr_state] = path[prev_state] + [curr_state]
path = new_path
# Termination
(prob, state) = max((V[-1][s], s) for s in states)
return (prob, path[state])
Example Usage
obs_sequence = ['walk', 'shop', 'clean']
probability, state_sequence = viterbi(obs_sequence, states, start_probability, transition_probability, emission_probability)
print("Most probable state sequence:", state_sequence)
print("Probability of the sequence:", probability)
Visualization (Optional)
Visualizing HMMs and Viterbi paths can be useful for understanding the process. Here’s a basic approach using networkx
:
import networkx as nx
import matplotlib.pyplot as plt
def visualize_hmm(states, trans_p):
G = nx.DiGraph()
for src in states:
for dst in states:
G.add_edge(src, dst, weight=trans_p[src][dst])
pos = nx.circular_layout(G)
edge_labels = nx.get_edge_attributes(G, 'weight')
nx.draw(G, pos, with_labels=True, node_color='lightblue', node_size=2000)
nx.draw_networkx_edge_labels(G, pos, edge_labels=edge_labels)
plt.title("HMM State Transition Graph")
plt.show()
visualize_hmm(states, transition_probability)
Time and Space Complexity
Step | Time Complexity |
---|---|
Initialization | \( O(N) \) |
Recursion | \( O(N^2 \cdot T) \) |
Termination | \( O(N) \) |
Where:
- \( N \): Number of states
- \( T \): Length of the observation sequence
Applications
- Speech Recognition: Decoding spoken words based on acoustic features.
- Bioinformatics: Aligning DNA sequences and gene prediction.
- POS Tagging: Assigning part-of-speech labels in NLP tasks.
- Error Detection: Correcting noise in communication channels.
Conclusion
The Viterbi algorithm is a cornerstone in sequence decoding problems where direct observation of states isn’t possible. By leveraging dynamic programming, it efficiently recovers the most likely sequence of hidden states from observed outputs.
References
- Jurafsky & Martin – Speech and Language Processing, 3rd Edition (Draft Chapters).
- MIT OCW – Hidden Markov Models and Viterbi Algorithm
- YouTube: Viterbi Algorithm Explained