Skip to content

CS3016/08. Implementing the Viterbi algorithm for Hidden Markov Models (HMMs)

 Published: 3 hours read

The Viterbi algorithm is a dynamic programming algorithm used to find the most probable sequence of hidden states—called the Viterbi path—that results in a sequence of observed events in a Hidden Markov Model (HMM). This algorithm is widely used in speech recognition, bioinformatics, and natural language processing.

Table of contents

Open Table of contents

Introduction

Hidden Markov Models (HMMs) are statistical models where the system being modeled is assumed to follow a Markov process with hidden states. In many real-world scenarios, we observe a sequence of outputs or emissions, but not the sequence of internal states that generated them.

The Viterbi algorithm helps solve this by efficiently computing the most probable sequence of hidden states that explains a sequence of observed events, using known state transition and emission probabilities.

Real-world applications include:

Core Concepts

An HMM is defined by:

The Viterbi algorithm calculates:

Viterbi Algorithm Implementation

1. Define the HMM and Observations

states = ['Rainy', 'Sunny']
observations = ['walk', 'shop', 'clean']
start_probability = {'Rainy': 0.6, 'Sunny': 0.4}
transition_probability = {
    'Rainy': {'Rainy': 0.7, 'Sunny': 0.3},
    'Sunny': {'Rainy': 0.4, 'Sunny': 0.6},
}
emission_probability = {
    'Rainy': {'walk': 0.1, 'shop': 0.4, 'clean': 0.5},
    'Sunny': {'walk': 0.6, 'shop': 0.3, 'clean': 0.1},
}

2. Implement the Viterbi Algorithm

def viterbi(obs, states, start_p, trans_p, emit_p):
    V = [{}]
    path = {}

    # Initialization
    for s in states:
        V[0][s] = start_p[s] * emit_p[s][obs[0]]
        path[s] = [s]

    # Recursion
    for t in range(1, len(obs)):
        V.append({})
        new_path = {}

        for curr_state in states:
            (prob, prev_state) = max(
                (V[t - 1][y0] * trans_p[y0][curr_state] * emit_p[curr_state][obs[t]], y0) for y0 in states
            )
            V[t][curr_state] = prob
            new_path[curr_state] = path[prev_state] + [curr_state]

        path = new_path

    # Termination
    (prob, state) = max((V[-1][s], s) for s in states)
    return (prob, path[state])

Example Usage

obs_sequence = ['walk', 'shop', 'clean']
probability, state_sequence = viterbi(obs_sequence, states, start_probability, transition_probability, emission_probability)

print("Most probable state sequence:", state_sequence)
print("Probability of the sequence:", probability)

Visualization (Optional)

Visualizing HMMs and Viterbi paths can be useful for understanding the process. Here’s a basic approach using networkx:

import networkx as nx
import matplotlib.pyplot as plt

def visualize_hmm(states, trans_p):
    G = nx.DiGraph()
    for src in states:
        for dst in states:
            G.add_edge(src, dst, weight=trans_p[src][dst])
    pos = nx.circular_layout(G)
    edge_labels = nx.get_edge_attributes(G, 'weight')
    nx.draw(G, pos, with_labels=True, node_color='lightblue', node_size=2000)
    nx.draw_networkx_edge_labels(G, pos, edge_labels=edge_labels)
    plt.title("HMM State Transition Graph")
    plt.show()

visualize_hmm(states, transition_probability)

Time and Space Complexity

StepTime Complexity
Initialization\( O(N) \)
Recursion\( O(N^2 \cdot T) \)
Termination\( O(N) \)

Where:

Applications

  1. Speech Recognition: Decoding spoken words based on acoustic features.
  2. Bioinformatics: Aligning DNA sequences and gene prediction.
  3. POS Tagging: Assigning part-of-speech labels in NLP tasks.
  4. Error Detection: Correcting noise in communication channels.

Conclusion

The Viterbi algorithm is a cornerstone in sequence decoding problems where direct observation of states isn’t possible. By leveraging dynamic programming, it efficiently recovers the most likely sequence of hidden states from observed outputs.

References

  1. Jurafsky & MartinSpeech and Language Processing, 3rd Edition (Draft Chapters).
  2. MIT OCW – Hidden Markov Models and Viterbi Algorithm
  3. YouTube: Viterbi Algorithm Explained

Previous Post
CS3016/07. Demonstrate with a Program the Markov Property and Stationary Markov Chains
Next Post
CS3016/09. Implementing the Forward Algorithm