Markov Model

MARKOV MODEL:

Markov models incorporate the principles of the Markov property, as defined by Russian mathematician Andrey Markov in 1906. In short, the prediction of an outcome is based solely on the information provided by the current state, not on the sequence of events that occurred before. The four main forms of Markov models are the Markov chain, Markov Decision Process, Hidden Markov model and the partially observable Markov decision process. The specific uses of each of these models are dependent on two factors; whether or not the system state is fully observable, and if the system is controlled or fully autonomous.

A Markov model is a statistical model used to model sequences of events or states, where the probability of transitioning from one state to another depends only on the current state, and not on any of the previous states. Markov models are commonly used in deep learning to model sequences of inputs, such as words in a sentence or frames in a video.

In a Markov model, each state is represented by a node, and the transitions between states are represented by directed edges between nodes. Each edge is assigned a probability, which represents the probability of transitioning from one state to another. The probability of transitioning from one state to another is also known as the transition probability.

Markov models can be either first-order or higher-order, depending on whether the probability of transitioning to a new state depends only on the current state (first-order), or on the current state and a fixed number of previous states (higher-order). Higher-order Markov models can capture more complex dependencies between states, but require more data and computation to train.

In deep learning, Markov models are often used in combination with recurrent neural networks (RNNs). RNNs are a type of neural network that can process sequences of inputs by maintaining an internal state that depends on the previous inputs. Markov models can be used to model the transitions between the internal states of an RNN, allowing the RNN to capture more complex dependencies between inputs.

Overall, Markov models are a powerful tool for modeling sequences of inputs in deep learning, and can be used in a variety of applications, including natural language processing, speech recognition, and video analysis.

PROCESS:

The process of constructing and using a Markov model typically involves the following steps:

Define the states: The first step in building a Markov model is to define the possible states of the system or process being modeled. For example, in a weather prediction model, the states could be "sunny," "cloudy," or "rainy."
Determine the transition probabilities: The next step is to determine the probability of transitioning from one state to another. This can be done by analyzing historical data or by using domain knowledge to estimate the probabilities. For example, if historical data shows that it is sunny on 80% of days following a day that was also sunny, then the transition probability from "sunny" to "sunny" would be 0.8.
Represent the model: The Markov model can be represented as a directed graph, where the nodes represent the states and the edges represent the transition probabilities. Each edge is labeled with the probability of transitioning from one state to another.
Use the model for prediction or analysis: Once the Markov model is constructed, it can be used to make predictions about the future state of the system or to analyze the behavior of the system. For example, the model could be used to predict the weather for the next day based on the current weather conditions.
Refine the model: The accuracy of the Markov model can be improved by refining the state definitions, collecting more data to estimate the transition probabilities, or using a higher-order Markov model that considers more than just the current state. The model can also be refined by incorporating feedback or observations from the system being modeled, such as updating the transition probabilities based on new data.

Overall, the process of using a Markov model involves defining the states, estimating the transition probabilities, representing the model as a directed graph, using the model for prediction or analysis, and refining the model as needed to improve its accuracy.

EXAMPLE:

Suppose we want to build a Markov model to predict the weather in a certain city, where the weather can be either sunny, rainy, or cloudy. We have historical data showing the following transition probabilities:

On a sunny day, there is a 0.6 probability that the next day will be sunny, a 0.2 probability that it will be rainy, and a 0.2 probability that it will be cloudy.
On a rainy day, there is a 0.4 probability that the next day will be sunny, a 0.3 probability that it will be rainy, and a 0.3 probability that it will be cloudy.
On a cloudy day, there is a 0.5 probability that the next day will be sunny, a 0.25 probability that it will be rainy, and a 0.25 probability that it will be cloudy.

We start with a sunny day on day 1.

a) What is the probability that it will be rainy on day 3?

b) What is the probability that it will be sunny on day 4?

c) What is the steady-state probability distribution for the weather conditions?

Solution:

a) To find the probability of rainy weather on day 3, we need to consider all possible sequences of weather conditions that lead to rainy weather on day 3. There are two possible sequences: sunny-rainy-rainy and sunny-cloudy-rainy. The probability of the first sequence is:

0.6 * 0.2 * 0.3 = 0.036

The probability of the second sequence is:

0.6 * 0.2 * 0.25 = 0.03

Therefore, the total probability of rainy weather on day 3 is:

0.036 + 0.03 = 0.066

b) To find the probability of sunny weather on day 4, we need to consider all possible sequences of weather conditions that lead to sunny weather on day 4. There are three possible sequences: sunny-sunny-sunny-sunny, sunny-rainy-sunny-sunny, and sunny-cloudy-sunny-sunny. The probability of the first sequence is:

0.6 * 0.6 * 0.6 * 0.6 = 0.1296

The probability of the second sequence is:

0.6 * 0.2 * 0.6 * 0.6 = 0.0432

The probability of the third sequence is:

0.6 * 0.2 * 0.5 * 0.6 = 0.036

Therefore, the total probability of sunny weather on day 4 is:

0.1296 + 0.0432 + 0.036 = 0.2088

c) To find the steady-state probability distribution for the weather conditions, we need to find the vector pi that satisfies the equation pi * P = pi, where P is the transition probability matrix:

P = [0.6 0.2 0.2; 0.4 0.3 0.3; 0.5 0.25 0.25]

Solving for pi, we get:

pi = [0.5 0.25 0.25]

Therefore, the steady-state probability distribution is that the weather is sunny with probability 0.5, rainy with probability 0.25, and cloudy with probability 0.25. This means that over the long term, the weather in this city is most likely to be sunny.

THANKYOU.

Markov Model

aman khan

More articles by aman khan

Explore content categories

More articles by aman khan

LINEAR REGRESSION

COMPUTER VISION IN MACHINE LEARNING

Explore content categories