Markov Model
MARKOV MODEL:
Markov models incorporate the principles of the Markov property, as defined by Russian mathematician Andrey Markov in 1906. In short, the prediction of an outcome is based solely on the information provided by the current state, not on the sequence of events that occurred before. The four main forms of Markov models are the Markov chain, Markov Decision Process, Hidden Markov model and the partially observable Markov decision process. The specific uses of each of these models are dependent on two factors; whether or not the system state is fully observable, and if the system is controlled or fully autonomous.
A Markov model is a statistical model used to model sequences of events or states, where the probability of transitioning from one state to another depends only on the current state, and not on any of the previous states. Markov models are commonly used in deep learning to model sequences of inputs, such as words in a sentence or frames in a video.
In a Markov model, each state is represented by a node, and the transitions between states are represented by directed edges between nodes. Each edge is assigned a probability, which represents the probability of transitioning from one state to another. The probability of transitioning from one state to another is also known as the transition probability.
Markov models can be either first-order or higher-order, depending on whether the probability of transitioning to a new state depends only on the current state (first-order), or on the current state and a fixed number of previous states (higher-order). Higher-order Markov models can capture more complex dependencies between states, but require more data and computation to train.
In deep learning, Markov models are often used in combination with recurrent neural networks (RNNs). RNNs are a type of neural network that can process sequences of inputs by maintaining an internal state that depends on the previous inputs. Markov models can be used to model the transitions between the internal states of an RNN, allowing the RNN to capture more complex dependencies between inputs.
Overall, Markov models are a powerful tool for modeling sequences of inputs in deep learning, and can be used in a variety of applications, including natural language processing, speech recognition, and video analysis.
PROCESS:
The process of constructing and using a Markov model typically involves the following steps:
Overall, the process of using a Markov model involves defining the states, estimating the transition probabilities, representing the model as a directed graph, using the model for prediction or analysis, and refining the model as needed to improve its accuracy.
EXAMPLE:
Suppose we want to build a Markov model to predict the weather in a certain city, where the weather can be either sunny, rainy, or cloudy. We have historical data showing the following transition probabilities:
We start with a sunny day on day 1.
a) What is the probability that it will be rainy on day 3?
b) What is the probability that it will be sunny on day 4?
c) What is the steady-state probability distribution for the weather conditions?
Solution:
a) To find the probability of rainy weather on day 3, we need to consider all possible sequences of weather conditions that lead to rainy weather on day 3. There are two possible sequences: sunny-rainy-rainy and sunny-cloudy-rainy. The probability of the first sequence is:
0.6 * 0.2 * 0.3 = 0.036
The probability of the second sequence is:
0.6 * 0.2 * 0.25 = 0.03
Therefore, the total probability of rainy weather on day 3 is:
0.036 + 0.03 = 0.066
b) To find the probability of sunny weather on day 4, we need to consider all possible sequences of weather conditions that lead to sunny weather on day 4. There are three possible sequences: sunny-sunny-sunny-sunny, sunny-rainy-sunny-sunny, and sunny-cloudy-sunny-sunny. The probability of the first sequence is:
0.6 * 0.6 * 0.6 * 0.6 = 0.1296
The probability of the second sequence is:
0.6 * 0.2 * 0.6 * 0.6 = 0.0432
The probability of the third sequence is:
0.6 * 0.2 * 0.5 * 0.6 = 0.036
Therefore, the total probability of sunny weather on day 4 is:
0.1296 + 0.0432 + 0.036 = 0.2088
c) To find the steady-state probability distribution for the weather conditions, we need to find the vector pi that satisfies the equation pi * P = pi, where P is the transition probability matrix:
P = [0.6 0.2 0.2; 0.4 0.3 0.3; 0.5 0.25 0.25]
Solving for pi, we get:
pi = [0.5 0.25 0.25]
Therefore, the steady-state probability distribution is that the weather is sunny with probability 0.5, rainy with probability 0.25, and cloudy with probability 0.25. This means that over the long term, the weather in this city is most likely to be sunny.
THANKYOU.