Maximum Likelihood Estimation and the Expectation–Maximization
Underpinning powerful machine learning algorithms are two critical techniques - Maximum Likelihood Estimation (MLE) and the Expectation-Maximization (EM) algorithm. MLE provides a structured approach for model parameter selection, while EM handles hidden variables - together they empower ML models to uncover deeper patterns.
Maximum Likelihood Estimation: Finding the Best Model Fit
Maximum Likelihood Estimation (MLE) is a statistical method to determine the parameters for a model that maximize the likelihood of generating the observed data. It centers on the following:
In essence, MLE picks the model parameters that offer the highest chance to replicate the observed data.
Expectation-Maximization: Fill in the blanks
The Expectation–Maximization (EM) algorithm comes into play when data has missing values or hidden variables that impact the observations. It works iteratively in two steps:
E-Step: Estimate hidden/missing variables given current model parameters.
M-Step: Update model parameters to maximize likelihood based on original data and E-Step estimates.
These two steps are repeated, progressively refining estimates of hidden variables and model parameters. The intuition - make the best inference possible given what we know, then update what we know based on those inferences.
Example: Predicting Human Activity from Sensor Data
Consider wearable devices with motion sensors that record physical activity over time. The exact nature of activities - walking, exercising etc. - is hidden. EM can help predict activity by analyzing patterns across sensor streams:
E-Step: Infer probable activity labels for timestamped sensor data using an initial activity classifier.
M-Step: Re-train the classifier based on sensor data and predicted labels to enhance accuracy.
Through multiple iterations, EM continually boosts the performance of the human activity classifier despite lacking direct activity labels in the sensor data.