Expectation Maximization (EM): Algorithm
The Problem: You have the heights, but you don’t know who belongs to which group. If you knew the groups, you could easily calculate the average height for each. And if you knew the average height for each group, you could guess who belongs to which group.
This is a classic “chicken-and-egg” problem. We need the group labels to find the averages, but we need the averages to find the group labels.
The EM Solution: EM solves this by “cheating” a little. It’s an iterative two-step process:
1. The “Expectation” Step (E-Step): The Guessing Step.
- Start with a random guess for the model parameters (e.g., guess the average heights for Group A and Group B).
- Based on these guesses, calculate the probability (or “expectation”) for each person belonging to Group A vs. Group B. A very tall person is probably in the “taller” group, and a shorter person is probably in the “shorter” group.
2. The “Maximization” Step (M-Step): The Updating Step.
- Now, treat those probabilities you just calculated as if they were certain.
- Update your model parameters by calculating a new “weighted” average height for each group. For example, a person who is 90% likely to be in Group A will contribute 90% of their height to Group A’s new average.
- This step is called “Maximization” because we are choosing new parameters that maximize the likelihood of seeing our data, given our guesses from the E-step.
3. You simply repeat these two steps (E-Step and M-Step) over and over. Each iteration, your guesses get a little bit better, and your calculated averages get closer to the true values. You stop when the parameters barely change between iterations.
