From the course: Applied Machine Learning: Supervised Learning

Unlock this course with a free trial

Join today to access over 25,500 courses taught by industry experts.

Understanding overfitting and underfitting

Understanding overfitting and underfitting

- [Instructor] When you're dealing with supervised learning, a key thing to realize is that models can overfit and underfit. I want to make sure you understand the basic concept here. We talked about this a little bit when we looked at our decision tree and I said that the decision tree can grow as deep as you let it, and if it gets deep enough, it basically memorizes the training data, which maybe somewhat counterintuitively doesn't really perform well in the real world because the real-world data will not exactly match the training data. So there are two things that we want to balance here, and one is called underfitting and the other one is called overfitting. Statisticians, again, have their jargon or terms that they like to use for these. They say that an underfit model has high bias and an overfit model has high variance. But I like to look at it this way. An underfit model is too simple. It's not able to capture the signal of the noise. And an overfit model is too complicated…

Contents