Data Science - Sense explained through data and science !
Why is Data Science - Sense explained through data and science ?
In the Data Science world, "Model Fit" is a phenomena of concern. Is it something new ? No ! absolutely not !
If one observes, this phenomena has been around us all the time and at all the places.
Here's an analogy (life example)
Students/Children/Kids Vs Model
Overfitting Model:
The children mug up the subjects(each and every text, including '.'). They are trained over the training data(textbooks) a number of times. They over learn.
So, when it comes to usual tests conducted in schools, where test data is from the training data(text books) itself, these kids might perform/score very well. But give them a test data out of training data(questions out of textbooks) ; not all would perform well in these tests.
This is called Overfitting. Here the model is being enforced to learn only a specific data multiple times.
Generalised Model :
These are the kids who mostly excel in entrance tests such as IITJEE, CAT etc.
Why - Because they are generalised. They are not trained specific to few textbooks alone. These kids are trained too, only that, not repetitively over the same data, but over a multiple or a variety of data(sets) of different subjects.
Here the model is being allowed to learn from multiple data sets.
Underfitting Model :
Poorly performing kids ! Why ?
1. Model / Student has the ability to perform : But - Lack of resources / input variables such as teaching quality (poor), economic status, etc limit the kids' performance.
Here the model can perform, but needs more features/input variables.
2. Model / Student doesn't have ability to perform : Mismatch between Kids' interests and the subject/area of study. May be he/she wants to excel in arts, sports or any other field.
The model can't perform in the subject area chosen. ! Poor choice of Business area or problem definition !