From the course: Develop ML Models with Python and T-SQL
Unlock this course with a free trial
Join today to access over 25,500 courses taught by industry experts.
Common pitfalls
From the course: Develop ML Models with Python and T-SQL
Common pitfalls
- [Presenter] Here are some common pitfalls you may encounter on how you can avoid them. First, data cleansing. This is the process where you make sure the data is ready to be ingested and used for machine learning. You want to ensure that there's no inconsistent data formats. The data may come in in different formats, but you want to ensure that all the data types and formats are correct before processing. You also want to ensure there are no missing values. Handling missing data is crucial. Decide on a strategy, whether you want to assign default values or you want to use calculated values for dealing with the missing fields. You also want to remove any outliers, meaning ones that are at extreme ends of the data spectrum. Another thing to consider is integrity. Ensure that the database relationship is preserved, that foreign keys are utilized, datasets are normalized, and there are no orphan records. In addition, be sure that feature selection is done properly. Work with a knowledge…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.