Join now Sign in

From the course: Deep Learning with Python and Keras: Build a Model for Sentiment Analysis

Unlock this course with a free trial

Join today to access over 25,500 courses taught by industry experts.

Preprocessing text for sentiment analysis

Preprocessing text for sentiment analysis

From the course: Deep Learning with Python and Keras: Build a Model for Sentiment Analysis

Start my 1-month free trial Buy for my team

Preprocessing text for sentiment analysis

“

- [Instructor] When you're working with text data and using it to train machine learning models, there is a bunch of pre-processing and cleaning that you have to do before you can actually use that text data for model building and training. And here in this movie, of those pre-processing techniques. we'll briefly discuss some Text pre-processing usually includes tokenization, lemmatization, and stop word removal. Let's talk about tokenization first. Machine learning models don't work with the entire chunk of text that you feed in. Tokenization is the process of breaking down text into smaller units called tokens. Tokens can be words, numbers, or punctuation marks. Tokenization helps in structuring the text for further processing and analysis, and is a fundamental step to understand and interpret the text by analyzing its individual components. Lemmatization involves reducing words to their base or root form.…

Contents