Chatbot with Python-Flask

Chatbot with Python-Flask

What is Chatbot?

ChatterBot is a Python library that makes it easy to generate automated responses to a user’s input. ChatterBot uses a selection of machine learning algorithms to produce different types of responses. This makes it easy for developers to create chat bots and automate conversations with users. 

I created chatbot using python-flask, for Deployment I used Docker.

Step -1: Collecting Data

I am collected some 1000 question and answers, then did some text preprocessing, and then put into a dataframe.

No alt text provided for this image

Step – 2 : Model Bulding

Pre-requisites

A hands-on knowledge of scikit library and NLTK is assumed. However, if you are new to NLP, you can still read the article and then refer back to resources.

NLTK has been called “a wonderful tool for teaching and working in, computational linguistics using Python,” and “an amazing library to play with natural language.”

Downloading and Installing NLTK:

1.    For downloading : pip install nltk

2.    For Testing: run python and then import nltk

No alt text provided for this image

you can choose import NLTK and run nltk.download().This will open the NLTK downloader from where the corpora and models to download. You can also download all packages at once.

The NLTK data package includes a pre-trained Punkt tokenizer for English.


Text Preprocessing:

No alt text provided for this image

·       Converting the entire text into uppercase or lowercase

·       Tokenization: Tokenization is just the term used to describe the process of converting the normal text strings into a list of tokens

·       Removing Noise i.e everything that isn’t in a standard number or letter.

·        Removing Stop words.

·       Stemming: Stemming is the process of reducing inflected (or sometimes derived) words to their stem, base or root form — generally a written word form.

·       Lemmatization: A slight variant of stemming is lemmatization. The major difference between these is, that, stemming can often create non-existent words, whereas lemmas are actual words.

TF-IDF Approach:

No alt text provided for this image

Term Frequency-Inverse Document Frequency, or TF-IDF for short, where:

Term Frequency: is a scoring of the frequency of the word in the current document.

TF = (Number of times term t appears in a document)/(Number of terms in the document)


Inverse Document Frequency: is a scoring of how rare the word is across documents.

IDF = 1+log(N/n), where, N is the number of documents and n is the number of documents a term t has appeared in.

Tf-idf weight is a weight often used in information retrieval and text mining. This weight is a statistical measure used to evaluate how important a word is to a document in a collection or corpus

Cosine Similarity:

No alt text provided for this image

TF-IDF is a transformation applied to texts to get two real-valued vectors in vector space. We can then obtain the Cosine similarity of any pair of vectors by taking their dot product and dividing that by the product of their norms. That yields the cosine of the angle between the vectors. Cosine similarity is a measure of similarity between two non-zero vectors. Using this formula we can find out the similarity between any two documents d1 and d2.

Step – 3: Flask

Flask is a micro-framework for Python. It allows you to build websites and web apps quite rapidly and easily, it’s really good and light.

Actually I have kept all the model building code in flask and then connect to the front end. But this is not a good idea. We have to maintain a database and then retrieve the data on Id based. As of now its working fine in this way.

No alt text provided for this image

Index.html file is my landing page.

As per flask project structure we should keep all the .html files in template folder and .css and images(if any) files are in static folder. For more information you find go through on flask installation document.

No alt text provided for this image

Create a method for posting the data to front end. Request.method==’post’ will post the response to the front end.

No alt text provided for this image

In the code snippet, I am calling flask-python variable into html file. So once user enter the text it will redirect to the bot method in flask file. In bot method I am calling chatanswers function, where I am applying tf-idf model on preprocessed text. And then finally passing the response_text to the html file.

Step – 4: Deployment

You can follow the docker document to deploy your project.

Hello Ramkumar sir, Myself Akash Bangar I See your Chatbot Python Flask article. I don't knowledge about these NLTK packages...But still Learning. I am Graduated Student Looking for a job in a data scientist role is there any vacancy in your company? I have done an Internship in Python and data science, and looking for a job If any Vacancy please let me know.

Like
Reply

To view or add a comment, sign in

Others also viewed

Explore content categories