Word prediction/Spell correction using python library-Spello

Vinoth Saravanan

Published Feb 11, 2023

As we all came through searching something in google, for some words we might not know the exact spelling. so we type something related to that word and google would suggest like “this is what you are trying to search right?”. Here is an example, most of us not using medical terms and words in everyday routine life, unless we are from the healthcare domain or healthcare-related domain.

When I tried to develop a web application for the healthcare domain, I wanted to get the disease name from the user. When a user types a disease name with minor spelling mistakes, I need to correct the spelling mistakes or predict the word as per user input. To achieve this, I’ve tried Spello library in Python.

Prerequisites:

1. We need to install Python and Pandas. In my example, I’m using Python==3.8.10 & Pandas==1.3.4

2. Install Jupyter notebook — pip install notebook==6.4.6

3. Install Spello — pip install spello==1.3.0

What is Spello?

Spello is a spell-checking library in Python. It is designed to provide an easy-to-use and flexible interface for users to check the spelling of words in their text.

For official documentaion: https://pypi.org/project/spello/

One of the key features of Spello is its ability to use multiple dictionaries, which allows users to customize the spell-checking process to fit their specific needs.

For example, a user could use a standard English dictionary for general text, and then switch to a specialized dictionary for technical terms.

Spello also supports fuzzy matching, which can help to identify misspelled words that are similar to words in the dictionary. This feature can be especially useful for catching typos or correcting mistakes in the text.

How Spello handles this?

It is built with a combination of two models, Phoneme and Symspell.
Phoneme Model uses Soundex algorithm in the background and suggests correct spellings using phonetic concepts to identify similar-sounding words.
Symspell Model uses the concept of edit distance in order to suggest correct spellings.

Spello gets you the best of both, taking into consideration the context of the word as well.

Currently, this module is available in English(en) and Hindi(hi).

let’s dive into the real stuff,

Open the command prompt and move to the concerned folder you like to work
Install all the prerequisites
Open jupyter notebook, by typing “jupyter notebook “.

No alt text provided for this image — Opening jupyter notebook

After that, the web browser opens like the below.

To open a new workbook, click on the new button in the top right corner & select Python 3(ipykernel).

An empty workbook will be open as below.

Jupyter notebook will work as our normal command prompt, we can install packages & run our code for instant results. Here I’m installing spello by using jupyter notebook.

Import required packages & block of code for training our model.

from spello.model import SpellCorrectionMode
from nltk.tokenize import TreebankWordTokenizer
#Defining_model 
sp = SpellCorrectionModel(language='en')

#Reading_keywords_from_csv_file
df = pd.read_Csv('disease_name.csv')

#Creating_list_of_keywords
disease_name_list = df['disease_name'].tolist()

tokenizer  = TreebankWordTokenizer()
#Removing_unwanted_strings_from_keyword
list_words = [re.sub('^\W|\s'," ",w).lower().strip() for w in disease_name_list if len(w) > 2]
#Traing_our_model
sp.train(list_words)

In the above code, after importing the required packages, define the model. Then, we are reading a CSV file to train our model & converting it into a dataframe.

In the next step, we create a list from that file. Then, remove unwanted spaces & characters.

Finally, training our model with our customized keywords

The CSV file will contain data like below.

If everything goes well, you will see the screen like this

Spell-checking

keyword = input('Enter any keyword.. '
corrected_keyword = sp.spell_correct(keyword)
print('Corrected keyword is : ', corrected_keyword))

In the above code, we get keywords as input from the user. Then, we are passing the input to the spell_correct(input) method to get the corrected keyword.

In the above image, we can see that user entered “nemonia” and our model predicted the correct keyword as “Pneumonia”.

Conclusion:

Most spell correction & word prediction libraries have their own dictionary. Some common dictionary words work fine in those libraries. As we discussed in the first paragraph of our story, if we need to achieve this spell correction & word prediction in a particular domain or field, we should have the option to set our own dictionary of keywords. Spello library gives us that option.

Word prediction/Spell correction using python library-Spello

Vinoth Saravanan

What is Spello?

How Spello handles this?

let’s dive into the real stuff,

Conclusion:

Others also viewed

🐍 Arko & Mehnaz Begin Their Python Journey

Python Interpreter – Environment, Invoking & Working

Python vs R: An Introduction to Statistical Learning

25 Python String Questions with Solution

Python Strings Format

Strings in Python - All You Need to Know

TF-IDF in Python

Wine, Python, and Random Forests

Python: Mutable, Immutable... Everything is an Object!

python basic pratice

Explore content categories