Text Mining

Rashen Fernando

Published Jul 21, 2021

The Problem!

When conducting a research study for a specific problem, data must be collected in order to analyze the problem. For instance, in the construction field, occupational accidents have become a major problem. The most severe causes of accidents according to some latest's researches are found out to be as, worker factor, technological factor, natural factors, and surrounding activities. When conducting research for a such case often the necessary data must be collected through interviews and questionnaires. There is a wide possibility of answers being bias, skipped, interpretation issues, or accessibility issues when the data collections are carried out. In addition to that data collected will be transformed into reports saved in datasheets and the data will be extracted manually which will be time-consuming.

The Solution.

The text mining approach can collect data from the reports made available from organizations and it attempts to extract sources and analyze hazards using Pareto analysis. These data are much more accurate and reliable and also from this method a lot of time can be saved.

Recommended by LinkedIn

Mining the Unknown: The NRGscapes App That Turns UAP…

Dr Andrew Morgan - PhD, Grad.Cert., Dip.Ed. 8 months ago

Julia's Study Guide — Machine Learning Basics

Julia Oliveira 8 months ago

MLP (Keras) Optimizers for Discrete Problems

Yohanes Nuwara 1 year ago

What is Text Mining?

Text mining is an artificial intelligent technology involving a process where free text in documents is converted to machine-understandable structured data using Natural Language Processing (NLP). NLP is a technique used for interaction between computer and human languages. The process provides high-quality information from text offer conversion.

The Process.

Text mining is done by three main classifications: Support vector machine (SVM), Decision tree (DT), and Random forest (RF). According to the converted data, these three classifiers analyze them according to given attributes and conditions.

After the classification is done the data are processed according to a set of procedures. In the first step, the data which were collected, are separated into words, and a token is created for each and every word. The next step is to stop word removal. This is the step where the most common words in a sentence/paragraph are filtered out to save the unique words. The third process is stemming and lemmatization where the tense of the words is changed when breaking down the sentence. As the last step, all the documents are converted into the corpus and presented as a collection of text.

In addition to this machine categorization, manual categorization can be done. After these steps, for each of these causes an N-gram table is formed where the activity for the caused is described in a single word, two words, and three words. Finally, a validation process is carried out to find the average weighted FI score which evaluates the performance of each cause.

To view or add a comment, sign in

Text Mining

Rashen Fernando

The Problem!

The Solution.

Recommended by LinkedIn

What is Text Mining?

The Process.

Others also viewed

Polymath as Complexity

A Neural Net based trading strategy

All important Machine learning algorithms in 2023

Evolutionary Amoebas Playing Space Invaders - A Deep Learning Adventure

Sentiment Analysis - Here is what the people have to say

Revolutionizing Model Integration with Adapter Fusion

Jotery Project

Fréchet Inception Distance

Data Mining VS Artificial Intelligence VS Machine Learning VS Deep Learning

Data Mining: 5Ws and 1H

NLP for Legal Document Analysis

Text Mining in Customer Feedback

Natural Language Processing Algorithms

Utilizing Natural Language Processing in AI Recommendations

NLP Applications for Corporate Sustainability Data Analysis

Natural Language Processing in Scientific Literature

Key Challenges In NLP For Chatbot Development

Explore content categories

The Problem!

The Solution.

Recommended by LinkedIn

What is Text Mining?

The Process.

Others also viewed

Polymath as Complexity

A Neural Net based trading strategy

All important Machine learning algorithms in 2023

Evolutionary Amoebas Playing Space Invaders - A Deep Learning Adventure

Sentiment Analysis - Here is what the people have to say

Revolutionizing Model Integration with Adapter Fusion

Jotery Project

Fréchet Inception Distance

Data Mining VS Artificial Intelligence VS Machine Learning VS Deep Learning

Data Mining: 5Ws and 1H

Similar topics

NLP for Legal Document Analysis

Text Mining in Customer Feedback

Natural Language Processing Algorithms

Utilizing Natural Language Processing in AI Recommendations

NLP Applications for Corporate Sustainability Data Analysis

Natural Language Processing in Scientific Literature

Key Challenges In NLP For Chatbot Development

Explore content categories