The power of AI is in small data!

Ashwin P.

Published Jun 24, 2019

We've all been schooled into believing that AI and ML solutions naturally lend themselves into solving "Big Data" problems.

The allure of capturing as much data as possible is strong. And, now that more businesses are experimenting with machine learning and AI, it’s growing stronger. When you aren’t sure what you may eventually need, might as well capture everything, right?

The exponential growth of data is undisputed , driven by the internet of things and connected devices.Reality is data is already big and getting bigger, but do we need to worry about all this big data upfront even if it is exciting?.Is there anything wrong in thinking small first?

Can humans have more time in the day by allowing machines to solve problems based on small, but reliable data?

A case for small data

Big data can be unwieldy, expensive to maintain, clean and understand.Most small to medium enterprises would struggle to conjure together the technology and people resources required to process big data to the point that it becomes valuable.

It's also hard to see through the human biases, intended or unintended, in big data that can be dangerous when building the machines for the future.

The other reason to factor in is that data is changing at a rapid pace, so data can quite quickly become very irrelevant.

Just like you need to learn to walk before you can run, you can’t really do big data right until you master the art of harnessing small data first.

So then, what is small data?

Small data results from the experimental or intentionally collected data of a human scale where the focus is on causation and understanding rather than prediction.

Small data is much more manageable, and devoid of the high costs (not to mention compliance and regulatory risks) of big data, which can require a massive amount of work to manage, maintain and keep clean. Small data, even if it comes in unstructured form, can also be labeled somewhat easily.

Why AI and ML models might be more powerful with small data?

With big data we are not sure what model to use and we are uncertain about the biases and errors in the data. If we were better able to incorporate these in our modelling we would achieve a more realistic result. But this is difficult to achieve. On the other hand,small data models are necessarily simple and reflect at least some uncertainty. We know about the dangers of model misspecification. Although the results may not calibrate the uncertainty perfectly, at least the user of the conclusions will understand that they should be cautious and allow for the possibility that they are wrong.

In contrast, models for big data might be fine for point prediction and classification but we struggle to provide realistic assessments of uncertainty. Also consider the problem that big data suggests a massive number of hypotheses with less protection against the danger of false positive results.

Conclusion

The artificial intelligence we build for the future is only as good as the data we use to build them with.Humans make less irrational and unbiased decisions when they have small, but relevant data with the potential to give us relationships and insights, those tiny clues that can uncover big trends, so then why shouldn't we build artificial intelligence with the same relevance that help us with better decision making?

I'm not suggesting that we should avoid big data and there is no place for it, however more is not always better.

Think small!

To view or add a comment, sign in

The power of AI is in small data!

Ashwin P.

A case for small data

So then, what is small data?

Why AI and ML models might be more powerful with small data?

Conclusion

More articles by Ashwin P.

Others also viewed

AI data readiness: Why Most AI Fails Before It Even Starts 🧠

Sherlock Holmes AI Case Files : The conundrum of the “POC Trap”

Artificial Intelligence (Ai) v Big Data

Unleashing the Power of AI: Embracing Foundational Data Skills

Data Foundations for AI in 2026: Why Quality Matters

AI Isn’t Magic — It’s Math on Your Messy Data

When Data Becomes Buzzwords

Discover how AI and ML can enhance data analytics

The Data Challenge: Is Your Business Truly AI‑Ready?

Tuesdays with TIVRA - Topic-3: BI to AI - The Saga of Data Quality Issues

Why Good Enough Data Is Important

How To Fine-Tune AI Models On Small Datasets

How to Build a Reliable Data Foundation for AI

Why Trust in Data is Hard to Earn

Overcoming Data Limitations In AI Model Development

Explore content categories

A case for small data

So then, what is small data?

Why AI and ML models might be more powerful with small data?

Conclusion

More articles by Ashwin P.

Why an integrated data management solution matters.

Data management starts with discovery.

Others also viewed

AI data readiness: Why Most AI Fails Before It Even Starts 🧠

Sherlock Holmes AI Case Files : The conundrum of the “POC Trap”

Artificial Intelligence (Ai) v Big Data

Unleashing the Power of AI: Embracing Foundational Data Skills

Data Foundations for AI in 2026: Why Quality Matters

AI Isn’t Magic — It’s Math on Your Messy Data

When Data Becomes Buzzwords

Discover how AI and ML can enhance data analytics

The Data Challenge: Is Your Business Truly AI‑Ready?

Tuesdays with TIVRA - Topic-3: BI to AI - The Saga of Data Quality Issues

Similar topics

Why Good Enough Data Is Important

How To Fine-Tune AI Models On Small Datasets

How to Build a Reliable Data Foundation for AI

Why Trust in Data is Hard to Earn

Overcoming Data Limitations In AI Model Development

Explore content categories