Data Science Landscape

Ling Zhang

Published Jan 5, 2016

Business Applications vs. Sciences

Any data science project should be driven by business problems that means data science serves an organization by providing answers for its business problems and strategies in decision making process. Business problems can be classified as forecast, classification or prediction, segmentation, association and summarization; the related applications are survival analysis of any types, customer retention, scoring, rare event identification or fraud detection, customer targeting by segments, recommendation system, process optimization, topic identification, relationship mining, sentiment analysis, etc. Data science algorithms or machine learning includes two fundamental categories, supervise and non-supervised, the most common supervised learning methods include regression (logistic regression, multi-linear regression, Ridge. etc.), decision trees, Neural Network, Support Vector Machine, etc. Unsupervised learning include clustering and principle component analysis, etc. Those methods can be combined in solving more complicated problems to form resembled methods like random forest built on many trees or boosting methods built on multiple regressions or semi-supervised learning that combines supervised and unsupervised methods as one. Recently deep learning is starting to become popular and it embeds learning algorithms and workflows within a single learning process for delivering optimized solutions. The chart below is a mapping from business problems into types of learning methods but it’s not a mapping from a specific business application to a specific scientific method. The right methods should be chosen according to a specific business problem and the end performance matric.

Predictive Analytics

Predictive analytics is a sub area of data science by focusing on the prediction. It usually goes from low level to high level. Thinking about a scenario that a patent goes to a doctor’s office, first the doctor tries to understand what happened to the patient and the patient tells the symptoms from sickness; then the doctor explores what happened to the patient, and he also may tell what will happen next or possible symptoms, finally the doctor provides the patient with a prescription. Those are a sequence of the processes used for predictive analytics. In business, we start with historical data, find the truth, what happened such as which transactions are fraudulence and what are the patterns look like among those transactions and why it happened at what time (causational analysis plus time awareness analysis). The result is that we can tell stories about those transactions. However the analysis does not stop here. The best value comes from how to prevent fraud transactions from happening in the future and what actions to take to stop the fraud transactions, that needs to develop predictive or forecasting models and embedded those models into a transaction process at real time. Whenever, the model identifies a transaction with higher fraud score above a certain threshold (defined by business criteria), an alert is generated and the transaction is stopped immediately. The whole process starts with raw data to identify valuable information, gain deeper knowledge, draw business insight and finally optimize strategical decisions. It goes from hindsight that knows nothing to insight that provides clues about what causes problems to foresight that know what will happen and what to do. The chart below shows the big picture about predictive analytics.

Analytics Tools

This is report about top analysis tools used in year 2014 and 2015 and their comparison. The report from KDDNugget newsletter. R is #1 in 2015, followed by RapidMiner and SQL and Python.

Big data tools – Big data tools in 2015, Hadoop is #1 followed by Spark, Hive and SQL

Programming Language – Python is #1, followed by Java, C++, etc.

Analytics by Industry – Top industries use analytics in 2014, CRM, Banking and Health care or HR, Fraud Detection

To view or add a comment, sign in

Data Science Landscape

Ling Zhang

More articles by Ling Zhang

Others also viewed

Basics of Machine Learning

To predict or to interpret in data science, that is the question

Data Science: The Catalyst for AI and ML Advancements

Task #1 - Prediction using Supervised ML

Data Science: Unlocking Algorithms for Analytics Success

How Data Science Evolved and What the Future Holds

Why you should become a Data Scientist right now!

Machine Learning in the World of Data Science

The Importance of Data Preprocessing in ML & DL: Enhancing Model Performance with Clean Data

The Endeavour of Data Science and Machine Learning

Predictive Analytics Integration

How to Choose the Right Data Analytics Methodology

How to Justify Data Science Work to Business Teams

Machine Learning Models For Healthcare Predictive Analytics

Big Data Applications in Forecasting

Explore content categories

More articles by Ling Zhang

🌟 April 2026 — From Intelligence to Enduring Impact

From Misunderstanding to Mastery: Rising into the Age of AI, Leadership, and Purpose

When Intelligence Becomes Abundant: The Rise of Human Leadership

Leading in the Age of AI, Not Just Building It

Standing at the Threshold of Becoming

November Reflections: A Season of Gratitude, Growth & AI Transformation

🌟Rise Beyond Limits: The Laws, Leadership, and AI Shaping Tomorrow

AI & Human Nature: How to Transcend Human Nature and Unlock AI Leadership

August Insights: Perseverance, AI’s Future, and Living with Purpose ✨

🌱 The Vision Shift: Lead with Clarity, Scale with Purpose, and Grow Without Limits