The Missing Piece in Machine Learning: Quantifying Narratives in MVP Prediction

Christophe Brown

Published Jan 28, 2025

Machine learning thrives on structured data—numbers, categories, and trends we can measure. But what about the intangible factors? The narratives, perceptions, and stories that shape outcomes in ways data alone can’t capture?

As part of my project to predict the NBA MVP in real-time, I found myself wrestling with this question. The stats were clear—metrics like Win Shares, Player Efficiency Rating (PER), and Box Plus Minus strongly correlate with MVP winners. Yet, my models were still missing the mark on controversial, hotly debated MVP races.

Why? Because the MVP race isn’t just about the numbers. It’s about the story.

Why Narrative Matters

Consider Derrick Rose’s MVP season in 2011. Statistically, LeBron James had a stronger case. But Rose’s narrative—a breakout season, his role in turning the Chicago Bulls into contenders, and his likability as a player—captured voters’ imaginations. These factors don’t show up on a stat sheet.

That’s when I realized my model was missing a crucial element: sentiment. If I could quantify the narrative behind each MVP race, I could give my system the missing piece it needed to predict outcomes more accurately.

Integrating Sentiment into Machine Learning

To address this, I turned to GPT-4o—a large language model with the context and capacity to evaluate decades of NBA history. I iterated through 15 potential criteria for assessing MVP candidates, eventually refining them down to 5 that encapsulated key aspects of the narrative, such as:

Historic Significance: Did the player achieve something unprecedented?
Team Influence: How much did the player elevate their team’s performance?
Adversity Overcome: Did they play through challenges like injuries or tough circumstances?

Using GPT-4o, I generated sentiment scores for each MVP candidate from the past 40 years, creating new features to integrate into my model.

This took 5 criteria, requesting GPT-4o to rate each player on each criteria as unbiased as possible (and forgetting who really "won" the award). The categories are viewable in my project GitHub repo:

MVP Sentiment Criteria on GitHub

Recommended by LinkedIn

Machine learning can be a marathon - not a sprint

Xavier Saavedra 7 years ago

A Deep Dive into Ensemble Algorithms and Combining…

Doug Rose 1 year ago

Crossing The Chasm - Machine Learning Predictions to…

Karthikeyan Sankaran 7 years ago

The Results

The impact was immediate. Without sentiment analysis, my models performed well, but struggled with contentious MVP seasons—Rose in 2011, Karl Malone over Michael Jordan in 1997, and others.

After incorporating sentiment scores, the model’s performance jumped. For testing, it achieved perfect accuracy in predicting past MVP winners. More importantly, it provided insights into why certain players were chosen, aligning with voter tendencies even in controversial years.

Article content — 2023-2024 MVP Voting - The yellow line along the top indicates the winning player (

What This Means for Machine Learning

This experiment wasn’t just about predicting MVP winners. It was about bridging the gap between quantitative rigor and qualitative intuition.

Narrative-driven sentiment analysis has applications far beyond sports. Think of how customer reviews shape product recommendations, or how employee evaluations can be influenced by subjective feedback. By quantifying what’s typically unquantifiable, we can make machine learning models more robust, adaptable, and insightful.

Takeaways

Numbers Alone Aren’t Enough: Qualitative factors often hold the key to real-world decisions.
Iterate and Refine: The process of selecting and testing sentiment criteria was as important as the model itself.
AI Can Enhance Human Judgment: GPT-4o enabled my system to process decades of narrative-driven data more consistently and less biasedly than a human could.

What’s Next?

This phase was pivotal in creating a real-time MVP prediction system. Next, I’ll be focusing on productionizing the model with containerization, cloud hosting, and monitoring to make the predictions dynamic and scalable.

What are your thoughts? Have you used sentiment analysis or narrative-driven features in your machine learning projects? I’d love to hear how you’ve tackled similar challenges!

To view or add a comment, sign in

The Missing Piece in Machine Learning: Quantifying Narratives in MVP Prediction

Christophe Brown

Why Narrative Matters

Integrating Sentiment into Machine Learning

Recommended by LinkedIn

The Results

What This Means for Machine Learning

Takeaways

What’s Next?

More articles by Christophe Brown

Others also viewed

How (not) to use Machine Learning for time series forecasting: The sequel

📘 Best practice Machine Learning Models: From Chaos to Consistency

Unlocking the Power of Decision Trees in Machine Learning

Great Predictions, Poor Descriptions: Actionable Insights for SHAP Boosted by Tree-Based Models

Singularities and Smoothness in Machine Learning

12 great links on key Machine Learning topics in 2018

Simplifying Data with PCA: Lessons from Real Life

Random Forest In Machine Learning

Can Machine Learning solve your problem?

Simplifying Machine Learning: 10 Algorithms Explained with Everyday Analogies

Explore content categories

Why Narrative Matters

Integrating Sentiment into Machine Learning

Recommended by LinkedIn

The Results

What This Means for Machine Learning

Takeaways

What’s Next?

More articles by Christophe Brown

Experimentation Meets Real-World Insight: Building Smarter MVP Predictions

Others also viewed

How (not) to use Machine Learning for time series forecasting: The sequel

📘 Best practice Machine Learning Models: From Chaos to Consistency

Unlocking the Power of Decision Trees in Machine Learning

Great Predictions, Poor Descriptions: Actionable Insights for SHAP Boosted by Tree-Based Models

Singularities and Smoothness in Machine Learning

12 great links on key Machine Learning topics in 2018

Simplifying Data with PCA: Lessons from Real Life

Random Forest In Machine Learning

Can Machine Learning solve your problem?

Simplifying Machine Learning: 10 Algorithms Explained with Everyday Analogies

Similar topics

Evaluating Large Language Models With Real-World Scenarios

Key Findings from Large Language Model Analysis

Explore content categories