Decoding Data Bias in Machine Learning: Key Takeaways from CSCI S-184, Harvard Summer School Class in the Trenches of AI Ethics

Bruce Huang, Ph.D., Ed.D.

Published Jul 31, 2023

In machine learning and data science, ensuring fairness and avoiding bias is paramount. As we navigate the complexities of this field, the Harvard Summer School class on Data Science and AI Ethics, Regulations, and Laws has sparked valuable discussions on these topics. This post draws on one of those insightful dialogues to shed light on the nuanced aspects of data bias in machine learning.

What is Bias in Machine Learning?

In machine learning, bias is a model's inclination or predisposition towards certain outcomes or predictions, often due to the data used to train it. It's important to note that not all biases are harmful or unwanted - biases can merely reflect the non-uniform distribution often found in real-world datasets. The key is recognizing and understanding these biases, ensuring they do not introduce unwarranted prejudice or lead to incorrect conclusions.

Understanding the Role of Bias

Take, for instance, a case study discussed in the CSCI S-184 class. We examined a study aiming to identify the drivers of diabetes in a region where the average Body Mass Index (BMI) is greater than the national average. Here, a dataset with a higher proportion of individuals with a BMI above 25 would be representative of the population being studied, thus making it appropriate.

However, if the study aims to understand diabetes drivers more generally across populations with varying BMIs, then a dataset mostly comprised of individuals with high BMI could introduce bias. In this scenario, the study's results may only apply to populations with a higher average BMI, potentially overlooking factors more relevant in populations with lower BMIs.

Recommended by LinkedIn

Architecture of Intention: Designing AI with Ethics…

Dena Sue Potestio, CAP® 9 months ago

Southern African Conference for Artificial…

Steve Wang 3 years ago

Issue #41 — From Hype to Habits: Platforms…

Dr. Freddie Seba 5 months ago

Going Beyond Sex, Gender, and Race

Our class discussions have underlined that fairness in machine learning isn't only about demographic factors like sex, gender, and race. Any variable that can lead to unfair representation or treatment can be a source of bias. This includes age, BMI, income, education level, and more.

The goal should be to strive for a representative sample that captures the diversity and variation of the population you are studying. By doing so, we can ensure the model's predictions are accurate and fair across different groups of people.

Understanding and accounting for bias in machine learning is more than just a theoretical exercise. It is a significant contributor to the accuracy and fairness of our models. By considering bias in our datasets, we can create models that perform well while being fair and unbiased.

These rich discussions in the Harvard Summer School class underscore the importance of carefully considering the design and composition of your dataset when conducting a study. We move one step closer to achieving fairness in machine learning by considering these factors.

Missed out on this summer's exploration into AI ethics? Don't worry; the quest for unraveling the nuances of data bias in machine learning isn't over yet. CSCI S-184 is coming back in Spring 2024. Grab your spot and join us on this exciting journey to decode the ethics, regulations, and laws in Data Science and AI.

To view or add a comment, sign in

See all

Decoding Data Bias in Machine Learning: Key Takeaways from CSCI S-184, Harvard Summer School Class in the Trenches of AI Ethics

Bruce Huang, Ph.D., Ed.D.

Recommended by LinkedIn

More articles by this author

Others also viewed

Ethics and Decision-Making in the Use of Generative AI

Unleashing AI Across the Public Sector: What We've Learned from Government and Academic Collaboration

Got Tethics? The State of AI Safety in 2026 + More Updates

Responsible AI & Ethics: adoption, governance & workforce, a different perspective

Expertise, Ethics, and Integrity in Qualified AI Conversations

AI and Nueral Nets

AI Data Bias - the beast that we need to tame

Ascent Special Edition: When Expertise Becomes a Gate

Looking into the Mirror: Reflection on AI and Human Bias in Research

AI Ethics- Cause for Concern?

Ensuring Fair Representation In AI Training Data

Understanding Bias in AI Job Screening

Ethical Considerations In Predictive Analytics With AI

How to Identify Bias in Data-Driven Decisions

Addressing Bias and Privacy in AI Datasets

How to assess sex bias in research data

Explore content categories

Recommended by LinkedIn

CSCI 184: Key Takeaways from Discussion on Bias in AI and Machine Learning Models

Mar 16, 2024

Linking Harvard’s Findings on Workplace Passion Bias to AI Algorithm Bias

Mar 2, 2024

Analyzing President Biden’s Executive Order on AI: An Initial Exploration of its Impact on AI Innovation

Nov 1, 2023

On World Teachers' Day: Nurturing Minds in the AI Era and the Undying Value of a Human Touch

Oct 5, 2023

In the Wake of the MGM Cyberattack: A Clarion Call from the Director of HES Cybersecurity Master's Degree Program

Sep 13, 2023

Riding the Tech Wave: Unveiling the Top In-demand Skills for Tomorrow

Sep 7, 2023

Can AI Cure Cancer? A Technological Perspective Inspired by Professor Andrew Lo of MIT and Recent Research by Aichi Cancer Center and NEC

Aug 9, 2023

AI Chatbots in Drive-Throughs - A Review and Reflection on the Wall Street Journal Article

Aug 7, 2023

The Power of the Dinner Table: How Family Conversations Shape Our Children and The Parallels Between Dinner Conversations and AI Learning Models

Jul 27, 2023

Unleashing the Power of Data Science and AI in Cross-Country Skiing

Jul 8, 2023

Others also viewed

Ethics and Decision-Making in the Use of Generative AI

Unleashing AI Across the Public Sector: What We've Learned from Government and Academic Collaboration

Got Tethics? The State of AI Safety in 2026 + More Updates

Responsible AI & Ethics: adoption, governance & workforce, a different perspective

Expertise, Ethics, and Integrity in Qualified AI Conversations

AI and Nueral Nets

AI Data Bias - the beast that we need to tame

Ascent Special Edition: When Expertise Becomes a Gate

Looking into the Mirror: Reflection on AI and Human Bias in Research

AI Ethics- Cause for Concern?

Similar topics

Ensuring Fair Representation In AI Training Data

Understanding Bias in AI Job Screening

Ethical Considerations In Predictive Analytics With AI

How to Identify Bias in Data-Driven Decisions

Addressing Bias and Privacy in AI Datasets

How to assess sex bias in research data

Explore content categories