The Process of Building an Explainable Fraud Detection System Using ML and Streamlit

Nameera Nilofer K.

Published Apr 27, 2025

Following my previous project 'Building a Fraud Detection System Using XGBoost', I wanted to go further and build a dashboard on Streamlit. This project uses synthetic data that mimics the risk of fraudulent refund abuse. To simulate tackling this issue, I built an end-to-end Machine Learning dashboard for fraudulent user detection, using a synthetic dataset from Kaggle.

This project included data preprocessing, feature engineering, model training, explainability through SHAP, and deployment using Streamlit. It is important to reflect critically on the process, especially on the dataset's limitations, and highlight the project's true strengths.

Shortcomings and Challenges

1. Synthetic Nature of the Dataset

The dataset used was artificially generated and not from real-world customer behaviours which introduced several limitations:

Predictability: The data patterns were simpler and cleaner than what real-world data would present.

Perfect Accuracy: The model achieved an accuracy of 1.0, which is not realistic in production environments where user behaviour is noisy and evolving.

Lack of Outliers: Real fraud often involves rare, outlier behavior, which synthetic datasets struggle to replicate.

2. Data Quality Issues

Before modelling, significant inconsistencies were identified:

Return Dates Earlier than Order Dates: Logically invalid records, which required removal.

Loss of Balance: Removing invalid entries slightly skewed the balanced nature of the original dataset, although not dramatically.

3. Simplified Feature Relationships

Although feature engineering was done, the relationships between features and fraud were more linear than expected in reality:

In production, fraud indicators often interact in non-linear and subtle ways.
Additional features like multi-order behaviour would be necessary to fully capture fraud patterns.

4. Streamlit Deployment Constraints

While Streamlit enabled rapid deployment, it also posed some challenges:

Initial difficulty handling SHAP visualisations due to the changes in
Environment-specific module errors (e.g., missing 'shap' or 'pickle' issues).
Limited options for scaling to millions of records without a backend database.

Recommended by LinkedIn

Fraud Model Bake Off

Gavin Holland 1 week ago

What survey fraudsters fear most

Ryan Howard 1 year ago

Four Critical Factors in Successful Machine Learning…

Liz Lasher 6 years ago

Strengths and Achievements

Despite these limitations, the project has several important strengths worth celebrating:

1. End-to-End Pipeline

Built a full system from raw data ingestion ➔ cleaning ➔ feature generation ➔ model prediction ➔ visualisation.

Automated the fraud detection pipeline.

2. Feature Engineering from Domain Intuition

Created meaningful features like Days_to_Return_Corrected, Suspicious_Score, and High_Returner_Flag based on logical business behaviour.

3. Model Explainability Integrated

Implemented SHAP values to interpret how each feature impacted fraud predictions.
Created a bar chart visualisation for easy business understanding.

4. Professional Dashboard Experience

Designed a clean, tabbed Streamlit app allowing easy exploration of KPIs, fraud scores, and detailed user explainability.

5. Problem Solving and Adaptability

Diagnosed and fixed data quality issues effectively.
Handled environment errors during deployment.
Shifted modelling expectations appropriately when facing perfect accuracy artifacts.

Going forward, applying similar methodologies to real, messy data would introduce new challenges such as handling concept drift, building feedback loops, threshold tuning, and minimising false positives. Embracing these complexities will be key to building robust, production-ready fraud detection systems.

Maham Faizan 12mo

Amazing 👏

Sherwin Theophilus 1y

Great tool Nameera Nilofer K.!

1 Reaction

Rayhan Ahmed 1y

Sounds exciting!!

1 Reaction

See more comments

To view or add a comment, sign in

The Process of Building an Explainable Fraud Detection System Using ML and Streamlit

Nameera Nilofer K.

Shortcomings and Challenges

Recommended by LinkedIn

Strengths and Achievements

More articles by Nameera Nilofer K.

Others also viewed

10 Important Lessons from Enhancing Fraud Detection with Machine Learning in the Fintech Industry

Next-Gen AI Fraud Detection & Risk Assessment in BFSI: How Glib.ai is Revolutionizing Financial Security

Credit Card Fraud Detection-My First Real-World Data Science Project

🚀 Detecting Fraudulent Transactions Using Machine Learning

🚀 Building an AI-Powered Fraud Detection System for Banco Popular PR

Data Science: The Unexpected Hero Against Financial Fraud

Strategies for Fraud Detection Using Machine Learning

Credit Card Fraud Detection Using SMOTE Technique

Artificial Intelligence Creates New Opportunities To Combat Fraud

Common Challenges With AI In Fraud Detection

Real-Time Fraud Detection to Improve Customer Experience

Fraud Detection in Ecommerce Platforms

How to Use AI and Expert Analysis for Fraud Detection

How to Improve Fraud Detection Measures

Explore content categories

Shortcomings and Challenges

Recommended by LinkedIn

Strengths and Achievements

More articles by Nameera Nilofer K.

SQL on MacOS

Insights from Islamic-Inspired Data and Modern Voices

Building a Fraud Detection Model with XGBoost

A Beginner’s Journey Exploring Football Data Analysis

Others also viewed

10 Important Lessons from Enhancing Fraud Detection with Machine Learning in the Fintech Industry

Next-Gen AI Fraud Detection & Risk Assessment in BFSI: How Glib.ai is Revolutionizing Financial Security

Credit Card Fraud Detection-My First Real-World Data Science Project

🚀 Detecting Fraudulent Transactions Using Machine Learning

🚀 Building an AI-Powered Fraud Detection System for Banco Popular PR

Data Science: The Unexpected Hero Against Financial Fraud

Strategies for Fraud Detection Using Machine Learning

Credit Card Fraud Detection Using SMOTE Technique

Artificial Intelligence Creates New Opportunities To Combat Fraud

Similar topics

Common Challenges With AI In Fraud Detection

Real-Time Fraud Detection to Improve Customer Experience

Fraud Detection in Ecommerce Platforms

How to Use AI and Expert Analysis for Fraud Detection

How to Improve Fraud Detection Measures

Explore content categories