Building an AI-Agent Based Python Security Scanner.

Callis Ezenwaka, (PMP®, CISSP)

Published Jun 19, 2025

The Problem

Manual security scanning of Python dependencies is time-consuming and error prone. Developers often skip vulnerability checks during development, leading to security debt that compounds over time. Existing tools either lack intelligence or require complex setup and maintenance.

I needed a solution that could automatically scan Python packages, identify vulnerabilities, and provide actionable recommendations without requiring extensive security expertise. Or at most allow vibe coders to scan all the packages in their code with minimal set up.

The Solution: AI Agent-based scanner

An AI agent-based security scanner that uses intelligent agents to analyze Python dependencies and identify vulnerabilities. It combines multiple data sources with AI reasoning to provide comprehensive security analysis.

Architecture Overview

Article content — Architecture Overview of python vulnerability scanner

Technical Implementation

Core Components:

Dependency Parser: Handles both requirements.txt and environment.yml files
AI Agent: Uses Hugging Face models via smolagents framework
Multi-Tool Integration: Custom tools for PyPI , GitHub , and vulnerability scanning
Structured Output: Consistent reporting across different output formats

Key Technologies:

Python 3.11+
Hugging Face smolagents framework
PyPI API
GitHub API
OSV (Open-Source Vulnerabilities) database

Implementation Challenges

Challenge 1: Inconsistent AI Output

The AI agent produced different output structures across runs:

python 

# Sometimes: 
{'vulnerable_packages': {'pkg': [...]}} 
 
# Other times: 
{'executive_summary': {'vulnerable_packages': {'pkg': [...]}}}

For anyone interested in understanding more about AI Agents, I would recommend the agents-course (Hugging Face Agents Course).

Solution: Use Agent Tools

Implemented a robust FinalAnswerTool with structured validation.

python 

class FinalAnswerTool(Tool): 
    def forward(self, answer: Any) -> Dict[str, Any]: 
        if isinstance(answer, dict): 
            standardized = { 
                "vulnerable_packages": {}, 
                "upgrade_recommendations": {}, 
                "overall_risk_assessment": "No assessment provided" 
            } 
             
            # Handle multiple possible structures 
            if "vulnerable_packages" in answer: 
                standardized["vulnerable_packages"] = answer["vulnerable_packages"] 
            elif "executive_summary" in answer and isinstance(answer["executive_summary"], dict): 
                if "vulnerable_packages" in answer["executive_summary"]: 
                    standardized["vulnerable_packages"] = answer["executive_summary"]["vulnerable_packages"] 
             
            return standardized

Challenge 2: Multi-Source Data Integration

Different APIs return vulnerability data in incompatible formats.

Results and Impact

Accuracy: Successfully identified all known vulnerabilities in test cases.
Performance: Average scan time of 15-30 seconds for typical Python projects.
Usability: Color-coded terminal output with clear remediation steps.
Flexibility: Multiple output formats for different use cases

Lessons Learned

AI Output Consistency: Always validate and structure AI agent outputs
Error Handling: Robust fallback mechanisms are essential when dealing with external APIs
User Experience: Clear, actionable output is more valuable than comprehensive technical details
Modular Design: Separating tools, parsers, and formatters enables easier testing and maintenance

Future Enhancements

Immediate Next Steps:

CI/CD pipeline integration (GitHub Actions, GitLab CI)
Support for additional package managers (npm, Maven, Composer)
Vulnerability trend analysis and risk scoring
Code-based API and configuration scanning

Long-term Vision:

Machine learning models for false positive reduction
Integration with security information and event management (SIEM) systems
Automated pull request generation for vulnerability fixes

Technical Specifications

System Requirements:

Python 3.11 or higher
Hugging Face Token (free tier sufficient)
Internet connection for API access

Performance Metrics:

Memory usage: ~200MB baseline
Network requests: 2-5 per package
Processing time: 5-10 seconds per package

Conclusion

This demonstrates how AI agents can effectively automate security scanning while maintaining accuracy and usability. The key success factors were handling AI output inconsistency, robust error handling, and focusing on developer experience.

The project proves that combining multiple data sources with AI reasoning can produce more intelligent security analysis than traditional rule-based approaches. Source code and documentation available on GitHub: https://github.com/callezenwaka/motionstream.

To view or add a comment, sign in

Building an AI-Agent Based Python Security Scanner.

Callis Ezenwaka, (PMP®, CISSP)

The Problem

The Solution: AI Agent-based scanner

Architecture Overview

Technical Implementation

Core Components:

Key Technologies:

Implementation Challenges

Challenge 1: Inconsistent AI Output

Solution: Use Agent Tools

Challenge 2: Multi-Source Data Integration

Recommended by LinkedIn

Solution: Use a Parser.

Usage Example

Sample Output

Results and Impact

Lessons Learned

Future Enhancements

Immediate Next Steps:

Long-term Vision:

Technical Specifications

System Requirements:

Performance Metrics:

Conclusion

More articles by Callis Ezenwaka, (PMP®, CISSP)

Others also viewed

Send Automatic What's app messages using python

Automating Image Background Removal with Python and the Remove.bg API

Data Ingestion and Web Scraping with the Python Programming Language

Three Ways to Tame Your Data: Typing, Dataclasses, and Pydantic

Unleashing Python's Power: Multithreading vs. Multiprocessing

12 Exciting Python Projects on Github You Should Try Today [2022]

Built LLMs Applications using Chainlit within minutes.

Automate WhatsApp Messaging with Python Using PyAutoGUI:

Announcing pymqrest: a Python wrapper for the IBM MQ administrative REST API

Humanize CQL2 filters authoring with Type-Safe DSL in Python

Explore content categories

The Problem

The Solution: AI Agent-based scanner

Architecture Overview

Technical Implementation

Core Components:

Key Technologies:

Implementation Challenges

Challenge 1: Inconsistent AI Output

Solution: Use Agent Tools

Challenge 2: Multi-Source Data Integration

Recommended by LinkedIn

Solution: Use a Parser.

Usage Example

Sample Output

Results and Impact

Lessons Learned

Future Enhancements

Immediate Next Steps:

Long-term Vision:

Technical Specifications

System Requirements:

Performance Metrics:

Conclusion

More articles by Callis Ezenwaka, (PMP®, CISSP)

The Unintended Consequences of Modern Authentication: Balancing Security and Privacy

Embedding Trust and Safety in Digital Financial Products

Zero Trust Security in Digital Finance: Insights from Cyber Security Connect Manchester.

The Privacy Paradox in Healthcare Research: Setting the Pace, Losing the Race?

Trust, Privacy and Security with OpenID Connect and the Future of Digital Trust.

Real-Time Data Visualization on the Browser using D3.js.

Balancing A Technical Team: Depth or Breadth Dilemma.

The Technical Debt Metaphor: The Agility vs Quality Trade-offs in Enterprise Applications.

Privacy by Design in Enterprise Cloud Applications

Privacy Regulations in the Era of Cloud and Digital Boom

Others also viewed

Send Automatic What's app messages using python

Automating Image Background Removal with Python and the Remove.bg API

Data Ingestion and Web Scraping with the Python Programming Language

Three Ways to Tame Your Data: Typing, Dataclasses, and Pydantic

Unleashing Python's Power: Multithreading vs. Multiprocessing

12 Exciting Python Projects on Github You Should Try Today [2022]

Built LLMs Applications using Chainlit within minutes.

Automate WhatsApp Messaging with Python Using PyAutoGUI:

Announcing pymqrest: a Python wrapper for the IBM MQ administrative REST API

Humanize CQL2 filters authoring with Type-Safe DSL in Python

Similar topics

How to Build Agent Frameworks

Vibe Coding and Its Impact on Software Engineering

Using Reactive Agents in Kubernetes Architecture

Explore content categories