Design a Developer-First Vulnerability Platform for enterprises

Design a Developer-First Vulnerability Platform for enterprises

Recently I was checking the GitHub Advanced Security and other application such as Synk to comprehend how they are screening/scanning the CVE issues associated with my code and flagging us in CI/CD pipeline

Most security tools are great at finding bugs but terrible at fixing them. In many organizations, security teams throw massive lists of CVEs at developers without any context. This leads to alert fatigue, where "Critical" often means nothing because the library isn't even used in production.

To fix this, we need a platform that is context-aware, event-driven, and outcome-oriented.

2. High-Level Architecture

The platform is designed as a distributed, event-driven pipeline. It treats every code push or new CVE discovery as a data event that must be enriched before it reaches a human.

  1. Push PR event arrives -> persist scan_job in Postgres.
  2. Publish job to queue.
  3. Worker consumes job, builds dependency graph, emits findings.
  4. Enricher + scorer + ranking run as independent consumers.
  5. Dedup/suppression updates canonical finding state.
  6. Alert consumer creates/updates Jira/Slack.
  7. Dashboard reads aggregated state (MTTR/SLA/aging/trends).

HOW CVE ARE CAPTURED

Simple matching flow

  • Build dependency list from lockfile/graph as coordinates:ecosystem (npm, maven, pypi, etc.)package nameresolved version
  • Normalize names (aliases, case, namespace differences).
  • Query vuln KB (NVD/GHSA/OSV/internal) for that package.
  • Check version range logic:vulnerable if version is inside affected rangesafe if version is in fixed/patched range
  • If matched, create finding:package@version + cve_id + repo/path + evidence

Example

If KB says:

  • package: log4j-core
  • affected: >=2.0, <2.17.1
  • fixed: 2.17.1

And your graph has log4j-core@2.14.1, it matches CVE and becomes a finding.        

Why Dedup is required ?

Why duplicates happen

  • The same CVE can come from multiple sources (e.g., NVD, GHSA, scanner databases).
  • The same repository may be scanned multiple times due to pushes, retries, or scheduled scans.
  • A single vulnerable transitive dependency can appear through multiple dependency paths.
  • Different services may report the same underlying issue in a single application.

What deduplication does

  • Merges repeated detections into a single canonical finding
  • Updates last_seen_at and increments an occurrence count instead of creating new records
  • Preserves full history of sightings while keeping the active issue list clean and actionable


Flow diagram :

Article content
Vuln scanner Flow

Event flow state:

  • Webhook emits job event (scan.requested)
  • Scan worker parses deps and emits raw findings (finding.detected)
  • Enricher adds context (finding.enriched)
  • Scoring computes risk/confidence (finding.scored)
  • Policy decides SLA + action (finding.prioritized)
  • Fan-out consumers act:alert service (finding.alerted)ticket service (finding.ticket.created)remediation bot (finding.remediation.pr.opened)
  • After fix/rescan:finding.remediated then finding.closed

Note:

To keep the article concise, I have not outlined the database schema and the design choices I made. This is intended to provide a bird’s-eye view of the problem and the possible solution paths that we typically explore and use in our day-to-day development work.

References:

  • Wiz provides good tooling and applications around this problem
  • GithubAdvancedSecurity is another good tool integrated into GitHub



To view or add a comment, sign in

More articles by Dheeraj Mishra

  • Design a platform to manage Ads campaign

    An ad campaign is a planned series of promotional messages with a common goal — like increasing sales or brand…

  • Design a customer chat support system

    The customer chat support system is a critical component of the CRM, enabling real-time communication between customers…

Others also viewed

Explore content categories