From COBOL to Python with Claude Code + GSD: what actually works

From COBOL to Python with Claude Code + GSD: what actually works


AI will not magically modernize your COBOL estate. But used correctly, it can dramatically compress discovery, improve consistency, and accelerate validated migration to Python.

There is a version of this story that keeps circulating in engineering circles: point an AI at a mainframe codebase, describe the target state, and walk away while it modernizes decades of business logic into clean Python.

That is not what happens.

But what does happen can be genuinely valuable — if the work is structured correctly.

The real opportunity is not one-shot code conversion. It is using AI to reduce the time it takes to understand legacy systems well enough to migrate them safely.

The problem nobody talks about

COBOL survives because it is trusted, not because it is pleasant to work with.

These systems often encode 30 to 40 years of business decisions, regulatory edge cases, file contracts, numeric precision rules, and workflows that “just work.” In many cases, the original authors are long gone, documentation is incomplete, and the code itself has become the only source of truth.

That is why the hard part of migration has never been syntax translation.

The hard part is reconstructing intent.

That untouched EVALUATE block may be handling a regulatory exception from 1989. A COMP-3 packed decimal field may be carrying business-critical numeric behavior that cannot be approximated or guessed. A fixed-width input record may look simple until one overlooked position offset causes downstream reconciliation issues.

Legacy modernization fails when teams underestimate this reality. The challenge is not converting one language into another. It is preserving behavior that has been embedded in production systems for decades.

What Claude Code actually changes

Used well, Claude Code is not a magic COBOL-to-Python compiler. It is a very strong intent extraction and modernization assistant.

Instead of pasting fragments into a chat window, teams can work directly against real files in a repository: COBOL programs, copybooks, control files, and JCL. That changes the workflow significantly.

Claude Code can help teams:

  • read COBOL programs and trace copybook dependencies
  • map PIC fields into modern Python types
  • summarize business rules in plain English
  • flag undocumented assumptions and suspicious logic
  • generate first-pass Python equivalents
  • scaffold tests around edge cases and branches

That alone is a major shift.

What used to require weeks of manual discovery can often be compressed into days. Not because AI eliminates complexity, but because it accelerates one of the slowest parts of the project: getting from “we do not fully understand this system” to “we have a structured picture of what this program actually does.

That compression is where the value begins.

The problem at scale: context rot

A single COBOL program is manageable.

A portfolio of 50, 80, or 100 batch jobs is a different problem entirely.

This is where many AI-assisted migration demos stop being representative of reality. Quality often degrades as work scales. Instructions drift. Naming conventions become inconsistent. Later outputs inherit noise and assumptions from earlier tasks. Program number 40 receives a worse experience than program number 1.

That is the context problem.

For large-scale migrations, context management becomes just as important as code generation.

Why GSD matters

This is where GSD becomes interesting. GSD (“Get S*** Done”) is a spec-driven workflow layer on top of Claude Code that breaks large engineering efforts into bounded, verifiable tasks executed with fresh context.

The real value of GSD is not just orchestration. It is discipline.

By structuring migration work into fresh, bounded, spec-driven units, GSD helps preserve consistency across a large portfolio. Each program can be treated as an atomic migration task with its own scope, inputs, outputs, validation criteria, and completion definition.

That matters because large modernization efforts do not fail only on correctness. They also fail on inconsistency.

If every migration unit is handled with the same structure, the same standards, and the same validation pattern, quality becomes more repeatable. That makes the overall approach far more viable at scale.

What a real migration workflow looks like

In practice, the most effective pattern looks less like wholesale auto-conversion and more like structured, validation-first modernization.

A practical workflow looks something like this:

1. Map the codebase

Start by inventorying the estate:

  • COBOL programs
  • copybooks
  • file dependencies
  • job relationships
  • JCL flows
  • inputs and outputs


Before writing Python, understand what exists.

2. Break work into atomic units

Treat each program or business function as its own migration task. Define:

  • what will be converted
  • what files are involved
  • what business rules must be preserved
  • what test strategy will prove equivalence


3. Convert with explicit rules

This is where discipline matters. For example:

  • PIC 9(7)V99 should become Decimal, never float
  • packed decimal handling must be explicit
  • fixed-width record parsing must preserve exact positional behavior
  • branch logic should be covered with generated tests


4. Validate against a golden dataset


This is the most important step in the entire process.

Not “the AI says the code looks correct.”

Not “the Python version seems cleaner.”

Not “the logic appears equivalent.”

Just this:

Does the Python version produce the same result as the COBOL version on the same inputs?

That is the standard that matters.

Run both versions against identical data. Compare outputs. Diff files. Reconcile mismatches. Repeat until behavior matches.

5. Commit independently and traceably

Each migrated unit should be tracked independently. That makes regression analysis easier, supports rollback if needed, and allows teams to scale the effort without losing control.

Where this works especially well

AI-assisted COBOL modernization works best in environments where behavior is bounded, inputs and outputs are clear, and validation can be made objective.

The strongest candidates are:

  • batch ETL jobs
  • report generators
  • fixed-width file transformations
  • deterministic business-rule processing
  • data preparation layers for downstream platforms


These are good targets because the logic is usually self-contained and validation is binary. The migration can be treated as a behavior-preservation exercise rather than a broad architectural rewrite.

Where you still need real engineering discipline

There are also areas where teams need to be especially careful.

Financial precision

COBOL numeric behavior is unforgiving. Implied decimals, signed fields, and packed formats do not translate safely into casual Python code. Any migration that uses floating-point arithmetic where decimal precision is required is introducing risk.

Hidden assumptions

Legacy systems often contain magic values, fallback paths, and data-handling exceptions that no one remembers. If AI flags them, treat those flags as investigation points — not as noise.

JCL and operational context

Migrating the application logic is not the same as migrating the system. Scheduling, restarts, dataset dependencies, control flows, and operational procedures often live outside the COBOL source. JCL can be analyzed and documented, but replacing that operational layer is usually a separate workstream.

Performance at production volume

A Python implementation may be logically correct and still fail operationally if it cannot handle production-scale volume. High-throughput batch environments may require chunking, multiprocessing, or distributed execution to meet runtime expectations.

What this actually gives engineering teams

Claude Code with GSD does not eliminate migration risk.

What it does is remove one of the most expensive phases of modernization: the long period where the team is still trying to understand what the system really does.

It shortens the gap between:

  • “nobody fully understands this program”
  • “we have a working Python equivalent with tests and output validation”


That is a meaningful improvement.

And when done well, it does not just help with the first program. It helps maintain quality and consistency across the entire portfolio.

Final take

So, does it make sense to use Claude Code and GSD to migrate COBOL to Python wholesale?

Yes — but only if “wholesale” is defined correctly.

Not as one-shot automation.

Not as blind code translation.

Not as a promise that AI will modernize a mainframe estate by itself.

It makes sense as a system-wide, AI-assisted, validation-first modernization strategy.

That is the real shift.

AI is not replacing migration strategy. It is compressing the time required to understand legacy systems well enough to migrate them safely, consistently, and with far better momentum than most teams have had before.

In most COBOL modernization efforts, that understanding phase is where the real cost lives.

Reducing that cost — while keeping validation non-negotiable — is where this approach starts to become practical.

Bottom line: AI does not eliminate migration risk. But it can dramatically reduce discovery time, improve consistency, and accelerate the path from legacy uncertainty to validated Python equivalents.

If you are working on modernization, I would be interested in what you are seeing in practice — especially around packed decimals, JCL dependencies, context management, and validation at scale.

My connection from another world Gourav J. Shah has been telling me about GSD. Adding to my TODO!

Like
Reply

To view or add a comment, sign in

More articles by Jeffrey Rodriguez Viaña

Others also viewed

Explore content categories