Coding an AVIF decoder without writing code

There's a lot of talk about "vibe coding" lately, and I recently had an experience that may be an interesting story showing how LLMs could actually work for technically challenging problems.

Image formats and decoders have long been an interest of mine. I witnessed the hype and eventual demise of the wavelet transform-based JPEG 2000 and have been following the struggling adoption of the new JPEG XL. I've seen Apple’s HEIC format gain popularity, but generally, this is no longer a headline-grabbing field. That's why I was surprised to see people already using AVIF in the wild.

I have a personal GitHub project (https://github.com/jdeng/goheif) that handles HEIC files from my iPhone, and users had asked for AVIF support. It wasn't simple then but now maybe I can just vibe it out. But I didn't want to rely on an external service or pull in a massive library for a single feature. The dav1d library is the obvious choice for the AV1 codec; it's written in plain C, making it easy to integrate. I also saw a Go parser library based on WASM, which essentially compiles dav1d to low-level JavaScript and runs it in a VM. However, in the age of LLMs, I believe readable code should be the universal language. I didn't want to carry a whole VM in my small tool.

The alternative—working directly with C code—means falling into the rabbit hole of building, shipping, and linking the library. I am no fan of the build systems and processes for C/C++ software. Go's approach is so much nicer, as it essentially has no build system. No Makefiles, no CMake, no depot_tools, Bazel, or Ninja. And yet, dav1d uses something I hadn’t seen before: Meson, which is built on Ninja. I certainly didn’t have the time or patience to learn and set up another build tool.

My project is written in Go and already embeds the libde265 source code via cgo to handle HEIC (which uses the same BMFF container format as AVIF). It wasn't a simple task to set up, but I thought with this new "vibe coding" approach, I could tackle the dav1d integration with an LLM. I suspected I could take a similar approach, which boiled down to a few tasks:

  1. Create a config.h for major OS/architecture combinations, a file usually generated by the build system. Define the CFLAGS/LDFLAGS for cgo so the source files can be compiled together.
  2. Ensure cgo can compile the dav1d C code with a simple go build.
  3. Write a Go wrapper for the dav1d C decoder functions.
  4. Update my existing BMFF parser to support AV1-specific blocks.

I copied the dav1d source code into my working directory and started writing my first prompt. I asked the LLM to generate the necessary configuration files by referencing the Meson build script, which I had only glanced at.

Prompt 1: “Update config.h, vcs_version.h and dav1d.go based on @meson.build to use proper defines and CFLAGS, LDFLAGS.”

The task was completed quickly. The output wasn't perfect—it included some redundant definitions and unneeded architecture support since I hadn't specified otherwise—but after deleting the extra code, I had a great starting point.

Next, I created a file to include all the C source files and typed go build. It failed, as expected. I had to briefly inspect the Meson build file to understand the required source list and, with a few small tweaks, the dav1d source code was compiling with a single command.

The next prompts were a bliss. I used Cursor's agent mode, which automatically handled compiler errors.

Prompt 2: “Use @libde265.go as an example, implement a decoder for AVIF in @dav1d.go using libdav1d.“ "Write a unit test to decode a test avif image to jpeg”

I asked the agent to write a unit test, but it didn't work. Taking a closer look, I saw the agent was trying to create an AVIF file from scratch within the test, and it was failing to parse the actual test image I provided. It then went a little crazy, trying all sorts of nonsensical fixes, and I had to stop it.

I realized my mistake: I had fed the test a full image file, while my decoder could only handle a raw coded bitstream. The next logical step was to ask the agent to update the BMFF parser to process the file container. It did a pretty good job but still failed to parse the file. I tried a few other sample images with no luck. The agent eventually gave up after offering some "you are absolutely right" platitudes and some unimportant improvements it mistook for fixes. To its credit, it did use hexdump and head to examine the file's data structure—a manual process that would have been incredibly time-consuming for me.

Prompt 3: “What is the reason for the error: Failed to decode AVIF image: get_picture error: -11” “A few files are tested with the same result. Could it be the implementation of Decoder is buggy?”

I was about to give up. But then I asked one slightly more targeted question.

Prompt 4: “So in @goheif.go if it.Info is nil, ErrUnknownItem is returned. Is this correct?” “Can you check the av01 item parsing code?”

This time the response was BRILLIANT. The LLM figured it out all by itself. It was a minor but critical logic error in the parsing code. Throughout this debugging process, I just sat there watching and didn't change a single line of code.

Article content
LLM identifying the bug

Here is the related code change (for the bug fix only):

Article content
Related code change

And just like that, I had a working AVIF parser with no build system or external dependencies. You just import the library, type go build, and you can now handle AVIF files. All of this was done in under an hour.

This is not a typical software engineering task, so it's unlikely the AI had seen many examples like it during training. And I didn't spend time crafting prompts, yet it figured out a complex problem with just a few sentences. This is the state of LLMs/AI agents today.

The key takeaway? If your problem is well-defined, and you can provide good context and break the work into reasonably sized tasks, AI agents can accomplish amazing things.

And this is still only the beginning.

fascinating! One thing i do think this shows though is that you have to have command of coding. A weekend coder would have likely given up.

To view or add a comment, sign in

More articles by Jack Deng

  • Growth and Freshness

    The Changed World I think nobody would argue that coding agents probably wrote 90%—probably more—of the code today, as…

    2 Comments
  • The Survivor Game and Survival Mode

    I recently used Codex to analyze my DNA sequence (yes it is a coding agent but can do much more than coding). It…

    2 Comments
  • Inception

    I have often observed comparisons between LLM outputs and dreams; the similarity is not merely poetic, but functional…

    8 Comments
  • Why You Need to Be a "Jerk" to Survive the AI Era

    The Writing is on the Wall We are entering a new era where intelligence is shifting from a scarce resource to an…

    5 Comments
  • You may be talking out of context

    Two Fridays ago, I had an idea: why not try to use Codex to write a PDF renderer, i.e.

    2 Comments
  • It's PART-T Time: How to Talk to Your Coding Agent

    With my OpenAI Codex weekly token quota about to reset, I finally decided to tackle a project I'd been wanting to do…

  • The Token Economics of Software Development

    “Tokens” are everywhere these days. When people say "tokens", they're usually talking about one of two things, and…

    4 Comments
  • What AI Can't Do: The Unbreachable Limits of AI

    As large language models (LLMs) evolve from fascinating curiosities into influential tools, it’s understandable to feel…

    4 Comments
  • Launch Your Career on the Cutting Edge: Why and How to Join an AI Startup

    For new graduates and those at the early stages of their careers, joining a startup has always been both exhilarating…

    2 Comments

Others also viewed

Explore content categories