eBay Tests with AI: DSL vs LLMs for Efficient Testing

At eBay, we are regularly creating tests for our listings platform. These tests need to create test listings with specific traits and attributes. Right now the tools we have to do this are complex and somewhat difficult to work with. A colleague recently suggested: “What if we used AI to just say, ‘Create a fixed-price listing in the US for $50,’ and The Right Thing Happens?” It sounds magical — but something about it didn’t feel quite right. I think it's easy to fall into “when you a shiny new hammer, everything looks like a nail.” Tokens aren’t free, and LLMs can’t match the speed of plain code. In this case, building an expressive DSL using a tool like JGiven [1] is a better approach. The tests will be faster and reliably repeatable. given()  .a_fixed_price_listing()  .with_price(50)  .on_site("US"); We could use AI to build and refine this DSL, but the tests themselves should just run regular code. My general heuristic would be: - Use compiled code for repeatable, deterministic tasks. - Use LLMs for dynamic, fuzzy, or exploratory tasks — like writing code or builing an agent to make a restaurant restaurant reservation. Use tokens when you have to, not just because you can. [1] https://jgiven.org/

Yours is very practical advise tied to what's worked. If you've not seen it, StrongDM is developing new models for weirdo stuff that is brand new. Their "software factory" concept uses hold-out tests in plain english that aren't available to the implementing model. They then use those to test the software to ensure it works correctly.

Like
Reply

David Van Couvering agree with you on not trying to see everything as a nail with new shiny hammer.. I am curious to know on how to explore when to use it? Is it by experimenting, analyzing the cost etc and then make a call. Ex: if we are trying to build a intelligent workflow engine to personalize user interactions .. if we code it, then every change / rule needs to be coded in.. Would like to know if you have any framework or mental models we need to develop on the usecases and when to bring in LLMs.

Like
Reply

yes, I spend a lot of my time these days having AI build out repeatable deterministic things

See more comments

To view or add a comment, sign in

Explore content categories