eBay Tests with AI: DSL vs LLMs for Efficient Testing

2mo

At eBay, we are regularly creating tests for our listings platform. These tests need to create test listings with specific traits and attributes. Right now the tools we have to do this are complex and somewhat difficult to work with. A colleague recently suggested: “What if we used AI to just say, ‘Create a fixed-price listing in the US for $50,’ and The Right Thing Happens?” It sounds magical — but something about it didn’t feel quite right. I think it's easy to fall into “when you a shiny new hammer, everything looks like a nail.” Tokens aren’t free, and LLMs can’t match the speed of plain code. In this case, building an expressive DSL using a tool like JGiven [1] is a better approach. The tests will be faster and reliably repeatable. given() .a_fixed_price_listing() .with_price(50) .on_site("US"); We could use AI to build and refine this DSL, but the tests themselves should just run regular code. My general heuristic would be: - Use compiled code for repeatable, deterministic tasks. - Use LLMs for dynamic, fuzzy, or exploratory tasks — like writing code or builing an agent to make a restaurant restaurant reservation. Use tokens when you have to, not just because you can. [1] https://jgiven.org/

Behavior-Driven Development in Plain Java jgiven.org

14 Comments

Justin Abrahms 2mo

Yours is very practical advise tied to what's worked. If you've not seen it, StrongDM is developing new models for weirdo stuff that is brand new. Their "software factory" concept uses hold-out tests in plain english that aren't available to the implementing model. They then use those to test the software to ensure it works correctly.

Arun Ganuga 2mo

David Van Couvering agree with you on not trying to see everything as a nail with new shiny hammer.. I am curious to know on how to explore when to use it? Is it by experimenting, analyzing the cost etc and then make a call. Ex: if we are trying to build a intelligent workflow engine to personalize user interactions .. if we code it, then every change / rule needs to be coded in.. Would like to know if you have any framework or mental models we need to develop on the usecases and when to bring in LLMs.

Willy John VanSickle III 2mo

yes, I spend a lot of my time these days having AI build out repeatable deterministic things

1 Reaction

See more comments

To view or add a comment, sign in

More Relevant Posts

Joseph Edmonds
2mo
Report this post
https://lnkd.in/dh2qAzsn Here's an article where I'm proposing that the classic TDD bug fix approach is missing a critical step. Tests are great - they prove specific features are working and specific bugs are not present. But static analysis is even better. Static analysis can prevent not just specific bugs but whole categories of bugs by blocking buggy patterns. In this modern day and age of AI driven development this strategy is both even more important and also even easier to implement with really high levels of precision. An LLM can bang out a custom ES Lint or PHPStan rule in seconds, the time, token and pain savings are hard to calculate but they are potentially huge. Have a read, let me know what you think https://lnkd.in/dh2qAzsn

Defence Before Fix: Preventing Bug Classes with Static Analysis ltscommerce.dev
Like Comment
To view or add a comment, sign in
Dmytro Huz
1mo
Report this post
🚀 I just published an open source tool to connect requirements, code, and tests - and check that tests really cover the acceptance criteria AI makes it easy to generate code and tests fast, but much harder to know whether acceptance criteria are actually protected — that is why I built ac-trace, a small open-source tool that maps criteria to code and tests, then mutates the mapped code to see whether the tests really catch the breakage. Sounds interesting? ➡️ Jump directly into repo: https://lnkd.in/dK8FqK8M OR ➡️ Checkout here in more details: https://lnkd.in/dEQZEfAX

I built ac-trace to question the trust we place in passing tests dmytrohuz.com
Like Comment
To view or add a comment, sign in
LAKSHYA BENGANI
1mo
Report this post
Most developers have experienced this at some point: A large code migration that touches hundreds of files. You start with good intentions… “Shouldn’t take too long.” A few hours later you’re still: • replacing deprecated APIs • updating the same patterns everywhere • performing tests to verify you didn’t break something It’s repetitive work and surprisingly easy to mess up. Recently I ran into exactly this situation. Instead of doing the migration manually (or writing fragile scripts), I wondered: Could an AI agent do this instead? So I built a Migration CLI Agent. The idea was simple: Give the agent the migration rules, let it scan the repository, apply the code transformations, and validate the changes before moving forward. This ended up becoming a really interesting problem in AI-assisted developer tooling. I have written a blog walking through: • how the migration agent works • the architecture behind the CLI • how I added guardrails to keep the changes reliable If you’re exploring AI agents for developer workflows or working with large codebases, you might find it interesting. Read the full post here 👇 https://lnkd.in/gfuzE6u2 Curious to hear — how are people approaching large code migrations today? #AI #DeveloperTools #SoftwareEngineering #LLM

Stop Migrating Code Manually— Build an AI Agent to Do It For You medium.com

1 Comment
Like Comment
To view or add a comment, sign in
Sandeep Mehta
1mo
Report this post
Just found Serena and my AI agents are coding faster than ever. most coding agents read entire files, grep around blindly, and do string replacements like it's 2005. Serena gives your LLM actual IDE superpowers. Symbol-level code retrieval. Relational understanding. Find references, insert after symbol, navigate like a senior dev would. The result? Fewer tokens. Faster execution. Better code. It's an MCP server that plugs into Claude Code, Codex, Cursor, VSCode, Gemini CLI — basically everything. Open source. Free. And it actually works on large codebases where other tools fall apart. If you're running AI coding agents and not using something like this, you're burning tokens and getting worse results. 🔗 https://lnkd.in/eMe6-vaT #AIAgents #CodingAgents #Serena #OpenSource #DeveloperTools #AI #MCP #AgenticAI #BuildInPublic #SoftwareEngineering

GitHub - oraios/serena: A powerful MCP toolkit for coding, providing semantic retrieval and editing capabilities - the IDE for your agent github.com
Like Comment
To view or add a comment, sign in
Mayank G.
2mo
Report this post
Aider - AI pair programmer that edits your actual files, not just generates code What it does: → Reads your entire codebase for context → Edits files directly (no copy-paste) → Commits changes to git automatically → Works with Claude, GPT-4, DeepSeek, local models → Understands project structure and dependencies Setup: uv pip install aider-chat Cost: $0 for the tool (pay for API calls) The difference from ChatGPT: ChatGPT: Generate code → Copy → Paste → Fix import errors → Debug Aider: "Add user authentication" → Done, committed to git Real workflow from yesterday: Task: Add rate limiting to 5 API endpoints Traditional: 2 hours writing code, testing, fixing bugs With Aider: 15 minutes directing, reviewing the changes Aider reads the codebase, makes the changes, shows you the diff, commits when you approve. What makes it different: Maintains context across entire session Understands relationships between files Suggests files to include for better results Runs tests automatically after changes Rolls back if tests fail The workflow: You: "Add error handling to the payment processing" Aider: Shows proposed changes across 3 files You: "Approve" or "No, handle timeouts differently" Aider: Revises, commits to git This is what "AI coding assistant" should have been from the start. Not generating code snippets. Actually editing your project. 🔗 https://lnkd.in/gBkPQAqi #AI #Coding #DevTools

GitHub - Aider-AI/aider: aider is AI pair programming in your terminal github.com
Like Comment
To view or add a comment, sign in
Nicolas Louis
1mo
Report this post
Currently I put my focus on the optimization of tokens consumption. A software knowledge is very helpful to store the structure of an application helping AI to not rediscover the code at every prompt (even with a memory mechanism). Another approach is considering the fact : now the code does not matter and by consequences the language. My question is : fine so what is the best language for an AI approach ? This github page gives some clues : https://lnkd.in/eW-2VnG4

GitHub - mame/ai-coding-lang-bench: Which programming language is best for AI coding agents? Benchmarking 13 languages with Claude Code. github.com

2 Comments
Like Comment
To view or add a comment, sign in
Alex Fuentes
1mo
Report this post
7 LLM skills to build powerful AI agents for your enterprise 🚀structured prompt engineering for consistent outputs 🚀 context engineering to deliver relevant data 🚀 fine-tuning only where it adds measurable value 🚀 retrieval-augmented generation (rag) for grounded responses 🚀 agent design with safety guardrails 🚀 production-ready deployment for scale 🚀 continuous observability and optimization https://lnkd.in/g2UHYeyf #llm #enterpriseai #aiengineering #promptengineering #rag #deployment #observability

DZone: Programming & DevOps news, tutorials & tools dzone.com
Like Comment
To view or add a comment, sign in
Goekhan Bakir
1mo
Report this post
This is an interesting discussion on a new way of third party dependency in the age of ai code generation. Will libraries just become formal verifiable specifications and what wouldbe the implications?

A Software Library with No Code dbreunig.com
Like Comment
To view or add a comment, sign in
Joshua Smith
1mo Edited
Report this post
Post 3: Prompt to Production: The Framework Argument is Dying Picking a framework used to be a religious argument. AI just made it a use case discussion. I've deployed several C++ containers over the past few months. Nothing fancy -- FFT and feature extraction algorithms. Less than a year ago I would have had ChatGPT write a C# version and copy-pasted it into an existing microservice. Now I can have a C++/Rust/Python container spun up and deployed into my environment as fast as it used to take me to copy C# code somewhere, only to refactor it somewhere else ten minutes later. I'm not converting everything to assembly. I'm probably not switching to Java, either. My C++ containers were tactical, not web-scale production. The framework conversation is far from over -- but the ideology behind it is dying. Because here's the question I keep coming back to. If a new language came out -- let's call it JoshScript, so everyone knows this is hypothetical -- and JoshScript read as cleanly as Java or JavaScript or whatever you grew up on, but was fully designed around AI programming assistants, followed guardrails natively, and was the clear choice for both web-scale and one-off POCs... why wouldn't we switch? If it gave us 10x productivity and handled the pain-in-the-ass things automatically -- timezones, localization, insert-your-nightmare-here -- what exactly are we holding onto? And yes -- every abstraction layer eventually leaks. When it does, you still need to know what's underneath. JoshScript wouldn't change that. But arguing over which religion is right while the world ships 10x faster around you -- that's already over. And you have to pay the inflation tariff for ignoring it. So what would your JoshScript include? What would it have to solve to stop keeping you up at night? — Josh
Like Comment
To view or add a comment, sign in
Abdelrahman Mohamed
1mo
Report this post
Cursor AI is now available inside JetBrains IDEs including Android Studio. No more switching between editors. You get Cursor's powerful AI agents, model choice (Claude, GPT, Gemini), and codebase indexing all without leaving Android Studio. Just install the Cursor ACP from the JetBrains AI chat, sign in with your Cursor account, and you're good to go. 🔗 https://lnkd.in/d7NWxiwj #AndroidDev #CursorAI #JetBrains #AndroidStudio #AI

Cursor is now available in JetBrains IDEs · Cursor cursor.com

3 Comments
Like Comment
To view or add a comment, sign in

2,334 followers

View Profile Follow

eBay Tests with AI: DSL vs LLMs for Efficient Testing

More from this author

Simplifying technical designs

Choosing a backend language, choosing a culture

A set of coding standards

Explore content categories

eBay Tests with AI: DSL vs LLMs for Efficient Testing

More Relevant Posts

More from this author

Simplifying technical designs

Choosing a backend language, choosing a culture

A set of coding standards

Explore related topics

Explore content categories