Prompting LLM to Code

Tanmoy Roy

Published Jul 20, 2025

Coding with LLM / Vibe Coding / Prompt Engineering / Context Engineering, are the emerging development paradigms. Not new, but evolving and maturing.

I spent the last 7 months with various LLMs tinkering with a simple app idea. I wanted to check how good/bad are the codes generated by the LLMs, in an exhaustive way. The quality of the code is supposed to be like a 3-5 yr experienced developer would build. Or at least that’s what I presumed.

TL; DR: Vibe Coding or Prompt Engineered coding seems to work well for a Simple, unidimensional, non-remote/non-API, low Non-Functional-Requirement heavy task. For everything else, as of now, it is still better to code by hand. Unless you are using Context Engineering with Agentic.

If you are still with me, then let’s get into the details. It’s going to be a large post. 😊

My app idea was simple. I wanted to create a Role-Based-Access-Control enabled web application which will have 2 roles, admin and user. The app will have 5 users, out of which 1 will be an admin and the rest are normal users. There will be a login screen with a company logo (image to be provided by me) which will allow the admin user to view all the users in one screen + a landing page with a custom/user specific message with login time and number of concurrent logins + a reporting page which will call an API to bring some data from a local PGSQL DB. For normal users they will see a custom message with their name with the last time they logged in + a reporting database API enabled page which will generate a report from a local PGSQL DB.

So, in all, a web app, 5 APIs (2 for RBAC + 2 for data + 1 for concurrent login details), local DB insert scripts and an environment setup script.

I tried with 4 LLMs. OpenAI o3, Claude, Qwen and Gemini 2.5.

None of the above LLMs could produce even a decent web app with API calls in the first few iterations and repeated prompt optimizations.

I was worried that my prompts were wrong. That perception “prompted” me to do a Udemy course on Prompt Engineering to learn more about it. 😊 And it helped to a great extent. The results that you will see below are after the Prompt Engineering course.

But it still was not generating the quality output as promised by many blogs/youtubers. This led me to Agentic AI approaches. Aided by the Agentic approach, the quality of the output vastly improved. However, with significant increased cost of operations.

Let’s investigate the outcome of the iterations based on the following aspects and tools used.

NFR – Non-Functional Requirement
Security – OWASP Top 10 Requirements
Code Quality – Standard SonarQube enabled.
Code Complexity – SonarQube
SAST– SonarQube
DAST - ZAP
UX – How intuitively the pages are connected and arranged.
UI – How good is the project structure, reusability and asset segregation?

General observations and findings for the entire process:

This is purely a personal endeavor. What started as a learning for prompt engineering moved to Agentic and later to Context Engineering.
I have used the standard code quality indexes/setup locally. Each of the code generation and quality check processes were isolated.
Kubernetes / Docker were not used.
I am personally a fan of Gitea and have used it here.
APIs are written in Python + FastAPI. Sorry Java folks!
Qwen’s code referenced some of the Chinese university libraries and that had to be manually changed.
From pure developer experience in setting up the environment and access, OpenAI leads by miles. Google’s approach is strangely inadequate. Claude is somewhere in between.
Response times are almost similar for OpenAI and Gemini. Claude needs improvement.

Adoption of the AGENTIC approach

Agentic approach is well supported by Gemini and OpenAI. For Claude, it’s a bit odd. May be my program was not proper hence not detailing it further.
Agentic approach is costly. Atleast 2-3 times more costly than pure LM based approach.
To circumvent the cost part, I have used local OLLAMA based reasoning model to decompose the tasks before handing it over to a online LLM.
May be, just maybe, if there is a way to call an LLM API at a lower cost just for the reasoning, it will help the enterprise operations.
I have used DAG and Dynamic Decomposition based patterns for all these operations.

Adoption of the CONTEXT engineering

It’s new so it will take some time to mature. But I find it far more exciting and aligned with enterprise than prompting.
I found it significantly more capable than prompting and coupled with agentic approach, possibilities are truly endless.
I will detail the context engineering approach in separate post.

First: OpenAI

Recommended by LinkedIn

My Sojourn Into "Vibe Coding" Reality

Mike Lakas 9 months ago

WTF is VIBE Coding?!

Tyler Shields 1 year ago

Prompt Engineering 101: A Software Developer's Guide…

Zunaira S. 12 months ago

Second: Claude

Third: Qwen via Huggingface

Fourth: Gemini 2.5 with Vertex AI

As you can see the code quality and the ability to adhere to NFRs are poor. Its not satisfactory for most of the iterations.

So I switched to Agentic Approach. As mentioned earlier Agentic approach improved the quality by leaps and bounds.

OpenAI O3 + AutoGen

Second: Claude + LangGraph / Claude Code CLI

Third: Qwen via Huggingface + CrewAI

Fourth: Gemini 2.5 with Vertex AI + A2A + LangGraph

These results do not show the effect of Context engineering (CE). CE reduced the iterations and with Agentic, the results were, well, scarily good.

Ending the article with the hope that you, the readers, will add more to this. My approach came from a purely learning aspect and hence can be fraught with errors, gaps, and chances of betterment.

Any question/clarification/comment will be highly appreciated.

Dextra Labs 9mo

Tanmoy Roy Great breakdown. This aligns with what we’re seeing, prompting hits limits fast, especially for production-grade apps. Agentic + context engineering unlocks real value, but comes with cost tradeoffs.

1 Reaction

Keerthesh N 9mo

Helpful insight, Tanmoy BTW Did you try Google AI Studio - I felt this is good for building prototype and small projects (web apps) in compare to ChatGPT and CoPilot agents. It supports prompt engineered coding and uses Gemini LLM family.

1 Reaction

Akash Jain 9mo

Thanks for sharing, Tanmoy

1 Reaction

Kiran Hariharan 9mo

Never really went with vibe coding. But I’m getting good results with Claude. The selection of libraries is something we have to control in the rules I’m guessing. Agentic is working out. I’m just dipping my toes into context Engjneering and it’s promising. I know you are surprised ;) Btw which version of Claude did you use? 3.5 was good. 3.7 sucked and 4 I’m still playing. The Jury is out, though people say it’s really good.

Prompting LLM to Code

Tanmoy Roy

Recommended by LinkedIn

More articles by Tanmoy Roy

Others also viewed

From "Vibe Coding" to "Viable Coding": Democratization of Software Engineering and the Rise of Micro-SaaS

🎤 Vocal Coding is the New Vibe Coding

From The Challenges - NATS Message Broker

Code Rots. Can Agentic Coding Tools Help us to Combat the Entropy?

You don't need to call it "vibe coding" - but Cursor and friends can help you amplify your impact

The Future of AI Coding: Is KiloAI Ready to Replace Copilot and Cursor

David Schachter’s Blog: The argument about coding standards. Oh my!

How Some People Are Faking "0 ms Runtime" on LeetCode — and Why It’s a Bad Idea

Coding vs. Software Engineering

Leetcode 21: Merge Two Sorted Lists

LLM Prompt Engineering Strategies for 2023

Key Prompting Strategies for Small LLMs

LLM Prompting Techniques for Non-Programmers

How to Make LLM Output More Human-Like

LLM Prompt Testing for Unintended Outcomes

Explore content categories

Recommended by LinkedIn

More articles by Tanmoy Roy

How the Mighty (apps) have fallen!

RPA and Agentic AI - Enterprise Convergence

GPU Synchronisation - Glimpse into NVIDIA's runbook

iPhone and its huge Indian Caller-ID problem...

Machine Learning; Deep and Shallow: Choosing the right approach for your ML models

Illusive lure of a Silver Bullet

Generative AI, ChatGPT, BARD, LLM etc. – Questions that I had and was afraid to ask

#RightToRepair - Hardware vs Software

Javascript - Master of None

Multilingual chatbot - A reality

Others also viewed

From "Vibe Coding" to "Viable Coding": Democratization of Software Engineering and the Rise of Micro-SaaS

🎤 Vocal Coding is the New Vibe Coding

From The Challenges - NATS Message Broker

Code Rots. Can Agentic Coding Tools Help us to Combat the Entropy?

You don't need to call it "vibe coding" - but Cursor and friends can help you amplify your impact

The Future of AI Coding: Is KiloAI Ready to Replace Copilot and Cursor

David Schachter’s Blog: The argument about coding standards. Oh my!

How Some People Are Faking "0 ms Runtime" on LeetCode — and Why It’s a Bad Idea

Coding vs. Software Engineering

Leetcode 21: Merge Two Sorted Lists

Similar topics

LLM Prompt Engineering Strategies for 2023

Key Prompting Strategies for Small LLMs

LLM Prompting Techniques for Non-Programmers

How to Make LLM Output More Human-Like

LLM Prompt Testing for Unintended Outcomes

Explore content categories