AI fixes caching bug with try/catch, but misses query string and locale header

AI writes fast. It also lies fast. Yesterday it confidently “fixed” our caching bug by wrapping a fetch in try/catch and returning null on error. The code looked clean in the diff. Production would have turned it into a silent failure factory. Real issue was SSR caching inconsistency: we were keying cache by pathname only. AI didn’t notice the query string and locale header were part of the response shape. So /products?sort=price cached over /products?sort=popular. And en US HTML got served to fr FR users during a traffic spike. AI suggested “just add a TTL”. I overrode it and changed the key to include search params plus a normalized accept language. Then I added a cache bypass for authenticated requests because we saw personalized fragments in the markup. The best part: AI helped me write the tests. The dangerous part: it wrote tests that asserted the implementation, not the behavior. I rewrote them to assert cache separation across two requests with different headers. AI is a great pair. But it doesn’t carry your invariants in its head. You do. Ship the diff, not the vibe. #JavaScript #SSR #Caching #FrontendArchitecture #AIEngineering

To view or add a comment, sign in

More Relevant Posts

Vineet Dhiman
2w
Report this post
AI makes you faster at writing code and slower at noticing what it does to your system. That tradeoff shows up in production, not in the prompt window. Last week I asked an AI to “add optimistic updates” to a React Query mutation. It did the easy part: update cache in onMutate, then invalidate. Looked clean. The miss was subtle: no rollback on error and no cancellation of in flight queries. So a slow refetch raced the optimistic write and we shipped a UI that flickered back to stale data. AI suggested this shape: onMutate setQueryData onError toast onSettled invalidateQueries My review changed it to: cancelQueries in onMutate, snapshot previous, rollback in onError, then invalidate. And we added AbortController to the fetch so retries didn’t stack on flaky mobile networks. Outcome: fewer “it changed then changed back” tickets and a measurable drop in duplicate requests. The AI got us 70 percent there. The other 30 percent was knowing where time and state fight. The real skill isn’t prompting. It’s recognizing the parts of async behavior an LLM can’t feel. #ReactQuery #JavaScript #FrontendEngineering #AsyncProgramming #AICodeReview
Like Comment
To view or add a comment, sign in
Aryan Patel
2w
Report this post
Ever wondered what happens when top-tier AI models go head-to-head? 🥊🤖 I recently built AI Battle Arena a full-stack application where different major Large Language Models debate, solve algorithms, and compete for the highest score, judged entirely by another AI! Here is how the architecture handles the heavy lifting under the hood: 1️⃣ The Prompt: A user submits a coding problem or logic puzzle via a sleek, dark-mode React UI. 2️⃣ The Contenders (Parallel Execution): Utilizing LangGraph, the backend concurrently routes the problem to both Mistral (Medium) and Cohere (Command-R). They battle it out to generate the best possible answer. 3️⃣ The Judge: An impartial evaluator powered by Google Gemini Flash automatically kicks in. Utilizing LangChain and Zod for strictly typed structured outputs, it reads both solutions, scores them out of 10, and explains its detailed reasoning. 🛠 The Tech Stack: ⚡️ Frontend: React, Vite, TailwindCSS (for that premium dark aesthetic) ⚙️ Backend: Node.js, Express, LangChain, LangGraph, Zod 🧠 AI Models: Google Gemini, Mistral AI, Cohere Building this was a fantastic deep dive into Agentic AI workflows. Using LangGraph’s state machines to orchestrate complex, multi-agent collaborations changes the game entirely. Ensuring that the “Judge” model returned strictly formatted JSON via Zod schema parsing was highly critical to getting the UI to respond fluidly. Github:https://lnkd.in/gwpyvF2C #AI #WebDevelopment #ReactJS #NodeJS #LangChain #LangGraph #MachineLearning #GenerativeAI #SoftwareEngineering #TechBuilds

1 Comment
Like Comment
To view or add a comment, sign in
Rayhan Shorker
1w
Report this post
Prompt engineering is not just about chatting. It’s about engineering data. 🧠🤖 When integrating LLMs into a production backend, the biggest challenge isn't the "AI magic"—it's the predictability. In my project "CodeLumina" (an AI-powered security auditor), I couldn't afford a chatty response like "Sure, I can help you with that code..." I needed a raw, valid, and deterministic JSON object that my Node.js backend could parse and store instantly. Here is how I optimized my workflow to get 100% structured JSON from Llama-3.3 and GPT-4o: 🔹 1. Strict System Prompting: Instead of a vague instruction, I defined a strict schema within the system prompt. I told the AI exactly what keys to include (e.g., vulnerabilities, severity, line_numbers) and warned it to return ONLY JSON without any conversational preamble. 🔹 2. Leveraging JSON Mode: I utilized the "json_object" response format provided by Groq SDK and OpenAI. This forces the model to guarantee that the output string is a valid JSON, preventing 3 AM parsing crashes. 🔹 3. Few-Shot Prompting: By providing 1-2 examples of "Input Code" vs "Expected JSON Output" in the prompt, I significantly reduced hallucinations. The model now understands the "contextual shape" of the data I need. 🔹 4. Low Temperature for Determinism: Setting the temperature to 0.1 or 0.2 was the game-changer. It makes the model less "creative" and more focused on following the structural rules I set. The Result? CodeLumina now processes complex code snippets and returns structured audit reports with sub-second latency, directly feeding into my research dashboard to calculate F1-Score and Precision metrics. In the world of AI Adoption, a well-engineered prompt is just as critical as a clean database schema. How are you handling LLM reliability in your apps? Are you using JSON mode or manual regex parsing? Let's talk in the comments! 👇 #AI #PromptEngineering #GenerativeAI #BackendDevelopment #TypeScript #NodeJS #CodeLumina #LLM #Groq #Llama3 #CleanCode #FullStackDeveloper

1 Comment
Like Comment
To view or add a comment, sign in
Kristiyan I.
1w
Report this post
Agent workflows re-run the same tool calls. LLM apps burn tokens on near-identical prompts. Both have the same fix: cache on Valkey. Shipped this week: @betterdb/agent-cache - several releases, now at v0.4.0. Multi-tier exact-match cache for AI agents. Adapters for LangChain, LangGraph, Vercel AI SDK, OpenAI, Anthropic, LlamaIndex. Multi-modal caching, pluggable binary normalizer, cost tracking. Runs on vanilla Valkey, no modules required. @betterdb/semantic-cache v0.2.0 - major overhaul. Five new adapters (OpenAI chat + responses, Anthropic, LlamaIndex, LangGraph memory store), embedding helpers for OpenAI / Voyage / Cohere / Ollama / Bedrock, multi-modal prompts with binary refs, batch lookup, stale-model eviction on upgrades, rerank hook. betterdb-semantic-cache (Python) - port of the Node package. Same API surface, same providers, same cost tracking. All MIT. #Valkey #Redis #AI #LLM #OpenSource
Like Comment
To view or add a comment, sign in
nihan nihu
3w Edited
Report this post
Every AI code reviewer on the market has the exact same fatal flaw: Amnesia. They analyze a pull request in complete isolation. They have no awareness of your team's conventions, no institutional knowledge, and no memory of the catastrophic async generator leak that took down production three weeks ago. So, I built Omni-SRE to fix it. Omni-SRE is a context-aware code review agent that actually remembers your codebase's history. Before it reads a single line of your diff, it queries Vectorize Hindsight—a persistent vector memory layer—to recall past incidents and vulnerabilities. Those memories are injected directly into the Groq LLM context window alongside the PR diff. The difference is night and day: ❌ Without Memory: The AI blindly approves a PR missing a critical socket teardown block. ✅ With Hindsight Memory: The agent flags the PR as CRITICAL, explicitly citing Incident 2025-08-14-A and enforcing our team's safety conventions. What I learned architecting this: Stateless AI is a local maximum. The next generation of developer tools will carry persistent, team-scoped memory. Context management is a product feature. We had to build a custom Token Budget Manager to dynamically balance memory recall depth against diff size so we didn't blow out the context window. SSE streaming is non-negotiable. If your agent thinks for 8 seconds and dumps the result, users assume it froze. Streaming intermediate reasoning steps changes the entire perception of latency. The architecture is fully open-source. You can check out the code and the Python/React stack here: https://lnkd.in/gc4dbuBy If you're building AI-native dev tools—or thinking about how vector memory changes the agent paradigm—I'd genuinely like to hear your take below. 👇 I wrote a full technical breakdown of how we built this. Link in the comments. #AIAgents #AI #Hindsight #AgentMemory #AIMemory #LLM

2 Comments
Like Comment
To view or add a comment, sign in
Amir Hajian
1w
Report this post
A year ago I built my best app to solve PII redaction the "right" way: server-side, API-dependent, cloud-hosted. It worked. But it bugged me. Every document you scan has to leave your machine. That's the part nobody talks about. Today, motivated by the new model release on Hugging Face, I came back to the problem and rebuilt it from scratch; this time entirely in the browser: no backend, no API key, no data leaving your premises. Ever. Here's what it does: - Detects 8 PII types: names, emails, phones, addresses, account numbers, URLs, dates, secrets - Runs a 1.5B-parameter model client-side via Transformers.js + ONNX Runtime WASM - Gives you confidence scores on every detected entity - Redact or highlight mode, copy-clean output - First load downloads the quantized weights (~200MB) and caches them. Every run after that is instant. - Local model with open weights, meaning you can easily finetune it with Transformers without a limit. Now I can truly call it PII redaction the "right way": Zero infra. Zero cost at scale. Zero trust required. The model is openai/privacy-filter (Apache 2.0), sparse MoE architecture, 128K context window. Transformers.js makes it almost trivially easy to drop models like this into a webpage. If you're building anything where sensitive documents touch your pipeline, this pattern is worth knowing about. #PrivacyByDesign #MachineLearning #NLP #WebAI #TransformersJS #DataPrivacy #OpenSource

4 Comments
Like Comment
To view or add a comment, sign in
Julie Torani
1w
Report this post
Love this! No backend. No API. No data leaving your machine. This is what responsible AI looks like in practice. Worth a read: #AI #DataPrivacy #ResponsibleAI #Innovation

Amir Hajian

Chief Science Officer and Head of Arteria AI Cafe @Arteria AI
1w

A year ago I built my best app to solve PII redaction the "right" way: server-side, API-dependent, cloud-hosted. It worked. But it bugged me. Every document you scan has to leave your machine. That's the part nobody talks about. Today, motivated by the new model release on Hugging Face, I came back to the problem and rebuilt it from scratch; this time entirely in the browser: no backend, no API key, no data leaving your premises. Ever. Here's what it does: - Detects 8 PII types: names, emails, phones, addresses, account numbers, URLs, dates, secrets - Runs a 1.5B-parameter model client-side via Transformers.js + ONNX Runtime WASM - Gives you confidence scores on every detected entity - Redact or highlight mode, copy-clean output - First load downloads the quantized weights (~200MB) and caches them. Every run after that is instant. - Local model with open weights, meaning you can easily finetune it with Transformers without a limit. Now I can truly call it PII redaction the "right way": Zero infra. Zero cost at scale. Zero trust required. The model is openai/privacy-filter (Apache 2.0), sparse MoE architecture, 128K context window. Transformers.js makes it almost trivially easy to drop models like this into a webpage. If you're building anything where sensitive documents touch your pipeline, this pattern is worth knowing about. #PrivacyByDesign #MachineLearning #NLP #WebAI #TransformersJS #DataPrivacy #OpenSource
Like Comment
To view or add a comment, sign in
Tasnim Fariyah
3w Edited
Report this post
Built a lightweight AI research agent from scratch with reasoning + memory + transparency. The stack: - Groq (llama-3.3-70b) - DuckDuckGo search - ChromaDB for persistent memory - Streamlit UI The agent uses a ReAct-style loop, it decides whether to search the web or answer from memory, logs every decision to a JSON audit trail, and remembers past conversations across sessions. Entirely open source. I hit a few roadblocks that taught me a lot about agentic behavior: The agent refused to search: It was answering everything from its training data. Turns out my system prompt was too vague. Once I made the search rules explicit, behavior improved instantly. Specificity in prompts matters more than I expected. The silent failure: After integrating ChromaDB, Groq started rejecting requests silently - no errors, just no output. The Fix was sanitizing memory before sending to LLM, stripping empty strings and truncating entries that were too long. The infinite search loop: I watched the agent search 5 times in a row for the same thing without ever summarizing an answer, hitting my iteration cap. Fixed by appending "Do not search again" to the tool results, which effectively forced the LLM to move from "research mode" to "summary mode." Now onto the next challenge: using AI agents to rethink and improve how QA workflows operate. Check out the repo below if you're interested in lightweight agent architectures! #llm #ai #python #agentai #groq #chromadb
1 Comment
Like Comment
To view or add a comment, sign in
Michael Anderson
2w Edited
Report this post
TLDR: DDD manages complexity when working with LLM agents. I have been working on a static type-checking plugin for Clojure (https://lnkd.in/gGGxihms), to balance out the flexibility that the language affords with a way of doing sanity checks on the annotations we already use in our codebase (currently for plumatic/schema, and eventually to incorporate other systems such as metosin/malli). AI gave new life to the project, since much of the work was first replacing thousands of lines with an alternative analysis engine, then filling in test cases as I found them. Mostly, this flow worked well. But type checking is a complex beast, and the app grew increasingly bug-ridden and inflexible as the boundaries between internal typing representations (based off of the Blame for All algorithm at https://lnkd.in/gexUzdmS) and the Schema annotations used to provide the initial input. The LLMs increasingly produced bad edits chewing through more and more tokens. They are next-token prediction engines; bad surface language hurts their ability to work. What saved the app was adhering to strict domains within it. The language of "Schema" would be completely separate from the language of "Types". This was rigorously enforced: the word "schema" could only refer to plumatic/schema references and code working on them, and "type" to the internal representations. Creating this clean boundary, enforcing the language used, and creating small, defined conversion points in one direction (Schema -> Type) resolved this issue. This was easy to put into an AGENTS.md, and the LLM confusion between domains dropped considerably. Work become much more efficient and mechanical. Today, refining this approach and making the domain boundaries tighter solved another class of bugs: Schema was interpreted differently when defining the initial dictionary of references than when the library converts them to Types. By focusing again on clean descriptions of the domains and defining consistent boundaries, I was able to guide the LLM to clean this up in a principled fashion. #AI #SoftwareEngineering #Clojure #DDD

2 Comments
Like Comment
To view or add a comment, sign in
OM PRAKASH 🇮🇳
3w
Report this post
I’ve been building something meaningful over the past few weeks — AI Guru v2. A full-stack AI system that answers questions in the spirit of the Bhagavad Gītā — grounded in actual verses, not hallucinated responses. 1. How it works (simple idea, complex execution): Your question → Semantic search over ~700 verses (embeddings + FAISS) → Optional emotion detection (transformer model) → LLM that is strictly constrained to retrieved context 2. Tech Stack: • Backend: FastAPI • Frontend: React + Vite • Retrieval: Sentence Transformers + FAISS • LLM: OpenAI-compatible APIs (via OpenRouter) 3. What I learned building this: • Why retrieval quality > model size in RAG systems • Importance of using the same embedding model for indexing & querying • Handling CORS, environment separation (VITE_ vs server keys)* • Designing systems with clear boundaries (no claims, disclaimers, grounded outputs) 4. This is a learning-first project focused on: • Honest AI (no fake wisdom generation) • Transparent retrieval scores • Clean pipeline: CSV → embeddings → FAISS → RAG → UI I’ve attached a short walkthrough of the full system. Would genuinely love feedback from people working on: • RAG systems • LLM applications • Tech + philosophy / cultural interfaces #MachineLearning #RAG #FastAPI #React #LLM #OpenSource #LearningInPublic #AIProjects
Like Comment
To view or add a comment, sign in

28,329 followers

373 Posts

View Profile Connect

AI fixes caching bug with try/catch, but misses query string and locale header

More Relevant Posts

Explore related topics

Explore content categories