AI writes fast. It also lies fast. Yesterday it confidently “fixed” our caching bug by wrapping a fetch in try/catch and returning null on error. The code looked clean in the diff. Production would have turned it into a silent failure factory. Real issue was SSR caching inconsistency: we were keying cache by pathname only. AI didn’t notice the query string and locale header were part of the response shape. So /products?sort=price cached over /products?sort=popular. And en US HTML got served to fr FR users during a traffic spike. AI suggested “just add a TTL”. I overrode it and changed the key to include search params plus a normalized accept language. Then I added a cache bypass for authenticated requests because we saw personalized fragments in the markup. The best part: AI helped me write the tests. The dangerous part: it wrote tests that asserted the implementation, not the behavior. I rewrote them to assert cache separation across two requests with different headers. AI is a great pair. But it doesn’t carry your invariants in its head. You do. Ship the diff, not the vibe. #JavaScript #SSR #Caching #FrontendArchitecture #AIEngineering
AI fixes caching bug with try/catch, but misses query string and locale header
More Relevant Posts
-
AI makes you faster at writing code and slower at noticing what it does to your system. That tradeoff shows up in production, not in the prompt window. Last week I asked an AI to “add optimistic updates” to a React Query mutation. It did the easy part: update cache in onMutate, then invalidate. Looked clean. The miss was subtle: no rollback on error and no cancellation of in flight queries. So a slow refetch raced the optimistic write and we shipped a UI that flickered back to stale data. AI suggested this shape: onMutate setQueryData onError toast onSettled invalidateQueries My review changed it to: cancelQueries in onMutate, snapshot previous, rollback in onError, then invalidate. And we added AbortController to the fetch so retries didn’t stack on flaky mobile networks. Outcome: fewer “it changed then changed back” tickets and a measurable drop in duplicate requests. The AI got us 70 percent there. The other 30 percent was knowing where time and state fight. The real skill isn’t prompting. It’s recognizing the parts of async behavior an LLM can’t feel. #ReactQuery #JavaScript #FrontendEngineering #AsyncProgramming #AICodeReview
To view or add a comment, sign in
-
-
Ever wondered what happens when top-tier AI models go head-to-head? 🥊🤖 I recently built AI Battle Arena a full-stack application where different major Large Language Models debate, solve algorithms, and compete for the highest score, judged entirely by another AI! Here is how the architecture handles the heavy lifting under the hood: 1️⃣ The Prompt: A user submits a coding problem or logic puzzle via a sleek, dark-mode React UI. 2️⃣ The Contenders (Parallel Execution): Utilizing LangGraph, the backend concurrently routes the problem to both Mistral (Medium) and Cohere (Command-R). They battle it out to generate the best possible answer. 3️⃣ The Judge: An impartial evaluator powered by Google Gemini Flash automatically kicks in. Utilizing LangChain and Zod for strictly typed structured outputs, it reads both solutions, scores them out of 10, and explains its detailed reasoning. 🛠 The Tech Stack: ⚡️ Frontend: React, Vite, TailwindCSS (for that premium dark aesthetic) ⚙️ Backend: Node.js, Express, LangChain, LangGraph, Zod 🧠 AI Models: Google Gemini, Mistral AI, Cohere Building this was a fantastic deep dive into Agentic AI workflows. Using LangGraph’s state machines to orchestrate complex, multi-agent collaborations changes the game entirely. Ensuring that the “Judge” model returned strictly formatted JSON via Zod schema parsing was highly critical to getting the UI to respond fluidly. Github:https://lnkd.in/gwpyvF2C #AI #WebDevelopment #ReactJS #NodeJS #LangChain #LangGraph #MachineLearning #GenerativeAI #SoftwareEngineering #TechBuilds
To view or add a comment, sign in
-
Prompt engineering is not just about chatting. It’s about engineering data. 🧠🤖 When integrating LLMs into a production backend, the biggest challenge isn't the "AI magic"—it's the predictability. In my project "CodeLumina" (an AI-powered security auditor), I couldn't afford a chatty response like "Sure, I can help you with that code..." I needed a raw, valid, and deterministic JSON object that my Node.js backend could parse and store instantly. Here is how I optimized my workflow to get 100% structured JSON from Llama-3.3 and GPT-4o: 🔹 1. Strict System Prompting: Instead of a vague instruction, I defined a strict schema within the system prompt. I told the AI exactly what keys to include (e.g., vulnerabilities, severity, line_numbers) and warned it to return ONLY JSON without any conversational preamble. 🔹 2. Leveraging JSON Mode: I utilized the "json_object" response format provided by Groq SDK and OpenAI. This forces the model to guarantee that the output string is a valid JSON, preventing 3 AM parsing crashes. 🔹 3. Few-Shot Prompting: By providing 1-2 examples of "Input Code" vs "Expected JSON Output" in the prompt, I significantly reduced hallucinations. The model now understands the "contextual shape" of the data I need. 🔹 4. Low Temperature for Determinism: Setting the temperature to 0.1 or 0.2 was the game-changer. It makes the model less "creative" and more focused on following the structural rules I set. The Result? CodeLumina now processes complex code snippets and returns structured audit reports with sub-second latency, directly feeding into my research dashboard to calculate F1-Score and Precision metrics. In the world of AI Adoption, a well-engineered prompt is just as critical as a clean database schema. How are you handling LLM reliability in your apps? Are you using JSON mode or manual regex parsing? Let's talk in the comments! 👇 #AI #PromptEngineering #GenerativeAI #BackendDevelopment #TypeScript #NodeJS #CodeLumina #LLM #Groq #Llama3 #CleanCode #FullStackDeveloper
To view or add a comment, sign in
-
Agent workflows re-run the same tool calls. LLM apps burn tokens on near-identical prompts. Both have the same fix: cache on Valkey. Shipped this week: @betterdb/agent-cache - several releases, now at v0.4.0. Multi-tier exact-match cache for AI agents. Adapters for LangChain, LangGraph, Vercel AI SDK, OpenAI, Anthropic, LlamaIndex. Multi-modal caching, pluggable binary normalizer, cost tracking. Runs on vanilla Valkey, no modules required. @betterdb/semantic-cache v0.2.0 - major overhaul. Five new adapters (OpenAI chat + responses, Anthropic, LlamaIndex, LangGraph memory store), embedding helpers for OpenAI / Voyage / Cohere / Ollama / Bedrock, multi-modal prompts with binary refs, batch lookup, stale-model eviction on upgrades, rerank hook. betterdb-semantic-cache (Python) - port of the Node package. Same API surface, same providers, same cost tracking. All MIT. #Valkey #Redis #AI #LLM #OpenSource
To view or add a comment, sign in
-
Every AI code reviewer on the market has the exact same fatal flaw: Amnesia. They analyze a pull request in complete isolation. They have no awareness of your team's conventions, no institutional knowledge, and no memory of the catastrophic async generator leak that took down production three weeks ago. So, I built Omni-SRE to fix it. Omni-SRE is a context-aware code review agent that actually remembers your codebase's history. Before it reads a single line of your diff, it queries Vectorize Hindsight—a persistent vector memory layer—to recall past incidents and vulnerabilities. Those memories are injected directly into the Groq LLM context window alongside the PR diff. The difference is night and day: ❌ Without Memory: The AI blindly approves a PR missing a critical socket teardown block. ✅ With Hindsight Memory: The agent flags the PR as CRITICAL, explicitly citing Incident 2025-08-14-A and enforcing our team's safety conventions. What I learned architecting this: Stateless AI is a local maximum. The next generation of developer tools will carry persistent, team-scoped memory. Context management is a product feature. We had to build a custom Token Budget Manager to dynamically balance memory recall depth against diff size so we didn't blow out the context window. SSE streaming is non-negotiable. If your agent thinks for 8 seconds and dumps the result, users assume it froze. Streaming intermediate reasoning steps changes the entire perception of latency. The architecture is fully open-source. You can check out the code and the Python/React stack here: https://lnkd.in/gc4dbuBy If you're building AI-native dev tools—or thinking about how vector memory changes the agent paradigm—I'd genuinely like to hear your take below. 👇 I wrote a full technical breakdown of how we built this. Link in the comments. #AIAgents #AI #Hindsight #AgentMemory #AIMemory #LLM
To view or add a comment, sign in
-
A year ago I built my best app to solve PII redaction the "right" way: server-side, API-dependent, cloud-hosted. It worked. But it bugged me. Every document you scan has to leave your machine. That's the part nobody talks about. Today, motivated by the new model release on Hugging Face, I came back to the problem and rebuilt it from scratch; this time entirely in the browser: no backend, no API key, no data leaving your premises. Ever. Here's what it does: - Detects 8 PII types: names, emails, phones, addresses, account numbers, URLs, dates, secrets - Runs a 1.5B-parameter model client-side via Transformers.js + ONNX Runtime WASM - Gives you confidence scores on every detected entity - Redact or highlight mode, copy-clean output - First load downloads the quantized weights (~200MB) and caches them. Every run after that is instant. - Local model with open weights, meaning you can easily finetune it with Transformers without a limit. Now I can truly call it PII redaction the "right way": Zero infra. Zero cost at scale. Zero trust required. The model is openai/privacy-filter (Apache 2.0), sparse MoE architecture, 128K context window. Transformers.js makes it almost trivially easy to drop models like this into a webpage. If you're building anything where sensitive documents touch your pipeline, this pattern is worth knowing about. #PrivacyByDesign #MachineLearning #NLP #WebAI #TransformersJS #DataPrivacy #OpenSource
To view or add a comment, sign in
-
Love this! No backend. No API. No data leaving your machine. This is what responsible AI looks like in practice. Worth a read: #AI #DataPrivacy #ResponsibleAI #Innovation
A year ago I built my best app to solve PII redaction the "right" way: server-side, API-dependent, cloud-hosted. It worked. But it bugged me. Every document you scan has to leave your machine. That's the part nobody talks about. Today, motivated by the new model release on Hugging Face, I came back to the problem and rebuilt it from scratch; this time entirely in the browser: no backend, no API key, no data leaving your premises. Ever. Here's what it does: - Detects 8 PII types: names, emails, phones, addresses, account numbers, URLs, dates, secrets - Runs a 1.5B-parameter model client-side via Transformers.js + ONNX Runtime WASM - Gives you confidence scores on every detected entity - Redact or highlight mode, copy-clean output - First load downloads the quantized weights (~200MB) and caches them. Every run after that is instant. - Local model with open weights, meaning you can easily finetune it with Transformers without a limit. Now I can truly call it PII redaction the "right way": Zero infra. Zero cost at scale. Zero trust required. The model is openai/privacy-filter (Apache 2.0), sparse MoE architecture, 128K context window. Transformers.js makes it almost trivially easy to drop models like this into a webpage. If you're building anything where sensitive documents touch your pipeline, this pattern is worth knowing about. #PrivacyByDesign #MachineLearning #NLP #WebAI #TransformersJS #DataPrivacy #OpenSource
To view or add a comment, sign in
-
Built a lightweight AI research agent from scratch with reasoning + memory + transparency. The stack: - Groq (llama-3.3-70b) - DuckDuckGo search - ChromaDB for persistent memory - Streamlit UI The agent uses a ReAct-style loop, it decides whether to search the web or answer from memory, logs every decision to a JSON audit trail, and remembers past conversations across sessions. Entirely open source. I hit a few roadblocks that taught me a lot about agentic behavior: The agent refused to search: It was answering everything from its training data. Turns out my system prompt was too vague. Once I made the search rules explicit, behavior improved instantly. Specificity in prompts matters more than I expected. The silent failure: After integrating ChromaDB, Groq started rejecting requests silently - no errors, just no output. The Fix was sanitizing memory before sending to LLM, stripping empty strings and truncating entries that were too long. The infinite search loop: I watched the agent search 5 times in a row for the same thing without ever summarizing an answer, hitting my iteration cap. Fixed by appending "Do not search again" to the tool results, which effectively forced the LLM to move from "research mode" to "summary mode." Now onto the next challenge: using AI agents to rethink and improve how QA workflows operate. Check out the repo below if you're interested in lightweight agent architectures! #llm #ai #python #agentai #groq #chromadb
To view or add a comment, sign in
-
-
TLDR: DDD manages complexity when working with LLM agents. I have been working on a static type-checking plugin for Clojure (https://lnkd.in/gGGxihms), to balance out the flexibility that the language affords with a way of doing sanity checks on the annotations we already use in our codebase (currently for plumatic/schema, and eventually to incorporate other systems such as metosin/malli). AI gave new life to the project, since much of the work was first replacing thousands of lines with an alternative analysis engine, then filling in test cases as I found them. Mostly, this flow worked well. But type checking is a complex beast, and the app grew increasingly bug-ridden and inflexible as the boundaries between internal typing representations (based off of the Blame for All algorithm at https://lnkd.in/gexUzdmS) and the Schema annotations used to provide the initial input. The LLMs increasingly produced bad edits chewing through more and more tokens. They are next-token prediction engines; bad surface language hurts their ability to work. What saved the app was adhering to strict domains within it. The language of "Schema" would be completely separate from the language of "Types". This was rigorously enforced: the word "schema" could only refer to plumatic/schema references and code working on them, and "type" to the internal representations. Creating this clean boundary, enforcing the language used, and creating small, defined conversion points in one direction (Schema -> Type) resolved this issue. This was easy to put into an AGENTS.md, and the LLM confusion between domains dropped considerably. Work become much more efficient and mechanical. Today, refining this approach and making the domain boundaries tighter solved another class of bugs: Schema was interpreted differently when defining the initial dictionary of references than when the library converts them to Types. By focusing again on clean descriptions of the domains and defining consistent boundaries, I was able to guide the LLM to clean this up in a principled fashion. #AI #SoftwareEngineering #Clojure #DDD
To view or add a comment, sign in
-
I’ve been building something meaningful over the past few weeks — AI Guru v2. A full-stack AI system that answers questions in the spirit of the Bhagavad Gītā — grounded in actual verses, not hallucinated responses. 1. How it works (simple idea, complex execution): Your question → Semantic search over ~700 verses (embeddings + FAISS) → Optional emotion detection (transformer model) → LLM that is strictly constrained to retrieved context 2. Tech Stack: • Backend: FastAPI • Frontend: React + Vite • Retrieval: Sentence Transformers + FAISS • LLM: OpenAI-compatible APIs (via OpenRouter) 3. What I learned building this: • Why retrieval quality > model size in RAG systems • Importance of using the same embedding model for indexing & querying • Handling CORS, environment separation (VITE_ vs server keys)* • Designing systems with clear boundaries (no claims, disclaimers, grounded outputs) 4. This is a learning-first project focused on: • Honest AI (no fake wisdom generation) • Transparent retrieval scores • Clean pipeline: CSV → embeddings → FAISS → RAG → UI I’ve attached a short walkthrough of the full system. Would genuinely love feedback from people working on: • RAG systems • LLM applications • Tech + philosophy / cultural interfaces #MachineLearning #RAG #FastAPI #React #LLM #OpenSource #LearningInPublic #AIProjects
To view or add a comment, sign in
Explore related topics
Explore content categories
- Career
- Productivity
- Finance
- Soft Skills & Emotional Intelligence
- Project Management
- Education
- Technology
- Leadership
- Ecommerce
- User Experience
- Recruitment & HR
- Customer Experience
- Real Estate
- Marketing
- Sales
- Retail & Merchandising
- Science
- Supply Chain Management
- Future Of Work
- Consulting
- Writing
- Economics
- Artificial Intelligence
- Employee Experience
- Workplace Trends
- Fundraising
- Networking
- Corporate Social Responsibility
- Negotiation
- Communication
- Engineering
- Hospitality & Tourism
- Business Strategy
- Change Management
- Organizational Culture
- Design
- Innovation
- Event Planning
- Training & Development