🤖 𝐀𝐈 𝐚𝐠𝐞𝐧𝐭𝐬 𝐝𝐨𝐧'𝐭 𝐡𝐚𝐥𝐥𝐮𝐜𝐢𝐧𝐚𝐭𝐞 𝐢𝐧 𝐩𝐚𝐫𝐚𝐠𝐫𝐚𝐩𝐡𝐬 𝐚𝐧𝐲𝐦𝐨𝐫𝐞. 👻 𝐓𝐡𝐞𝐲 𝐡𝐚𝐥𝐥𝐮𝐜𝐢𝐧𝐚𝐭𝐞 𝐢𝐧 𝐉𝐒𝐎𝐍. Last week, I spent 2 hours debugging why my agent kept failing a tool call. GPT-5.2 output v1: "score": "42" GPT-5.2 output v2: "score": 42 Switching keys or types can really throw off your whole workflow—that's just the tame example. If you're working in the AI world, you’re probably all too familiar with the real challenges: comparing two LLM function calls side by side, checking what changed in your RAG retrieval payload, debugging why your API v2 returns an extra nested field, or reviewing agent memory before and after running a tool. We tend to handle these tasks the same old way—opening two VS Code tabs, squinting, scrolling, and often missing that one crucial detail. That’s why I created the tool I really needed: 𝐉𝐒𝐎𝐍 𝐃𝐢𝐟𝐟 𝐕𝐢𝐞𝐰𝐞𝐫. It’s open source, instant, and accessible without any login. Just paste JSON A and JSON B, hit compare, and it shows you exactly what’s new in green, what’s been removed in red, what’s changed in yellow, and any type changes in orange (this last one has saved me multiple times!). It’s exactly like in the screenshot—side by side, clear, distraction-free. I built this because JSON is the most reliable way we have to understand the contracts between models, tools, and APIs in agentic systems—but we’ve lacked good tools to see what actually changed. It's free, open source, and works right in your browser. If you’re building agents, LLM apps, or working with APIs every day, give it a try—you'll wonder how you ever managed without it. Think of how many hours you might have lost last month to a missing comma or a confusing string versus a number. Find the link in the first comment, or comment "𝐃𝐈𝐅𝐅" and I'll DM it to you. #buildinpublic #opensource #aiagents #llm #developers #javascript #python #claudecode #anthropic
Snehal Dutta’s Post
More Relevant Posts
-
I built an 𝗔𝗜 𝗮𝗴𝗲𝗻𝘁 that reads your code and decides, 𝗼𝗻 𝗶𝘁𝘀 𝗼𝘄𝗻, what kind of help it needs. 𝗖𝗼𝗱𝗲𝗦𝗲𝗻𝘀𝗲 - 𝗔𝗻 𝗔𝗜-𝗽𝗼𝘄𝗲𝗿𝗲𝗱 𝗰𝗼𝗱𝗲 𝗮𝘀𝘀𝗶𝘀𝘁𝗮𝗻𝘁 𝗮𝗴𝗲𝗻𝘁 that accepts a code snippet and a natural language instruction, reasons about the intent, and autonomously selects and runs only the relevant analysis tools. The six tools it can call: • Bug Detector • Security Vulnerability Checker • Code Quality Analyzer • Code Explainer • Unit Test Generator • Refactor Suggester Each tool runs on a focused 𝘀𝘆𝘀𝘁𝗲𝗺 𝗽𝗿𝗼𝗺𝗽𝘁 engineered to return strict structured JSON - making the output predictable, parseable, and renderable as distinct UI components. 𝗛𝗼𝘄 𝗱𝗼𝗲𝘀 𝗶𝘁 𝘄𝗼𝗿𝗸? A 𝗟𝗮𝗻𝗴𝗖𝗵𝗮𝗶𝗻 agent powered by 𝗚𝗼𝗼𝗴𝗹𝗲 𝗚𝗲𝗺𝗶𝗻𝗶 reads both the code and your instruction, reasons about your intent, and selects only the relevant tools from a pool of six. It does not run all six blindly. It picks what makes sense, executes those tools, observes the results, and decides if anything else is needed before giving you a final answer. That decision-making loop is what makes it an actual agent. 𝗧𝗲𝗰𝗵 𝘀𝘁𝗮𝗰𝗸: Python, FastAPI, LangChain, Google Gemini API, React.js, Tailwind CSS - deployed on Render and Vercel. Live link and GitHub Repo in the comments. #AI #LangChain #GenerativeAI #AgenticAI #Python #FastAPI #React #FullStackDevelopment #LLM #MachineLearning #OpenToWork
To view or add a comment, sign in
-
-
Cursor just wrote prototype pollution into a developer's Node.js project. He had no idea. The function looked fine. A recursive merge utility. Clean code, passes all your unit tests. ❌ Bad: for (const key of Object.keys(source)) { if (typeof source[key] === 'object') { merge(target[key], source[key]) } else { target[key] = source[key] } } ✅ Fix: const merged = structuredClone({ ...defaults, ...userInput }) This is the same pattern that opened CVE-2019-10744 in lodash. AI models trained on pre-2019 StackOverflow answers reproduce it because nobody told them it became dangerous. An attacker sends { "__proto__": { "admin": true } } to any endpoint using this merge. After that, every plain object in your Node process inherits admin: true. Auth checks fail silently. No error. No crash. Just poisoned state. structuredClone() (Node 17+) kills this entirely. No prototype chain to pollute. Full breakdown with the safe alternatives and a quick grep command to audit your own repo: https://lnkd.in/dDjyz-Rf #security #webdev #ai #devsecops
To view or add a comment, sign in
-
Built an Advanced Personal Assistant from scratch. Here's what it actually does. Started with a blank Next.js project and a FastAPI skeleton. The result is Ava — an AI assistant that reasons, remembers, and acts across sessions. The stack: Next.js 16 · FastAPI · SQLite · Groq API · Python Groq handles inference at blazing speed. Everything else — memory, plugins, sessions, file operations — runs on your own machine. ● Agentic tool calling: The LLM doesn't just respond — it decides. Every message goes through an orchestration loop that determines whether to answer directly or invoke a tool. Weather, time zones, calculations, web search, GitHub stats, crypto prices — all fire as live tool calls with transparent execution blocks in the UI. ● Multi-model fallback cascade: If the primary model hits rate limits, the system silently falls back through a chain of models without breaking the conversation. The user never sees an error. ● Code execution: Ava writes Python, runs it in a sandboxed subprocess, reads the output, fixes errors, and iterates — all in a single turn. The full execution trace is visible inline . ● Persistent memory: After every conversation, a background extraction pass pulls facts, preferences, and events into a structured vault. Location, tech stack, habits — remembered across sessions without any manual tagging. ● Voice and Vision: Push-to-talk via MediaRecorder piped to Groq Whisper for transcription. Image upload routes to a vision model for analysis, OCR, and structured extraction. ● Dynamic plugin system: Install and uninstall tools at runtime. Register a custom skill by uploading a markdown file — the parser extracts the schema and makes it callable immediately, no backend changes required. ● Session archive: Every conversation is stored and browsable. Restore any past session back into the live chat with one click. The hardest parts were never the features themselves. They were the details — preventing tool call JSON from truncating mid-generation, stripping internal reasoning tokens before they reach the UI, making a free tier feel unlimited through intelligent model routing. The gap between a working demo and a reliable product is where most AI projects fall apart. This one doesn't. Happy to go deep on any part of the architecture in the comments. #llm #nextjs #fastapi #python #ai #groq #softwaredevelopment #webdevelopment
To view or add a comment, sign in
-
𝐂𝐥𝐚𝐮𝐝𝐞 𝐂𝐨𝐝𝐞 has native OpenTelemetry (check links). 𝐂𝐮𝐫𝐬𝐨𝐫 does not, but it has something else: a hooks system. And the community has already built the bridge. The Cursor hooks system fires pre/post events for every agent action — tool calls, shell commands, MCP server calls, file edits, session lifecycle. It was designed for governance tooling. Turns out it's also exactly what you need to emit OTLP spans. I evaluated two implementations: → 𝐜𝐮𝐫𝐬𝐨𝐫-𝐨𝐭𝐞𝐥-𝐡𝐨𝐨𝐤 (by 𝐋𝐚𝐧𝐠𝐆𝐮𝐚𝐫𝐝) — Python, single setup script, any OTLP backend, built-in data masking, GenAI semantic conventions → 𝐜𝐮𝐫𝐬𝐨𝐫-𝐥𝐚𝐧𝐠𝐟𝐮𝐬𝐞 (by 𝐧𝐚𝐨𝐮𝐟𝐚𝐥𝐞𝐥𝐡) — JS/npm, Langfuse-specific, more setup steps, no privacy controls 𝐜𝐮𝐫𝐬𝐨𝐫-𝐨𝐭𝐞𝐥-𝐡𝐨𝐨𝐤 is the one I'd recommend — and not just for the simpler install. It uses the same CNCF GenAI semantic conventions that Claude Code's native OTEL uses. That matters when you're eventually pulling data from both tools into the same backend. 𝗪𝗛𝗔𝗧 𝗬𝗢𝗨'𝗟𝗟 𝗦𝗘𝗘 𝗢𝗡𝗖𝗘 𝗜𝗧'𝗦 𝗥𝗨𝗡𝗡𝗜𝗡𝗚 1. 🔧 𝗖𝘂𝗿𝘀𝗼𝗿 𝗿𝘂𝗹𝗲𝘀 𝗲𝗳𝗳𝗲𝗰𝘁𝗶𝘃𝗲𝗻𝗲𝘀𝘀 — which .cursorrules produce clean single-pass completions, and which trigger expensive multi-turn correction loops 2. 🔌 𝗠𝗖𝗣 𝗰𝗮𝗹𝗹 𝗽𝗮𝘁𝘁𝗲𝗿𝗻𝘀 — which servers get invoked, how often, and whether any add latency without value 3. 💰 𝗖𝗼𝘀𝘁 𝘃𝗶𝘀𝗶𝗯𝗶𝗹𝗶𝘁𝘆 — token burn by session, by tool, by model 4. 🔄 𝗜𝗺𝗽𝗿𝗼𝘃𝗲𝗺𝗲𝗻𝘁 𝘀𝗶𝗴𝗻𝗮𝗹 — Cursor rules are your tuning lever; telemetry shows you where to pull it The bigger picture: once traces from Claude Code and Cursor land in the same Langfuse instance, you can start building evaluation loops — comparing tools, prompts, and agent behaviors on real data. That's the next step. More on that soon. Start collecting traces now, privately, even locally, so you have the data to evaluate once the tooling matures. Have you tried 𝐜𝐮𝐫𝐬𝐨𝐫-𝐨𝐭𝐞𝐥-𝐡𝐨𝐨𝐤 or 𝐜𝐮𝐫𝐬𝐨𝐫-𝐥𝐚𝐧𝐠𝐟𝐮𝐬𝐞? Curious what you're using, and how you're making use of it! #Cursor #OpenTelemetry #Langfuse #AIObservability #DevTools #CodingAssistant (Links in first comment)
To view or add a comment, sign in
-
Your AI prompts are terrible because you're treating Claude like Stack Overflow. Stop writing: "Please write a Python function that takes a list and returns..." Start writing: "I'm building a user auth system. Here's my current code: [paste]. Help me add password reset functionality." Context beats politeness every time. Give Claude your actual code, your actual problem, your actual constraints. "Please" and "thank you" don't make the output better. Specificity does. The best developers aren't polite prompters. They're context providers.
To view or add a comment, sign in
-
-
Anthropic accidentally leaked Claude Code's entire 500,000+ line TypeScript source code through an npm package that included a development source map file. This ironic leak from the "safety-first" company revealed that their AI coding assistant is essentially sophisticated prompt engineering rather than revolutionary technology, along with unreleased features and anti-competitive tactics. A few points: 1. The leak happened through a 57MB source map file accidentally included in npm package version 2.1.88, possibly due to a bug in Bun.js 2. Claude Code is built with 11 processing steps but relies heavily on hard-coded prompts and guardrails rather than advanced AI architecture 3. Anthropic implemented "anti-distillation poison pills" - fake tools designed to sabotage competitors trying to copy Claude's outputs 4. "Undercover mode" instructs Claude to hide its AI identity in outputs, potentially to make AI-generated code appear human-written 5. The system uses basic regex patterns to detect user frustration through keywords like "damn" and "balls" 😂 6. Leaked code revealed unreleased features including "Buddy" (AI companion), "Chyus" (background agent), and references to future models 7. The open-source community quickly created "Claw Code" - a Python rewrite of the leaked TypeScript code
To view or add a comment, sign in
-
🚀 Just shipped: RAG Chat — a Document Q&A chatbot that actually knows when to say "I don't know." Most LLM-powered apps hallucinate freely. I built one that can't — it only answers from documents you upload, and explicitly refuses if the answer isn't there. Here's what's under the hood: 🧠 Agentic RAG pipeline QueryRewriter → CacheAgent → RetrieverAgent → AnswerAgent — each a modular, swappable agent. Adding a new step is literally one Python file. ⚡ Semantic Q&A cache Repeated or similar questions skip the LLM entirely (cosine similarity threshold: 0.92). Real-world chatbots ask the same things constantly — this makes it fast and cheap. 🗂️ 3-Layer memory architecture • STM — Redis, last 20 messages, 24h TTL • MTM — Rolling summaries in DashVector • LTM — Document chunks + facts, ANN + SQL fusion retrieval 📄 4 chunking strategies Recursive, Fixed Size, Sentence-based, Paragraph-based — pick the one that fits your document type. 🛡️ Anti-hallucination by design 7 hard rules in the system prompt. If the context doesn't have the answer, the assistant says so. Stack: FastAPI · Next.js 14 · DashVector · Qwen (qwen-plus) · Redis · PostgreSQL · SQLAlchemy 2.0 👇 Full source on GitHub (link below) — contributions welcome! #RAG #LLM #AIEngineering #GenerativeAI #Python #NextJS #OpenSource
To view or add a comment, sign in
-
🚀 Firecrawl Just Changed the Game for Open-Source Web Agents Firecrawl has dropped a MAJOR open‑source release: a fully transparent framework that lets you build AI agents capable of searching, scraping, and interacting with web pages using any model you choose. No walls. No vendor lock‑in. Just pure, flexible power. Whether you're prototyping or scaling, you can now spin up your own web agent instantly and blast off right away. To all the Web Agent developers out there who love to tinker, build, and own your stack – the repo is live. Go fork it, play with it, and make it yours. The era of open web intelligence is here. Don’t just watch – contribute. 🔧 Source: https://lnkd.in/ee9Vbqiw #OpenSource #AI #WebScraping #Firecrawl #AIAgents #MachineLearning #DeveloperTools #LLM #BuildInPublic #Python #SoftwareDevelopment
To view or add a comment, sign in
-
I'm writing this mid-Claude session, half captains log and half cool tidbit for the power users. This evening I'm working on a new tool that's built in Python and Rust and I'm using Claude Code to power me through it. Prior to beginning phase 1 of the build, my Orchestrator noticed a gap in my notes and asked if I wanted it to do research or if I had material to provide. Given quite a few past experiences... lol, performing a deep web search right before a lengthy Orchestrator duel would be detrimental to the depth & quality of my session. So, I immediately took the questions it had for me into a fresh deep-research interface and spun up the ol web crawlers 😏 . What did I get? Exactly what my Orchestration agent needed in a nice .md file and an expert level review that would have taken me hours to complete... and to be honest I probably would have skipped doing it altogether given the time commitment. The output that I was expecting? Absolutely crispy clean and usable 😎 Next time you're up against a difficult question from your AI assistant, pause and think. Is this something best for a tool to answer for me? Is doing this task in the same session going to create unnecessary context? Pausing before action could be the difference between a strong output or hours of lost... and oh so frustrated time. (and tokens... we love our tokens...) Happy creating :)
To view or add a comment, sign in
Explore content categories
- Career
- Productivity
- Finance
- Soft Skills & Emotional Intelligence
- Project Management
- Education
- Technology
- Leadership
- Ecommerce
- User Experience
- Recruitment & HR
- Customer Experience
- Real Estate
- Marketing
- Sales
- Retail & Merchandising
- Science
- Supply Chain Management
- Future Of Work
- Consulting
- Writing
- Economics
- Artificial Intelligence
- Employee Experience
- Workplace Trends
- Fundraising
- Networking
- Corporate Social Responsibility
- Negotiation
- Communication
- Engineering
- Hospitality & Tourism
- Business Strategy
- Change Management
- Organizational Culture
- Design
- Innovation
- Event Planning
- Training & Development
The link is here 👇 https://snehaldutta.github.io/json-diff.io/