A ~550-word AGENTS.md reduced agent runtime by 28.64% and token usage by 16.58% on SWE-bench Verified. The trick wasn’t more context — it was less ambiguity. I tested these ideas while refactoring agent docs for a production Python/FastMCP monorepo at NOS. What stuck with me: 𝗔𝗚𝗘𝗡𝗧𝗦.𝗺𝗱 𝘄𝗼𝗿𝗸𝘀 𝘄𝗵𝗲𝗻 𝗶𝘁’𝘀 𝗲𝘅𝗲𝗰𝘂𝘁𝗮𝗯𝗹𝗲 𝗼𝗻𝗯𝗼𝗮𝗿𝗱𝗶𝗻𝗴. Setup + test commands beat prose (Lulla et al.). 𝗔𝗚𝗘𝗡𝗧𝗦.𝗺𝗱 𝗶𝘀 𝗯𝗲𝗰𝗼𝗺𝗶𝗻𝗴 𝘁𝗵𝗲 𝗶𝗻𝘁𝗲𝗿𝗼𝗽𝗲𝗿𝗮𝗯𝗹𝗲 𝗱𝗲𝗳𝗮𝘂𝗹𝘁. 4,860 context files across GitHub; `.cursorrules` is basically legacy (Galster et al.). 𝗦𝗵𝗼𝗿𝘁 𝗯𝗲𝗮𝘁𝘀 𝗰𝗼𝗺𝗽𝗿𝗲𝗵𝗲𝗻𝘀𝗶𝘃𝗲. Most files are <500 words; medians cluster around ~335–535 words (Chatlatanagulchai et al.). 𝗧𝗲𝘀𝘁𝗶𝗻𝗴 𝗶𝗻𝘀𝘁𝗿𝘂𝗰𝘁𝗶𝗼𝗻𝘀 𝗮𝗿𝗲 𝘁𝗵𝗲 𝗵𝗶𝗴𝗵𝗲𝘀𝘁-𝘀𝗶𝗴𝗻𝗮𝗹 𝘀𝗲𝗰𝘁𝗶𝗼𝗻. They show up in ~75% of high-quality files. 𝗔𝘂𝘁𝗼-𝗴𝗲𝗻𝗲𝗿𝗮𝘁𝗲𝗱 𝗰𝗼𝗻𝘁𝗲𝘅𝘁 𝗰𝗮𝗻 𝗯𝗮𝗰𝗸𝗳𝗶𝗿𝗲. LLM-generated files dropped success by ~3% on average while raising cost >20% (Gloaguen et al.). 𝗙𝗶𝗹𝗲 𝗹𝗼𝗰𝗮𝗹𝗶𝘇𝗮𝘁𝗶𝗼𝗻 𝗶𝘀 𝘄𝗵𝗲𝗿𝗲 𝗮𝗴𝗲𝗻𝘁𝘀 𝗳𝗮𝗶𝗹 𝗳𝗶𝗿𝘀𝘁. If they edit the wrong file, everything downstream collapses (ContextBench). What I did with this: one canonical AGENTS.md (~550 words, every snippet verified), CLAUDE.md + Copilot instructions as thin pointers, deleted `.cursorrules`, and 4 path-scoped instruction files that auto-inject context per folder. Takeaway: context engineering is mostly negative space — remove contradictions, name the right files, and make “run tests” unmissable. Sources: https://lnkd.in/eM-HnnGs https://lnkd.in/eN7pUsfY https://lnkd.in/eHAarmSC https://lnkd.in/e9Fx6UC7 https://lnkd.in/eJM2EHkh https://lnkd.in/eTqgZZqK https://lnkd.in/egk_dX8U #ContextEngineering #AICoding #CodingAgents #SoftwareEngineering #MCP #LLMs #DeveloperTools
Diogo Santos’ Post
More Relevant Posts
-
PSA for devs using HuggingFace Inference API for embeddings Spent some time debugging a cryptic error today and wanted to share the fix. If you're using router.huggingface.co with sentence-transformers/all-MiniLM-L6-v2, you might hit this: https://lnkd.in/dAcz8rmk() missing 1 required positional argument: 'sentences' Here's what's happening: The HF inference router determines the pipeline from the model's tags. all-MiniLM-L6-v2 gets routed to the SentenceSimilarity pipeline instead of feature-extraction — so the request payload format doesn't match what the pipeline expects. The fix? Use a model HuggingFace explicitly tags as feature-extraction. I switched to BAAI/bge-base-en-v1.5 and the error went away immediately. A few more things worth knowing: → https://lnkd.in/dxdEg7gX is now deprecated and returns 410 Gone. You can't fall back to the old endpoint — use router.huggingface.co going forward. → all-MiniLM-L6-v2 outputs 384-dimensional vectors. If your pgvector column is defined as vector(768), you have a silent dimension mismatch you may not have caught yet. BAAI/bge-base-en-v1.5 outputs 768 dims, so switching also fixed that for us. TL;DR: 1. Check that your model is tagged for the right pipeline on HuggingFace 2. Make sure your embedding dimensions match your vector column 3. https://lnkd.in/dxdEg7gX is gone — use router.huggingface.co Hope this saves someone a few hours. #softwaredevelopment #machinelearning #embeddings #huggingface #postgresql #pgvector #python #developers
To view or add a comment, sign in
-
-
🚀 FlameIQ v1.0.2 Released FlameIQ is an open-source performance regression detection engine designed for CI environments. The tool compares benchmark results against a stored baseline on every CI run and detects regressions using configurable thresholds and optional statistical testing. Key capabilities • Compares benchmark results against a stored baseline on every CI run • Enforces per-metric thresholds with direction-aware regression logic • Optional Mann–Whitney U statistical significance testing • Generates self-contained HTML performance reports • Outputs machine-readable JSON results for CI pipelines Installation pip install flameiq-core Resources Documentation: https://lnkd.in/d6e2D7mq PyPI: https://lnkd.in/d-2KcKFd Source Code: https://lnkd.in/d2VDWRQa Contributions and feedback are welcome as the project continues to evolve. #opensource #performanceengineering #python #devtools #cicd
To view or add a comment, sign in
-
-
This article shows how to add dynamic voice alerts to MetaTrader 5 EAs using a local text-to-speech pipeline, avoiding the limitations of pre-recorded WAV notifications. An MQL5 Expert Advisor formats market events (signals, prices, time, ATR, margin/trade status) into text and sends it via WebRequest HTTP POST to a localhost Python service. The Python side exposes a simple /speak endpoint, queues incoming messages, and speaks them sequentially to prevent overlaps and thread-related crashes. A moving-average crossover EA demonstrates the pattern, including a TestMode that speaks on each new bar and practical error handling for common WebRequest failures. This design improves accessibility, reduces alert maintenance, and stays fully local for low latency and privacy. #MQL5 #MT5 #EA #TTS https://lnkd.in/dRx_7Qze
To view or add a comment, sign in
-
-
I wrote a CLI tool to help your teams catch "slop" code before it creates a mountain of tech debt. Runs for any python projects 3.10 -> 3.13. Drop it in your CI/CD pipelines to act as a last linter for AI specific slop patterns like vague comments, util functions, and more. Check it out! https://lnkd.in/e5xNUwH2 https://lnkd.in/eT6psfZS
To view or add a comment, sign in
-
🚀 Debugging Journey: Largest BST in a Binary Tree (with Solution) Today I worked on a classic DSA problem — finding the largest BST inside a Binary Tree — and it really tested my debugging + recursion skills. 💡 Key Learnings: 🔹 Use correct boundary values → float('inf'), float('-inf') 🔹 Always track multiple things in recursion (size, min, max, BST status) 🔹 Correct condition is: max(left) < root < min(right) 🔹 Small syntax mistakes can break the whole logic ✅ Approach: For every node, return: Size of BST Minimum value Maximum value Whether it's a BST If subtree is BST → combine left + right Else → take max of left/right subtree 💻 Python Solution: class Solution: def largestBst(self, root): def helper(node): if not node: return 0, float('inf'), float('-inf'), True N1, min1, max1, isBST1 = helper(node.left) N2, min2, max2, isBST2 = helper(node.right) if isBST1 and isBST2 and max1 < node.data < min2: return (N1 + N2 + 1, min(min1, node.data), max(max2, node.data), True) return max(N1, N2), 0, 0, False ans, _, _, _ = helper(root) return ans 📌 Key Takeaway: “Debugging isn’t just fixing code — it’s understanding your logic deeply.” This problem helped me improve: ✔️ Tree Traversal ✔️ Recursion Thinking ✔️ Debugging Mindset #DSA #Python #BinaryTree #CodingJourney #Debugging #LearnInPublic #TechSkills
To view or add a comment, sign in
-
Secrets Hunter v0.6.0 is here! Most secret scanners ask one question: "Does this suspicious string match a known pattern?" Secrets Hunter asks a better one: "What is the context this suspicious string was found in?" A high-entropy string alone means little. But a high-entropy string next to API_KEY= is a different story entirely. Secrets Hunter combines Shannon entropy with assignment context analysis and a bigram language model to catch secrets that no pattern library would ever cover — including custom tokens, internal credentials, and formats unique to your stack. Here's what makes it different: → Assignment context boosting — severity is determined not just by the string itself, but by what it's assigned to → Semantic false positive filtering — strings that look like human language get filtered out before they reach you → Zero dependencies — pure Python 3.11+, no binaries, no system packages, no network calls at scan time → TOML overlay config — stack team, CI, and local configs without duplicating anything → SARIF + JSON export — plugs directly into GitHub Code Scanning and any CI pipeline Built from real AppSec experience. Lightweight by design. Fully auditable. 👉 pip install secrets-hunter Visit Secrets Hunter repository: https://lnkd.in/dYNysfgW #AppSec #DevSecOps #SecretScanning #Python #OpenSource #SecurityTools
To view or add a comment, sign in
-
-
LinkedIn: 📣 SynapseKit v0.6.9 is live. Two graph features in this release that I think matter more than they look. approval_node(): gates your graph on a human decision. The workflow hits a node, pauses, waits for a human to approve or reject, then continues. No polling, no hacks. One function call. dynamic_route_node(): routes to completely different subgraphs at runtime based on whatever logic you write. Sync or async. Your graph decides where it goes next while it's running. Together these two make human-in-the-loop workflows actually practical to build. Not a demo. Production. Also shipped: 💬 SlackTool [Slack]— send messages via webhook or bot token 📋 JiraTool— search, create, comment on issues via REST 🔍 BraveSearchTool [Brave]— web search via Brave API All three stdlib only. Zero new dependencies. Where we stand: 32 tools · 15 providers · 18 retrieval strategies · 795 tests · 2 dependencies. ⚡ pip install synapsekit 🔗 https://lnkd.in/d2fGSPkX #Python #LLM #RAG #OpenSource #AI #MachineLearning #Agents #SynapseKit
To view or add a comment, sign in
-
I thought contributing to Streamlit meant 𝗳𝗶𝘅𝗶𝗻𝗴 𝘀𝗺𝗮𝗹𝗹 𝗨𝗜 𝗯𝘂𝗴𝘀. I didn’t expect to 𝗿𝗲𝘃𝗲𝗿𝘀𝗲-𝗲𝗻𝗴𝗶𝗻𝗲𝗲𝗿 its entire architecture. While working on a component, I kept asking: “𝗪𝗵𝗲𝗿𝗲 𝗱𝗼𝗲𝘀 𝘁𝗵𝗶𝘀 𝗯𝘂𝘁𝘁𝗼𝗻 𝗮𝗰𝘁𝘂𝗮𝗹𝗹𝘆 𝗴𝗼?” I became so curious that I kept learning more and more about it 👇 User Python Code ↓ Elements API (button.py) ↓ DeltaGenerator ↓ Protobuf ↓ Runtime / Session Manager ↓ Tornado WebSocket ↓ React (TSX) ↓ Browser UI Unlike 𝗗𝗷𝗮𝗻𝗴𝗼 𝘁𝗲𝗺𝗽𝗹𝗮𝘁𝗲𝘀 or 𝗦𝗽𝗿𝗶𝗻𝗴 𝗕𝗼𝗼𝘁 𝗥𝗘𝗦𝗧 𝗔𝗣𝗜𝘀, Streamlit flows like this: 🧠 Python → Protobuf → WebSocket → React → Rerun Model Every click reruns the script. State lives in the session manager. UI updates happen through delta patches over a persistent WebSocket. That’s when it clicked, Streamlit isn’t just client–server. It’s reactive execution architecture, closer to 𝗥𝗲𝗮𝗰𝘁 + 𝗝𝘂𝗽𝘆𝘁𝗲𝗿 + 𝗲𝘃𝗲𝗻𝘁-𝗱𝗿𝗶𝘃𝗲𝗻 systems. The lesson? If a framework feels “𝘀𝗶𝗺𝗽𝗹𝗲,” it’s usually hiding something sophisticated. Don’t stop at using it - dissect it. Understanding the layers transforms you from a user into a builder. How do you usually break down a complex system to understand it completely? #CodingJourney #DeveloperCommunity #Streamlit #OpenSourceContributing
To view or add a comment, sign in
-
-
Claude can write a compiler, but it can't spell "anthropic" backwards. I tested it: pure LLM: "cipohtrpna" => wrong. Give it a sandbox? It writes `echo "anthropic" | rev`, runs it, gets "ciporhtna." => correct 10 tests where LLMs fail: counting, reversal, prime factorization, set ops. No sandbox: 8/10. With sandbox: 10/10. I didn't tell it to write code, it just recognized what it's bad at and reached for Python! Anthropic just shipped Programmatic Tool Calling (Claude writes code that calls your tools). Fewer round-trips, less tokens. It runs in their container, not yours. Testing it, something occurred to me: does that still work without tools? Yes, and it patches its own blind spots with code! Have you tried PTC already? #AICoding #ClaudeCode #AIAgents #LLM
To view or add a comment, sign in
-
-
Everyone learns stacks. But very few understand where they actually matter. Take a simple problem: Checking if brackets are balanced. Most people think it’s about counting. It’s not. It’s about order. Here’s what really happens behind the scenes: → You scan the expression left to right → Every opening bracket goes into a stack → Every closing bracket tries to match the last opening one If it matches → remove it If it doesn’t → the entire structure breaks That’s the moment you realize: Stacks aren’t just data structures. They are decision systems. They enforce rules like: Last In → First Out And that’s exactly how: • Code editors validate syntax • Compilers detect errors • Browsers manage navigation history A simple example: [(a+b)] → Valid ✔ [(a+b] → Invalid ❌ Same characters. Different structure. That’s the difference between working code and broken logic. The lesson? In programming — and in systems — structure beats quantity. Always. #DataStructures #Python #ProblemSolving #CodingJourney #AIThinking
To view or add a comment, sign in
More from this author
Explore content categories
- Career
- Productivity
- Finance
- Soft Skills & Emotional Intelligence
- Project Management
- Education
- Technology
- Leadership
- Ecommerce
- User Experience
- Recruitment & HR
- Customer Experience
- Real Estate
- Marketing
- Sales
- Retail & Merchandising
- Science
- Supply Chain Management
- Future Of Work
- Consulting
- Writing
- Economics
- Artificial Intelligence
- Employee Experience
- Workplace Trends
- Fundraising
- Networking
- Corporate Social Responsibility
- Negotiation
- Communication
- Engineering
- Hospitality & Tourism
- Business Strategy
- Change Management
- Organizational Culture
- Design
- Innovation
- Event Planning
- Training & Development