Ali Danesh Moghaddam’s Post

2w Edited

AI coding agents are incredible, but they all suffer from the same fatal flaw: Attention Dilution. If you’ve ever unleashed an agent on a massive, N years old legacy codebase, you know what happens. They use standard cat or grep, swallow thousands of lines of boilerplate just to change one function, burn through tokens, and eventually hallucinate. Vector RAG isn't the answer for code either,it loses it and goes bananas. To solve this, I’ve been building OptiVault: an open-source Static Context Compiler and Model Context Protocol (MCP) server. Instead of dumping raw code into the context window, OptiVault intercepts it. A Zero-Fat (Keto ?) AI Developer workflow: AST-Driven Semantic Routing: Powered by Tree-sitter, OptiVault extracts deterministic function signatures and dependency skeletons. LLMs drill down hierarchically without the bloat. Obsidian Dual-Compatibility: The shadow context acts as both an AI index AND a human-readable Obsidian Knowledge Graph. The Autopilot Loop: OptiVault generates a CLAUDE.md that teaches agents to use MCP tools, sync their own context after edits, and strictly following best agent coding principles. GitHub: https://lnkd.in/eiZpsG6g #AI #SoftwareEngineering #Claude #MCP #OpenSource #Token #Obsidian

GitHub - Alidmo/OptiVault: Tokenwaste management github.com

1 Comment

Ali Danesh Moghaddam 2w

magic

To view or add a comment, sign in

More Relevant Posts

Ali Danesh Moghaddam
1w
Report this post
trying to maximize #OptiVault, I think claude is making tokens go on a stress management for longer tasks, its still a cat-and-mouse fight. But I couldn't sleep much, couldn't let my brain go with everything going on, so I made OptiVault go on a diet with Zero-Fat Navigation (query_graph, concepts merge, strict protocol, real benchmark). after you install it using npm and link it to your claude, you can test on your project: cd ~/some-other-repo optivault init . # indexes the repo, creates _optivault/ + CLAUDE.md claude mcp add optivault optivault -- mcp \ --vault "$(pwd)/_optivault" \ --source "$(pwd)" Then open #ClaudeCode in that repo — it'll pick up the new CLAUDE.md protocol and all 6 MCP tools including the new query_graph.

Ali Danesh Moghaddam

Software Engineer
2w Edited

AI coding agents are incredible, but they all suffer from the same fatal flaw: Attention Dilution. If you’ve ever unleashed an agent on a massive, N years old legacy codebase, you know what happens. They use standard cat or grep, swallow thousands of lines of boilerplate just to change one function, burn through tokens, and eventually hallucinate. Vector RAG isn't the answer for code either,it loses it and goes bananas. To solve this, I’ve been building OptiVault: an open-source Static Context Compiler and Model Context Protocol (MCP) server. Instead of dumping raw code into the context window, OptiVault intercepts it. A Zero-Fat (Keto ?) AI Developer workflow: AST-Driven Semantic Routing: Powered by Tree-sitter, OptiVault extracts deterministic function signatures and dependency skeletons. LLMs drill down hierarchically without the bloat. Obsidian Dual-Compatibility: The shadow context acts as both an AI index AND a human-readable Obsidian Knowledge Graph. The Autopilot Loop: OptiVault generates a CLAUDE.md that teaches agents to use MCP tools, sync their own context after edits, and strictly following best agent coding principles. GitHub: https://lnkd.in/eiZpsG6g #AI #SoftwareEngineering #Claude #MCP #OpenSource #Token #Obsidian

GitHub - Alidmo/OptiVault: Tokenwaste management github.com
Like Comment
To view or add a comment, sign in
Mayank Ahuja
1mo
Report this post
If you're building AI agents and haven't seen this yet : it's every prompt Claude Code uses, rewritten and open-sourced. The verification agent pattern alone is worth the click.

Swati Ahuja

Software Engineer at Salesforce
1mo Edited

Claude Code's source was accidentally published to npm. So I studied every prompt in the codebase using claude. Here's what I found and I'm open-sourcing all of it. Claude Code uses 26 distinct prompts to function: > 1 system prompt (identity, safety, code style, tool routing) > 11 tool prompts (shell, file ops, search, web, planning) > 5 agent prompts (explorer, architect, verifier, docs, general) > 4 memory prompts (summarization, session notes, extraction) > 1 coordinator prompt (multi-agent orchestration) > 4 utility prompts (titles, recaps, suggestions) The patterns that stood out: 1. Anti-over-engineering rules: "don't add features beyond what was asked" 2. Tiered risk assessment : freely edit files, but confirm before force-pushing 3. Adversarial verification : a dedicated agent whose job is to TRY TO BREAK the implementation 4. Memory compression : 9-section summarization that preserves every user message 5. Never delegate understanding : "write prompts that prove you understood" I have rewritten every prompt from scratch for legal compliance. Same behavioral intent wihout verbatim copying text. The repo includes: > Every prompt, ready to copy into your own agent > 9 pattern analyses with commentary > 3 claude skills you can drop in today > MIT licensed you can fork and reuse as it is. If you're building AI coding agents, this will save you months of prompt engineering. Link: https://lnkd.in/gNizmf6T #PromptEngineering #ClaudeCode #AI #AIAgents #LLM #OpenSource

GitHub - swati510/claude-code-prompts: Independently authored prompt templates for AI coding agents — system prompts, tool prompts, agent delegation, memory management, and multi-agent coordination. Informed by studying Claude Code. github.com
Like Comment
To view or add a comment, sign in
Swati Ahuja
1mo Edited
Report this post
Claude Code's source was accidentally published to npm. So I studied every prompt in the codebase using claude. Here's what I found and I'm open-sourcing all of it. Claude Code uses 26 distinct prompts to function: > 1 system prompt (identity, safety, code style, tool routing) > 11 tool prompts (shell, file ops, search, web, planning) > 5 agent prompts (explorer, architect, verifier, docs, general) > 4 memory prompts (summarization, session notes, extraction) > 1 coordinator prompt (multi-agent orchestration) > 4 utility prompts (titles, recaps, suggestions) The patterns that stood out: 1. Anti-over-engineering rules: "don't add features beyond what was asked" 2. Tiered risk assessment : freely edit files, but confirm before force-pushing 3. Adversarial verification : a dedicated agent whose job is to TRY TO BREAK the implementation 4. Memory compression : 9-section summarization that preserves every user message 5. Never delegate understanding : "write prompts that prove you understood" I have rewritten every prompt from scratch for legal compliance. Same behavioral intent wihout verbatim copying text. The repo includes: > Every prompt, ready to copy into your own agent > 9 pattern analyses with commentary > 3 claude skills you can drop in today > MIT licensed you can fork and reuse as it is. If you're building AI coding agents, this will save you months of prompt engineering. Link: https://lnkd.in/gNizmf6T #PromptEngineering #ClaudeCode #AI #AIAgents #LLM #OpenSource

GitHub - swati510/claude-code-prompts: Independently authored prompt templates for AI coding agents — system prompts, tool prompts, agent delegation, memory management, and multi-agent coordination. Informed by studying Claude Code. github.com

7 Comments
Like Comment
To view or add a comment, sign in
🤖 Diego Maye
3w
Report this post
This is one of those tools that makes you rethink how AI coding should actually work. Just discovered rtk — an open-source CLI proxy designed for AI coding agents. The idea is simple, but powerful: 👉 Most of the tokens we burn in LLM workflows are NOT in prompts… 👉 They’re in noisy command outputs (logs, git, docker, tests, etc.) RTK sits between your terminal and your AI agent and compresses that noise into structured, minimal output — without losing meaning. The result? • Up to 60–90% token reduction (GitHub) • Faster responses • Longer agent sessions • Lower costs • Cleaner reasoning loops Think about this for a second: In agent-based systems (Cursor, Claude Code, Codex…), every command execution feeds back into the model. If that feedback is noisy → you waste context If it’s structured → you unlock scale RTK is basically introducing a new layer in the AI stack: “Output optimization layer” And this is where things get interesting… As agents become more autonomous, 👉 token efficiency stops being an optimization… and becomes architecture. This is the kind of tooling that will define the next wave of AI-native development. If you’re building with AI agents, this is definitely worth a look: https://lnkd.in/dD_qrwic

GitHub - rtk-ai/rtk: CLI proxy that reduces LLM token consumption by 60-90% on common dev commands. Single Rust binary, zero dependencies github.com

1 Comment
Like Comment
To view or add a comment, sign in
Tan Ngo Nguyen
2w
Report this post
If you’re using Claude Code, this is the one repository you actually need to bookmark. The everything-claude-code project (by affaan-m) has officially crossed 100k stars this month, evolving from a hackathon winner into the definitive "performance engine" for AI agents. 🚀 Why it’s the gold standard: 28+ Specialized Agents: Pre-configured roles for TDD, Security Audits, and Research. 120+ Battle-Tested Skills: Modular powers like automated refactoring and complex dependency mapping. AgentShield Security: Built-in hooks to prevent secret leaks and "hallucinated" git commands. Memory Persistence: Finally, a way to keep context across sessions without hitting the token ceiling. Whether you're on Claude Code, Cursor, or Codex, this repo provides the structural "brains" to make them behave like senior engineers rather than just chatbots. 🔗 Check it out: https://lnkd.in/gbRVG2sm #ClaudeCode #AI #OpenSource #SoftwareEngineering #AgenticWorkflows

GitHub - affaan-m/everything-claude-code: The agent harness performance optimization system. Skills, instincts, memory, security, and research-first development for Claude Code, Codex, Opencode, Cursor and beyond. github.com

1 Comment
Like Comment
To view or add a comment, sign in
Md. Al-Naim
5d
Report this post
AI agents are amazing until they forget your repository's architecture 10 messages into the chat. I got tired of paying for massive API token contexts just to remind Cursor and Claude how my database migrations work every single session. Semantic search (RAG) wasn't cutting it—when my code changed, the AI would hallucinate outdated context. So, I built Memographix. It’s an open-source, local memory layer for AI agents that uses the Model Context Protocol (MCP). How it works: - Task Capsules: The AI writes a compressed summary of how it solved a specific problem in your repo. - Staleness Tracking: If you (or another dev) modify the underlying files, Memographix flags the memory as stale so the AI doesn't use outdated facts. - Portability: Works seamlessly across Claude Desktop, Cursor, Copilot, Aider, and Windsurf. I even pinned it against a Kubernetes benchmark to prove the deterministic quality score. Stop dumping 20 files into your prompt just to get a feature shipped. 🐙 GitHub (Code & Benchmarks): https://lnkd.in/dQqG2SCy 📦 PyPI: pipx install memographix Give it a spin and let me know if it saves you as many API tokens as it saved me.

GitHub - coderalnaim/memographix: Local AI agent memory layer for repeated developer tasks, fresh context packets, and low-token repo recall. github.com

2 Comments
Like Comment
To view or add a comment, sign in
Lucas Garcia Rubio
1mo
Report this post
A lot of token spend with large language models (LLMs) is not your prompt. If you use an LLM while debugging from the terminal, this adds up fast. It is the terminal noise you paste into the context window: `git diff`, test logs, `grep` output. One pragmatic idea: treat command output (stdout) like an API surface. RTK (https://lnkd.in/dwt3Sycp) positions itself as a local CLI proxy. It runs your command and filters or compresses stdout before it reaches the model context. The RTK README claims 60–90% token reduction. How it gets there (per command type): - Smart filtering (remove noise) - Grouping (aggregate similar items) - Truncation (keep relevant context, cut redundancy) - Deduplication (collapse repeats with counts) Adoption detail I care about: It can auto-rewrite commands so `git status` becomes `rtk git status` via a PreToolUse hook (a command rewriter). The model never sees the rewrite. Safety net: on failure, it can `tee` the full unfiltered output to a saved log and print the path. Privacy note: telemetry is enabled by default (anonymous aggregate metrics, once per day). Opt out with `RTK_TELEMETRY_DISABLED=1` or setting it at `~/.config/rtk/config.toml`. What would stop you from using stdout filtering: correctness, debugging, or privacy? #llm #ai #developer #token #rtk #proxy #opensource

GitHub - rtk-ai/rtk: CLI proxy that reduces LLM token consumption by 60-90% on common dev commands. Single Rust binary, zero dependencies github.com
Like Comment
To view or add a comment, sign in
Sohil Shah
1mo Edited
Report this post
Building with LLMs involves 10% prompt engineering and 90% tackling complex engineering challenges that tutorials often overlook. I recently completed a Git History Analyzer using the Google SDK and LLMs. While the workflow may seem straightforward feeding commit data to the AI for a summary the reality involves navigating token management, data noise, and the intricacies of the SDK. Here’s a breakdown of the technical hurdles I faced while creating a simple Agent using Google ADK: - The Context Window Constraint: Strategies for processing thousands of commits without exceeding LLM memory limits. - Google SDK Realities: Navigating integration specifics that documentation may not fully address. - Refining Data for AI: Transforming raw, noisy git diff output into structured data that the model can effectively reason about. If you're advancing beyond basic wrappers and developing production-ready AI tools, this deep dive is for you. Read the full technical breakdown here: https://lnkd.in/gJ6avvPu To fellow engineers working with LLMs: What significant "hidden complexity" have you encountered that took you by surprise? Let’s discuss in the comments. #artificialintelligence #AI #AgenticAI #ArtificialIntelligence #GoogleAdk #AgentEngine #VertexAI

Building a Git History Analyzer with Google ADK and LLMs — The Hard Parts Nobody Talks About medium.com

5 Comments
Like Comment
To view or add a comment, sign in
Harshit S Bisht
2w
Report this post
🚀 Built an AI-powered Code Review System (OpenEnv) I developed a production-style environment where AI agents perform automated code reviews — similar to how developers review pull requests in real teams. 🔍 What it does: • Detects bugs, security vulnerabilities, and performance issues • Assigns severity levels (low → critical) • Suggests fixes with explanations • Uses a reward-based system to evaluate AI performance ⚙️ Tech Stack: • Python, FastAPI • Pydantic • OpenEnv framework • Hugging Face Inference APIs • Docker 🧠 Key Improvements: • Replaced keyword matching with F1-score evaluation (precision + recall) • Added hallucination penalty to reduce false positives • Designed multi-step feedback loop for iterative improvement • Built REST API endpoints (/reset, /step, /state) • Structured evaluation logs ([START] / [STEP] / [END]) 🎯 Why this matters: Code review is critical in software engineering. This system simulates how AI can assist in CI/CD pipelines, developer tools, and automated quality checks. 🔗 Live Demo (Hugging Face): https://lnkd.in/g_BQVz8m 💻 GitHub Repository: https://lnkd.in/ggWHWdmv 💡 Moving toward building intelligent developer systems powered by AI. Would love feedback from developers and AI engineers! #AI #MachineLearning #SoftwareEngineering #BackendDevelopment #Python #FastAPI #HuggingFace #OpenSource

GitHub - bisht14/my-openenv github.com

2 Comments
Like Comment
To view or add a comment, sign in
Niyazi Tuğra Demirörs
6d
Report this post
Two weeks ago I built an AI memory tool. Then I measured what it actually did. About 3,000 lines of code had to die. The bet was the standard one in AI memory: take long conversations, slice them into small "atoms," embed each atom, search top-K. MemPalace, mem0, most LangChain memory backends share that shape. I copied it. Then I ran the numbers on real conversational data: — RRF score band stuck at 0.014–0.017. The "small chunks → sharp embeddings" hypothesis didn't hold. — LLM synthesis fed BETTER from whole sessions than from fragments. — My beautiful 600-node graph? Nobody walked it. Not me. Not the AI. — 663 fragments carrying ~860 entity instances. Half were just folder names. I deleted the mining pipeline. Pivoted to "narrative-first": whole Sessions as memory units, sparse but meaningful wikilinks between them, and an "Identity Layer" that distills who you are over time. The AI doesn't "remember conversations." It KNOWS the user. That was v1.0. Then v1.0 hit a second wall: SessionStart was the wrong trigger. Mid-stream X-close left orphans. Race conditions multiplied. v1.1 inverts it: refine when the session ENDS, in a detached worker that survives Claude Code's own termination. SessionStart only catches what slipped through. Ships today. Hard rule: zero Anthropic API calls. Mnemos uses your existing Claude Code subscription quota only — no surprise bills, ever. 🐕 5-minute walkthrough: https://lnkd.in/dteAhbsi 📜 The full pivot story: https://lnkd.in/dj2Tr_Zp 🚀 Repo: https://lnkd.in/dpHsAPBi #AI #ClaudeCode #OpenSource #DeveloperTools #BuildingInPublic #LLM

GitHub - mnemos-dev/mnemos: Turn your Claude Code history into a searchable memory palace. Obsidian-native, markdown-first. See STATUS.md for current capabilities and roadmap. github.com
Like Comment
To view or add a comment, sign in

476 followers

39 Posts

View Profile Follow

Ali Danesh Moghaddam’s Post

More Relevant Posts

Explore related topics

Explore content categories