98M Tokens a Week at 60-80% Cost Reduction with Advisor-Driven Development

I used 98 million tokens last week. Not a typo. And no, I'm not burning money. I engineered the cost floor down low enough to actually operate at that scale. Here's the pattern: Most Claude Code sessions are 80% mechanical work. Reading files. Writing boilerplate. Simple edits. Reformatting. You're paying Opus rates for tasks Haiku could handle without breaking a sweat. So I built a dispatch system that stops doing that. Opus stays in the orchestrator seat. It plans, makes architectural decisions, and reviews output. Everything mechanical gets routed to the cheapest capable model for the job: Haiku, Sonnet, Gemini, Kimi, or local Ollama models running free on-device like Gemma 4. The result: 60-80% cost reduction on multi-step tasks. Which means I can run at 98M tokens a week and still have it make sense as a business decision. The pattern is called Advisor-Driven Development. Two dispatch paths: 1. Agent tool Claude subagents with full file access on cheaper models 2. ask.py A CLI that routes to any provider for text generation, code, even video analysis Opus tokens go to thinking. Everything else goes where it's cheapest to send it. I've open sourced the whole thing under Pushing Squares. Repo below includes: ask.py, a ready-to-paste CLAUDE.md snippet, and a slash command for structured multi-task execution. Pay for intelligence. Not for mechanical work. // —ARI— https://lnkd.in/e_nzMQ2i

GitHub - PUSHINGSQUARES/advisor-driven-dev: Multi-model orchestration for Claude Code — Opus plans, cheap models execute. 10 models, 5 providers, local Ollama. Cut API costs 60-80%. github.com

To view or add a comment, sign in

More Relevant Posts

Ivan Gorban
6d
Report this post
I kept having the same annoying experience with Claude Code: every new session felt like working with a smart engineer with amnesia. The model could reason about the code, but it kept spending too much time rediscovering the same repository structure, dependencies, conventions, and previous decisions. After a while, this started to feel less like a prompting problem and more like a workflow-state problem. At some point I stopped asking: “how do I write a better prompt?” The better question was: what should survive between stateless agent sessions? That question became Quoin, a small tool I’ve been building around Claude Code. The core idea is to move from a prompt-centric workflow to an artifact-centric workflow. Instead of relying on one huge CLAUDE.md or a long conversation history, Quoin keeps lightweight workflow state in files: architecture notes, plans, critic responses, reviews, memory, cost snapshots, and a documentation cache. The most useful part so far has been the codebase knowledge cache. In my own workflow, it reduced input tokens by around 47%, mostly by cutting repeated orientation. This is obviously not a universal benchmark — just one workflow and one setup — but it was enough to make the direction feel worth exploring. What I find more interesting is not the token reduction itself, but what it suggests about agent workflows. A lot of coding-agent workflows today are still very human-language-first. We ask models to write long plans, long summaries, long reviews, and long explanations. They often look impressive, but many are too verbose to be operationally useful. They become another object the model has to reread, compress, reinterpret, and eventually forget. I’m increasingly convinced that some workflow artifacts should be more machine-first: structured state, constraints, file maps, dependency notes, checklists, decisions, and failure modes. Human-readable views are still important, but probably only where human judgment is actually needed. I wrote the longer build log here: https://lnkd.in/dXE4eYcW And the repo is here: https://lnkd.in/dEP_Vim8 Would be very interested in feedback from people using Claude Code, Codex, Cursor, or similar tools on larger or multi-repo projects. The question I’m trying to answer is not “can the model code?” It can. The harder question is: how do we make the next session smarter than the previous one?
Like Comment
To view or add a comment, sign in
Swati Ahuja
3w
Report this post
Most people still don't know how to use claude code properly. Here's the system I use to produce MVPs with clean architecture, proper docs, and zero scope drift. First thing I do: create a CLAUDE.md file in the repo root. This is your constitution. Tech stack, folder structure, naming conventions, what the MVP does as concise as possible. Claude Code reads this before every task. It stops the AI from "improving" your plan with ideas you never asked for. Then I create a skills folder. This is the part nobody talks about. Skills are reusable instruction files that teach Claude Code how YOU build. One skill for your component patterns. One for your API conventions. One for your testing approach. Drop them in, and every task follows your standards automatically. No repeating yourself across prompts. No drift between sessions. Break your MVP into phases. Not vague phases. Exact deliverables. > Phase 1: auth + user model. > Phase 2: core CRUD. > Phase 3: integration. Feed one phase at a time. The moment you dump everything at once, Claude starts making architectural decisions on your behalf and suddenly you have a microservice where you wanted a monolith. Properly test each phase before moving into next one. For clean code: write constraints in your skills. "All services return Result types." "No business logic in controllers." "Every function under 30 lines." Claude Code follows constraints better than most junior engineers when you actually write them down. For design: reference specific component libraries in a UI skill. "Use shadcn/ui with the default theme." "Follow this layout from the wireframe section." Vague aesthetics get you vague output. Specific references get you production UI. For documentation: I no longer have to manually do this, https://lnkd.in/gxjrWD8T handles most of the part for me The real trick is saying no. Claude Code will suggest "improvements" mid task. Better error handling. Extra validation. A caching layer. All reasonable. All scope creep. Finish the phase first. Optimize later. All of the MVP's I shipped took under a week. The code was reviewable. The docs were current. The architecture matched the original plan. Your AI coding tool is only as disciplined as your instructions.
5 Comments
Like Comment
To view or add a comment, sign in
Nidhi Bansal
4w
Report this post
📋 **Before installing any third-party memory tool for Claude — read this** Anthropic's official memory guide for Claude Code covers everything built in, for free, with zero setup beyond a text file. ***Here's what's already available out of the box*** 📄 CLAUDE.md — a plain text file you write once. Claude reads it at the start of every session: our stack, conventions, rules, architecture decisions. Commit it to git and the whole team shares it automatically. 🤖 Auto memory — Claude writes its own notes. Build commands, debugging patterns, your preferences. No prompting needed, it just learns as you work. 📁 File hierarchy — project, user, and org-wide scopes. One file for personal preferences, one for the team, one for the whole company. Each level layered cleanly on top of the next. 🗂️ Rules per file type — scope instructions to TypeScript files, API handlers, or any path pattern. The right context loads only when Claude needs it. Most teams reach for plugins before reading this page. That's the wrong order. ***Start here first*** 📰 https://lnkd.in/gWSBEiPc #AI #ClaudeCode #Anthropic #DeveloperTools #EngineeringTeams

How Claude remembers your project - Claude Code Docs code.claude.com
Like Comment
To view or add a comment, sign in
Artur Wojnar
2w Edited
Report this post
Yesterday, I probably went a bit too far. My statement was very direct and could have been interpreted as dismissing the value of learning #softwarearchitecture patterns. I wrote: "My whole programmer's life, I was looking for how to structure my codebase, you know the deal: what are the responsibilities of app/domain services, entities, etc. When I grew up, I finally discovered the truth: It doesn't matter." What I meant: - I was referring to people who focus too much on implementation details. It happens all the time—I used to do the same. - The root problem is often a lack of clarity about WHAT the real goal is and WHY we are doing something in the first place. - As a result, the rules of a given pattern can start to constrain us. As the saying goes, the tail shouldn’t wag the dog 😇 - Many authors tend to SELL certain approaches as essential for success. But often, that’s just marketing. - Hexagonal Architecture is not a complete solution by itself. We still need to do the hard work of designing responsibilities. - Similarly, architecture is not the same as a folder structure. What I really wanted to emphasize is this: - First understand WHAT you are trying to achieve and WHY—only then decide HOW to implement it. - In practice, most of the time, we are managing risk and coupling. And those are exactly the core concerns addressed by approaches like Hexagonal, Onion, or Clean Architecture. They say that the most beneficial approach is to protect the business logic and separate it from the rest. That said, I don’t think it really matters where we place Saga orchestration or API composition call—whether in the application layer or within a domain service. Cheers!
1 Comment
Like Comment
To view or add a comment, sign in
Poornachandra Kongara
4w
Report this post
Claude Code becomes powerful when paired with structured systems. Not just prompts, but workflows, memory, and rules working together. That’s what turns it from a helper into a reliable coding system. Here’s how to make it work 👇 - Getting Started Initialize Claude in your project so it understands your codebase and sets up a working context. - Understanding CLAUDE.md Use it as persistent memory to store architecture, decisions, and rules across sessions. - Memory File Hierarchy Structure memory across global and project levels so context stays organized and relevant. - Best Practices Be clear in instructions, define constraints, and keep memory concise to improve consistency. - Project File Structure Maintain clean folders and configs so Claude can navigate and operate effectively. - Adding Skills (The Superpower) Create reusable skills that automate common tasks and reduce repetitive instructions. - Skill Ideas Use skills for testing, reviews, deployments, and design patterns to speed up development. - Setting Up Hooks Automate workflows by triggering actions before or after tasks for better control. - Permissions & Safety Define boundaries so Claude operates safely within your system. - The 4-Layer Architecture Combine memory, skills, hooks, and agents to build a scalable and structured setup. - Daily Workflow Pattern Start with planning, execute step-by-step, and refine outputs continuously. - Quick Reference Use shortcuts and commands to manage context and switch modes efficiently. The way you structure your workflow defines the quality of your output. Are you using Claude as a tool or designing a system around it?
41 Comments
Like Comment
To view or add a comment, sign in
Andrei Kondratev
1w
Report this post
GO 1.23 PUSHES CALLBACK-BASED ITERATORS — AND IT'S A CONTROVERSIAL CHOICE The Go standard library has had callback-style iteration for a while — sync.Map, fs.WalkDir, and friends. Go 1.23 essentially formalizes this pattern via iter.Seq. But there's a real tension here. Go is an intentionally imperative, explicit language. Developers expect to own the control flow — not hand it off to a callback and hope everything behaves. That's why the explicit iterator object pattern feels far more natural in Go: .Iter() → for it.Next() → .Key() / .Value() This isn't a new idea — it's already proven itself in the stdlib: bufio.Scanner sql.Rows Both are simple, predictable, and free of hidden behavior. Compare that to the callback approach, where: Control flow becomes non-linear defer behaves unexpectedly Early returns turn into a puzzle Panics can get swallowed silently Many teams deliberately choose the iterator object pattern as their internal standard for exactly these reasons. One example: Solod — a Go subset that compiles to C — uses iterator objects by design. And honestly, it feels much closer to the spirit of Go. The real question isn't which approach is "newer" — it's which one keeps your code honest: tight execution control, or flexible composition through callbacks?
Like Comment
To view or add a comment, sign in
Rupeshit Patekar
1w
Report this post
I’ve been using Claude Code daily for months. And here’s the uncomfortable truth… Most developers are using maybe 20% of it. They install it, generate some code, and stop there. What they miss is the other 80% — the workflows and commands that actually turn it into a serious productivity engine. Real stuff you’ll use every day. A few that genuinely change how you work: 1. Plan before you build Shift + Tab (Plan Mode) Let Claude analyze your codebase and propose architecture first. You catch design flaws early instead of debugging later. 2. Control your context or it will control you /compact → compress context /clear → reset completely Mixing contexts is the fastest way to get bad output. 3. Treat your codebase as the prompt “Look at src/auth/login.ts and follow the same pattern.” Way more effective than describing what you want. 4. Build in small, testable steps Don’t say “build feature X.” Break it down: schema → API → validation → UI. Test at every step. 5. Force visibility into changes “Show me a diff of all files and explain each change.” Otherwise you’re trusting blindly. 6. Debug properly Paste full errors. Not summaries. “Diagnose root cause step by step before suggesting a fix.” This alone saves hours. 7. Lock in standards once /memory Define rules like “always use strict mode” or “always run tests.” Applies automatically in every session. 8. Know when to reset If things go sideways: “Stop. Start fresh from the original version.” Sometimes restarting is faster than fixing. Simple setup that changes everything: /init → generate project context /memory → define rules Shift + Tab → plan architecture Then build incrementally. Takes 5 minutes. Saves days. The real shift isn’t AI writing code. It’s this: You’re no longer just coding. You’re directing a system. And that system is only as good as how clearly you think. Same developer. 10x leverage. If you’re using Claude Code like autocomplete, you’re missing the point.

1 Comment
Like Comment
To view or add a comment, sign in
Kuldeep Vyas
2w
Report this post
🚀 Day 27 – Dependency Injection: The Backbone of Scalable Design Dependency Injection (DI) is not just a framework feature — it’s a core design principle that enables loosely coupled, testable, and maintainable systems. At its core: 👉 Don’t create dependencies. Inject them. 🔹 1. Promotes Loose Coupling Instead of: OrderService service = new OrderService(new PaymentService()); Use DI: OrderService service = new OrderService(paymentService); ➡ Reduces tight coupling between components ➡ Makes systems easier to evolve 🔹 2. Improves Testability DI allows easy mocking: OrderService service = new OrderService(mockPaymentService); ➡ Enables unit testing without real dependencies ➡ Faster, isolated, reliable tests 🔹 3. Supports Multiple Implementations PaymentService → CardPayment / UpiPayment / WalletPayment ➡ Switch implementations without changing business logic ➡ Perfect for extensible systems 🔹 4. Enables Clean Architecture DI aligns perfectly with: ✔ SOLID principles (especially Dependency Inversion) ✔ Layered & Hexagonal Architecture ➡ Business logic stays independent of frameworks 🔹 5. Constructor Injection > Field Injection ✔ Makes dependencies explicit ✔ Ensures immutability ✔ Easier to test ➡ Preferred approach in production systems 🔹 6. Lifecycle & Scope Management Frameworks like Spring manage: ✔ Singleton ✔ Prototype ✔ Request / Session scopes ➡ Optimizes memory and performance automatically 🔹 7. Avoid Over-Injection (Anti-Pattern) Too many dependencies = design smell ➡ Indicates violation of Single Responsibility Principle ➡ Refactor into smaller, focused services 🔹 8. Framework vs Pure DI Framework DI (Spring) → fast development Manual DI → better control, lightweight ➡ Choose based on system complexity 🔥 Architect’s Takeaway Dependency Injection is not about frameworks — it’s about design discipline. It helps you build: ✔ Flexible systems ✔ Testable code ✔ Replaceable components ✔ Scalable architectures 💬 Do you prefer constructor injection or field injection — and why? #100DaysOfJavaArchitecture #DependencyInjection #Java #SpringBoot #CleanArchitecture #SystemDesign #TechLeadership
Like Comment
To view or add a comment, sign in
Subham Kundu
1w
Report this post
𝗛𝗼𝘄 𝘁𝗼 𝗶𝗻𝗷𝗲𝗰𝘁 𝗰𝗼𝗻𝘁𝗲𝘅𝘁 𝗶𝗻 𝗰𝗹𝗮𝘂𝗱𝗲 𝗰𝗼𝗱𝗲 𝗵𝗼𝗼𝗸𝘀 𝗮𝘁 𝗿𝘂𝗻𝘁𝗶𝗺𝗲 Claude Code hooks are a hot topic in AI-assisted development, and for a good reason. Most developers rely on prompts to guide Claude's behavior. The problem is that prompts are suggestions - Claude can forget or skip them mid-session. Hooks guarantee behavior at the infrastructure level. In simple terms, a hook is a shell script that fires automatically at a specific lifecycle event. In this post, I want to focus on one powerful pattern: injecting context for a specific command. Suppose your team follows strict API design guidelines - consistent response shapes, zod validation, proper status codes. Without hooks, you paste these rules manually every session. 𝗪𝗶𝘁𝗵 𝗮 𝗣𝗿𝗲𝗧𝗼𝗼𝗹𝗨𝘀𝗲 𝗵𝗼𝗼𝗸, 𝘆𝗼𝘂 𝘄𝗿𝗶𝘁𝗲 𝗮 𝘀𝗰𝗿𝗶𝗽𝘁 𝘁𝗵𝗮𝘁 𝗱𝗲𝘁𝗲𝗰𝘁𝘀 𝘄𝗵𝗲𝗻 𝗖𝗹𝗮𝘂𝗱𝗲 𝗶𝘀 𝗮𝗯𝗼𝘂𝘁 𝘁𝗼 𝗲𝗱𝗶𝘁 𝗮𝗻𝘆 𝗳𝗶𝗹𝗲 𝘂𝗻𝗱𝗲𝗿 `𝘀𝗿𝗰/𝗮𝗽𝗶/`, 𝗿𝗲𝗮𝗱𝘀 𝘆𝗼𝘂𝗿 𝗴𝘂𝗶𝗱𝗲𝗹𝗶𝗻𝗲𝘀 𝗳𝗿𝗼𝗺 𝗮 𝗺𝗮𝗿𝗸𝗱𝗼𝘄𝗻 𝗳𝗶𝗹𝗲, 𝗮𝗻𝗱 𝗶𝗻𝗷𝗲𝗰𝘁𝘀 𝘁𝗵𝗲𝗺 𝗮𝘀 `𝗮𝗱𝗱𝗶𝘁𝗶𝗼𝗻𝗮𝗹𝗖𝗼𝗻𝘁𝗲𝘅𝘁` 𝗯𝗲𝗳𝗼𝗿𝗲 𝘁𝗵𝗲 𝗲𝗱𝗶𝘁 𝗲𝘅𝗲𝗰𝘂𝘁𝗲𝘀. Claude receives your rules at precisely the right moment, every single time. The setup involves three pieces: a guidelines markdown file, a bash script that checks the file path via `jq`, and a registration entry in `.claude/settings.json`. Please refer to the video for the complete walkthrough. 𝗧𝗵𝗲 𝗰𝗼𝗿𝗲 𝗽𝗿𝗶𝗻𝗰𝗶𝗽𝗹𝗲: 𝗽𝗿𝗼𝗺𝗽𝘁𝘀 𝘀𝘂𝗴𝗴𝗲𝘀𝘁, 𝗵𝗼𝗼𝗸𝘀 𝗴𝘂𝗮𝗿𝗮𝗻𝘁𝗲𝗲. P.S. What rules would you want Claude to follow without exception?

1 Comment
Like Comment
To view or add a comment, sign in
Son-U Paik
2w
Report this post
𝗧𝗵𝗶𝘀 𝘁𝗮𝗹𝗸 𝗶𝘀 𝘄𝗼𝗿𝘁𝗵 𝘆𝗼𝘂𝗿 𝘁𝗶𝗺𝗲. Mario Zechner’s "Building pi in a World of Slop" looks like a talk about a minimal coding agent framework. It is an argument for restraint, judgment and human agency in software engineering. An engineer arguing from the code and a lawyer arguing from liability land in the same place. That convergence is the point. He also swears more than I do, so he says it better. Zechner’s mechanism is specific. A human cannot produce twenty thousand lines of code in a few hours. An agent can. The human bottleneck was never a defect. It was the control system. Remove it and small errors stop feeling small. They compound. By the time you feel the pain, the codebase has filled with what Zechner calls enterprise-grade complexity. Architecture, API, anything that defines the shape of the system, he writes by hand. The deeper harm is not volume. It is agency. You delegate the work. You lose the ability to reason about your own system. You cannot recover what you never understood. 𝗜𝗻𝗳𝗼𝗿𝗺𝗲𝗱 𝗜𝗻𝘁𝗲𝗻𝘁 names the discipline. This is what its absence looks like. Then the externality. Agents generate at scale. The cost lands elsewhere. Open source maintainers now sort through a flood of agent-written issues and pull requests from people who cannot answer questions about what they submitted. The party producing the output is not the party bearing the consequence. For anyone doing supply-chain or third-party AI risk, this is the pattern to watch. pi itself is the argument. A minimal harness you can read, modify and own is not a tool choice. It is restraint made visible. 𝗦𝗹𝗼𝘄 𝗔𝗜 names the same discipline from the governance side. Use agents where the task is scoped and the output is verifiable. Keep critical work under human authorship. Keep human review inside the control loop, not downstream of it. Scoped delegation is not a limit on the technology. It is the only way the technology stays accountable. If you build with coding agents inside an organization that confuses velocity with control, this is the clearest talk you will watch this quarter. Watch it. Video: https://lnkd.in/gjxiPWmU Summary: https://lnkd.in/gMTnvvSw 𝗙𝗶𝗻𝗮𝗹 𝗟𝗶𝗮𝗯𝗶𝗹𝗶𝘁𝘆 𝗿𝗲𝘀𝘁𝘀 𝘄𝗶𝘁𝗵 𝘁𝗵𝗲 𝗛𝘂𝗺𝗮𝗻.
Like Comment
To view or add a comment, sign in

148 followers

8 Posts

View Profile Connect

98M Tokens a Week at 60-80% Cost Reduction with Advisor-Driven Development

More Relevant Posts

Explore content categories