Name: Fixing Ctrl+C in Async AI Pipeline: Lessons in Undo Path Complexity | Augusta Bhardwaj posted on the topic | LinkedIn
Uploaded: 2026-04-02T00:15:35.307Z
Duration: 36 s
Channel: Augusta Bhardwaj

Augusta Bhardwaj

1mo

The hardest bug I ever fixed: clean Ctrl+C in an async AI pipeline. When a user presses Ctrl+C during a streaming response with active tool calls, you need to — in order, without race conditions: 1. Cancel the HTTP stream gracefully 2. Abort any in-flight tool executions 3. Clean up temporary state (partial files, temp directories) 4. Preserve conversation history up to the interruption point 5. Return to a clean prompt — ready for the next input Each step can fail. And each failure mode is different. What if Ctrl+C fires between two tool calls? What if the stream buffer hasn't flushed? What if cleanup itself gets interrupted by a second Ctrl+C? What if an async tool call returns after cancellation and tries to write to a closed context? Python's signal handling + asyncio cancellation made it possible. But every edge case took hours to find — because you can only reproduce them by hitting Ctrl+C at exactly the right millisecond. The lesson I keep coming back to: the undo path is always harder than the happy path. And in developer tools, the undo path is what determines whether people trust your software. Stack: Python + Claude API GitHub: https://lnkd.in/ghn_8iKA Full case study: https://lnkd.in/gtg49D-S #Python #Claude #CLI #AsyncPython #Architecture #BuildInPublic #SoftwareEngineering

Transcript

I built a CLI chat bot with flawless cancellation, Python async IO for true concurrency, the cloud API for streaming and Rich for terminal rendering. Every response streams word by word, tool calls show spinners, the user sees everything happening in real time, Control C triggers A cooperative cancellation, async tasks clean up, conversation history reverts and the prompt returns instantly. Built with Python And Claude. Check it out.

To view or add a comment, sign in

More Relevant Posts

Rail Tech

108 followers
1w
Report this post
Most LLM agents forget everything between sessions. The ones that don't usually bolt on a vector database which works, but retrieves by semantic similarity and tends to surface adjacent noise alongside the things that actually matter. I've been building MME (Modular Memory Engine) at Rail Tech to take a different angle. MME stores memories as a weighted tag graph, every saved fact is automatically tagged, and retrieval propagates through the graph by keyword overlap and learned edge weights. Designed to sit alongside your vector DB, not replace it. Keep your embeddings. Add a layer that knows what actually got used. This week we shipped the official Python SDK: 👉 👉🏻 pip install railtech-mme Thin, typed client over the MME REST API. Sync and async, full Pydantic models, and a LangChain extra (MMESaveTool, MMEInjectTool) that drops into any LangChain or LangGraph agent. Works on a cold account from day one — no embedding warm-up, no fine-tuning. If you're building LLM agents in Python and you're tired of either watching your agent forget everything or wrestling with a vector DB, take a look: → pip install railtech-mme → https://lnkd.in/enWXdmfD → mme.railtech.io Would love feedback from anyone shipping agents in production. #LLM #AI #Python #LangChain #AgentMemory
Like Comment
To view or add a comment, sign in
Ajay Sainath
1w
Report this post
🚀 Visualizing code like a system, not just files When working with large codebases, the hardest part isn’t writing code, it’s understanding how everything connects. So I built an AI Codebase Knowledge Graph. 💡 What it does • Transforms an entire codebase into an interactive knowledge graph • Maps relationships between functions, classes, and modules • Helps visualize dependencies and architecture in a clear, intuitive way • Enables impact analysis by tracing how changes affect other parts of the system ⚙️ How it works I built an AST-based parsing pipeline to extract structural elements from Python code, then used graph modeling to represent relationships across the codebase. Using NetworkX, the system constructs a graph of dependencies, and a Streamlit interface lets you explore it interactively. 📊 Why it matters Instead of digging through files manually, you can see your system as a connected structure. This makes it much easier to: • Understand complex architectures • Debug faster • Analyze the impact of changes before making them 🧠 The idea I wanted to move beyond code as text and treat it as a system of relationships. 🔗 GitHub https://lnkd.in/gqBkRDn9 #AI #GraphAnalytics #DeveloperTools #Python #CodeVisualization #SoftwareEngineering
Like Comment
To view or add a comment, sign in
Mohammad Vohra
1w
Report this post
𝗽𝗶𝗽 𝗶𝗻𝘀𝘁𝗮𝗹𝗹 𝗮𝗴𝗲𝗻𝘁-𝗴𝗼𝗱𝗺𝗼𝗱𝗲 That's all it takes to give your LLM agents full file system and shell capabilities. I just open-sourced 𝗮𝗴𝗲𝗻𝘁-𝗴𝗼𝗱𝗺𝗼𝗱𝗲 — a Python MCP package built for engineers shipping real AI products at scale. 🛠️ Capabilities: • Read, write & edit files — safely, UTF-8 aware, no accidental overwrites • Execute commands — sandboxed subprocess, no shell injection risk • Explore the filesystem — recursive listing, glob patterns, depth control • Everything scoped to your workspace root — no path escapes, ever • No Docker. No Deno. No containers. Just Python. ⚙️ Works natively over MCP (stdio) or fully in-process. Plug an LLM. Your keys, your model, your infra. 🔗 https://lnkd.in/dPaHd6XX What gaps have YOU hit building agents? Drop a comment 👇 #AIEngineering #LLMAgents #MCP #ModelContextProtocol #OpenSource #Python #GenerativeAI #AgentDevelopment #MachineLearning
2 Comments
Like Comment
To view or add a comment, sign in
Paul Iusztin

Senior AI Engineer • Founder @ Decoding AI • Author @ LLM Engineer’s Handbook ~ I ship AI products and teach you about the process.
2w
Report this post
Here’s the easiest way to fix context window limits: Stop putting documents in the prompt… Put them in a REPL instead. Instead of building complex RAG pipelines and other hacks to work around context limits, load the document as a variable in a persistent Python environment. This is the core idea behind Recursive Language Models (RLMs) as an orchestration technique. The model never sees the full document. It only gets metadata: • Size • Structure • Available functions • How to access it Then the model writes code to explore it. Each step runs inside a persistent REPL. Variables survive across iterations. So the model builds results progressively: • Filtered subsets • Intermediate buffers • Partial summaries • Structured outputs When deeper reasoning is needed, it spawns a sub-call: llm_query(prompt, chunk) Only that chunk goes to a worker model. The result returns to the REPL. The main context stays clean. Only small execution results get appended to history. This keeps the context window a constant size. Here's the takeaway: Traditional RAG → you engineer the context REPL loop → the model engineers its own context This is context engineering on autopilot. Load the large state into memory. Let the model inspect it. Keep the prompt minimal. Cleaner context. Lower cost. Better reasoning over large data. I broke down the full RLM mechanism in a recent newsletter. Check it out here: https://lnkd.in/dj5PWtSW
35 Comments
Like Comment
To view or add a comment, sign in
Muhammad Abdullah
1w
Report this post
𝐈𝐟 𝐘𝐨𝐮 𝐃𝐨𝐧’𝐭 𝐔𝐧𝐝𝐞𝐫𝐬𝐭𝐚𝐧𝐝 𝐓𝐡𝐢𝐬 𝐏𝐫𝐨𝐛𝐥𝐞𝐦, 𝐘𝐨𝐮 𝐃𝐨𝐧’𝐭 𝐔𝐧𝐝𝐞𝐫𝐬𝐭𝐚𝐧𝐝 𝐒𝐭𝐚𝐜𝐤𝐬 Today I tackled a fundamental problem that looks simple at first — but really tests your understanding of logic and data structures. 💡 𝐓𝐡𝐞 𝐂𝐡𝐚𝐥𝐥𝐞𝐧𝐠𝐞: Given a string of brackets () { } [ ], determine whether it is valid. 🧠 𝐌𝐲 𝐀𝐩𝐩𝐫𝐨𝐚𝐜𝐡: Instead of checking everything at the end, I used a stack (𝐋𝐈𝐅𝐎 𝐩𝐫𝐢𝐧𝐜𝐢𝐩𝐥𝐞) to validate each step in real-time. • Push opening brackets • On closing bracket → match with the last opened one • If mismatch occurs → invalid • If everything matches & stack is empty → valid 🔥 𝐊𝐞𝐲 𝐋𝐞𝐚𝐫𝐧𝐢𝐧𝐠: This problem taught me how powerful simple data structures can be when used correctly. 🐍 𝐏𝐲𝐭𝐡𝐨𝐧 𝐒𝐨𝐥𝐮𝐭𝐢𝐨𝐧 👇 📌 Consistency in solving such problems is helping me build strong problem-solving skills. #Python #DSA #FullStack #AI #Logic #LeetCode #AIDriven
Like Comment
To view or add a comment, sign in
Aditya Kumar
2w
Report this post
Today's topic is a tool combo breakdown focusing on three exciting combinations that can revolutionize your workflow and save you time. Whether it’s integrating Claude Code with Obsidian for a seamless knowledge management system or harnessing n8n combined with the Claude API to automate complex tasks, these tools offer specific benefits. Let's dive into one of our options: using Python along with the Claude API. This combo allows developers to leverage AI capabilities directly within their existing workflows. Here’s how you can set it up: 1. **Setup**: First, ensure you have Python installed on your machine. You'll also need n8n and Claude Code plugins for n8n. 2. **Write Your Script**: Start by writing a simple Python script that uses the Claude API to process text inputs. For example: ```python import n8n from claude import ClaudeAPI # Initialize Claude API cl = ClaudeAPI() # Function to get AI generated response def get_response(prompt): response = cl.get_completion(prompt) return response # Example usage of the function result = get_response("What's the weather like in New York today?") print(result) 3. **Integrate with Obsidian**: Next, you can integrate this script with Obsidian using n8n to automate tasks. This setup can save significant time and effort, reducing manual processing and allowing for more efficient workflows. Would you be interested in exploring further AI integration opportunities like this one? Let us know your thoughts or challenges in the comments below. #ClaudeCode #AIAutomation #AITools #BuildWithAI #loopfeedai
Like Comment
To view or add a comment, sign in
SynapseKit AI

13 followers
4w
Report this post
📣 SynapseKit v1.4.7 + v1.4.8 just dropped. Back to back. Huge thanks to Dhruv Garg and Abhay Krishna who drove most of this sprint. 🙌 Two themes in these releases: getting data in, and making workflows resilient. Getting data in: 5 new loaders The gap between "I have a RAG pipeline" and "I can actually feed it my company's data" is a loader problem. These close it: 📨 SlackLoader — pull channel messages directly into your pipeline 📝 NotionLoader — ingest pages and databases from Notion 📖 WikipediaLoader — single article or multiple, pipe-separated 📄 ArXivLoader — search arXiv, download PDFs, extract text automatically 📧 EmailLoader — any IMAP mailbox, stdlib only, zero extra dependencies SynapseKit now has 24 loaders. Your data is probably already covered. Better retrieval — ColBERT ColBERTRetriever brings late-interaction ColBERT via RAGatouille. Instead of comparing a single query vector against a single document vector, ColBERT scores every query token against every document token (MaxSim). On long documents the recall improvement is significant- single-vector approaches lose detail in the compression. Token-level scoring doesn't. Resilient graph workflows Subgraph error handling now ships with three strategies — retry with backoff, fallback to an alternative graph, skip and continue. Production workflows break. The question is whether they break gracefully. Where SynapseKit stands today: 27 providers · 9 vector backends · 42 tools · 24 loaders · 2 hard dependencies ⚡ pip install synapsekit==1.4.8 📖 https://lnkd.in/dvr6Nyhx 🔗 https://lnkd.in/d2fGSPkX #Python #LLM #RAG #AI #OpenSource #MachineLearning #Agents #SynapseKit

GitHub - SynapseKit/SynapseKit: Ship LLM apps faster. Production-grade LLM framework for Python. Async-native RAG, agents, and graph workflows. 2 dependencies. Zero magic. github.com
Like Comment
To view or add a comment, sign in
PRASHANTH KUMAR BILLAVA
6d
Report this post
Paying per token for internal automation? There's a better way. I built a proxy that wraps Claude Code CLI as a standard Anthropic API — same format, same SDKs, zero per-token billing. The trick: Claude Code CLI runs under a flat-rate Claude Max subscription. We exposed it as POST /v1/messages on our internal server. Now every machine on the network calls it like the real API. One line change in Python: base_url = "https://your-server:port" # that's it Bonus: Claude CLI can read and fix actual files on the server — something the real API cannot do. Stack: Flask + Podman + config.json. Fully self-hosted. #ClaudeCode #DevOps #Anthropic #AI #CloudNative #PlatformEngineering
Like Comment
To view or add a comment, sign in
Gabriel Mercuri
4d
Report this post
The new CodeAct feature in Agent Framework tackles a real performance bottleneck for AI agents: orchestration overhead from chaining multiple tool calls. By letting agents write and run their whole plan as a single Python script in a Hyperlight sandbox, CodeAct cuts latency by half and cuts token usage by over 60% for workloads with lots of small chained steps. - CodeAct is available in the agent-framework-hyperlight(alpha) package, using Hyperlight micro-VMs for safe, isolated execution. - Instead of model - tool - model loops, agents submit one block of code, with tool calls bridged out securely to your runtime. - Approval modes are flexible; you can decide whether the code block or individual tool calls require human signoff. This is a practical upgrade for any scenario where agents do data wrangling, chained lookups, or report generation. The wiring stays simple, and the gains are measurable.
1 Comment
Like Comment
To view or add a comment, sign in
flazetech

1,234 followers
3w
Report this post
Stop treating the random seed like a magic number. If your model "works" only at seed=42, you haven't built robustness — you've built coincidence. Here's a practical checklist to stop bluffing reproducibility and actually ship: - log the full RNG state (Python, NumPy, PyTorch, CUDA) — not just the seed. - save experiment configs and dependency hashes (library versions, commit SHAs). - run k-fold seed sweeps: test 5–10 different seeds and report variance, not just best accuracy. - automate environment capture: Dockerfile + pip/conda lockfiles + OS info. - add smoke tests that fail on >X% variance before merging. Tools that make this simple: - Weights & Biases (wandb) — artifact + config + run comparison. - DVC (iterative/dvc) — data + model versioning with reproducible pipelines. - Hydra (facebookresearch/hydra) — config management and multi-run sweeping. - MLflow — experiment tracking + model registry. Also check these repos for patterns: - iterative/dvc (GitHub) — reproducible pipelines for ML. - facebookresearch/hydra — flexible configs and sweeping. Quick pattern I use: 1) seed_everything at process start 2) capture RNG states to wandb artifact 3) run 7-seed sweep via Hydra 4) record mean ± std in the PR FlazeTech taught me to treat reproducibility as a feature, not a checkbox. If your model’s performance collapses across seeds, that’s a product risk — not a research footnote. Are you logging RNG state and variance, or still "hoping" 42 saves you? #MachineLearning #Reproducibility #MLOps #AIEngineering #DevTools #Hydra #DVC
Like Comment
To view or add a comment, sign in

1,778 followers

35 Posts

View Profile Follow

Transcript

More Relevant Posts

Explore content categories