Name: Agent Code Execution in Isolated Containers | Chris Weaver posted on the topic | LinkedIn
Uploaded: 2026-02-26T14:45:02.437Z
Duration: 32 s
Channel: Chris Weaver

Chris Weaver

2mo

I let agents execute arbitrary Python code on my computer. (no, I haven’t lost my mind.) Most developers would call that a security nightmare. I call it the future of local AI and agentic engineering. Isolated containers spin up on demand, agent runs whatever Python it wants inside, nothing leaks out. We've been using it internally for a while, it powers code execution across Onyx agents doing data analysis, file processing, and tool generation. We pulled it out into a standalone repo with no Onyx dependency because this felt like something the whole agent ecosystem is missing. If you're building agents that need code execution, you shouldn't have to spend three weeks on sandboxing infrastructure before you can get to the interesting part. Day 4 of launch week. Look out for tomorrow as we’re taking this sandbox a whole lot further 👀

5 Comments

Chris Weaver 2mo

You can check it out here: https://github.com/onyx-dot-app/python-sandbox, contributions are welcome!

3 Reactions

Arulnidhi Karunanidhi 2mo

Congratulations Chris Weaver 🥂 This is interesting.. I have a question for you.. A lot of finance teams still rely on Excel drill-down workflows. In accounts receivable, double-clicking a summary cell often opens a separate sheet with the underlying PO and payment rows. Curious whether Onyx agents can automate that kind of extraction?

Youssef Ben Mahmoud 2mo

Security first is the right call for agent execution. A reusable isolated sandbox plus auditable runs removes a major blocker for teams that want autonomous workflows in production

Abdul Rehman Azam 1mo

Controlled execution is the real unlock here. Without proper sandboxing, agents can’t safely scale.

Arindam Bose 2mo

This is great Chris Weaver ! Quick question, are there any restrictions on the file size for analysis?

See more comments

To view or add a comment, sign in

More Relevant Posts

Muyukani Kizito
2mo Edited
Report this post
Today I am open-sourcing Turn v1.0: a compiled systems programming language and custom Rust VM built specifically for agentic compute. Turn Language Anyone writing AI agent software understands that it takes writing hundreds of lines of TypeScript, Python and Pydantic just to parse JSON and manage agent concurrency. LLMs are stochastic and yet we're using deterministic languages to write the harness. It works but is painful and brittle. So I decided to fix it at the compiler level. Turn treats LLMs as native computational units, introducing three core primitives: 1️⃣ Cognitive Type Safety: Define a struct and call infer. The Turn VM natively guarantees the LLM returns exactly the memory shape you asked for. No more manual JSON parsing. 2️⃣ Probabilistic Routing: LLMs hallucinate. Turn makes uncertainty a first-class citizen with a native confidence operator, allowing you to build programmatic fail-safes. 3️⃣ Actor-Model Concurrency: Multi-agent orchestration in Python is a race-condition nightmare. Turn uses isolated VM execution trees (spawn_link) and deterministic mailboxes (receive). It’s fast, provider-agnostic (OpenAI, Anthropic, Gemini, Grok), and completely open-source. Playground & Docs: https://turn-lang.dev Source Code: https://lnkd.in/dj-fqr2C #Rust #ArtificialIntelligence #OpenSource #ProgrammingLanguages
Like Comment
To view or add a comment, sign in
Davontae Jackson
1mo
Report this post
Over the past few months I’ve been working on Pulse, an open‑source observability platform for AI apps. It brings everything your agents do into one place: Pulse captures every prompt, completion, token count, latency and cost, and organizes them into session timelines and traces. This makes it easy to debug, replay runs and understand how workflows behave. What Pulse includes • End‑to‑end trace capture and session timelines. • A trace‑ingestion server, interactive dashboard, CLI for agents, and TypeScript/Python SDKs that install quickly and can run locally or self‑host. Try it and help shape what’s next: https://usepulse.dev #Observability #AIAgents #LLM #DeveloperTools

Pulse usepulse.dev
Like Comment
To view or add a comment, sign in
Moss (YC F25)

2,322 followers
1mo
Report this post
🚀 We promised a frictionless developer experience. The unified Moss repo is here to deliver on it. A single, structured collection of drop-in samples to skip the boilerplate and start building Voice AI pipelines with Moss. ✓ Core SDKs: Python & TS flows for querying in sub 10ms, custom embeddings, and metadata filtering ⚡ Real-Time Voice: Pipecat & LiveKit pipelines with sub-10ms audio retrieval Clone it, swap in your code, and go.

1 Comment
Like Comment
To view or add a comment, sign in
Yorrick Jansen
2mo
Report this post
Claude can write a compiler, but it can't spell "anthropic" backwards. I tested it: pure LLM: "cipohtrpna" => wrong. Give it a sandbox? It writes `echo "anthropic" | rev`, runs it, gets "ciporhtna." => correct 10 tests where LLMs fail: counting, reversal, prime factorization, set ops. No sandbox: 8/10. With sandbox: 10/10. I didn't tell it to write code, it just recognized what it's bad at and reached for Python! Anthropic just shipped Programmatic Tool Calling (Claude writes code that calls your tools). Fewer round-trips, less tokens. It runs in their container, not yours. Testing it, something occurred to me: does that still work without tools? Yes, and it patches its own blind spots with code! Have you tried PTC already? #AICoding #ClaudeCode #AIAgents #LLM
1 Comment
Like Comment
To view or add a comment, sign in
Dean Didion
1mo
Report this post
Most people don't realise how far you can push n8n when you pair it with Python. 🐍 I've just finished building something I'm pretty excited about. A Python sidecar service running alongside our n8n automation platform. It sounds more complicated than it is, but the impact is massive. Here's what was bugging me: n8n is brilliant for connecting apps, moving data and building workflows. But the moment you need serious data processing, JavaScript hits a wall pretty fast. And I kept running into that wall. So I built a FastAPI service that lives in the same Docker network as n8n. When a workflow needs heavy lifting, it fires off a request to Python, gets the result back, and carries on. From n8n's side it's just another HTTP node call. Simple. The difference in what's now possible is honestly kind of wild. Where JavaScript would struggle, Python just gets on with it. Pandas for processing thousands of rows in milliseconds. NumPy for calculations that would crash a spreadsheet. Access to pretty much every ML and AI library out there. Data cleaning, enrichment, financial modelling, NLP — all available inside a workflow with a single HTTP node. The whole thing runs self-hosted on Hetzner, locked down behind API key auth and internal Docker networking, so nothing is exposed that shouldn't be. The thing I keep coming back to is this. You don't need to rip out your low-code tools when they hit their limits. You just need to know where to extend them. n8n handles the orchestration. Python handles the intelligence. If you're building automation pipelines and finding yourself fighting against the tool rather than with it, I'd genuinely recommend looking at this pattern. Happy to share more if anyone's curious 👇 #n8n #Python #Automation #FastAPI #Docker #LowCode #WorkflowAutomation

4 Comments
Like Comment
To view or add a comment, sign in
Diogo Santos
2mo
Report this post
A ~550-word AGENTS.md reduced agent runtime by 28.64% and token usage by 16.58% on SWE-bench Verified. The trick wasn’t more context — it was less ambiguity. I tested these ideas while refactoring agent docs for a production Python/FastMCP monorepo at NOS. What stuck with me: 𝗔𝗚𝗘𝗡𝗧𝗦.𝗺𝗱 𝘄𝗼𝗿𝗸𝘀 𝘄𝗵𝗲𝗻 𝗶𝘁’𝘀 𝗲𝘅𝗲𝗰𝘂𝘁𝗮𝗯𝗹𝗲 𝗼𝗻𝗯𝗼𝗮𝗿𝗱𝗶𝗻𝗴. Setup + test commands beat prose (Lulla et al.). 𝗔𝗚𝗘𝗡𝗧𝗦.𝗺𝗱 𝗶𝘀 𝗯𝗲𝗰𝗼𝗺𝗶𝗻𝗴 𝘁𝗵𝗲 𝗶𝗻𝘁𝗲𝗿𝗼𝗽𝗲𝗿𝗮𝗯𝗹𝗲 𝗱𝗲𝗳𝗮𝘂𝗹𝘁. 4,860 context files across GitHub; `.cursorrules` is basically legacy (Galster et al.). 𝗦𝗵𝗼𝗿𝘁 𝗯𝗲𝗮𝘁𝘀 𝗰𝗼𝗺𝗽𝗿𝗲𝗵𝗲𝗻𝘀𝗶𝘃𝗲. Most files are <500 words; medians cluster around ~335–535 words (Chatlatanagulchai et al.). 𝗧𝗲𝘀𝘁𝗶𝗻𝗴 𝗶𝗻𝘀𝘁𝗿𝘂𝗰𝘁𝗶𝗼𝗻𝘀 𝗮𝗿𝗲 𝘁𝗵𝗲 𝗵𝗶𝗴𝗵𝗲𝘀𝘁-𝘀𝗶𝗴𝗻𝗮𝗹 𝘀𝗲𝗰𝘁𝗶𝗼𝗻. They show up in ~75% of high-quality files. 𝗔𝘂𝘁𝗼-𝗴𝗲𝗻𝗲𝗿𝗮𝘁𝗲𝗱 𝗰𝗼𝗻𝘁𝗲𝘅𝘁 𝗰𝗮𝗻 𝗯𝗮𝗰𝗸𝗳𝗶𝗿𝗲. LLM-generated files dropped success by ~3% on average while raising cost >20% (Gloaguen et al.). 𝗙𝗶𝗹𝗲 𝗹𝗼𝗰𝗮𝗹𝗶𝘇𝗮𝘁𝗶𝗼𝗻 𝗶𝘀 𝘄𝗵𝗲𝗿𝗲 𝗮𝗴𝗲𝗻𝘁𝘀 𝗳𝗮𝗶𝗹 𝗳𝗶𝗿𝘀𝘁. If they edit the wrong file, everything downstream collapses (ContextBench). What I did with this: one canonical AGENTS.md (~550 words, every snippet verified), CLAUDE.md + Copilot instructions as thin pointers, deleted `.cursorrules`, and 4 path-scoped instruction files that auto-inject context per folder. Takeaway: context engineering is mostly negative space — remove contradictions, name the right files, and make “run tests” unmissable. Sources: https://lnkd.in/eM-HnnGs https://lnkd.in/eN7pUsfY https://lnkd.in/eHAarmSC https://lnkd.in/e9Fx6UC7 https://lnkd.in/eJM2EHkh https://lnkd.in/eTqgZZqK https://lnkd.in/egk_dX8U #ContextEngineering #AICoding #CodingAgents #SoftwareEngineering #MCP #LLMs #DeveloperTools
1 Comment
Like Comment
To view or add a comment, sign in
TheNextGenTechInsider.com

651 followers
1mo
Report this post
AntroCode Launches Zero-Dependency Single-File DeepSeek UI for Developers 📌 A 12-year-old developer just dropped a revolutionary tool: AntroCode, a zero-dependency, single-file DeepSeek UI that runs in your browser with one command. No servers, no installs - just python AntroCode_1.py and instant access to AI chat, CoT reasoning, and token tracking. Already trending on Hacker News, it’s redefining lightweight AI workflows for devs who hate setup. 🔗 Read more: https://lnkd.in/dNzkDQV8 #Antrocode #Deepseek #Python #Singlefile #Zerodependency

1 Comment
Like Comment
To view or add a comment, sign in
Kunal Kumar
1mo
Report this post
🚀Deploying my Algorithm to #Docker stack so it can run fully automated, 24/7. Tech stack: -Python for core logic -Flask API to receive and process signals -MongoDB for logging and state -Docker for reliable, always‑on deployment -Upstox / Dhan broker webhooks for execution Once deployed, I’ll be able to use the strategy directly via broker webhooks, without manually sitting on the charts. More details and architecture breakdown coming soon.
Like Comment
To view or add a comment, sign in
Muhammad Abdullah
1mo
Report this post
𝐖𝐫𝐢𝐭𝐢𝐧𝐠 𝐚 𝐏𝐲𝐭𝐡𝐨𝐧 𝐬𝐜𝐫𝐢𝐩𝐭 𝐢𝐬 𝐞𝐚𝐬𝐲. When you start connecting different systems together, you quickly learn one brutal truth: APIs will lie to you. They crash. They time out. They change their data without warning. While automating workflows at my current startup, I realized that if your system only works when everything goes perfectly, you haven't built a pipeline—you’ve built a ticking time bomb. Using tools like n8n alongside Python isn't just about moving data from Point A to Point B. It’s about planning for what happens when Point B is suddenly offline. If you aren't building automatic retries and backup plans into your code, you are just hoping for the best. Data engineers and backend devs: What is your go-to strategy for handling silent API failures or surprise rate limits in production? 👇 #n8n #API #DataPipelines #Python #DataEngineering #Automation
1 Comment
Like Comment
To view or add a comment, sign in
Aman Parmar
2mo
Report this post
I asked Claude Code to stop wasting my tokens. It told me to stop using it. I was converting Confluence pages to markdown by passing them through Claude Code. Page after page. Every conversion cost tokens - for something that's just a format change. So I asked Claude Code: you're spending too many tokens on this. What's the token-optimized way? Its answer: use Pandoc. This doesn't need me. I built a Python script. It calls the Confluence REST API, downloads the page, and runs Pandoc for the conversion. Zero tokens. But then 3 things broke: 1. Images were not getting downloaded in the .md file 2. Table formatting was not proper 3. Status labels and @mentions were not converting to markdown Multiple iterations later - the script now handles all of it. Pages convert to markdown without missing anything. Proper formatting. No token cost. The thing that stuck with me: I wasn't wasting tokens on hard problems. I was wasting them on mechanical tasks that a Python script handles better. If you're using Claude Code, Antigravity, Cursor, or any AI tool - try this before your next conversion. Ask yourself: does this need intelligence, or is it just a format change? If it's just format - write a script. Save the tokens for the work that actually needs thinking. Feel free to connect if you want to know more. Happy to help. #ClaudeCode #Python #Confluence #BuildInPublic #AITools
2 Comments
Like Comment
To view or add a comment, sign in

7,329 followers

32 Posts

View Profile Connect

More Relevant Posts

Explore related topics

Explore content categories