Agentic code execution
Agentic Code Execution: Moving Beyond Traditional Function-Based Tools
Context
When building AI agents today, the core design challenge is crafting an efficient workflow around one key element: tools.
Tools allow a model to interact with external systems, perform tasks, and retrieve information. Recently the ecosystem has matured rapidly, and we now have two main paradigms for enabling tool usage:
Both are powerful, but each inherit a foundational limitation: tools are static.
They must be defined in advance with fixed arguments, fixed internal logic, and fixed return types.
This rigidity becomes problematic as agents become more capable, multi-step, and autonomous.
The Problem
1. Tools are inherently static
A Python function used as a tool might look like:
def resize_image(path: str, width: int, height: int) -> str:
...
This function has:
If the model wants to:
…it simply cannot, unless the developer modifies the function. This means tool design is a bottleneck for agent intelligence.
2. Too many tools = too many tokens
Every tool definition is injected into the model’s context window.
If you create dozens of tools, you automatically:
AI engineers often try to compress functionality using feature flags:
def file_manager(path: str, delete: bool = False, read: bool = False, write: bool = False):
...
But this quickly becomes unwieldy:
3. Intermediate results consume tokens
Each tool call returns data that the model must read and interpret.
For multi-step reasoning, intermediate data may flood the context and waste tokens.
Example:
You pay for all of it.
The Solution — Exec: Dynamic Code Tools
Instead of exposing static functions, expose a single tool that simply executes arbitrary code generated by the model.
Example API
def execute_python(code: str) -> str:
exec(code, globals()) # exec() will execute python code inside a string variable
Model usage
{
"tool": "execute_python",
"code": "
files = [f for f in os.listdir('.') if f.endswith('.log')]
files = files[:50]
print('\\\\n'.join(files))
"
}
Now your tool collapses dozens of predefined functions into one universal interface.
Recommended by LinkedIn
Advantages
But there’s a big issue
Security.
Executing model-generated Python inside your main interpreter is dangerous:
Unless executed inside a hardened sandbox, exec is unsafe.
The Solution — Subprocess Sandboxing
A safer approach is executing model-generated code inside a subprocess with strict isolation.
Example implementation
import subprocess
import tempfile
import sys
def run_python_subprocess(code: str) -> str:
with tempfile.NamedTemporaryFile(suffix=".py", delete=False) as f:
f.write(code.encode())
path = f.name
result = subprocess.run(
[sys.executable, path],
capture_output=True,
text=True,
timeout=5,
cwd="/tmp"
)
return result.stdout or result.stderr
Why subprocess is safer
Example usage
{
"tool": "run_python_subprocess",
"code": "
import json, glob, os
images = glob.glob('*.png')
result = [{'file': f, 'size': os.path.getsize(f)} for f in images]
print(json.dumps(result))
"
}
This allows:
…all with one flexible tool.
Alongside subprocess sandboxing (which isn't 100% safe as implemented above), you can use other techniques such as running Python in isolated mode (python -I), preventing dangerous imports by overriding builtins.__import__, adding timeout protection, containerization, and more.
Do not use this code in production—it's written only to showcase an approach you can use to overcome some limitations of traditional tool calling.
Practical Use Cases
1. Data Transformation Tools
Agents write custom logic to generate CSVs, JSON, XML, XLSX, etc.
2. CodeAssist / CodeMode
Agents write and execute code as part of their reasoning loop.
3. Autonomous Engineering Agents
Agents can:
without predefined static functions.
References
If you want to explore more about agentic code execution, the topic is evolving rapidly, consider reading more:
Conclusions
Traditional function-based tools limit the flexibility and intelligence of agents.
As workflows grow more complex, static tool definitions become a bottleneck and increase token usage.
Agentic code execution—where the model writes and executes its own code—provides a solution:
exec enables this paradigm, while subprocess makes it practical and safe.
Modern agent architectures are shifting from “functions with arguments” to dynamic code execution tools, enabling the next generation of powerful agentic systems.
Bravo Leo!