No LangChain, no Python, no fancy framework. Just C, curl, and a local LLaMA model running with llama.cpp on my Machine. Why? I wanted to really feel what “agentic” AI looks like at the lowest level: 1) I spin up llama.cpp locally with a quantized Meta LLaMA model, exposing an OpenAI‑style /v1/chat/completions endpoint on localhost. 2) My C program opens a simple terminal loop, reads user input, and sends it as JSON over HTTP using libcurl. 3) The response is parsed directly from the raw JSON and printed back to the console – no SDKs, no helpers. 4) Every turn (User: / Bot:) is appended to a memory.txt log, so the agent has a persistent, readable conversation history I can inspect right inside my editor. 5) On top of that, I keep a sliding window of recent messages in memory and send them on every request, so the model can actually “remember” context during the session, not just answer one‑off prompts. It’s a tiny project, but it was a fun reminder of what’s actually happening under all the layers of modern tooling: you’re just sending structured text to a model, getting structured text back, and deciding how to manage state and side‑effects around it. Code: https://lnkd.in/ggdYXaZH

To view or add a comment, sign in

Explore content categories