Introducing Rail Tech MME for Persistent LLM Agents

108 followers

Most LLM agents forget everything between sessions. The ones that don't usually bolt on a vector database which works, but retrieves by semantic similarity and tends to surface adjacent noise alongside the things that actually matter. I've been building MME (Modular Memory Engine) at Rail Tech to take a different angle. MME stores memories as a weighted tag graph, every saved fact is automatically tagged, and retrieval propagates through the graph by keyword overlap and learned edge weights. Designed to sit alongside your vector DB, not replace it. Keep your embeddings. Add a layer that knows what actually got used. This week we shipped the official Python SDK: 👉 👉🏻 pip install railtech-mme Thin, typed client over the MME REST API. Sync and async, full Pydantic models, and a LangChain extra (MMESaveTool, MMEInjectTool) that drops into any LangChain or LangGraph agent. Works on a cold account from day one — no embedding warm-up, no fine-tuning. If you're building LLM agents in Python and you're tired of either watching your agent forget everything or wrestling with a vector DB, take a look: → pip install railtech-mme → https://lnkd.in/enWXdmfD → mme.railtech.io Would love feedback from anyone shipping agents in production. #LLM #AI #Python #LangChain #AgentMemory

To view or add a comment, sign in

More Relevant Posts

Augusta Bhardwaj
1mo
Report this post
The hardest bug I ever fixed: clean Ctrl+C in an async AI pipeline. When a user presses Ctrl+C during a streaming response with active tool calls, you need to — in order, without race conditions: 1. Cancel the HTTP stream gracefully 2. Abort any in-flight tool executions 3. Clean up temporary state (partial files, temp directories) 4. Preserve conversation history up to the interruption point 5. Return to a clean prompt — ready for the next input Each step can fail. And each failure mode is different. What if Ctrl+C fires between two tool calls? What if the stream buffer hasn't flushed? What if cleanup itself gets interrupted by a second Ctrl+C? What if an async tool call returns after cancellation and tries to write to a closed context? Python's signal handling + asyncio cancellation made it possible. But every edge case took hours to find — because you can only reproduce them by hitting Ctrl+C at exactly the right millisecond. The lesson I keep coming back to: the undo path is always harder than the happy path. And in developer tools, the undo path is what determines whether people trust your software. Stack: Python + Claude API GitHub: https://lnkd.in/ghn_8iKA Full case study: https://lnkd.in/gtg49D-S #Python #Claude #CLI #AsyncPython #Architecture #BuildInPublic #SoftwareEngineering
Like Comment
To view or add a comment, sign in
Bibhu Kumar Singh
2w
Report this post
Excited to share a project I recently built while exploring RAG (Retrieval-Augmented Generation) with LangChain I developed a YouTube Video Assistant that can understand and answer questions from video content. Here’s how it works: 🔹 Extracts video transcripts using YouTube Transcript API 🔹 Splits large text into manageable chunks using RecursiveCharacterTextSplitter 🔹 Generates embeddings with HuggingFace Embeddings 🔹 Stores and retrieves context efficiently using FAISS vector store 🔹 Performs retrieval and augments the user query with relevant context 🔹 Sends the final prompt to the LLM for accurate responses To streamline the workflow, I designed a clean pipeline that connects each stage — from retrieval to generation — making the system modular and efficient. The backend is powered by a Flask server. #AI #MachineLearning #LangChain #RAG #LLM #Python #Flask #FAISS #HuggingFace
Like Comment
To view or add a comment, sign in
flazetech

1,235 followers
3w
Report this post
Stop treating the random seed like a magic number. If your model "works" only at seed=42, you haven't built robustness — you've built coincidence. Here's a practical checklist to stop bluffing reproducibility and actually ship: - log the full RNG state (Python, NumPy, PyTorch, CUDA) — not just the seed. - save experiment configs and dependency hashes (library versions, commit SHAs). - run k-fold seed sweeps: test 5–10 different seeds and report variance, not just best accuracy. - automate environment capture: Dockerfile + pip/conda lockfiles + OS info. - add smoke tests that fail on >X% variance before merging. Tools that make this simple: - Weights & Biases (wandb) — artifact + config + run comparison. - DVC (iterative/dvc) — data + model versioning with reproducible pipelines. - Hydra (facebookresearch/hydra) — config management and multi-run sweeping. - MLflow — experiment tracking + model registry. Also check these repos for patterns: - iterative/dvc (GitHub) — reproducible pipelines for ML. - facebookresearch/hydra — flexible configs and sweeping. Quick pattern I use: 1) seed_everything at process start 2) capture RNG states to wandb artifact 3) run 7-seed sweep via Hydra 4) record mean ± std in the PR FlazeTech taught me to treat reproducibility as a feature, not a checkbox. If your model’s performance collapses across seeds, that’s a product risk — not a research footnote. Are you logging RNG state and variance, or still "hoping" 42 saves you? #MachineLearning #Reproducibility #MLOps #AIEngineering #DevTools #Hydra #DVC
Like Comment
To view or add a comment, sign in
Mohammad Vohra
1w
Report this post
𝗽𝗶𝗽 𝗶𝗻𝘀𝘁𝗮𝗹𝗹 𝗮𝗴𝗲𝗻𝘁-𝗴𝗼𝗱𝗺𝗼𝗱𝗲 That's all it takes to give your LLM agents full file system and shell capabilities. I just open-sourced 𝗮𝗴𝗲𝗻𝘁-𝗴𝗼𝗱𝗺𝗼𝗱𝗲 — a Python MCP package built for engineers shipping real AI products at scale. 🛠️ Capabilities: • Read, write & edit files — safely, UTF-8 aware, no accidental overwrites • Execute commands — sandboxed subprocess, no shell injection risk • Explore the filesystem — recursive listing, glob patterns, depth control • Everything scoped to your workspace root — no path escapes, ever • No Docker. No Deno. No containers. Just Python. ⚙️ Works natively over MCP (stdio) or fully in-process. Plug an LLM. Your keys, your model, your infra. 🔗 https://lnkd.in/dPaHd6XX What gaps have YOU hit building agents? Drop a comment 👇 #AIEngineering #LLMAgents #MCP #ModelContextProtocol #OpenSource #Python #GenerativeAI #AgentDevelopment #MachineLearning
2 Comments
Like Comment
To view or add a comment, sign in
Abhishek Kharat
2w
Report this post
spent the last few days going deep into LangChain. not just "what is it" but actually building the pieces from scratch. wrote everything up into a technical blog — 8 components, working code for each one, a RAG pipeline at the end. and because i don't have a paid OpenAI key (lol), i built the whole thing using Groq's free tier + HuggingFace embeddings. completely free stack, everything runs. some things that actually clicked for me: — prompt templates aren't just convenience. they're the difference between a prompt you can test vs one you'll forget existed — agents feel magical until you realize they're just a loop: think → call tool → look at result → repeat — RAG is simpler than it sounds: load your docs, embed them, retrieve the relevant bits, pass as context. that's it. if you're getting into LangChain and don't want to burn through API credits figuring it out, the notebook might help. 📝 blog: https://lnkd.in/gPsMf_Ys 💻 notebook (free stack, no OpenAI key needed): https://lnkd.in/gnnJhCxy #LangChain #GenerativeAI #LLM #RAG #Python #PromptEngineering #AgenticAI #MachineLearning Innomatics Research Labs #OpenSource
Like Comment
To view or add a comment, sign in
SynapseKit AI

13 followers
2w
Report this post
📣 3 lines to add tracing. Very different things to look at afterward. We benchmarked local agent observability across three frameworks. LangChain wins on setup — 3 lines. But no step latency locally. No structured query object. SynapseKit surfaces token counts, cost, and per-step timing out of the box. LlamaIndex gives you the deepest event tree. Setup ease and observability depth are not the same metric. Full breakdown → engineersofai.com #Python #AI #LLM #MLEngineering #EngineersOfAI
1 Comment
Like Comment
To view or add a comment, sign in
Makoto Yui

Senior Principal Software Engineer at Treasure Data
2w
Report this post
Been trying out a bunch of agent tooling lately — LangChain / LangGraph, PydanticAI, Google ADK, Strands Agents, etc. The ReAct-style reasoning and tool use is great, but in practice I often found myself wanting a bit more control over how things run. On the flip side, modeling everything as a task graph sometimes felt a bit heavy for what I needed. So I started a small OSS / personal experiment → https://graflow.ai Graflow is a Python framework where orchestration stays explicit in code, while agent frameworks handle reasoning and tool use. Still early, but aiming for something simple, readable, and easier to control. If you want to try it quickly, there’s a hands-on on Colab: https://lnkd.in/gTaFyD3A Would really appreciate any feedback 🙏 GitHub: https://lnkd.in/g_BiNmez
1 Comment
Like Comment
To view or add a comment, sign in
Ryane Mehdi D.
1w
Report this post
🚨 Every $0 AI stack tutorial shows you LangGraph, Ollama, and ChromaDB. But almost none of them explain how to actually expose your agent to the outside world. I hit this wall immediately while building my async RAG pipeline designed to process and extract data from Pubmed articles. My orchestrator is Python. My UI is Chainlit. But something needs to sit cleanly between them. That's when I realized: FastAPI + Pydantic is the missing layer nobody draws in the architecture diagrams. 🧩 Here is why they are becoming non-negotiable for my stack: 🛡️ Pydantic: Strictly validates inputs before they ever reach the LLM, preventing hallucinations at the source. ⚡ FastAPI: Turns the Python agent into a robust, async REST API in about 20 lines of code. I am currently integrating both to bridge the gap between my backend processing and the user interface. To the AI and Data Engineers out there who have built this before: what is the #1 trap I should avoid when setting up FastAPI for LLM agents? 👇 #FastAPI #Pydantic #DataEngineering #Bioinformatics #BuildInPublic
Like Comment
To view or add a comment, sign in
ABHAY YADAV
1mo
Report this post
Just built my first RAG (Retrieval-Augmented Generation) Application! 🚀 The idea is simple upload any PDF and chat with it using AI. Ask questions, get answers all from your own document. 🛠️ Tech Stack: → FastAPI (Backend) → Mistral AI (LLM + Embeddings) → ChromaDB (Vector Store) → LangChain (RAG Pipeline) → HTML/CSS/JS (Frontend) ⚙️ How it works: 1. Upload your PDF 2. It gets split into chunks & converted to embeddings 3. Stored in ChromaDB vector store 4. You ask a question → relevant chunks are retrieved → Mistral AI generates the answer Built this from scratch while learning GenAI still building. 🔥 Agents & Deployment coming very soon! 🔗 GitHub: [https://lnkd.in/gNhZ2pJN] #GenAI #RAG #LangChain #FastAPI #MachineLearning #Python #BuildInPublic
1 Comment
Like Comment
To view or add a comment, sign in
Amanpreet Sandhu
1w
Report this post
Day 7/60 Continuing Chapter I Continuing Topic VI - Discovering Types 2. Continuing Type Conversations We can use bool () to convert a variable into a Boolean. If the variable has content, it will become True. If it's empty or , it'l become False. Convert these variables into booleans, and check the output. 🧩Code member = "Sam" middle_name = ""' foot_size = 8.5 siblings = 0 boolean_member = bool (member) boolean middle_name = bool(middle_name) boolean_foot_size = bool (foot_size) boolean_siblings = bool (siblings) print (boolean_member) print(boolean_middle_name) print (boolean_foot_size) print (boolean_siblings) 🖥️ Output True False True False 🧠Challenge of the day: What value will be printed by the code below? 🧩Code average_score = 4.7 print (int (average_score)) 🖥️Output 4 or 5 ? #python #programming #ai #bigtech

1 Comment
Like Comment
To view or add a comment, sign in

108 followers

View Profile Connect

Introducing Rail Tech MME for Persistent LLM Agents

More Relevant Posts

Explore related topics

Explore content categories