Building Agentis Memory for Shared AI Agent Knowledge

1mo

I tried to fork Redis and embed ONNX Runtime in C. After a week of segfaults, I asked myself: why am I torturing myself with C when I've been writing Java for 12+ years? So I built my own Redis from scratch. Here's the backstory: We were building a multi-agent system for production incident investigation. 6 AI agents running in parallel — logs, metrics, PagerDuty, Confluence, Slack, history — each digging through its own data source. The problem? They had no shared memory. LogsInvestigator found OOMKilled — that's the root cause. But MetricsInvestigator didn't know. It was building its own theory about CPU spikes. The Synthesizer got 3 competing hypotheses, 2 of which were garbage. I looked at what's available: → Mem0: REST API + external embeddings + Docker + custom SDK → Zep: REST API + Postgres + external embeddings + custom SDK → Redis Stack: RESP protocol, but embeddings still external Every solution required network calls for embeddings. Three hops to save one fact. For working memory — that's unacceptable. So I built Agentis Memory: ✦ Speaks Redis protocol (RESP) — any Redis client works out of the box ✦ Embeddings computed locally via ONNX Runtime (all-MiniLM-L6-v2) ✦ Single binary ~150MB, zero dependencies ✦ No API keys, no REST, no custom SDKs Two commands that change everything: • MEMSAVE key "text" → chunks, embeds, indexes automatically • MEMQUERY namespace "query" 5 → semantic search in milliseconds The fun part about performance: First version on plain JVM → 2x slower than Redis 😅 After GraalVM native-image + SIMD via Java Vector API → 1.36x FASTER than Redis 168K ops/sec vs Redis's 123K. At pipeline depth 100 → 3.19M ops/sec. And yes — I benchmarked it against Redis 7.4, Dragonfly, and Lux. Honest numbers, no cherry-picking. Now Claude Desktop, Claude Code, Codex, and Junie all share memory through Agentis Memory. Agents understand context faster, make fewer mistakes, don't duplicate work. The project is open source (Apache-2.0). Detailed write-up with benchmarks and architecture: https://lnkd.in/diREt3cs GitHub: https://lnkd.in/dAEgsuJn

GitHub - scrobot/agentis-memory github.com

To view or add a comment, sign in

More Relevant Posts

Divine Owai
2w Edited
Report this post
Just shipped a distributed systems project And honestly i learned more debugging in this than in any project I've ever done 😅 I built a search recommendation engine from scratch, the kind of system that tracks what you click on and uses that data to re-rank future search results in real time The stack: → FastAPI for the APIs → Kafka as the message queue (click events flow through here) → PostgreSQL to store click scores → Redis to cache hot results → Docker Compose to wire everything together The whole flow looks like this: user clicks a result → event goes to Kafka → stream processor reads it → updates scores in postgres → next search returns re-ranked results with Redis serving the hot ones in <1ms sounds clean right? it was NOT clean getting there at all 😭 The issues i ran into (and how i fixed them): 1. Kafka advertising the wrong address to my Python service which kept getting a "connection refused" error, even though Kafka was running. Turned out Kafka was advertising itself as localhost:9092, which inside Docker means the container's own localhost, not the network address other containers could reach. fixed it by setting KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092 so containers find each other by service name. 2. "depends_on" doesn't mean "ready" Docker's depends_on only waits for a container to start, not for the service inside to be ready. Kafka takes 20-30 seconds to fully boot. My stream processor was connecting before Kafka was ready and crashing. I fixed it with proper Docker healthchecks + retry logic in Python that actively probes Kafka every 5 seconds instead of blindly sleeping. 3. WSL2 dropping me into Docker's internal VM when i typed wsl it dropped me into Docker Desktop's internal Linux VM instead of a real Ubuntu distro. everything was broken — no sudo, no apt, wrong environment. Had to install Ubuntu properly via wsl --install -d Ubuntu and work from there. 4. Running Python outside Docker: I spent a while confused why my local Python couldn't reach Kafka. The fix was containerising the Python services too so everything runs on the same Docker network. Learned the hard way that localhost means something completely different inside vs outside a container. The most satisfying moment was watching the metrics endpoint show: cache hit rate: 57% kafka consumer lag: 0 cache HIT latency: 0.6ms vs cache MISS: 2.3ms That 12x latency difference between Redis and PostgreSQL is real and measurable not just theory anymore. Would apply this to an actual application next... Link to repo: https://lnkd.in/eDt65PvS #softwareengineering #distributedsystems #kafka #redis #python #docker #buildinpublic #devops Alright keep scrollingggg :)
9 Comments
Like Comment
To view or add a comment, sign in
Amlan Nayak
3w
Report this post
Multi-cloud was never a tooling problem—it was a semantics problem, and solving that is what actually unlocks true portability. If MultiCloudJ delivers on its promise, it doesn’t just reduce lock-in—it fundamentally shifts control back to developers and architecture, where it belongs.
AllTech Magazine

1,173 followers
1mo

𝙏𝙝𝙚 𝙈𝙞𝙨𝙨𝙞𝙣𝙜 𝙎𝙩𝙖𝙣𝙙𝙖𝙧𝙙 𝙛𝙤𝙧 𝙈𝙪𝙡𝙩𝙞-𝘾𝙡𝙤𝙪𝙙 𝙅𝙖𝙫𝙖 𝘿𝙚𝙫𝙚𝙡𝙤𝙥𝙢𝙚𝙣𝙩 𝙅𝙪𝙨𝙩 𝘼𝙧𝙧𝙞𝙫𝙚𝙙 For years, Java teams building for multi‑cloud have faced the same painful reality: every cloud provider has its own SDK, its own semantics, and its own way of doing things. Vendor lock‑in wasn’t a theoretical risk—it was baked into the codebase from day one. Sandeep Pal, an engineering leader with over a decade at Snapchat, Microsoft, and Intel, decided to fix that. He created 𝗠𝘂𝗹𝘁𝗶𝗖𝗹𝗼𝘂𝗱𝗝—an open‑source Java SDK that gives developers a single, unified interface across AWS, GCP, and Alibaba Cloud. No more ad‑hoc wrappers. No more massive refactoring when a business requirement shifts. 𝗜𝗻 𝘁𝗵𝗶𝘀 𝗔𝗹𝗹𝗧𝗲𝗰𝗵 𝗠𝗮𝗴𝗮𝘇𝗶𝗻𝗲 𝗶𝗻𝘁𝗲𝗿𝘃𝗶𝗲𝘄, 𝗦𝗮𝗻𝗱𝗲𝗲𝗽 𝗯𝗿𝗲𝗮𝗸𝘀 𝗱𝗼𝘄𝗻: 🔹 𝗪𝗵𝘆 𝘁𝗵𝗲 𝗽𝗿𝗼𝗯𝗹𝗲𝗺 𝘄𝗮𝘀 𝘀𝗼 𝘀𝘁𝘂𝗯𝗯𝗼𝗿𝗻: Semantic differences between providers—like DynamoDB’s strict type marshalling versus Firestore’s rigid structure—meant no viable open‑source abstraction existed until now. 🔹 𝗛𝗼𝘄 𝗠𝘂𝗹𝘁𝗶𝗖𝗹𝗼𝘂𝗱𝗝 𝗮𝘃𝗼𝗶𝗱𝘀 𝗽𝗮𝘀𝘁 𝗽𝗶𝘁𝗳𝗮𝗹𝗹𝘀: Instead of reinventing the wheel with raw API bindings (the approach that ultimately sank Apache jclouds), it leverages official cloud SDKs for the heavy lifting, building unified semantics on top of that stability. 🔹 𝗛𝗮𝗻𝗱𝗹𝗶𝗻𝗴 𝘁𝗵𝗲 𝗵𝗮𝗿𝗱 𝘀𝘁𝘂𝗳𝗳: Normalizing delete responses (idempotent vs. 404), pagination (LastEvaluatedKey vs. cursors), and NoSQL data types so developers get a consistent interface without writing provider‑specific conditionals. 🔹 𝗧𝗲𝘀𝘁𝗶𝗻𝗴 𝗮𝗰𝗿𝗼𝘀𝘀 𝗰𝗹𝗼𝘂𝗱𝘀: Building trust through a rigorous WireMock‑based conformance suite that replays real HTTP transactions, achieving 80%+ test coverage and convincing hundreds of teams to adopt. 🔹 𝗧𝗵𝗲 𝟴𝟬/𝟮𝟬 𝗿𝘂𝗹𝗲 𝗳𝗼𝗿 𝗰𝗹𝗼𝘂𝗱 𝘀𝘁𝗿𝗮𝘁𝗲𝗴𝘆: Use a unified standard for commodity services (object storage, queues, compute) to preserve portability and negotiation leverage. Go deep only on the 20% that’s a true differentiator (like Gemini or SageMaker). Sandeep’s work is backed by deep open‑source experience—he’s contributed to Apache HBase, Phoenix, Spark, and GoCloud.dev. At Snapchat, his Spark optimizations alone saved tens of millions in infrastructure costs by rethinking data layout and eliminating retry cycles. For any engineering team navigating #multicloud complexity, this conversation is essential. 👉 Read the full interview to see how MultiCloudJ is finally giving Java developers the standard they’ve been missing. 𝗟𝗶𝗻𝗸: https://lnkd.in/etevsPYr #Java #MultiCloud #OpenSource #CloudComputing #SoftwareEngineering #DevOps #TechMagazine
Like Comment
To view or add a comment, sign in
Karun Kumar
1w Edited
Report this post
I recently dedicated a couple of days to building a change-data-capture pipeline from scratch using the AWS free tier. Here's a breakdown of the process: Pipeline Overview: CoinMarketCap API → Python → RDS Postgres → Debezium → Kafka → S3 (JSON) 1. A Python script accesses CoinMarketCap's free-tier API and upserts the top 10 cryptocurrencies into Postgres. 2. RDS Postgres serves as the source of truth, with every INSERT/UPDATE recorded in the write-ahead log. 3. Debezium connects to the WAL via a logical replication slot, converting each row change into a CDC event and publishing it to Kafka. 4. A single-broker Kafka in KRaft mode (without Zookeeper) buffers the events. 5. The Confluent S3 Sink consumes the topic and outputs the events as JSON, creating one file per minute. This entire setup runs on a single t3.micro instance with 1 GB RAM and 1 GB swap, utilizing one IAM role and one bucket, without any managed Kafka or paid tier services. Key Learnings: - On RDS, the master user isn't a superuser and can't create a role WITH REPLICATION. Instead, grant the built-in rds_replication role. This term is crucial, as the documentation covers it, but the error message may lead you astray. - Debezium's default decimal.handling.mode is precise, which emits NUMERIC columns as base64-encoded bytes in your JSON. Change it to string to avoid prices appearing as "YmFzZTY0." - The S3 sink task reports RUNNING before attempting a PutObject. If your IAM policy lacks s3:PutObject on arn:aws:s3:::bucket/* (note the /*), the sink appears healthy until the first rotation, when it fails. Verify PutObject permissions before trusting the task state. - Home WiFi's public IP can rotate unexpectedly. If your EC2 security group is scoped to "my IP" and your ISP gives you a new one overnight, you're locked out until you update the SG. What's next: Phase 2 — add schema validation and move infrastructure to Terraform. Phase 3 — land the S3 data in an open table format so the bucket becomes directly queryable. Demo video is attached. Please watch and let me know your feedback. Github repo link is in the comments.

1 Comment
Like Comment
To view or add a comment, sign in
Tony Kim
1mo
Report this post
Written by Luke Thompson - MongoDB Champion published on Friends of OpenJDK (Foojay.io), learn how to build a Java faceted full-text search API! In the tutorial, he'll walk through using a interesting dataset which showcases how you can effectively pair machine learning/AI-generated data with more traditional search to produce fast, cheap, repeatable, and intuitive search engines. Dive in here 👉 https://lnkd.in/gm6a2Y77 #mongodb #java #nosql #database #atlas

Java Faceted Full-Text Search API Using MongoDB Atlas Search https://foojay.io
Like Comment
To view or add a comment, sign in
Vadym Kazulkin 🇺🇦
2w
Report this post
New blog post alert 🚨 "Serverless applications on AWS with Lambda using Java 25, API Gateway and DynamoDB – Part 6 Using GraalVM Native Image". In this article, we’ll introduce another approach to improve the performance of the Lambda function – GraalVM Native Image. If you like my content, please follow me on GitHub (github.com/Vadym79) and give my repositories like this https://lnkd.in/epud2eRf a star! Amazon Web Services (AWS) Oracle #Java #Serverless #AWS GraalVM https://lnkd.in/e-ZAAvaf

Use Lambda GraalVM Native Image and provide its performance measurements for the sample application vkazulkin.com
Like Comment
To view or add a comment, sign in
Soumen Deb
5d
Report this post
🚀 Built a local POC system that can process 100,000+ orders per second. Here's what I learned. In financial services and high-frequency trading, peak write rates aren't gradual — they're a vertical cliff. Thousands of transactions hit simultaneously, and every millisecond of latency has consequences. I just published a 5-part engineering deep-dive on building a horizontally-sharded, fault-tolerant order pipeline using: ⚡ Redis Streams (4-shard architecture) — 600,000 ops/sec ceiling 🐘 PostgreSQL — batched writes, fully decoupled from the HTTP layer 🐍 Python (FastAPI + asyncio) — sub-millisecond producer latency ☕ Spring Boot & Quarkus — polyglot consumer implementations 𝗞𝗲𝘆 𝗶𝗻𝘀𝗶𝗴𝗵𝘁𝘀 𝗳𝗿𝗼𝗺 𝘁𝗵𝗲 𝗯𝘂𝗶𝗹𝗱: ✅ Why Redis Streams beats Kafka for low-latency booking pipelines (no operational overhead, sub-ms write latency, built-in consumer groups + PEL for at-least-once delivery) ✅ Why Python's built-in hash() is dangerous for shard routing at scale — and how SHA-1 solves it ✅ How horizontal sharding makes scaling additive: bump NUM_SHARDS, get linear throughput — zero code changes ✅ Circuit breaker patterns and graceful degradation under shard failure ✅ Batch insert tuning that turns 1,000 individual DB writes into a single efficient operation This isn't just a side project — it's a distillation of 22 years in financial services (Lehman Brothers, Morgan Stanley, JPMorgan Chase) compressed into working, testable code. 📖 Full engineering write-up on Medium: https://lnkd.in/dc4SZu-Z 💻 Full source code on GitHub (Python + Spring Boot + Quarkus): https://lnkd.in/dwPTn9Qh 🙏 Special credit to Claude (Anthropic) — my AI pair programmer throughout this build. Claude helped architect the sharding logic, debug race conditions, and sharpen the engineering narrative. Human expertise + AI collaboration = faster, better systems. 💡 As Einstein said: "Everything should be made as simple as possible, but not simpler." That principle guided every design decision here — strip away what you don't need, keep exactly what you do. — #SystemDesign #HighThroughput #RedisStreams #PostgreSQL #SpringBoot #Python #Quarkus #FinTech #DistributedSystems #SoftwareEngineering #GenAI #ClaudeAI #BackendEngineering

GitHub - soumenkdeb/redis-stream-sharding-postgres: High-throughput order pipeline: Redis streams + PostgreSQL, Python / Spring Boot / Quarkus github.com
Like Comment
To view or add a comment, sign in
Aditya Singh
3w
Report this post
Built a full stack async document processing system in a day. Here's what's under the hood 👇 DocFlow: an async document workflow engine where uploaded files go through a real multi-stage background processing pipeline, not a fake synchronous trick. The architecture: → Next.js 14 + TypeScript frontend with a full dashboard. Upload, live progress tracking, review/edit extracted fields, finalize & export → FastAPI (Python) backend with a clean 3 layer architecture: API routes → Service layer → Worker layer. Zero business logic in route handlers → Celery workers handle all processing in the background, completely outside the request-response cycle → Redis does double duty Celery message broker AND Pub/Sub channel for streaming 7 live progress events per job (job_queued → job_started → parsing_started → parsing_completed → extraction_started → extraction_completed → job_completed) → PostgreSQL + SQLAlchemy + Alembic for persistent job/document state with proper migrations → Server-Sent Events (SSE) to stream Redis Pub/Sub events to the frontend in real time → JWT auth, file storage abstraction layer (LocalStorage today, S3 ready interface), idempotent retry handling, cancellation support, and export as JSON/CSV → Docker Compose one command spins up all 5 services (API, Worker, Frontend, PostgreSQL, Redis) → Pytest test suite covering API, service layer, and workers For deployment I went through Docker on Render, Railway, and Vercel the peer dependency conflicts between TanStack/react-query v5 and Radix UI packages in the Node build made it a nightmare. The app runs perfectly locally via Docker Compose. Production cloud deploy was the only thing that didn't land in time. Both DocFlow and yesterday's project were built as 1 day take home assessments given by Predusk Technology Private Limited. Shipping production grade systems under real time pressure is a different kind of engineering test GitHub: https://lnkd.in/daAkhhKG #fullstack #python #fastapi #nextjs #celery #redis #docker #asyncprogramming #typescript #postgresql #buildinpublic #webdevelopment
1 Comment
Like Comment
To view or add a comment, sign in
Pratham Saroch
2w
Report this post
🚀 Backend Learning | Caching Patterns for High-Performance Systems While working on backend systems, I recently explored different caching strategies used to improve performance and scalability. 🔹 The Problem: • Frequent database hits increasing latency • High load under traffic • Need for faster response times 🔹 What I Learned: • Cache Aside (Lazy Loading): Load data into cache on demand • Write Through: Write to cache and DB simultaneously • Write Back (Write Behind): Write to cache first, DB updated later 🔹 Key Insights: • Cache Aside → Simple & widely used • Write Through → Strong consistency • Write Back → High performance but complex 🔹 Outcome: • Reduced database load • Faster API responses • Better system performance Caching is not just about storing data — it’s about choosing the right strategy. 🚀 #Java #SpringBoot #Redis #SystemDesign #BackendDevelopment #Caching #LearningInPublic
Like Comment
To view or add a comment, sign in
Otavio Santana
3w
Report this post
MongoDB Atlas offers a powerful document model, enabling you to store data as JSON-like objects that closely resemble your application code. Read more 👉 https://lttr.ai/Ap3jo #Java #NoSQL #MongoDB

Create a Java REST API with Quarkus and Eclipse JNoSQL for MongoDB mongodb.com
Like Comment
To view or add a comment, sign in
Esther Arias Valor
3w
Report this post
𝗦𝗽𝗿𝗶𝗻𝗴 𝗶𝘀 𝗲𝘃𝗼𝗹𝘃𝗶𝗻𝗴 𝗮𝗰𝗿𝗼𝘀𝘀 𝗮𝗹𝗹 𝗸𝗲𝘆 𝗮𝗿𝗲𝗮𝘀 𝗼𝗳 𝗯𝗮𝗰𝗸𝗲𝗻𝗱 𝗱𝗲𝘃𝗲𝗹𝗼𝗽𝗺𝗲𝗻𝘁. This week I read an update from InfoQ about the latest Spring ecosystem releases — and what stood out is how many areas are evolving at the same time. Here are some highlights: 🔹 Spring Boot → AMQP 1.0 support + MongoDB batch integration 🔹 Spring Data → improved Redis features and bulk operations in MongoDB 🔹 Spring Security → new authorization features + critical vulnerability fix 🔹 Spring Integration → better support for cloud events and messaging 🔹 Apache Kafka → improved acknowledgment handling and error strategies 🔹 Spring AMQP → stronger messaging support with AMQP 1.0 🔹 Spring AI → more flexible configuration for AI integrations 🔹 Spring Vault → simpler management of secrets and certificates 👉 Key takeaway: Java is not evolving in isolation. It’s advancing across security, data, messaging, integration, and AI — all at once. From a backend perspective, this reinforces how important it is to understand not just frameworks, but the full landscape of modern systems: event-driven architectures, secure applications, and data flows. 💬 Curious — which of these areas is having the biggest impact in your projects? #Java #SpringBoot #Spring #BackendDevelopment #Microservices #Kafka #Security #Data #Cloud #DevOps #SoftwareArchitecture https://lnkd.in/ervTw5yN

Spring News Roundup: Third Milestone Releases of Boot, Security, Integration, AI and AMQP infoq.com
Like Comment
To view or add a comment, sign in

749 followers

129 Posts

View Profile Follow

Building Agentis Memory for Shared AI Agent Knowledge

More Relevant Posts

Explore content categories