Java Native AI Gateway for Claude, OpenAI, Llama, and More

𝗝𝗮𝘃𝗮 𝘁𝗲𝗮𝗺𝘀 𝘀𝗵𝗼𝘂𝗹𝗱𝗻’𝘁 𝗻𝗲𝗲𝗱 𝗣𝘆𝘁𝗵𝗼𝗻 𝘁𝗼 𝗯𝘂𝗶𝗹𝗱 𝗔𝗜. So for the last 18 months, Java teams have been duct-taping LLM SDKs into their stack. I decided to build what Java teams actually need. Meet 𝗡𝗮𝘂𝘁𝗶𝗹𝘂𝘀 — a Java-native AI Gateway, now open source on GitHub. One API across Claude, OpenAI, Llama, Mistral, and more. Built with Spring Boot 3 and Java 21. No Python sidecars. No reverse-proxy hacks. No “let’s switch stacks for this one service.” 𝗪𝗵𝗮𝘁’𝘀 𝗶𝗻𝘀𝗶𝗱𝗲: → Smart routing across providers (priority, round-robin, random, cost-aware) → Automatic fallback on rate limits, timeouts, and 5xx errors → Semantic caching (pgvector + Redis hot tier) — same prompt, no second bill → Per-key 𝗿𝗮𝘁𝗲 𝗹𝗶𝗺𝗶𝘁𝗶𝗻𝗴 + 𝗰𝗼𝘀𝘁 𝘁𝗿𝗮𝗰𝗸𝗶𝗻𝗴 in tokens and USD → Cross-cutting concerns done right: log redaction, audit trails, validation → Observability you already use — Micrometer, Prometheus, OpenTelemetry We’re one week away from v0.1. The core foundation is ready: provider SPI, routing, fallback, and OpenAI-compatible API. 𝗡𝗲𝘅𝘁 𝘂𝗽: → Native Claude + OpenAI adapters → Streaming support → Spring Boot starter If you're building on the JVM and waiting for first-class AI infrastructure in Java — this is for you. ⭐ Stars and early feedback will help us shape what ships first. Repo link in the first comment ↓ #Java #SpringBoot #AI #LLM #OpenSource #DeveloperTools #Github

1 Comment

Mohd Farhan 2d

Repo: https://github.com/kloudOcean/nautilus

To view or add a comment, sign in

More Relevant Posts

Chandan Kumar
4w
Report this post
I've been building RAG pipelines in Java for 12 months. Here's everything I wish I had on Day 1 — condensed into a free 10-page PDF guide. 📄 What's inside: ✅ RAG Architecture explained (Ingestion + Query phases) ✅ Spring Boot + LangChain4j full project setup ✅ Document chunking strategy (the #1 thing most guides skip) ✅ pgvector integration with HNSW index ✅ LLM augmentation with Claude — grounded, citation-aware answers ✅ Production patterns: Hybrid Search, Re-ranking, Metadata Filtering ✅ Complete REST API — curl examples included --- The hardest part of RAG isn't the LLM call. It's the retrieval layer. Most developers spend 90% of their time on prompt engineering and 10% on retrieval quality — and then wonder why their RAG chatbot hallucinates. The truth: garbage in, garbage out. Poor chunking → irrelevant chunks → the LLM makes stuff up. What actually moves the needle: → Recursive chunking (paragraph → sentence → word fallback) → 10-15% overlap between chunks to avoid cutting mid-thought → minScore threshold (≥ 0.70) to discard low-relevance matches → Metadata filtering for multi-tenant safety → Hybrid Search: vector + BM25 + RRF fusion Once you fix retrieval, the LLM answer quality jumps dramatically. --- Stack used in this guide: → Java 21 + Spring Boot 3.2 → LangChain4j 0.31 → Claude Sonnet (Anthropic) — 200K context window → pgvector on PostgreSQL — zero extra infra → OpenAI text-embedding-3-small --- 12 years of Java backend engineering + 12 months of GenAI production work. This guide is the bridge between the two. Grab the PDF above. Build something real. --- ♻️ Repost if this helps your team. 🔔 Follow @chandantechie for weekly Java + AI deep-dives. #Java #SpringBoot #RAG #GenAI #LLM #LangChain4j #AIEngineering #SoftwareEngineering #BackendDevelopment #VectorDatabase

1 Comment
Like Comment
To view or add a comment, sign in
intelliswarm.ai

4 followers
1w Edited
Report this post
The multi-agent landscape in early 2025 looked like this: LangChain. CrewAI. AutoGen. OpenAI Swarm. All Python. All single-user. All missing what enterprises actually need to ship agents to production. Meanwhile, Java runs ~90% of enterprise backends. Spring Boot is the de-facto standard. But if you wanted to orchestrate AI agents in the JVM ecosystem, your options were essentially "write it yourself." We built SwarmAI to close that gap. Today we're open-sourcing it. → Built on Spring Boot 3.4 + Spring AI 1.0.4 GA → 8 orchestration patterns (Sequential, Parallel, Hierarchical, Iterative, Self-Improving, Swarm, Distributed, Composite) → Sealed Build → Compile → Execute lifecycle — catches errors before tokens get spent → Multi-tenancy, governance gates, budget enforcement, audit trails — architectural, not bolted on → Three RL policy engines (LinUCB contextual bandits, DQN with experience replay) for skill generation and stopping decisions, 4–12x better than Monte Carlo baselines → 38 built-in tools, MCP adapter, RAFT consensus for multi-node coordination → 1,400+ tests passing. Apache 2.0 for core. Full write-up with code, architecture, and benchmark methodology: https://lnkd.in/eWmmY2qq We'd love hard questions, issues, and PRs. #Java #SpringBoot #AIAgents #MultiAgent #OpenSource

Introducing SwarmAI: A Multi-Agent Orchestration Framework for Java intelliswarm.ai
Like Comment
To view or add a comment, sign in
Agnaldo Souza
3d
Report this post
The End of the Language Wars: Why Java and Python are Better Together 🚀 Scalability or Agility? If you work with large-scale systems, you’ve likely faced this dilemma. While Java is the "safe harbor" for infrastructure, Python is the engine for rapid innovation. Instead of picking a side, here is how I view the evolution of modern architecture: 1. The Backend as a Fortress (Java) Java remains unbeatable for managing complex business rules, concurrency, and security. It’s where we ensure that authentication (JWT) is foolproof and that data persistence—in databases like PostgreSQL—is performant and scalable. 2. Intelligence as a Competitive Edge (Python) For data processing, automation, and AI models, the Python ecosystem is simply more productive. Integrating a quantitative trading bot or a sentiment analysis engine into a Java environment isn't a "workaround"—it’s a strategic advantage. 3. Interoperability is Key Whether using isolated microservices, gRPC for low latency, or the polyglot capabilities of GraalVM, integration today is seamless. Data flows from Java, gets processed by the "magic" of Python libraries, and returns to a modern frontend (like Next.js) transparently. The result? A robust system that doesn't sacrifice the speed of delivering new features. The question is: How polyglot is your stack today? Do you prefer the safety of a single ecosystem or the versatility of a hybrid architecture? #Java #Python #SoftwareArchitecture #WebDev #Backend #Scalability
Like Comment
To view or add a comment, sign in
Kreuzberg

230 followers
4d
Report this post
kreuzberg-txtai is live 🎉 Our TxtAI (by NeuML) integration for Kreuzberg offers TxtAI developers a simple replacement for Textractor by replacing Apache Tika and its Java runtime with Kreuzberg’s Rust-based extraction engine. You don’t need to change your current pipeline interface to use it. Add it to your TxtAI workflow, and you’ll get: * Support for many formats like PDF, DOCX, PPTX, HTML, images, and more, all in one package * No need for Java. Tika requires a Java Virtual Machine (JVM), which adds extra runtime overhead in every environment and container. Kreuzberg uses compiled Rust binaries instead, so you just need a pip install. * You get consistent metadata for every document, including title, MIME type, page count, and source path. The format stays the same no matter the file type, so you don’t have to search through different output fields. * Flexible OCR options let you choose from Tesseract, EasyOCR, or PaddleOCR, depending on your language and accuracy needs. * You can pick your output format: plain text, Markdown, HTML, or Djot. It’s designed to be minimal. This thin connector fits into your existing TxtAI pipeline without adding complexity. This is the newest part of Kreuzberg’s ecosystem of AI framework integrations, joining SurrealDB, LangChain, LlamaIndex, CrewAI, Haystack AI Framework, and Spring AI. It's MIT licensed and you’ll need Python 3.10+. GitHub: https://lnkd.in/dvdZVy63 Let us know what you think and connect with the team on our Discord server: https://lnkd.in/djjurDdE
2 Comments
Like Comment
To view or add a comment, sign in
Srini Karlekar
2w
Report this post
𝗜 𝘄𝗿𝗼𝘁𝗲 𝟱,𝟬𝟬𝟬 𝘄𝗼𝗿𝗱𝘀 𝗮𝗿𝗴𝘂𝗶𝗻𝗴 𝘆𝗼𝘂 𝘀𝗵𝗼𝘂𝗹𝗱 𝘀𝘁𝗼𝗽 𝗿𝗲-𝗱𝗲𝗿𝗶𝘃𝗶𝗻𝗴 𝗸𝗻𝗼𝘄𝗹𝗲𝗱𝗴𝗲 𝘆𝗼𝘂 𝗮𝗹𝗿𝗲𝗮𝗱𝘆 𝗵𝗮𝘃𝗲. Then I had to go build the thing to prove I wasn't just talking. Earlier this month, Andrej Karpathy dropped a 40-line Markdown gist sketching the idea of an LLM-powered "knowledge compiler" — a system that turns raw sources into a cross-linked, queryable wiki that compounds every time you feed it something new. (April 4, 2026. The internet is still catching up.) Beautiful idea. I closed the tab, wrote an article about it but the itch was persistent. Having built knowledge bases, I wanted to build a working version of Karpathy's idea of Knowledge Compiler. So I built a working version. Here's what I learned: → The system that surprised me most wasn't the ingestion. It was the synthesis pages — pages the LLM wrote that I never asked for. Two sources arguing the same question from different angles? A synthesis page appears. The insight buried in the contrast surfaces automatically. That's the compound interest. Not a metaphor. An artifact. → RAG cannot do this. RAG re-derives at query time and discards the derivation. Compilation deposits it into the wiki. Once. Permanently. Every future question builds on it. → The most underrated use case: legacy code. Point the compiler at a COBOL system whose architects retired years ago. It files entity pages, concept pages, cross-links, and architectural synthesis — the document the original team would have written if they'd had unlimited patience. The graph you didn't have. The repo is on GitHub. The wiki it produced — built from the newsletter's own raw material — is live and walkable right now. The argument was the newsletter. The artifact is the proof. Newsletter: https://lnkd.in/eh4tPHmp Github Repo: https://lnkd.in/exUaPecT Wiki: https://lnkd.in/eKkpbwBz Medium Blog: https://lnkd.in/ejSttU-9 #KnowledgeCompilation #LLMWiki #AgenticAI #Dogfooding #OpenSource #LegacyModernization #KnowledgeManagement #SignalOverNoise

Eating My Own Dogfood: Shipping the Knowledge Compiler skarlekar.medium.com

8 Comments
Like Comment
To view or add a comment, sign in
huajin zhong
3w
Report this post
🚀 Exciting Update: Solving the Latest Spring Boot Bugs with Scenario-Based AI Contexts! If you are building AI-driven applications in Java, managing runtime contexts effectively is crucial. I’ve just pushed a significant update to my open-source project: Scenario-Based Runtime Context for AI. In our latest update, I’ve added detailed descriptions and concrete evidence demonstrating how this framework successfully resolves some tricky bugs encountered in the latest Spring Boot releases. 🐛🔨 By utilizing a scenario-based approach, we can handle dynamic AI runtime contexts much more smoothly without clashing with Spring Boot's latest lifecycle or dependency injection changes. Curious to see how it works? Check out the updated repository and the new use cases here: 🔗 https://lnkd.in/gUnnVG5y I'd love to hear your thoughts! If you find it helpful, a ⭐️ on GitHub is always appreciated. Feedback and PRs are welcome! #SpringBoot #Java #AI #OpenSource #GitHub #SoftwareEngineering #DeveloperTools
Like Comment
To view or add a comment, sign in
Gustavo Araujo Dunhão
5d
Report this post
Your LLM endpoint returns a beautiful paragraph. Your frontend dev pings you: > "What field is the price in?" > "It's… in the sentence." 😬 That's the moment you reach for structured output. Just published Post 6 of my Spring AI RAG series — Structured Output: Turning LLM Prose into Typed Java Records. The whole API is one method call: swap .content() for .entity(FaqEntry.class) and you get a typed Java record instead of a String. Same ChatClient, same prompt, same RAG. One line different. And it's not magic: Spring AI generates a JSON schema from your record, appends it to your prompt, and parses the response with Jackson. A few things I had to learn the hard way: → Use String for almost everything (enums and dates will bite you) → Keep records flat and short — long schemas = more LLM mistakes → Field names are part of the prompt: releaseDate beats dt → Always wrap with retry + validation. The LLM will eventually return invalid JSON. Use it when the next consumer is code, not a human. Read the full post: https://lnkd.in/dxwJRJhU #SpringAI #RAG #Java #SpringBoot #LLM #AIEngineering #StructuredOutput
Like Comment
To view or add a comment, sign in
Thota Ganesh
1w
Report this post
Everyone's talking about AI. Most Java devs don't know Spring now ships with it. Spring AI + LangChain4j let you build LLM-powered features using the same patterns you already know. Here's a basic RAG (Retrieval-Augmented Generation) flow in Spring Boot: 1. Load your documents → chunk them → store embeddings in PGVector or Redis 2. On user query → vectorize the query → retrieve top-k relevant chunks 3. Inject chunks into the LLM prompt → get a grounded, factual response Real use case: internal documentation chatbot that answers questions using your own codebase docs. Built one in ~2 days. What you need: Spring AI dependency, an OpenAI or AWS Bedrock key, a vector DB. AI in Java is no longer experimental. It's production-ready. Are you building anything AI-powered in your Java stack? Drop it below. #SpringAI #Java #LLM #RAG #BackendDevelopment #AWS
1 Comment
Like Comment
To view or add a comment, sign in
chandrakanth garimella
2w
Report this post
💡 Hot take: Java isn't legacy — it's leading the AI backend race. While everyone talks Python for AI, the JVM quietly shipped 👇 ☕ Java 26 with Vector API, Structured Concurrency & AOT caching — built for AI workloads 🌱 Spring AI hitting production maturity 🔗 LangChain4j + Google ADK for Java 1.0 now GA 🤖 Keycloak adding MCP (Model Context Protocol) support The pattern is clear: enterprises aren't rewriting 20 years of Java systems in Python. They're bringing AI to the JVM. If you're a Java dev, you're not behind the curve — you're exactly where the next wave is landing. Which side are you on — "Python for AI" or "JVM all the way"? 👇 #Java #SpringAI #JVM #AI #BackendEngineering
Like Comment
To view or add a comment, sign in
Divya Sonawane
2w
Report this post
Built something interesting this week — an AI Code Reviewer 👨💻 It takes code as input and gives structured feedback like: • Bugs and edge cases • Security concerns • Performance issues • Suggestions for improvement I wanted to go beyond just “generate output” and actually make it useful, so I added: – A clean editor interface – Review history (stored using JPA) – Structured analysis instead of plain text Tech used: Java, Spring Boot, MySQL, Gemini API It was fun figuring out how to structure the AI response properly and make the UI feel like an actual tool instead of a demo. 🔗 GitHub: https://lnkd.in/dce_6kPG #Java #SpringBoot #BackendDevelopment #AI #Projects #LearningInPublic Here’s a quick demo 👇
Like Comment
To view or add a comment, sign in

1,534 followers

113 Posts

View Profile Connect

Java Native AI Gateway for Claude, OpenAI, Llama, and More

More Relevant Posts

Explore content categories