Inference4j 0.9.0 Released: Summarization, Translation, Text2SQL Support

2mo Edited

Inference4j 0.9.0 released, Summarization, Translation, Text2SQL support Summarization, translation, grammar correction, text-to-SQL — all running locally on the JVM with no Python, no microservices, no GPU required. try (var summarizer = BartSummarizer.distilBartCnn().build()) { String summary = summarizer.summarize(longArticle); System.out.println(summary); } What's new in 0.9.0: - Seq2seq inference engine with KV cache — the foundation for all encoder-decoder models - 5 new task wrappers: FlanT5, BART summarizer, MarianMT translator, CoEdit grammar corrector, T5 text-to-SQL - Unigram (SentencePiece) tokenizer with Viterbi decoding - ByteBuffer pooling for lower GC pressure - Download progress tracking Text2SQL models allows developers to use natural language and translate into real SQL queries to the SQL DB. This one I think may be a popular one for java devs. That brings inference4j to 19 task wrappers supporting 30+ models across vision, audio, NLP, and multimodal — all through type-safe, builder-pattern APIs that feel like writing normal Java. The goal hasn't changed: make on-device AI inference a first-class citizen in the Java ecosystem. No ONNX tensors, no JNI juggling — just pick a model, call a domain method. What's coming next: - 0.10.0 — Named Entity Recognition + more embedding models - 0.11.0 — TikToken tokenizer, unlocking Llama 3.2 and other modern LLMs - 0.12.0 — Text-to-speech pipeline via Piper — local voice synthesis on the JVM Docs: https://lnkd.in/e63MXwQN GitHub: https://lnkd.in/eEUESRkq #Java #AI #MachineLearning #ONNX #OpenSource

GitHub - inference4j/inference4j: Java Inference API for Onnx models github.com

To view or add a comment, sign in

More Relevant Posts

Avi Press
1mo
Report this post
I talk to a lot of open source AI companies building dev tools. One interesting mistake a lot of them are making is focusing on exclusivley on Python and JavaScript/TypeScript tooling and neglecticing the langauge that most large enterprises are actually using for their AI workloads today. It's Java. The majority of AI workloads in the enterprise are running on JVM, and our data shows this trend is speeding up. When you're making strategic product decisions: market data > vibes. https://lnkd.in/gX95gPdZ

62% of enterprises now use Java to power AI apps https://thenewstack.io

5 Comments
Like Comment
To view or add a comment, sign in
Donald Raab
1mo Edited
Report this post
Empty should be empty! The shock and surprise when I saw the memory cost of a Stream created from an empty List in #Java was audible. Would you be surprised if I told you that Agentic AI helped me discover the memory cost of a Stream? No, a group of AI agents didn't wake up randomly one day and think, know, or care to look for this. That was all me. I was experimenting with using Agentic AI for memory testing and micro-benchmarking, because both are laborious, tedious, and time-consuming activities. I told the agent to write the code to measure a whole bunch of different things I don't usually include in memory tests and benchmarks because it takes too much time to code, run, analyze, and summarize them. Among the many things I told the agent to look at was the memory cost of a Java Stream and #EclipseCollections LazyIterable created from empty lists. The agent(s) did suggest there was likely a correlation between the memory cost of a Stream and the performance differences between Stream and LazyIterable when used with small collections. Using Agentic AI helped me find a potential needle in the proverbial haystack. In the process this has helped me identify a blind spot in the benchmarking approach I have been focused on for years with Eclipse Collections. I have some new found energy to focus on a particular area of comparison I have not spent much time investigating... Small Collections Performance w/ Lazy and Eager Iteration Patterns in Java Enjoy! 🙏 👇 https://lnkd.in/eQAMtrXP

Empty Should be Empty donraab.medium.com

1 Comment
Like Comment
To view or add a comment, sign in
Rajneesh Kumar
1mo
Report this post
Java and AI in the same sentence used to mean a Python sidecar and a prayer. Not anymore. Today I worked through Spring AI end to end — from wiring up a basic ChatClient chatbot to building a full prompt engineering layer. And the depth here is serious. SystemMessage locks down model behavior before users touch it. AssistantMessage gives your app real conversational memory. PromptTemplate brings PreparedStatement-level discipline to prompt construction. External .st files make prompts runtime-swappable without a redeployment. And BeanOutputConverter maps LLM responses directly to typed Java POJOs — no fragile parsing, no regex hacks. Real-world use case: A travel guide service that takes user preferences, builds dynamic prompts from templates, calls the model, and returns structured itinerary objects — all inside a standard Spring Boot microservice. Key takeaway: Spring AI doesn't bolt AI onto Java. It integrates it — with the same patterns, discipline, and maintainability your team already ships with. AI features belong inside your stack, not beside it. What's your current approach to LLM integration in Java? Drop it below. 👇 #SpringAI #Java #PromptEngineering #SpringBoot #AIEngineering
Like Comment
To view or add a comment, sign in
Shahzad Badar
1mo
Report this post
Nobody told Java developers the AI race already has a Java lane. 🏎️ For the past 2 years I kept hearing: "Switch to Python." "Java is too slow for AI." "You're falling behind." Then I built a production AI agent in Spring Boot in a weekend. Here's what nobody's talking about 👇 ☕ The JVM is quietly becoming an AI powerhouse The ecosystem is here. It's mature. And it plays to every strength Java developers already have. 🔧 What's available RIGHT NOW: ⚡ Spring AI — official Spring framework, works with OpenAI, Claude, Gemini, Ollama out of the box 🦜 LangChain4j — full agent framework: tools, memory, RAG, streaming. In Java. 🧠 Semantic Kernel — Microsoft-backed, enterprise-grade AI orchestration 📦 PGvector + Spring Data — vector search without leaving your comfort zone 🗺️ My 90-day path from Java dev → Agentic AI engineer: Weeks 1–2 → Call your first LLM from a Spring Boot controller Weeks 3–5 → Build a RAG pipeline with your own docs Weeks 6–8 → Create an AI agent with real tools (search, DB, APIs) Weeks 9–12 → Ship a multi-agent system. Observability included. You don't relearn anything. You add AI on top of what you already know. Enterprise AI doesn't run in notebooks. It runs in production — with uptime SLAs, distributed tracing, and type safety. That's your world. 🏆 —— I put together a full carousel with the ecosystem map + roadmap. Check it out and save it for your 90-day journey 👆 #Java #SpringBoot #AIEngineering #LangChain4j #SpringAI #SoftwareEngineering #AgenticAI

1 Comment
Like Comment
To view or add a comment, sign in
JR Silva
1mo
Report this post
"These frameworks allow Java developers to build, train, and deploy machine learning models without leaving the ecosystem they already know — a critical advantage in large organizations where retraining thousands of developers on an entirely new language stack would be prohibitively expensive." - Java’s Quiet AI Revolution: How a 30-Year-Old Language Is Powering the Next Wave of Enterprise Machine Learning https://lnkd.in/gyiRD-iJ

Java's Quiet AI Revolution: How a 30-Year-Old Language Is Powering the Next Wave of Enterprise Machine Learning https://www.webpronews.com

1 Comment
Like Comment
To view or add a comment, sign in
Mutaz Al Mahamed
1mo
Report this post
Does Python threaten Java in the AI era? I do not think so. But I do think it exposed where Java was late. For years, if you wanted to experiment quickly with AI, Python was the default path. PyTorch is fundamentally a Python package, and TensorFlow still describes its Python API as the most complete and easiest to use. That created a real gap for Java teams. Not because Java could not run serious systems. But because adding AI often meant bolting Python onto architectures that were already stable, secure, and observable. And that is why the conversation is changing now. Azul’s 2026 State of Java Survey says 62% of enterprises now use Java to power AI functionality, up from 50% the year before. At the same time, Spring AI has official MCP support, and JetBrains’ Koog is pushing further into JVM-native AI agents with Java APIs, Spring Boot integration, and OpenTelemetry support. So the real story is not “Python vs Java.” It is this: Python still leads how AI gets explored. Java is becoming a much stronger place to integrate, secure, observe, and run AI in production. Python is still faster for research, experimentation, and model work. Java is becoming more compelling where AI has to live inside existing enterprise systems, with real requirements around latency, security, and operations. That is not a weakness. That is architecture. The real risk for Java developers is not Python replacing them. It is assuming AI belongs to some other team, some other stack, or some other future. It does not anymore. AI is becoming a feature inside the systems we already build. Do you agree with that framing? A) Python builds AI, Java runs it in production B) Java still has more to prove #Java #Python #AI #EnterpriseJava #SpringAI
3 Comments
Like Comment
To view or add a comment, sign in
Nacereddine Chouich
1mo
Report this post
𝗠𝗼𝗱𝗲𝗿𝗻 𝗝𝗮𝘃𝗮 𝗶𝘀 𝗾𝘂𝗶𝗲𝘁𝗹𝘆 𝗯𝗲𝗰𝗼𝗺𝗶𝗻𝗴 𝗮 𝗽𝗼𝘄𝗲𝗿𝗳𝘂𝗹 𝗽𝗹𝗮𝘁𝗳𝗼𝗿𝗺 𝗳𝗼𝗿 𝘁𝗵𝗲 𝗔𝗜 𝗲𝗿𝗮. With Java 25 and frameworks like Spring AI, Java is no longer just the language of legacy enterprise systems. It’s evolving into a strong foundation for building scalable AI-powered applications. Here are a few reasons why Java is gaining attention in the AI space: ✅ 𝙑𝙞𝙧𝙩𝙪𝙖𝙡 𝙏𝙝𝙧𝙚𝙖𝙙𝙨 (𝘗𝘳𝘰𝘫𝘦𝘤𝘵 𝘓𝘰𝘰𝘮) – Introduced with Project Loom, they allow applications to handle thousands of concurrent AI requests efficiently. Perfect for systems that interact with LLM APIs or AI agents. ✅ 𝗥𝗲𝗰𝗼𝗿𝗱𝘀 & 𝗦𝘁𝗿𝘂𝗰𝘁𝘂𝗿𝗲𝗱 𝗗𝗮𝘁𝗮 – Java Records make it easier to map structured responses from AI models directly into clean, immutable objects. ✅ 𝗕𝗲𝘁𝘁𝗲𝗿 𝗗𝗲𝘃𝗲𝗹𝗼𝗽𝗲𝗿 𝗘𝘅𝗽𝗲𝗿𝗶𝗲𝗻𝗰𝗲 – Features like text blocks, pattern matching, and modern syntax make Java much more expressive than it used to be. ✅ 𝗘𝗻𝘁𝗲𝗿𝗽𝗿𝗶𝘀𝗲 𝗥𝗲𝗹𝗶𝗮𝗯𝗶𝗹𝗶𝘁𝘆 – While the AI ecosystem evolves rapidly, Java continues to offer strong backward compatibility and stability. Python still dominates AI research and model training, but Java is becoming an excellent choice for building production-grade AI systems. The combination of scalability, reliability, and modern language features makes Java a serious player in the future of enterprise AI. What do you think — will Java play a bigger role in AI-powered systems in the coming years? #Java #AI #SpringAI #SoftwareEngineering #BackendDevelopment
Like Comment
To view or add a comment, sign in
Harsiddha Patel
1mo
Report this post
💡 Java Trap That Fooled Even AI (and Many Developers!) Recently, I came across a tricky Java expression: int x = 1; x += x++ + ++x; System.out.println(x); 🤔 What do you think the output is? Many (including AI tools like ChatGPT) might say 7 ❌ But the correct answer is 5 ✅ 🔍 Let’s break it down: Initial value: x = 1 x += expr → Java evaluates it as: x = x + expr (but it uses the original value of x) 👉 Now evaluate the expression step-by-step: x++ → returns 1, then x = 2 ++x → increments to 3, returns 3 So: x++ + ++x = 1 + 3 = 4 Now: x = 1 (original) + 4 = 5 ⚠️ Why this is tricky? Mix of post-increment (x++) and pre-increment (++x) Combined with compound assignment (+=) Evaluation order matters a LOT Even experienced developers (and AI!) can miscalculate this if not careful. 🚫 Lesson Learned: Avoid writing such code in real projects Instead, write clean and readable logic: int a = x++; int b = ++x; x = x + a + b; ✔ Easier to read ✔ Easier to debug ✔ No hidden surprises 🤖 Interesting Insight Even AI can make mistakes in such edge cases because: These rely on side effects + evaluation order Not just pattern recognition 🔥 Takeaway: Write code for humans first, not just compilers. #Java#C# #Programming #CleanCode #Debugging #Learning #SoftwareDevelopment
Like Comment
To view or add a comment, sign in
Anurag Tyagi
1mo
Report this post
Java developers have been building AI systems the hard way. Spring AI is about to change that. 👇 For the past 2 years, integrating AI into Java backends meant: → Writing raw HTTP clients to call LLM APIs → Managing prompt templates manually → Building your own RAG pipeline from scratch → Gluing Python services into your Java architecture It worked. But it was painful. Spring AI changes the equation. What it actually is: Spring AI brings AI engineering into the Spring ecosystem. The same patterns Java developers already know — abstractions, auto-configuration, dependency injection — now applied to LLMs, embeddings, and vector stores. Why it matters: 1. Your Java team can now own the AI layer. 🏗️ No more "we need a Python team for that." AI capabilities live inside your existing Spring Boot services. 2. Model portability is built in. 🔄 Switch between AI providers without rewriting your integration. Your business logic stays untouched. 3. RAG pipelines become first-class citizens. 📚 Document ingestion, embedding, vector store integration — all wired into the Spring ecosystem you already understand. 4. Production patterns apply immediately. ⚙️ Observability, configuration management, testing — Spring AI inherits the entire Spring production toolkit. 5. It reduces the Python dependency. 🐍 Not eliminating it. But reducing it. For structured, scoped AI tasks — Java can now own the stack. The bigger picture? AI integration has been a Python-first world. Spring AI is a signal that the Java ecosystem is serious about changing that. For teams running Spring Boot in production, this isn't just a new library. It's a new option for how you architect AI into your systems. Are you a Java developer exploring AI integration? Or still routing everything through Python services? ⬇️ #SpringAI #Java #SpringBoot #LLM #GenAI #AIEngineering #BackendEngineering #RAG

1 Comment
Like Comment
To view or add a comment, sign in
Jitendra Singh Rawat
1mo Edited
Report this post
The Java vs Python debate in AI is the wrong conversation. CTOs are not choosing a language. They’re choosing a system. Here’s what production agentic AI actually looks like at scale: Python owns the model layer. Fast iteration. Rich tooling. LangChain, CrewAI, AutoGen — the ecosystem is deep and moving fast. For model experimentation, fine-tuning pipelines, and evals, nothing touches it. Java owns everything around it. Spring AI and LangChain4j are now first-class citizens for LLM orchestration and RAG. But the real Java advantage was already there — circuit breakers, distributed tracing, JVM profiling, enterprise security, thread safety at scale. Virtual threads (Java 21+) make async agent workflows lean. GraalVM native cuts cold starts for serverless agent nodes. That’s not catch-up — that’s infrastructure-grade AI.The pattern emerging in serious production deployments: Python at the inference boundary. Java holding the operational perimeter. Not a handoff. A handshake. The architects who are winning aren’t picking sides. They’re designing for both — letting each language do what it was built for, connected through well-defined contracts and MCP tool interfaces. The agentic era doesn’t reward language loyalty.It rewards architectural clarity. #EnterpriseAI #SystemsThinking #SpringAI #PlatformArchitecture #AgenticAI
Like Comment
To view or add a comment, sign in

1,939 followers

313 Posts

View Profile Connect

Inference4j 0.9.0 Released: Summarization, Translation, Text2SQL Support

More Relevant Posts

Explore related topics

Explore content categories