"OpenAI's gpt-oss Java port for CPU inference"

6mo

🚀 Launching GPT-OSS Java: Pure Java LLM Inference in ~1000 Lines Excited to share my latest open-source project - a complete Java port of OpenAI's gpt-oss inference engine running on CPU, now available on https://lnkd.in/gzCXk-pH! 🎯 Key features: • 📚 Educational - Clean, readable code for understanding LLM Transformer internals • 🏗️ Complete gpt-oss architecture - Full implementation of MoE transformer with GQA, sliding window attention, RoPE, and SwiGLU • 💻 CPU inference - No GPU required, designed for consumer-grade commodity hardware on local machines or cloud compute instances • 🧠 Memory efficient - Runs gpt-oss-20b models on CPU with just 16GB RAM • ⚡ Performance optimized - Support KVCache and exploit modern JDK GC/JIT, parallel processing, SIMD Vector API, and fused operations • 🔢 MXFP4 dequantization - Handles original MXFP4 quantized MoE weights 📊 Performance highlights: • ~11 tokens/sec on Apple M3 Pro (12 CPUs, 36GB) • ~10 tokens/sec on AWS EC2 m5.4xlarge (8 physical cores, 16 vCPUs, 64GB) Inspired by llama.cpp and llama2.c, this project demonstrates that Java can achieve impressive performance for LLM inference when properly optimized. Check it out: https://lnkd.in/gzCXk-pH

GitHub - amzn/gpt-oss.java: A pure Java implementation of OpenAI's gpt-oss inference optimized for CPU execution github.com

2 Comments

Xiening Dai 6mo

Nice! Just curious, why Java?

To view or add a comment, sign in

More Relevant Posts

Yazid Al-Bsoul
5mo
Report this post
#MachineLearning in Java: Deep Java Library (DJL) Revolution The Problem: Java developers wanting AI integration faced tough choices: learn Python or use expensive external APIs. Not anymore! What is DJL? Deep Java Library is Amazon's open-source solution for building, training, and deploying ML models directly in Java. Key Features: - Engine-Agnostic: Switch between TensorFlow, PyTorch, and MXNet without code changes. - Java Ecosystem: Full Spring Boot support, Akka integration, native multithreading. - ModelZoo: 70+ pre-trained models from GluonCV, HuggingFace, TorchHub, and Keras. Examples: 1. Netflix: Used DJL for real-time log analysis, processing millions of system logs, clustering similar messages, and reducing false alerts. Ran 100+ hours without memory issues! 2. Handwritten Digits: Build recognition systems with 99.98% accuracy in under 10 lines of code using MNIST. 3. Object Detection: SSD model detects multiple objects, draws bounding boxes, and provides confidence scores. 4. Other Applications: - Pneumonia detection from X-rays - Malicious URL detection - Android footwear classification - Question Answering systems - Sentiment Analysis - Pose Estimation DJL Perfect for Microservices, Real-time Processing, Edge Computing, and Enterprise Applications. DJL bridges Java developers to AI. Whether enhancing existing apps or building new ML products, DJL is your solution. Resources: https://djl.ai https://lnkd.in/eVZWNREa #Java #MachineLearning #DeepLearning #DJL #AI #SoftwareDevelopment #AWS

DJL - Deep Java Library djl.ai
Like Comment
To view or add a comment, sign in
Bala Kishore Poojari
6mo Edited
Report this post
💡 Today’s Learning: Method Overloading in Java Today, I explored Method Overloading in more depth! 🚀 Yesterday, I learned that the Java compiler differentiates overloaded methods using: 1️⃣ Number of parameters 2️⃣ Data types of parameters 3️⃣ Sequence of parameters 4️⃣ Check impicit typecasting But today, I discovered one more key factor — 👉 Implicit Typecasting — the compiler also considers it while resolving overloaded methods. However, it can sometimes cause ambiguity in certain cases! ⚠️ I also noticed that several built-in methods in Java are overloaded: substring(int beginIndex) substring(int beginIndex, int endIndex) println() and print() in System.out printf() and format() methods 🧠 Quick Mind Map Recap 🌟 Method Overloading 🌟 │ ┌────────────────────────┼────────────────────────┐ │ │ │ 📘 Definition ⚙️ Compiler Checks 🧩 Built-in Examples │ ├─ Name │ ├─ Same method name ├─ No. of params ├─ println() ├─ Different params ├─ Data types ├─ print() │ ├─ Sequence of params ├─ substring() │ └─ Implicit typecasting └─ printf() │ 💭 Can cause ambiguity if compiler finds multiple matches 💻 Example: class Demo { void show(int a) { System.out.println("Int: " + a); } void show(double a) { System.out.println("Double: " + a); } public static void main(String[] args) { Demo d = new Demo(); d.show(10); // calls show(int) d.show(10.5); // calls show(double) d.show('A'); // implicit typecasting → int → show(int) } } 🔍 Bonus Concept: Can the main() method be overloaded? ✅ Yes! We can overload the main() method in Java. However, the JVM always starts execution from the standard method signature: public static void main(String[] args) If we create another overloaded main() method, JVM won’t call it automatically — we need to call it manually from the original main. Example: class Test { public static void main(String[] args) { System.out.println("Main method with String[] args"); main(10); // Calling overloaded main } public static void main(int a) { System.out.println("Overloaded main with int parameter: " + a); } } 🧩 So, JVM recognizes the main() method by its method signature — public static void main(String[] args) — not just by its name. ✨ Key Takeaway: Method Overloading improves code readability, reusability, and flexibility — a powerful concept in Java’s Object-Oriented Programming! 💪 #Java #OOPs #MethodOverloading #LearningJourney #Programming #CodeEveryday #JavaDeveloper
Like Comment
To view or add a comment, sign in
Ahmed Hassan
6mo
Report this post
Major Features in Latest Java Versions (JDK 21–25) JDK 25 (September 2025, LTS) AI/ML APIs: Tensor operations and vectorized computations for machine learning. Generational ZGC: Lower-latency garbage collection for large apps. Primitive Pattern Matching: Finalized support for primitives in instanceof/switch. Stream Gatherers: Custom Stream API operations (finalized). Quantum-Resistant Cryptography: ML-KEM/ML-DSA for post-quantum security. Compact Object Headers (Experimental): Reduces memory overhead (Project Lilliput). --- JDK 24 (March 2025) Quantum-Resistant Cryptography: ML-KEM/ML-DSA algorithms introduced. Generational Shenandoah GC (Experimental): Low-pause GC for long-running apps. AOT Class Loading & Linking: Faster startup via cached class data (Project Leyden). Structured Concurrency (Preview): Manages related tasks as a unit. Stream Gatherers (Preview): Custom Stream operations. Vector API (Incubator): Enhanced vector computations. --- JDK 23 (September 2024) Primitive Types in Patterns (Preview): Pattern matching for all primitives. Markdown Documentation Comments: Markdown in Javadoc for richer API docs. Module Import Declarations (Preview): Import entire module packages. Structured Concurrency (Preview): Simplified concurrent task management. Scoped Values (Preview): Immutable data sharing across threads. --- JDK 22 (March 2024) Foreign Function & Memory API: Safe native code/memory access (finalized). Unnamed Variables & Patterns: Use _ for unused variables/patterns. Region Pinning in G1 GC: Reduces GC latency in JNI regions. Stream Gatherers (Preview): Custom Stream operations introduced. Scoped Values (Preview): Immutable thread-local alternative. --- JDK 21 (September 2023, LTS) Virtual Threads: Lightweight threads for scalable concurrency. Record Patterns: Deconstruct records in patterns/switch. Pattern Matching for Switch: Full pattern support with null/guard cases. Sequenced Collections: Ordered access with first()/last()/reversed().
Like Comment
To view or add a comment, sign in
KARTIK CHOUDHARY
5mo
Report this post
Java ke scientifically advanced secrets! 🔥 --- Post 1: Java ka "Quantum Bit" simulation using boolean arrays!🤯 ```java public class QuantumJava { // Qubit simulation using boolean probability private static boolean[] qubit = new boolean[2]; // |0⟩ and |1⟩ public static void quantumGate() { // Hadamard Gate simulation boolean temp = qubit[0]; qubit[0] = (boolean) (Math.random() > 0.5 ? qubit[1] : !qubit[1]); qubit[1] = temp; System.out.println("Qubit State: |" + (qubit[0]?"1":"0") + "⟩ + |" + (qubit[1]?"1":"0") + "⟩"); } } ``` Secret: Java booleans use karke quantum superposition simulate kar sakte ho! 💀 --- Post 2: Java ka "Genetic Algorithm" using Method Handles!🔥 ```java import java.lang.invoke.*; public class GeneticJava { public static void evolveCode() throws Throwable { MethodHandles.Lookup lookup = MethodHandles.lookup(); // Runtime pe genetic mutation of methods! MethodType type = MethodType.methodType(String.class); MethodHandle mh = lookup.findStatic(GeneticJava.class, "mutate", type); CallSite site = new MutableCallSite(mh); MethodHandle mutant = site.dynamicInvoker(); String result = (String) mutant.invoke(); // Evolving code! } public static String mutate() { return "Mutated: " + Math.random(); } } ``` Internal: Bytecode level pe method bodies dynamically mutate ho sakti hain! 💡 --- Post 3: Java ka "Neural Network" using pure arrays!🚀 ```java public class JavaNeuralNetwork { public static double think(double[] inputs, double[][] weights) { // Single neuron simulation double sum = 0; for (int i = 0; i < inputs.length; i++) { sum += inputs[i] * weights[0][i]; } return 1 / (1 + Math.exp(-sum)); // Sigmoid activation } // Backpropagation bhi implement kar sakte ho! } ``` Magic: Pure Java arrays use karke complete neural network bana sakte ho! 💪 --- Post 4: Java ka "Blockchain" using Object hashes!🔮 ```java public class JavaBlockchain { class Block { int index; String previousHash; Object data; String hash; String calculateHash() { return Integer.toHexString( (index + previousHash + data.toString()).hashCode() ); } } // Immutable blockchain using object hashes! } ``` Crypto Level: Java object hashes use karke simple blockchain implement kar sakte ho! 💀 --- yeh concepts toh computer science ke professors ko bhi nahi pata honge! 😎
Like Comment
To view or add a comment, sign in
Anushka Singh
6mo
Report this post
Java with DSA Challenge Day 5 Challenge Problem: Check Prime Number Today’s challenge was to determine whether a given integer n is a prime number or not. A prime number is a number greater than 1 that is divisible only by 1 and itself. Example: Input: n = 17 Output: True Explanation: 17 is divisible only by 1 and 17. This problem is a fundamental concept in number theory and serves as a great way to understand time complexity optimization in algorithms. Code Snippet // User function Template for Java class Solution { public static boolean prime(int n) { // 0 and 1 are not prime numbers if (n <= 1) return false; // 2 and 3 are prime numbers if (n <= 3) return true; // Eliminate multiples of 2 and 3 if (n % 2 == 0 || n % 3 == 0) return false; // Check divisibility up to √n using 6k ± 1 rule for (int i = 5; i * i <= n; i += 6) { if (n % i == 0 || n % (i + 2) == 0) return false; } return true; // number is prime } public static void main(String[] args) { int n1 = 17, n2 = 56; System.out.println(n1 + " is prime? " + prime(n1)); // true System.out.println(n2 + " is prime? " + prime(n2)); // false } } Code Explanation 1.Base Case Handling Numbers ≤ 1 are not prime, so we return false. 2. Small Prime Numbers 2 and 3 are prime by definition, so we return true. 3. Eliminate Multiples of 2 and 3 Any number divisible by 2 or 3 (except 2 and 3 themselves) cannot be prime. 4. Using the 6k ± 1 Rule All primes greater than 3 can be written as 6k ± 1. So, instead of checking every number, we only check divisibility for numbers of this form up to √n. This reduces unnecessary iterations and makes the code much faster. 5.Return True If no divisors are found, the number is prime. Complexity Time Complexity: O(√n) Space Complexity: O(1) Output 17 is prime? true 56 is prime? false Reflection This problem shows how even a simple concept like checking prime numbers can be optimized using mathematical insights like the 6k ± 1 rule. #Java #DSA #100DaysOfCode #CodingChallenge #ProblemSolving #GeeksforGeeks #Algorithms #LearnToCode #JavaDeveloper #SoftwareEngineering #DataStructures #Developer
Like Comment
To view or add a comment, sign in
Daniel Georgescu
6mo Edited
Report this post
The Java programming language is no longer an unknown field to machine learning or AI. Moreover, it has become a good fit for ML/AI. . . Any travel to a new world starts with its first step. And here, I tried to outline the first step to take in the fascinating world of AI. . . Enjoy! 😊 https://lnkd.in/d-c9_jDh

Introduction to OpenCV using Java danielgeorgescu.substack.com
Like Comment
To view or add a comment, sign in
Namrata Patil
6mo
Report this post
💡 Types of Polymorphism in Java One of the most commonly asked interview questions — 👉 What’s the difference between Method Overloading and Method Overriding? These two are the types of Polymorphism in Java: Method Overloading → Static Binding / Early Binding / Compile-Time Polymorphism Method Overriding → Dynamic Binding / Late Binding / Runtime Polymorphism ⚙️ What’s Compile-Time vs Run-Time? Compile-Time Polymorphism — The decision about which method to call is made by the compiler before the program runs. Run-Time Polymorphism — The decision about which method to execute is made while the program is running, based on the actual object. ✈️ 1️⃣ Method Overloading (Compile-Time Polymorphism) In method overloading, methods share the same name but differ in parameters (type, number, or order). Return type can be different, and we don’t use @Override here. This happens during compilation phase. 💡 Real-world example: Think of an airplane control system — the same “start” button works differently based on input: Start engine with key 🔑 Start engine with remote 🛰️ Start engine with voice 🎙️ → Same action “start”, different inputs! class Plane { void startEngine(int key) { System.out.println("Starting engine with key 🔑"); } void startEngine(String voiceCommand) { System.out.println("Starting engine with voice command 🎙️"); } } public class Airport { public static void main(String[] args) { Plane jet = new Plane(); jet.startEngine(1234); jet.startEngine("Start Jet"); } } 🛫 2️⃣ Method Overriding (Run-Time Polymorphism) In method overriding, a child class redefines a method from the parent class using the same method name, parameters, and return type. We use the @Override annotation, and it’s executed during runtime. 💡 Real-world example: Imagine two planes — one regular, one fighter jet. Both have a “fly” method, but the behavior changes based on the plane type! class Plane { void fly() { System.out.println("Plane is flying at normal speed ✈️"); } } class FighterJet extends Plane { @Override void fly() { System.out.println("Fighter Jet flying at supersonic speed 💥"); } } public class Sky { public static void main(String[] args) { Plane myPlane = new FighterJet(); // Runtime Polymorphism myPlane.fly(); } } ✨ Polymorphism is what makes Java powerful — it lets one action behave differently in different scenarios, just like real life! 🔜 In the next post, we’ll see how abstraction and interfaces make polymorphism even more powerful in real-world applications. 🚀 #Java #Polymorphism #OOPsConcepts #TapAcademy
Like Comment
To view or add a comment, sign in
Yuvraj Singh Kushwah
5mo
Report this post
🌟 Day 14 of My Java Learning Journey 🔥 💯 Hey everyone! 👋 ~ Today’s topic was all about decision-making in Java — how a program chooses which path to follow based on given conditions. 💡 . I explored how to find the greatest number among multiple values using nested if-else statements, one of the core parts of selection statements in Java. 💻 Here’s the code I worked on today: 👇 -------------------------------------code start-------------------------------------- public class FindGreaterNumberDemo2 { public static void main(String[] args) { int p = 11; int q = 22; int r = 33; int s = 44; if (p > r && p > s && p > q) { System.out.println("p is greater number "); } else if (q > s && q > p && q > r) { System.out.println("q is greater number"); } else if (r > p && r > s && r > q) { System.out.println("r is greater number"); } else { System.out.println("s is greater number"); } } } -------------------------------------code output------------------------------------ s is greater number ---------------------------------------code end-------------------------------------- . 🔍 Explanation: We have four integer variables: p, q, r, and s. Using an if-else-if ladder, we compare each number with the others using the logical AND (&&) operator. The first condition that turns out true will print which number is the greatest. If none of them match, the else block executes, showing that s is the greatest. . 💡 Key Takeaway: Selection statements like if, else if, and else help control the program’s logic — deciding what happens next depending on the condition. . 🚀 What’s next? In the upcoming posts, I’ll share many more real-world examples related to selection statements, so we can deeply understand how decision-making works in Java programs. Stay tuned — it’s gonna get crazy cool and more practical! 💻🔥 . #Java #100DaysOfCode #Day14 #JavaLearningJourney #FlowControl #IfElse #SelectionStatements #DevOps #Programming #CodingJourney #LearnJava #TechLearner #CodeNewbie .
Like Comment
To view or add a comment, sign in
KARTIK CHOUDHARY
5mo
Report this post
Java ke scientifically advanced secrets! 🔥 --- Post 1: Java ka "DNA Encoding" simulation using byte arrays!🧬 ```java public class JavaDNA { // A=00, T=01, G=10, C=11 - DNA base encoding in bits private static byte[] encodeDNA(String sequence) { byte[] encoded = new byte[sequence.length() * 2]; for (int i = 0; i < sequence.length(); i++) { char base = sequence.charAt(i); encoded[i*2] = (byte) ((base >> 1) & 1); // First bit encoded[i*2+1] = (byte) (base & 1); // Second bit } return encoded; } public static void main(String[] args) { byte[] dna = encodeDNA("ATCG"); System.out.println("DNA Encoded: " + java.util.Arrays.toString(dna)); } } ``` Secret: Java bytes use karke DNA sequences ko binary mein encode kar sakte ho! 💀 --- Post 2: Java ka "Quantum Entanglement" simulation!⚛️ ```java public class QuantumEntanglement { private static boolean[] entangledQubits = new boolean[2]; public static void entangle() { // Einstein's "spooky action at a distance" boolean state = Math.random() > 0.5; entangledQubits[0] = state; entangledQubits[1] = state; // Always same state! } public static void measure() { System.out.println("Qubit 1: " + entangledQubits[0]); System.out.println("Qubit 2: " + entangledQubits[1]); System.out.println("Entangled: " + (entangledQubits[0] == entangledQubits[1])); } } ``` Quantum Magic: Do qubits hamesha same state mein rehte hain, chahe kitni dur ho! 🔥 --- Post 3: Java ka "Time Travel" using versioned object states!🕰️ ```java import java.util.*; public class TimeTravel { private static Map<Integer, Object> timeline = new HashMap<>(); private static int currentTime = 0; public static void saveState(Object obj) { timeline.put(currentTime++, obj.toString()); } public static String travelToTime(int time) { return timeline.get(time); } // Object ke previous states mein time travel! } ``` Temporal Logic: Object ke purane states mein wapas ja sakte ho! 💡 --- Post 4: Java ka "Parallel Universe" using multiple classloaders!🌌 ```java public class ParallelUniverses { public static void main(String[] args) throws Exception { // Different classloaders = different universes ClassLoader universe1 = new CustomClassLoader(); ClassLoader universe2 = new CustomClassLoader(); Class<?> class1 = universe1.loadClass("MyClass"); Class<?> class2 = universe2.loadClass("MyClass"); System.out.println("Same class? " + (class1 == class2)); // FALSE! // Do alag universes mein same class different hai! } } ``` Multiverse Theory: Har classloader ek alag universe create karta hai! 🚀 --- Bhai, yeh concepts toh regular programming se bahar ke hain! 😎
Like Comment
To view or add a comment, sign in

730 followers

1 Post

View Profile Connect

"OpenAI's gpt-oss Java port for CPU inference"

More Relevant Posts

Explore related topics

Explore content categories