100k Requests with Java Virtual Threads

100k Requests with Java Virtual Threads

The Concurrency Conundrum: Why 100k Requests Break Traditional Java

For decades, Java has relied on the operating system's (OS) native threads, often called platform threads, to handle concurrent operations. When your web server receives a request, it typically dedicates one platform thread to process that request. This "thread-per-request" model is straightforward to program; you write blocking code as if it were synchronous, and the OS handles the context switching.

The Hidden Cost of Platform Threads

While simple, this model has a significant drawback: platform threads are expensive. Each platform thread consumes a substantial amount of OS memory (often 1–2MB for its stack alone), and switching between them involves a context switch at the OS level, which is a CPU-intensive operation.

Imagine trying to handle 100,000 concurrent requests with this model. You'd need 100,000 platform threads. This would quickly lead to:

  • Memory Exhaustion: Your application would demand hundreds of gigabytes of RAM just for thread stacks, which is simply unfeasible for most servers.
  • CPU Overload: The OS would spend an inordinate amount of time context switching between threads, leaving little CPU power for actual business logic.
  • Degraded Performance: Even if your system didn't crash, throughput would plummet as overhead dominates.

To mitigate this, developers often resort to complex asynchronous programming models (like reactive programming with Project Reactor or RxJava) or event-driven architectures. While powerful, these approaches introduce significant complexity, making code harder to read, debug, and maintain. You swap the simplicity of blocking code for the challenges of non-blocking callbacks and reactive streams.

Enter Java Virtual Threads

This is where Java Virtual Threads, introduced as a preview feature in Java 19 (Project Loom) and made permanent in Java 21, revolutionize concurrency. They offer a new paradigm that combines the programming simplicity of the "thread-per-request" model with the scalability of asynchronous designs. Essentially, they allow you to write simple, blocking code that can still handle an enormous number of concurrent operations without the overhead of traditional platform threads.

Demystifying Java Virtual Threads: A Game Changer

At their core, Java Virtual Threads are lightweight, user-mode threads managed entirely by the Java Virtual Machine (JVM), not the operating system. Think of them as "fibers" or "green threads" — they are cheap to create, cheap to block, and cheap to discard.

How Do They Work Their Magic?

The JVM maps a large number of virtual threads onto a small pool of underlying platform threads, known as carrier threads. When a virtual thread executes a blocking operation (like waiting for I/O, a network call, or a database query), the JVM unmounts it from its current carrier thread. The carrier thread then becomes free to mount and execute another virtual thread. Once the blocking operation completes, the unmounted virtual thread is queued to be remounted on an available carrier thread and resume its execution.

This clever multiplexing means you can have millions of virtual threads running concurrently, all sharing a handful of platform carrier threads. The OS sees only the carrier threads, completely unaware of the virtual threads being managed by the JVM.

Key Benefits of Virtual Threads:

  • Massive Scalability: Create hundreds of thousands, even millions, of threads without exhausting memory or CPU.
  • Simplified Programming Model: Go back to the familiar, easy-to-reason-about "thread-per-request" blocking style. No more callback hell or complex reactive chains unless you genuinely need them for other reasons.
  • Reduced Development Effort: Less complex code means fewer bugs and faster development cycles.
  • Improved Resource Utilization: Efficiently use CPU cores by avoiding costly OS context switches for most concurrent tasks.

Practical Usage: Creating Virtual Threads

Creating a virtual thread is surprisingly similar to creating a platform thread, but with a crucial factory method.

  1. Using Thread.ofVirtual().start()
  2. This is the most direct way to create and start a single virtual thread.

Copyimport java.time.Duration;

public class VirtualThreadExample {

    public static void main(String[] args) throws InterruptedException {
        long startTime = System.currentTimeMillis();

        // Create and start a virtual thread
        Thread virtualThread = Thread.ofVirtual().start(() -> {
            System.out.println("Hello from a Virtual Thread! Thread ID: " + Thread.currentThread().threadId());
            try {
                Thread.sleep(Duration.ofSeconds(2)); // Simulate some blocking work
            } catch (InterruptedException e) {
                Thread.currentThread().interrupt();
            }
            System.out.println("Virtual Thread finished its work.");
        });

        virtualThread.join(); // Wait for the virtual thread to complete

        long endTime = System.currentTimeMillis();
        System.out.println("Main thread finished. Total time: " + (endTime - startTime) + "ms");
    }
}
        

Notice how Thread.currentThread().threadId() still works, giving you a unique ID for each virtual thread, just like platform threads.

3. Using Executors.newVirtualThreadPerTaskExecutor()

For managing a pool of virtual threads (though with virtual threads, the concept of a "pool" is less about limiting threads and more about providing an ExecutorService interface), you can use this factory method. Each task submitted to this executor will run on its own virtual thread.

Copyimport java.time.Duration;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.stream.IntStream;

public class VirtualThreadExecutorExample {

    public static void main(String[] args) throws InterruptedException {
        long startTime = System.currentTimeMillis();
        int numberOfTasks = 10_000; // Let's simulate 10,000 concurrent tasks

        try (ExecutorService executor = Executors.newVirtualThreadPerTaskExecutor()) {
            IntStream.range(0, numberOfTasks).forEach(i -> {
                executor.submit(() -> {
                    // System.out.println("Task " + i + " running on Virtual Thread: " + Thread.currentThread().threadId());
                    try {
                        Thread.sleep(Duration.ofMillis(100)); // Simulate some blocking work
                    } catch (InterruptedException e) {
                        Thread.currentThread().interrupt();
                    }
                    // System.out.println("Task " + i + " finished.");
                });
            });
        } // executor.close() will wait for all tasks to complete

        long endTime = System.currentTimeMillis();
        System.out.println("All " + numberOfTasks + " tasks finished. Total time: " + (endTime - startTime) + "ms");
    }
}
        

If you run this example, you'll see 10,000 tasks complete in just slightly over 100ms (plus some overhead), demonstrating the incredible efficiency of virtual threads for I/O-bound or blocking operations. Compare that to a fixed thread pool of platform threads, which would take much longer or even deadlock if the pool size was too small.

Building a High-Throughput Service with Spring Boot and Virtual Threads

Integrating virtual threads into a Spring Boot application is remarkably straightforward, especially with Spring Boot 3.2+ and Java 21+. Spring Boot has embraced virtual threads, making it incredibly easy to switch your entire web server (Tomcat, Jetty, Undertow) to use them.

Enabling Virtual Threads in Spring Boot

To enable virtual threads for your web server and all AsyncTaskExecutors in Spring Boot, simply add this line to your application.properties or application.yml:

Copy# application.properties
spring.threads.virtual.enabled=true
        

That's it! With this single line, Spring Boot configures its embedded servlet container (e.g., Tomcat) to use virtual threads for handling incoming requests. This means every request will be processed by a virtual thread, allowing you to scale to thousands of concurrent connections with minimal resource overhead.

Practical Example: A High-Concurrency Spring Boot Service

Let's create a simple Spring Boot application that simulates a service doing some "heavy" (blocking) work. We'll use a Thread.sleep() to mimic an I/O operation like a database call or an external API request.

  1. Project Setup (Maven pom.xml)
  2. Ensure you're using Java 21+ and Spring Boot 3.2+.

Copy<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <parent>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-parent</artifactId>
        <version>3.2.4</version> <!-- Or newer -->
        <relativePath/> <!-- lookup parent from repository -->
    </parent>
    <groupId>com.example</groupId>
    <artifactId>virtualthreads-demo</artifactId>
    <version>0.0.1-SNAPSHOT</version>
    <name>virtualthreads-demo</name>
    <description>Demo project for Spring Boot Virtual Threads</description>
    <properties>
        <java.version>21</java.version>
    </properties>
    <dependencies>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-web</artifactId>
        </dependency>
        <!-- For testing, if needed -->
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-test</artifactId>
            <scope>test</scope>
        </dependency>
    </dependencies>

    <build>
        <plugins>
            <plugin>
                <groupId>org.springframework.boot</groupId>
                <artifactId>spring-boot-maven-plugin</artifactId>
            </plugin>
        </plugins>
    </build>

</project>        

3. application.properties

Copyspring.threads.virtual.enabled=true
server.port=8080        

Optional: Set a smaller Tomcat max threads to observe behavior with platform threads

Copyserver.tomcat.threads.max=200        

By setting server.tomcat.threads.max to a small number (e.g., 200), you can clearly see the difference. With virtual threads enabled, even if Tomcat's platform thread pool is small, it can handle many more requests because it's dispatching them to virtual threads.

4. Spring Boot Application Class

Copypackage com.example.virtualthreadsdemo;

import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;

@SpringBootApplication
public class VirtualthreadsDemoApplication {

    public static void main(String[] args) {
        SpringApplication.run(VirtualthreadsDemoApplication.class, args);
    }

}        

5. A High-Throughput Controller

This controller exposes an endpoint that simulates a blocking operation.

Copypackage com.example.virtualthreadsdemo;

import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;

import java.time.Duration;

@RestController
public class HeavyWorkController {

    private static final int DEFAULT_SLEEP_MILLIS = 100; // Simulate 100ms I/O operation

    @GetMapping("/heavy-work")
    public String doHeavyWork(@RequestParam(defaultValue = "100") int duration) throws InterruptedException {
        String threadInfo = "Processing on Thread: " + Thread.currentThread().getName() + " (Virtual: " + Thread.currentThread().isVirtual() + ")";
        System.out.println(threadInfo + " - Starting work for " + duration + "ms.");

        // Simulate a blocking I/O operation or database call
        Thread.sleep(Duration.ofMillis(duration));

        System.out.println(threadInfo + " - Finished work.");
        return "Work done in " + duration + "ms on " + Thread.currentThread().getName() + " (Virtual: " + Thread.currentThread().isVirtual() + ")";
    }

    @GetMapping("/status")
    public String getStatus() {
        return "Service is running on thread: " + Thread.currentThread().getName() + " (Virtual: " + Thread.currentThread().isVirtual() + ")";
    }
}
        

Testing the Service

  1. Run the application.
  2. Use a load testing tool (e.g., Apache JMeter, k6, hey, or even simple curl commands in a loop) to send a large number of concurrent requests to http://localhost:8080/heavy-work.

  • Example with hey (install via go install github.com/rakyll/hey@latest):

Copyhey -n 10000 -c 1000 http://localhost:8080/heavy-work?duration=50        

This command sends 10,000 requests with 1,000 concurrent connections, each simulating a 50ms blocking operation.

  • Without virtual threads (or with spring.threads.virtual.enabled=false), you'd quickly hit the server.tomcat.threads.max limit, and requests would queue up, taking a very long time to complete or even timing out.
  • With virtual threads enabled, you'll observe that 10,000 requests, each taking 50ms, will complete in roughly 50ms + network latency and minimal overhead. The server logs will show that each request is handled by a unique virtual thread, demonstrating the seamless scalability.

Common Pitfalls and Advanced Tips

While virtual threads are powerful, they aren't a silver bullet for all concurrency problems.

  • Pinning: Virtual threads can get "pinned" to their carrier thread if they execute synchronized blocks or call native methods. When pinned, the carrier thread cannot unmount the virtual thread, blocking other virtual threads from using that carrier.
  • Solution: Minimize synchronized blocks. If you must use them, keep them very short. Prefer ReentrantLock or StampedLock which don't cause pinning.
  • ThreadLocals: ThreadLocal variables are copied for each virtual thread, which can lead to increased memory consumption if used extensively and not managed carefully.
  • Solution: Be mindful of ThreadLocal usage. Consider using ScopedValue (introduced in Java 21) as a more efficient and safer alternative for passing implicit data.
  • CPU-Bound Tasks: Virtual threads excel at I/O-bound or blocking operations. For purely CPU-bound tasks (e.g., heavy computations), the optimal number of threads is typically close to the number of CPU cores. Using many virtual threads for CPU-bound tasks won't make them faster and could introduce unnecessary context switching overhead.
  • Solution: Identify CPU-bound vs. I/O-bound tasks. Use a traditional ForkJoinPool or FixedThreadPool for CPU-bound work.
  • Monitoring: While virtual threads are lightweight, you still need to monitor your application's health. Tools like VisualVM, JFR (Java Flight Recorder), and Spring Boot Actuator can help observe virtual thread activity, carrier thread utilization, and identify potential pinning issues.

Virtual threads allow us to write simple, imperative, blocking code and still achieve phenomenal scalability. This is a monumental shift for Java developers, allowing us to focus on business logic rather than complex concurrency primitives.

Conclusion

Java Virtual Threads are a paradigm shift for building high-performance, scalable applications. By decoupling the programming model from the underlying OS threads, they bring back the simplicity of the thread-per-request model while delivering unprecedented concurrency, allowing your Java applications to effortlessly handle hundreds of thousands of requests. While not a cure-all, understanding their benefits and pitfalls empowers you to write more efficient, maintainable, and robust services.

To view or add a comment, sign in

More articles by 🧿 Saral Saxena

Others also viewed

Explore content categories