Project Loom meets Quarkus
Project Loom has been introduced with Java 19 as a Preview.
The premise of Loom is :
“(..) to drastically reduce the effort of writing, maintaining, and observing high-throughput concurrent applications that make the best use of available hardware”
Ron Pressler, May 2020
We will need to wait for some time until Loom gets out of Preview and hits full release. However, since it is already available, the Quarkus team decided to give it a try in Version 2.10.0 Final. Thanks to their preliminary work on virtual threads, we can experience Loom using more complex examples.
This blog post introduces Project Loom and Quarkus and gives instructions on running applications on Quarkus with Postgres utilizing virtual threads. It also includes results from load tests performed on said application.
Why Project Loom
The standard way of handling parallel and concurrency computations in Java is to use threads.
Java’s way of threading boils down to utilizing operating system threads. It means that each thread created in the VM has its system thread executing the Thread.run()
method, respectively. This also means that the OS is responsible for running threads.
No matter what, each thread occupies memory and craves CPU time. Considering multitasking in an OS with a multicore CPU, all of the threads must compete over access to the hardware resources. This leads to many problems that must be resolved to effectively use the multitasking concept, thread locking, thread scheduling, synchronization, to name a few.
Java Threads have been available since the beginning of Java. Throughout the years, they were evolving and adapting to new hardware possibilities. Starting with Green Threads they quickly became platform threads by default, only to expand to the Concurrency API introduced in Java 1.5. Since then Java threading, with its Future
, ExecutorService
, ForkJoinPool
, Concurrent_ maps
, and many more, has matured.
However, the world of it-tech has bigger and bigger demands for computing power, as time goes by. The quality of many software systems is measured by high throughput. This demands as much optimized utilization of resources as possible, in the shortest possible time.
For all the servers that must handle an increased workload, this is a delicate balance between a number of threads competing for resources and being responsive in a timely manner.
A thread pool with a small number of threads can be overloaded with a big queue of requests. Having too many threads in the pool that are idle will block OS thread resources that other processes could use.
Brian Goetz in his book "Java Concurrency in Practice" recommends the formula to calculate a number of threads in the pool:
Number of threads = Number of CPUs * Target CPU utilization * (1 + Wait time / Compute time)
However that does not resolve the problem. It is just a way to measure, not guess.
Creating new threads in JVM is relatively expensive. The process includes creating a java stack, a system stack, system calls, and callbacks. It is more expensive compared to creating ‘standard’ objects, especially when a given server wants to handle multiple short-lived requests where for each request a new thread would be created. It doesn’t matter in a development environment but it plays a crucial role in high throughput production.
Why care about Loom
Project Loom aims to mitigate or solve said problems by introducing virtual threads.
The idea behind this is to create a Java Thread that is not attached to the platform thread. This virtual thread does not occupy OS resources and CPU until it needs to do so. Thus the platform thread can be utilized by other tasks. However, when there is a long-running, CPU-intensive thread then it will pin to the platform thread and create a bottleneck. This is because virtual threads are not faster than standard threads. They optimize the utilization of platform threads.
Virtual threads solve high throughput demands that can’t be solved by standard threads due to the limit of CPU cores. They are not designed to speed up computation and gain low latency.
Java platform threads are scheduled by the operating system, whereas virtual threads are scheduled by JDK. This scheduler assigns virtual threads to platform threads in many-to-many manners. This is illustrated in the following picture.
The next sections describe how to experience virtual threads in action by implementing a Quarkus application that undergoes load tests with a different setup.
The full source code of this application may be found at github.
Quarkus and virtual threads
Quarkus is a Java stack developed with cloud deployment in mind. It promises fast startup and a low memory footprint. It is Kubernetes-friendly and allows applications to be run on OpenJDK HotSpot and GraalVM. Quarkus supports both imperative and reactive programming, whereas the former is implemented natively using Netty and Mutiny.
Prerequisites
This sample application is based on tools that need to be installed before going further:
- Java OpenJDK HotSpot 19
- Docker
- Quarkus
- siege - regression test and benchmark utility
Init sample application
Start with Quarkus app initialization
quarkus create app loom-quarkus
Dependencies
Additional libraries must be added to the Maven dependencies.
quarkus-agroal
, quarkus-jdbc-postgresql
, quarkus-reactive-pg-client
are data source libraries handling blocking and nonblocking connections respectivelly.
quarkus-resteasy-reactive-jackson
allows creating REST services with JSON responses.
<dependency>
<groupId>io.quarkus</groupId>
<artifactId>quarkus-agroal</artifactId>
</dependency>
<dependency>
<groupId>io.quarkus</groupId>
<artifactId>quarkus-reactive-pg-client</artifactId>
</dependency>
<dependency>
<groupId>io.quarkus</groupId>
<artifactId>quarkus-jdbc-postgresql</artifactId>
</dependency>
<dependency>
<groupId>io.quarkus</groupId>
<artifactId>quarkus-resteasy-reactive-jackson</artifactId>
</dependency>
Enable virtual threads
Virtual threads that come with Java19 are disabled by default. One must explicitly tell Java to activate them. This can be done by using the parameter –enable-preview
in the Maven configuration file. Also, set the release version to 19.
<plugin>
<artifactId>maven-compiler-plugin</artifactId>
<version>${compiler-plugin.version}</version>
<configuration>
<compilerArgs>
<arg>--enable-preview</arg>
</compilerArgs>
<release>19</release>
</configuration>
</plugin>
Data source configuration
Quarkus comes with Dev Services that ease the development process. One of the options is to provision and configure the Postgresql database automatically. This way Quarkus takes care of data source configuration, both blocking and nonblocking. This is fair enough for this sample application.
Enable database dev services in application.properties
:
quarkus.datasource.devservices.enabled=true
REST Service
This service shows how to use virtual threads. By default, Quarkus runs services with a reactive approach, thus it is necessary to configure the service with the fine-grained setup using @Blocking
, @NonBlocking
, and @UseVirtualThreads
annotations.
@Path("/sandbox")
@Produces(MediaType.APPLICATION_JSON)
@NoCache()
@Blocking
public class LoomSandboxResource {
BlogpostsRepository repository;
@Inject
public LoomSandboxResource(BlogpostsRepository repository) {
this.repository = repository;
}
@GET
@Path("/blocking/jdbc")
public List<Blogpost> blogPostsBlockingJDBC() {
return repository.findAllJdbc();
}
@GET
@RunOnVirtualThread
@Path("/loom/jdbc")
public List<Blogpost> blogPostsLoomJdbc() {
return repository.findAllJdbc();
}
}
Data repository
For purposes of this application, a single repository is used. However, this repository utilizes both blocking AgroalDataSource
and non-blocking PgPool
data sources.
@ApplicationScoped
public class BlogpostsRepository {
final PgPool client; //This is nonblocking postgres driver
final AgroalDataSource agroalDataSource; //This is standard blocking postgres driver
private final String SELECT_ALL = "SELECT * FROM blogposts";
@Inject
public BlogpostsRepository(final PgPool client, final AgroalDataSource agroalDataSource) {
this.client = client;
this.agroalDataSource = agroalDataSource;
}
public List<Blogpost> findAllJdbc() {
var quotesList = new ArrayList<Blogpost>();
try (Connection connection = agroalDataSource.getConnection();
PreparedStatement preparedStatement = connection.prepareStatement(SELECT_ALL);
ResultSet resultSet = preparedStatement.executeQuery();) {
while (resultSet.next()) {
quotesList.add(BlogpostMapper.toBlogpost(resultSet));
}
} catch (SQLException e) {
e.printStackTrace();
}
return quotesList;
}
public Uni<List<Blogpost>> findAllReactiveUni() {
return client.query(SELECT_ALL)
.execute()
.onItem().transform(BlogpostMapper::toBlogpost);
}
public Multi<Blogpost> findAllReactiveMulti() {
return client.query(SELECT_ALL)
.execute()
.onItem().transformToMulti(set -> Multi.createFrom().iterable(set))
.onItem().transform(BlogpostMapper::toBlogpost);
}
}
Running the application
The simplest way to run this application is to use Quarkus cli
%>quarkus dev
It should print output similar to the following:
At this point, the application should be up and connected to a dockerized Postgresql database.
Importing sample data
Let’s import sample data in order to perform some tests and see that the application is making connections to the database.
Data can be downloaded from github. This file should be copied to the main/resources/
directory.
Next, let’s create an InitDb
class responsible for pulling data from the main/resources/sample-data.json
and uploading this data into the database.
This class will reinitialize the db schema every time the application starts to ensure every test is working exactly on the same set.
@ApplicationScoped
public class InitDb {
boolean schemaCreate;
final PgPool client;
@Inject
public InitDb(@ConfigProperty(name = "blogposts.schema.create", defaultValue = "true") boolean schemaCreate,
PgPool client) {
this.schemaCreate = schemaCreate;
this.client = client;
}
void config(@Observes StartupEvent ev) {
if (schemaCreate) {
run();
}
}
private void run() {
List<Tuple> batch = prepareData();
client.query("DROP TABLE IF EXISTS blogposts").execute()
.flatMap(r -> client.query("CREATE TABLE blogposts (id SERIAL PRIMARY KEY, author VARCHAR(256) NOT NULL, content TEXT NOT NULL , tags VARCHAR(256))").execute())
.await().indefinitely();
client.preparedQuery("INSERT INTO blogposts (author, content, tags) VALUES ($1, $2, $3)")
.executeBatch(batch)
.await().indefinitely();
}
private List<Tuple> prepareData() {
try {
InputStream in = getClass().getResourceAsStream("/sample-data.json");
String result = new BufferedReader(new InputStreamReader(in))
.lines().collect(Collectors.joining("\n"));
ObjectMapper mapper = new ObjectMapper();
TypeReference<List<Blogpost>> documentMapType =
new TypeReference<>() {
};
var document = mapper.readValue(result, documentMapType);
return document.stream().map(i -> Tuple.of(i.author(), i.content(), i.tags())).toList();
} catch (IOException e) {
throw new RuntimeException(e);
}
}
}
Restarted application should be populated with sample data. We can verify it by sending requests to one of its urls: http://localhost:8080/sandbox/blocking/jdbc .
Testing
The aim of the tests was to verify how the application will utilize CPU, memory and threads during the load tests and if there is any difference between using the platform and virtual threads.
The test environment:
- Number of cores/threads in CPU - 10/10
- 64 GB RAM
- Postgresql running in Docker
- Quarkus app running in dev mode on Java OpenSDK Hot Spot 19
Siege was used as the test framework. It is a simple command line tool, with a clear summary that is enough for the purposes of these tests.
Each test sends 1000 requests with a 1s delay before each request.
Every response contains around 1.2MiB of JSON data.
Thread pool limit 10
In this scenario thread pool limit and vertx worker pool size were both set to 10 threads.
Also, the db connections min and max size were adjusted to minimize data loss.
quarkus.vertx.worker-pool-size=10
quarkus.thread-pool.max-threads=10
quarkus.datasource.jdbc.min-size = 60
quarkus.datasource.jdbc.max-size = 90
Siege test results
virtual threads | platform threads | ||
---|---|---|---|
Elapsed time | 12,38 | 10,94 | sec |
Average response time | 7,56 | 4,11 | sec |
Transaction rate | 80,78 | 91,41 | trans/sec |
Throughput | 94,47 | 106,91 | MB/sec |
Memory and CPU consumption
Thread pool limit 100
In this scenario thread pool limit and vertx worker pool size were both set to 100 threads.
Also db connections min and max size were adjusted to minimize data loss.
quarkus.vertx.worker-pool-size=100
quarkus.thread-pool.max-threads=100
quarkus.datasource.jdbc.min-size = 60
quarkus.datasource.jdbc.max-size = 90
Siege test results
virtual threads | platform threads | ||
---|---|---|---|
Elapsed time | 12,02 | 10,50 | sec |
Average response time | 6,89 | 4,07 | sec |
Transaction rate | 83,19 | 95,24 | trans/sec |
Throughput | 97,30 | 108,27 | MB/sec |
Memory and CPU consumption
Thread pool limit 300
In this scenario thread pool limit and vertx worker pool size were both set to 300 threads.
Siege test results
virtual threads | platform threads | ||
---|---|---|---|
Elapsed time | 12,07 | 10,23 | sec |
Average response time | 6,99 | 3,92 | sec |
Transaction rate | 82,85 | 97,75 | trans/sec |
Throughput | 96,90 | 114,33 | MB/sec |
Memory and CPU consumption
Thread pool limit 10 and long running task
This test is about observing how virtual threads handle long-running tasks compared to platform threads. The original BlogpostsRepository
needs to be altered with Thread.sleep()
in order to simulate a long-running task. This way each request will be locked for 2 seconds before returning the value.
public List<Blogpost> findAllJdbc() {
var quotesList = new ArrayList<Blogpost>();
try (Connection connection = agroalDataSource.getConnection();
PreparedStatement preparedStatement = connection.prepareStatement(SELECT_ALL);
ResultSet resultSet = preparedStatement.executeQuery();) {
while (resultSet.next()) {
quotesList.add(BlogpostMapper.toBlogpost(resultSet));
}
} catch (SQLException e) {
e.printStackTrace();
}
try {
Thread.sleep(2000);
} catch (InterruptedException e) {
throw new RuntimeException(e);
}
return quotesList;
}
Siege test results
virtual threads | platform threads | ||
---|---|---|---|
Elapsed time | 13,34 | 213,81 | sec |
Average response time | 8,15 | 106,56 | sec |
Transaction rate | 74,96 | 4,68 | trans/sec |
Throughput | 87,68 | 5,47 | MB/sec |
Memory and CPU consumption
Summary
At first glance, looking at the test results, it seems that virtual threads do not differ much from platform threads in terms of performance. However, the first two tests show that the bigger the thread pool, the better performance in platform threads, whereas virtual threads keep up on the same level, more or less. Moreover, the test with the thread pool size set to 300 presents an even better performance in favor of the platform threads. Thus why even bother about virtual threads if they are less performant and additionally they consume twice as much heap memory as the other ones?
The last test shows the true power of virtual threads. If we take a closer look at the results, we will find that with the thread pool set to 10, the platform threads are having a hard time serving all the long-running requests. It took them more than 200 seconds to complete. In the meantime, the virtual threads turn out to be lightning fast, with a finish time of 13 seconds.
This is possible due to the non-blocking nature of virtual threads. They do not pin to the platform thread when there is no need to. Therefore platform thread can handle other requests leading to high throughput.
This is very promising, but we need to keep in mind that Loom is very young, and we have yet to learn all the pitfalls.
I think that programmers won’t use Loom directly on a daily basis. It is more about when the vast number of libraries and frameworks will implement support for Loom.
On the contrary, it may also spark new projects that support Loom from the ground up, and we will see some shift in library usage, at least in the world of microservices where high throughput is crucial.