Go-like selects using jox channels in Java
The jox project implements fast & scalable channels in Java, designed to be run with Java 21+ and virtual threads. The inspiration for the project comes from the Kotlin ecosystem.
jox is relatively young; in the previous announcement, we covered the motivation behind a new concurrent data structure implementation and some initial performance tests.
We're now taking the next step by adding Go-like select
s. Let's see how this feature works, and again, look at a couple more performance tests!
What is a channel?
At a fundamental level, a channel is like a queue: you can send & receive elements. Channels come in three primary flavors. Rendezvous channels don't have a buffer, and every send
operation blocks until another thread invokes receive
. Hence, two threads must "meet" to exchange the value. The second flavor is buffered channels, where send
s don't block until the buffer is full. Finally, you can also have unlimited channels, where send
s never block.
But that's available in Java's std lib?
This prompts the question: how is a channel different from an LinkedBlockingQueue
, ArrayBlockingQueue
or a SynchronousQueue
, available in Java's standard library?
The first difference is that channels can be closed: either by signaling that a channel is "done"—no new values can be sent, but the ones that are already buffered or being sent, are still delivered. Alternatively, a channel might transit to an "error" state—which discards all buffered elements and interrupts any pending send/receive operations.
The second difference, which is the focus of this release, is the availability of the select
method. A select
invocation takes a number of clauses, from which exactly one is guaranteed to be completed. For example:
import com.softwaremill.jox.Channel;
import static com.softwaremill.jox.Select.select;
class Demo {
public static void main(String[] args) throws InterruptedException {
// creates a buffered channel (buffer of size 3)
var ch1 = new Channel<Integer>(3);
var ch2 = new Channel<Integer>(3);
var ch3 = new Channel<Integer>(3);
// send a value to two channels—doesn't block,
// as the channel is buffered
ch2.send(29);
ch3.send(32);
var received = select(
ch1.receiveClause(),
ch2.receiveClause(),
ch3.receiveClause()
);
// prints: Received: 29
System.out.println("Received: " + received);
// ch3 still holds a value that can be received
}
}
(Here, we've got a single thread running all the operations, which isn't that interesting in general but demonstrates the principles well. In real life, the send & receive operations typically run on different (virtual) threads.)
select
either immediately completes one of the clauses if data is available (as in the example above) or blocks until some clause can be completed. What's crucial is that exactly one clause will be completed—no more! Hence, if two clauses become completable at the same time, only one element will be received from one of the channels.
select
s are biased towards clauses that appear first. In the example above, both ch2
and ch3
can be immediately selected, but it's the value from ch2
that is received, as the clause comes first in the list.
More features of selects
Apart from receives, you can also select
exactly one clause to complete from a number of send
s (or mix&match send&receive clauses). Again, a simple example:
import com.softwaremill.jox.Channel;
import static com.softwaremill.jox.Select.select;
class Demo {
public static void main(String[] args) throws InterruptedException {
var ch1 = new Channel<Integer>(1);
var ch2 = new Channel<Integer>(1);
ch1.send(12); // buffer is now full
var sent = select(
ch1.sendClause(13, () -> "first"),
ch2.sendClause(25, () -> "second")
);
// prints: Sent: second
System.out.println("Sent: " + sent);
}
}
We're trying to send a value either to ch1
, or ch2
. However, the buffer of ch1
is full, so the second clause will be selected.
Here, we're also providing an optional callback so that the select
can return a value that can be used to differentiate the selected clause (if that's important). Similarly, callbacks that transform values received by receiveClause
can be specified.
If no clause can be immediately completed, and the select
would block, a default can be provided, which prevents the blocking and returns the default:
import com.softwaremill.jox.Channel;
import static com.softwaremill.jox.Select.defaultClause;
import static com.softwaremill.jox.Select.select;
class Demo {
public static void main(String[] args) throws InterruptedException {
var ch1 = new Channel<Integer>(3);
var ch2 = new Channel<Integer>(3);
var received = select(
ch1.receiveClause(),
ch2.receiveClause(),
defaultClause(52)
);
// prints: Received: 52
System.out.println("Received: " + received);
}
}
select
s enable a lot of interesting communication patterns between virtual threads. Go might be an initial inspiration, but once the groundwork for a feature-full implementation of channels in Java is done, we hope to research how channels can be best used in real-world JVM applications. For sure, we'll report back!
Disclaimer on benchmarks
Ron Pressler, the lead of Project Loom, noted in a comment to the article announcing jox that micro-benchmarks should be ignored unless you're an expert in the implementation of the JVM/virtual threads. And I think neither Ron nor I would qualify myself as such ;).
The more complex benchmarks that we've added to jox confirm what Ron says: in the presence of a higher number of concurrently running virtual threads, some of the observed differences either disappear, previously slower implementations become (comparatively) faster, or require different tuning parameters to achieve higher performance.
That's why we hope to work more on adding non-micro benchmarks (in addition to the chain
one, discussed below), which are closer to real-world scenarios.
However, the micro-benchmarks that use only a single channel or queue are still helpful to spot regressions (if some change makes a microbenchmark 2x as slow as before, it's better to investigate). Secondly, they are helpful to compare against other implementations (including Kotlin and Java's built-in) to get a baseline on the performance that you might expect.
And as with all other matters—if you have more benchmark ideas, just open an issue!
Performance: selects
With the disclaimer out of the way, let's compare the performance of direct send
-receive
invocations with the performance of similar operations but wrapped in a select
. Hence, we'll look at the overhead that adding a select
imposes. We'll compare this with Kotlin's and Java's built-in queues results.
In the test, we're running two threads/coroutines, where one thread is sending values, and the other is either a channel.receive()
or a select(channel.receiveClause())
. Each bar represents the average amount of time it takes to complete the transmission of a single element, that is perform a single send-receive pair:
Using select
adds a 2x overhead—not very dramatic, but certainly could use some improvements.
What happens when we select from two channels? Again, each bar represents single-element transmission. Comparing Java and Kotlin Channel
implementations:
Again, jox receives a performance penalty (as does kotlin), and there might be some opportunities to close in on the gap. To put the results in another perspective, 600 ns/op means that we can run 1600 such operations in a single second—which I think is quite impressive both for Java and Kotlin!
Performance: chains
To bring the benchmarks to a more real-world setting, let's compare the test results where we send values through a chain of channels. There are n
channels and n+1
threads. The first thread sends values, threads 2..n
receive a value from the previous channel and send it to the next, and thread n+1
receives values.
First, let's look at rendezvous channels (buffer of size 0), with a chain of 100, 1k, and 10k threads, contrasted with a similar chain constructed in Kotlin and Java using SynchronousQueue.
Again, a single operation means a send-receive pair:
And of buffered channels, contrasted with Kotlin and Java's ArrayBlockingQueue
s (buffer of size 100):
As in the previous benchmarks, Kotlin has a lead over the Java implementation. My guess remains that this is due to differences in how coroutines / virtual threads are being scheduled to be run on platform threads, plus the (still) smaller overhead of switching between coroutines compared to switching between virtual threads.
Again putting things into perspective: an (amortized) cost of 6ns/send-receive operation pair is about 166000 such operations/second!
In the same Reddit discussion, Ron states that the Java implementation is, in fact more performant than Kotlin's, and the observed differences are due to our tests not reflecting real-world usage. Which might mean that we indeed need better tests!
But what's even more interesting is that Java's built-in collections not only catch up (SynchronousQueue
and ArrayBlockingQueue
) but also (slightly) surpass our Channel
implementation in performance. They spread their wings under higher contention!
Further work
A couple of useful features are still missing before leveraging jox in some real-world applications, e.g., as the underlying channels implementation in ox. Some things that we'd like to implement:
- only completing
select
with aDone
when all channels areDone
- separating
Sink
&Source
interfaces - supporting channel views to implement
map
andfilter
Moreover, a couple of optimizations have been suggested by the community, which we definitely have to try out. Who knows, maybe we can further improve the benchmarks?
If you have further suggestions or feedback, both our community forum and GitHub await. The project is available under the Apache2 license on Maven Central!
<dependency>
<groupId>com.softwaremill.jox</groupId>
<artifactId>core</artifactId>
<version>0.0.4</version>
</dependency>