Fast number parsing in Scala
Every now and then, we’ve got to parse a String
into an Int
or Double
. With Scala, that’s easy — given a value: String
, just call value.toInt
. Couldn't be simpler!
However, what happens if the value is not an integer? Well, a java.lang.NumberFormatException
exception will be thrown. In fact, the .toInt
method simply calls Java’s Integer.parseInt
, which does exactly the same.
Being Scala programmers, we don’t use exceptions often, if at all. And definitely not for control flow. That’s why we usually catch the exception that might occur during parsing and convert the result to an Either
or Option
. Hence, typical number-parsing presents itself as follows:
Try(value.toInt).toOption
What can go wrong? Well, if the number-parsing is on your code’s hot path — that is, called frequently — and if the value is often not a number (that is, parsing fails), you’ll see really poor performance. To the degree that this might be the decisive factor when it comes to the latency of your application.
The problem is that every time parsing fails, an exception is being thrown, and this implies creating the stack trace, which is a slow operation. As the name suggests, exceptions should be used for exceptional cases, not in the regular case.
What’s the alternative? Turns out Scala 2.13 introduced a new .toIntOption
method which is a completely different implementation than .toInt
. No exceptions are involved and the whole process is a lot faster. How much faster? Let’s check!
To run the tests, I’ve used the sbt-jmh plugin, running on my 2018 i9 MBP, using both Java 11 and Java 17 from Azul Zulu. We are testing three variants to parse a number, and fallback to 0 if parsing fails (it does fail in the test):
- using the
Try()
wrapper - using a traditional
try-catch
, in case it was theTry
that is adding overhead, and not the exception-throwing - using
.toIntOption
Here’s the code:
class IntParsing {
@Benchmark
def usingTry: Int = Try("abc".toInt).getOrElse(0)
@Benchmark
def usingCatch: Int =
try "abc".toInt
catch {
case _: Exception => 0
}
@Benchmark
def usingOption: Int = "abc".toIntOption.getOrElse(0)
}
The results are more than surprising! The .toIntOption
approach is 500x faster than the other ones:
[info] IntParsing.usingTry thrpt 25 712466.231 ± 3966.962 ops/s
[info] IntParsing.usingCatch thrpt 25 743692.743 ± 41825.919 ops/s
[info] IntParsing.usingOption thrpt 25 421541301.290 ± 3037535.459 ops/s
The difference is so huge that I’m sure I’ve made some methodological mistake in the test setup — please correct me :) (I did check a variant where the parsed value is not a constant string, but a constant computed at run-time into a val
— same results). But both the Internet wisdom and verifying in real-life confirm that even if not as staggering, the difference in performance between Try(value.toInt)
and value.toIntOption
is really huge.
As a fun exercise, let’s see what happens when we turn off stack traces entirely using -XX:-StackTraceInThrowable
, as suggested in a scala-contributors thread on this subject:
[info] IntParsing.usingTry thrpt 25 5213565.928 ± 254051.145 ops/s
[info] IntParsing.usingCatch thrpt 25 5591556.684 ± 241780.699 ops/s
[info] IntParsing.usingOption thrpt 25 456207304.570 ± 8674732.617 ops/s
There’s still a difference, but much smaller: “only” an 81x improvement.
Summing up: don’t use the exception-throwing .toInt
variant to parse a number unless you’re just experimenting in the REPL. Scala 2.13 brings you a safer (no exceptions involved!) and much faster .toIntOption
(or .toDoubleOption
) variant.