Fast number parsing in Scala

Fast number parsing in Scala webp image

Every now and then, we’ve got to parse a String into an Int or Double. With Scala, that’s easy — given a value: String, just call value.toInt. Couldn't be simpler!

However, what happens if the value is not an integer? Well, a java.lang.NumberFormatException exception will be thrown. In fact, the .toInt method simply calls Java’s Integer.parseInt, which does exactly the same.

Being Scala programmers, we don’t use exceptions often, if at all. And definitely not for control flow. That’s why we usually catch the exception that might occur during parsing and convert the result to an Either or Option. Hence, typical number-parsing presents itself as follows:

Try(value.toInt).toOption

What can go wrong? Well, if the number-parsing is on your code’s hot path — that is, called frequently — and if the value is often not a number (that is, parsing fails), you’ll see really poor performance. To the degree that this might be the decisive factor when it comes to the latency of your application.

The problem is that every time parsing fails, an exception is being thrown, and this implies creating the stack trace, which is a slow operation. As the name suggests, exceptions should be used for exceptional cases, not in the regular case.

What’s the alternative? Turns out Scala 2.13 introduced a new .toIntOption method which is a completely different implementation than .toInt. No exceptions are involved and the whole process is a lot faster. How much faster? Let’s check!

To run the tests, I’ve used the sbt-jmh plugin, running on my 2018 i9 MBP, using both Java 11 and Java 17 from Azul Zulu. We are testing three variants to parse a number, and fallback to 0 if parsing fails (it does fail in the test):

  1. using the Try() wrapper
  2. using a traditional try-catch, in case it was the Try that is adding overhead, and not the exception-throwing
  3. using .toIntOption

Here’s the code:

class IntParsing {
  @Benchmark
  def usingTry: Int = Try("abc".toInt).getOrElse(0)

  @Benchmark
  def usingCatch: Int =
    try "abc".toInt
    catch {
      case _: Exception => 0
    }

  @Benchmark
  def usingOption: Int = "abc".toIntOption.getOrElse(0)
}

The results are more than surprising! The .toIntOption approach is 500x faster than the other ones:

[info] IntParsing.usingTry     thrpt   25     712466.231 ±    3966.962  ops/s
[info] IntParsing.usingCatch   thrpt   25     743692.743 ±   41825.919  ops/s
[info] IntParsing.usingOption  thrpt   25  421541301.290 ± 3037535.459  ops/s

The difference is so huge that I’m sure I’ve made some methodological mistake in the test setup — please correct me :) (I did check a variant where the parsed value is not a constant string, but a constant computed at run-time into a val — same results). But both the Internet wisdom and verifying in real-life confirm that even if not as staggering, the difference in performance between Try(value.toInt) and value.toIntOption is really huge.

As a fun exercise, let’s see what happens when we turn off stack traces entirely using -XX:-StackTraceInThrowable, as suggested in a scala-contributors thread on this subject:

[info] IntParsing.usingTry     thrpt   25    5213565.928 ±  254051.145  ops/s
[info] IntParsing.usingCatch   thrpt   25    5591556.684 ±  241780.699  ops/s
[info] IntParsing.usingOption  thrpt   25  456207304.570 ± 8674732.617  ops/s

There’s still a difference, but much smaller: “only” an 81x improvement.

Summing up: don’t use the exception-throwing .toInt variant to parse a number unless you’re just experimenting in the REPL. Scala 2.13 brings you a safer (no exceptions involved!) and much faster .toIntOption (or .toDoubleOption) variant.

Blog Comments powered by Disqus.