Principles of developing applications in Scala
Scala, a statically-typed functional programming language, has been on the market for almost 20 years. During that time, a couple of mainstream approaches have evolved when it comes to writing Scala code. These include using effect systems, techniques of asynchronous code execution, and leveraging various libraries and frameworks.
While these approaches are quite diverse, there's a set of principles that are shared among all of them. In other words, when writing applications in Scala, whatever your favorite stack, there are certain things you often do, and certain things you never do.
These principles emerge from three pillars: the language itself, the standard library, and the community. Let's look at a couple of such practices in detail, in some cases contrasting the Scala approach with how applications are written in the Java ecosystem.
Immutable data structures
One of the most fundamental and impactful properties of code written using Scala is that by default, data structures are immutable. There are a number of benefits of such an approach (and some downsides, as well):
- data can't be modified "somewhere else" by "somebody else"
- state external to a function cannot change, code is easier to understand; that's also called local reasoning
- resiliency to data corruption as concurrent data modification is not possible
- greater thread-safety and simpler concurrent programming in general
Other programming platforms share the preference for immutable data structures when it comes to concurrent programming. The fact that it's the default in Scala might be one of the reasons for concurrent programming to be one of Scala's niches.
Scala supports immutable data structures on many levels. First, we've got constructs right in the language, such as
case classes, class fields which are
vars) by default,
enums (in Scala 3), and ADTs (sealed families of classes). Another important aspect is the ease of creating modified data structures using the auto-generated
But maybe even more importantly, the standard library is built around immutable data structures—if you need to use a
Map, you'll get an immutable version by default.
Finally, most Scala libraries out there consume and produce immutable data structures. It's simply the way of handling data in Scala. Of course, you can still use mutable data structures if needed. However, that's rather an exception than a rule.
Nothing stops you from using immutable data structures on other platforms, including node.js, Java or .NET. However, in these cases, the language support is limited. More importantly, the standard libraries and widely used community libraries are based on mutable data structures. Immutability, even if possible, might prove highly impractical.
For example, almost all Java libraries use mutable collection interfaces from the standard library. If you're using a Java library, almost certainly, it's going to be using standard, mutable collections. That's something that is going to be almost impossible to change: Java will remain a mutability-first language.
Expressions and values
A feature of Scala that might not stand out at first but quickly becomes indispensable is that everything in Scala is an expression. Once you get used to it, writing in a language where statements and expressions are separated will feel cumbersome and unnecessarily limiting.
This is a testament to Scala's flexibility and versatile syntax. In terms of the three pillars that we have defined in the beginning, expression-orientation is a purely language-level trait.
A related practice in Scala is representing various concepts as values. Starting from functions: the syntax to define a function-value is very light-weight. It just feels natural to do in Scala. And this is reinforced by the standard library, which uses higher-order functions (that is, functions that take other functions as parameters or return functions). As well as by the whole ecosystem, which uses this approach as well.
But it doesn't stop at functions. One of Scala's main strengths is its flexibility in defining abstractions. This often translates into the ability to represent various concepts as values. Maybe somewhat surprisingly, adding such a level of indirection opens interesting new possibilities.
As a prime example, let's take functional effect systems, such as ZIO or cats-effect. There, the entire computation is represented as a value. That way, we get lazy and controlled evaluation of effectful code. This, in turn, combined with a custom runtime, enables declarative concurrency, implementing light-weight threads with principled interruptions or fearless refactoring.
Another example from my own back-yard is tapir and sttp client. Both deal with the HTTP domain; in both, you first create a value representing an endpoint or an HTTP request. This value is, of course, immutable, which allows incremental refinement, as well as separating concerns: the networking layer from the domain layer.
Macros, not reflection
The third principle stems directly from Scala's focus on correctness via statically checked types. Again, we will mostly contrast the "Scala approach" with what Java offers.
Scala is very flexible when defining abstractions, but this will only get you as far. At some point, you'll face either writing large chunks of boilerplate code or somehow getting the code generated for you. Examples include generating JSON encoders/decoders, wiring the object graph using dependency injection, or mapping classes to database tables.
The question is: how do we define this code-generation process? What language are we using, and how can users define what should be generated?
In Java, such tasks are usually accomplished using annotations, which are then read using reflection at run-time, using a framework such as Spring. This framework then acts as an interpreter, generating bytecode or instantiating predefined classes. It has been my long-standing opinion that annotations are abused in Java and, in fact, create a parallel, limited, and unsafe programming language, with underspecified, ad-hoc interpreters.
What's the alternative, then? In Scala, the answer is generating code at compile time. In its most common form, type-directed term derivation is known as
implicits (in Scala 2) or
givens (in Scala 3). Just as the IDE can often infer types for a piece of code, the Scala compiler can infer code based on a type. The "developer experience" around this feature has been improved in Scala 3; for a good overview, take a look at this presentation by Magda Stożek.
But Scala doesn't stop there—the direct replacement for annotation-based code generation are inlines and macros. They allow generating Scala code at compile-time, in a type-safe way, while using the full Scala language to define the process and seamlessly integrating with other Scala features.
Annotations can still be used to guide the macro and provide additional meta-data—but that's only one of the options; inspecting the types and terms passed to the macro is another. There are numerous examples of macro usages in Scala, for example, the jsoniter-scala library or macwire for compile-time dependency injection.
Macros are often cited as being hard to understand. And while writing a macro might indeed be far from trivial, the same holds—I'd argue, that even to a greater degree—when dealing with writing annotation processors or bytecode generators.
Using macros is a different story; it's usually no different from calling a method. The fact that error messages are not always perfect is a problem, but that's a matter of maturing the ecosystem, not a fundamental issue of the entire approach.
While reflection-based code generation, guided by annotations, is still predominant in the Java ecosystem, there are signs that it's also moving toward compile-time generation. Hibernate's metamodel is one example (it uses annotation processors). However, the low expressiveness of annotations as a sub-language remains a problem. Another example is Kotlin's serialization plugin (https://github.com/Kotlin/kotlinx.serialization).
Explicit when necessary, implicit when not
The last principle doesn't stem directly from the design of Scala or the standard library but rather from the "culture" of functional programming in general and Scala's community in particular.
We often see the temptation to introduce various forms of "magic" in almost every platform. The intentions are good: we want automation and things to happen automatically. However, problems start when as a user of this "magic" we cannot pinpoint and understand what will happen, when, and why. This is at odds with predictability and explorability of our code bases, which is a significant factor in delivering maintainable software.
A prime example of such "magic" is classpath scanning and "auto-discovery", which conditionally enables some components, depending on their mere presence on the classpath. This is almost entirely absent in Scala code.
Would you like your operating system behavior to be modified based only on the fact that you downloaded something from the Internet? Yet, that's what classpath-scanning & auto-instrumentation magic is doing. In Scala, we prefer explicitly enabling features, even if it's three more lines of code.
Another example is that Scala applications always explicitly define
main methods. For some reason, Java code often does not have such a clearly defined entrance point, instead relying on our good old friends: annotations and reflection. Maybe it's Java that's unique here, or maybe it's Scala that is bringing some sanity to the situation—I'm not sure. But for certain, there's no reason to fear the main.
Where is this coming from? When writing Scala code, or generally using the "functional" style, we work with functions that might accept other functions as parameters and return values that we pass further to other functions. This, after all, is what functional programming is about: treating functions as first-class values.
Consequently, we can directly follow code paths—using the simple yet very effective method of "go-to definition" in our IDEs.
None of the traits described above are universally unique to Scala. It's their combination that provides such a good mix. It's not only about language design; defaults matter, as is shown by immutable data structures and immutable collections in the standard library.
Having a regular, surprisingly small grammar in an expression-oriented language opens up new possibilities—for example, to succinctly represent concepts as values.
Extending the compilation process using type-safe code generation is more efficient, secure, and predictable compared to run-time annotation processing.
Finally, being explicit as to what happens when and why in your code increases the explorability and understandability of our codebases, even when it's necessary to navigate library-provided infrastructure code.
This all rests on three pillars: the Scala language, the standard library, and the community-provided libraries and frameworks. What would you add to the list of principles when developing code using functional programming and Scala?
Reviewed by Michał Matłoka.