This post is the beginning of a series about functional software
architecture in Kotlin. Its material originally comes from a collaboration
between Active Group and Blume2000, where
we consulted them on the development of their web shop. I wrote this
post together with Benedikt
Stemmildt, who was CTO at Blume2000
at the time.
This post is about data validation. We want to ensure that objects in our
program are „valid,“ meaning they satisfy arbitrary
consistency criteria without which our software won‘t function.
My colleague Marco Schneider already presented abstractions for this
in Haskell in an
earlier post.
The same ideas can also be transferred to Kotlin — with some
compromises. In this post, we‘ll tackle the topic anew, specifically
how we can help an object-oriented perspective using functional
techniques. So it‘s not necessary to read the
Haskell post. (We still warmly recommend the post, as it particularly
describes the concept of Applicative, which is impractical in
Kotlin.) However, basic Kotlin knowledge is assumed.
Premise
Martin Fowler nicely describes the validation problem in his blog post
Martin Fowler: Replacing Throwing Exceptions with Notification in
Validations. He
quotes the following Java code fragment from a class whose attributes
date and numberOfSeats must have certain properties for the class
methods to work:
public void check() {
if (date == null) throw new IllegalArgumentException("date is missing");
LocalDate parsedDate;
try {
parsedDate = LocalDate.parse(date);
}
catch (DateTimeParseException e) {
throw new IllegalArgumentException("Invalid format for date", e);
}
if (parsedDate.isBefore(LocalDate.now())) throw new IllegalArgumentException("date cannot be before today");
if (numberOfSeats == null) throw new IllegalArgumentException("number of seats cannot be null");
if (numberOfSeats < 1) throw new IllegalArgumentException("number of seats must be positive");
}
This code is quick to write and easy to read, but has at least two problems:
-
It throws an exception when a problem occurs. Stylistically,
however, it makes sense to use exceptions primarily for situations
where something unexpected and unavoidable happens in the software‘s
environment („file not found“). Here, though, we‘re dealing with
situations that are unpleasant but expected.
-
More importantly, the method checks for multiple possible problems
but exits at the first one, so it can only report one problem.
In the Java ecosystem, frameworks have therefore been established that
work without exceptions and collect and report all identified
problems, such as the Hibernate
Validator. Here‘s a code example
for validation with such a validator, from the Kotlin code of the web
shop at Blume2000:
data class Position(
@field:Min(1, message = "COUNT_ZERO") var count: Int,
@field:Valid var price: Price,
@field:Valid var product: Product
)
The things with @ are annotations that the Kotlin compiler transfers
into the object code for the Position class, where they are read by
the validator at runtime. For example, the @field:Min annotation
means that the value of the count field must be at least 1. The
@field:Valid annotation refers to the validation annotations that
are in the Price or Product class, respectively.
The validator only becomes active when explicitly called on a
Position object:
val position: Position = ...
val factory = Validation.buildDefaultValidatorFactory()
val validator = factory.getValidator()
val violations: Set<ConstraintViolation<Position>> = validator.validate(position)
So the basic procedure is as follows:
- first create an object and
- check whether it‘s valid
Functional programmers‘ hackles rise at this approach. Why create an
invalid object at all? We‘re reminded of the scene in
Alien where
the crew lets the alien into the spaceship for examination afterwards. (It
doesn‘t end well.)
The „validator method“ also creates some very concrete problems:
- The validator annotations effectively constitute a small programming
language that you first have to learn.
- The annotations couple the code to the validator framework.
- Valid and invalid objects have the same type, so the type system
can‘t help distinguish them.
Validation in Functional Programming
Accordingly, in functional programming we pursue a different approach
to validation. The prominent functional programmer Yaron
Minsky (CTO at Jane
Street) coined the following saying:
„Make illegal states unrepresentable.“
Minsky doesn‘t speak directly about validation but about using the
type system: In strongly typed languages, we should design our types
so that only consistent, „legal“ data can be created at all.
Enforcing this with the type system alone isn‘t always possible, for
example with numbers that must come from a specific range. Often we
need, for instance, a type for natural (i.e., non-negative integer)
numbers, but many languages only have int as a type for integers.
To ensure that it‘s a non-negative number, we therefore have
to do something at runtime.
But even at runtime we can follow Minsky‘s credo and prevent „invalid“
objects from being created in the first place. To implement this for
the Position type from above, instead of the „official“ constructor
function Position:invoke, we write a special, validating constructor
function (or „factory method“ in OO speak) something like this:
class Position {
companion object {
fun of(count: Int, price: Price, product: Product) =
if (count >= 1)
...
else
...
}
If we don‘t want to throw an exception in the else case—as in Martin
Fowler‘s cautionary example — we need to accommodate the validation
result in the return type of of. For this purpose, we use the Kotlin
FP library Arrow, which contains all sorts of
useful functional abstractions—particularly the type
Validated. (Unfortunately,
Arrow‘s documentation is somewhat elliptical at the time of this
article‘s writing.) Validated is defined as follows:
sealed class Validated<out E, out A>
data class Valid<out A>(val value: A) : Validated<Nothing, A>
data class Invalid<out E>(val value: E) : Validated<E, Nothing>
Thus a Validated value can either encapsulate a valid value of type
A with the Valid class, or with Invalid, a description of a
validation error of type E. With this we can complete of as
follows:
fun of(count: Int, price: Price, product: Product)
: Validated<List<ValidationErrorDescription>, Position> =
if (count >= 1)
Valid(Position(count, price, product))
else
Invalid(listOf(MinViolation(price, 1)))
The type ValidationErrorDescription isn‘t listed here — it contains
classes with descriptions of possible validation errors. We use a
whole list of them here because multiple validation errors can occur
when constructing a single complex object.
Pragmatics
In practice, however, you would make two more changes:
First, Arrow has defined two „extension functions“ that attach valid
and invalid methods to every class, which call the constructors:
fun <A> A.valid(): Validated<Nothing, A> = Valid(this)
fun <E> E.invalid(): Validated<E, Nothing> = Invalid(this)
With these, the above code would idiomatically look like this:
fun of(count: Int, price: Price, product: Product)
: Validated<List<ValidationErrorDescription>, Position> =
if (count >= 1)
Position(count, price, product).valid()
else
listOf(MinViolation(price, 1)).invalid()
(I personally find this confusing.)
Furthermore, most uses of Validated contain a list in the error
case, more precisely a non-empty list. Arrow provides a convenience
type alias for this:
public typealias ValidatedNel<E, A> = Validated<NonEmptyList<E>, A>
(The type NonEmptyList is also included with Arrow.)
Unfortunately, the extension functions valid and invalid don‘t
work unchanged for ValidatedNel; there are extra functions
validNel and invalidNel for that. With these, the constructor of
finally looks like this:
fun of(count: Int, price: Price, product: Product)
: ValidatedNel<ValidationErrorDescription, Position> =
if (count >= 1)
Position(count, price, product).validNel()
else
MinViolation(price, 1).invalidNel()
So much for constructing Validated. To process a Validated object,
a when is sufficient:
when (Position.of(count, price, product)) {
is Valid -> ...
is Invalid -> ...
}
(Validated also has a fold method, but it‘s not necessarily
conducive to readability.)
Anyone who looks deeper into Arrow will see that—following the Haskell
package
validation—it‘s
possible to use any semigroup for type E. However, this is too
cumbersome for practice in Kotlin since the semigroup isn‘t
automatically inferred. Besides, ValidatedNel is usually sufficient
anyway.
To write tests where you already know the objects are valid, we
typically use a convenience function like this:
class ValidationException(message: String, val error: Any) : IllegalArgumentException(message)
@Throws(ValidationException::class)
fun <E, A> Validated<E, A>.get(): A =
this.valueOr { throw ValidationException("Validated expected to be valid", it as Any) }
The valueOr method is built into Arrow and either delivers the valid
value in a Validated or calls the passed function, which in this
case throws an exception.
With this we can simply write code such as this in tests:
val position: Position = Position.of(...).get()
Composition
Perhaps you‘ve wondered why of doesn‘t take ValidatedNel<...,
Price> and ValidatedNel<..., Product> as arguments. After all,
Price and Product probably also need to be validated. However, this
isn‘t necessary since we follow the credo of not creating invalid
Price and Product objects in the first place — Price and Product
are thus implicitly already validated.
Nevertheless, we need to figure out how to combine the validations for Price and Product with that of Position. (In Haskell this happens with the help of the Applicative abstraction, which would be too cumbersome to use in Kotlin.) So let‘s imagine there are also factory methods for Price and Product:
class Price {
companion object {
fun of(...): ValidatedNel<ValidationErrorDescription, Price> = ...
}
}
class Product {
companion object {
fun of(...): ValidatedNel<ValidationErrorDescription, Product> = ...
}
}
To validate a Price and a Product object in parallel such that any errors are combined, Arrow provides a series of extension functions called zip. Each of these accepts a certain number of Validated values and applies a function to the results, if possible. For example, here‘s the three-argument zip:
public inline fun <E, A, B, C, Z> ValidatedNel<E, A>.zip
(b: ValidatedNel<E, B>, c: ValidatedNel<E, C>, f: (A, B, C) -> Z)
: ValidatedNel<E, Z>
This takes some getting used to because the first argument comes before the .zip and all others in the parentheses after it. We could call it like this:
Price.of(...).zip(Product.of(...))
{ price, product -> Product(count, price, product) }
However, this only works if we call the constructor of Product
directly. But we naturally want to use Product.of, which itself
returns a Validated. Unfortunately, Arrow doesn‘t offer a really
practical abstraction here. It‘s easiest with our own helper function
that uses zip to construct a pair from the two validated
intermediate results:
fun <Z, A, B> validate(
a: ValidatedNel<Z, A>,
b: ValidatedNel<Z, B>
): ValidatedNel<Z, Pair<A, B>> = a.zip(b) { validA, validB ->
Pair(validA, validB)
}
This works correspondingly for tuples of higher arity.
We can then process the „validated pair“ from validate with the
extension function andThen, which is included with Arrow:
public fun <E, A, B> Validated<E, A>.andThen(f: (A) -> Validated<E, B>): Validated<E, B> =
when (this) {
is Validated.Valid -> f(value)
is Validated.Invalid -> this
}
Here‘s how it works:
validate(Price.of(...), Product.of(...))
.andThen { (price, product) -> Product.of(count, price, product) }
(The validate function has the additional advantage of being
symmetric, unlike zip.)
Note that andThen — just as in Haskell — has the same signature as a
monadic bind or flatMap, but it doesn‘t form a monad: andThen
doesn‘t accumulate errors.
Conclusion
Functional validation only requires a simple data type and a few
methods on it and works without a DSL or annotations. Since „valid“
and „invalid“ are expressed by different objects, you could also call
it „object-oriented validation.“ It‘s a good idea in any case.
In Kotlin, Arrow brings the right abstractions; only documentation and
convenience are still somewhat lacking.