While working on the paper „Evolution of Functional UI Paradigms“,
published at the FUNARCH workshop of ICFP
2025, I had
a thought about pure functions that I would like to briefly explain
here. As an argument in favor of pure functions, functional
programmers often cite better testability: pure functions require no
complicated test setup or mocks, are deterministic, parallelizable,
etc. In tendency, this is certainly correct. Pure functions are often
better suited for testing than their impure counterparts. However,
this connection is not a neccessary implication in either direction:
There are pure functions that are hard to test, and there are impure
functions that are easy to test. Thus purity itself cannot be the
substance that ensures a piece of program is properly testable. But
then what is this substance?
Before I attempt an answer, I want to briefly explain why the link
between purity and testability is not mandatory.
There are impure functions that are perfectly testable
An impure function is one whose behavior cannot be fully determined
only from a description of the relation between input and output
values. This can happen, for example, when the inputs and outputs are
not pure (mathematical) values but mutable objects—i.e., memory
locations to which different values can be assigned.
public class Counter {
private int value;
public Counter() {
this.value = 0;
}
public int getValue() {
return value;
}
public void inc() {
this.value = this.value + 1;
}
}
...
Counter counter = new Counter();
counter.inc();
assert counter.getValue() == 1;
Here, the variable counter refers to a Counter, which is a mutable
storage of int values. The method inc, which implicitly takes such
a storage (this) as an argument, is definitely not pure. Yet it is
simple to test. We expect that inc increments the counter, and we can
verify exactly that with three simple lines of test code. From a
software architecture view, mutable state has other issues, but at
least in this simple example, testability is unproblematic.
There are pure functions that are poorly testable
On the other hand, pure functions are not necessarily easy to test. At
Active Group we write a lot of UI code with the
reacl-c ClojureScript
library. That code is composed almost exclusively of pure functions,
yet there are significant parts of our GUI code that are not tested
automatically. That‘s not because we‘re lazy (writing tests makes
programming easier!), but because UI code is often inherently hard to
test. Consider this example in ClojureScript:
(defn view [temp]
(dom/div
{:style (when (too-hot? temp)
{:background "red"})}
(temperature->string temp))
(assert-equal (view 22) (dom/div "22")
(assert-equal (view 183)
(dom/div {:style {:background "red"}} "183"))
The view function maps „temperature“ values to a UI
representation. The temperature value should appear as a number in the
UI. In addition, „too high“ temperatures (whatever that means) should
be emphasized more strongly. Here we chose to implement the concept of
„emphasis“ by applying a red background.
The function view is pure – no changes or side effects occur – and
we can write simple tests for it, as shown above. However, those tests
are poor, because they check an aspect of the implementation: they
require a red background and allow no alternative implementation of
the same idea of emphasis. If later we decide that a nicer
implementation would be a red border or a small red exclamation mark,
we would have to change both implementation and tests. Tests that must
be constantly adapted to implementation are almost worthless.
Good tests check whether an implementation satisfies its specification
– and that is very hard in this example. In natural language, we
might express the specification for view like this: Show the temperature as
text and emphasize too-high temperatures. But how can this
specification be formalized so that it can be verified by automated
tests?
The substance that guarantees testability
I cannot answer this question here. On the contrary: I assert that
some aspects a computer program handles are simply not formalizable –
and so not automatically testable. With this article I only want to
show that pure functions are not a guarantee for good
testability. Instead, I claim that this relationship is mediated by
something else. The temperature display example sheds light on what
that something else is: precise and simple specifications.
First, one must make clear that every piece of program is written
toward a goal that is not contained in the code itself. In other
words: code is never an end in itself. If one takes care, one doesn’t
merely negotiate those goals in endless meetings, but writes them
down. The result is specifications. These specifications are more
valuable when they are simple (yet sufficient) and precise.
Pure functions start in pole position regarding the simplicity of
their specifications. The absence of mutations and side effects means
that a pure function can be characterized solely by the relationship
between its inputs and outputs. With impure functions there is more
room for nonsense. As programmers, we must restrain that nonsense by
discipline. The Counter example above shows that this is indeed
possible. That counter has a very simple and precise specification,
and so it is also easy for us to write a proper test.
On the other hand, the temperature example also shows that testability
can fail even with pure functions, namely when the specification is
imprecise. Likewise, one could imagine examples with precise but
complex specifications. Again, testability would suffer regardless of (im-)purity.
In our FUNARCH paper
we suggest one way to deal with those imprecise
specifications. Instead of giving up and writing no UI tests at all,
we propose addressing the problem with classic software architecture
craftsmanship by separating responsibilities. Concretely, we suggest
the functional Model-View-ViewModel (MVVM) pattern: aspects of UI code
that can be precisely specified go into a separate model (consisting
of immutable data and pure functions), and so become testable. The
genuinely difficult-to-specify aspects of the user interface are
pushed into a minimal and – thanks to the reacl-c programming model
– loosely coupled area of the application, thereby minimizing the
effort of manual (and thus costly) testing.