How I Misunderstood Dialyzer
Published on Oct 12, 2020 by dix.
I published a post a few weeks ago describing problems I had using Dialyzer with OTP behaviours like GenServer.1 I presented some helpers I hoped to use in a project at work that would make these behaviours slightly easier to work with. When I went to use these helpers in our production code, I realized that they would be no help to me all. This demonstrated to me that I had fundamentally misunderstood Dialyzer and success typing more broadly. To understand things better, I dove into the Dialyzer literature.
The following two snippets of Elixir code quite easily demonstrate my misunderstanding of Dialyzer.
@spec foo(atom()) :: atom() def foo(:a), do: 4 def foo(:b), do: :b @spec bar(atom()) :: atom() def bar(:a), do: 4 def bar(:b), do: 5
If we run Dialyzer against these two functions, only bar/1
is identified as
having no local return. But to our human eyes, it is obvious that there is
something wrong with foo/1
, but foo/1
has a valid success typing. Dialyzer
will only reject programs that it knows cannot execute correctly, but will
accept a program that could execute correctly given some input. In the case of
foo/1
, if the function is only ever called with the argument :b
, it will
conform to the spec. In the case of bar/1
, there is no program input which
will ever conform to the specified type.
By delving deeper into the formal specification of the Dialyzer algorithm, we
can see this more concretely. To begin with, it is important to understand that
for the purposes of success typing, a multiple clause function is equivalent to
a case statement. This means that to understand foo/1
and bar/1
we should
consider them as:
@spec foo(atom()) :: atom() def foo(arg) do case arg do :a -> 4, :b -> :b end end @spec bar(atom()) :: atom() def bar(arg) do case arg do :a -> 4 :b -> 5 end end
If we consider the following general case expression:
case E of P_1 when G_1 -> B_1; . . . P_N when G_N -> B_N; end
The Dialyzer algorithm creates the following set of constraints2:
\begin{equation} C_E \land (\mathop{\lor}_{i}^{n} \tau_E \subseteq \tau_{P_i} \land C_{G_i} \land C_{B_i} \land \tau_{out} \subseteq \tau_{B_i}) \end{equation}
There is a great deal of notation here that it is important to understand.
\(C_x\) is the constraints in the environment on the variable named x
.
\(\tau_x\) is the type of the variable named x
. \(\land\) is logical
conjunction, that is each side of the expression must be true. \(\lor\) is
logical disjunction, that is either side of the expression can be true.
The series of logical disjunctions \(\mathop{\lor}_i^n\) was the source of my
misunderstanding. In effect, this means that if there is a solution to any of
the clauses of a case statement, there is a success typing for the entire case
statement. My misunderstanding had been that there would be a conjunction for
each clause. Namely that there is only a success typing for the entire case
statement, if there is a success typing for every clause.
Conclusions
I wrote my original post because I found some missing pieces in Dialyzer that I hoped to temporarily hack around. I also hoped would be resolved by more robust work done by the implementors of Gleam and the Erlang Team at WhatsApp. While the hacky solution I proposed didn’t in fact work, I am glad that it gave me a reason to explore the Dialyzer literature extensively. I feel like I understand how it all works pretty well now. That said, now that I understand what it is actually capable of doing, I’m more eager than ever for harder static typing on the BEAM.
Caveat
Based on my previous post, I hope that it is clear to readers that I am not an expert in this area. I have been reading a lot and trying to gain some practical experience in this area, but it is entirely possible that I misunderstood something I’ve seen or read. If you read this and see something you think is wrong, please let me know.
References
If you find this topic interesting at all, please read the papers written by the original implementors of Dialyzer. They are all quite short and relatively easy to read and understand. Here are a couple of the ones I read to prepare this post.
- Practical Type Inference Based on Success Typings
- Gradual Typing of Erlang Programs: A Wrangler Experience
- TypEr: A Type Annotator of Erlang Code
Footnotes:
: This is actually the constraint derivation described in the
TypEr
paper as I found it easier to read. I believe it is
equivalent to the algorithm used by Dialyzer