Writing Typeclass Derivations with Magnolia
Magnolia may be used to write methods which materialize typeclass instances for arbitrary case classes and sealed traits, by combining existing typeclasses Magnolia finds in scope. However, for any given typeclass, Magnolia needs a definition for how it should combine the typeclasses it finds for each of the parameters in a case class, and how it chooses between the typeclasses it finds for each subtype of a sealed trait.
This tutorial explains how to write that definition.
Build
First of all, include the latest version of Magnolia in your build. If you are using sbt, include the following:
libraryDependencies += "com.propensive" %% "magnolia" % "0.16.0"
libraryDependencies += "org.scala-lang" % "scala-reflect" % scalaVersion.value % Provided
Note that the dependency on scala-reflect
for your current version of Scala
is required.
The Derivation Object
Magnolia’s design expects each typeclass derivation to be defined it its own object, bundling the methods which will be used to build the new typeclasses alongside the method which gets bound to the magnolia macro.
A basic derivation object will follow this structure,
import magnolia._
import scala.language.experimental.macros
object MyDerivation {
type Typeclass[T] = ???
def combine[T](caseClass: CaseClass[Typeclass, T]): Typeclass[T] = ???
def dispatch[T](sealedTrait: SealedTrait[Typeclass, T]): Typeclass[T] = ???
implicit def gen[T]: Typeclass[T] = macro Magnolia.gen[T]
}
We will need to provide implementations for the combine
and dispatch
methods, and the Typeclass
type constructor, though these implementations
will depend on the nature of the typeclass interface we are deriving for.
If you are also the author of the typeclass, the typeclass’s companion object
makes a reasonable choice for the derivation object. An implicit method defined
on the companion object, such as gen
above, will automatically be in scope
during any implicit search for an instance of the typeclass. If you are not
the author of the typeclass, then you will not have access to the companion
object, and users of your derivation will need to import the implicit gen
method into scope to use Magnolia to generate typeclass instances, or call it
directly every time.
You cannot derive for more than one typeclass in the same object.
The Type Constructor
Firstly, Magnolia needs to know how to construct the typeclass type. Most
typeclasses have just a single type parameter, which will be the generic type
Magnolia will derive the typeclass for, and in these cases, such as for Show
,
we can simply write,
type Typeclass[T] = Show[T]
though some typeclasses may take more than one parameter. If, for example, we
had an Encoder[F, T]
typeclass, parameterized on both the type is is
encoding (T
), and the format it is encoding to (F
), we would need to
provide a derivation for each format, fixing the format parameter in the
Typeclass
definition, like so,
type Typeclass[T] = Encoder[Json, T]
The combine
Method
The combine
method is typically the most difficult part of the derivation
object definition, and its implementation will depend heavily on whether the
typeclass is covariant or contravariant. This distinction will sometimes
apply to the variance annotation on the generic parameter to the typeclass. But
even in the common cases where typeclasses are invariant in their generic
parameter, that generic parameter will typically appear in either covariant or
contravaraint positions in the typeclass interface.
For example, Show
is a contravariant typeclass, because its abstract method
takes a value of the generic type, and returns a fixed type (String
),
trait Show[T] { def show(value: T): String }
whereas Default
, which returns an instance of the generic type, is a
covariant typeclass (which incidentally takes no parameters),
trait Default[T] { def default: T }
Let’s start with the contravariant case, using Show
as an example.
We need to implement a method which returns a new instance of Show[T]
, having
been provided with a single value: a CaseClass
instance. For a Show
typeclass, we need to implement,
def combine[T](caseClass: CaseClass[Show, T]): Show[T] = ???
The CaseClass
value provides everything we know about the particular case
class we need to derive a new Show
for. But as we are programming to a
generic interface, we actually know very little concretely. Looking at the type
of CaseClass[Show, T]
, only the type constructor, Show
is universally
quantified.
CaseClass
provides a method,
def parameters: Seq[Param[Show, T]]
which gives us access to a sequence of objects corresponding to each of the
parameters in the case class we are deriving for. The Param
type has several useful methods:
label
, the name of the parameter, as aString
,typeclass
, a reference to an instance of the typeclass the compiler has found (or derived) for that parameter type,default
, anOption
which may contain the default value used when the case class constructor is called without specifying that value,dereference
, a method which takes an instance of the case class, and returns the value of the parameter within it.
An interesting point to be aware of is that the Param
instance has methods
which take or return values with a type corresponding to the parameter type.
But any code we write will not know anything about that type, and in a sequence
of Param
values (of unknown length) these types will be abstract, and likely
different. Yet we must find a way to make use of them to implement the new
typeclass!
Each Param
instance has a type member, called PType
, which is the abstract
type corresponding to that parameter in the case class. Thankfully, given any
particular value of Param[Show, T]
, say p
, the compiler knows that the
typeclass
method will return an instance of Show[p.PType]
, which means that
it has a method, show
, which takes a value of type p.PType
(and returns,
invariably, a String
). We can apply an instance of T
, say t
, to
p.dereference
to get a p.PType
value back, so by combining these two,
p.typeclass.show(p.dereference(t))
now gives us an instance of String
for
an instance of the case class, T
. So we have eliminated the
existentially-quantified p.PType
type.
We can, of course, apply this same function to each element of the sequence of
Param
s, and given an instance of the case class type, T
, we can produce a
Seq[String]
. Each Param
also has a label
, so we could choose to prefix
each parameter with its label. That code might be,
val paramStrings = caseClass.parameters.map { p =>
p.label+"="+p.typeclass.show(p.dereference(t))
}
A reasonable Show
instance might join all of these named parameters,
separated by commas, inside parentheses, and prefixed with the name of the case
class, which we can obtain from the typeName
member of our CaseClass
instance, like so,
caseClass.typeName+paramStrings.mkString("(", ",", ")")
and we now have a single String
.
But you may have noticed that I have not yet explained where the instance of
the case class, t
, comes from. Remember, we are not deriving a String
; we
are deriving a typeclass which converts a T
instance into a String
, and the
t
is simply the parameter to the show
method in the new typeclass we are
constructing. Putting this all together, here is a full implementation of
combine
for the Show
typeclass.
def combine[T](caseClass: CaseClass[Show, T]): Show[T] = new Show[T] {
def show(t: T): String = {
val paramStrings = caseClass.parameters.map { p =>
p.label+"="+p.typeclass.show(p.dereference(t))
}
caseClass.typeName+paramStrings.mkString("(", ",", ")")
}
}
Most contravariant typeclass derivations will take a similar form. But this
doesn’t work for covariant typeclasses, such as a Decoder
, where we must
implement a method which takes a String
and constructs a new instance of a
case class from that String
. In these cases, we must do some similar
type-gymnastics, but mostly with different methods from the Magnolia API.
Our implementation of combine
will look similar to that of Show
, except
that we now need to find a way to implement the abstract decode
method of a
new Decoder[T]
typeclass.
def combine[T](caseClass: CaseClass[Decoder, T]): Decoder[T] = new Decoder[T] {
def decode(s: String): T = ???
}
Given that we have no T
-typed inputs, we have only one way to produce a T
,
which is to use the construct
method on our CaseClass[Decoder, T]
instance.
construct
takes a lambda of type Param[Decoder, T] => Return
. This lambda
needs to operate on each parameter in the case class in turn, producing a value
of the appropriate type for each parameter, and then construct
returns an
instance of the case class type, T
, composed of those parameters.
The return type of the parameter lambda is Return
, and is, of course,
dependent on the Param
value. Unfortunately, Scala’s type system cannot
represent a Function
type where the return type is dependent on the input
type, and still maintain reasonable type inference, so Magnolia compromises on
the typesafety of the return value for the lambda. It is very important to
ensure that the lambda returns a value of the right type, but it should
nonetheless be obvious during the first usage of a derived typeclass instance
if the implementation is incorrect, as a ClassCastException
will be almost
guaranteed to occur. A future version of Scala may support better typing of
function types.
For now, though, we need to construct a new case class instance, by specifying
how each parameter should be constructed. In the case of Decoder
, we start
with a String
, so we will assume that we have elsewhere implemented a method, say
parse
, which will read the contents of the string (in whatever format we are
decoding) and convert its contents into a Map
of keys and values, where the
keys are, by convention, the parameter names from the case class.
For each parameter, p
, we can then look up its String
value in the map, and
use the typeclass corresponding to the parameter to decode it to the
appropriate type, like so,
def combine[T](caseClass: CaseClass[Decoder, T]): Decoder[T] =
new Decoder[T] {
def decode(str: String): T = {
val valueMap = parse(str)
caseClass.construct { p => p.typeclass.decode(valueMap(p.label)) }
}
}
Typically, the combine
method for a covariant typeclass will,
interpret the input,
use the parameter label to select some part of the input,
apply this input to the typeclass associated with that parameter.
Note that the construct
API doesn’t expose the sequence of parameters it
operates on. Given that for any given case class, the user has no control over
the length of the sequence of parameters, and no way to distinguish between
different parameters, for simplicity and safety, Magnolia takes care of mapping
the function over all of the parameters.
A full implementation of a Decoder
derivation is given in the Magnolia
examples
As well as case classes, Magnolia will derive for value classes (exactly one
parameter) and case objects (zero parameters). Derivations for these types will
work without any special modifications to the above code. But if there is a
need to distinguish them from ordinary case classes, the CaseClass
type has
methods isValueClass
and isObject
which will be true for each of those
cases, respectively.
The dispatch
Method
If you only care to derive for case classes, and don’t need to handle sealed
traits, dispatch
may be left unimplemented. However, in most cases, it can be
implemented more easily than the combine
method, so it’s worth including for
completeness.
Its signature looks like this,
def dispatch[T](sealedTrait: SealedTrait[Typeclass, T]): Typeclass[T]
The purpose of the dispatch
method, regardless of the variance of the
typeclass, is to choose the correct, or “best”, subtype to use to handle the
input. Magnolia’s API presents a sequence of Subtype[Typeclass, T]
instances to choose from.
In the case of contravariant typeclasses, where we have an instance of the
sealed trait type as input, this choice is very easy: the instance of the
sealed trait will have exactly one matching Subtype
instance, which will be
the only possibility. The SealedTrait
API provides a convenience method, also
called dispatch
, for dealing with exactly these cases. It takes an instance
of the sealed trait type to choose which Subtype
to use, and a lambda from
that Subtype
instance to a return value. Because the trait is marked
sealed
, we know that exactly one of the Subtype
instances will match.
The API for Subtype
is very similar to that of Param
. It provides access to
the type name of the subtype (label
), and the corresponding typeclass
instance (typeclass
), but in place of a dereference
method, Subtype
has a
partial function called cast
which will cast a value with the type of the
sealed trait to the type of that subtype.
An instance of Scala’s PartialFunction
may or may not be defined for the
inputs it accepts, and normally we would want to check before passing one a
value it can’t handle, but for the Subtype#dispatch
method, given that we
have already selected a Subtype
instance on the basis of an input value, it
is safe to call subtype.cast
on that same value and know that it will return
a value with that subtype’s type. This is important because the typeclass
value for that Subtype
will need an instance of this same type, and not the
sealed trait’s type.
We can therefore implement dispatch
for a Show
typeclass like so,
def dispatch[T](sealedTrait: SealedTrait[Show, T]): Show[T] = new Show[T] {
def show(t: T): String = sealedTrait.dispatch(t) { subtype =>
subtype.typeclass.show(subtype.cast(value))
}
}
Most implementations for contravariant type classes won’t deviate far from this pattern.
For covariant typeclasses, however, we generally have a freer choice of which subtype to use, as that choice is not dictated by our input type, and different typeclasses may make that choice in different ways. Some possibilities are,
just use the first subtype (e.g.
Default
typeclass)match the subtypes’
label
s against some part of the input (e.g.Decoder
)try each of the subtypes in turn until finding one which works
For our covariant Decoder
typeclass, our input is a String
, so we will
assume the existence of a getTypeName
method which can extract the name of a
type from the string, to be used to match against the subtypes names. Having
found a matching Subtype
instance, we will then use its typeclass
value to
decode the input String
. The string will be exactly the same input as was
passed to the sealed trait typeclass, so there’s no need to process or cast it;
we are simply delegating processing it to a different typeclass.
Here is what the implementation of dispatch
looks like for a Decoder
,
def dispatch[T](sealedTrait: SealedTrait[Decoder, T]): Decoder[T] =
new Decoder[T] {
def decode(str: String) = {
val name = getTypeName(str)
val subtype = ctx.subtypes.find(_.label == name).get
subtype.typeclass.decode(str)
}
}
As you can see, even this implementation is quite simple.
Summary
With a definition for Typeclass[T]
, and implementations of combine
and
dispatch
, we have everything required for Magnolia to provide derivation.
Several examples of typeclasses with their derivation objects exist in the
examples
directory
of the Magnolia source.
If these members are put into the same object, say Show
or Decoder
in the
examples above, alongside a gen[T]
method, bound to macro Magnolia.gen[T]
,
it should be possible to call Show.gen[SomeType]
or Decode.gen[SomeType]
and have Magnolia derive an appropriate typeclass for SomeType
.
If the definiton of SomeType
refers to types for which no typeclass can be
found or derived, Magnolia should report which typeclass could not be found,
including a “stack trace” if it is deeply nested in the ADT structure.
Note that Magnolia will only produce debugging output for explicit calls to the
gen
method; not for invocations through implicit search. The reason for this
is that implicit search may invoke the Magnolia macro, which may fail to derive
a suitable implicit, but implicit search may subsequently continue and find a
matching implicit elsewhere. Magnolia has no way of knowing whether implicit
search will ultimately fail, and any failure output it produces may turn out to
be a false-positive.
Getting more help
Magnolia is still being actively developed, and has so far only had exposure to a limited number of typeclass derivations, so everyone is still exploring its capabilities. If you get stuck implementing a derivation for a particular typeclass, ask on Gitter, or send a tweet to @propensive.