{-# LANGUAGE RankNTypes #-}
For some time, there's been a strange creature, Pointed
, that was folks have been proposing. Originating in Edward Kmett's category extras, it looks like this:
class Pointed f where
point :: a -> f a
The notion behind it was that one could decompose, e.g., Applicative
into an instance of the Pointed
typeclass and an instance of the Apply
typeclass (giving apply :: f (a -> b) -> f a -> f b
) and an instance of Pointed, such that the two interact properly.
The basic problem is that point is too general and lawless. The only "law" one can give is that if f
is both Functor
and Pointed
, then fmap f . point === point . f
-- but, by parametricity, this comes for free! It provides no guidance on what point
does or how. Rather, it follows for all functions a -> f a
as a consequence of the functor laws.
Another way to think of this in universal algebra we think relationally, in terms of operations. We define first a semigroup with an associative operation (*)
, and then we can 'complete' it by providing it with an identity. The identity object is defined uniquely up to isomorphism by the properties it possess -- we don't define operations by what objects provide them identities. The way to see "point" as really yielding a unit object is to define
class Inhabited f where
unit :: f ()
Now, for f a functor we have unit = point ()
and point x = fmap (const x) unit
.
Dan Doel has provided some truly weird examples of legitimate pointeds.
It so happens that there is another typeclass, already in base, which also has no laws, although there's a good deal of "intuition" behind it. This typeclass is Foldable.
We typically think of foldable as such:
class Foldable f where
foldr :: (a -> b -> b) -> b -> f a -> b
However, its equally powerful (and I find more convenient) to think of Foldable simply giving a function toList :: f a -> [a]
.
Just like Pointed
, Foldable
has no laws other than, for f
a Functor
, the free theorem fmap f . toList === toList . fmap f
. This in fact flows directly from the fact that for f
a Functor
, toList
has the type of a natural transformation between endofunctors on Hask: type NatTrans f g = (Functor f, Functor g) => forall a. f a -> g a
.
Every instance of Traversable is also an instance of Folded. Traversable can be given as
class Traversable f where
traverse :: Applicative f => (a -> f b) -> t a -> f (t b)
Traversable has two laws:
- (Identity) : traverse Identity === Identity
- (Composition) : traverse (Compose . fmap g . f) === Compose . fmap (traverse g) . traverse f
Incidentally, the first law ensures that getId . traverse (Id . f) === fmap f
The second in turn specializes to give the composition law for functors. This leads me to suspect that we can define (in a categorical sense) Traversable as a special type of functor and capture the fashion in which it is "genuinely" a subclass of Functor -- i.e. a functor preserving certain special morphisms intrinsic to Hask, just as Applicative is a functor preserving the monoidalness (and closedness) of Hask.
Given traverse
, we can define toList = getConst . traverse (const . (:[]))
So for any given f
that is both Foldable
and Traversable
, we can give the law that its definition of toList
must coincide with that given by Traversable
. And this gives at least some sort of law on Foldable. Intuitively, the composition law gives us that "every a in f a that is traversed must be traversed at most once" (linearity). Hence a lawful foldable gives us no duplication or results in our toList
.
However, there are many things that are Foldable
but not Traversable
! Set
is not traversable, because it is not a Functor on all of Hask, only that subcategory subject to the Ord constraint. As another example, the unwrapped reader Functor/Monad/Applicative ((->) r)
is not traversable. Hence data Store r a = Store (r -> a) r
has an obvious toList
but is also not traversable.
Are the Foldable
instances for such things doomed to be lawless? Obviously, since Foldable only has one method, we can't define laws by the interaction of methods. Similarly, since the one method of Foldable
eliminates the f
, we can't define a law by the composition of its methods.
What is necessary is a further subclass of Foldable
, such that it harbors a useful relationship to Foldable
without simply specializing it. Fortunately, such a subclass exists, and gives us the 'point' of Pointed
along the way! Foldable
can be thought of as yielding a natural transformation to list. And since list is the free monoid generated by a type, this gives in fact, by composition, a natural transformation to any monoid generated by a Haskell type (such as Set
, Sum
, First
, etc.) A slight generalization of this is what is captured by the fact that foldMap :: Monoid m -> (a -> m) -> t a -> m
, foldr
, and toList
are equivalent in power (nb: this is a lie if you give foldMap an instance of Monoid
that is unlawful). Thus the fact that list is such a prevalent data structure in Haskell is really a first-order reflection of the fact that fold is such a fundamental operation in functional programming in general.
In any case, what I suggest is that we provide the following typeclass, dual to Foldable
:
class Buildable f where
fromList :: [a] -> f a
fromList xs = build (\cons unit -> foldr cons unit xs)
build :: (forall b. (a -> b -> b) -> b -> b) -> f a
build g = fromList . g (:) []
singleton :: a -> f a
singleton = fromList . (:[])
The singleton
function is now our desired "point" operation. Foldable
and Buildable
, each individually lawless, together can be given a strong set of laws, though not strong enough to always determine their implementations up to unique isomorphism.
The key is that foldr and build on typical lists are subject to a very strong set of laws, which we typically refer to as short cut fusion:
foldr c n (build g) === g c n
We can understand short cut fusion as follows:
type FoldList a = forall b. (a -> b -> b) -> b -> b
toFoldList :: [a] -> FoldList a
toFoldList xs = \cons unit -> foldr cons unit xs
fromFoldList :: FoldList a -> [a]
fromFoldList fl = fl (:) []
Note that fromFoldList
is a synonym for build, and by equational reasoning, foldr c n === ($ c) . ($ n) . toFoldList
. This we can rewrite the fusion rule as:
($ c) . ($ n) . toFoldList . fromFoldList === ($ c) . ($ n)
Which derives directly from toFoldList . fromFoldList === id
.
This works because lists are basically isomorphic to the catamorphism over them (ignoring some hairiness with seq
) -- this is at the heart of church encoding and much else. For Foldable/Buildable we aren't necessarily dealing with isomorphisms, but ideally we can formulate rules along the same lines, if not quite as strong. What should such rules be? Let's look at some examples. For Set
, we have
fromList . toList === id
Similarly for Last
, First
, Sequence
, etc.
We can view this property as given by the fact that [a] is the free monoid generated by a type. Since all these other type constructors also give monoids generated by a type, there is an adjunction, with fromList universally the right (forgetful) adjoint.
However, this law is too strong for all cases. Consider:
data Tree a = Nil | Leaf a | Branch (Tree a) (Tree a)
Now give it an "append" as such:
append = Branch
This "append" is not associative. It contains more information than a list, not less. But our forgetful functor -- fromList -- doesn't have to generate every value in our target category. In fact, "opening up" isomorphisms into a broader class of almost-equivalences is in one sense the point of adjunctions.
What really matters is that we get
toList . fromList . toList === toList
fromList . toList . fromList === fromList
where there is a transformation from toList . fromList
to id
that is natural (i.e. anything we do on the a to . fro
of a list via some structure can be mapped to something we do on lists directly), and a transformation from id
to fromList . toList
that is natural (i.e. anything that we do on a structure directly can be mapped to something we do on that structure as munged through a list).
In any case these laws capture a rather intuitive notion -- any information that we "throw away" will only be "discarded" once.
However, such laws don't determine Foldable or Buildable uniquely. A simple example is the following:
newtype Goofy a = Goofy [a]
instance Foldable Goofy where
toList (Goofy g) = go
where go (_:x:xs) = x : toList xs
go _ = []
instance Buildable Goofy where
fromList = Goofy . go
where go (x:xs) = x:x:go xs
go _ = []
Here, we arbitrarily double every element going "into" Goofy, and strip half of Goofy going back out. Our laws are obeyed, but not in the expected way. Other examples should be easy to generate.
Now, anything can be made Foldable, since in the worst case we can just send all objects to the empty list. However, not anything can be made Buildable. Here's a simple example:
instance Foldable (,) a where
toList (_,x) = [x]
instance Buildable (,) a where
fromList = ???
(We can of course fix this by adding a monoid constraint to a
in the Buildable instance)
In any case, our two laws have nice consequences as far as interaction between themselves and with other typeclasses as well. We get properties such as the following:
Foldable law:
fold f === fold f . toList
Buildable extension:
fold f === fold f . fromList . toList
Foldable law:
fmap toList . traverse f === traverse f . toList
Buildable extension:
fmap toList . traverse f === fmap toList . traverse f . fromList . toList
Monoid/Buildable law:
(Buildable f, Monoid f a) => fromList xs `mappend` fromList ys === fromList (xs ++ ys)
I'm sure there are some other nice properties that I haven't arrived at either.
Also note that Buildable
is not only a nice construction to work with, but also can allow a few efficiencies that we don't currently have. In particular, if I have a Sequence a
and wish to turn it to a Set a
, I have two "obvious" avenues. A) I map Set.singleton
over it and then call fold
. B) I simply call Set.fromList . Sequence.toList
.
The "right" thing is in fact to foldr
a set-builder function such as Set.insert
, beginning with the empty set. This is exactly what the build for Set does!
So we can write a general function:
fromFoldable :: (Foldable f, Buildable g) => f a -> g a
fromFoldable x = build (\cons nil -> foldr cons nil x)
and assuming our Foldable
and Buildable
instances are defined well, this will be quite efficient, avoiding the creation of unnecessary intermediate structures!
So what's the punchline here?
- Foldable is handy, but lawless.
- Pointed is considered handy, but lawless.
- Buildable generalized Pointed strictly -- everything Buildable is Pointed, and everything Pointed is Buildable (to get the latter, just always build from a one element list).
- Buildable is even more handy than pointed (it lets us generalize a range of, though not all, existing fromList functions).
- Buildable and Foldable together form a nice dual pair that has very nice laws.
- These nice laws get even nicer in conjunction with other lawful typeclasses (and I suspect there's more I haven't worked out yet here too).
- Perhaps we should consider creating a nice Buildable package for potential inclusion in the platform.
Extra Credit
Given that we've established a family of adjunctions, the obvious question to ask is what monads we get out of them, and what we can say about such monads in general. I have some suspicions, but haven't bothered to work it through.
One more tricky bit
There is one class of things that are Pointed
that Buildable
does not generalize -- those things such that they require at least one a
-- e.g. Identity
, NonEmptyList
, Pair
, or infinite streams. There are a few solutions here -- none great.
The most principled thing is to define Buildable1
as a dual to Foldable1
(in the semigroups package).
class Foldable1 f where
toNEL :: f a -> NEL a
class Buildable1 f where
fromNEL :: NEL a -> f a
singleton :: a -> f a
Due to contravariance, while fewer things are Foldable1
than Foldable
, more things are Buildable1
than Buildable
. For those which are both Foldable1
and Buildable1
we can provide the same adunction laws regarding toNEL . fromNEL . toNEL === toNEL
, etc. This group of things includes Identity
and NonEmptyList
, but excludes many other common types like Set
, First
, and even List
.
The most practical solution, despite its ickiness, seems to be to provide both fromList
and fromNonEmptyList
in Buildable
, noting that the former is partial in some instances. On the one hand, partial functions are terrible. On the other, this mirrors most closely how we've already structured many other functions in Haskell, and it allows those who so choose to only work in the total subset.
Future Work
For types like Set
we would need some sort of constraint kinds -- I have yet to work out the story for that, though it doesn't look terrible, I suspect.