The Single Unit

The monad in philosophy represents the single Supreme Being. All is derived from the monad and all succumbs to the monad. Maybe such connotations foreshadows what is to come. We’ve already traversed great distance with functors and applicatives, but to reach the summit, there’s only one final thing left to encounter.

Programming land

The minimal definition for the monad is given below.

class Applicative m => Monad m where
  (>>=)  :: m a ->  (a -> m b)  -> m b
  return ::   a                 -> m a

A monad is an applicative with the additional function >>=.
(It turns out that return is just the applicative pure)

The magic of monads is the function >>= called “bind”. Abstractly, if we have some value A of type a and two functions f :: a -> m b, g :: b -> m c, then >>= allows us to “compose” them like function composition: return A >>= f >>= g.
The final output would be something of type m c. The type constructor m can be seen as some sort of “embellishment” or “effect” to our existing types.

Analogous functions corresponding to the monad functions exist in Scala. Below are the functions for the Option type.

def return = Option.apply
def bind = Option.flatmap

Haskell do-notation

The do-notation is syntactic sugar available in Haskell to sequence monadic values and computation. Understanding this notation will give us insight into why monads are so coveted by practical programmers. We will tackle do-notation through some simple examples.

Before we continue with the do-notation, let’s cover the simple >> function for monads.

class Monad m where
  (>>=)  :: m a -> (  a -> m b) -> m b
  (>>)   :: m a ->  m b         -> m b
  return ::   a                 -> m a

The >> function is defined as k >> f = k >>= \_ -> f
i.e. >> sequences two monadic values, but returns only the second parameter.

Consider the following example that sequences two IO monad actions.

putStrLn :: String -> IO ()
putStrLn "Hello" >> putStrLn "World"

It turns out that this is identical to the following.

do putStrLn "Hello"
   putStrLn "World

At a glance it seems like we are running these IO commands like we would in an imperative programming language, but we are actually sequencing these monadic values with >>

Now consider the following more interesting example, which is relatively representative of how do blocks are used in everyday code.

do x <- Just 2
   y <- Just 3
   return (x+y)

The final output is not surprisingly Just 5. It appears that we can “extract” the integer value out of each Maybe with the <- operator.

Under the hood, the code is actually translated to

Just 2 >>= (\x -> Just 3 >>= (\y -> return (x+y)))

The usage of <- does not explicitly pull the value out of the maybe, but rather does so implicitly using bind. Besides that, the sequencing of monadic values in a do-block is done by embedding function calls within each other with the aid of bind. From looking at the translation, it’s clear that the syntactic sugar provided by do is much more terse and comprehensible.

The do-notation allows us to sequence monadic values and computation together using >> and >>= as if we are writing imperative code. >> acts like a “semi-colon” in imperative code, while >>= acts like a “semi-colon” with variable assignment. The underlying logic in a do-block is ultimately determined by the monad’s >>, >>=, return.

The for statement in Scala behaves similarly to the do notation from Haskell.

for {
  x <- m1
  y <- m2
} yield combine(x, y)

Here, both m1 and m2 must be monads of the same type.
i.e. m1 and m2 can be a List(Int) and List(String) respectively.
The above logic can be translated to the following.

m1.flatmap(x => {
    m2.map(y => {
      y => combine(x, y)
    }
})

In case you have not realized, map in Scala is fmap in Haskell and flatmap is >>=.

Alternative Monad Definitions

Instead of defining the >>= function, a monad can be defined alternatively as:

class Monad m where
  fmap   ::   m a ->  (a -> b) -> m b
  join   :: m m a              -> m a
  return ::     a              -> m a

A monad can thus be seen as a functor for which we can collapse higher-order functorial values with join.
i.e. For the maybe type: join Just Just 3 = Just 3

With this definition we still have bind defined as >>= = join . map so this definition is as expressive as the initial definition. To see that this definition is not any more expressive, observe that fmap f m = m >>= (return . f) and join m = m >>= id

The “magic” of the monad are the function >>= and join. What these functions mean and how they are defined is dependent on the particular monad.

Besides join, there is a lesser known alternative definition with >=>

class Monad m where
  (>=>) :: (a -> m b) -> (b -> m c) -> (a -> m c)
  return :: a -> m a

Notice that the type signature of >=> looks awfully like that of normal function composition (.) with the only difference being the return type of the composed function returning monadic types.

We can define >>= and >=> in terms of each other with

>=> f g  =  \x -> x >>= f >>= g
>>= x f  =  (id >=> f) x

Monad Laws

The functions return and >>= must follow the monad laws below.

return a >>= k                  =  k a             -- left identity
m        >>= return             =  m               -- right identity
m        >>= (\x -> k x >>= h)  =  (m >>= k) >>= h -- associativity

Left and right identity correspond with our intuititon that return simply “lifts” a value to its corresponding monadic form without doing any computation. Associativity corresponds with our intuition that >>= is function composition.

If we use >=> as the base definition of a monad, then the following monad laws must apply.
These laws appear much more “naturally” than the ones above.

return >=> f = f                  -- left unit
f >=> return = f                  -- right unit
(f >=> g) >=> h = f >=> (g >=> h) -- associativity

Relationship to other Categorical Constructs

As mentioned before, a monad is a more specialized applicative, so we expect to use monads with existing functor and applicative methods.

Using the methods that we have discussed we can define liftM, which is the monadic version of the functor’s fmap.

liftM :: Monad m => (a -> b) -> m a -> m b
liftM f m  =  m >>= (return . f)

What is more interesting is how monads compare with applicatives. We already discussed how the monadic version of pure is return. For monads, the <*> operator is called ap and is defined as follows.

mf `ap` m  =  do
  fab <- mfab
  a <- ma
  return (fab a) 

How exactly does the monad differ from the applicative?
Consider the example:

miffy :: Monad m => m Bool -> m a -> m a -> m a
miffy mb mt me = do 
    b <- mb
    if b then mt else me

iffy :: Applicative f => f Bool -> f a -> f a -> f a
iffy fb ft fe = cond <$> fb <*> ft <*> fe where 
    cond b t e = if b then t else e

Both miffy and iffy codify the ternary operation with monads and applicatives respectively. However, the actual invocation of the two functions differ.

*Main> miffy (Just True) (Just 1) Nothing
Just 1
*Main> iffy (Just True) (Just 1) Nothing 
Nothing

The Maybe type constructor is a monad and thus also an applicative in Haskell. In miffy, the function chooses and performs one of the two computations: mt or me. In iffy, both computations for ft and fe are computed, regardless of the value of fb.

In the definition of Maybe as a functor or applicative, attempting to invoke a lifted function over Nothing will always return Nothing. i.e. fmap f Nothing = Nothing for any function f

Both applicatives and monads allow us to sequence lifted values. Unlike the applicative, the usage of the Maybe monad allow us the ability to “choose” which computation to utilize using earlier monadic values.

Why Monads?

Consider the following table, which showcases the different usecases that monads can satisfy.

Monad Programming Usage
Maybe Exceptions/Nullablity
Error Exceptions
IO input/output
[] Nondeterminism
Reader Read-only State
Writer Logging
State Global State
Cont Continuations

When we talk about monads, applicatives, or functors, you may hear the term “effect” thrown around, yet there doesn’t seem to be a concrete definition for what an “effect” is.
An effect is an embellishment to existing computation that is context dependent. If we are working with the Maybe functor, the effect would be the ability to encapsulate exceptions or nullability. If we are working with the [] monad, the effect would be the ability to simulate non-determinism.
In light of these observations, it’s logical why the definition is so vague and that effects are just the semantics of type constructors.
In other words, effects are embellishments of existing types that programmers have attached some emotional meaning. This “embellishment” can be seen in the type signature of bind with the function of type: a -> m b, which is a function of type a -> b with something extra denoted with m.

Monads do not need to come into play when it comes to effectful computation. We as programmers can use the effect of the Maybe functor without any of the machinery of monads.

Monads and applicatives do come into play when we want to sequence effectful values. If we have some existing monadic values: x1 x2 x3, we can sequence them together with the applicative syntax f <$> x1 <*> x2 <*> x3. However, this sort of sequencing with the applicative style only allows us to dictate control flow and logic based on the underlying values, and not the effects. As shown in the iffy example, if any value evaluate to Nothing, the end result will always be Nothing.

Both monads and applicatives allow us sequence effectful values in some way. As foreshadowed in comparison between miffy and iffy, the advantage the monad has over the applicative is the ability for upstream effects to affect downstream effects and computation. Monads can be seen as applicatives with the ability to dictate control using effects.

Mathland

There are different ways to formulate the definition of a monad in category theory. We will go over two common ones.

A monad in category theory is an endofunctor \(T\) with two associated natural transformations: \(\eta : I \to T\) and \(\mu : T^2 \to T\).
Here, \(I\) is the identity functor while \(T^2\) is the endofunctor applied twice.

These natural transformations must abide by certain laws such that the following diagrams commute. Composition of natural transformations here is horizontal composition.

\[\require{AMScd} \begin{CD} T^3 @>\mu\circ I_T>> T^2\\ @V I_T\circ\mu VV @V \mu VV \\ T^2 @>\mu>> T\\ \end{CD}\] \[\require{AMScd} \begin{CD} T @> \eta\circ I_T>> T^2 @<I_T\circ\eta<< T\\ & @V{\mu}VV & \\ && T \\ \end{CD}\]

These two diagrams express that both \(\eta\) and \(\mu\) are associative.

Compared to our programming definition, \(\eta\) is return and \(\mu\) is join.

Monads as Monoids

The natural transformations associated with a monad are at times called multiplication(\(\mu\)) and unit(\(\eta\)), which suggests that the monad is in some way associated with the monoid. Let’s uncover this connection.

Suppose we are working with a monoidal category, which we have covered briefly in the previous blog post. We can define a monoid in this monoidal category as some object \(m\) with two assiocated morphisms: \(\eta : () \to m\) and \(\mu : m \otimes m \to m\) where \(()\) is the unit in the monoidal category.
This definition of a monoid is different than ones mentioned before in a previosu post, but the underlying idea is the same.
The following diagrams must commute since a monoid is associative:

\[\require{AMScd} \begin{CD} (m\otimes m)\otimes m && && m\otimes(m\otimes m)\\ @V{\mu\otimes id} VV & & @V{id\otimes \mu}VV \\ m\otimes m @>\mu>> m @<\mu<< m\otimes m\\ \end{CD}\]

Being a monoidal category, \((m\otimes m)\otimes m\) is isomorphic to \(m\otimes (m \otimes m)\). The idea is that we want to view both as just \(m^3\)

\[\require{AMScd} \begin{CD} () \otimes m @>{\eta\otimes id}>> m\otimes m @<{id\otimes \eta }<< m\otimes ()\\ & @V{\mu}VV & \\ && m \\ \end{CD}\]

The similarity between these diagrams and the commutative diagrams from the previous section should act as foreshadowing.

Now consider the category of endofunctors, which has endofunctors as objects and natural transformations between them as morphisms. Notice that if we have two endofunctors \(F\) and \(G\) from this category, we can compose them with \(F\circ G\) to produce another endofunctor. This is only possible since the start and end category of both \(F, G\) is the same.
i.e. The choice of only looking at endofunctors is intentional.

Based on what we know already, composing functors in this manner is associative. Then, this category of endofunctors is actually a monoidal category with functor composition acting as \(\otimes\) and the identity functor will act as the identity element.

Now that we have a monoidal category, what does a monoid in this category look like?

The monoid must be assiated with some object \(T\) which is some endofunctor. The monoid will also have two associated morphisms, which are natural transformations in this monoidal category: \(\eta: I\to T\) and \(\mu: T\otimes T \to T\)

The monoid laws then become the monad laws exactly.

\[\require{AMScd} \begin{CD} T^3 @>\mu\circ I_T>> T^2\\ @V I_T\circ\mu VV @V \mu VV \\ T^2 @>\mu>> T\\ \end{CD}\] \[\require{AMScd} \begin{CD} T @> \eta\circ I_T>> T^2 @<I_T\circ\eta<< T\\ & @V{\mu}VV & \\ && T \\ \end{CD}\]

Looking back, a monad is just a functor that can be applied repeatedly to an object in a way resembling monoid multiplication.

This explains the famous statement:

All told, a monad in X is just a monoid in the category of endofunctors of X

Etymology

The term “monad” did not originate within the computer science community. Instead, it was adapted for use in programming from category theory. A monoid in the monoidal category of endofunctors used to be called a “triple”, which was unwieldly and ambiguous to the extent that when the term “monad” was suggested, it was openly adopted. “Monad” was actually coined to be related to the monoid, so unlike what my introduction implied, the category theory monad unfortunately has not much to do with the greek root “monos” or philosophy.

Parting Words

We started off with understanding the functor and the applicative, and now, we have finally made it. The mysticism surrounding the monad has lifted, relieving us from the mental shackles of category theory. Hopefully after doing a deep dive into the monad, we have all gained a deeper appreciation whenever we encounter the monad in the wild.

References

  1. https://www.baeldung.com/scala/for-comprehension
  2. https://en.wikibooks.org/wiki/Haskell/Understanding_monads
  3. https://stackoverflow.com/questions/17409260/what-advantage-does-monad-give-us-over-an-applicative
  4. https://stackoverflow.com/questions/33386622/what-exactly-does-effectful-mean/49132391
  5. https://bartoszmilewski.com/2016/11/21/monads-programmers-definition/
  6. https://bartoszmilewski.com/2016/12/27/monads-categorically/
  7. https://wiki.haskell.org/Typeclassopedia#Intuition_3
  8. https://english.stackexchange.com/questions/30654/where-does-the-term-monad-come-from/30661#30661