# An initial solution to the monad problem, and then some more

This is the second of two talks about monads, based on the very good notes by Andrea Schalk and continuing the one I gave on the 30th of May. Recall that we are trying to solve the following problem:

given a monad $T = (T, \eta, \mu)$, find an adjunction $(F, G, \eta, \varepsilon)$ such that $T = GF$ and $\mu = G \varepsilon_F$

If the adjunction $(F, G, \eta, \varepsilon)$ solves the problem above, we say that it generates the monad $T$.

The first solution to this problem was given by the Swiss mathematician Heinrich Kleisli, and is based on an alternative way of defining monads, as it is the case with adjunctions. Let us suppose $T = GF$ with $F \dashv G$. If $f : A \to TB = G(FB)$, then $f^\sharp : FA \to FB$, so that $Gf^\sharp : TA \to TB$: and we know from the definition of monad that $Gf^\sharp \circ \eta_A = f$. We can thus define an operator $(\cdot)^\ast$ that takes $f \in \mathcal{C}(A,TB)$ into $f^\ast = Gf^\sharp \in \mathcal{C}(TA,TB)$ so that $f^\ast \circ \eta_A = f$ whatever $f$ is. The simplest example is $f = \eta_A$ itself, which yields $(\eta_A)^\ast \circ \eta_A = \eta_A$, so that $(\eta_A)^\sharp = \mathrm{id}_{FA}$ by uniqueness in the definition of adjunction quadruple, and $(\eta_A)^\ast = \mathrm{id}_{TA}$. Moreover, if $f : A \to TB$ and $g : B \to TC$, then $g^\ast \circ f = g^\ast \circ f^\ast \circ \eta_A$, which implies $(g^\ast \circ f)^\ast = g^\ast \circ f^\ast$ by uniqueness.

Definition 4. A Kleisli triple on a category $\mathcal{C}$ is a triple $(T, \eta, (\cdot)^\ast)$ where:

1. $T : |\mathcal{C}| \to |\mathcal{C}|$ is a function,
2. $\eta_A \in \mathcal{C}(A, TA)$ for every $A \in |\mathcal{C}|$, and
3. $f^\ast \in \mathcal{C}(TA,TB)$ for every $f \in \mathcal{C}(A,TB)$

such that the following equations are satisfied:

1. $f^\ast \circ \eta_A = f$ for every $f : A \to TB$;
2. $(\eta_A)^\ast = \mathrm{id}_{TA}$ for every $A$;
3. $(g^\ast \circ f)^\ast = g^\ast \circ f^\ast$ for every $f : A \to TB$, $g : B \to TC$.

If $(F,G, \eta, (\cdot)^\sharp)$ is an adjunction quadruple then $(GF, \eta, G(\cdot)^\sharp)$ is a Kleisli triple.

Theorem 1. Let $\mathcal{C}$ be a category.

1. If $(T, \eta, \mu)$ is a monad on $\mathcal{C}$, and if $f^\ast = \mu_B \circ Tf$ for every $f : A \to TB$, then $(T, \eta, (\cdot)^\ast)$ is a Kleisli triple on $\mathcal{C}$.
2. If $(T, \eta, (\cdot)^\ast)$ is a Kleisli triple on $\mathcal{C}$, and if $Tf = (\eta_B \circ f)^\ast$ for every $f : A \to B$ and $\mu_A = (\mathrm{id}_{TA})^\ast$ for every $A$, then $(T, \eta, \mu)$ is a monad on $\mathcal{C}$.
3. The two operations from the previous points are each other’s converse.

Proof: Point 1 follows from naturality of $\eta$ and $\mu$ and the three monad laws:

• $f^\ast \circ \eta_A = \mu_B \circ Tf \circ \eta_A = \mu_B \circ \eta_{TB} \circ f = \mathrm{id}_{TB} \circ f = f$
• $(\eta_A)^\ast = \mu_{TA} \circ \eta_A = \mathrm{id}_{TA}$
• $(g^\ast \circ f)^\ast = \mu_C \circ T\mu_C \circ T^2g \circ Tf = \mu_C \circ \mu_{TC} \circ T^2g \circ Tf = \mu_C \circ Tg \circ \mu_B \circ Tf = g^\ast \circ f^\ast$

For point 2, functoriality of $T$, naturality of $\eta$ and $\mu$, and monad laws follow from Kleisli laws:

• $T(g \circ f) = (\eta_C \circ g \circ f)^\ast = ((\eta_C \circ g)^\ast \circ \eta_B \circ f)^\ast = (\eta_C \circ g)^\ast \circ (\eta_B \circ f)^\ast = Tg \circ Tf$
• $T\mathrm{id}_A = (\eta_A)^\ast = \mathrm{id}_{TA}$
• $Tf \circ \eta_A = (\eta_B \circ f)^\ast \circ \eta_A = \eta_B \circ f$
• $\mu_B \circ T^2f = (\mathrm{id}_{TB}^\ast \circ \eta_{TB} \circ (\eta_B \circ f)^\ast)^\ast = (\eta_B \circ Tf)^{\ast\ast} = ((\eta_B \circ f)^\ast \circ \mathrm{id}_{TA})^\ast = Tf \circ \mu_A$
• $\mu_A \circ \eta_{TA} = (\mathrm{id}_{TA})^\ast \circ \eta_{TA} = \mathrm{id}_{TA} = (\mathrm{id}_{TA} \circ \eta_A)^\ast = ((\mathrm{id}_{TA})^\ast \circ \eta_{TA} \circ \eta_A)^\ast = \mu_A \circ T\eta_A$
• $\mu_A \circ \mu_{TA} = (\mathrm{id}_{TA}^\ast \circ \mathrm{id}_{T^2A})^\ast = (\mathrm{id}_A \circ \mathrm{id}_{TA}^\ast)^\ast = (\mathrm{id}_{TA}^\ast \circ \eta_{TA} \circ \mathrm{id}_{TA}^\ast)^\ast = \mu_A \circ T\mu_A$

Point 3 is straightforward. $\Box$

Considering again the free monoid example, the corresponding Kleisli triple has

$f^\ast(s) = [ x \; \mathtt{for} \; x \; \mathtt{in} \; f(a) \; \mathtt{for} \; a \; \mathtt{in} \; s ]$

Theorem 1 says that we can restate our problem as follows:

given a Kleisli triple $(T, \eta, (\cdot)^\ast)$, find an adjunction quadruple $(F, G, \eta, (\cdot)^\sharp)$ such that $T = GF$ and $(\cdot)^\ast = G((\cdot)^\sharp)$

If $T = GF$ with $F \dashv G$, then for every $A,B \in |\mathcal{C}|$ there is an isomorphism $\mathcal{D}(FA,FB) \cong \mathcal{C}(A,TB)$: this observation is at the base of Kleisli’s construction.

Definition 5. Let $T = (T, \eta, (\cdot)^\ast)$ be a Kleisli triple on a category $\mathcal{C}$. The Kleisli category of $T$ is the category $\mathcal{C}_T$ defined as follows:

• $|\mathcal{C}_T| = |\mathcal{C}|$;
• $\mathcal{C}_T(A,B) = \mathcal{C}(A,TB)$;
• $\mathrm{id}_{A}^{\mathcal{C}_T} = \eta_A$, that is, the identity of $A$ in $\mathcal{C}_T$ is $\eta_A$;
• $g \bullet f = g^\ast \circ f$, that is, the composition of $f$ and $g$ in $\mathcal{C}_T$ is the composition of $f$ and $g^\ast$ in $\mathcal{C}$.

Theorem 2. $\mathcal{C}_T$ is a category.

Proof: If $f \in \mathcal{C}_T(A,B)$, then $f \bullet \mathrm{id}_{A}^{\mathcal{C}_T} = f^\ast \circ \eta_A = f$ and $\mathrm{id}_{B}^{\mathcal{C}_T} \bullet f = (\eta_B)^\ast \circ f = \mathrm{id}_{TB} \circ f = f$ by the Kleisli laws. If $f \in \mathcal{C}_T(A,B)$$g \in \mathcal{C}_T(B,C)$, and $f \in \mathcal{C}_T(C,D)$, then $(h \bullet g) \bullet f = (h^\ast \circ g)^\ast \circ f = h^\ast \circ g^\ast \circ f = h \bullet (g \bullet f).$ $\Box$

Our plan is to construct an adjunction quadruple $(F_T, G_T, \eta, (\cdot)^\sharp)$, with $F_T : \mathcal{C} \to \mathcal{C}_T$ and $G_T : \mathcal{C}_T \to \mathcal{C}$, such that $T = G_T F_T$ and $G_T f^\sharp = f^\ast$ for every $f \in \mathcal{C}_T(A,B)$. We do this as follows:

• $F_T A = A$ for every $A \in |\mathcal{C}|$;
• $G_T A = TA$ for every $A \in |\mathcal{C}_T|$;
• $G_T f = f^\ast$ for every $f \in \mathcal{C}_T(A,B)$;
• $f^\sharp = f$ for every $f \in \mathcal{C}(A, G_T B)$.

Let us quickly check that $G_T$ is indeed a functor. If $A \in |\mathcal{C}_T|$ then $G_T \mathrm{id}_{A}^{\mathcal{C}_T} = (\eta_A)^\ast = \mathrm{id}_{TA} = \mathrm{id}_{G_T A}$. If $f \in \mathcal{C}_T(A,B)$ and $g \in \mathcal{C}_T(B,C)$, then $G_T(g \bullet f) = (g^\ast \circ f)^\ast = g^\ast \circ f^\ast = G_Tg \circ G_Tf$. We are only left to determine, for every $f \in \mathcal{C}(A, G_T B)$, a unique $f^\sharp \in \mathcal{C}_T(F_T A, B)$ such that $G_T f^\sharp \circ \eta_A = f$: but the entire construction leads to the choice $f^\sharp = f$! Indeed, $\mathcal{C}_T(F_T A, B) = \mathcal{C}_T(A,B) = \mathcal{C}(A,TB) = \mathcal{C}(A, G_T B)$, and $G_Tf \circ \eta_A = f^\ast \circ \eta_A = f$ by the Kleisli laws. Observe that the functor $G_T$ is the one that does all the work, while the function $F_T$ is little more than a placeholder.

By our identification of adjunctions with adjunction quadruples (see the previous talk) we also get $F_T f = (\eta_B \circ f)^\sharp = \eta_B \circ f$ for every $f \in \mathcal{C}(A,B)$, and $(\varepsilon_T)_S = (\mathrm{id}_{G_T S}^{\mathcal{C}})^\sharp = \mathrm{id}_{TS} \in \mathcal{C}(G_TF_TS, TS) = \mathcal{C}_T(F_TG_TS, S)$ for every $S \in |\mathcal{C}_T| = |\mathcal{C}|$.

Kleisli’s solution is not the only one, but just one among many: and, in a sense that will be clear later, the “simplest” one. Another solution was constructed by Eilenberg and Moore, and is based on a completely different approach: instead of keeping the objects and specializing the morphisms, one expands the objects and redefines the morphisms.

Definition 6. Let $\mathcal{C}$ be a category and let $T = (T, \eta, \mu)$ be a monad on $\mathcal{C}$.

1. A $T$-algebra on $\mathcal{C}$ is a pair $a = (A,a)$ where $A$ is an object in $\mathcal{C}$ and $a : TA \to A$ is such that $a \circ \eta_A = \mathrm{id}_A$ and $a \circ \mu_A = a \circ Ta$.
2. A morphism of $T$-algebras from a $T$-algebra $a = (A,a)$ to a $T$-algebra $b = (B,b)$ is an arrow $f \in \mathcal{C}(A,B)$ such that $b \circ Tf = f \circ a$.
3. The Eilenberg-Moore category of $T$-algebras on $C$ is the category $\mathcal{C}^T$ which has $T$-algebras as objects, morphisms of $T$-algebras as morphisms, and where identities and composition are defined as in $\mathcal{C}$.

If $T=M$ is the free monoid construction, then an $M$-algebra is a function $a : A^\ast \to A$ such that

• $a[x] = x$ for every $x \in A$, and
• $a[u^1_1 \cdots u^1_{n_1} \cdots u^m_1 \cdots u^m_{n_m}] = a[a[u^1_1 \cdots u^1_{n_1}] \cdots a[u^m_1 \cdots u^m_{n_m}]]$ for every $u^1_1, \ldots, u^1_{n_1}, \ldots, u^m_1, \ldots, u^m_{n_m} \in A$.

As $T$ is a monad, for every object $A$ of $\mathcal{C}$ there is a free $T$-algebra $\mu_A = (TA, \mu_A)$, and every arrow $f \in \mathcal{C}(A,B)$ induces a morphism of free $T$-algebras $Tf \in \mathcal{C}^T(\mu_A, \mu_B)$. Moreover, any $a \in \mathcal{C}(TA,A)$ is, by definition, also a morphism from $(TA, \mu_A)$ to $(TA, a)$ in $\mathcal{C}^T$.

This time, our plan is to construct an adjunction $(F^T, G^T, \eta, \mu)$ such that $F^T : \mathcal{C} \to \mathcal{C}^T$, $G^T : \mathcal{C}^T \to \mathcal{C}$, $T = G^T F^T$, and $\mu_A = G^T \varepsilon_{F^T A}$ for every $A \in |\mathcal{C}|$. We do this as follows:

• $F^T A = \mu_A = (TA, \mu_A)$;
• $F^T f = Tf$;
• $G^T a = A$ if $a : TA \to A$;
• $G^T f = f$;
• $\varepsilon^T_a = a$ for every $a = (A,a) \in |\mathcal{C}^T|$.

Then clearly $G^T F^T = T$, while naturality of $\varepsilon$ follows from the properties of free $T$-algebras with respect to $T$-algebra morphisms. In addition, if $S = a = (A,a) \in |\mathcal{C}^T|$ then $G^T \varepsilon^T_S \circ \eta_{G^T S} = a \circ \eta_A = \mathrm{id}_{G^T S}$, and if $A \in \mathrm{C}$ then $\varepsilon^T_{F^T A} \circ F^T \eta_A = \mu_A \circ T\eta_A \mathrm{id}_{TA}$. We thus have a full-featured adjunction: this time, $F^T$ is doing all the work, and $G^T$ is just a forgetful functor.

Theorem 3. Let $T = (T, \eta, \mu)$ be a monad on a $\mathcal{C}$. Identify the monad $T$ with the corresponding Kleisli triple $T = (T, \eta, (\cdot)^\ast)$. The Kleisli category $\mathcal{C}_T$ is equivalent to the full subcategory of $\mathcal{C}^T$ generated by the free $T$-algebras.

Proof: Define a functor $J : \mathcal{C}_T \to \mathcal{C}^T$ by setting $JA = \mu_A = (TA, \mu_A)$ for every $A \in |\mathcal{C}_T| = |\mathcal{C}|$, and $Jf = \mu_B \circ Tf = f^\ast \in \mathcal{C}^T(JA, JB)$ for every $f \in \mathcal{C}_T(A,B) = \mathcal{C}(A, TB)$. Then $J$ is a faithful functor, because if $f,g \in \mathcal{C}_T(A,B)$, then $f = \mu_B \circ \eta_{TB} \circ f = \mu_B \circ Tf \circ \eta_A$ and similarly $g = \mu_B \circ \eta_{TB} \circ g = \mu_B \circ Tg \circ \eta_A$, so that $f = g$ if $Jf = Jg$. But $J$ is also full, because if $f : (TA, \mu_A) \to (TB, \mu_B)$ is a morphism of free $T$-algebras, then from the laws of monads follows that $f = J(f \circ \eta_A)$. $\Box$

But things get even more interesting than this! Let $T = (T, \eta, \mu)$ be a monad: let us consider all the adjunctions $(F, G, \eta, \varepsilon)$ that generate $T$. What can be a morphism of such adjunctions? First, if $(F, G, \eta, \varepsilon)$ is a solution with $F : \mathcal{C} \to \mathcal{D}$, and $(F', G', \eta, \varepsilon')$ is a solution with $F' : \mathcal{C} \to \mathcal{D}'$, we may consider a functor $L : \mathcal{D} \to \mathcal{D'}$ as a morphism from $(F, G, \eta, \varepsilon)$ to $(F', G', \eta, \varepsilon')$. Next, we want that the equalities $GF = T = G'F'$ are not affected by mid-way application of $L$: this translates into the two conditions $L \circ F = F'$ and $G' \circ L = G$. Finally, as the previous point yields $LFG = F'G'L$, we want that $L$ does not interfere with the counits: that is, $L\varepsilon = \varepsilon'_L$.

Definition 7. Let $T = (T, \eta, \mu)$ be a monad on a category $\mathcal{C}$ and let $(F, G, \eta, \varepsilon)$, $(F', G', \eta, \varepsilon')$ be two adjunctions that generate $T$, with $F : \mathcal{C} \to \mathcal{D}$ and $F' : \mathcal{C} \to \mathcal{D}'$, respectively. A $T$-preserving functor from $(F, G, \eta, \varepsilon)$ to $(F', G', \eta, \varepsilon')$ is a functor $L : \mathcal{D} \to \mathcal{D}'$ such that $L \circ F = F'$, $G' \circ L = G$, and $L\varepsilon = \varepsilon'_L$.

It is straightforward to see that $T$-generating adjunctions together with $T$-preserving functors form a category $\mathrm{Adj}(T)$: composition is provided by the usual composition of functors, while the identity of $(F, G, \eta, \varepsilon)$ in $\mathrm{Adj}(T)$ is the identity functor of $\mathcal{D}$ if $F : \mathcal{C} \to \mathcal{D}$.

To confirm that our intuition is correct, let us verify that $J$ satisfies the three given equations:

• $J F_T A = JA = (T_A, \mu_A) = F^T A$;
• $J F_T f = \mu_B \circ T(\eta_B \circ f) = \mu_B \circ T\eta_B \circ Tf = \mathrm{id}_{TB} \circ Tf = F^Tf$;
• $G^T J A = G^T (TA, \mu_A) = TA = G_T A$;
• $G^T J f = G^T (\mu_B \circ Tf) = \mu_B \circ Tf = f^\ast = G_T f$;
• $J(\varepsilon_T)_S = J\mathrm{id}_{TS}= \mu_S \circ T\mathrm{id}_{TS} = \mu_S = \varepsilon^T_{(TS, \mu_S)} = \varepsilon^T_{JS}$.

Theorem 4. Let $T = (T, \eta, \mu)$ be a monad. Then the Kleisli adjunction is the initial object of $\mathrm{Adj}(T)$, the Eilenberg-Moore adjunction is the final object. In particular, $J$ is the only arrow in $\mathrm{Adj}(T)$ from the former to the latter.

Proof: If $(F, G, \eta, \varepsilon)$ is the Kleisli adjunction, then the only choice for $L$ is $LA = F'A$ and $Lf = \varepsilon'_{F'B} \circ F' f$. If $(F', G', \eta, \varepsilon')$ is the Eilenberg-Moore adjunction, then the only choice for $L$ is $LS = (GS, G \varepsilon_S)$ and $Lf = Gf$. $\Box$