An initial solution to the monad problem, and then some more

This is the second of two talks about monads, based on the very good notes by Andrea Schalk and continuing the one I gave on the 30th of May. Recall that we are trying to solve the following problem:

given a monad T = (T, \eta, \mu), find an adjunction (F, G, \eta, \varepsilon) such that T = GF and \mu = G \varepsilon_F

If the adjunction (F, G, \eta, \varepsilon) solves the problem above, we say that it generates the monad T.

The first solution to this problem was given by the Swiss mathematician Heinrich Kleisli, and is based on an alternative way of defining monads, as it is the case with adjunctions. Let us suppose T = GF with F \dashv G. If f : A \to TB = G(FB), then f^\sharp : FA \to FB, so that Gf^\sharp : TA \to TB: and we know from the definition of monad that Gf^\sharp \circ \eta_A = f. We can thus define an operator (\cdot)^\ast that takes f \in \mathcal{C}(A,TB) into f^\ast = Gf^\sharp \in \mathcal{C}(TA,TB) so that f^\ast \circ \eta_A = f whatever f is. The simplest example is f = \eta_A itself, which yields (\eta_A)^\ast \circ \eta_A = \eta_A, so that (\eta_A)^\sharp = \mathrm{id}_{FA} by uniqueness in the definition of adjunction quadruple, and (\eta_A)^\ast = \mathrm{id}_{TA}. Moreover, if f : A \to TB and g : B \to TC, then g^\ast \circ f = g^\ast \circ f^\ast \circ \eta_A, which implies (g^\ast \circ f)^\ast = g^\ast \circ f^\ast by uniqueness.

Definition 4. A Kleisli triple on a category \mathcal{C} is a triple (T, \eta, (\cdot)^\ast) where:

  1. T : |\mathcal{C}| \to |\mathcal{C}| is a function,
  2. \eta_A \in \mathcal{C}(A, TA) for every A \in |\mathcal{C}|, and
  3. f^\ast \in \mathcal{C}(TA,TB) for every f \in \mathcal{C}(A,TB)

such that the following equations are satisfied:

  1. f^\ast \circ \eta_A = f for every f : A \to TB;
  2. (\eta_A)^\ast = \mathrm{id}_{TA} for every A;
  3. (g^\ast \circ f)^\ast = g^\ast \circ f^\ast for every f : A \to TB, g : B \to TC.

If (F,G, \eta, (\cdot)^\sharp) is an adjunction quadruple then (GF, \eta, G(\cdot)^\sharp) is a Kleisli triple.

Theorem 1. Let \mathcal{C} be a category.

  1. If (T, \eta, \mu) is a monad on \mathcal{C}, and if f^\ast = \mu_B \circ Tf for every f : A \to TB, then (T, \eta, (\cdot)^\ast) is a Kleisli triple on \mathcal{C}.
  2. If (T, \eta, (\cdot)^\ast) is a Kleisli triple on \mathcal{C}, and if Tf = (\eta_B \circ f)^\ast for every f : A \to B and \mu_A = (\mathrm{id}_{TA})^\ast for every A, then (T, \eta, \mu) is a monad on \mathcal{C}.
  3. The two operations from the previous points are each other’s converse.

Proof: Point 1 follows from naturality of \eta and \mu and the three monad laws:

  • f^\ast \circ \eta_A = \mu_B \circ Tf \circ \eta_A = \mu_B \circ \eta_{TB} \circ f = \mathrm{id}_{TB} \circ f = f
  • (\eta_A)^\ast = \mu_{TA} \circ \eta_A = \mathrm{id}_{TA}
  • (g^\ast \circ f)^\ast = \mu_C \circ T\mu_C \circ T^2g \circ Tf = \mu_C \circ \mu_{TC} \circ T^2g \circ Tf = \mu_C \circ Tg \circ \mu_B \circ Tf = g^\ast \circ f^\ast

For point 2, functoriality of T, naturality of \eta and \mu, and monad laws follow from Kleisli laws:

  • T(g \circ f) = (\eta_C \circ g \circ f)^\ast = ((\eta_C \circ g)^\ast \circ \eta_B \circ f)^\ast = (\eta_C \circ g)^\ast \circ (\eta_B \circ f)^\ast = Tg \circ Tf
  • T\mathrm{id}_A = (\eta_A)^\ast = \mathrm{id}_{TA}
  • Tf \circ \eta_A = (\eta_B \circ f)^\ast \circ \eta_A = \eta_B \circ f
  • \mu_B \circ T^2f = (\mathrm{id}_{TB}^\ast \circ \eta_{TB} \circ (\eta_B \circ f)^\ast)^\ast = (\eta_B \circ Tf)^{\ast\ast} = ((\eta_B \circ f)^\ast \circ \mathrm{id}_{TA})^\ast = Tf \circ \mu_A
  • \mu_A \circ \eta_{TA} = (\mathrm{id}_{TA})^\ast \circ \eta_{TA} = \mathrm{id}_{TA} = (\mathrm{id}_{TA} \circ \eta_A)^\ast = ((\mathrm{id}_{TA})^\ast \circ \eta_{TA} \circ \eta_A)^\ast = \mu_A \circ T\eta_A
  • \mu_A \circ \mu_{TA} = (\mathrm{id}_{TA}^\ast \circ \mathrm{id}_{T^2A})^\ast = (\mathrm{id}_A \circ \mathrm{id}_{TA}^\ast)^\ast = (\mathrm{id}_{TA}^\ast \circ \eta_{TA} \circ \mathrm{id}_{TA}^\ast)^\ast = \mu_A \circ T\mu_A

Point 3 is straightforward. \Box

Considering again the free monoid example, the corresponding Kleisli triple has

f^\ast(s) = [ x \; \mathtt{for} \; x \; \mathtt{in} \; f(a) \; \mathtt{for} \; a \; \mathtt{in} \; s ]

Theorem 1 says that we can restate our problem as follows:

given a Kleisli triple (T, \eta, (\cdot)^\ast), find an adjunction quadruple (F, G, \eta, (\cdot)^\sharp) such that T = GF and (\cdot)^\ast = G((\cdot)^\sharp)

If T = GF with F \dashv G, then for every A,B \in |\mathcal{C}| there is an isomorphism \mathcal{D}(FA,FB) \cong \mathcal{C}(A,TB): this observation is at the base of Kleisli’s construction.

Definition 5. Let T = (T, \eta, (\cdot)^\ast) be a Kleisli triple on a category \mathcal{C}. The Kleisli category of T is the category \mathcal{C}_T defined as follows:

  • |\mathcal{C}_T| = |\mathcal{C}|;
  • \mathcal{C}_T(A,B) = \mathcal{C}(A,TB);
  • \mathrm{id}_{A}^{\mathcal{C}_T} = \eta_A, that is, the identity of A in \mathcal{C}_T is \eta_A;
  • g \bullet f = g^\ast \circ f, that is, the composition of f and g in \mathcal{C}_T is the composition of f and g^\ast in \mathcal{C}.

Theorem 2. \mathcal{C}_T is a category.

Proof: If f \in \mathcal{C}_T(A,B), then f \bullet \mathrm{id}_{A}^{\mathcal{C}_T} = f^\ast \circ \eta_A = f and \mathrm{id}_{B}^{\mathcal{C}_T} \bullet f = (\eta_B)^\ast \circ f = \mathrm{id}_{TB} \circ f = f by the Kleisli laws. If f \in \mathcal{C}_T(A,B)g \in \mathcal{C}_T(B,C), and f \in \mathcal{C}_T(C,D), then (h \bullet g) \bullet f = (h^\ast \circ g)^\ast \circ f = h^\ast \circ g^\ast \circ f = h \bullet (g \bullet f). \Box

Our plan is to construct an adjunction quadruple (F_T, G_T, \eta, (\cdot)^\sharp), with F_T : \mathcal{C} \to \mathcal{C}_T and G_T : \mathcal{C}_T \to \mathcal{C}, such that T = G_T F_T and G_T f^\sharp = f^\ast for every f \in \mathcal{C}_T(A,B). We do this as follows:

  • F_T A = A for every A \in |\mathcal{C}|;
  • G_T A = TA for every A \in |\mathcal{C}_T|;
  • G_T f = f^\ast for every f \in \mathcal{C}_T(A,B);
  • f^\sharp = f for every f \in \mathcal{C}(A, G_T B).

Let us quickly check that G_T is indeed a functor. If A \in |\mathcal{C}_T| then G_T \mathrm{id}_{A}^{\mathcal{C}_T} = (\eta_A)^\ast = \mathrm{id}_{TA} = \mathrm{id}_{G_T A}. If f \in \mathcal{C}_T(A,B) and g \in \mathcal{C}_T(B,C), then G_T(g \bullet f) = (g^\ast \circ f)^\ast = g^\ast \circ f^\ast = G_Tg \circ G_Tf. We are only left to determine, for every f \in \mathcal{C}(A, G_T B), a unique f^\sharp \in \mathcal{C}_T(F_T A, B) such that G_T f^\sharp \circ \eta_A = f: but the entire construction leads to the choice f^\sharp = f! Indeed, \mathcal{C}_T(F_T A, B) = \mathcal{C}_T(A,B) = \mathcal{C}(A,TB) = \mathcal{C}(A, G_T B), and G_Tf \circ \eta_A = f^\ast \circ \eta_A = f by the Kleisli laws. Observe that the functor G_T is the one that does all the work, while the function F_T is little more than a placeholder.

By our identification of adjunctions with adjunction quadruples (see the previous talk) we also get F_T f = (\eta_B \circ f)^\sharp = \eta_B \circ f for every f \in \mathcal{C}(A,B), and (\varepsilon_T)_S = (\mathrm{id}_{G_T S}^{\mathcal{C}})^\sharp = \mathrm{id}_{TS} \in \mathcal{C}(G_TF_TS, TS) = \mathcal{C}_T(F_TG_TS, S) for every S \in |\mathcal{C}_T| = |\mathcal{C}|.

Kleisli’s solution is not the only one, but just one among many: and, in a sense that will be clear later, the “simplest” one. Another solution was constructed by Eilenberg and Moore, and is based on a completely different approach: instead of keeping the objects and specializing the morphisms, one expands the objects and redefines the morphisms.

Definition 6. Let \mathcal{C} be a category and let T = (T, \eta, \mu) be a monad on \mathcal{C}.

  1. A T-algebra on \mathcal{C} is a pair a = (A,a) where A is an object in \mathcal{C} and a : TA \to A is such that a \circ \eta_A = \mathrm{id}_A and a \circ \mu_A = a \circ Ta.
  2. A morphism of T-algebras from a T-algebra a = (A,a) to a T-algebra b = (B,b) is an arrow f \in \mathcal{C}(A,B) such that b \circ Tf = f \circ a.
  3. The Eilenberg-Moore category of T-algebras on C is the category \mathcal{C}^T which has T-algebras as objects, morphisms of T-algebras as morphisms, and where identities and composition are defined as in \mathcal{C}.

If T=M is the free monoid construction, then an M-algebra is a function a : A^\ast \to A such that

  • a[x] = x for every x \in A, and
  • a[u^1_1 \cdots u^1_{n_1} \cdots u^m_1 \cdots u^m_{n_m}] = a[a[u^1_1 \cdots u^1_{n_1}] \cdots a[u^m_1 \cdots u^m_{n_m}]] for every u^1_1, \ldots, u^1_{n_1}, \ldots, u^m_1, \ldots, u^m_{n_m} \in A.

As T is a monad, for every object A of \mathcal{C} there is a free T-algebra \mu_A = (TA, \mu_A), and every arrow f \in \mathcal{C}(A,B) induces a morphism of free T-algebras Tf \in \mathcal{C}^T(\mu_A, \mu_B). Moreover, any a \in \mathcal{C}(TA,A) is, by definition, also a morphism from (TA, \mu_A) to (TA, a) in \mathcal{C}^T.

This time, our plan is to construct an adjunction (F^T, G^T, \eta, \mu) such that F^T : \mathcal{C} \to \mathcal{C}^T, G^T : \mathcal{C}^T \to \mathcal{C}, T = G^T F^T, and \mu_A = G^T \varepsilon_{F^T A} for every A \in |\mathcal{C}|. We do this as follows:

  • F^T A = \mu_A = (TA, \mu_A);
  • F^T f = Tf;
  • G^T a = A if a : TA \to A;
  • G^T f = f;
  • \varepsilon^T_a = a for every a = (A,a) \in |\mathcal{C}^T|.

Then clearly G^T F^T = T, while naturality of \varepsilon follows from the properties of free T-algebras with respect to T-algebra morphisms. In addition, if S = a = (A,a) \in |\mathcal{C}^T| then G^T \varepsilon^T_S \circ \eta_{G^T S} = a \circ \eta_A = \mathrm{id}_{G^T S}, and if A \in \mathrm{C} then \varepsilon^T_{F^T A} \circ F^T \eta_A = \mu_A \circ T\eta_A \mathrm{id}_{TA}. We thus have a full-featured adjunction: this time, F^T is doing all the work, and G^T is just a forgetful functor.

Theorem 3. Let T = (T, \eta, \mu) be a monad on a \mathcal{C}. Identify the monad T with the corresponding Kleisli triple T = (T, \eta, (\cdot)^\ast). The Kleisli category \mathcal{C}_T is equivalent to the full subcategory of \mathcal{C}^T generated by the free T-algebras.

Proof: Define a functor J : \mathcal{C}_T \to \mathcal{C}^T by setting JA = \mu_A = (TA, \mu_A) for every A \in |\mathcal{C}_T| = |\mathcal{C}|, and Jf = \mu_B \circ Tf = f^\ast \in \mathcal{C}^T(JA, JB) for every f \in \mathcal{C}_T(A,B) = \mathcal{C}(A, TB). Then J is a faithful functor, because if f,g \in \mathcal{C}_T(A,B), then f = \mu_B \circ \eta_{TB} \circ f = \mu_B \circ Tf \circ \eta_A and similarly g = \mu_B \circ \eta_{TB} \circ g = \mu_B \circ Tg \circ \eta_A, so that f = g if Jf = Jg. But J is also full, because if f : (TA, \mu_A) \to (TB, \mu_B) is a morphism of free T-algebras, then from the laws of monads follows that f = J(f \circ \eta_A). \Box

But things get even more interesting than this! Let T = (T, \eta, \mu) be a monad: let us consider all the adjunctions (F, G, \eta, \varepsilon) that generate T. What can be a morphism of such adjunctions? First, if (F, G, \eta, \varepsilon) is a solution with F : \mathcal{C} \to \mathcal{D}, and (F', G', \eta, \varepsilon') is a solution with F' : \mathcal{C} \to \mathcal{D}', we may consider a functor L : \mathcal{D} \to \mathcal{D'} as a morphism from (F, G, \eta, \varepsilon) to (F', G', \eta, \varepsilon'). Next, we want that the equalities GF = T = G'F' are not affected by mid-way application of L: this translates into the two conditions L \circ F = F' and G' \circ L = G. Finally, as the previous point yields LFG = F'G'L, we want that L does not interfere with the counits: that is, L\varepsilon = \varepsilon'_L.

Definition 7. Let T = (T, \eta, \mu) be a monad on a category \mathcal{C} and let (F, G, \eta, \varepsilon), (F', G', \eta, \varepsilon') be two adjunctions that generate T, with F : \mathcal{C} \to \mathcal{D} and F' : \mathcal{C} \to \mathcal{D}', respectively. A T-preserving functor from (F, G, \eta, \varepsilon) to (F', G', \eta, \varepsilon') is a functor L : \mathcal{D} \to \mathcal{D}' such that L \circ F = F', G' \circ L = G, and L\varepsilon = \varepsilon'_L.

It is straightforward to see that T-generating adjunctions together with T-preserving functors form a category \mathrm{Adj}(T): composition is provided by the usual composition of functors, while the identity of (F, G, \eta, \varepsilon) in \mathrm{Adj}(T) is the identity functor of \mathcal{D} if F : \mathcal{C} \to \mathcal{D}.

To confirm that our intuition is correct, let us verify that J satisfies the three given equations:

  • J F_T A = JA = (T_A, \mu_A) = F^T A;
  • J F_T f = \mu_B \circ T(\eta_B \circ f) = \mu_B \circ T\eta_B \circ Tf = \mathrm{id}_{TB} \circ Tf = F^Tf;
  • G^T J A = G^T (TA, \mu_A) = TA = G_T A;
  • G^T J f = G^T (\mu_B \circ Tf) = \mu_B \circ Tf = f^\ast = G_T f;
  • J(\varepsilon_T)_S = J\mathrm{id}_{TS}= \mu_S \circ T\mathrm{id}_{TS} = \mu_S = \varepsilon^T_{(TS, \mu_S)} = \varepsilon^T_{JS}.

Theorem 4. Let T = (T, \eta, \mu) be a monad. Then the Kleisli adjunction is the initial object of \mathrm{Adj}(T), the Eilenberg-Moore adjunction is the final object. In particular, J is the only arrow in \mathrm{Adj}(T) from the former to the latter.

Proof: If (F, G, \eta, \varepsilon) is the Kleisli adjunction, then the only choice for L is LA = F'A and Lf = \varepsilon'_{F'B} \circ F' f. If (F', G', \eta, \varepsilon') is the Eilenberg-Moore adjunction, then the only choice for L is LS = (GS, G \varepsilon_S) and Lf = Gf. \Box


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s