Kompreni: Understanding OCaml with Abstract Algebra Examples

2023-05-11

Why and wherefore

I’ve wanted to level up my OCaml understanding for a while now, but between doing a fair amount of work in Rust at my current job and already having a bit of Haskell, I wasn’t sure I could justify the “distraction.” But between some excellent resources available online, some interesting reading (more on that in a moment), and the newest 5.0.0 release, I decided it was time to dig in!

Oleg Kiselyov does something cool, news at 11

The first reading topic I’ve stumbled across recently come from an extremely prolific CS researcher, Oleg Kiselyov who has written a lot of amazing papers and code. One such project is the tagless-final style of embedding domain-specific languages in a typed functional language like Haskell or OCaml. I can’t do it justice, it’s truly amazing stuff, you should go read it yourself. But! Having seen some of the work before in Haskell, it was intriguing to look through this work and compare and contrast Haskell and OCaml versions.

Modules modules modules

The second topic is modules, as in “module systems” and “modular programming”.1 They’re really interesting; indeed, some say modules matter most. OCaml (like SML and other related languages) allow for some really interesting flexibility around designing and implementing interfaces. I won’t be able to explain as well as the above link or this more verbose and less esoteric explanation.

Algebraic building blocks

Which leads me to the point of my little project named kompreni. While it is certainly nowhere near as complete as, say, bastet,2 I wanted to use describing abstract algebra ideas as a way to explore expressing things in OCaml.

We’ll start with a semigroup; a semigroup consists of a set \( S \) and an associative binary operation \( \cdot \) on that set; that is, for any \(x\), \(y\), and \(z\) in \(S\), it must be the case that \( x \cdot (y \cdot z) = (x \cdot y) \cdot z \).

In kompreni, I’ve written a Signature for semigroups:

module type Signature = sig
  type t
  val ( +& ) : t -> t -> t
end

along with an accompanying Laws functor, a function which takes any module implementing the above Signature and creates a new module:

module Laws (S : Signature) = struct
  open S
  let associative x y z = x +& (y +& z) = x +& y +& z
end

One thing I miss compared to Haskell is being able to specify the precedence and associativity of user-defined operators like +&. Since it begins with the plus symbol, OCaml treats it like a plus; ah well. That means that this expression above, when formatted with dune fmt, looks like it’s missing some parentheses; OCaml says + is left-associative, so it says +& is too.

With that signature and functor together, I’ve next turned to qcheck for property-based testing in OCaml. For instance, in kompreni’s test suite I’ve got the following. First we define a base signature for testing:

module type Testable = sig
  type t
  val gen : t QCheck2.Gen.t
end

Then we combine it with the Signature above to actually test the associative property:

module SemigroupLaws (X : Testable) (S : Semigroup.Signature with type t = X.t) =
struct
  include Semigroup.Laws (S)
  let tests =
    [
      make_test
        QCheck2.Gen.(triple X.gen X.gen X.gen)
        "associative" (uncurry3 associative);
    ]
end

With other scaffolding (like uncurry3) defined elsewhere, this takes two modules—one implementing the testable interface and another that is a Semigroup with the same internal type t—and defines a list of qcheck tests.

kompreni also contains monoids, semigroups with an identity element we’ll call zero:

module type Signature = sig
  include Semigroup.Signature
  val zero : t
end

Note that include statement, which says “bring in the body of the Semigroup signature, too.” We also have more Laws for monoids to fulfill, in addition to associative:

module Laws (M : Signature) = struct
  open M
  include Semigroup.Laws (M)
  let left_id x = zero +& x = x
  let right_id x = x +& zero = x
end

which we can also test:

module MonoidLaws (X : Testable) (M : Monoid.Signature with type t = X.t) =
struct
  include Monoid.Laws (M)
  let tests =
    let module SL = SemigroupLaws (X) (M) in
    List.map (uncurry2 (make_test X.gen))
      [ ("left id", left_id); ("right id", right_id) ]
    @ SL.tests
end

Semirings in paradise

This post is already getting too long, so I’ll finish with my favorite example. I think this setup shows the power of OCaml modules, but also shows a place where either I don’t know how to use them well enough (very plausible!) or they fall just a tad short of what I’d like.

Consider semirings, which consist of a set of elements \( S \), two operations \( + \) and \( \cdot \), and two special elements \(0\) and \(1\) such that:

My absolute favorite semiring is the Tropical semiring, also known as the min-plus semiring, which consists of:

Here’s kompreni’s implementation3:

module MinPlus = struct
  type t = Finite of Q.t | Infinite
  let ( +& ) a b =
    match (a, b) with
    | Finite x, Finite y -> Finite (Q.max x y)
    | Finite _, _ -> a
    | _, Finite _ -> b
    | _, _ -> Infinite
  let zero = Infinite
  let ( *& ) a b =
    match (a, b) with
    | Finite x, Finite y -> Finite (Q.add x y)
    | _, _ -> Infinite
  let one = Finite Q.zero
end

And here are the laws that a Semiring must abide by in kompreni:

module Laws (S : Signature) = struct
  open S
  include Commutative_monoid.Laws (S)
  (* (S, 1, *&) is also a Monoid *)
  let times_associative x y z = x *& (y *& z) = x *& y *& z
  let times_left_one x = one *& x = x
  let times_right_one x = x *& one = x
  (* Zero annihilates *)
  let zero_annihilates_left x = zero *& x = zero
  let zero_annihilates_right x = x *& zero = zero
  (* Distributivity *)
  let left_distributive x y z = x *& (y +& z) = (x *& y) +& (x *& z)
  let right_distributive x y z = (x +& y) *& z = (x *& z) +& (y *& z)
end

I really like being able to include previous law definitions in new ones; for instance, having defined the Commutative_monoid.Laws functor, the line include Commutative_monoid.Laws (S) ensures that I include all those laws into my tests for almost free—ensuring that I test \((S, +, 0)\) is indeed a commutative monoid. However, I don’t see a way to cleverly check \((S, ⋅, 1)\) is also a monoid without writing out the times_ rules explicitly. Maybe I need more OCaml module & functor goodness? bastet forgoes wrapping the monoid definitions and laws into the semiring one, for what it’s worth.

Fin.

While I miss the usability of cargo from Rust and some fun things from Haskell, OCaml is fun and eye-opening! Please feel free to head to GitHub to check out the code.

  1. Not to be confused with a module over a ring or one of the many other definitions of “module.” ↩︎

  2. Nor will it likely ever be. ↩︎

  3. Rather than floats, this uses rational numbers from zarith for finite values. Using \(\mathbb{Q}\) is not the same as using \(\mathbb{R}\), but this still forms a semiring! Also it’s really, really hard to represent arbitrary real numbers in software. For instance, the associativity tests fail if you use OCaml floats, because floating point numbers aren’t real↩︎