Dynamic records and enumerators: Open-sourcing OCaml libraries

Graham Steel
August 14, 2015

Updated March 2018

We open-sourced a lot more libraries since this first post, including our popular OCaml Linter - take a look on our Github page.

Original post:

The vast majority of the Cryptosense code base is built using OCaml. We're excited to announce that we're releasing a couple of the OCaml libraries we developed as free software: records and enumerators. Here we'll describe what the two libraries do and what we use them for.

Records

Most languages have a notion of record types: a collection of fields which each have a name and a type. The classic example is a point type with three integer fields x, y and z. In OCaml this is written:

type point = { x : int; y : int; z : int}.

A common problem however is that this notion is often not first class. For example, one can write a function that takes a point, a value and sets p.x to this value. One can also do the same for y:

let set_x p v = p.x <- v
let set_y p v = p.x <- v

But since fields are not values it's not possible to make a function that can set either x or y. Several solutions exist, the most popular being Van Laarhoven lenses, which abstract away the concepts of "getting" and "setting" things. Lenses can get a bit hairy, so we're using a simpler solution.

Our records library turns record types and fields into runtime values. This example, taken from our README, shows how we handle this.First, we create a layout. That's a value representing a record type.

type point
let point : point Record.layout = Record.declare "point"
let x = Record.field point "x" Type.int
let y = Record.field point "y" Type.int
let z = Record.field point "z" Type.int
let () = Record.seal point

Then, we can allocate and use a record value.

let _ =
  let p = Record.make point in
  Record.set p x 3;
  Record.set p y 4;
  Record.set p z 5;
  Record.format Format.std_formatter p
  

The last line outputs p in JSON format:

{"x":3,"y":4,"z":5}

We have two use cases for this in our code base. First, it simplifies the way we deal with JSON data. This basically builds of_json and to_json functions automatically. The ppx_deriving_yojson library can also be used to this end, but sometimes more control is needed (in particular when we don't control the JSON format).Also, it makes it possible to create lists of heterogeneous records. For example, our app tracer parses a stream of JSON values identified by a command name (such as "encrypt" or "sign"), each associated to a record with the command's parameters. This is all bound together with GADTs, but that's another story!

Enumerators

When working with large data structures, you may need a way of controlling how your program allocates memory. Typically, inefficiencies can result from pieces of code like the following:

type point = { x : int; y : int; z : int }

let x_line = Array.init 1000 (fun i -> { x = i; y = 0; z = 0 })
let y_line = rotate_z 90 x_line
let angle = Array.append x_line y_line
let cross = Array.append right_angle (rotate_z 180 angle)
let () = Array.iter print_point cross

This constructs an horizontal cross from a straight line and successive spatial transformations, where a figure is represented by an array of point records. Using this code can be problematic because intermediate arrays are allocated at each step. With a large amount of points, it may take a significant amount of time or even fill up your computer's memory.To deal with these issues, we have built the enumerators library. An enumerator represents a finite sequence of elements, like an array does, but you can apply transformations to enumerators and combine them together in a lazy way, that is, by computing them only when needed and without storing intermediate values. Here is an improved version of the plotting of the cross:

let x_line_array = Array.init 1000 (fun i -> { x = i; y = 0; z = 0 })
let x_line = Enumerator.of_array x_line_array
let y_line = Enumerator.map (rotate_z 90) x_line
let angle = Enumerator.append x_line y_line
let cross = Enumerator.append right_angle (Enumerator.map (rotate_z 180) angle)
let () = Enumerator.iter print_point cross

To make it even more efficient, you could store the points using the records library mentioned previously.In addition to the functions used in this post, the library provides you with various less common combinators.  At Cryptosense, we use functions such as `product` and `subset` to generate test cases for our Analyzer and Compliance Tester.

Get in touch

We'd love to hear about it if you use one of these libraries in an OCaml project. Also, we're recruiting.