Fiddling with the OCaml Type System

by jasonyeo

Not too long ago, in OCaml 4.00, the guys working on OCaml exposed the compiler internals and packaged it as a findlib library. The library allows people working on editor tools to get access to the type system, the ast structures, the typed tree structures and many other cool stuff. One of the things you can do with this library is that you can get the types of a piece of code. Yep, it gives you some form of reflection capabilities. How cool is that?! One of the use case of this is that it allows programmers to use camlp4 to figure which printer to use when printing values, it gives some form of generic printing.

Some of the modules that are required to get the types of the code are TypedTree, Types, Compmisc, Lexing, Parse and Printtyp. I will go through each of these later.

Firstly, we will need to get the initial environment (Env type) to work on. We need this when we are inferring the code. We can get this environment from Compmisc.

let e =
Compmisc.init_path false;
Compmisc.initial_env ()

Now, let’s get the AST of the code that we want to infer the types from. We will need to do lexical analyzing on it and parse the tokens. The good news is that all these are provided in the compiler-libs as functions. The tokens and AST are just a function call away. Say, we want to get the types of this add function:

let add x y = x + y

We can first pass the string representation of the code to the Lexer, and then pipe it (with the |> operator) to the parser.

let ast = "let add x y = x + y" |> Lexing.from_string |> Parse.implementation

Note that the (|>) operator is introduced in 4.01.0 of OCaml. You will need to upgrade your compiler to use it. You can also define it this way (but it may not be as efficient) :

let (|>) x f = f x

Now that we have the ast, we can now get the types of the ast. We will need to use the type_structure function from the Typemod module to do this. Simply pass the ast and the initial environment to the function.

let tstr, _tsig, _newe = Typemod.type_structure env ast Location.none

The return value is a triple with the type structure (a type representation of the expression’s type), the type signature and the new environment after we have inferred the types. We will work will the type structure and ignore the rest.
After some destructuring and unwrapping, you should get the types:

let {str_items} = tstr in
let {str_desc} :: _ = str_items in
let Typedtree.Tstr_value (_, lst) = str_desc in
let rec helper = function
| [] -> ()
| ({ pat_type }, _) :: rest ->
Printtyp.raw_type_expr Format.std_formatter pat_type;
helper rest in
helper lst;

Finally, to compile the code, you need to compile it with the compiler-libs package and link it with the ocamlcommon library.

ocamlbuild -pkg compiler-libs -lib ocamlcommon gettyp.ml

That’s it. The code in this post is found in this gist. Happy hacking!

Advertisements