Lisp compiler in F#: Introduction

May 30, 2009 at 10:34 PM | categories: Compiler | View Comments

Browse the full source of the compiler on GitHub

I started learning functional programming in F# and Haskell around 6 months ago, and one of the programs I've been writing in order to learn F# is a small Lisp compiler, with syntax similar to Scheme:

(define (fact n) (if (= n 0) 1 (* n (fact (- n 1)))))
(Console.WriteLine "6! = {0}" (fact 6))
(Console.WriteLine "What is your name?")
(Console.WriteLine "Hello, {0}" (Console.ReadLine))

I'm planning on posting a series of articles explaining how the compiler demonstrates some useful F# features. By way of an introduction, I'd like to talk about two of my favourite features from functional languages: discriminated unions and pattern matching.

At their simplest, discriminated unions are equivalent to .NET enums, since they allow a strongly-types variable to hold one of a fixed set of values:

(*
 * Rough C# equivalent:
 * public enum ListOp { Add, Subtract, Multiply, Divide, Equal }
 *)

type ListOp = Add
            | Subtract 
            | Multiply 
            | Divide 
            | Equal

But each of these possible values can be tagged with a data structure, which means that discriminated unions are useful where an object-orientated language would need a hierarchy of classes:

(*
 * Rough C# equivalent:
 * public abstract class LispVal { }
 *
 * public sealed class ArgRef : LispVal
 * {
 *     public ArgRef(int index)
 *     {
 *         Index = index;
 *     }
 *
 *     public int Index { get; private set; }
 * }
 *
 * public sealed class Atom : LispVal
 * {
 *     public Atom(string text)
 *     {
 *         Text = text;
 *     }
 *
 *     public string Text { get; private set; }
 * }
 *
 * etc., for the other cases
 *)
type LispVal = 
             | ArgRef of int                                
             | Atom of string
             | Bool of bool
             | IfPrimitive of LispVal * LispVal * LispVal
             | LambdaDef of string list * LispVal
             | LambdaRef of MethodInfo * bool * Type list
             | List of LispVal list
             | ListPrimitive of ListOp * LispVal list
             | Number of int
             | QuotePrimitive of LispVal
             | String of string
             | VariableDef of string * LispVal
             | VariableRef of LocalBuilder

When faced with the task of writing code that understands the C# class hierarchy above, I'd implement the Visitor pattern: I'd define an interface called ILispValVisitor, and every time I needed to process a LispVal somehow, I'd declare a nested class that implemented the right ILispValVisitor methods. Fairly straightforward, right? -- a boilerplate class implementation in every piece of code that needs to look inside a LispVal.

Functional languages have a much more elegant alternative, in the form of pattern matching operators. There's no real equivalent to pattern matching in C#, although the concept could be similar to if, switch or the visitor pattern depending on the situation:

(*
 * Rough C# equivalent: an implementation of the visitor pattern
 *)
let rec typeOf (env : Map) = function
    | ArgRef _ -> typeof
    | Atom a -> a |> ident env |> typeOf env
    | Bool _ -> typeof
    | IfPrimitive (_, thenValue, elseValue) ->
        match typeOf env thenValue with
        | t when t = typeOf env elseValue -> t
        | _ -> raise <| Compiler("expected 'then' and 'else' branches to have same type")

    | LambdaDef (_, body) -> typeOf env body
    | LambdaRef (methodBuilder, _, _) -> methodBuilder.ReturnType
    | List (Atom a :: args) -> a |> lambdaIdent args env |> typeOf env
    | List (fn :: _) -> raise <| Compiler(sprintf "can't invoke %A" fn)
    | List [ ] -> raise <| Compiler("can't compile empty list")
    | ListPrimitive _ -> typeof
    | Number _ -> typeof
    | String _ -> typeof
    | QuotePrimitive _ -> typeof
    | VariableDef _ -> typeof
    | VariableRef local -> local.LocalType

The Lisp compiler I wrote relies on a discriminated union, LispVal for storing expression trees, and pattern matching for most of its processing. I'll post more in-depth articles covering the source code in detail, including:

  • Parsing with fslex and fsyacc
  • Code generation using System.Reflection.Emit
  • Calling the compiler from C#
blog comments powered by Disqus