Lisp compiler in F#: Introduction
May 30, 2009 at 10:34 PM | categories: Compiler | View CommentsBrowse the full source of the compiler on GitHub
I started learning functional programming in F# and Haskell around 6 months ago, and one of the programs I've been writing in order to learn F# is a small Lisp compiler, with syntax similar to Scheme:
(define (fact n) (if (= n 0) 1 (* n (fact (- n 1))))) (Console.WriteLine "6! = {0}" (fact 6)) (Console.WriteLine "What is your name?") (Console.WriteLine "Hello, {0}" (Console.ReadLine))
I'm planning on posting a series of articles explaining how the compiler demonstrates some useful F# features. By way of an introduction, I'd like to talk about two of my favourite features from functional languages: discriminated unions and pattern matching.
At their simplest, discriminated unions are equivalent to .NET enums, since they allow a strongly-types variable to hold one of a fixed set of values:
(* * Rough C# equivalent: * public enum ListOp { Add, Subtract, Multiply, Divide, Equal } *) type ListOp = Add | Subtract | Multiply | Divide | Equal
But each of these possible values can be tagged with a data structure, which means that discriminated unions are useful where an object-orientated language would need a hierarchy of classes:
(* * Rough C# equivalent: * public abstract class LispVal { } * * public sealed class ArgRef : LispVal * { * public ArgRef(int index) * { * Index = index; * } * * public int Index { get; private set; } * } * * public sealed class Atom : LispVal * { * public Atom(string text) * { * Text = text; * } * * public string Text { get; private set; } * } * * etc., for the other cases *) type LispVal = | ArgRef of int | Atom of string | Bool of bool | IfPrimitive of LispVal * LispVal * LispVal | LambdaDef of string list * LispVal | LambdaRef of MethodInfo * bool * Type list | List of LispVal list | ListPrimitive of ListOp * LispVal list | Number of int | QuotePrimitive of LispVal | String of string | VariableDef of string * LispVal | VariableRef of LocalBuilder
When faced with the task of writing code that understands the C# class hierarchy above, I'd implement the Visitor pattern: I'd define an interface called ILispValVisitor
, and every time I needed to process a LispVal
somehow, I'd declare a nested class that implemented the right ILispValVisitor
methods. Fairly straightforward, right? -- a boilerplate class implementation in every piece of code that needs to look inside a LispVal
.
Functional languages have a much more elegant alternative, in the form of pattern matching operators. There's no real equivalent to pattern matching in C#, although the concept could be similar to if
, switch
or the visitor pattern depending on the situation:
(* * Rough C# equivalent: an implementation of the visitor pattern *) let rec typeOf (env : Map) = function | ArgRef _ -> typeof | Atom a -> a |> ident env |> typeOf env | Bool _ -> typeof | IfPrimitive (_, thenValue, elseValue) -> match typeOf env thenValue with | t when t = typeOf env elseValue -> t | _ -> raise <| Compiler("expected 'then' and 'else' branches to have same type") | LambdaDef (_, body) -> typeOf env body | LambdaRef (methodBuilder, _, _) -> methodBuilder.ReturnType | List (Atom a :: args) -> a |> lambdaIdent args env |> typeOf env | List (fn :: _) -> raise <| Compiler(sprintf "can't invoke %A" fn) | List [ ] -> raise <| Compiler("can't compile empty list") | ListPrimitive _ -> typeof | Number _ -> typeof | String _ -> typeof | QuotePrimitive _ -> typeof | VariableDef _ -> typeof | VariableRef local -> local.LocalType
The Lisp compiler I wrote relies on a discriminated union, LispVal
for storing expression trees, and pattern matching for most of its processing. I'll post more in-depth articles covering the source code in detail, including:
- Parsing with fslex and fsyacc
- Code generation using
System.Reflection.Emit
- Calling the compiler from C#