Inline IL assembly

April 23, 2009 at 09:05 PM | categories: Uncategorized | View Comments

Writing about C# and IL made me wonder how hard it would be to give inline assembly capabilities to, say, C#. (F# already has inline IL; I'm not sure about other .NET languages.)

It turns out to be surprisingly easy to write a tool that injects IL into a .NET assembly, after the compiler has finished, using Mono Cecil. In fact, the example on the Cecil FAQ injects WriteLine calls at the start of each method.

I put together a proof of concept that lets you write code like this:

public static class Program
{
    private static int Calculate()
    {
        IL.Push(10, 1, 1);
        IL.Emit("add");
        IL.Emit("mul");
        IL.Emit("ret");
        return 0;
    }
 
    public static void Main()
    {
        Console.WriteLine("(1 + 1) * 10 = {0}", Calculate());
    }
}

The methods on the IL class are placeholders that get replaced by a separate post-processor:

  • IL.Push: Removed by the post-processor, leaving the arguments behind on the virtual machine stack
  • IL.Emit: Replaced with the appropriate opcode by the post-processor
  • IL.Pop: Removed by the post-processor, allowing a value on the VM stack to be consumed by regular C# code

The IL class needs to be defined as part of your application. Although the code inside this class isn't important -- it'll never be executed -- the way the methods are declared is important:

  • IL.Push needs separate generic overloads taking 1, 2, 3, etc. parameters. If we defined one method, taking params object[], then the compiler would construct an array. If we defined overloads taking object instead of generic types, then the compiler would box value types.
  • IL.Emit is straightforward, taking a single string parameter.
  • IL.Pop returns a generic value. It's up to you to specify this generic type according to what's on the stack.

One limitation with this demo is that the IL.Emit method just takes a string, so no MethodInfo, Label etc., and no calls, branching, etc. I haven't thought of a decent way of specifying this optional parameter in a call to IL.Emit: maybe another string, suitably encoded?

Here's the code for the post processor, conveniently provided as an NUnit test fixture. Its purpose is to load an assembly using Mono Cecil, loop through each method, recognise the IL emitted by the C# compiler for each of the three IL methods, replace those calls with chunks of real IL, and save a new assembly. A word of advice: peverify is essential for testing: it's easy to generate unverifiable IL this way.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Reflection;
using Mono.Cecil;
using Mono.Cecil.Cil;
using NUnit.Framework;
 
[TestFixture]
public class InlineIL
{
    private static bool ReplaceInstructions(MethodDefinition method)
    {
        CilWorker worker = method.Body.CilWorker;
        foreach (Instruction instruction in method.Body.Instructions)
        {
            if (instruction.OpCode == OpCodes.Ldstr)
            {
                Instruction next = instruction.Next;
                if (next == null)
                    continue;

                if (next.OpCode != OpCodes.Call)
                    continue;

                MethodReference operand = (MethodReference) next.Operand;
                if (operand.DeclaringType.Name != "IL")
                    continue;

                switch (operand.Name)
                {
                    case "Emit":
                        string asm = ((string) instruction.Operand).Replace('.', '_');
                        FieldInfo field = typeof(OpCodes).GetField(
                            asm,
                            BindingFlags.Public | BindingFlags.Static | BindingFlags.IgnoreCase);

                        if (field == null)
                            throw new InvalidOperationException("Unrecognised opcode " + asm + ".");

                        OpCode opCode = (OpCode) field.GetValue(null);
                        Instruction replacement = worker.Create(opCode);
                        worker.Replace(instruction, replacement);
                        worker.Remove(next);
                        return true;
                }
            }
            else if (instruction.OpCode == OpCodes.Call)
            {
                MethodReference operand = (MethodReference) instruction.Operand;
                if (operand.DeclaringType.Name != "IL")
                    continue;

                switch (operand.Name)
                {
                    case "Push":
                    case "Pop":
                        worker.Remove(instruction);
                        return true;
                }
            }
        }

        return false;
    }

    [Test]
    public void ExpandInlineAsm()
    {
        AssemblyDefinition assembly = AssemblyFactory.GetAssembly("Temp.exe");
        IEnumerable<MethodDefinition> methods =
            from module in assembly.Modules.Cast<ModuleDefinition>()
            from type in module.Types.Cast<TypeDefinition>()
            from method in type.Methods.Cast<MethodDefinition>()
            select method;

        foreach (var method in methods)
        {
            while (ReplaceInstructions(method))
            {
            }
        }

        AssemblyFactory.SaveAssembly(assembly, "Temp.new.exe");
    }
}
blog comments powered by Disqus