Movable Type on EC2

April 18, 2010 at 08:00 PM | categories: Uncategorized | View Comments

I'm a big fan of virtual servers and I've always run this web site from one. Until recently I had it on a VMware instance on my home PC, although my recent experience with Amazon EC2 and a couple of large traffic spikes prompted me to move it.

In the end the process turned out to be pretty easy:

  1. Back up to Amazon S3 using duplicity:
  2. MySQL dump: mysqldump --all-databases
  3. /etc/apache2
  4. /var/www
  5. Start an EC2 small instance running AMI Ubuntu 9.04 (ami-ccf615a5)
  6. Restore from Amazon S3
  7. apt-get -y install apache2 duplicity libapache2-mod-perl2 libdbd-mysql-perl libdbi-perl mysql-server perlmagick python-boto
  8. Restore MySQL dump, /etc/apache2 and /var/www using duplicity
  9. Run MySQL script against the local instance
  10. Start Apache. Check whether the static HTML pages and Movable Type's admin interface work.
  11. Assign an Amazon elastic IP address to the EC2 instance. This gives me a static IP address that I can refer to from DNS.
  12. Remap the DNS alias (an A record and a CNAME record) via my ISP's web site
  13. Done!

I'm happy with the changes so far:

  • Performance has been fine: although publishing the site now takes 30 seconds not 15, I'm getting much better response times and bandwidth
  • I'm paying to run an EC2 instance full time whereas before I was just paying for home power bills
  • I'm not going to get shot by my ISP next time one of my posts appears on Reddit

The fact that I was taking daily backups made the move risk-free. It took a couple of attempts to get a working site on the EC2 server, but I was able to start a fresh instance and restore from backup each time. I also know that, if the site does fall over in future, restoring from backup will take a few minutes and I'll lose one day of data at most.

Read and Post Comments

Cherry Blossom in Jubilee Place

April 12, 2010 at 09:07 AM | categories: Uncategorized | View Comments

Jubilee Place Cherry Blossom - 2
Originally uploaded by Tim Robinson

Summer has almost arrived in London -- I took this at the weekend in one of the parks close to where I work.

Read and Post Comments

A render farm in Haskell

March 07, 2010 at 07:40 PM | categories: Uncategorized | View Comments

Browse the full source of the ray tracer on GitHub

A few weeks back I came across the smallpt ray tracer by Kevin Beason. smallpt is a global illumination renderer written in 99 lines of C++. Since it uses a Monte Carlo technique to randomly cast rays and average the results, the longer you can leave it running, the better the image will look. Kevin's code uses OpenMP to split calculations across more than one CPU core.

Sample smallpt image

Since ray tracing is the sort of mathsy task that Haskell excels at, I wanted to see how what a Haskell port of Kevin's code looked like. That code itself isn't the subject of this article, because what I wanted to write about instead are the techniques you can use to distribute Haskell code across multiple cores. These techniques range from parMap (which is almost transparent to the program, but provides little control over where the code runs) to brute-force distribution of work across several machines.


One of the things I love about Haskell is that its pure functional nature makes it straightforward to parallelise code within a single machine. You can normally take any code implemented in terms of map and parallelise it with parMap, from the Control.Parallel.Strategies module. parMap will use lightweight threads provided by the runtime to apply a function to each element in a list in parallel.

The parMap function accepts an argument that controls the evaluation strategy. Most of the examples I've seen use parMap rwhnf ("weak head normal form"), which only evaluates the outside of a data structure; the insides of your data are evaluated lazily, and this doesn't necessarily happen on one of the parallel lightweight threads as intended.

To evaluate anything more complicated than a list of simple values in you'll probably need to use parMap rdeepseq, which recurses inside your data on the lightweight thread so that it's fully evaluated by the time parMap finishes. rdeepseq is defined in terms of the Control.DeepSeq module, and if you've defined your own types, you'll find you need to implement NFData.

A final note on parMap: don't forget to compile with the -threaded flag, then run your program with +RTS -Nx, where x is the number of CPU cores you want to distribute across.


Haskell supports good old-fashioned manual concurrency with functions in the Control.Concurrent module. This works broadly the same as in other languages:

  • Use forkIO to start an IO computation in parallel with the current thread
  • Threads can communicate through objects called MVars: create a new one using newEmptyMVar or newMVar
  • MVars can be empty, or they can hold at most one value:
  • Read a value from an MVar using takeMVar. This will block if the MVar is currently empty.
  • Write a value to an MVar using putMVar. This will block if the MVar is currently full. Calling putMVar wakes up one of the threads that's blocked inside takeMVar.

forkIO starts a computation in parallel and lets you essentially forget about it; mvars are an effective way of encapsulating mutable state shared between more than one thread. What I like about this is that there's no error-prone manual locking like you'd find in, say, Java or .NET. Although forkIO and mvars give you full control over thread scheduling, they're not drop-in replacements for sequential calculations in the same way as parMap.

Brute force

The previous two techniques rely on the multi-threading capabilities of the Haskell runtime. They know nothing of what's happening outside of a single Haskell process, which limits them to the number of cores present in the machine. To go beyond that, separate instances of your program will need to communicate with each other over a network.

In my Haskell port of smallpt, I used this technique to run as many instances of the program as needed: one process per core on the local machine, and one process on each of a range of instances running on Amazon EC2. The communication protocol is as simple as possible: plain text over pipes. I used ssh to communicate with the EC2 instances, so from the outside there's no difference between a local instance and one running on the cloud.

The algorithm goes as follows:

  1. Establish the work you need to do. In this case, a work item is one line in the final rendered image.
  2. Split the work into tasks you can give to the worker processes. For simplicity, I'm splitting the work equally to give one task to each worker.
  3. Launch the workers and provide them with the task inputs. Since workers accept plain text, I take a Work object and hPrint it to a pipe.
  4. When each worker finishes, collect its output. Outputs will come back in an unpredictable order.
  5. When all the workers are done, sort all of the outputs into the same order as the inputs

The coordinator is the program you launch from the command line, and it deals with things like command-line parsing, executing the distribution algorithm above, and writing a .png format image at the end. The worker processes are themselves instances of the smallpt executable, launched with the -r flag. In effect what each smallpt -r does is:

-- Render one line of the scene and return it as a list of pixels
line :: (Floating a, Ord a, Random a) => Context a -> Int -> [Vec a]

runWorker = interact (show . map line . read)

One potential flaw in this is that, because you're communicating over plain text, the Haskell type system won't trap any errors: there isn't necessarily any link between the types in the coordinator and the types in the worker. The interface to the distribution code consists of a Coordinator a b record, where a is the type for inputs to the workers, and b is the type that the workers produce as output:

data (Show a, Read b) => Coordinator a b = 
    Coordinator { submitWork :: [String] -> [a] -> IO [b],
                  runWorker :: IO () }

coordinator :: (Read a, Show a, Read b, Show b) => (a -> b) -> Coordinator a b
coordinator worker = 
    Coordinator { submitWork = submitWork',
                  runWorker = interact (show . map worker . read) }

Most of the multi-threading code lives inside the submitWork' function which implements the algorithm above:

  1. Given a list of worker commands lines ([String]) and the work itself ([a]), produce a list of tuples of workers and tasks
  2. Using forkIO, fork one thread per worker. Inside each of these threads:
  3. Start the worker process itself (via createProcess shell)
  4. Write an [a] to its standard input
  5. Read a [b] from its standard output
  6. Put the [b] into an mvar shared between all worker threads
  7. Back in the coordinator thread, call takeMVar once for each worker. This produces a list of results as a [[b]].
  8. Once all the workers have finished, collapse the results into a [b], making sure the output list comes back in the same order as the original inputs. If any of the workers encountered an error, use ioError to fail the whole operation.

This approach works remarkably well for this simple ray tracer demo. But it has some serious flaws that you'd want to avoid in a production system:

  • Any failed worker process invalidates the whole set of results. Even in the absence of software bugs, machines can fail any time (consider how often disks fail when you've got 4,000 of them). The coordinator should recognise transient errors and retry those tasks; it might choose to take workers out of rotation if they fail repeatedly.
  • All machines are assumed to be equally powerful. This is rarely the case: for instance, workers running on my Mac here finish far sooner than ones on my small Amazon instances.
  • All work is assumed to take the same length of time to complete. Even in this simple ray tracer, plain diffuse surfaces are much easier to calculate than reflective (shiny) ones or refractive (transparent) ones.
  • Tasks are allocated up front. If a worker finishes early -- maybe because it's running on a Mac Pro, or because it generated a line of empty space -- it can't have any more work allocated to it.
  • Workers can't spawn other workers. In my code the worker function isn't declared in the IO monad, so it can't interact with anything. Even if it could, it would need to know about all of the other workers so that it could pick an idle core to do work on.

Next time I'll talk about some of the code in the ray tracer itself, as well as some approaches you'd use in practice to correct these flaws in the demo.

Read and Post Comments

An improved spelling corrector in Haskell

November 01, 2009 at 11:09 AM | categories: Uncategorized | View Comments

My previous post came up on the Haskell Reddit yesterday, and I got some great suggestions for improvements to the program, which I've checked into GitHub.

  • Map.fromListWith replaces List.foldl' and Map.insertWith'. fromListWith appears to be specifically designed for aggregating values that share a common key (in .NET terminology it's the GroupBy function).

  • I'll commonly use a construct like readFile "big.txt" >>= return . train . lowerWords, to bind a monad value readFile "big.txt" to a function (train . lowerWords), then produce a new monad value with return. A shorter way to write this is fmap (train . lowerWords) (readFile "big.txt"), or, with an import from Control.Applicative, (train . lowerWords <$> readFile "big.txt").

  • You can replace the lambda syntax (\w -> Map.lookup w nwords) with an operator section, ('Map.lookup' nwords). You might see this more commonly as (+1) in place of (\x -> x + 1); it's the same thing. Edit: the challenging Markdown syntax means that you'll have to imagine the single quotes around Map.lookup replaced with backticks, `.

I should have realised this at the time, but it didn't occur to me that you can use full pattern-matching syntax in the right-hand side of a Haskell list comprehension. If the match fails on some element then that element is filtered out. We can use this to improve on the Python version of the edits1 function: whereas the Python version combines its b[0] subscripts with an if b check, the Haskell version can use pattern matching to do both. We can also use the inits and tails functions to build substrings instead of brute-force applications of take and drop.

# Python
def edits1(word):
   s = [(word[:i], word[i:]) for i in range(len(word) + 1)]
   deletes    = [a + b[1:] for a, b in s if b]
   transposes = [a + b[1] + b[0] + b[2:] for a, b in s if len(b)>1]
   replaces   = [a + c + b[1:] for a, b in s for c in alphabet if b]
   inserts    = [a + c + b     for a, b in s for c in alphabet]
   return set(deletes + transposes + replaces + inserts)

> -- Haskell
> edits1 word = let s = zip (inits word) (tails word)
>                   deletes    = [ a ++ y     | (a, _:y  ) <- s ]
>                   transposes = [ a ++ y:x:z | (a, x:y:z) <- s ]
>                   replaces   = [ a ++ c:y   | (a, _:y  ) <- s, c <- alphabet ]
>                   inserts    = [ a ++ c:x   | (a, x    ) <- s, c <- alphabet ]
>                in Set.fromList $ concat [ deletes, transposes, replaces, inserts ]

The edits1 function is the densest one in both the Python and Haskell versions. It's the one where the intent of the code was least clear to me when I originally saw it, probably because of the extra clutter in the list comprehensions. In the latest Haskell revision it's more obvious what's going on.

I've consistently used the names x, y and z for the first three parts of the word's suffix. Because they consistently stand for the same thing, any inconsistencies jump out at you when you read the code:

  • a ++ y stands out because x is missing; it's been deleted. (x isn't even mentioned in the pattern match.)
  • a ++ y:x:z stands out because x and y are the wrong way round; we're transposing them
  • a ++ c:y stands out because we've got c instead of x; x has been replaced. (Again, x has dropped out of the pattern match.)
  • a ++ c:x has c in front of x; it's been inserted

After I'd written the first Haskell version I was skeptical. The Python version uses some typical Python idioms, like if b for checking for empty strings, using the same syntax for accessing lists and sets, and overloading the or operator to look for the first non-empty set, and Haskell doesn't have these same shortcuts. However the latest set of revisions makes the the core algorithm in the Haskell version more readable than the Python equivalent.

Read and Post Comments

A spelling corrector in Haskell

October 31, 2009 at 03:07 PM | categories: Uncategorized | View Comments

Update: Thanks to the commenters on this blog and on Reddit, I've got a much improved and more readable Haskell port of the spelling corrector.

On Wednesday I attended the StackOverflow DevDay in London, which was a day of excellent talks by engaging speakers. Michael Sparks gave a talk on Python, where he built up Peter Norvig's spelling corrector line by line as we watched. I was impressed by how easy it was to understand the source code, and the style struck me as being particularly functional. So, I was compelled to translate the Python source into Haskell.

This post is also a Literate Haskell program, meaning that you should be able to copy the text from the page, paste it into a .lhs file, and compile it. Snippets starting with > are Haskell; in between is Peter's original program, and my commentary on the differences.

 # Python region looks like this:
import re, collections

> -- Haskell region looks like this:
> module Main where
> import Char
> import qualified Data.List as List
> import qualified Data.Map as Map
> import qualified Data.Set as Set
> import Data.Ord
> import IO
> import List

Every language needs some imports. The Python program only has two; we're using a scattering of functions outside of the prelude (Haskell's standard library), so we need to import a bit more:

  • Char gives us toLower and isAlpha
  • Data.List gives us foldl'
  • Data.Map gives us the Map type
  • Data.Set gives us the Set type
  • Data.Ord gives us the comparing function
  • IO gives us isEOF
  • List gives us maximumBy
def words(text): return re.findall('[a-z]+', text.lower())

> lowerWords = filter (not . null) . map (map toLower . filter isAlpha) . words

The Haskell prelude already has a function called words, which splits a string by spaces. filter isAlpha and filter (not . null) approximate the Python regular expression:

  • filter isAlpha drops all characters outside of a-z
  • filter (not . null) excludes any empty strings (such as sequences of numbers or punctuation in the original text)
def train(features):
    model = collections.defaultdict(lambda: 1)
    for f in features:
        model[f] += 1
    return model

> train = List.foldl' (\dict word -> Map.insertWith' (+) word (1::Int) dict) Map.empty 

Haskell doesn't need an explicit loop here: we use foldl' to iterate over the list of words and add each one to a map. The Map.insertWith' function either inserts a value (if missing), or extracts the existing value, applies it to a function, and inserts the result back in the map.

NWORDS = train(words(file('big.txt').read()))

> readNWORDS = readFile "big.txt" >>= return . train . lowerWords

A big difference in the Haskell version is that file I/O is encapsulated in an IO monad. So whereas Python's NWORDS variable is an actual dictionary, readNWORDS is a I/O value that, when executed, reads and parses a file and yields a dictionary.

alphabet = 'abcdefghijklmnopqrstuvwxyz'

> alphabet = [ 'a' .. 'z' ]

I put a cheeky shortcut in the Haskell version. (It makes no difference to the line count.)

def edits1(word):
   s = [(word[:i], word[i:]) for i in range(len(word) + 1)]
   deletes    = [a + b[1:] for a, b in s if b]
   transposes = [a + b[1] + b[0] + b[2:] for a, b in s if len(b)>1]
   replaces   = [a + c + b[1:] for a, b in s for c in alphabet if b]
   inserts    = [a + c + b     for a, b in s for c in alphabet]
   return set(deletes + transposes + replaces + inserts)

> edits1 word =
>     let s = [ (take i word, drop i word) | i <- [ 0 .. length word ] ]
>         deletes    = [ a ++ tail b | (a, b) <- s, not $ null b ]
>         transposes = [ a ++ b!!1 : b!!0 : drop 2 b | (a, b) <- s, not $ null b, not $ null $ tail b ]
>         replaces   = [ a ++ c : tail b | (a, b) <- s, c <- alphabet, not $ null b ]
>         inserts    = [ a ++ c : b | (a, b) <- s, c <- alphabet ]
>     in Set.fromList (deletes ++ transposes ++ replaces ++ inserts)

The Haskell and Python versions of this function are fairly close. The main differences:

  • Haskell uses the let ... in keywords to declare values
  • List comprehension syntax is very similar, if you replace Python's for and in with Haskell's | and <-. (Recurring theme: Haskell prefers symbols to keywords.)
  • [ start .. end ] is Haskell's built-in range syntax. It's lazy, and generates elements only on demand, which means it's fine to construct an infinite list like this: [ 1 .. ]
  • Python has neat string slicing syntax:
  • [:i] is replaced with take i, to take the first i characters
  • [i:] is replaced with drop i, to take all but the first i characters
  • Subscripts can be replaced with the !! operators, which does the same thing, but with O(N) complexity. Remember Haskell's strings are lists of characters, not arrays (although see Data.ByteString).
  • Python's if keyword maps to different things depending on the context. Here it's used to ask 'is the string empty?', and we replace it with the not and null functions.
  • ++ and : stand in for Python's + operator, depending on whether we're concatenating two lists (a ++ b) or pushing a single element onto the start of a list (head : tail)

I'm not totally convinced about the performance of the Haskell version, since take and drop are O(N) and not O(1). However, N is small here, being the length of a word. If it's a problem we could use ByteString instead for O(1) complexity at the price of having to copy strings.

def known_edits2(word):
    return set(e2 for e1 in edits1(word) for e2 in edits1(e1) if e2 in NWORDS)

> known_edits2 knownWords = Set.unions . Set.elems . (Set.intersection knownWords . edits1) . edits1

Here I replaced the Python list comprehensions with Haskell's built-in set functions. We can't iterate directly over a set in Haskell, so if we used a list comprehension here, we'd to use elems to produce a list from the set and fromList to turn the result into a set again. Here I feel that the Python version demonstrates the intent more clearly, which is to produce a set of edits from edit1, then run those through edit1 again, and keep only the edits that turn out to be real words.

Read from right to left, what the Haskell version does is:

  • Produces a set of edits to a word using edit1
  • For each edit in the set, produce another set of edits using edit1, and keep only those edits-of-edits that can be found in knownWords. We now have a set of sets.
  • Turn a set of sets into a list of sets, using elems
  • Collapse that list of sets into a single set of words using unions
def known(words): return set(w for w in words if w in NWORDS)

> -- no Haskell equivalent

The known function actually disappears in the Haskell version. Because it takes a set of known words instead of a map of word frequencies, it turns into a direct call to intersection lower down.

def correct(word):
    candidates = known([word]) or known(edits1(word)) or known_edits2(word) or [word]
    return max(candidates, key=NWORDS.get)

> correct nwords word = 
>     let knownWords = Map.keysSet nwords
>         candidates = Set.elems
>                    $ head
>                    $ filter (not . Set.null)
>                    $ [ Set.intersection knownWords $ Set.singleton word,
>                        Set.intersection knownWords $ edits1 word,
>                        known_edits2 knownWords word,
>                        Set.singleton word ]
>       in maximumBy (comparing (\w -> w `Map.lookup` nwords)) candidates

Python has a magic short-circuiting or operator, which we have to fake by putting together a list of sets and finding the first non-empty one. Because Haskell is lazy this does in fact short-circuit: for instance, we never make a call to known_edits2 if we can already find word in knownWords.

I'm not a fan of maximumBy in Haskell, which makes you compare two items yourself; I prefer the Python version, which is similar to .NET's OrderBy function. Here, though, the comparing function in Data.Ord makes the code a little less verbose.

Finally, here's a little command-line demo that corrects words the user types in. It's only at this point that the Haskell version touches the text file that the Python program encapsulates in the NWORDS variable; the Haskell version passes the dictionary through each function that needs it. I could have done a direct translation of the Python, but this would have meant writing most of the program as imperative IO monad code, which wasn't really the point of the Haskell translation.

> prompt nwords = do
>     eof <- isEOF
>     if eof
>       then return ()
>       else do
>           getLine >>= putStrLn . correct nwords
>           prompt nwords
> main = readNWORDS >>= prompt
Read and Post Comments

Next Page ยป