Blog Archive

Monday, July 23, 2007

ICFP '07 Post-Mortem: IV

My friend Creighton and I entered the ICFP programming competition this year, we didn't make it very far, but then again we didn't spend as much time on it as last year. That's the way it goes.

I'm posting this in the hopes that someone can point out our bug. Please, help me!

This is our main module to convert DNA to RNA. If you're familiar with this years contest (and Haskell, I suppose) it should be reasonably comprehensible. I hope. If not, please let me know.
====================
Execute.lhs



> {-# OPTIONS -fglasgow-exts #-}

> module Execute where

> import DNA
> import Control.Monad.State
> import Data.List
> import System.IO
> import System.Environment
> import Data.Maybe
> import Maybe


I don't know that I've ever even tested our "main" method, maybe Creighton has. Almost all of our testing was done by loading the Testing module into GHCI.


> main :: IO ()
> main = do
> prefile:dnafile:_ <- getArgs
> prefix <- (openFile prefile ReadMode) >>= hGetContents
> dna <- (openFile dnafile ReadMode) >>= hGetContents
> let (World rna _ ) = execState execute (World [] (prefix++dna))
> print rna
> return ()


Here are our datatypes for the template items and pattern items. Everywhere the spec uses the Template datatype, that translates to [Template] in our implementation (and similarly for Pattern). Otherwise, these are pretty much straight from the spec.


> data Template = TBase Char | NL Int Int | N Int
> deriving Show

> data Pattern = Base Char | Skip Int | Search DNA | GOpen | GClose
> deriving Show


Most of the interesting stuff happens in the state monad, and this is our state!
DNA is of type String, and RNA is of type [DNA].


> data World = World RNA DNA
> deriving (Show, Eq)


These next few function operate on our state, they let us read DNA and write RNA. They'll get used all over the place.


> peekDNA :: (MonadState World m, Integral a) => a -> m [Char]
> peekDNA n = do World _ dna <- get
> return $ genericTake n dna

> popDNA :: (MonadState World m, Integral a) => a -> m [Char]
> popDNA n = do World a dna <- get
> put $ World a (genericDrop n dna)
> return $ genericTake n dna

> pushRNA :: MonadState World m => [Char] -> m ()
> pushRNA rna = do World rnas a <- get
> put $ World (rna:rnas) a


This is the Big Deal DNA to RNA function. The idea is that you give this to execState along with a World containing no RNA and the input DNA, and you'll get back a World containing the output RNA. The "main" function above has an example of this.


> execute :: MonadState World m => m ()
> execute = runMaybeT execute' >> return ()


The function execute' is run under the MaybeT monad transformer so that we can short-circuit the process from any of our functions by calling fail, which acts like finish () in the spec.

The function execute' looks almost exactly like in the spec. The reversals are in hopes that lots of pre-pending plus one big reverse is quicker that a lot of small post-pendings.


> execute' :: MonadState World m => m ()
> execute' = do p <- liftM reverse pattern
> t <- liftM reverse template
> matchreplace p t >> execute'


> pattern :: MonadState World m => m [Pattern]
> pattern = pattern' 0 []


Again, if you're familiar with the function pattern in the contest spec, this shouldn't look at all unfamiliar. Tail recursion is used to keep track of how match pattern is built up so far, and the current level.


> pattern' :: MonadState World m => Int -> [Pattern] -> m [Pattern]
> pattern' x ps = do
> base <- popDNA 1
> case base of
> "C" -> pattern' x ((Base 'I'):ps)
> "F" -> pattern' x ((Base 'C'):ps)
> "P" -> pattern' x ((Base 'F'):ps)
> otherwise -> do
> base' <- popDNA 1
> case (base,base') of
> ("I","C") -> pattern' x ((Base 'P'):ps)
> ("I","P") -> do
> n <- nat
> pattern' x ((Skip n):ps)
> ("I","F") -> do
> popDNA 1
> s <- consts
> pattern' x ((Search s):ps)
> otherwise -> do
> base'' <- popDNA 1
> case (base,base',base'') of
> ("I","I","P") -> pattern' (x+1) ((GOpen):ps)
> ("I","I","C") -> if x==0 then return ps else pattern' (x-1) ((GClose):ps)
> ("I","I","F") -> if x==0 then return ps else pattern' (x-1) ((GClose):ps)
> ("I","I","I") -> (popDNA 7 >>= pushRNA) >> pattern' x ps
> otherwise -> fail ""

> template :: MonadState World m => m [Template]
> template = template' []


Again, this shouldn't look unfamiliar to anyone who's spent time on the contest. And again, tail recurion is used to build up the template. In retrospect I probably didn't need to pass the current list in as a param, but hey, it's done. I guess I've bought into the "tail recursion as a replacement for state" meme. I'll need to break that habit - it isn't very Haskell-ish.


> template' :: MonadState World m => [Template] -> m [Template]
> template' ts = do base <- popDNA 1
> case base of
> "C" -> template' $ (TBase 'I') : ts
> "F" -> template' $ (TBase 'C') : ts
> "P" -> template' $ (TBase 'F') : ts
> "I" -> do base' <- popDNA 1
> case base' of
> "C" -> template' $ (TBase 'P'):ts
> "F" -> do l <- nat
> n <- nat
> template' ((NL n l):ts)
> "P" -> do l <- nat
> n <- nat
> template' ((NL n l):ts)
> "I" -> do base'' <- popDNA 1
> case base'' of
> "C" -> return ts
> "F" -> return ts
> "P" -> do n <- nat
> template' $ (N n):ts
> "I" -> do pushRNA =<< popDNA 7
> template' ts
> _ -> fail ""
> _ -> fail ""
> _ -> fail ""


This is where our error is. Are there any Haskell people who did this years ICFP contest who can tell us where we went wrong? Because I'm mystified, myself, and I really wish I had a working DNA -> RNA converter. In one of the examples, the "popDNA i" in the base case never seems to get called!


> matchreplace p t= matchreplace' p t 0 [] []
> matchreplace' [] t i e c = popDNA i >> (modify $ replace t e)
> matchreplace' (p:ps) t i e c = do
> (World _ dna) <- get
> case p of
> Base b -> if dna !!! i == (Just b)
> then matchreplace' ps t (i+1) e c
> else return ()
> Skip n -> if length dna > (i + n) then return ()
> else matchreplace' ps t (i+n) e c
> Search s -> case searchPost s dna i of
> Just n -> matchreplace' ps t n e c
> Nothing -> return ()
> GOpen -> matchreplace' ps t i e (i:c)
> GClose -> matchreplace' ps t i (e++[(subseq (head c) i dna)]) (subseq' 1 c)

> searchPost :: DNA -> DNA -> Int -> Maybe Int
> searchPost s dna i = let search n [] = Nothing
> search n (s':next) = if s' == s then Just n else search (n+1) next
> in liftM ((i+length s)+) $ search 0 ((map $ take (length s)) . tails $ genericDrop i dna)


> replace t e = replace' t e []

> replace' [] e r = (\(World rna dna) -> World rna (r++dna))
> replace' (t:ts) e r = case t of
> TBase b -> replace' ts e (r++[b])
> NL n l -> replace' ts e (r++(protect l $ fromMaybe [] (e!!!n)))
> N n -> replace' ts e (r++(asnat $ length $ fromMaybe [] (e!!!n)))


The rest of this are the utility functions used by the above. They're pretty simple, and should be familiar to you if you've done this year's ICFP competion


> asnat :: Integral n => n -> DNA
> asnat n | n==0 = "P"
> | n >= 0 && n `mod` 2==0 = 'I':(asnat $ n `div` 2)
> | n >= 0 = 'C' : (asnat $ n `div` 2)

> quote :: DNA -> DNA
> quote ('I':ds) = 'C':(quote ds)
> quote ('C':ds) = 'F':(quote ds)
> quote ('F':ds) = 'P':(quote ds)
> quote ('P':ds) = 'I':'C':(quote ds)
> quote _ = []

> protect :: Integral n => n -> DNA -> DNA
> protect 0 dna = dna
> protect n dna | n > 0 = protect (n-1) (quote dna)
> | otherwise = error "Protect should never receive a negative argument"


> nat :: (MonadState World m, Integral i) => m i
> nat = do dna <- popDNA 1
> case dna of
> "P" -> return 0
> "I" -> liftM (* 2) nat
> "F" -> liftM (* 2) nat
> "C" -> liftM (\x -> 2*x+1) nat
> _ -> fail ""

> consts :: MonadState World m => m DNA
> consts = do n <- popDNA 1
> case n of
> "C" -> liftM ('I':) consts
> "F" -> liftM ('C':) consts
> "P" -> liftM ('F':) consts
> "I" -> do n' <- peekDNA 1
> case n' of
> "C" -> popDNA 2 >> liftM ('P':) consts
> _ -> return ""
> _ -> return ""


And that's it for this file!

No comments:

Listening:

Watching:

  • House
  • Ride Back