Wednesday, October 7, 2009

It has been a while

Well, I ran out of steam (and time) for my foray into Haskell way back in March. I plan on starting a new project soon, one that is a quite a bit more ambitious. I have always learned better `by doing', so I am going to try to implement a Haskell package to compute the Grobner basis of an ideal in a polynomial ring using the multithreaded constructs that are built into Haskell.

Of course, this means that I will have to write a polynomial 'object', and then try to get a multithreaded version of Buchberger's algorithm programmed properly. I think I am going to be looking at Anton Leykin's paper and C++ implementation of a distributed version of Buchberger's algorithm. Wish me luck!

Thursday, March 5, 2009

SyntaxHighlighter

I found a nice little JavaScript package that will highlight code according to 'brushes' for that language called SyntaxHighlighter. I wasn't sure that I could get it to work on Blogger, but I found instructions how to do this on this webpage.

The download package does not include Haskell support, but I did find a basic one by Cristiano Paris here. I changed a few things from how he had them for this blog, and I think I may add some of the basic list operation functions to the coloring scheme at some point (although the emacs haskell-mode does not highlight these).

Wednesday, March 4, 2009

Typleclasses in Haskell

This post will try to demystify the part of the definition of the functions in the previous example before the "=>". For example, we had the definition

maxBy :: (Num t, Ord t) => Int -> [t] -> t

The function type tells us that we input an Int and a list of type ts and output a type t value.

Part of the beauty of Haskell is that one does not have to specify the type of input that we are dealing with when using maxBy.
However, in the function maxBy, we needed to take the maximum of a set of values, as well as take products of a set of values, so not just any type will do.

So, the type t input should allow one to tell when one value is larger than another, and should also make it possible to multiply values. This is where typeclasses come in. In the definition of maxBy, we tell the compiler to insist that any type t input must have typeclass Ord (to order the elements), and typeclass Num (to multiply the elements).

I won't go over this here, but one can easily create new data types that are instances of these types, provided one implements the operations that are built into Haskell for these types.

Tuesday, March 3, 2009

Haskell article

Below is a link to an article by John Goerzen (one of the authors of Real World Haskell) on how Haskell is different than the languages that most programmers are used to. If I hadn't programmed in it at all beforehand, some of these differences may have scared me away (no for loops!), but I am loving every minute of it.

Article

Monday, March 2, 2009

A More Extended Example

In this example I will try to discuss several of the features that I find very nice in Haskell. The problem we will consider is Problem 11 from Project Euler. I produced a solution that was long, mostly due to the fact that I am not very well versed in thinking in Haskell. However, one of the cool things about Project Euler is that one can look at other people's solutions and below I will present a solution due to lomeo:

import List

grid = The 20x20 grid as a list of lists.

myMaximum :: (Ord t) => [t] -> t
myMaximum [] = 0
myMaximum xs = maximum xs

takeBy :: Int -> [a] -> [[a]]
takeBy n = filter ((n==) . length) . map (take n) . tails

maxBy :: (Num t, Ord t) => Int -> [t] -> t
maxBy n = myMaximum . map product . takeBy n

maxInRows :: (Num t, Ord t) => Int -> [[t]] -> t
maxInRows n = maximum . map (maxBy n)

maxInCols :: (Num t, Ord t) => Int -> [[t]] -> t
maxInCols n = maxInRows n . transpose

maxInLRupper :: (Num t, Ord t) => Int -> [[t]] -> t
maxInLRupper n = maxInRows n . transpose . zipWith drop [0..]

maxInLRlower :: (Num t, Ord t) => Int -> [[t]] -> t
maxInLRlower n = maxInRows n . transpose . zipWith drop [0..] . transpose

maxInRLupper :: (Num t, Ord t) => Int -> [[t]] -> t
maxInRLupper n = maxInLRupper n . map reverse

maxInRLlower :: (Num t, Ord t) => Int -> [[t]] -> t
maxInRLlower n = maxInLRlower n . map reverse

findMaxProduct :: (Num t, Ord t) => [[t]] -> Int -> t
findMaxProduct xs n = maximum $ map (flip ($ n) xs) [maxInRows, maxInCols,
maxInLRupper, maxInLRlower, maxInRLupper, maxInRLlower]

main = findMaxProduct grid 4

We will take a look at the code line by line (note that the blank lines in between functions are important in Haskell, as they indicate that the definition of the function is over). The only change I made to the code was to add the function types before the function definitions, since I think it is good practice to do so (especially if trying to understand someone else's code). Also, note that it is a little awkward to read this code top-down, since it starts with the helper functions that give the solution in the end. When writing in Haskell, I usually find myself writing the code in the opposite order.

The code import List brings in Data.List which handles some of the more advanced list management tools (see the description here).

The function myMaximum just returns the maximum of a nonempty list, or zero if the list is empty. Note that when Haskell calls myMaximum on a list, it calls the first version of the function that matches the pattern in the order they were listed in the source file.

The next line is a doozy, and there is a lot going on here, so bear with me:

takeBy :: Int -> [a] -> [[a]]
takeBy n = filter ((n==) . length) . map (take n) . tails

Looking at the function declaration, one would be tempted to say that takeBy takes two arguments, an Int and a list of as. However, note that we have only specified one input variable, effectively making this function return a function with input [a] and output [[a]].

So what does this function do? The '.' denotes function composition in Haskell, so this function applies tails, then map (take n), and then filter ((n==) . length) to a list, and returns a list of lists.

The function tails returns all final segments of the argument, longest first. For example, tails [1,2,3] would return [[1,2,3],[2,3],[3],[]].

The function map is one of the mainstays of Haskell. It is of type (a -> b) -> [a] -> [b], which means it takes a function turning type a objects to type b objects, a list of a objects, and produces a list of b objects; it just applies the function given to each of the elements of the given list to produce a new list. The map allows one to perform some iterative tasks in a simple way. So, in our example, we are applying (take n) to each element of the list returned from tails, which in effect, takes the first n elements of each of the final segments of the list input. For example, one has

map (take 4) . tails "abcdefg" = ["abcd","bcde","cdef","defg","efg","fg","g",""]

In this problem, we are trying to say something about the maximal product of all length 4 'words' in the 20x20 grid. We clearly have some that are too short in this example, so we need to take off the ones that are not length 4. That is where the next function comes in. The function filter is of type (a -> Bool) -> [a] -> [a] and simply selects those elements in the list that evaluate to True when passed to the first argument. The function we are concerned with is ((n==).length), which is of type [a] -> Bool and first takes the length (of type [a] -> Int) and then checks if that number is equal to n. So, takeBy selects the initial segments of length n of the final segments of the list that is passed to it.

Now that we have slogged through that line, the rest will be easier. The next function is

maxBy :: (Ord t) => Int -> [t] -> t
maxBy n = myMaximum . map product . takeBy n

This function applies (takeby n) to a list, takes the product over all those lists with map product, and then finds the maximum of all those values with myMaximum. The next two functions are pretty simple:

maxInRows :: (Num t, Ord t) => Int -> [[t]] -> t
maxInRows n = maximum . map (maxBy n)

maxInCols :: (Num t, Ord t) => Int -> [[t]] -> t
maxInCols n = maxInRows n . transpose

The function maxInRows maps the (maxBy n) function over each row, and then takes the maximum of all those values in the resulting list. The function maxInCols uses the useful function transpose makes the rows in the grid columns and the columns rows. A nice thing about this function is that it works for non-square grids, and we will use this to our advantage.

The next two functions include an ingenious use of the function zipWith

maxInLRupper :: (Num t, Ord t) => Int -> [[t]] -> t
maxInLRupper n = maxInRows n . transpose . zipWith drop [0..]

maxInLRlower :: (Num t, Ord t) => Int -> [[t]] -> t
maxInLRlower n = maxInRows n . transpose . zipWith drop [0..] . transpose

maxInRLupper :: (Num t, Ord t) => Int -> [[t]] -> t
maxInRLupper n = maxInLRupper n . map reverse

maxInRLlower :: (Num t, Ord t) => Int -> [[t]] -> t
maxInRLlower n = maxInLRlower n . map reverse

The functions (maxInRows n) and transpose are easy enough to understand. The function zipWith is of type (a -> b -> c) -> [a] -> [b] -> [c]; it applies the function passed to it to an element each from the lists [a] and [b], producing a list [c]. If one list is longer than the other, zipWith automatically stops once one of the lists is exhausted. For example, one has the following:

zipWith (+) [1..10] [1..] = [2,4,6,8,10,12,14,16,18,20]

So, the function maxInLRupper n applies zipWith drop [0..] to an input grid, which just takes the 'upper left hand corner' of the grid by dropping the first few elements. In other words, one has

zipWith drop [0..] [[1,2,3],[4,5,6],[7,8,9]] = [[1,2,3],[5,6],[9]]

In maxInLRupper, we transpose the grid to get the left to right downward diagonals, and then compute maxInRows on that resulting list. To find those in the lower left hand corner, we apply an extra transpose before performing the computation. To get the downward to the left diagonals, we map reverse on the grid to reverse each row, and then use the previous functions to find the maximums on the diagonals.

findMaxProduct :: (Num t, Ord t) => [[t]] -> Int -> t
findMaxProduct xs n = maximum $ map (flip ($ n) xs) [maxInRows, maxInCols,
maxInLRupper, maxInLRlower, maxInRLupper, maxInRLlower]


I am still working on wrapping my head around the last function. What the command says is easy enough: map the function (flip ($ n) xs) to the elements of the list (which is a list of functions), and then take the maximum of the resulting list. However, we have to understand what the command flip ($ 4) xs maxInRows does.
The function flip takes a function of type a -> b -> c and returns a function that behaves the same, but is of type b -> a -> c. So, below is how this works:

flip ($ 4) xs maxInRows = ($ 4) maxInRows xs
= maxInRows 4 xs

The first equality is by the definition of flip, and the last equality is by definition of the function ($ 4). The operator ($) is shorthand for function evaluation and is useful to reduce the proliferation of parenthesis. In this case, ($ a) applied to a function f of type a -> b evaluates f at 4.

Phew! I am still trying to wrap my head around this last bit, but I know that I learned a lot working through this code.

By the way, after I posted this I realized I completely omitted the discussion on the stuff before the => in the function definitions. That will be the next post.

Tuesday, February 24, 2009

First example

I thought I would start with an exercise in RWH to get my feet wet, since I was having difficulty just getting a program to do anything. So, I worked on exercise 1 in chapter 3, which was to write a function that does the same as the length command. So, I wrote something to the effect of:

len count [] = count
len count (x:xs) = len count + 1 xs

A few notes on what is above. Function calling in Haskell is done by pattern matching, and the above code says what len should do if given inputs that match these patterns. The symbol [] is an empty list, and (x:xs) is shorthand for a nonempty list with x as its first element, and the remainder of the list as xs.

So I fire up Haskell at the my command prompt with ghci, then :load listLength.hs and try running this code on the list [1..1000000], and I got an Exception: stack overflow! How is this possible with something so simple? Well, it turns out that Haskell is lazy in that it does not evaluate an expression until it absolutely must. In this case, Haskell does not evaluate the expression count + 1, so it recursively calls this function until we get to the base case, the empty list. Since Haskell's operator stack can't handle this many recursions, it dies.

How can we get around this and still have this simple intuitive code? Haskell has an operator that tells the interpreter to evaluate an expression even if it is not necessary for program flow, which is the '$!' operator. So, by changing the above code to

len count [] = count
len count (x:xs) = len $! count + 1 xs

makes this do what length is supposed to, albeit much slower since it is recursive.

Monday, February 23, 2009

First Post

Hello everyone!

Having been in academics for the past seven years, it has been a long time since I have learned a new programming language. Poking around the programming subreddit, I found that many there liked the Haskell language. It sounded quirky (and it uses monads!) so I figured I would try it out.

This blog is all about me going from not knowing a thing about Haskell to hopefully being able to understand some of the more advanced ideas of the language. I will be using the book Real World Haskell by Bryan O'Sullivan, Don Stewart and John Goerzen. Feel free to comment about the efficiency of the code I post; I would love the advice!

For the first few code samples, I will be working out problems from Project Euler, which is a good supply of mildly tricky to downright hard programming problems. It should be fun!