Foray Into Haskell
A couple of months ago, I suddenly had this desire to learn Haskell. I was working on my compilers project (obviously) and randomly thought it would be cool to know and use a purely functional language for whatever reason. Plus, I’ve been reading about Haskell a bit on blogs and stuff and one of my friends uses it pretty regularly so I figured I’d give it a shot.
Why learn Haskell? I’m not sure if I’d ever really need to use it, but of course I may have a job where I program in a purely functional language and then it would certainly help. But more importantly, I wanted to challenging myself to thinking in a new way, and of course it’s good to have another tool in our tool belt. For more general reasons to learn Haskell, check this out.
Anyway, about 2 months ago, I went to the Haskell website, downloaded GHC and started reading this as a tutorial. I knew some of the basic stuff, like that Haskell is purely functional and whatnot but that’s a good document to read to learn about some of the design decisions and of course it serves as a tutorial. I didn’t get very far in the tutorial due to school and other responsibilities
That was 2 months ago. Last night, I decided to re-start my foray into Haskell and went through several chapters of the tutorial. I’m not done with it, but I hate learning by reading and decided that I’d learn by doing. So I set out to implement some “simple” algorithms in Haskell and see how/why they are different from the traditional implementations. Now of course my code isn’t the best written Haskell (because I’ve been writing for less than a day), and I’m pretty sure there are better ways to implement these algorithms, but it was definitely a good learning experience.
Note: I didn’t want to use some of the more advanced Haskell features like monads largely because I don’t really understand them yet, and also because I wanted to keep my algorithms really pure and functional. I don’t yet buy that monads are within the rules of haskell’s purely functional-ness, but I’m sure I’ll understand that later on.
One of the algorithms I especially struggled with was Depth First Graph Search. At first, I didn’t expect it to be very challenging because DFS is naturally recursive and that’s pretty much the common way to write algorithms in Haskell anyway. So I started out with some data type definitions and some functions to manipulate the data.
[code lang="c"]
---- a Node is an index and an adjacency list ----
data Node = Node Integer [Integer]
makeNode :: Integer -> [(Integer, Integer)] -> Node
makeNode ind eList = Node ind
((map (\x -> snd x) (filter (\x -> (fst x)==ind) eList))
++ (map (\x -> fst x) (filter (\x -> (snd x)==ind) eList)))
getAdjList (Node _ b) = b
getId (Node a _) = a
instance Show Node where
show (Node a b) = "Node " ++ show a ++ " " ++ show b
---- a Graph is a list of nodes ----
data Graph = Graph [Node]
instance Show Graph where
show (Graph x) = foldl (++) "" (map (\y -> show y ++ ['\n']) x)
makeGraph :: Integer -> [(Integer, Integer)] -> Graph
makeGraph numNodes edges = Graph (map (\x -> makeNode x edges) (enumerateList numNodes))
[/code]
So hopefully that is relatively straightforward. My basic graph structure is a list of Nodes, and each Node is a just a tuple containing an index and a list of neighboring node indices. And how concise is that makeNode function… I love it! There’s also a function to make a graph out of a list of edges, and it basically just calls makeNode a bunch of times.
Ok then my first approach was to try and implement a standard DFS. It looked something like this:
[code lang="c"]
---- just the type signature of getNode. Implementation is too long and boring ----
getNode :: Graph -> Integer -> Maybe Node
---- run depthFirstSearch starting from ind, return the list of reachable nodes ----
depthFirstSearch :: Graph -> Integer -> [Node]
depthFirstSearch g ind = foldr (++) [fromJust (getNode g ind)]
(map (\x -> depthFirstSearch g (getId x))
(map (\x -> fromJust (getNode g x)) (getAdjList (fromJust (getNode g ind)))))
[/code]
So yes this is unnecessarily complicated, (you don’t need the last map because depthFirstSearch takes integers anyway. It’s like applying a function, then applying the inverse right away.) However, the whole routine is wrong and I introduced the complication as I thought about how to fix it. It’s wrong because in a nested call to depthFirstSearch you have no idea what nodes you’ve visited. And using this approach, at best you can know the nodes that are on the path from the current node to the root, but you cannot know about the other nodes that you’ve already visited.
For example, supposed I have edges = (1, 2), (1, 3), (2, 3) and I call depthFirstSearch (makeGraph 3 edges) 1. What happens? Well, we “visit” node one, and then call depthFirstSearch (makeGraph 3 edges) 2. This obviously calls depthFirstSearch (makeGraph 3 edges) 1 and so on ad infinitum.
Ok so we can quickly fix this infinite recursion problem. By removing the current node from the graph before we recurse. But this still doesn’t fix all of our problems. Now while visiting node 2, we recurse to node 3, then we back up to node 1 and visit node 3 again. At the top level, we have know way of knowing that we already visited node 3 earlier.
So if we’re doing tree search, the above algorithm (with the “remove current node” adjustment) should work. Actually there are some minor adjustments: Here’s my final Depth First Search for Tree Search:
[code lang="c"]
---- run depthFirstSearch starting from ind, return the list of reachable nodes ----
depthFirstSearch :: Graph -> Integer -> [Node]
depthFirstSearch g ind = foldl (++) [fromJust (getNode g ind)]
(map (\x -> depthFirstSearch (removeNode g ind) (getId x))
(map
(\x -> fromJust (getNode g x))
(filter
(\x -> isJust (getNode (removeNode g ind) x))
(getAdjList (fromJust (getNode g ind))))))
[/code]
So a good detour but not really what I was after. Figuring out how to get DFS graph-search to work wasn’t trivial for me. After a couple of hours here’s what I came up with:
[code lang="c"]
dfsIterateOverNeighbors :: Graph -> [Integer] -> [Node]
dfsIterateOverNeighbors g toVisit =
if length toVisit == 0
then []
else
let firstNode = fromJust (getNode g (head toVisit))
newG = removeNode g (getId firstNode)
neighbors = (filter (\x -> isJust (getNode newG x)) (getAdjList firstNode))
in [firstNode] ++
(dfsIterateOverNeighbors
newG
(neighbors ++ (filter (\x -> (not (contains x neighbors))) (tail toVisit))))
depthFirstSearch g ind =
dfsIterateOverNeighbors g [getId (fromJust (getNode g ind))]
[/code]
So, let’s look quickly at how this works. Basically, depthFirstSearch immediately calls dfsIterateOverNeighbors (which is probably not well named but whatever) and this function does some magic. Actually, all it really does is maintain a stack of nodes to visit, and visit them in stack order. Each time you visit a node, you also remove it from the graph for subsequent “iterations” to know that you’ve visited the node. Finally, I’m careful about what nodes get put on the stack, not allowing nodes that I’ve visited already and cleaning up the backside of the stack with nodes that I figured out I’ll visit sooner than I had previously thought.
So, in thinking about this. I realize this is basically an iterative DFS algorithm. A lot of “state” is maintained as function arguments, and the whole thing works tail-recursively. You do a bunch of work, then tell the next guy to do some stuff, which seems a lot like imperative programming. But it is pretty interesting that to make a “simple” recursive algorithm work in a functional setting, I essentially made it more imperative.
So I was relatively happy with this solution and decided to check my answers. I looked at what other people have done online and found a couple of interesting things. Most notable was this paper that uses mutable Arrays to implement Depth First Search in linear time. However, immutable arrays are “cheating” by my standard so these guys don’t have a solution to my problem.
I also ran across this thread that is loosely related. Actually I only really found the first two posts related but essentially it seems that it is quite challenging to implement DFS in linear time without using some mutable state. In looking at my algorithm, I’m pretty sure it’s quadratic (O(|V| calls to dfsIterateOverNeighbors and in each call you have to walk across toVisit which is also O(|V|)). However, I could not find a “better” way to implement Depth First Search under my somewhat arbitrary restrictions.
In conclusion, my first experience with Haskell was kind of cool, I had to think really hard about something that is usually pretty trivial but I did come up with a decent solution and learned a bunch of language features along the way. I think I will keep trying to implement various algorithms in Haskell whenever I get free time and maybe I’ll write about it too.
June 24th, 2009 at 4:07 pm
You can automatically derive the Show class for your type.
data Node = Node Integer [Integer]
deriving Show
You can also derive Eq (equality comparison) and Read (simple parsing), among others.
Also, ‘(\x -> fst)’ can be written more concisely as ‘fst’.
June 24th, 2009 at 4:10 pm
You could also write makeNode more concisely with list comprehensions, which are supposed to look like set-builder notation:
makeNode ind eList = Node ind ([ y | (x,y) <- eList, x==ind] ++ [ x | (x,y) <- eList, y==ind])
June 24th, 2009 at 5:58 pm
Hi! Without really looking at what your code is doing or testing the following, I thought I would post a version of your last code block that is a little more idiomatic and better (and sometimes just different) in some other ways as well:
dfsIterateOverNeighbors :: Graph -> [Integer] -> [Node]
dfsIterateOverNeighbors g [] = []
dfsIterateOverNeighbors g (v:vs) =
let firstNode = fromJust (getNode g v)
newG = removeNode g (getId firstNode)
neighbors = filter (isJust . getNode newG) (getAdjList firstNode)
newToVisit = neighbors ++ filter (not . flip contains neighbors) vs
in firstNode : dfsIterateOverNeighbors newG newToVisit
depthFirstSearch g =
dfsIterateOverNeighbors g . return . getId . fromJust . getNode g
Hope that type-checks!
June 24th, 2009 at 5:59 pm
… sorry about the formatting. I tried to stick it in a tag
June 25th, 2009 at 8:54 am
Thanks for the comments guys. Good to know people are interested…
@InternetGuy: As I mentioned, I’m still really knew to Haskell so thanks for the “deriving” tip. I should caught that unnecessary lambda though. Didn’t know about list comprehensions so thanks for that too!
@jberryman these are all techniques i’ve read about but didn’t know it is the norm in terms of programming practices. Thanks!
June 25th, 2009 at 1:38 pm
Why do you consider immutable arrays “cheating” ? They’re immutable, it’s not like they’re impure in any way. You can do lot of cool stuff with Haskell arrays : especially keeping in mind that they’re lazy by default so you can use them in their own initialization, it’s surprising how much can be done thanks to that.
I must say though that graphs and algorithms on them have always been something of a problem in pure functional languages, there are some interesting approach around (look at FGL http://hackage.haskell.org/cgi-bin/hackage-scripts/package/fgl for instance).
June 26th, 2009 at 1:12 pm
@Jedai: yes you’re right. I mentioned that I just started learning Haskell and in as much as I’ve learned, I haven’t gotten to Array’s yet. I’m sure they are very cool and highly useful as a language construct.
Also thanks for the FGL link. I will check that out.
June 28th, 2009 at 2:01 pm
[...] and as I like to learn by doing, I’ve been implementing common algorithms in the language. Last week I tackled depth first search and found it to be surprisingly challenging and consequently really [...]