Posts Tagged ‘haskell cs programming dfs’

Foray Into Haskell

Wednesday, June 24th, 2009

A couple of months ago, I suddenly had this desire to learn Haskell. I was working on my compilers project (obviously) and randomly thought it would be cool to know and use a purely functional language for whatever reason. Plus, I’ve been reading about Haskell a bit on blogs and stuff and one of my friends uses it pretty regularly so I figured I’d give it a shot.

Why learn Haskell? I’m not sure if I’d ever really need to use it, but of course I may have a job where I program in a purely functional language and then it would certainly help. But more importantly, I wanted to challenging myself to thinking in a new way, and of course it’s good to have another tool in our tool belt. For more general reasons to learn Haskell, check this out.

Anyway, about 2 months ago, I went to the Haskell website, downloaded GHC and started reading this as a tutorial. I knew some of the basic stuff, like that Haskell is purely functional and whatnot but that’s a good document to read to learn about some of the design decisions and of course it serves as a tutorial. I didn’t get very far in the tutorial due to school and other responsibilities

That was 2 months ago. Last night, I decided to re-start my foray into Haskell and went through several chapters of the tutorial. I’m not done with it, but I hate learning by reading and decided that I’d learn by doing. So I set out to implement some “simple” algorithms in Haskell and see how/why they are different from the traditional implementations. Now of course my code isn’t the best written Haskell (because I’ve been writing for less than a day), and I’m pretty sure there are better ways to implement these algorithms, but it was definitely a good learning experience.

Note: I didn’t want to use some of the more advanced Haskell features like monads largely because I don’t really understand them yet, and also because I wanted to keep my algorithms really pure and functional. I don’t yet buy that monads are within the rules of haskell’s purely functional-ness, but I’m sure I’ll understand that later on.

One of the algorithms I especially struggled with was Depth First Graph Search. At first, I didn’t expect it to be very challenging because DFS is naturally recursive and that’s pretty much the common way to write algorithms in Haskell anyway. So I started out with some data type definitions and some functions to manipulate the data.

[code lang="c"]
---- a Node is an index and an adjacency list ----
data Node = Node Integer [Integer]

makeNode :: Integer -> [(Integer, Integer)] -> Node
makeNode ind eList = Node ind
((map (\x -> snd x) (filter (\x -> (fst x)==ind) eList))
++ (map (\x -> fst x) (filter (\x -> (snd x)==ind) eList)))

getAdjList (Node _ b) = b

getId (Node a _) = a

instance Show Node where
show (Node a b) = "Node " ++ show a ++ " " ++ show b

---- a Graph is a list of nodes ----
data Graph = Graph [Node]

instance Show Graph where
show (Graph x) = foldl (++) "" (map (\y -> show y ++ ['\n']) x)

makeGraph :: Integer -> [(Integer, Integer)] -> Graph
makeGraph numNodes edges = Graph (map (\x -> makeNode x edges) (enumerateList numNodes))
[/code]

So hopefully that is relatively straightforward. My basic graph structure is a list of Nodes, and each Node is a just a tuple containing an index and a list of neighboring node indices. And how concise is that makeNode function… I love it! There’s also a function to make a graph out of a list of edges, and it basically just calls makeNode a bunch of times.

Ok then my first approach was to try and implement a standard DFS. It looked something like this:
[code lang="c"]
---- just the type signature of getNode. Implementation is too long and boring ----
getNode :: Graph -> Integer -> Maybe Node

---- run depthFirstSearch starting from ind, return the list of reachable nodes ----
depthFirstSearch :: Graph -> Integer -> [Node]
depthFirstSearch g ind = foldr (++) [fromJust (getNode g ind)]
(map (\x -> depthFirstSearch g (getId x))
(map (\x -> fromJust (getNode g x)) (getAdjList (fromJust (getNode g ind)))))

[/code]

So yes this is unnecessarily complicated, (you don’t need the last map because depthFirstSearch takes integers anyway. It’s like applying a function, then applying the inverse right away.) However, the whole routine is wrong and I introduced the complication as I thought about how to fix it. It’s wrong because in a nested call to depthFirstSearch you have no idea what nodes you’ve visited. And using this approach, at best you can know the nodes that are on the path from the current node to the root, but you cannot know about the other nodes that you’ve already visited.

For example, supposed I have edges = (1, 2), (1, 3), (2, 3) and I call depthFirstSearch (makeGraph 3 edges) 1. What happens? Well, we “visit” node one, and then call depthFirstSearch (makeGraph 3 edges) 2. This obviously calls depthFirstSearch (makeGraph 3 edges) 1 and so on ad infinitum.

Ok so we can quickly fix this infinite recursion problem. By removing the current node from the graph before we recurse. But this still doesn’t fix all of our problems. Now while visiting node 2, we recurse to node 3, then we back up to node 1 and visit node 3 again. At the top level, we have know way of knowing that we already visited node 3 earlier.

So if we’re doing tree search, the above algorithm (with the “remove current node” adjustment) should work. Actually there are some minor adjustments: Here’s my final Depth First Search for Tree Search:

[code lang="c"]
---- run depthFirstSearch starting from ind, return the list of reachable nodes ----
depthFirstSearch :: Graph -> Integer -> [Node]
depthFirstSearch g ind = foldl (++) [fromJust (getNode g ind)]
(map (\x -> depthFirstSearch (removeNode g ind) (getId x))
(map
(\x -> fromJust (getNode g x))
(filter
(\x -> isJust (getNode (removeNode g ind) x))
(getAdjList (fromJust (getNode g ind))))))
[/code]

So a good detour but not really what I was after. Figuring out how to get DFS graph-search to work wasn’t trivial for me. After a couple of hours here’s what I came up with:
[code lang="c"]
dfsIterateOverNeighbors :: Graph -> [Integer] -> [Node]
dfsIterateOverNeighbors g toVisit =
if length toVisit == 0
then []
else
let firstNode = fromJust (getNode g (head toVisit))
newG = removeNode g (getId firstNode)
neighbors = (filter (\x -> isJust (getNode newG x)) (getAdjList firstNode))
in [firstNode] ++
(dfsIterateOverNeighbors
newG
(neighbors ++ (filter (\x -> (not (contains x neighbors))) (tail toVisit))))

depthFirstSearch g ind =
dfsIterateOverNeighbors g [getId (fromJust (getNode g ind))]
[/code]

So, let’s look quickly at how this works. Basically, depthFirstSearch immediately calls dfsIterateOverNeighbors (which is probably not well named but whatever) and this function does some magic. Actually, all it really does is maintain a stack of nodes to visit, and visit them in stack order. Each time you visit a node, you also remove it from the graph for subsequent “iterations” to know that you’ve visited the node. Finally, I’m careful about what nodes get put on the stack, not allowing nodes that I’ve visited already and cleaning up the backside of the stack with nodes that I figured out I’ll visit sooner than I had previously thought.

So, in thinking about this. I realize this is basically an iterative DFS algorithm. A lot of “state” is maintained as function arguments, and the whole thing works tail-recursively. You do a bunch of work, then tell the next guy to do some stuff, which seems a lot like imperative programming. But it is pretty interesting that to make a “simple” recursive algorithm work in a functional setting, I essentially made it more imperative.

So I was relatively happy with this solution and decided to check my answers. I looked at what other people have done online and found a couple of interesting things. Most notable was this paper that uses mutable Arrays to implement Depth First Search in linear time. However, immutable arrays are “cheating” by my standard so these guys don’t have a solution to my problem.

I also ran across this thread that is loosely related. Actually I only really found the first two posts related but essentially it seems that it is quite challenging to implement DFS in linear time without using some mutable state. In looking at my algorithm, I’m pretty sure it’s quadratic (O(|V| calls to dfsIterateOverNeighbors and in each call you have to walk across toVisit which is also O(|V|)). However, I could not find a “better” way to implement Depth First Search under my somewhat arbitrary restrictions.

In conclusion, my first experience with Haskell was kind of cool, I had to think really hard about something that is usually pretty trivial but I did come up with a decent solution and learned a bunch of language features along the way. I think I will keep trying to implement various algorithms in Haskell whenever I get free time and maybe I’ll write about it too.