willow

Trees too large to load
<--- four mins reading time
Today I wrote a small crate willow that can be used to lazily traverse trees. It's a bit of a niche library, but I thought I'd share it anyway.
The idea is that you can build a tree of nodes, and then traverse it in a lazy fashion. This is useful if you have a tree that is very deep, but you only need to use a small part of it.
You write code as though the entire tree exists, and willow will quitely add and cache nodes as you traverse the tree.
Willow clocks in at about 90 SLOC: pretty small indeed!
Willow Tree
Chess
Chess has an extremely large branching factor. Sometimes, it's useful to abstract away move generation and focus on how we traverse a position. This is where lazy trees come in.
This is also applicable to other games with large branching factors, such as Go.




File Systems
File systems are also a good example of a tree that could be very deep but only as small part of it is used. For example, if you're looking for a file in a directory, you don't need to traverse the entire directory tree.

impl Node for Board {
  fn children(&self) -> Vec<Self> {
    self.legal_moves()
  }

  fn leaf(&self) -> bool {
    self.is_over()
  }
}

impl Node for Dir {
  fn children(&self) -> Vec<Self> {
    self.subdirs()
  }

  fn leaf(&self) -> bool {
    self.files().is_empty()
  }
}
Traversal
Let's define a function, walk which simply walks depth-first through a tree.
It's that simple! From a few rules, we can derive a lot of functionality.



Unrendered
Sometimes, we only want to iterate over parts of the tree which have already been rendered. This might be because creating new branches is expensive, or because we want to avoid traversing deeper than we have already.
Traversing rendered nodes is as simple as changing children to rendered_children.

fn walk(tree: &mut Tree<Node>, id: Id) {
  for child in &mut tree.children(id) {
    println!("{:?}", tree.get(*child).unwrap());
    walk(tree, *child);
  }
}


tree.children(id)
tree.rendered_children(id)
Transposition
In some trees there will be some nodes which naturally have multiple parents:
  • Chess: the same position can be reached from different positions
  • Symbolic links: a directory can be the target of a symlink from multiple places, and be the child of a parent directory simultaneously
We can solve this using a transposition table: define a hash function for each node, and use it to check if a node has already been added to the tree. If it has, we can redirect to the first instance of the node.
This, technically, is no longer a tree, but a directed acyclic graph. However, it's still useful to think of it as a tree.
Performance
At first, it may seem like this could be extremely slow. In practice, this is not a problem. The performance impact of a hash table on small trees is negligible, and the performance impact of a hash table on large trees compensates for the cost of maintaining two identical sub-trees below nodes with the same hash.
// check if child is already in the tree
if let Some(c) = self.hash_table.get(&child) {
    // if so, append the existing node
    // instead of creating a new one
    id.append(*c, &mut self.arena);
    continue;
}

self.add_child(id, child);
Writing willow introduced me to making and publishing rust crates. It's unlikely to be useful to anyone else, but I'm glad I wrote it. It was a fun exercise.
If I've made a mistake in this post, please let me know! I'm quite new to rust and I'm happy to learn.
  • Willow on Github
  • Willow on crates.io
  • Email me
Thanks for reading.
<--- Home