Finding common subexpressions: algorithm

Now that we know about the uniq storage option, we can discuss the algorithm that we will use to find the common subexpressions. We will use the uniq storage option, so all common subexpressions appear as shared nodes in the tree (as we have seen earlier). This means that it is sufficient to do a tree-walk, and mark the nodes that we visit more than once. We chose this algorithm because it is easy to express in Kimwitu.

We will do one depth-first left-to-right tree-walk in the expression tree, and
use a boolean attribute visited to see if we have visited a node before (initially unset), and
an attribute name to label shared (common) nodes (initially 0) (this implements a mapping from 'common' expressions to names)
We will also keep a reverse mapping from names to expressions.
During the tree-walk,
- We keep a list of all names (initially empty)
- when we visit a node the first time (visited is unset), we set visited
- when we visit a node a second time (visited is set, name is 0)
  - we generate a new name,
  - assign the new name to the name attribute of the current node,
  - make a mapping from the name back to the current node, and
  - insert the new name in the list of all names

After the tree walk we have a list of (names of) all common subexpressions of the expression (we have a list of names, and a mapping from names to expressions).

The following picture shows the result of applying the algorithm on our 'running example'. It shows the attributed tree, together with the mappings list.

next overview up

Last updated at 03 February '98 by kimwitu@cs.utwente.nl