Computer Science 235

AVL Trees


Note: Projects are to be completed by each student individually (not by groups of students).

In this project you will implement the Set abstract data type using an AVL Tree.

The program is given a list of commands as input. The program runs each command on an AVL Tree Set and reports the result of each command to the output.

Example Input

clear
add bob
add joe
add jim
print
find joe
remove bob
remove joe
remove jim
print
find joe

Example Output

clear
add bob
add joe
add jim
print
level 0: jim(2)
level 1: bob(1) joe(1)
find joe true
remove bob
remove joe
remove jim
print
find joe false

Testing

Here are some ideas for tests.

  1. Add that results in a rotate.
  2. Add that results in a double rotate.
  3. Add a duplicate item.
  4. Remove a node with two children.
  5. Remove that results in a rotate.
  6. Remove that results in a double rotate.
  7. Remove an item that is not in the tree.

Commands

The following commands can be given in the input file. Each command is given on one line in the input file. Each command has an output that is written to the output file.

Clear

The clear command has the form:

clear

The clear command removes all the items from the set. The set is empty after this operation completes.

The output for the clear command has the form: (The output is the same as the input.)

clear

Add

The add command has the form:

add item

The add command adds 'item' to the set if the item is not already in the set. If the set already contains the item, the add command leaves the set unchanged. However the command is still output and processing continues with the next command. The 'item' parameter is a string of non-whitespace characters. The 'item' parameter cannot contain any whitespace characters (space, tab, newline) because these characters are used to separate parameters.

The output for the add command has the form: (The output is the same as the input.)

add item

Remove

The remove command has the form:

remove item

The remove command removes 'item' from the set if the item is present in the set. If the set does not contain the item, the remove command leaves the set unchanged. However the command is still output and processing continues with the next command. The 'item' parameter is a string of non-whitespace characters.

The output for the remove command has the form: (The output is the same as the input.)

remove item

Find

The find command has the form:

find item

The find command searches for 'item' in the set. The 'item' parameter is a string of non-whitespace characters. The result of the find command is the string 'true' if the item is found and the string 'false' if the item is not found.

If 'item' is found in the set the output for the find command has the form:

find item true

If 'item' is not found in the set the output for the find command has the form:

find item false

Print

The print command has the form:

print

The print command outputs the items stored in the set. The items are output using a level-order traversal of the AVL tree. The item at the root of the tree is output first, followed by the children of the root, then the grandchildren of the root, etc. The height of each item in the tree is output immediately following the item. (Each height is surrounded by parentheses.) The height of a leaf node is 1. The items on a given level in the tree are output on the same line of the output (unless there are too many items as described later).

The output for the print command has the form:

print
level 0: item1(height1)
level 1: item2(height2) item3(height3)
...

Each line of output gives items stored at the same level in the AVL tree. Each line of output has the form:

level x: item1(height1) item2(height2) ...

where 'x' is the level of the items. The root node is at level 0. The children of the root are at level 1, etc. The items stored on level 'x' in the tree are output following the 'level x:' prefix. Each item is separated from the previous output by a single space character. The height of each item in the tree is output immediately following the item. Each height is surrounded by parentheses. The height of a leaf node is 1. Note that level 0 has only one node (the root), level 1 has at most two nodes, level 2 has at most four nodes, etc.

Level 4 and above may have more than eight items. Do not output more than eight items on one line. If a level has more than eight items, output multiple lines for that level, with the first eight items on the first line, the next eight items on the next line, etc. Re-output the 'level x:' prefix for each line of output that is needed for a given level. For example the output for 10 items on level 4 could look like this:

level 4: item1(1) item2(1) item3(2) item4(2) item5(1) item6(2) item7(1) item8(2)
level 4: item9(2) item10(2)

The Nodes

Each node in the tree contains:

  1. An item.
  2. A pointer to the left child.
  3. A pointer to the right child.
  4. The height of the node.

The Tree

The tree must be a binary search tree and the tree must use AVL balancing. At a minimum, a tree should contain:

  1. A pointer to the root node of the tree.
  2. A count of how many items are stored in the tree.

Adding

The implementation must use a binary search tree and the tree must use AVL balancing. The 'add' operation adds a new node to the AVL tree. The new node contains the item that is to be added to the set. The 'add' operation must run in O(log n) time, where 'n' is the number of items in the set.

Removing

The implementation must use a binary search tree and the tree must use AVL balancing. The 'remove' operation removes the node from the AVL tree that contains the item that is to be removed from the set. The 'remove' operation must run in O(log n) time, where 'n' is the number of items in the set.

The 'remove' operation on an AVL tree has a case where two different choices can be made and both choices produce a correct AVL tree. The case occurs when removing a node that has two non-null children. The item to be removed can be replaced with either the largest item in the left subtree or the smallest item in the right subtree. For this project the 'remove' operation is required to use the second choice, remove must use the smallest item in the right subtree.

When a node becomes unbalanced after a remove operation it is possible that the grandchildren of the unbalanced node will have equal height. Remove must use a single rotate to balance the node in this case. You cannot use a double rotate in this case.

Finding

The 'find' operation searches the AVL tree for the requested item. The 'find' operation must run in O(log n) time, where 'n' is the number of items in the set.

Using the Standard Library

You must implement the AVL Tree Set using pointers and nodes. You are not allowed to use data types such as vector, list, set, map, etc from the standard library.

Implementation Requirements

  1. The implementation must use a binary search tree and the tree must use AVL balancing.
  2. When removing a node that has two children, replace the item with the smallest item in the right subtree.
  3. When rotating a node that has equal height grandchildren, use a single rotate.
  4. Use a level-order traversal to output the contents of the tree.
  5. Use your own code from the List project to provide a queue for the level-order traversal.
  6. The 'add', 'remove', and 'find' operations must run in O(log n) time.
  7. You are not allowed to use data types such as vector, list, set, map, etc from the standard library.
  8. Operations that remove nodes from the tree such as 'clear' and 'remove' must 'delete' the nodes that are removed.
  9. The program must pass a memory leak test such as valgrind.
  10. The implementation of the set must use a C++ template so that the set is able to store objects of any valid data type.
  11. Write your own code for the tree. Do not use code from the book, the internet, or class notes.

Command Line

The program is run with the names of the 'Command' and 'Output' files given on the command-line. For example the program might be run like this:

lab6 command.txt output.txt

When the program is run this way the program runs the commands given in the 'command.txt' file and writes the output from the commands to the 'output.txt' file. Note that the names given on the command line may not always be 'command.txt' and 'output.txt'.

Slow Code

If your code is running slow, here are some things to check.

Is your tree really balanced? An unbalanced tree will start to exhibit O(n) behavior.

Are you re-calculating information that might not be changing? Store the height of a node inside each node object and only update the heights that might have changed. Only a small fraction of the nodes in a tree will need to have their heights updated during an insertion or deletion and this fraction gets smaller as the tree gets larger.

Are you using too much recursion? If you check the heights and the balancing of your tree by calling a recursive method on the root of the tree every time you insert or delete, you are doing too much work. Any time you insert or delete, the only nodes that need to have their height updated or that might become unbalanced are those along the path from the root node to the inserted note.

The 'add' and 'remove' operations should be recursive, and if they are, this path is "stored" in the recursive calls. Do your height checking, balance checking and balancing as you come out of these recursive calls. (After the recursive call in your recursive methods, put the code to do the work for these operations).