AVL Trees
Note: Projects are to be completed by each student individually (not by groups of students).
In this project you will implement the Set abstract data type using an AVL Tree.
The program is given a list of commands as input. The program runs each command on an AVL Tree Set and reports the result of each command to the output.
Example Input
clear add bob add joe add jim print find joe remove bob remove joe remove jim print find joe
Example Output
clear add bob add joe add jim print level 0: jim(2) level 1: bob(1) joe(1) find joe true remove bob remove joe remove jim print find joe false
Testing
Here are some ideas for tests.
- Add that results in a rotate.
- Add that results in a double rotate.
- Add a duplicate item.
- Remove a node with two children.
- Remove that results in a rotate.
- Remove that results in a double rotate.
- Remove an item that is not in the tree.
Commands
The following commands can be given in the input file. Each command is given on one line in the input file. Each command has an output that is written to the output file.
Clear
The clear command has the form:
clear
The clear command removes all the items from the set. The set is empty after this operation completes.
The output for the clear command has the form: (The output is the same as the input.)
clear
Add
The add command has the form:
add item
The add command adds 'item' to the set if the item is not already in the set. If the set already contains the item, the add command leaves the set unchanged. However the command is still output and processing continues with the next command. The 'item' parameter is a string of non-whitespace characters. The 'item' parameter cannot contain any whitespace characters (space, tab, newline) because these characters are used to separate parameters.
The output for the add command has the form: (The output is the same as the input.)
add item
Remove
The remove command has the form:
remove item
The remove command removes 'item' from the set if the item is present in the set. If the set does not contain the item, the remove command leaves the set unchanged. However the command is still output and processing continues with the next command. The 'item' parameter is a string of non-whitespace characters.
The output for the remove command has the form: (The output is the same as the input.)
remove item
Find
The find command has the form:
find item
The find command searches for 'item' in the set. The 'item' parameter is a string of non-whitespace characters. The result of the find command is the string 'true' if the item is found and the string 'false' if the item is not found.
If 'item' is found in the set the output for the find command has the form:
find item true
If 'item' is not found in the set the output for the find command has the form:
find item false
The print command has the form:
The print command outputs the items stored in the set. The items are output using a level-order traversal of the AVL tree. The item at the root of the tree is output first, followed by the children of the root, then the grandchildren of the root, etc. The height of each item in the tree is output immediately following the item. (Each height is surrounded by parentheses.) The height of a leaf node is 1. The items on a given level in the tree are output on the same line of the output (unless there are too many items as described later).
The output for the print command has the form:
print level 0: item1(height1) level 1: item2(height2) item3(height3) ...
Each line of output gives items stored at the same level in the AVL tree. Each line of output has the form:
level x: item1(height1) item2(height2) ...
where 'x' is the level of the items. The root node is at level 0. The children of the root are at level 1, etc. The items stored on level 'x' in the tree are output following the 'level x:' prefix. Each item is separated from the previous output by a single space character. The height of each item in the tree is output immediately following the item. Each height is surrounded by parentheses. The height of a leaf node is 1. Note that level 0 has only one node (the root), level 1 has at most two nodes, level 2 has at most four nodes, etc.
Level 4 and above may have more than eight items. Do not output more than eight items on one line. If a level has more than eight items, output multiple lines for that level, with the first eight items on the first line, the next eight items on the next line, etc. Re-output the 'level x:' prefix for each line of output that is needed for a given level. For example the output for 10 items on level 4 could look like this:
level 4: item1(1) item2(1) item3(2) item4(2) item5(1) item6(2) item7(1) item8(2) level 4: item9(2) item10(2)
The Nodes
Each node in the tree contains:
- An item.
- A pointer to the left child.
- A pointer to the right child.
- The height of the node.
The Tree
The tree must be a binary search tree and the tree must use AVL balancing. At a minimum, a tree should contain:
- A pointer to the root node of the tree.
- A count of how many items are stored in the tree.
Adding
The implementation must use a binary search tree and the tree must use AVL balancing. The 'add' operation adds a new node to the AVL tree. The new node contains the item that is to be added to the set. The 'add' operation must run in O(log n) time, where 'n' is the number of items in the set.
Removing
The implementation must use a binary search tree and the tree must use AVL balancing. The 'remove' operation removes the node from the AVL tree that contains the item that is to be removed from the set. The 'remove' operation must run in O(log n) time, where 'n' is the number of items in the set.
The 'remove' operation on an AVL tree has a case where two different choices can be made and both choices produce a correct AVL tree. The case occurs when removing a node that has two non-null children. The item to be removed can be replaced with either the largest item in the left subtree or the smallest item in the right subtree. For this project the 'remove' operation is required to use the second choice, remove must use the smallest item in the right subtree.
When a node becomes unbalanced after a remove operation it is possible that the grandchildren of the unbalanced node will have equal height. Remove must use a single rotate to balance the node in this case. You cannot use a double rotate in this case.
Finding
The 'find' operation searches the AVL tree for the requested item. The 'find' operation must run in O(log n) time, where 'n' is the number of items in the set.
Using the Standard Library
You must implement the AVL Tree Set using pointers and nodes. You are not allowed to use data types such as vector, list, set, map, etc from the standard library.
Implementation Requirements
- The implementation must use a binary search tree and the tree must use AVL balancing.
- When removing a node that has two children, replace the item with the smallest item in the right subtree.
- When rotating a node that has equal height grandchildren, use a single rotate.
- Use a level-order traversal to output the contents of the tree.
- Use your own code from the List project to provide a queue for the level-order traversal.
- The 'add', 'remove', and 'find' operations must run in O(log n) time.
- You are not allowed to use data types such as vector, list, set, map, etc from the standard library.
- Operations that remove nodes from the tree such as 'clear' and 'remove' must 'delete' the nodes that are removed.
- The program must pass a memory leak test such as valgrind.
- The implementation of the set must use a C++ template so that the set is able to store objects of any valid data type.
- Write your own code for the tree. Do not use code from the book, the internet, or class notes.
Command Line
The program is run with the names of the 'Command' and 'Output' files given on the command-line. For example the program might be run like this:
lab6 command.txt output.txt
When the program is run this way the program runs the commands given in the 'command.txt' file and writes the output from the commands to the 'output.txt' file. Note that the names given on the command line may not always be 'command.txt' and 'output.txt'.
Slow Code
If your code is running slow, here are some things to check.
Is your tree really balanced? An unbalanced tree will start to exhibit O(n) behavior.
Are you re-calculating information that might not be changing? Store the height of a node inside each node object and only update the heights that might have changed. Only a small fraction of the nodes in a tree will need to have their heights updated during an insertion or deletion and this fraction gets smaller as the tree gets larger.
Are you using too much recursion? If you check the heights and the balancing of your tree by calling a recursive method on the root of the tree every time you insert or delete, you are doing too much work. Any time you insert or delete, the only nodes that need to have their height updated or that might become unbalanced are those along the path from the root node to the inserted note.
The 'add' and 'remove' operations should be recursive, and if they are, this path is "stored" in the recursive calls. Do your height checking, balance checking and balancing as you come out of these recursive calls. (After the recursive call in your recursive methods, put the code to do the work for these operations).