Computer Science 236

Datalog Lab Session


Introduction


In this lab session you will write some of the code needed for joining two relations. The join operation is needed for evaluating rules in Project 4. You will need to write more code after the lab session to fully complete the join operation and Project 4. (You will need your Scheme, Tuple, and Relation classes from Project 3 to complete this session.)


Part 1: joinable


  1. Write a function named 'joinable' in your Relation class.
    An important part of joining two relations is testing two tuples to see if they are joinable.

      static bool joinable(const Scheme& leftScheme, const Scheme& rightScheme,
    		       const Tuple& leftTuple, const Tuple& rightTuple) {
        // add code to test whether the tuples are joinable
        return true;
      }
    

    The 'joinable' function is given the scheme and a tuple from the left relation of the join and the scheme and a tuple from the right relation of the join. The 'joinable' function returns 'true' if the two tuples should be combined to form a tuple in the result of the join.

    Suppose 'joinable' is given the following schemes and tuples. Should 'joinable' return 'true' for this case?

    left scheme: A B
    right scheme: B C
    
    left tuple: 1 2
    right tuple: 3 4
    

    Should 'joinable' return 'true' for this case?

    left scheme: A B
    right scheme: B C
    
    left tuple: 4 2
    right tuple: 2 1
    

    (Note that 'joinable' is 'static' so you can call it from the 'main' function without needing to create a Relation object. This allows for easier testing.)

  2. Loop over the left scheme and tuple.
    Add a loop to the 'joinable' function to print the names and values in the left scheme and tuple.

        for (unsigned leftIndex = 0; leftIndex < leftScheme.size(); leftIndex++) {
          const string& leftName = leftScheme.at(leftIndex);
          const string& leftValue = leftTuple.at(leftIndex);
          cout << "left name: " << leftName << " value: " << leftValue << endl;
        }
    
  3. Test the 'joinable' function.
    Write a 'main' function that creates some schemes and tuples and calls the 'joinable' function.

    int main() {
    
      Scheme scheme1( { "A", "B" } );
      Scheme scheme2( { "B", "C" } );
    
      Tuple tuple1( {"'1'", "'2'"} );
      Tuple tuple2( {"'3'", "'4'"} );
    
      Relation::joinable(scheme1, scheme2, tuple1, tuple2);
    
    }
    

    Compile and test. The output should look something like this:

    left name: A value: '1'
    left name: B value: '2'
    
  4. Loop over the right scheme and tuple.
    Add a second loop to the 'joinable' function to print the names and values in the right scheme and tuple. (The code for this loop should be similar to the code for the left scheme and tuple.)

    Compile and test. The output should look something like this:

    left name: A value: '1'
    left name: B value: '2'
    right name: B value: '3'
    right name: C value: '4'
    

    Should 'joinable' return 'true' or 'false' for this case? How will 'joinable' decide what it should return? How can you structure the loops in 'joinable' so it can compare the correct values and decide what it should return?

  5. Nest the loops properly.
    Move the second loop (for the right scheme and tuple) so it is nested inside the first loop (for the left scheme and tuple).

    Compile and test. The output should look something like this:

    left name: A value: '1'
    right name: B value: '3'
    right name: C value: '4'
    left name: B value: '2'
    right name: B value: '3'
    right name: C value: '4'
    

    Which lines in this output can tell 'joinable' that the correct answer for this case is 'false'?

  6. Test for the case when the tuples are not joinable.
    Add an 'if' statement inside the second loop that returns false when it sees a case where the tuples are not joinable.

  7. Add more cases and tests to the 'main' function.

      Scheme scheme3( { "X", "Y" } );
      Scheme scheme4( { "X", "Y", "Z" } );
    
      Tuple tuple3( {"'1'", "'4'"} );
      Tuple tuple4( {"'1'", "'2'", "'4'"} );
    
      cout << Relation::joinable(scheme1, scheme2, tuple1, tuple2) << endl;
      cout << Relation::joinable(scheme2, scheme3, tuple1, tuple2) << endl;
      cout << Relation::joinable(scheme3, scheme4, tuple1, tuple4) << endl;
      cout << Relation::joinable(scheme3, scheme4, tuple3, tuple4) << endl;
    

    Compile and test. Check that 'joinable' returns the correct value for each test. (Note that 'true' prints as '1' and 'false' prints as '0'.)

  8. Take a screenshot showing the output from testing your 'joinable' function.


Part 2: join


  1. Write a function named 'join' in your Relation class.

      Relation join(const Relation& right) {
        const Relation& left = *this;
        Relation result;
        // add code to complete the join operation
        return result;
      }
    

    The 'join' function is called on the 'left' relation of the join. The 'right' relation of the join is passed as a parameter. The 'join' function returns a new relation that contains the result of the join.

  2. Loop over the tuples in the left relation.
    Add a loop to the 'join' function that prints the tuples in the left relation. Your loop should run the following line of code for each tuple in the left relation. (This line of code is used during the lab session for testing and debugging and should be removed later.)

          cout << "left tuple: " << leftTuple.toString(left.scheme) << endl;
    
  3. Test the 'join' function.
    Write a 'main' function that creates a relation and calls the 'join' function.

    int main() {
    
      Relation studentRelation("students", Scheme( {"ID", "Name", "Major"} ));
    
      vector<string> studentValues[] = {
        {"'42'", "'Ann'", "'CS'"},
        {"'64'", "'Ned'", "'EE'"},
      };
    
      for (auto& value : studentValues)
        studentRelation.addTuple(Tuple(value));
    
      studentRelation.join(studentRelation);
    
    }
    

    Compile and test. The output should look something like this:

    left tuple: ID='42', Name='Ann', Major='CS'
    left tuple: ID='64', Name='Ned', Major='EE'
    
  4. Loop over the tuples in the right relation.
    Add a second loop to the 'join' function that prints the tuples in the right relation.

    Compile and test. The output should look something like this:

    left tuple: ID='42', Name='Ann', Major='CS'
    left tuple: ID='64', Name='Ned', Major='EE'
    right tuple: ID='42', Name='Ann', Major='CS'
    right tuple: ID='64', Name='Ned', Major='EE'
    

    (Not too great because both the left and right relations are the same relation.)

  5. Add a second relation to the 'main' function.

      Relation courseRelation("courses", Scheme( {"ID", "Course"} ));
    
      vector<string> courseValues[] = {
        {"'42'", "'CS 100'"},
        {"'32'", "'CS 232'"},
      };
    
      for (auto& value : courseValues)
        courseRelation.addTuple(Tuple(value));
    
      studentRelation.join(courseRelation);
    

    Compile and test. The output should look something like this:

    left tuple: ID='42', Name='Ann', Major='CS'
    left tuple: ID='64', Name='Ned', Major='EE'
    right tuple: ID='32', Course='CS 232'
    right tuple: ID='42', Course='CS 100'
    

    How can you structure the loops in the 'join' function so it can compare every tuple in the left relation with every tuple in the right relation?

  6. Nest the loops properly.
    Move the second loop (for the right relation) so it is nested inside the first loop (for the left relation).

    Compile and test. The output should look something like this:

    left tuple: ID='42', Name='Ann', Major='CS'
    right tuple: ID='32', Course='CS 232'
    right tuple: ID='42', Course='CS 100'
    left tuple: ID='64', Name='Ned', Major='EE'
    right tuple: ID='32', Course='CS 232'
    right tuple: ID='42', Course='CS 100'
    
  7. Take a screenshot showing the output from testing your 'join' function.

  8. Submit your screenshots and a zip file containing the code you wrote during this session to Learning Suite.


Conclusion


You should now have about half of the code needed for your join operation. You should write two more functions:

  1. joinSchemes: combines the schemes for the left and right relations into single scheme for the result relation.

  2. joinTuples: combines tuples from the left and right relations into single tuple for the result relation.

Your 'join' function can call these functions as well as the 'joinable' function to produce the relation that results from the join.

(The steps described in the conclusion do not need to be completed as part of the lab session, but they do need to be done as part of the project.)