Computer Science 236

Datalog Parser Lab Session


During this lab session you will write some of the code needed to complete your Datalog Parser. You will need to write additional code after the lab session to fully complete the Parser. (You will need your Token class from Project 1 to complete this session.) (Note that the Parser.h file mentioned in the steps below will be created by you from scratch. There is no code provided for this lab.)


Part 1: Parser class and support functions


  1. Make a 'Parser' class (Parser.h)
    The Parser is given a vector of Tokens that are typically provided by the Scanner.
    (note that you need '#include' for 'vector' and 'Token.h')

    class Parser {
     private:
      vector<Token> tokens;
     public:
      Parser(const vector<Token>& tokens) : tokens(tokens) { }
    };
    

  2. Add some support functions to the 'Parser' class (Parser.h)
    The support functions will make the parsing routines simpler and easier to write. The 'tokenType' function returns the type of the current Token. The 'advanceToken' function moves to the next Token. The 'throwError' function is called when the Parser finds an error. (Note that the 'throwError' function shown here is not complete.) You may want to add other support functions in addition to these.

      TokenType tokenType() const {
        return tokens.at(0).getType();
      }
      void advanceToken() {
        tokens.erase(tokens.begin());
      }
      void throwError() {
        cout << "error" << endl;
      }
    

  3. Test the support functions (main.cpp)
    (note that you will need a number of '#include' statements)

    int main() {
    
      vector<Token> tokens = {
        Token(ID,"Ned",2),
        Token(LEFT_PAREN,"(",2),
        Token(RIGHT_PAREN,")",2),
      };
    
      Parser p = Parser(tokens);
      cout << p.tokenType() << endl;
      p.advanceToken();
      cout << p.tokenType() << endl;
      p.advanceToken();
      cout << p.tokenType() << endl;
      p.throwError();
        
    }
    

    Compile and test.
    The output should look something like this:
    (The values printed by 'tokenType' will vary depending on the order of the token types in your TokenType enum.)

    13
    3
    4
    error
    
  4. Add a 'match' function to the 'Parser' class (Parser.h)
    The 'match' function is another important support function that will make the parsing routines simpler and easier to write. The 'match' function is called when parsing a terminal symbol. (This 'match' function has a debug print, to be removed later.) (The debug print is helpful in testing the parsing functions below.)

      void match(TokenType t) {
        cout << "match: " << t << endl;
        // add code for matching token type t
      }
    
    The following pseudo-code describes how the 'match' function should work.
        if the current token type matches t
          advance to the next token
        else
          report a syntax error
    

  5. Test the 'match' function (main.cpp)

    int main() {
    
      vector<Token> tokens = {
        Token(ID,"Ned",2),
        Token(LEFT_PAREN,"(",2),
        Token(RIGHT_PAREN,")",2),
      };
    
      Parser p = Parser(tokens);
      p.match(ID);
      p.match(LEFT_PAREN);
      p.match(ID);         // intentional error
      p.match(RIGHT_PAREN);
    
    }
    

    Compile and test.
    The output should look something like this:

    match: 13
    match: 3
    match: 13
    error
    match: 4
    
  6. Take a screenshot showing your terminal and the resulting output. (You can also take a screenshot of an IDE showing similar results.)


Part 2: Parsing functions for 'idList'


  1. Using the grammar rule for 'idList' from the Project 2 description, write a parsing function for 'idList' in the 'Parser' class (Parser.h)

    Grammar Rule:

    idList -> COMMA ID idList | lambda
    
    Parsing Function:
      void idList() {
        if (tokenType() == COMMA) {
          match(COMMA);
          match(ID);
          idList();
        } else {
          // lambda
        }
      }
    

  2. Test 'idList' with valid input (main.cpp)

    int main() {
    
      vector<Token> tokens = {
        Token(COMMA,",",2),
        Token(ID,"Ted",2),
        Token(COMMA,",",2),
        Token(ID,"Zed",2),
        Token(RIGHT_PAREN,")",2),
      };
    
      Parser p = Parser(tokens);
      p.idList();
    
    }
    

    Compile and test.
    The output should look something like this:
    (No errors should be reported.)

    match: 0
    match: 13
    match: 0
    match: 13
    
  3. Test 'idList' with bad input (main.cpp)

    int main() {
    
      vector<Token> tokens = {
        Token(COMMA,",",2),
        //Token(ID,"Ted",2),
        Token(COMMA,",",2),
        Token(ID,"Zed",2),
        Token(RIGHT_PAREN,")",2),
      };
    
      Parser p = Parser(tokens);
      p.idList();
    
    }
    

    Compile and test.
    The output should look something like this:
    (An error should be reported when parsing the second COMMA.)

    match: 0
    match: 13
    error
    match: 0
    match: 13
    

Part 3: Parsing functions for 'scheme'


  1. Using the grammar rule for 'scheme' from the Project 2 description, write a parsing function for 'scheme' in the 'Parser' class (Parser.h)

    Grammar Rule:

    scheme -> ID LEFT_PAREN ID idList RIGHT_PAREN
    
    Parsing Function:
      void scheme() {
        // add code for parsing a 'scheme'
      }
    

  2. Test 'scheme' with valid input (main.cpp)

    int main() {
    
      vector<Token> tokens = {
        Token(ID,"Ned",2),
        Token(LEFT_PAREN,"(",2),
        Token(ID,"Ted",2),
        Token(COMMA,",",2),
        Token(ID,"Zed",2),
        Token(RIGHT_PAREN,")",2),
      };
    
      Parser p = Parser(tokens);
      p.scheme();
    
    }
    

    Compile and test.
    The output should look something like this:
    (No errors should be reported.)

    match: 13
    match: 3
    match: 13
    match: 0
    match: 13
    match: 4
    
  3. Test 'scheme' with bad input (main.cpp)

    int main() {
    
      vector<Token> tokens = {
        Token(ID,"Ned",2),
        //Token(LEFT_PAREN,"(",2),
        Token(ID,"Ted",2),
        Token(COMMA,",",2),
        Token(ID,"Zed",2),
        Token(RIGHT_PAREN,")",2),
      };
    
      Parser p = Parser(tokens);
      p.scheme();
    
    }
    

    Compile and test.
    The output should look something like this:
    (An error should be reported when parsing the second ID.)

    match: 13
    match: 3
    error
    match: 13
    match: 0
    match: 13
    match: 4
    
  4. Take a screenshot showing your terminal and the resulting output. (You can also take a screenshot of an IDE showing similar results.)

  5. Submit your screenshots and a zip file containing the code you wrote during this session to Learning Suite.


Part 4: Complete the Datalog Parser project


These steps are to be done as part of Project 2, they are not required as part of the lab session.

  1. Write parsing functions for the remaining grammar rules.

  2. Fix error handling in the parser. (Throw an Exception in the 'throwError' function.) (Catch the Exception at the top of the parser and report the error.)

  3. Write classes for Parameter, Predicate, Rule, and Datalog Program.

  4. Add code to the parser to create Parameter, Predicate, and Rule objects while parsing, and construct a Datalog Program object that contains lists of Schemes, Facts, Rules, and Queries.