Wednesday, March 13, 2013

Scala Parser Combinators - Part 2

Scala Parser Combinators - Part 3 >

In Part 1 of Scala Parser Combinators, I introduced Scala parsers and gave an example of a trivial DSL that parsed arithmetic operations. I also demonstrated how to evaluate the language and print the results of the parsing operation. I'm going to expand on this example by showing how to transform the flat result string into a useful set of structures.

Rather than simply printing the results of the parsing operation, let's look at how we would evaluate the arithmetic operation to compute the result. For example, rather than parsing "9*8+21/7" and printing "(9~Some((*~(8~Some((+~(21~Some((/~(7~None))))))))))", we would like the parser to return "75".

Before we begin, though, it's important to understand what something like "(9~Some((*~(8~Some((+~(21~Some((/~(7~None))))))))))" really means in the world of Scala. Let's look at a subset of this string, "(9~Some((*~(8~None))))". This is the result of parsing "9*8". The first part that looks interesting is the "9~Some(...)". In the previous tutorial, we defined the following rule in our parser:

def expr: Parser[Any] = number ~ opt(operator ~ expr )

It's clear that "number" is evaluating to "9" and "~" is being printed out verbatim (which you should recall is used in Scala parsers to join parts of the grammar). However, what's with "Some(...)"? Well, whenever Scala parses an opt(x) statement, it will evaluate it as either Some(...) or None, both of which are subclasses of Option. That makes since... the opt(x) statement evaluates to an Option.

Instead of having our parser return a bunch of ~ and options, let's look at transforming the parser results into something more useful. Again, looking at our current parser rule:

def expr: Parser[Any] = number ~ opt(operator ~ expr )

We need to modify this parser definition to have it return an Int instead of Any. This is simple:

def expr: Parser[Int] = ...

The next step is to compute the result of the arithmetic operation. Our grammar rule allows for either a single number or a number followed by an arithmetic operator and another number. If we're dealing with a single number, we need to tell the parser to convert the result to an Int. To do this, we make the following modification to our parser rule:

def expr: Parser[Int] = (number ^^ { _.toInt }) ...

The ^^ just tells the parser to execute the code that follows it, contained in {...}. All we're doing is converting it to an Int.

Next, we need to tell the parser what to do when it encounters a number, or when it encounters a number followed by an operator and another number. For this, we need to define the integer operation for each situation (single integer value, addition of two values, subtraction of two values, division of two values, and multiplication of two values).

Again, we start by adding ^^ to our rule to include the Scala code that we want to execute in these situations. The result is:

def expr: Parser[Int] = (number ^^ { _.toInt }) ~ opt(operator ~ expr ) ^^ {
    case a ~ None => a
    case a ~ Some("*" ~ b) => a * b
    case a ~ Some("/" ~ b) => a / b
    case a ~ Some("+" ~ b) => a + b
    case a ~ Some("-" ~ b) => a - b

There are five cases we're handling. The first is the situation where we have just a single integer (a ~ None). When we have an Int with None after it, we simply evaluate the integer value as-is. The second situation is when we have an integer being multiplied by another integer (a ~ Some("*" ~ b)). In this case, we simply perform a * b. We then proceed to define the rules for division, addition, and subtraction.

The key take-aways from this tutorial are:
  • You define the type that your parser rule is returning within the brackets of the Parser[ ] definition. In this example, it's an Int.
  • You can add custom Scala code to operate on the parser results with ^^ { ... }

This works great for this simple arithmetic DSL. However, for anything more involved than this simple example, we most likely need to build a parse tree (or better yet, an AST) that we can process. I cover an introduction to parse trees in the next tutorial on Scala Parser Combinators.

No comments:

Post a Comment