Attribute flow

Attribute flow is the pattern in which information moves from node to node in the parse tree.

Synthesized and Inherited Attributes

Synthesized attributes have their values only calculated in productions where the associated symbol appears on the left hand side. All dependencies point from the child to parent in the parse tree, i.e., it is bottom-up.

An inherited attribute is one that is not synthesized. Information may flow to a node from the parent or siblings.

In compilers, symbol table information is often passed from symbol to symbol using inherited attributes. Look at example in Scott (4.3?) where he uses an inherited attribute grammar in an LL(1) grammar for expressions to obtain left associativity.

AGs are declarative- define set of valid trees, but don’t specify how to build or decorate them.

AGs can be well-defined; rules determine a unique set of values for every possible parse tree. Also can be non-circular; never yield a parse tree with cycles in the attribute flow graph. They can also be both circular and well-defined.

Translation scheme

An algorithm that decorates a tree in an order that respects the AG attribute flow. Most obvious scheme is to make repeated passes over the tree, computing any attribute whose arguments are defined. Terminates when values no longer change. Doesn’t use any knowledge of structure of tree or grammar.

If tree is non-cyclic, use a dynamic scheme that tailors the evaluation order to the structure of the given tree. Construct a dependency graph and then perform a topological sort on dependency graph, followed by evaluating semantic rules in order of sort.

In compilers

Fastest translation schemes are static on a restricted set of AGs. Anything interleaved with parsing in a recursive descent parser must be specifiable with an L-attributed grammar.

A common approach is to interleave construction of an abstract syntax tree (AST) with parsing (no explicit parse tree).

S-attributed AG

All attributes are synthesized, such that their values are calculated only in productions in which their symbol appears on the left hand side. Can be computed by single bottom-up traversal of the syntax tree. This can be interleaved with parsing. The arguments to semantic functions in an S-attributed AG are always attributes of symbols on the right-hand side of the current production, and the return value is always placed into an attribute of the left-hand side of the production.

Consider the following LL(1) expression grammer for left-associative subtraction:

expr -> const expr_trail
expr_tail -> - const expr_tail | $\epsilon$

Now, we can add inherited attributes to establish left-to-right attribute flow (non S-attributed)

expr -> const expr_trail
    expr_tail.st    := const.val
    expr.val        := expr_tail.val
expr_tail_1 -> - const expr_tail_2
    expr_tail_2.st  := expr_tail_1.st - const.val
    expr_tail_1.val := expr_tail_2.val
expr_tail -> $\epsilon$
    expr_tail.val   := expr_tail.st

These are inherited, because certain productions have symbols being assigned while that symbol does not appear on the LHS.

L-attributed AG

If we say that attribute A.s depends on B.t (bc B.t is passed to a semantic function that returns a value for A.s) then we can define L-attributed grammars with the following:

  • Each synthesized attribute of a left-hand-side symbol depends only on that symbol’s own inherited attributes or on attributes (synthesized or inherited) of the production’s RHS symbols
  • Each inherited attribute of a right-hand-side symbol depends only on inherited attributes of the left-hand-side symbol or on attributes (synthesized or inherited) of symbols to its left in the right-hand-side

This all means that the attributes for this grammar can be computed with a single left-to-right depth first traversal of the parse tree. Can be computed on-the-fly during an LL(1) parse since this is the same order that nodes are generated during an LL parse.

Every S-attributed grammar is an L-attributed grammar.

Action routine

A semantic function that is executed at a particular point in a parse. These are used for ad hoc translation schemes interleaved with parsing.

Example:

expr -> const expr_trail
expr_tail -> {op = t.kind;} - const expr_tail { print(op); } | $\epsilon$

Tree Grammars

So far, attribute grammars have been used solely for decorating parse trees. AGs can also be used to decorate syntax trees. A CFG is meant to define (generate) a language composed of strings of tokens, where each string is the yield of a parse tree. A tree grammar is meant to define (or generate) the tree itself. Semantic rules attached to the productions of a tree grammar can be used to define the atrribute flow of a syntax tree in exactly the same way that semantic rules attached to the productions of a CFG are used to define the attribute flow of a parse tree.

Constructing an Abstract Syntax Tree

ASTs are only built for syntactically valid programs according to CFG that have already been parsed.

The concrete syntax is the syntax that is parsed. Abstract syntax is much simpler. ASTs only handle syntactically valid expressions.

Recall: The order in which a CFG is specified implies precedence for operators (e.g., for addition, subtraction). AST no longer deals with precedence or associativity.

OO Implementation

  • All nodes in AST are sublcasses of a common abstract base class
  • In a compiler, we could use the visitor pattern to traverse the tree for type checking and code generation
    • Define a class for each non-terminal in the abstract syntax
    • If there are multiple productions for a non-terminal, make the non-terminal class abstract, and introduce a subclass for each production

Visitor pattern

Separate code for what to do when traversing a data structure from the code for the data structure itself. The code for traversal encapsulated in a Visitor class. Action taken at a node can depend on the type of node and the type of the Visitor (double dispatch).

Useful in compilers because we traverse the AST multiple times with different purposes (type checking, code generation).