Lexical Analysis ---------------- Comments are enclosed in /* and */ and may span several lines. Comments do not nest. Blanks, tabs, returns, newlines, and comments are skipped during lexical analysis. Identifiers are enclosed in { and }, must begin with a letter, and consist of letters and digits. The underscore counts as a letter. Program fragments are enclosed in %{ and %}. They do not nest. Furthermore, they are immediately copied to the generated scanner during lexical analysis of the specification (so that there is no danger of overflowing the generator's matching buffer). For the tokens and attributes returned by the scanner of the scanner generator see the following table: Lexeme Token Attribute blank - - tab - - return - - newline - - /* comment */ - - EOF E_O_F - %% SDEL - %{ program text %} PROG - : COLON - { letter, followed by letters or digits } IDENT pointer to name | BAR - * STAR - + PLUS - ? QMARK - ( LPAR - ) RPAR - [ LBRA - ] RBRA - - DASH - ^ CARET - . DOT - \n \t \b \r \f \a CHAR ASCII code \ octal digits CHAR ASCII code \ other character CHAR ASCII code any other character CHAR ASCII code Syntax of the Specification Language (LR Parsing) ------------------------------------------------- specification : definitions SDEL rules E_O_F | definitions SDEL rules SDEL utilities E_O_F ; utilities : /* epsilon */ | utilities PROG ; definitions : /* epsilon */ | definitions definition ; definition : COLON IDENT regexpr | PROG ; rules : /* epsilon */ | rules rule ; rule : regexpr PROG ; regexpr : regterm | regexpr BAR regterm ; regterm : regfactor | regterm regfactor ; regfactor : regprimary | regfactor STAR | regfactor PLUS | regfactor QMARK ; regprimary : LPAR regexpr RPAR | IDENT | CHAR | LBRA charclass RBRA | LBRA CARET charclass RBRA | DOT ; charclass : classcomponent | charclass classcomponent ; classcomponent : CHAR | CHAR DASH CHAR ;