|
|
lex follows two rules to resolve ambiguities that may arise from the lex specification. These rules are:
%{
#include "y.tab.h"
%}
%%
START { return(STARTTOK); }
BREAK { return(BREAKTOK); }
END { return(ENDTOK); }
[a-zA-Z][a-zA-Z0-9]* { return(yytext); }
The string "START" could be matched by both the first or
the fourth rule: because START is a reserved word, you want only
the action associated with the first rule to be executed.
By placing the rule for START
and the other reserved words before the rule for identifiers,
you ensure that reserved words are recognized as such.
The second kind of ambiguity could arise, for example, if the input text was coded in a programming language that had operators that were similar. Part of a lex specification for the C language might look like this:
"+" {return(PLUS); }
"++" {return(INC); }
The lexical analyzer
should recognize the increment operator
"++", not the addition operator "+", when it
reads the following statement:
i++;