|
|
lex follows two rules to resolve ambiguities that may arise from the lex specification. These rules are:
%{ #include "y.tab.h" %} %% START { return(STARTTOK); } BREAK { return(BREAKTOK); } END { return(ENDTOK); } [a-zA-Z][a-zA-Z0-9]* { return(yytext); }The string "START" could be matched by both the first or the fourth rule: because START is a reserved word, you want only the action associated with the first rule to be executed. By placing the rule for START and the other reserved words before the rule for identifiers, you ensure that reserved words are recognized as such.
The second kind of ambiguity could arise, for example, if the input text was coded in a programming language that had operators that were similar. Part of a lex specification for the C language might look like this:
"+" {return(PLUS); } "++" {return(INC); }The lexical analyzer should recognize the increment operator "++", not the addition operator "+", when it reads the following statement:
i++;