DOC HOME SITE MAP MAN PAGES GNU INFO SEARCH
 

(bison.info.gz) Generalized LR Parsing

Info Catalog (bison.info.gz) Mystery Conflicts (bison.info.gz) Algorithm (bison.info.gz) Stack Overflow
 
 Generalized LR (GLR) Parsing
 ============================
 
    Bison produces _deterministic_ parsers that choose uniquely when to
 reduce and which reduction to apply based on a summary of the preceding
 input and on one extra token of lookahead.  As a result, normal Bison
 handles a proper subset of the family of context-free languages.
 Ambiguous grammars, since they have strings with more than one possible
 sequence of reductions cannot have deterministic parsers in this sense.
 The same is true of languages that require more than one symbol of
 lookahead, since the parser lacks the information necessary to make a
 decision at the point it must be made in a shift-reduce parser.
 Finally, as previously mentioned ( Mystery Conflicts), there are
 languages where Bison's particular choice of how to summarize the input
 seen so far loses necessary information.
 
    When you use the `%glr-parser' declaration in your grammar file,
 Bison generates a parser that uses a different algorithm, called
 Generalized LR (or GLR).  A Bison GLR parser uses the same basic
 algorithm for parsing as an ordinary Bison parser, but behaves
 differently in cases where there is a shift-reduce conflict that has not
 been resolved by precedence rules ( Precedence) or a
 reduce-reduce conflict.  When a GLR parser encounters such a situation,
 it effectively _splits_ into a several parsers, one for each possible
 shift or reduction.  These parsers then proceed as usual, consuming
 tokens in lock-step.  Some of the stacks may encounter other conflicts
 and split further, with the result that instead of a sequence of states,
 a Bison GLR parsing stack is what is in effect a tree of states.
 
    In effect, each stack represents a guess as to what the proper parse
 is.  Additional input may indicate that a guess was wrong, in which case
 the appropriate stack silently disappears.  Otherwise, the semantics
 actions generated in each stack are saved, rather than being executed
 immediately.  When a stack disappears, its saved semantic actions never
 get executed.  When a reduction causes two stacks to become equivalent,
 their sets of semantic actions are both saved with the state that
 results from the reduction.  We say that two stacks are equivalent when
 they both represent the same sequence of states, and each pair of
 corresponding states represents a grammar symbol that produces the same
 segment of the input token stream.
 
    Whenever the parser makes a transition from having multiple states
 to having one, it reverts to the normal LALR(1) parsing algorithm,
 after resolving and executing the saved-up actions.  At this
 transition, some of the states on the stack will have semantic values
 that are sets (actually multisets) of possible actions.  The parser
 tries to pick one of the actions by first finding one whose rule has
 the highest dynamic precedence, as set by the `%dprec' declaration.
 Otherwise, if the alternative actions are not ordered by precedence,
 but there the same merging function is declared for both rules by the
 `%merge' declaration, Bison resolves and evaluates both and then calls
 the merge function on the result.  Otherwise, it reports an ambiguity.
 
    It is possible to use a data structure for the GLR parsing tree that
 permits the processing of any LALR(1) grammar in linear time (in the
 size of the input), any unambiguous (not necessarily LALR(1)) grammar in
 quadratic worst-case time, and any general (possibly ambiguous)
 context-free grammar in cubic worst-case time.  However, Bison currently
 uses a simpler data structure that requires time proportional to the
 length of the input times the maximum number of stacks required for any
 prefix of the input.  Thus, really ambiguous or non-deterministic
 grammars can require exponential time and space to process.  Such badly
 behaving examples, however, are not generally of practical interest.
 Usually, non-determinism in a grammar is local--the parser is "in
 doubt" only for a few tokens at a time.  Therefore, the current data
 structure should generally be adequate.  On LALR(1) portions of a
 grammar, in particular, it is only slightly slower than with the default
 Bison parser.
 
Info Catalog (bison.info.gz) Mystery Conflicts (bison.info.gz) Algorithm (bison.info.gz) Stack Overflow
automatically generated byinfo2html