(bison.info.gz) Generalized LR Parsing
Info Catalog
(bison.info.gz) Mystery Conflicts
(bison.info.gz) Algorithm
(bison.info.gz) Stack Overflow
Generalized LR (GLR) Parsing
============================
Bison produces _deterministic_ parsers that choose uniquely when to
reduce and which reduction to apply based on a summary of the preceding
input and on one extra token of lookahead. As a result, normal Bison
handles a proper subset of the family of context-free languages.
Ambiguous grammars, since they have strings with more than one possible
sequence of reductions cannot have deterministic parsers in this sense.
The same is true of languages that require more than one symbol of
lookahead, since the parser lacks the information necessary to make a
decision at the point it must be made in a shift-reduce parser.
Finally, as previously mentioned ( Mystery Conflicts), there are
languages where Bison's particular choice of how to summarize the input
seen so far loses necessary information.
When you use the `%glr-parser' declaration in your grammar file,
Bison generates a parser that uses a different algorithm, called
Generalized LR (or GLR). A Bison GLR parser uses the same basic
algorithm for parsing as an ordinary Bison parser, but behaves
differently in cases where there is a shift-reduce conflict that has not
been resolved by precedence rules ( Precedence) or a
reduce-reduce conflict. When a GLR parser encounters such a situation,
it effectively _splits_ into a several parsers, one for each possible
shift or reduction. These parsers then proceed as usual, consuming
tokens in lock-step. Some of the stacks may encounter other conflicts
and split further, with the result that instead of a sequence of states,
a Bison GLR parsing stack is what is in effect a tree of states.
In effect, each stack represents a guess as to what the proper parse
is. Additional input may indicate that a guess was wrong, in which case
the appropriate stack silently disappears. Otherwise, the semantics
actions generated in each stack are saved, rather than being executed
immediately. When a stack disappears, its saved semantic actions never
get executed. When a reduction causes two stacks to become equivalent,
their sets of semantic actions are both saved with the state that
results from the reduction. We say that two stacks are equivalent when
they both represent the same sequence of states, and each pair of
corresponding states represents a grammar symbol that produces the same
segment of the input token stream.
Whenever the parser makes a transition from having multiple states
to having one, it reverts to the normal LALR(1) parsing algorithm,
after resolving and executing the saved-up actions. At this
transition, some of the states on the stack will have semantic values
that are sets (actually multisets) of possible actions. The parser
tries to pick one of the actions by first finding one whose rule has
the highest dynamic precedence, as set by the `%dprec' declaration.
Otherwise, if the alternative actions are not ordered by precedence,
but there the same merging function is declared for both rules by the
`%merge' declaration, Bison resolves and evaluates both and then calls
the merge function on the result. Otherwise, it reports an ambiguity.
It is possible to use a data structure for the GLR parsing tree that
permits the processing of any LALR(1) grammar in linear time (in the
size of the input), any unambiguous (not necessarily LALR(1)) grammar in
quadratic worst-case time, and any general (possibly ambiguous)
context-free grammar in cubic worst-case time. However, Bison currently
uses a simpler data structure that requires time proportional to the
length of the input times the maximum number of stacks required for any
prefix of the input. Thus, really ambiguous or non-deterministic
grammars can require exponential time and space to process. Such badly
behaving examples, however, are not generally of practical interest.
Usually, non-determinism in a grammar is local--the parser is "in
doubt" only for a few tokens at a time. Therefore, the current data
structure should generally be adequate. On LALR(1) portions of a
grammar, in particular, it is only slightly slower than with the default
Bison parser.
Info Catalog
(bison.info.gz) Mystery Conflicts
(bison.info.gz) Algorithm
(bison.info.gz) Stack Overflow
automatically generated byinfo2html