date:20050306

Re: x + (y) + z

2005-03-06 Thread Frank Heckenbach

(Please reply in the list.)

Derek M Jones wrote:

> Frank,
> 
> >> Since both parses will recognise the input one has to be
> >> selected.  Picking the common case means that there is
> >> less work which has to be undone later in the semantics phase.
> >
> >Undoing and reversing parsing (as you also mentioned in another
> >mail) may be an option, but IMHO not a very good one, if you can't
> >use the parser the way it's supposed to.
> 
> These things are sometimes necessary.  My lalr(1) grammar of C
> changes the tree built for parameters from infix to postfix (or maybe
> the other way around; I don't have the source to hand).
> 
> >But since you don't state why you don't do the usual thing (separate
> >token for type names), I can't say anything more ...
> 
> For this grammar I don't build a symbol table.

Then how do you distinguish between cast and other parentheses in
the semantics phase as you say? Or evaluate semantics at all? You
may not have a classic form of symbol table, but surely some way of
knowing what a given identifier means, don't you?

> Because I don't have a symbol table to look things up in.
> Perhaps I should have pointed this out (it also answers
> Frank Heckenbach's question).  When parsing the visible
> source (ie not doing any preprocessing; well apart from
> ignoring the directives) a statement/declaration at a time
> the content of a symbol table are likely to be very incomplete.

So you don't keep symbols between statements? Then what is your
program supposed to output at all? Some token sequences are
ambiguous, if you don't know e.g. if some identifier is a type or
not. So it seems the best you could output then is "there are two
alternatives: ... and ...". This would look like the `%merge'
example in the manual, so maybe you can use `%merge'.

Otherwise, if you want to output "This means probably this: ...",
and not mention the alternatives, it seems like you're trying to
write a heuristic parser (whatever this means), and I'm not sure if
bison or any automatic parser generator is really suitable for this.

> Good alternative suggestion.  But this still requires tree rewriting
> after the expression has been parsed.  My %gooa option proposal
> avoids this grammar violence.

If I understand your previous mails correctly, it doesn't. It may
make the cases where you need to rewrite less common, but it's not
clear that this makes your code any simpler (as you write it once,
no matter how often it will be executed anyway).

Frank

-- 
Frank Heckenbach, [EMAIL PROTECTED]
http://fjf.gnu.de/
GnuPG and PGP keys: http://fjf.gnu.de/plan (7977168E)

___
Help-bison@gnu.org http://lists.gnu.org/mailman/listinfo/help-bison

Re: x + (y) + z

2005-03-06 Thread Hans Aberg

At 14:43 + 2005/03/05, Derek M Jones wrote:
>>The normal way to resolve this would be to let the lexer check the lookup
>>table to see what y is: a type or a number identifier, and then return that
>>type. WHy does this not work for you.
>
>Because I don't have a symbol table to look things up in.
>Perhaps I should have pointed this out (it also answers
>Frank Heckenbach's question).  When parsing the visible
>source (ie not doing any preprocessing; well apart from
>ignoring the directives) a statement/declaration at a time
>the content of a symbol table are likely to be very incomplete.

You are not processing the C language, but some other language, with the
context dependencies of the C language removed. Perhaps check the newsgroup
comp.comilers for better input on that problem.

>>The commands %left and %right handles left and right associativity.
>
>Good alternative suggestion.  But this still requires tree rewriting
>after the expression has been parsed. My %gooa option proposal
>avoids this grammar violence.

Featuritis should be posted in bug-bison. You can only expect to get such
features into Bison if there is someone willing to volunteer doing the work.

Alternatively, you might introduce a new token-name both for type-names and
number-names. I.e., instead of
%token type-name number-name
you write
%token token-type-name
%%
type-name: token-type-name;
number-name: token-type-name;

You then get a correct GLR parse, and can try to sort out the ambiguity
later.

  Hans Aberg

___
Help-bison@gnu.org http://lists.gnu.org/mailman/listinfo/help-bison

Re: x + (y) + z

2005-03-06 Thread Derek M Jones

Hans,

>You are not processing the C language, but some other language, with the
>context dependencies of the C language removed.

Correct.  I only claim to be processing C syntax on a
single statement/declaration basis (and then only at the visible
source level)..

> Perhaps check the newsgroup
>comp.comilers for better input on that problem.

I am a regular reader of that news group.

>>Good alternative suggestion.  But this still requires tree rewriting
>>after the expression has been parsed. My %gooa option proposal
>>avoids this grammar violence.
>
>Featuritis should be posted in bug-bison. You can only expect to get such
>features into Bison if there is someone willing to volunteer doing the work.

I have been looking at the source.  You never know...

>Alternatively, you might introduce a new token-name both for type-names and
>number-names. I.e., instead of
>%token type-name number-name
>you write
>%token token-type-name
>%%
>type-name: token-type-name;
>number-name: token-type-name;
>
>You then get a correct GLR parse, and can try to sort out the ambiguity
>later.

I don't understand what you are getting at here.


derek

--
Derek M Jones tel: +44 (0) 1252 520 667
Knowledge Software Ltd mailto:[EMAIL PROTECTED]
Applications Standards Conformance Testing   http://www.knosof.co.uk




___
Help-bison@gnu.org http://lists.gnu.org/mailman/listinfo/help-bison

Re: x + (y) + z

2005-03-06 Thread Derek M Jones

Frank,

>(Please reply in the list.)

Ok (your last reply did not look at if it came via
the list, so I responded to you only).

>> For this grammar I don't build a symbol table.
>
>Then how do you distinguish between cast and other parentheses in
>the semantics phase as you say? Or evaluate semantics at all? You
>may not have a classic form of symbol table, but surely some way of
>knowing what a given identifier means, don't you?

Perhaps not.  My interest is in measuring the characteristics of
source code.  If the number of ambiguous parses is small enough
then I can ignore their impact on the measurements.

>So you don't keep symbols between statements?

Not yet.  I have been thinking about what this might buy me.

>it seems like you're trying to
>write a heuristic parser (whatever this means), and I'm not sure if
>bison or any automatic parser generator is really suitable for this.

The final grammar is likely to have several uses.  Heuristic parsing
is one interesting avenue.  Automatic parser generator or hand written;
it is a tough problem.

>> Good alternative suggestion.  But this still requires tree rewriting
>> after the expression has been parsed.  My %gooa option proposal
>> avoids this grammar violence.
>
>If I understand your previous mails correctly, it doesn't.

Changing the associativity of an operator will potentially result in a different
parse tree being generated.

> It may
>make the cases where you need to rewrite less common, but it's not
>clear that this makes your code any simpler (as you write it once,
>no matter how often it will be executed anyway).

At the moment I do not have a solution to the ambiguity present in the
token sequence:

static int (foo); static

ie a declaration and the first token from a second declaration.  %gooa
would get me out of this hole without a big grammar rewrite.


derek

--
Derek M Jones tel: +44 (0) 1252 520 667
Knowledge Software Ltd mailto:[EMAIL PROTECTED]
Applications Standards Conformance Testing   http://www.knosof.co.uk




___
Help-bison@gnu.org http://lists.gnu.org/mailman/listinfo/help-bison

Re: x + (y) + z

2005-03-06 Thread Hans Aberg

At 17:03 + 2005/03/06, Derek M Jones wrote:
>>You are not processing the C language, but some other language, with the
>>context dependencies of the C language removed.
>
>Correct.  I only claim to be processing C syntax on a
>single statement/declaration basis (and then only at the visible
>source level)..

>>Alternatively, you might introduce a new token-name both for type-names and
>>number-names. I.e., instead of
>>%token type-name number-name
>>you write
>>%token token-type-name
>>%%
>>type-name: token-type-name;
>>number-name: token-type-name;
>>
>>You then get a correct GLR parse, and can try to sort out the ambiguity
>>later.
>
>I don't understand what you are getting at here.

You ignore the context information distinguishing between type-names and
number-names. So set these names equal to the same token, and let the GLR
parser handle it. It then produces all correct parses, including the
possible type-name/number-name choices. You then get the correct parse
trees, and need only decide how to select one over the other. Clearly, this
choice cannot be done, in general, unless you somehow supply the context
information missing.

  Hans Aberg

___
Help-bison@gnu.org http://lists.gnu.org/mailman/listinfo/help-bison

Re: x + (y) + z

2005-03-06 Thread Derek M Jones

Hans,

>You ignore the context information distinguishing between type-names and
>number-names. So set these names equal to the same token, and let the GLR
>parser handle it. It then produces all correct parses, including the
>possible type-name/number-name choices.

This is what I am already doing (I have no choice since there is
no symbol table).

> You then get the correct parse
>trees, and need only decide how to select one over the other. Clearly, this
>choice cannot be done, in general, unless you somehow supply the context
>information missing.

Unfortunately glr parsing is not a universal solution.  It requires that
at the end of processing there be a unique parse tree (the multiple
parse trees that may exist while processing the input tokens are required
to eventually resolve to a single parse).

The %dprec option allows the grammar writer to select which parse to use
in some circumstances (i.e., both parses must involve the same sequence
of tokens), but not all.

I am current trying to figure out why

(x) + (y) + z;

(which can be parsed in four different ways) is generating an ambiguity
(it should be handled by my existing uses of %dprec and grammar
rewrites)

derek

--
Derek M Jones tel: +44 (0) 1252 520 667
Knowledge Software Ltd mailto:[EMAIL PROTECTED]
Applications Standards Conformance Testing   http://www.knosof.co.uk




___
Help-bison@gnu.org http://lists.gnu.org/mailman/listinfo/help-bison

Re: x + (y) + z

2005-03-06 Thread Hans Aberg

At 18:12 + 2005/03/06, Derek M Jones wrote:
>> You then get the correct parse
>>trees, and need only decide how to select one over the other. Clearly, this
>>choice cannot be done, in general, unless you somehow supply the context
>>information missing.
>
>Unfortunately glr parsing is not a universal solution.  It requires that
>at the end of processing there be a unique parse tree (the multiple
>parse trees that may exist while processing the input tokens are required
>to eventually resolve to a single parse).

Paul Hilfinger is thinking about actions that during a split are executed
immediately, in addition to those delayed until the merge. This will enable
the kinds of things that you are asking for, I think, as one then can build
the parse trees, and make a selection based on that.

Otherwise, you seem to simply ask for nondeterministic parsing, where one
keeps track of all the parsing possibilities. Such parsers are not difficult
to write, see for example the Parsing Techniques book:
http://www.cs.vu.nl/~dick/PTAPG.html
Such parsers are especially easily written in a functional language, like
Haskell . If you check the Haskell site, or list, you
might find one for you. The problem is though that these are very slow, with
stress on very: In once replaced one with a Flex-Bison combination (on the
Mini-Prolog that comes with Hugs), and a very slow program became lightning
fast. But with your problem at hand, you may have no choice.

  Hans Aberg

___
Help-bison@gnu.org http://lists.gnu.org/mailman/listinfo/help-bison

Re: x + (y) + z

2005-03-06 Thread Derek M Jones

Hans,

>Paul Hilfinger is thinking about actions that during a split are executed
>immediately, in addition to those delayed until the merge. This will enable
>the kinds of things that you are asking for, I think, as one then can build
>the parse trees, and make a selection based on that.

I think the only way forward is for me to write some code, in yyparse,
to select between the various parses when an ambiguity is detected.

>Otherwise, you seem to simply ask for nondeterministic parsing, where one
>keeps track of all the parsing possibilities. Such parsers are not difficult
>to write, see for example the Parsing Techniques book:
>http://www.cs.vu.nl/~dick/PTAPG.html

I bought the book when it originally came out.

No quality generalized LR parsers available for decades and
then two come along at almost the same time
 http://www.cs.berkeley.edu/~smcpeak/elkhound/


derek

--
Derek M Jones tel: +44 (0) 1252 520 667
Knowledge Software Ltd mailto:[EMAIL PROTECTED]
Applications Standards Conformance Testing   http://www.knosof.co.uk




___
Help-bison@gnu.org http://lists.gnu.org/mailman/listinfo/help-bison

$B%P%$%V$G@dD:!*G($l$9$.%^%s%^%s(B$B!y(B

2005-03-06 Thread $B$*$M$@$j%(%C%A"v(B


¬¬¬¬¬¬¬¬¬  [EMAIL PROTECTED]@â  ¬¬¬¬¬¬¬¬



ªÂ¤¢ ÌqÌã´íèªªªªªªªªªªªªªªªªª

[EMAIL PROTECTED]@[EMAIL PROTECTED]@fB[vÅtFeBbVÈéæ·è¾³ñô

cccccccccccccccccccc

[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL 
PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@«««

[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL 
PROTECTED]@ªªªªªªªªªªªªªªªªªªªªªª



[EMAIL PROTECTED]@@  http://urafeti.candyhos.com/?fhtvsgezcq



\\\\\\\\\\\\\\\\\

[EMAIL PROTECTED]@\\\\\\  

\\\\\\\\\\\\\\\\\

[EMAIL PROTECTED]

[EMAIL PROTECTED]

[EMAIL PROTECTED]

 

 

[EMAIL PROTECTED]@nBèx

[EMAIL PROTECTED]

[EMAIL PROTECTED]

 

[EMAIL PROTECTED]@http://urafeti.candyhos.com/?fhtvsgezcq

 

\\\\\\\\\\\\\\\\\

[EMAIL PROTECTED]@\\\\\\  

\\\\\\\\\\\\\\\\\

[EMAIL PROTECTED]

[EMAIL PROTECTED]

[EMAIL PROTECTED]

 

[EMAIL PROTECTED]@http://urafeti.candyhos.com/?fhtvsgezcq

 

*E**E**E**E**E**E**E**E*E*



[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL 
PROTECTED]@ú« «Ì«~««

[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL 
PROTECTED]@ª®ª®ª®ª®ª®

[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL 
PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED] µ«Ü«¹«ñ«©«H«

[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL 
PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]

@

[EMAIL PROTECTED]@[EMAIL PROTECTED]@¼ÉàbèÌåR[téæªR·è¾æñÁô

ªªªªªªªªªªªªªªªªªªªªªªªªªªªªª



¬¬¬¬¬¬  [EMAIL PROTECTED]@â  ¬¬¬¬¬¬



Åß¢½¸çÆvíêé[Mª¦Ä¨èÜ·B²o^ÉgÉo¦ª

[EMAIL PROTECTED]

[EMAIL PROTECTED]@[}KWÌÞïEðÍ±¿ç«



[EMAIL PROTECTED]@http://pure.inmova.cc/huku/

[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL 
PROTECTED]@[EMAIL PROTECTED]@

ÇÌ[}KWæèzM³êéîñÌpÉÖµÄÍA

[EMAIL PROTECTED]@ ²wÇÒÂlÌÓCÉ¨¢Ä²pº³¢B

[EMAIL PROTECTED]@ ²ÐîæÌTCgÉ¨¯é¢©Èéguâ¹QÉÎµÄà

[EMAIL PROTECTED]@ ÇÅÍêØÌÓCð¢©ËÜ·B

[EMAIL PROTECTED]@ AfÚîñÉÖµÄÌ¢©Èé¨â¹ÉÎµÄà

[EMAIL PROTECTED]@ ¨¦µ©ËÜ·ÌÅ\ß²¹³º³¢B

[EMAIL PROTECTED]@[EMAIL PROTECTED]@

fÚ³ê½LÌêÜ½ÍSðÂÈ]Ú·é±ÆðÖ~vµÜ·B






___
Help-bison@gnu.org http://lists.gnu.org/mailman/listinfo/help-bison

Re: x + (y) + z

Re: x + (y) + z

Re: x + (y) + z

Re: x + (y) + z

Re: x + (y) + z

Re: x + (y) + z

Re: x + (y) + z

Re: x + (y) + z

$B%P%$%V$G@dD:!*G($l$9$.%^%s%^%s(B$B!y(B

9 matches

Site Navigation

Mail list logo

Footer information