On Tue, Dec 18, 2012 at 4:34 PM, Richard Henderson <r...@redhat.com> wrote:
> On 12/14/2012 04:20 AM, Richard Biener wrote:
>> Exposing known rounding modes as new operation codes may sound like
>> a good idea (well, I went a similar way with trying to make operations with
>> undefined overflow explicit ... but the fallout was quite large even though
>> there is only one kind of undefined overflow and not many operation codes
>> that are affected ... so the work stalled - see no-undefined-overflow 
>> branch).
>> But don't under-estimate the fallout - both in wrong-code and
>> missed-optimizations.
>
> Yes, there will be problems adding new operation codes, but if you separate
> out the subcode somewhere, how can you be sure that the existing optimizations
> are looking at it and honoring it?  It seems to me that's just as much a 
> source
> of wrong-code as new operation codes.

Indeed.  Which is why I settled with new operation codes for
no-undefined-overflow.

>> Not sure if we want to start allocating sub-spaces of codes to a group
>> to allow flag-like composition (say, PLUS_EXPR gets 0x10 and the lower
>> nibble specifies the rounding mode).  It looks more appealing for the
>> rounding mode case (more cases) than for the binary (un-)defined overflow 
>> case.
>
> The largest problem here is that we're constrained on space:
>
>   ENUM_BITFIELD(rtx_code) code: 16;
>
>   unsigned int subcode          : 16;
>
> we can't afford to allocate an entire nibble to rounding.
>
> We could allocate the codes in some sort of pattern that would make it
> easy to extract the rounding mode algorithmicly.  Something like
>
>   (code - BASE) % 5
>
> since there are 4 directed rounding modes plus "unknown" or "dynamic".

Or stick with what I've done on no-undefined-overflow:

#define PLUS_EXPR_P(code) (code == PLUS_EXPR || code == PLUS_NV_EXPR)
...

/* Returns an equivalent non-NV tree code for CODE.  */
static inline enum tree_code
strip_nv (enum tree_code code)
{
  switch (code)
    {
    case NEGATENV_EXPR:
      return NEGATE_EXPR;
...

for FP rounding you'd probably have similar stuff as we have for the
qualifiers and functions to attach/remove rounding modes from codes.

>> You'd want to expose the rounding mode libc functions as builtins to be
>> able to detect them.  That's good anyway and can be done independently
>> (they currently act as memory optimization barrier which avoids most of
>> the issues with -frounding-math support).
>
> Yep.
>
>> Insertion of rounding mode changes has to be done after 2nd scheduling
>> (and you probably want to have even 1st scheduling optimize the schedule
>> for rounding mode changes ...).  Machine-reorg is one natural place to do
>> it (or where we currently insert vzeroupper).
>
> Flogging the 387 fpcr or the sse mxcr is just complicated enough to require
> a free register, and thus it probably has to be done before register 
> allocation.
> E.g. during the optimize-mode-switching pass where we currently handle 387
> rounding modes coming from other builtins and casts.

Or initially just not do anything here - you have to treat all external calls
conservatively anyway (and asms, unless we have a portable documented
way of saying "this asm affects the fp status" ...).  So initially just make
sure that the statements that are supposed to be barriers for FP ops
behave that way (note that such conservative treatment produces
"undefined rounding mode" ops which are not combinable with even
their own kind).

Which comes back to the fact that you somehow need to model dependency
on (possibly) rounding mode changing statements.  You can always
abuse virtual operands for that (either the existing single one or by
re-introducing the possibility of having multiple virtual operands per stmt).
Of course most passes do not care about virtual operands on things that
do not look like memory accesses.  And this is just the GIMPLE side,
you also need to handle fold (maybe a non-issue - but look at compund
stmt foldings) and RTL.

Richard.

>
> r~

Reply via email to