False ‘noreturn’ function does return warnings

2007-02-06 Thread Ralf Baechle
In an OS kernel functions that do not return are commonly used and
practically always their code is beyond gcc's ability to recognize
noreturn functions.  A typical example would for example be the BUG()
function in Linux which is implemented as something like:

static inline void __attribute__((noreturn)) BUG(void)
{
__asm__ __volatile__("trap");
}

So the code doesn't return but gcc isn't able to figure that out and the
caring programmer trying to help gcc to do a better job by adding the
noreturn is punished with

  warning: ‘noreturn’ function does return

There are other such situations in plain C.  A common solution is to add
something like like while(1) - but that does generate code.  Quite a bit
for frequently called or very small functions.  This frequently makes the
use of noreturn functions unattractive.  So two suggested solutions:

1) Burry the warning for good.  The programmer has explicitly said
   noreturn so if the function does return it's his problem.

2) Introduce a __builtin_unreached() builtin function which tells gcc that
   it not being reached, ever and does not generate any code.  This could
   be used like:

static inline void __attribute__((noreturn)) BUG(void)
{
__asm__ __volatile__("trap");
__builtin_unreached();
}

It would even conveniently be usable in macros or conditionals.

  Ralf



Re: [c++] switch ( enum ) vs. default statment.

2007-02-06 Thread Ralf Baechle
On Tue, Feb 06, 2007 at 04:44:50PM -0200, Alexandre Oliva wrote:

> > * You can add a return 0 or an exit(1) at the end of the function or
> > in a default label. Since in your case the code is unreachable, the
> > optimiser may remove it or it will never be executed.
> 
> But this would generate additional code for no useful purpose.
> See Ralf Baechle's posting with Subject: False ‘noreturn’ function
> does return warnings.
> 
> /me thinks it would make sense for us to add a __builtin_unreachable()
> that would indicate to GCC that no further action is needed from that
> point on.  A command-line flag could then tell whether GCC should
> generate abort() for such cases, or take a more general "undefined
> behavior" approach without generating additional code to that end.
> 
> Meanwhile, there's __builtin_trap() already, and Ralf might use that
> even to remove the asm volatile, and Paweł could use it in a default:
> label.  It's still worse than a __builtin_assume(e == X || e == Y),
> but it's probably much simpler to implement.  But then,
> __builtin_unreachable() might very well be implemented as
> __builtin_assume(0).

The Linux/MIPS BUG() macro uses a break 512 instruction while
__builtin_trap() generates a plain break that is break 0 instruction.
Aside of that __builtin_trap() would be fine ...

  Ralf


Re: RFC: A new MIPS64 ABI

2011-05-09 Thread Ralf Baechle
On Mon, Feb 21, 2011 at 07:45:41PM +, Richard Sandiford wrote:

> David Daney  writes:
> > Background:
> >
> > Current MIPS 32-bit ABIs (both o32 and n32) are restricted to 2GB of
> > user virtual memory space.  This is due the way MIPS32 memory space is
> > segmented.  Only the range from 0..2^31-1 is available.  Pointer
> > values are always sign extended.
> >
> > Because there are not already enough MIPS ABIs, I present the ...
> >
> > Proposal: A new ABI to support 4GB of address space with 32-bit
> > pointers.
> 
> FWIW, I'd be happy to see this go into GCC.

So am I for the kernel primarily because it's not really a new ABI but
an enhancement of the existing N32 ABI.

  Ralf


Re: [PATCH] MIPS: Make BUG() __noreturn.

2008-11-21 Thread Ralf Baechle
On Fri, Nov 21, 2008 at 07:46:43PM +0100, Geert Uytterhoeven wrote:

> > up with a couple of options:
> > 
> > 1) Enhance the _builtin_trap() function so that we can specify the
> >   break code that is emitted.  This would allow us to do something
> >   like:
> > 
> > static inline void __attribute__((noreturn)) BUG()
> > {
> > __builtin_trap(0x200);
> > }

I had suggested this one before ...

> > 2) Create a new builtin '__builtin_noreturn()' that expands to nothing
> >   but has no CFG edges leaving it, which would allow:
> > 
> > static inline void __attribute__((noreturn)) BUG()
> > {
> > __asm__ __volatile__("break %0" : : "i" (0x200));
> > __builtin_noreturn();
> > }
> 
> Now I remember, yes, __builtin_trap() is how we fixed it on m68k:

I like this interface.

> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e8006b060f3982a969c5170aa869628d54dd30d8
> 
> Of course, if you need a different trap code than the default, you're in
> trouble.

MIPS ISA newer than MIPS I also have conditional break codes allowing
something like this:

#define BUG_ON(condition)   \
do {\
__asm__ __volatile__("tne $0, %0, %1"   \
 : : "r" (condition), "i" (BRK_BUG));   \
} while (0)

that is test of condition and the trap as a single instruction.  Note there
are break and trap instructions on MIPS and they are basically doing the
same job ...

  Ralf


Re: Resend: [PATCH] [MIPS] Fix asm constraints for 'ins' instructions.

2008-06-13 Thread Ralf Baechle
On Thu, Jun 12, 2008 at 01:10:10PM -0700, David Daney wrote:

>>> Among the versions of GCC that can build the current kernel, will any
>>> fail on this code because the "i" constraint cannot be matched when
>>> expanded to RTL?
>>
>> Someone will point this out if I don't, so for avoidance of doubt:
>> this needs to be always_inline.  It also isn't guaranteed to work
>> with "bit" being a separate statement.  I'm not truly sure it's
>> guaranteed to work even with:
>>
>> __asm__ __volatile__ ("  foo %0, %1" : "=m" (*p) : "i" (nr & 5));
>>
>> but I think we'd try hard to make sure it does.
>>
>> I think Maciej said that 3.2 was the minimum current version.
>> Even with those two issues sorted out, I don't think you can
>> rely on this sort of thing with compilers that used RTL inlining.
>> (always_inline does go back to 3.2, in case you're wondering.)
>>
>
> Well I withdraw the patch.  With the current kernel code we seem to always 
> get good code generation.  In the event that the compiler tries to put the 
> shift amount (nr) in a register, the assembler will complain.  I don't think 
> it is possible to generate bad object code, so best to leave it alone.
>
> FYI, the reason that I stumbled on this several weeks ago is that 
> if(__builtin_constant_p(nr)) in the trunk compiler was generating code for 
> the asm even though nr was not constant.

How about I simply put your patch into the -queue tree, everybody gives it
a nice beating and then we'll how well it'll hold up in the real world?

  Ralf


Re: MIPS atomic memory operations (A.K.A PR 33479).

2007-09-19 Thread Ralf Baechle
On Tue, Sep 18, 2007 at 05:12:48PM -0700, David Daney wrote:

> There seems to be a small problem with the MIPS atomic memory operations 
> patch I recently committed 
> (http://gcc.gnu.org/ml/gcc-patches/2007-08/msg01290.html), in that on a 
> dual CPU machine it does not quite work.
> 
> You can look at http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33479#c3 for 
> more information.
> 
> Here is the code in question (from mips.h):
> 
> #define MIPS_COMPARE_AND_SWAP(SUFFIX, OP) \
>   "%(%<%[sync\n"  \
>   "1:\tll" SUFFIX "\t%0,%1\n" \
>   "\tbne\t%0,%2,2f\n" \
>   "\t" OP "\t%@,%3\n" \
>   "\tsc" SUFFIX "\t%@,%1\n"   \
>   "\tbeq\t%@,%.,1b\n" \

Please make this loop closure branch a branch-likely.  This is necessary
as a errata workaround for some processors.

(I know silicon people hate me for keeping branch likely instruction alive
this way but it's my job to make sure Linux and software are working without
unpleasant surprises.)

>   "\tnop\n"   \
>   "2:%]%>%)"
> 
> 
> 
> I guess my basic question is:  Should MIPS_COMPARE_AND_SWAP have a 
> 'sync' after the 'sc'?  I would have thought that 'sc' made the write 
> visible to all CPUs, but on the SB1 it appears not to be the case.

The MIPS architecture specification specifies no memory model, so for
portable code you need to make a worst case assumption which is weak
ordering.

Only on R4000 and R4400 SC and SCD did imply a SYNC operation.

> If we do need to add another 'sync' should it go in the delay slot of 
> the branch?  I would say yes because we would expect the branch to 
> rarely taken.

Not when using a branch likely.

Btw, I recently wrote an article about memory consistency which is at
http://www.linux-mips.org/wiki/Memory_consistency.  It gives a bit of
an overview of things in general and on MIPS specifically.  I request
people with detailed knowledge of MIPS cores not specifically covered
in that article to contribute.

  Ralf


Re: MIPS atomic memory operations (A.K.A PR 33479).

2007-09-19 Thread Ralf Baechle
On Wed, Sep 19, 2007 at 07:12:33PM +0100, Thiemo Seufer wrote:

> >>  Another option is to depend on the setting of -mbranch-likely.  By 
> >> default it is on only for the processors which implement it and do not 
> >> discourage it, i.e. these of the MIPS II, MIPS III and MIPS IV ISAs.

All MIPS implementations that have branch likely also support it with
good performance.  So the deprecation is atm really something that has
happened on paper.

The approach for LL/SC loops (where it's used for correctness) and the rest
of the code where we care about code size and performance is not necessarily
the same.

> > This seems to be the most sensible option.
> >
> > I will try to work up the GCC patch tonight.
> 
> This means generic MIPS code (MIPS I) wil have broken atomic
> intrinsics when run on modern MIPS machines.

Oh and if it takes adding new emulations for SYNC (some pseudo MIPS II
implementations lack SYNC afair) or branch likely to the kernel I will
certainly support that.

  Ralf


Re: Cannot unwind through MIPS signal frames with ICACHE_REFILLS_WORKAROUND_WAR

2007-11-13 Thread Ralf Baechle
On Tue, Nov 13, 2007 at 11:48:53AM +, Andrew Haley wrote:

> David Daney writes:
>  > With the current kernel (2.6.23.1) in my R5000 based O2 it seems 
>  > impossible for GCC's exception unwinding machinery to unwind through 
>  > signal frames.  The cause of the problems is the 
>  > ICACHE_REFILLS_WORKAROUND_WAR which puts the sigcontext at an almost 
>  > impossible to determine offset from the signal return trampoline.  The 
>  > unwinder depends on being able to find the sigcontext given a known 
>  > location of the trampoline.
>  > 
>  > It seems there are a couple of possible solutions:
>  > 
>  > 1) The comments in war.h indicate the problem only exists in R7000
>  > and E9000 processors.  We could turn off the workaround if the
>  > kernel is configured for R5000.  That would help me, but not those
>  > with the effected systems.
>  > 
>  > 2) In the non-workaround case, the siginfo immediately follows the
>  > trampoline and the first member is the signal number.  For the
>  > workaround case the first word following the trampoline is zero.
>  > We could replace this with the offset to the sigcontext which is
>  > always a small negative value.  The unwinder could then distinguish
>  > the two cases (signal numbers are positive and the offset
>  > negative).  If we did this, the change would have to be coordinated
>  > with GCC's unwinder (in libgcc_s.so.1).
>  > 
>  > Thoughts?
> 
> The best solution is to put the unwinder info in the kernel.  Does
> MIPS use a vDSO ?

No though we should.

Another reason is to get rid of the classic trampoline the kernel installs
on the stack.  On some multiprocessor systems it requires a cacheflush
operation to be performed on all processors which is expensive.  Having
the trampoline in a vDSO would solve that.

I need to look into it, not sure what it would take.

  Ralf


Re: Cannot unwind through MIPS signal frames with ICACHE_REFILLS_WORKAROUND_WAR

2007-11-13 Thread Ralf Baechle
On Tue, Nov 13, 2007 at 02:14:58PM +0100, Franck Bui-Huu wrote:

> > > David Daney writes:
> > >  > With the current kernel (2.6.23.1) in my R5000 based O2 it seems
> > >  > impossible for GCC's exception unwinding machinery to unwind through
> > >  > signal frames.  The cause of the problems is the
> > >  > ICACHE_REFILLS_WORKAROUND_WAR which puts the sigcontext at an almost
> > >  > impossible to determine offset from the signal return trampoline.  The
> > >  > unwinder depends on being able to find the sigcontext given a known
> > >  > location of the trampoline.
> > >  >
> > >  > It seems there are a couple of possible solutions:
> > >  >
> > >  > 1) The comments in war.h indicate the problem only exists in R7000
> > >  > and E9000 processors.  We could turn off the workaround if the
> > >  > kernel is configured for R5000.  That would help me, but not those
> > >  > with the effected systems.
> > >  >
> > >  > 2) In the non-workaround case, the siginfo immediately follows the
> > >  > trampoline and the first member is the signal number.  For the
> > >  > workaround case the first word following the trampoline is zero.
> > >  > We could replace this with the offset to the sigcontext which is
> > >  > always a small negative value.  The unwinder could then distinguish
> > >  > the two cases (signal numbers are positive and the offset
> > >  > negative).  If we did this, the change would have to be coordinated
> > >  > with GCC's unwinder (in libgcc_s.so.1).
> > >  >
> > >  > Thoughts?
> > >
> > > The best solution is to put the unwinder info in the kernel.  Does
> > > MIPS use a vDSO ?
> >
> > No though we should.
> >
> > Another reason is to get rid of the classic trampoline the kernel installs
> > on the stack.  On some multiprocessor systems it requires a cacheflush
> > operation to be performed on all processors which is expensive.  Having
> > the trampoline in a vDSO would solve that.
> >
> 
> And the stack wouldn't need to have exec permission anymore.

Oh?

extern void frob(void (*)(void));

int foo(void)
{
int x;

void bar(void)
{
x++;
}

frob(&bar);
print("x is %d\n", x);
}

Compile and enjoy.

  Ralf


Re: Cannot unwind through MIPS signal frames with ICACHE_REFILLS_WORKAROUND_WAR

2007-11-13 Thread Ralf Baechle
On Tue, Nov 13, 2007 at 03:22:33PM +0100, Franck Bui-Huu wrote:

> > > And the stack wouldn't need to have exec permission anymore.
> >
> > Oh?
> >
> > extern void frob(void (*)(void));
> >
> > int foo(void)
> > {
> > int x;
> >
> > void bar(void)
> > {
> > x++;
> > }
> >
> > frob(&bar);
> > print("x is %d\n", x);
> > }
> >
> > Compile and enjoy.
> >
> 
> Sorry Ralf, I missed your point.

This piece of code compiles to something that copies a trampoline to the
stack.  The address of that trampoline is what is then passed as argument
to frob().

Old versions of glibc were probable the most notorious users of trampolines.
Objective C also generates them.  Since a cacheflush that is a syscall is
required performance is less than great.

Which means the libc() cacheflush() function is another candidate for a
vDSO, it can be optimized by using SYNCI on some configurations.

  Ralf


Re: Cannot unwind through MIPS signal frames with ICACHE_REFILLS_WORKAROUND_WAR

2007-11-13 Thread Ralf Baechle
On Tue, Nov 13, 2007 at 03:37:39PM +0100, Kevin D. Kissell wrote:

> True, though it should perhaps be noted that currently it's only on 4KSc/Sd
> systems (which I know you work on) where it's even possible for the stack
> *not* to have exec permissions, since the classical MIPS MMU gives
> execute permission to any page that is readable.

Disabling PROT_EXEC on a mapping tells the kernel it doesn't need to take
care of I-cache coherency.  So it's somewhat beneficial even in absence of
a protection bit in the actual TLB hardware.

Some of these performance optimizations are impossible because the kernel
can't have definate knowledge that certain addresses have never entered the
I-cache.

  Ralf