Re: GCC support for PowerPC VLE

2013-03-21 Thread Will
James Lemke  codesourcery.com> writes:

> I have completed the binutils submission for VLE.
> I am working on the gcc submission.  The test results are looking good
> now.  Patches will be posted very soon.

Do you have any update on the work on VLE-support?

Thanks for any feedback you can provide!






Re: [RFC][AArch64] function prologue analyzer in linux kernel

2016-01-08 Thread Will Deacon
On Fri, Jan 08, 2016 at 02:36:32PM +0900, AKASHI Takahiro wrote:
> On 01/07/2016 11:56 PM, Richard Earnshaw (lists) wrote:
> >On 07/01/16 14:22, Will Deacon wrote:
> >>On Thu, Dec 24, 2015 at 04:57:54PM +0900, AKASHI Takahiro wrote:
> >>>So I'd like to introduce a function prologue analyzer to determine
> >>>a size allocated by a function's prologue and deduce it from "Depth".
> >>>My implementation of this analyzer has been submitted to
> >>>linux-arm-kernel mailing list[1].
> >>>I borrowed some ideas from gdb's analyzer[2], especially a loop of
> >>>instruction decoding as well as stop of decoding at exiting a basic block,
> >>>but implemented my own simplified one because gdb version seems to do
> >>>a bit more than what we expect here.
> >>>Anyhow, since it is somewhat heuristic (and may not be maintainable for
> >>>a long term), could you review it from a broader viewpoint of toolchain,
> >>>please?
> >>>
> >>My main issue with this is that we cannot rely on the frame layout
> >>generated by the compiler and there's little point in asking for
> >>commitment here. Therefore, the heuristics will need updating as and
> >>when we identify new frames that we can't handle. That's pretty fragile
> >>and puts us on the back foot when faced with newer compilers. This might
> >>be sustainable if we don't expect to encounter much variation, but even
> >>that would require some sort of "buy-in" from the various toolchain
> >>communities.
> >>
> >>GCC already has an option (-fstack-usage) to determine the stack usage
> >>on a per-function basis and produce a report at build time. Why can't
> >>we use that to provide the information we need, rather than attempt to
> >>compute it at runtime based on your analyser?
> >>
> >>If -fstack-usage is not sufficient, understanding why might allow us to
> >>propose a better option.
> >
> >Can you not use the dwarf frame unwind data?  That's always sufficient
> >to recover the CFA (canonical frame address - the value in SP when
> >executing the first instruction in a function).  It seems to me it's
> >unlikely you're going to need something that's an exceedingly high
> >performance operation.
> 
> Thank you for your comment.
> Yeah, but we need some utility routines to handle unwind data(.debug_frame).
> In fact, some guy has already attempted to merge (part of) libunwind into
> the kernel[1], but it was rejected by the kernel community (including Linus
> if I correctly remember). It seems that they thought the code was still buggy.

The ARC guys seem to have sneaked something in for their architecture:

  
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/arch/arc/kernel/unwind.c

so it might not be impossible if we don't require all the bells and
whistles of libunwind.

> That is one of reasons that I wanted to implement my own analyzer.

I still don't understand why you can't use fstack-usage. Can you please
tell me why that doesn't work? Am I missing something?

Will


Re: [RFC][AArch64] function prologue analyzer in linux kernel

2016-01-12 Thread Will Deacon
On Tue, Jan 12, 2016 at 03:11:29PM +0900, AKASHI Takahiro wrote:
> Will,
> 
> On 01/09/2016 12:53 AM, Will Deacon wrote:
> >On Fri, Jan 08, 2016 at 02:36:32PM +0900, AKASHI Takahiro wrote:
> >>On 01/07/2016 11:56 PM, Richard Earnshaw (lists) wrote:
> >>>On 07/01/16 14:22, Will Deacon wrote:
> >>>>On Thu, Dec 24, 2015 at 04:57:54PM +0900, AKASHI Takahiro wrote:
> >>>>>So I'd like to introduce a function prologue analyzer to determine
> >>>>>a size allocated by a function's prologue and deduce it from "Depth".
> >>>>>My implementation of this analyzer has been submitted to
> >>>>>linux-arm-kernel mailing list[1].
> >>>>>I borrowed some ideas from gdb's analyzer[2], especially a loop of
> >>>>>instruction decoding as well as stop of decoding at exiting a basic 
> >>>>>block,
> >>>>>but implemented my own simplified one because gdb version seems to do
> >>>>>a bit more than what we expect here.
> >>>>>Anyhow, since it is somewhat heuristic (and may not be maintainable for
> >>>>>a long term), could you review it from a broader viewpoint of toolchain,
> >>>>>please?
> >>>>>
> >>>>My main issue with this is that we cannot rely on the frame layout
> >>>>generated by the compiler and there's little point in asking for
> >>>>commitment here. Therefore, the heuristics will need updating as and
> >>>>when we identify new frames that we can't handle. That's pretty fragile
> >>>>and puts us on the back foot when faced with newer compilers. This might
> >>>>be sustainable if we don't expect to encounter much variation, but even
> >>>>that would require some sort of "buy-in" from the various toolchain
> >>>>communities.
> >>>>
> >>>>GCC already has an option (-fstack-usage) to determine the stack usage
> >>>>on a per-function basis and produce a report at build time. Why can't
> >>>>we use that to provide the information we need, rather than attempt to
> >>>>compute it at runtime based on your analyser?
> >>>>
> >>>>If -fstack-usage is not sufficient, understanding why might allow us to
> >>>>propose a better option.
> >>>
> >>>Can you not use the dwarf frame unwind data?  That's always sufficient
> >>>to recover the CFA (canonical frame address - the value in SP when
> >>>executing the first instruction in a function).  It seems to me it's
> >>>unlikely you're going to need something that's an exceedingly high
> >>>performance operation.
> >>
> >>Thank you for your comment.
> >>Yeah, but we need some utility routines to handle unwind data(.debug_frame).
> >>In fact, some guy has already attempted to merge (part of) libunwind into
> >>the kernel[1], but it was rejected by the kernel community (including Linus
> >>if I correctly remember). It seems that they thought the code was still 
> >>buggy.
> >
> >The ARC guys seem to have sneaked something in for their architecture:
> >
> >   
> > http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/arch/arc/kernel/unwind.c
> >
> >so it might not be impossible if we don't require all the bells and
> >whistles of libunwind.
> 
> Thanks. I didn't notice this code.
> 
> >>That is one of reasons that I wanted to implement my own analyzer.
> >
> >I still don't understand why you can't use fstack-usage. Can you please
> >tell me why that doesn't work? Am I missing something?
> 
> I don't know how gcc calculates the usage here, but I guess it would be more
> robust than my analyzer.
> 
> The issues, that come up to my mind, are
> - -fstack-usage generates a separate output file, *.su and so we have to
>   manage them to be incorporated in the kernel binary.

That doesn't sound too bad to me. How much data are we talking about here?

>   This implies that (common) kernel makefiles might have to be a bit changed.
> - more worse, what if kernel module case? We will have no way to let the 
> kernel
>   know the stack usage without adding an extra step at loading.

We can easily add a new __init section to modules, which is a table
representing the module functions and their stack sizes (like we do
for other things like alternatives). We'd just then need to slurp this
information at load time and throw it into an rbtree or something.

Will


Re: [RFC][AArch64] function prologue analyzer in linux kernel

2016-01-15 Thread Will Deacon
On Wed, Jan 13, 2016 at 05:13:29PM +0900, AKASHI Takahiro wrote:
> On 01/13/2016 03:04 AM, Will Deacon wrote:
> >On Tue, Jan 12, 2016 at 03:11:29PM +0900, AKASHI Takahiro wrote:
> >>On 01/09/2016 12:53 AM, Will Deacon wrote:
> >>>I still don't understand why you can't use fstack-usage. Can you please
> >>>tell me why that doesn't work? Am I missing something?
> >>
> >>I don't know how gcc calculates the usage here, but I guess it would be more
> >>robust than my analyzer.
> >>
> >>The issues, that come up to my mind, are
> >>- -fstack-usage generates a separate output file, *.su and so we have to
> >>   manage them to be incorporated in the kernel binary.
> >
> >That doesn't sound too bad to me. How much data are we talking about here?
> >
> >>   This implies that (common) kernel makefiles might have to be a bit 
> >> changed.
> >>- more worse, what if kernel module case? We will have no way to let the 
> >>kernel
> >>   know the stack usage without adding an extra step at loading.
> >
> >We can easily add a new __init section to modules, which is a table
> >representing the module functions and their stack sizes (like we do
> >for other things like alternatives). We'd just then need to slurp this
> >information at load time and throw it into an rbtree or something.
> 
> I found another issue.
> Let's think about 'dynamic storage' case like:
> $ cat stack.c
> extern long fooX(long a);
> extern long fooY(long b[]);
> 
> long foo1(long a) {
> 
>   if (a > 1) {
>   long b[a];  <== Here
> 
>   return a + fooY(b);
>   } else {
>   return a + fooX(a);
>   }
> }
> 
> Then, -fstack-usage returns 48 for foo1():
> $ aarch64-linux-gnu-gcc -fno-omit-frame-pointer -fstack-usage main.c stack.c \
>   -pg -O2 -fasynchronous-unwind-tables
> $ cat stack.su
> stack.c:4:6:foo1  48  dynamic
> 
> This indicates that foo1() may use 48 bytes or more depending on a condition.
> But in my case (ftrace-based stack tracer), I always expect 32 whether we're
> backtracing from fooY() or from fooX() because my stack tracer estimates:
>(stack pointer) = (callee's frame pointer) + (callee's stack usage)
> (in my previous e-mail, '-(minus)' was wrong.)
> 
> where (callee's stack usage) is, as I described in my previous e-mail, a size 
> of
> memory which is initially allocated on a stack in a function prologue, and 
> should not
> contain a size of dynamically allocate area.

According to who? What's the use in reporting only the prologue size?

Will


History of GCC

2016-10-25 Thread Will Hawkins
Hello everyone!

My name is Will Hawkins and I am a longtime user of gcc and admirer of
the project. I hope that this is the proper forum for the question I
am going to ask. If it isn't, please accept my apology and ignore me.

I am a real geek and I love the history behind open source projects.
I've found several good resources about the history of "famous" open
source projects and organizations (including, but definitely not
limited to, the very interesting Free as in Freedom 2.0).

Unfortunately there does not appear to be a good history of the
awesome and fundamental GCC project. I know that there is a page on
the wiki (https://gcc.gnu.org/wiki/History) but that is really the
best that I can find.

Am I missing something? Are there good anecdotes about the history of
the development of GCC that you think I might find interesting? Any
pointers would be really great!

Thanks for taking the time to read my questions. Thanks in advance for
any information that you have to offer. I really appreciate everyone's
effort to make such a great compiler suite. It's only with such a
great compiler that all our other open source projects are able to
succeed!

Thank you!
Will


Re: History of GCC

2016-10-26 Thread Will Hawkins
On Wed, Oct 26, 2016 at 9:07 AM, Ian Lance Taylor  wrote:
> On Tue, Oct 25, 2016 at 10:53 PM, Will Hawkins  wrote:
>>
>> My name is Will Hawkins and I am a longtime user of gcc and admirer of
>> the project. I hope that this is the proper forum for the question I
>> am going to ask. If it isn't, please accept my apology and ignore me.
>>
>> I am a real geek and I love the history behind open source projects.
>> I've found several good resources about the history of "famous" open
>> source projects and organizations (including, but definitely not
>> limited to, the very interesting Free as in Freedom 2.0).
>>
>> Unfortunately there does not appear to be a good history of the
>> awesome and fundamental GCC project. I know that there is a page on
>> the wiki (https://gcc.gnu.org/wiki/History) but that is really the
>> best that I can find.
>>
>> Am I missing something? Are there good anecdotes about the history of
>> the development of GCC that you think I might find interesting? Any
>> pointers would be really great!
>>
>> Thanks for taking the time to read my questions. Thanks in advance for
>> any information that you have to offer. I really appreciate everyone's
>> effort to make such a great compiler suite. It's only with such a
>> great compiler that all our other open source projects are able to
>> succeed!
>
> There is some history and links at
> https://en.wikipedia.org/wiki/GNU_Compiler_Collection .
>
> In my opinion, the history of GCC is not really one of drama or even
> anecdotes, except for the EGCS split.  There are plenty of people who
> work on GCC out of personal interest, but for decades now the majority
> of work on GCC has been by people paid to work on it.  I expect that
> the result is less interesting as history and more interesting as
> software.
>
> Ian

Ian,

Thank you for your response! I don't think that there has to be
controversy to be interesting. Obviously that split/reunification was
important, but I think that there might even be some value in
documenting the minutia of the project's growth. In other words, what
was the process for incorporating each new version of the C++
standard? Who and why did GCC start a frontend for X language? Things
like that.

Thanks again for your response!

Will


Re: History of GCC

2016-10-26 Thread Will Hawkins
On Wed, Oct 26, 2016 at 11:55 AM, Jeff Law  wrote:
> On 10/26/2016 07:07 AM, Ian Lance Taylor wrote:
>>
>> On Tue, Oct 25, 2016 at 10:53 PM, Will Hawkins  wrote:
>>>
>>>
>>> My name is Will Hawkins and I am a longtime user of gcc and admirer of
>>> the project. I hope that this is the proper forum for the question I
>>> am going to ask. If it isn't, please accept my apology and ignore me.
>>>
>>> I am a real geek and I love the history behind open source projects.
>>> I've found several good resources about the history of "famous" open
>>> source projects and organizations (including, but definitely not
>>> limited to, the very interesting Free as in Freedom 2.0).
>>>
>>> Unfortunately there does not appear to be a good history of the
>>> awesome and fundamental GCC project. I know that there is a page on
>>> the wiki (https://gcc.gnu.org/wiki/History) but that is really the
>>> best that I can find.
>>>
>>> Am I missing something? Are there good anecdotes about the history of
>>> the development of GCC that you think I might find interesting? Any
>>> pointers would be really great!
>>>
>>> Thanks for taking the time to read my questions. Thanks in advance for
>>> any information that you have to offer. I really appreciate everyone's
>>> effort to make such a great compiler suite. It's only with such a
>>> great compiler that all our other open source projects are able to
>>> succeed!
>>
>>
>> There is some history and links at
>> https://en.wikipedia.org/wiki/GNU_Compiler_Collection .
>>
>> In my opinion, the history of GCC is not really one of drama or even
>> anecdotes, except for the EGCS split.  There are plenty of people who
>> work on GCC out of personal interest, but for decades now the majority
>> of work on GCC has been by people paid to work on it.  I expect that
>> the result is less interesting as history and more interesting as
>> software.
>
> Agreed.  Speaking for myself, I got interested in GCC to solve a problem,
> then another, then another...  Hacking GCC made for an interesting hobby for
> a few years.  I never imagined it would turn into a career, but 20 years
> later, here I am.  I wouldn't be surprised if others have followed a similar
> path.
>
> jeff

Thank you Ian, Joel and Jeff for your responses!

I really appreciate it. As I said in my first message, I am a real
geek and I certainly did not mean to imply that I thought many people
would find the history interesting. That said, I think that there
might be some people who do find it so.

As an example of the type of software history that I find interesting,
consider these:

https://www.youtube.com/watch?v=TMjgShRuYbg
or
https://www.youtube.com/watch?v=2kEJoWfobpA
or
https://www.usenix.org/system/files/login/articles/03_lu_010-017_final.pdf
or as a final example
https://www.youtube.com/watch?v=69edOm889V4

To answer the question you are probably asking, "No, I have no idea
why I enjoy this type of stuff as much as I do!"

Can any of you recall a turning point where development went from
being driven by hobbyists to being driven by career developers? As a
result of that shift, has there been a change in the project's
priorities? Have there been conflicts between the employer's interests
and those of the project (in terms of project goals, licensing issues,
code quality, etc)?

In any event, I really appreciate your answers. If you have any
information that you think I might find interesting, please feel free
to pass it along. Thanks again!

Will


Re: History of GCC

2016-10-26 Thread Will Hawkins
On Wed, Oct 26, 2016 at 1:06 PM, Jakub Jelinek  wrote:
> On Wed, Oct 26, 2016 at 06:57:31PM +0200, Marek Polacek wrote:
>> I think you can learn a lot if you follow the Changes pages, so e.g.
>> <https://gcc.gnu.org/gcc-6/changes.html>, and go back down the history until
>> you reach the ancient <https://gcc.gnu.org/gcc-3.1/changes.html>.
>
> Even older releases, while they don't have changes.html, have changes/news
> etc. written in the pages referenced from
> https://gcc.gnu.org/releases.html#timeline
> Also see https://gcc.gnu.org/develop.html#timeline
> For questions like who has added feature XYZ, the best source is just the
> source repository's history or ChangeLog files.
>
> Jakub

Jakub, Marek,

Great suggestions! In fact, I had just thought of the same thing!

Thanks for your response!
Will


Re: History of GCC

2016-10-26 Thread Will Hawkins
On Wed, Oct 26, 2016 at 1:15 PM, Ian Lance Taylor  wrote:
> On Wed, Oct 26, 2016 at 9:31 AM, Will Hawkins  wrote:
>>
>> Thank you for your response! I don't think that there has to be
>> controversy to be interesting. Obviously that split/reunification was
>> important, but I think that there might even be some value in
>> documenting the minutia of the project's growth. In other words, what
>> was the process for incorporating each new version of the C++
>> standard? Who and why did GCC start a frontend for X language? Things
>> like that.
>
> It is easier to answer specific questions.
>
> There have always been GCC developers that have tracked the evolution
> of C++.  The first C++ standard was of course in 1998, at which point
> the language was over 10 years old, so there were a lot of C++
> language changes before then.  GCC has generally acquired new language
> features as they were being adopted into the standard, usually
> controlled by options like the current -std=c++1z.  This of course
> means that the new features have shifted as the standard has shifted,
> but as far as I know that hasn't happened too often.
>
> GCC started as a C compiler.  The C++ frontend was started by Michael
> Tiemann around 1987 or so.  It started as a patch and was later
> incorporated into the mainline.
>
> The Objective C frontend was started at NeXT.  They originally
> intended to keep it proprietary, but when they understood that the GPL
> made that impossible they contributed it back.  I forget when the
> Objective C++ frontend came in.
>
> Cygnus Support developed the Chill and, later, Java frontends.  The
> Chill frontend was removed later, and in fact the Java frontend was
> removed just recently.
>
> As I recall Fortran was a hobbyist project that eventually made it in.
> There were two competing forks, I think.  I don't remember too much
> about that off the top of my head.
>
> The Ada frontend was developed at AdaCore.
>
> The Go frontend was written by me, mostly because I like Go and I've
> been working on GCC for a long time.  I work at Google, and Go was
> developed at Google, but there wouldn't be a GCC Go frontend if I
> hadn't decided to write one.
>
> There is a Modula frontend that is always close to getting in.  I
> think there is a Pascal frontend out there too, somewhere.  And a D
> frontend.
>
> Ian

Wow, thanks Ian! This is awesome stuff! As I read through it, I may
have some additional questions. If I do, would you mind if I emailed
you directly? Thanks again for taking the time to write all this down!
Fascinating!

Will


Re: History of GCC

2016-10-26 Thread Will Hawkins
On Wed, Oct 26, 2016 at 1:28 PM, Richard Kenner
 wrote:
>> The Ada frontend was developed at AdaCore.
>
> The Ada frontend was developed at NYU, as an Air Force-funded project
> to show that Ada95 (then called Ada9X) was implementable.  AdaCore was
> later formed once that was complete to provide commercial support for
> the Ada compiler.  The members of that NYU project were the initial
> team at AdaCore.

Such great information, Richard! Thanks so much!
Will


Re: History of GCC

2016-10-26 Thread Will Hawkins
On Wed, Oct 26, 2016 at 2:23 PM, Eric Gallager  wrote:
> On 10/26/16, Ian Lance Taylor  wrote:
>> On Wed, Oct 26, 2016 at 9:31 AM, Will Hawkins  wrote:
>>>
>>> Thank you for your response! I don't think that there has to be
>>> controversy to be interesting. Obviously that split/reunification was
>>> important, but I think that there might even be some value in
>>> documenting the minutia of the project's growth. In other words, what
>>> was the process for incorporating each new version of the C++
>>> standard? Who and why did GCC start a frontend for X language? Things
>>> like that.
>>
>> It is easier to answer specific questions.
>>
>> There have always been GCC developers that have tracked the evolution
>> of C++.  The first C++ standard was of course in 1998, at which point
>> the language was over 10 years old, so there were a lot of C++
>> language changes before then.  GCC has generally acquired new language
>> features as they were being adopted into the standard, usually
>> controlled by options like the current -std=c++1z.  This of course
>> means that the new features have shifted as the standard has shifted,
>> but as far as I know that hasn't happened too often.
>>
>> GCC started as a C compiler.  The C++ frontend was started by Michael
>> Tiemann around 1987 or so.  It started as a patch and was later
>> incorporated into the mainline.
>>
>> The Objective C frontend was started at NeXT.  They originally
>> intended to keep it proprietary, but when they understood that the GPL
>> made that impossible they contributed it back.  I forget when the
>> Objective C++ frontend came in.
>
>
> The Objective C++ frontend was contribute by Apple. The earliest
> proposal I can find for adding it was in 2001 for GCC 3.x:
> https://gcc.gnu.org/ml/gcc/2001-11/msg00609.html
> However, it didn't actually make it in to the FSF version until 4.1:
> https://gcc.gnu.org/ml/gcc-patches/2005-05/msg01781.html
> https://gcc.gnu.org/ml/gcc-patches/2005-12/msg01812.html
> Personally, I think one of the interesting stories of GCC history is
> how Apple used to be really involved in GCC development until 2007, at
> which point the GPL3 and iPhone came out, and Apple abandoned GCC for
> llvm/clang. If you read through the mailing list archives on
> gcc.gnu.org, you can find all sorts of emails from people with "at
> apple dot com" email addresses in the early 2000s, until they just
> sort of stopped later that decade. Even llvm/clang was originally just
> another branch of gcc, and Chris Lattner was even going to contribute
> it and keep it part of gcc, but then he never got around to getting
> his copyright assignment paperwork filed, and then Apple turned it
> into a separate project:
> https://gcc.gnu.org/ml/gcc/2005-11/msg00888.html
> https://gcc.gnu.org/ml/gcc/2006-03/msg00706.html
>
>
>>
>> Cygnus Support developed the Chill and, later, Java frontends.  The
>> Chill frontend was removed later, and in fact the Java frontend was
>> removed just recently.
>>
>> As I recall Fortran was a hobbyist project that eventually made it in.
>> There were two competing forks, I think.  I don't remember too much
>> about that off the top of my head.
>>
>> The Ada frontend was developed at AdaCore.
>>
>> The Go frontend was written by me, mostly because I like Go and I've
>> been working on GCC for a long time.  I work at Google, and Go was
>> developed at Google, but there wouldn't be a GCC Go frontend if I
>> hadn't decided to write one.
>>
>> There is a Modula frontend that is always close to getting in.  I
>> think there is a Pascal frontend out there too, somewhere.  And a D
>> frontend.
>>
>> Ian
>>

I want to thank each individual for his/her reply, but I don't want to
SPAM the list. So, I will do it in one email!

Thanks! This is so much more information than I expected to get and
it's just amazing. Thanks again!

Will


Re: Compilers and RCU readers: Once more unto the breach!

2015-05-20 Thread Will Deacon
Hi Paul,

On Wed, May 20, 2015 at 03:41:48AM +0100, Paul E. McKenney wrote:
> On Tue, May 19, 2015 at 07:10:12PM -0700, Linus Torvalds wrote:
> > On Tue, May 19, 2015 at 6:57 PM, Linus Torvalds
> >  wrote:
> > So I think you're better off just saying that operations designed to
> > drop significant bits break the dependency chain, and give things like
> > "& 1" and "(char *)ptr-(uintptr_t)ptr" as examples of such.
> > 
> > Making that just an extension of your existing "& 0" language would
> > seem to be natural.
> 
> Works for me!  I added the following bullet to the list of things
> that break dependencies:
> 
>   If a pointer is part of a dependency chain, and if the values
>   added to or subtracted from that pointer cancel the pointer
>   value so as to allow the compiler to precisely determine the
>   resulting value, then the resulting value will not be part of
>   any dependency chain.  For example, if p is part of a dependency
>   chain, then ((char *)p-(uintptr_t)p)+65536 will not be.
> 
> Seem reasonable?

Whilst I understand what you're saying (the ARM architecture makes these
sorts of distinctions when calling out dependency-based ordering), it
feels like we're dangerously close to defining the difference between a
true and a false dependency. If we want to do this in the context of the
C language specification, you run into issues because you need to evaluate
the program in order to determine data values in order to determine the
nature of the dependency.

You tackle this above by saying "to allow the compiler to precisely
determine the resulting value", but I can't see how that can be cleanly
fitted into something like the C language specification. Even if it can,
then we'd need to reword the "?:" treatment that you currently have:

  "If a pointer is part of a dependency chain, and that pointer appears
   in the entry of a ?: expression selected by the condition, then the
   chain extends to the result."

which I think requires the state of the condition to be known statically
if we only want to extend the chain from the selected expression. In the
general case, wouldn't a compiler have to assume that the chain is
extended from both?

Additionally, what about the following code?

  char *x = y ? z : z;

Does that extend a dependency chain from z to x? If so, I can imagine a
CPU breaking that in practice.

> > Humans will understand, and compiler writers won't care. They will
> > either depend on hardware semantics anyway (and argue that your
> > language is tight enough that they don't need to do anything special)
> > or they will turn the consume into an acquire (on platforms that have
> > too weak hardware).
> 
> Agreed.  Plus Core Working Group will hammer out the exact wording,
> should this approach meet their approval.

For the avoidance of doubt, I'm completely behind any attempts to tackle
this problem, but I anticipate an uphill struggle getting this text into
the C standard. Is your intention to change the carries-a-dependency
relation to encompass this change?

Cheers,

Will


Re: Compilers and RCU readers: Once more unto the breach!

2015-05-20 Thread Will Deacon
On Wed, May 20, 2015 at 01:15:22PM +0100, Paul E. McKenney wrote:
> On Wed, May 20, 2015 at 12:47:45PM +0100, Will Deacon wrote:
> > On Wed, May 20, 2015 at 03:41:48AM +0100, Paul E. McKenney wrote:
> > >   If a pointer is part of a dependency chain, and if the values
> > >   added to or subtracted from that pointer cancel the pointer
> > >   value so as to allow the compiler to precisely determine the
> > >   resulting value, then the resulting value will not be part of
> > >   any dependency chain.  For example, if p is part of a dependency
> > >   chain, then ((char *)p-(uintptr_t)p)+65536 will not be.
> > > 
> > > Seem reasonable?
> > 
> > Whilst I understand what you're saying (the ARM architecture makes these
> > sorts of distinctions when calling out dependency-based ordering), it
> > feels like we're dangerously close to defining the difference between a
> > true and a false dependency. If we want to do this in the context of the
> > C language specification, you run into issues because you need to evaluate
> > the program in order to determine data values in order to determine the
> > nature of the dependency.
> 
> Indeed, something like this does -not- carry a dependency from the
> memory_order_consume load to q:
> 
>   char *p, q;
> 
>   p = atomic_load_explicit(&gp, memory_order_consume);
>   q = gq + (intptr_t)p - (intptr_t)p;
> 
> If this was compiled with -O0, ARM and Power might well carry a
> dependency, but given any optimization, the assembly language would have
> no hint of any such dependency.  So I am not seeing any particular danger.

The above is a welcome relaxation over C11, since ARM doesn't even give
you ordering based off false data dependencies. My concern is more to do
with how this can be specified precisely without prohibing honest compiler
and hardware optimisations.

Out of interest, how do you tackle examples (4) and (5) of (assuming the
reads are promoted to consume loads)?:

  http://www.cl.cam.ac.uk/~pes20/cpp/notes42.html

my understanding is that you permit both outcomes (I appreciate you're
not directly tackling out-of-thin-air, but treatment of dependencies
is heavily related).

> > You tackle this above by saying "to allow the compiler to precisely
> > determine the resulting value", but I can't see how that can be cleanly
> > fitted into something like the C language specification.
> 
> I am sure that there will be significant rework from where this document
> is to language appropriate from the standard.  Which is why I am glad
> that Jens is taking an interest in this, as he is particularly good at
> producing standards language.

Ok. I'm curious to see how that comes along.

> >  Even if it can,
> > then we'd need to reword the "?:" treatment that you currently have:
> > 
> >   "If a pointer is part of a dependency chain, and that pointer appears
> >in the entry of a ?: expression selected by the condition, then the
> >chain extends to the result."
> > 
> > which I think requires the state of the condition to be known statically
> > if we only want to extend the chain from the selected expression. In the
> > general case, wouldn't a compiler have to assume that the chain is
> > extended from both?
> 
> In practice, yes, if the compiler cannot determine which expression is
> selected, it must arrange for the dependency to be carried from either,
> depending on the run-time value of the condition.  But you would have
> to work pretty hard to create code that did not carry the dependencies
> as require, not?

I'm not sure... you'd require the compiler to perform static analysis of
loops to determine the state of the machine when they exit (if they exit!)
in order to show whether or not a dependency is carried to subsequent
operations. If it can't prove otherwise, it would have to assume that a
dependency *is* carried, and it's not clear to me how it would use this
information to restrict any subsequent dependency removing optimisations.

I guess that's one for the GCC folks.

> > Additionally, what about the following code?
> > 
> >   char *x = y ? z : z;
> > 
> > Does that extend a dependency chain from z to x? If so, I can imagine a
> > CPU breaking that in practice.
> 
> I am not seeing this.  I would expect the compiler to optimize to
> something like this:
> 
>   char *x = z;
> 
> How does this avoid carrying the dependency?  Or are you saying that
> ARM loses the dependency via a store to memory and a later reload?
> That would be a bi

Re: Compilers and RCU readers: Once more unto the breach!

2015-05-21 Thread Will Deacon
On Wed, May 20, 2015 at 07:16:06PM +0100, Paul E. McKenney wrote:
> On Wed, May 20, 2015 at 04:46:17PM +0100, Will Deacon wrote:
> > On Wed, May 20, 2015 at 01:15:22PM +0100, Paul E. McKenney wrote:
> > > Indeed, something like this does -not- carry a dependency from the
> > > memory_order_consume load to q:
> > > 
> > >   char *p, q;
> > > 
> > >   p = atomic_load_explicit(&gp, memory_order_consume);
> > >   q = gq + (intptr_t)p - (intptr_t)p;
> > > 
> > > If this was compiled with -O0, ARM and Power might well carry a
> > > dependency, but given any optimization, the assembly language would have
> > > no hint of any such dependency.  So I am not seeing any particular danger.
> > 
> > The above is a welcome relaxation over C11, since ARM doesn't even give
> > you ordering based off false data dependencies. My concern is more to do
> > with how this can be specified precisely without prohibing honest compiler
> > and hardware optimisations.
> 
> That last is the challenge.  I believe that I am pretty close, but I am
> sure that additional adjustment will be required.  Especially given that
> we also need the memory model to be amenable to formal analysis.

Well, there's still the whole thin-air problem which unfortunately doesn't
go away with your proposal... (I was hoping that differentiating between
true and false dependencies would solve that, but your set of rules isn't
broad enough and I don't blame you at all for that!).

> > Out of interest, how do you tackle examples (4) and (5) of (assuming the
> > reads are promoted to consume loads)?:
> > 
> >   http://www.cl.cam.ac.uk/~pes20/cpp/notes42.html
> > 
> > my understanding is that you permit both outcomes (I appreciate you're
> > not directly tackling out-of-thin-air, but treatment of dependencies
> > is heavily related).

Thanks for taking the time to walk these two examples through.

> Let's see...  #4 is as follows, given promotion to memory_order_consume
> and (I am guessing) memory_order_relaxed:
> 
>   r1 = atomic_load_explicit(&x, memory_order_consume);
>   if (r1 == 42)
> atomic_store_explicit(&y, r1, memory_order_relaxed);
>   --
>   r2 = atomic_load_explicit(&y, memory_order_consume);
>   if (r2 == 42)
> atomic_store_explicit(&x, 42, memory_order_relaxed);
>   else
> atomic_store_explicit(&x, 42, memory_order_relaxed);
> 
> The second thread does not have a proper control dependency, even with
> the memory_order_consume load because both branches assign the same
> value to "x".  This means that the compiler is within its rights to
> optimize this into the following:
> 
>   r1 = atomic_load_explicit(&x, memory_order_consume);
>   if (r1 == 42)
> atomic_store_explicit(&y, r1, memory_order_relaxed);
>   --
>   r2 = atomic_load_explicit(&y, memory_order_consume);
>   atomic_store_explicit(&x, 42, memory_order_relaxed);
> 
> There is no dependency between the second thread's pair of statements,
> so both the compiler and the CPU are within their rights to optimize
> further as follows:
> 
>   r1 = atomic_load_explicit(&x, memory_order_consume);
>   if (r1 == 42)
> atomic_store_explicit(&y, r1, memory_order_relaxed);
>   --
>   atomic_store_explicit(&x, 42, memory_order_relaxed);
>   r2 = atomic_load_explicit(&y, memory_order_consume);
> 
> If the compiler makes this final optimization, even mythical SC hardware
> is within its rights to end up with (r1 == 42 && r2 == 42).  Which is
> fine, as far as I am concerned.  Or at least something that can be
> lived with.

Agreed.

> On to #5:
> 
>   r1 = atomic_load_explicit(&x, memory_order_consume);
>   if (r1 == 42)
> atomic_store_explicit(&y, r1, memory_order_relaxed);
>   
>   r2 = atomic_load_explicit(&y, memory_order_consume);
>   if (r2 == 42)
> atomic_store_explicit(&x, 42, memory_order_relaxed);
> 
> The first thread's accesses are dependency ordered.  The second thread's
> ordering is in a corner case that memory-barriers.txt does not cover.
> You are supposed to start control dependencies with READ_ONCE_CTRL(), not
> a memory_order_consume load (AKA rcu_dereference and friends).  However,
> Alpha would have a full barrier as part of the memory_orde

Re: Compilers and RCU readers: Once more unto the breach!

2015-05-22 Thread Will Deacon
Hi Paul,

On Thu, May 21, 2015 at 09:02:12PM +0100, Paul E. McKenney wrote:
> On Thu, May 21, 2015 at 08:24:22PM +0100, Will Deacon wrote:
> > On Wed, May 20, 2015 at 07:16:06PM +0100, Paul E. McKenney wrote:
> > > On to #5:
> > > 
> > >   r1 = atomic_load_explicit(&x, memory_order_consume);
> > >   if (r1 == 42)
> > > atomic_store_explicit(&y, r1, memory_order_relaxed);
> > >   
> > >   r2 = atomic_load_explicit(&y, memory_order_consume);
> > >   if (r2 == 42)
> > > atomic_store_explicit(&x, 42, memory_order_relaxed);
> > > 
> > > The first thread's accesses are dependency ordered.  The second thread's
> > > ordering is in a corner case that memory-barriers.txt does not cover.
> > > You are supposed to start control dependencies with READ_ONCE_CTRL(), not
> > > a memory_order_consume load (AKA rcu_dereference and friends).  However,
> > > Alpha would have a full barrier as part of the memory_order_consume load,
> > > and the rest of the processors would (one way or another) respect the
> > > control dependency.  And the compiler would have some fun trying to
> > > break it.
> > 
> > But this is interesting because the first thread is ordered whilst the
> > second is not, so doesn't that effectively forbid the compiler from
> > constant-folding values if it can't prove that there is no dependency
> > chain?
> 
> You lost me on this one.  Are you suggesting that the compiler
> speculate the second thread's atomic store?  That would be very
> bad regardless of dependency chains.
> 
> So what constant-folding optimization are you thinking of here?
> If the above example is not amenable to such an optimization, could
> you please give me an example where constant folding would apply
> in a way that is sensitive to dependency chains?

Unless I'm missing something, I can't see what would prevent a compiler
from looking at the code in thread1 and transforming it into the code in
thread2 (i.e. constant folding r1 with 42 given that the taken branch
must mean that r1 == 42). However, such an optimisation breaks the
dependency chain, which means that a compiler needs to walk backwards
to see if there is a dependency chain extending to r1.

> > > So the current Linux memory model would allow (r1 == 42 && r2 == 42),
> > > but I don't know of any hardware/compiler combination that would
> > > allow it.  And no, I am -not- going to update memory-barriers.txt for
> > > this litmus test, its theoretical interest notwithstanding!  ;-)

Of course, I'm not asking for that at all! I'm just trying to see how
your proposal holds up with the example.

Will


Pretty print of C++11 scoped enums - request help towards a proper fix

2018-09-19 Thread will wray
Re: "Pretty print of enumerator never prints the id,
 always falls back to C-style cast output"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87364

The bug report gives a one-line 'fix' to enable output of enum id
but, for C++11 scoped enums, it fails to qualify as enum type::id.

The code is located in c-pretty-print.c
It has not been updated to deal with C++11 scoped enumerations.

'Separation of responsibilities' between c and cxx-pretty-print
seems fairly lax - it's convenient to push some c++ printing to c
(there are a few comments like /* This C++ bit is handled here...*/)

I have not quite managed to make a fix confined to c-pretty-print.c

I have a fix which duplicates the code in pp_c_enumeration_constant
to pp_cxx_enumeration_constant in cxx-pretty print, with modification

  if (value != NULL_TREE)
  {
if (ENUM_IS_SCOPED (type))
  pp_cxx_nested_name_specifier (pp, type);
pp->id_expression (TREE_PURPOSE (value));
  }

This works in my testing so far, but
 - It duplicates code from c to cxx (not DRY 'Don't Repeat Yourself)
 - I didn't find a single function to print full nested, scoped id
   so had to check if ENUM_IS_SCOPED to output nested specifiers.

I'm learning by hacking but would like guidance on a proper fix
from anyone more familiar with gcc pretty print and/or grammar -
the guideline comment, at the top of the file, states:

/* The pretty-printer code is primarily designed to closely follow
   (GNU) C and C++ grammars... */

I'd appreciate any recommendations towards a proper fix,
or pointers for how to write unit tests for the fix.

Thanks, Will


Re: Pretty print of C++11 scoped enums - request help towards a proper fix

2018-09-24 Thread will wray
Thanks Nathan,


In fact, after testing with enums nested in namespaces or structs,

or function local, I realised nested specifiers should be printed

for both scoped and unscoped enums, but for unscoped enums one

level of nested specifier (the enum type) needs to be stripped.

So I inverted the IS_SCOPED test and used get_containing_scope:


if (value != NULL_TREE)
{

if (!ENUM_IS_SCOPED (type))

type = get_containing_scope (type);

pp_cxx_nested_name_specifier (pp, type);

pp->id_expression (TREE_PURPOSE (value));

}


I submitted this fix as a patch to the bug report, with tests.

With this fix GCC now has similar output to both Clang and MSVC
for enumerated values. For non-enumerated values GCC continues
to print a C-style cast while Clang & MSVC print plain digits.
Yay! GCC is winning! (gives type info for non-enumerated values).

A downside of nested specifiers is that output gets verbose.

Richard Smith suggests to use less verbose output for known types
compared to auto deduced types. Discussion starts here
http://lists.llvm.org/pipermail/cfe-dev/2018-September/059229.html

For enum args, I guess that this would mean distinguishing whether
the corresponding template parameter was auto or a given enum type
and only printing a simple id with no nested specs for given type.
I don't know yet if that info is available at the leaf level here.

Similarly, type info should be added to deduced Integral values.
I may start to investigate how to do this in GCC pretty print.
I submitted the related request to MSVC devs:
https://developercommunity.visualstudio.com/content/problem/339663/improve-pretty-print-of-integral-non-type-template.html

> given the code base ...

GCC pretty-print code was committed by GDR mid 2002,
K&R style C, updated to C90 'prototype' in mid 2003,
untouched since then, not for C++11 or C++17 enum updates.

I found this corner of the code base fairly easy to hack,
thanks perhaps to GDRs attempts to follow the grammar.


On Mon, Sep 24, 2018 at 3:53 PM Nathan Sidwell  wrote:

> On 9/19/18 7:41 AM, will wray wrote:
> > Re: "Pretty print of enumerator never prints the id,
> >   always falls back to C-style cast output"
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87364
> >
>
> > I have a fix which duplicates the code in pp_c_enumeration_constant
> > to pp_cxx_enumeration_constant in cxx-pretty print, with modification
> >
> >if (value != NULL_TREE)
> >{
> >  if (ENUM_IS_SCOPED (type))
> >pp_cxx_nested_name_specifier (pp, type);
> >  pp->id_expression (TREE_PURPOSE (value));
> >}
> >
> > This works in my testing so far, but
> >   - It duplicates code from c to cxx (not DRY 'Don't Repeat Yourself)
> >   - I didn't find a single function to print full nested, scoped id
> > so had to check if ENUM_IS_SCOPED to output nested specifiers.
>
> This seems a fine approach, given the code base.
>
> nathan
>
> --
> Nathan Sidwell
>


Re: Pretty print of C++11 scoped enums - request help towards a proper fix

2018-09-25 Thread will wray
BTW The bug is still UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87364

It is easy to CONFIRM:

Follow this Compiler Explorer link https://godbolt.org/z/P4ejiy
or paste this code into a file and compile with g++:

template  struct wauto;
enum e { a };
wauto v;// error


Note that GCC reports  error: ... 'wauto v'
 => It should report   error: ... 'wauto v'

This is a bug; the intent of the code is print the enumerator id
(clang prints the enumerator id and so do recent MSVC previews).
There is also test code linked to the bug covering more cases.

I'd appreciate if someone would confirm the bug.

Thanks, Will

On Mon, Sep 24, 2018 at 5:23 PM will wray  wrote:

> Thanks Nathan,
>
>
> In fact, after testing with enums nested in namespaces or structs,
>
> or function local, I realised nested specifiers should be printed
>
> for both scoped and unscoped enums, but for unscoped enums one
>
> level of nested specifier (the enum type) needs to be stripped.
>
> So I inverted the IS_SCOPED test and used get_containing_scope:
>
>
> if (value != NULL_TREE)
> {
>
> if (!ENUM_IS_SCOPED (type))
>
> type = get_containing_scope (type);
>
> pp_cxx_nested_name_specifier (pp, type);
>
> pp->id_expression (TREE_PURPOSE (value));
>
> }
>
>
> I submitted this fix as a patch to the bug report, with tests.
>
> With this fix GCC now has similar output to both Clang and MSVC
> for enumerated values. For non-enumerated values GCC continues
> to print a C-style cast while Clang & MSVC print plain digits.
> Yay! GCC is winning! (gives type info for non-enumerated values).
>
> A downside of nested specifiers is that output gets verbose.
>
> Richard Smith suggests to use less verbose output for known types
> compared to auto deduced types. Discussion starts here
> http://lists.llvm.org/pipermail/cfe-dev/2018-September/059229.html
>
> For enum args, I guess that this would mean distinguishing whether
> the corresponding template parameter was auto or a given enum type
> and only printing a simple id with no nested specs for given type.
> I don't know yet if that info is available at the leaf level here.
>
> Similarly, type info should be added to deduced Integral values.
> I may start to investigate how to do this in GCC pretty print.
> I submitted the related request to MSVC devs:
>
> https://developercommunity.visualstudio.com/content/problem/339663/improve-pretty-print-of-integral-non-type-template.html
>
> > given the code base ...
>
> GCC pretty-print code was committed by GDR mid 2002,
> K&R style C, updated to C90 'prototype' in mid 2003,
> untouched since then, not for C++11 or C++17 enum updates.
>
> I found this corner of the code base fairly easy to hack,
> thanks perhaps to GDRs attempts to follow the grammar.
>
>
> On Mon, Sep 24, 2018 at 3:53 PM Nathan Sidwell  wrote:
>
>> On 9/19/18 7:41 AM, will wray wrote:
>> > Re: "Pretty print of enumerator never prints the id,
>> >   always falls back to C-style cast output"
>> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87364
>> >
>>
>> > I have a fix which duplicates the code in pp_c_enumeration_constant
>> > to pp_cxx_enumeration_constant in cxx-pretty print, with modification
>> >
>> >if (value != NULL_TREE)
>> >{
>> >  if (ENUM_IS_SCOPED (type))
>> >pp_cxx_nested_name_specifier (pp, type);
>> >  pp->id_expression (TREE_PURPOSE (value));
>> >}
>> >
>> > This works in my testing so far, but
>> >   - It duplicates code from c to cxx (not DRY 'Don't Repeat Yourself)
>> >   - I didn't find a single function to print full nested, scoped id
>> > so had to check if ENUM_IS_SCOPED to output nested specifiers.
>>
>> This seems a fine approach, given the code base.
>>
>> nathan
>>
>> --
>> Nathan Sidwell
>>
>


Re: Pretty print of C++11 scoped enums - request help towards a proper fix

2018-10-08 Thread will wray
Patch submitted:

https://gcc.gnu.org/ml/gcc-patches/2018-10/msg00452.html
[C++ PATCH] Fix pretty-print of enumerator ids (PR c++/87364)

My first GCC patch attempt, so more eyes would be good.

Cheers, Will

On Tue, Sep 25, 2018 at 4:25 PM will wray  wrote:

> BTW The bug is still UNCONFIRMED
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87364
>
> It is easy to CONFIRM:
>
> Follow this Compiler Explorer link https://godbolt.org/z/P4ejiy
> or paste this code into a file and compile with g++:
>
> template  struct wauto;
> enum e { a };
> wauto v;// error
>
>
> Note that GCC reports  error: ... 'wauto v'
>  => It should report   error: ... 'wauto v'
>
> This is a bug; the intent of the code is print the enumerator id
> (clang prints the enumerator id and so do recent MSVC previews).
> There is also test code linked to the bug covering more cases.
>
> I'd appreciate if someone would confirm the bug.
>
> Thanks, Will
>
> On Mon, Sep 24, 2018 at 5:23 PM will wray  wrote:
>
>> Thanks Nathan,
>>
>>
>> In fact, after testing with enums nested in namespaces or structs,
>>
>> or function local, I realised nested specifiers should be printed
>>
>> for both scoped and unscoped enums, but for unscoped enums one
>>
>> level of nested specifier (the enum type) needs to be stripped.
>>
>> So I inverted the IS_SCOPED test and used get_containing_scope:
>>
>>
>> if (value != NULL_TREE)
>> {
>>
>> if (!ENUM_IS_SCOPED (type))
>>
>> type = get_containing_scope (type);
>>
>> pp_cxx_nested_name_specifier (pp, type);
>>
>> pp->id_expression (TREE_PURPOSE (value));
>>
>> }
>>
>>
>> I submitted this fix as a patch to the bug report, with tests.
>>
>> With this fix GCC now has similar output to both Clang and MSVC
>> for enumerated values. For non-enumerated values GCC continues
>> to print a C-style cast while Clang & MSVC print plain digits.
>> Yay! GCC is winning! (gives type info for non-enumerated values).
>>
>> A downside of nested specifiers is that output gets verbose.
>>
>> Richard Smith suggests to use less verbose output for known types
>> compared to auto deduced types. Discussion starts here
>> http://lists.llvm.org/pipermail/cfe-dev/2018-September/059229.html
>>
>> For enum args, I guess that this would mean distinguishing whether
>> the corresponding template parameter was auto or a given enum type
>> and only printing a simple id with no nested specs for given type.
>> I don't know yet if that info is available at the leaf level here.
>>
>> Similarly, type info should be added to deduced Integral values.
>> I may start to investigate how to do this in GCC pretty print.
>> I submitted the related request to MSVC devs:
>>
>> https://developercommunity.visualstudio.com/content/problem/339663/improve-pretty-print-of-integral-non-type-template.html
>>
>> > given the code base ...
>>
>> GCC pretty-print code was committed by GDR mid 2002,
>> K&R style C, updated to C90 'prototype' in mid 2003,
>> untouched since then, not for C++11 or C++17 enum updates.
>>
>> I found this corner of the code base fairly easy to hack,
>> thanks perhaps to GDRs attempts to follow the grammar.
>>
>>
>> On Mon, Sep 24, 2018 at 3:53 PM Nathan Sidwell  wrote:
>>
>>> On 9/19/18 7:41 AM, will wray wrote:
>>> > Re: "Pretty print of enumerator never prints the id,
>>> >   always falls back to C-style cast output"
>>> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87364
>>> >
>>>
>>> > I have a fix which duplicates the code in pp_c_enumeration_constant
>>> > to pp_cxx_enumeration_constant in cxx-pretty print, with modification
>>> >
>>> >if (value != NULL_TREE)
>>> >{
>>> >  if (ENUM_IS_SCOPED (type))
>>> >pp_cxx_nested_name_specifier (pp, type);
>>> >  pp->id_expression (TREE_PURPOSE (value));
>>> >}
>>> >
>>> > This works in my testing so far, but
>>> >   - It duplicates code from c to cxx (not DRY 'Don't Repeat Yourself)
>>> >   - I didn't find a single function to print full nested, scoped id
>>> > so had to check if ENUM_IS_SCOPED to output nested specifiers.
>>>
>>> This seems a fine approach, given the code base.
>>>
>>> nathan
>>>
>>> --
>>> Nathan Sidwell
>>>
>>


Basic Block Statistics

2017-05-16 Thread Will Hawkins
Hello everyone!

I apologize if this is not the right venue to ask this question and/or
this is a waste of your time.

I was just wondering if there are statistics that gcc can emit that
includes either a) the average number of instructions per basic block
and/or b) the average size (in bytes) per basic block in a compilation
unit.

If nothing like this exists, I am more than happy to code something up
if people besides me think that it might be interesting.

I promise that I googled for information before asking, but I can't
guarantee that I didn't miss anything. Again, I apologize if I just
needed to RTFM better.

Thanks in advance for any responses!
Will


Re: Basic Block Statistics

2017-05-16 Thread Will Hawkins
On Tue, May 16, 2017 at 2:33 PM, Jeff Law  wrote:
> On 05/16/2017 12:24 PM, Will Hawkins wrote:
>> Hello everyone!
>>
>> I apologize if this is not the right venue to ask this question and/or
>> this is a waste of your time.
>>
>> I was just wondering if there are statistics that gcc can emit that
>> includes either a) the average number of instructions per basic block
>> and/or b) the average size (in bytes) per basic block in a compilation
>> unit.
>>
>> If nothing like this exists, I am more than happy to code something up
>> if people besides me think that it might be interesting.
>>
>> I promise that I googled for information before asking, but I can't
>> guarantee that I didn't miss anything. Again, I apologize if I just
>> needed to RTFM better.
> I don't think we have anything which inherently will give you this
> information.
>
> It'd be a useful thing to have though.  Implementation may be made more
> difficult by insns that generate > 1 instruction.
>
> Jeff

Thank you, Mr. Law. I think that this is something I'd really like to
work on. As I start to take a peak into how hard/easy this is to
implement, I may circle back and ask some additional technical
questions.

Thanks for your quick response!
Will


Re: Basic Block Statistics

2017-05-16 Thread Will Hawkins
On Tue, May 16, 2017 at 2:45 PM, David Malcolm  wrote:
> On Tue, 2017-05-16 at 14:24 -0400, Will Hawkins wrote:
>> Hello everyone!
>>
>> I apologize if this is not the right venue to ask this question
>> and/or
>> this is a waste of your time.
>>
>> I was just wondering if there are statistics that gcc can emit that
>> includes either a) the average number of instructions per basic block
>> and/or b) the average size (in bytes) per basic block in a
>> compilation
>> unit.
>>
>> If nothing like this exists, I am more than happy to code something
>> up
>> if people besides me think that it might be interesting.
>>
>> I promise that I googled for information before asking, but I can't
>> guarantee that I didn't miss anything. Again, I apologize if I just
>> needed to RTFM better.
>>
>> Thanks in advance for any responses!
>> Will
>
> I don't think anything like this currently exists, but it's probably
> doable via a plugin, e.g. by hooking up a new RTL pass somewhere
> towards the end of the pass pipeline.
>
> That said, IIRC basic blocks aren't used in the final passes;
> presumably they're not meaningful after the "free_cfg" pass.
>
> Hope this is helpful

Very helpful, thank you Mr. Malcolm!

Will

> Dave


Re: Basic Block Statistics

2017-05-17 Thread Will Hawkins
As I started looking into this, it seems like PLUGIN_FINISH is where
my plugin will go. Everything is great so far. However, when plugins
at that event are invoked, they get no data. That means I will have to
look into global structures for information regarding the compilation.
Are there pointers to the documentation that describe the relevant
global data structures that are accessible at this point?

I am looking through the source code and documentation and can't find
what I am looking for. I am happy to continue working, but thought I'd
ask just in case I was missing something silly.

Thanks again for all your help getting me started on this!
Will

On Tue, May 16, 2017 at 2:54 PM, Jeff Law  wrote:
> On 05/16/2017 12:37 PM, Will Hawkins wrote:
>> On Tue, May 16, 2017 at 2:33 PM, Jeff Law  wrote:
>>> On 05/16/2017 12:24 PM, Will Hawkins wrote:
>>>> Hello everyone!
>>>>
>>>> I apologize if this is not the right venue to ask this question and/or
>>>> this is a waste of your time.
>>>>
>>>> I was just wondering if there are statistics that gcc can emit that
>>>> includes either a) the average number of instructions per basic block
>>>> and/or b) the average size (in bytes) per basic block in a compilation
>>>> unit.
>>>>
>>>> If nothing like this exists, I am more than happy to code something up
>>>> if people besides me think that it might be interesting.
>>>>
>>>> I promise that I googled for information before asking, but I can't
>>>> guarantee that I didn't miss anything. Again, I apologize if I just
>>>> needed to RTFM better.
>>> I don't think we have anything which inherently will give you this
>>> information.
>>>
>>> It'd be a useful thing to have though.  Implementation may be made more
>>> difficult by insns that generate > 1 instruction.
>>>
>>> Jeff
>>
>> Thank you, Mr. Law. I think that this is something I'd really like to
>> work on. As I start to take a peak into how hard/easy this is to
>> implement, I may circle back and ask some additional technical
>> questions.
> Sure.  On-list is best.
> Jeff


Re: Basic Block Statistics

2017-05-17 Thread Will Hawkins
On Wed, May 17, 2017 at 1:02 PM, Jeff Law  wrote:
> On 05/17/2017 10:36 AM, Will Hawkins wrote:
>> As I started looking into this, it seems like PLUGIN_FINISH is where
>> my plugin will go. Everything is great so far. However, when plugins
>> at that event are invoked, they get no data. That means I will have to
>> look into global structures for information regarding the compilation.
>> Are there pointers to the documentation that describe the relevant
>> global data structures that are accessible at this point?
>>
>> I am looking through the source code and documentation and can't find
>> what I am looking for. I am happy to continue working, but thought I'd
>> ask just in case I was missing something silly.
>>
>> Thanks again for all your help getting me started on this!
> FOR_EACH_BB (bb) is what you're looking for.  That will iterate over the
> basic blocks.

Thank you so much for your response!

I just found this as soon as you sent it. Sorry for wasting your time!


>
> Assuming you're running late, you'll then want to walk each insn within
> the bb.  So something like this
>
> basic_block bb
> FOR_EACH_BB (bb)
>   {
> rtx_insn *insn;
> FOR_BB_INSNS (bb, insn)
>   {
> /* Do something with INSN.  */
>   }
>   }
>
>
> Note that if you're running too late the CFG may have been released, in
> which case this code wouldn't do anything.

I will just have to experiment to see exactly when the right time to
invoke this plugin to get the best data.

Thanks again!
Will

>
> jeff


Re: Basic Block Statistics

2017-05-17 Thread Will Hawkins
On Wed, May 17, 2017 at 1:04 PM, Will Hawkins  wrote:
> On Wed, May 17, 2017 at 1:02 PM, Jeff Law  wrote:
>> On 05/17/2017 10:36 AM, Will Hawkins wrote:
>>> As I started looking into this, it seems like PLUGIN_FINISH is where
>>> my plugin will go. Everything is great so far. However, when plugins
>>> at that event are invoked, they get no data. That means I will have to
>>> look into global structures for information regarding the compilation.
>>> Are there pointers to the documentation that describe the relevant
>>> global data structures that are accessible at this point?
>>>
>>> I am looking through the source code and documentation and can't find
>>> what I am looking for. I am happy to continue working, but thought I'd
>>> ask just in case I was missing something silly.
>>>
>>> Thanks again for all your help getting me started on this!
>> FOR_EACH_BB (bb) is what you're looking for.  That will iterate over the
>> basic blocks.
>
> Thank you so much for your response!
>
> I just found this as soon as you sent it. Sorry for wasting your time!
>
>
>>
>> Assuming you're running late, you'll then want to walk each insn within
>> the bb.  So something like this
>>
>> basic_block bb
>> FOR_EACH_BB (bb)
>>   {
>> rtx_insn *insn;
>> FOR_BB_INSNS (bb, insn)
>>   {
>> /* Do something with INSN.  */
>>   }
>>   }
>>
>>
>> Note that if you're running too late the CFG may have been released, in
>> which case this code wouldn't do anything.

This macro seems to require that there be a valid cfun. This seems to
imply that the macro will work only where the plugin callback is
invoked before/after a pass that does some optimization for a
particular function. In particular, at PLUGIN_FINISH, cfun is NULL.
This makes perfect sense.

Since PLUGIN_FINISH is the place where diagnostics are supposed to be
printed, I was wondering if there was an equivalent iterator for all
translation units (from which I could derive functions, from which I
could derive basic blocks) that just "FINISH"ed compiling?

The other way to approach the problem, I suppose, is to just
accumulate those stats at the end of each pass execution phase and
then simply print them when PLUGIN_FINISH is invoked.

I'm sorry to make this so difficult. I am just wondering about the way
that the community expects the plugins to be written in the most
modular fashion.

Thanks again for walking me through all this!
Will

>
> I will just have to experiment to see exactly when the right time to
> invoke this plugin to get the best data.
>
> Thanks again!
> Will
>
>>
>> jeff


Re: Basic Block Statistics

2017-05-17 Thread Will Hawkins
On Wed, May 17, 2017 at 2:41 PM, Will Hawkins  wrote:
> On Wed, May 17, 2017 at 1:04 PM, Will Hawkins  wrote:
>> On Wed, May 17, 2017 at 1:02 PM, Jeff Law  wrote:
>>> On 05/17/2017 10:36 AM, Will Hawkins wrote:
>>>> As I started looking into this, it seems like PLUGIN_FINISH is where
>>>> my plugin will go. Everything is great so far. However, when plugins
>>>> at that event are invoked, they get no data. That means I will have to
>>>> look into global structures for information regarding the compilation.
>>>> Are there pointers to the documentation that describe the relevant
>>>> global data structures that are accessible at this point?
>>>>
>>>> I am looking through the source code and documentation and can't find
>>>> what I am looking for. I am happy to continue working, but thought I'd
>>>> ask just in case I was missing something silly.
>>>>
>>>> Thanks again for all your help getting me started on this!
>>> FOR_EACH_BB (bb) is what you're looking for.  That will iterate over the
>>> basic blocks.
>>
>> Thank you so much for your response!
>>
>> I just found this as soon as you sent it. Sorry for wasting your time!
>>
>>
>>>
>>> Assuming you're running late, you'll then want to walk each insn within
>>> the bb.  So something like this
>>>
>>> basic_block bb
>>> FOR_EACH_BB (bb)
>>>   {
>>> rtx_insn *insn;
>>> FOR_BB_INSNS (bb, insn)
>>>   {
>>> /* Do something with INSN.  */
>>>   }
>>>   }
>>>
>>>
>>> Note that if you're running too late the CFG may have been released, in
>>> which case this code wouldn't do anything.
>
> This macro seems to require that there be a valid cfun. This seems to
> imply that the macro will work only where the plugin callback is
> invoked before/after a pass that does some optimization for a
> particular function. In particular, at PLUGIN_FINISH, cfun is NULL.
> This makes perfect sense.
>
> Since PLUGIN_FINISH is the place where diagnostics are supposed to be
> printed, I was wondering if there was an equivalent iterator for all
> translation units (from which I could derive functions, from which I
> could derive basic blocks) that just "FINISH"ed compiling?


Answering my own question for historical purposes and anyone else who
might need this:

  FOR_EACH_VEC_ELT(*all_translation_units, i, t)

is exactly what I was looking for!

Sorry for the earlier spam and thank you for your patience!
Will

>
> The other way to approach the problem, I suppose, is to just
> accumulate those stats at the end of each pass execution phase and
> then simply print them when PLUGIN_FINISH is invoked.
>
> I'm sorry to make this so difficult. I am just wondering about the way
> that the community expects the plugins to be written in the most
> modular fashion.
>
> Thanks again for walking me through all this!
> Will
>
>>
>> I will just have to experiment to see exactly when the right time to
>> invoke this plugin to get the best data.
>>
>> Thanks again!
>> Will
>>
>>>
>>> jeff


Re: Basic Block Statistics

2017-05-17 Thread Will Hawkins
On Wed, May 17, 2017 at 2:59 PM, Will Hawkins  wrote:
> On Wed, May 17, 2017 at 2:41 PM, Will Hawkins  wrote:
>> On Wed, May 17, 2017 at 1:04 PM, Will Hawkins  wrote:
>>> On Wed, May 17, 2017 at 1:02 PM, Jeff Law  wrote:
>>>> On 05/17/2017 10:36 AM, Will Hawkins wrote:
>>>>> As I started looking into this, it seems like PLUGIN_FINISH is where
>>>>> my plugin will go. Everything is great so far. However, when plugins
>>>>> at that event are invoked, they get no data. That means I will have to
>>>>> look into global structures for information regarding the compilation.
>>>>> Are there pointers to the documentation that describe the relevant
>>>>> global data structures that are accessible at this point?
>>>>>
>>>>> I am looking through the source code and documentation and can't find
>>>>> what I am looking for. I am happy to continue working, but thought I'd
>>>>> ask just in case I was missing something silly.
>>>>>
>>>>> Thanks again for all your help getting me started on this!
>>>> FOR_EACH_BB (bb) is what you're looking for.  That will iterate over the
>>>> basic blocks.
>>>
>>> Thank you so much for your response!
>>>
>>> I just found this as soon as you sent it. Sorry for wasting your time!
>>>
>>>
>>>>
>>>> Assuming you're running late, you'll then want to walk each insn within
>>>> the bb.  So something like this
>>>>
>>>> basic_block bb
>>>> FOR_EACH_BB (bb)
>>>>   {
>>>> rtx_insn *insn;
>>>> FOR_BB_INSNS (bb, insn)
>>>>   {
>>>> /* Do something with INSN.  */
>>>>   }
>>>>   }
>>>>
>>>>
>>>> Note that if you're running too late the CFG may have been released, in
>>>> which case this code wouldn't do anything.
>>
>> This macro seems to require that there be a valid cfun. This seems to
>> imply that the macro will work only where the plugin callback is
>> invoked before/after a pass that does some optimization for a
>> particular function. In particular, at PLUGIN_FINISH, cfun is NULL.
>> This makes perfect sense.
>>
>> Since PLUGIN_FINISH is the place where diagnostics are supposed to be
>> printed, I was wondering if there was an equivalent iterator for all
>> translation units (from which I could derive functions, from which I
>> could derive basic blocks) that just "FINISH"ed compiling?
>
>
> Answering my own question for historical purposes and anyone else who
> might need this:
>
>   FOR_EACH_VEC_ELT(*all_translation_units, i, t)
>
> is exactly what I was looking for!
>
> Sorry for the earlier spam and thank you for your patience!
> Will


Well, I thought that this was what I wanted, but it turns out perhaps
I was wrong. So, I am turning back for some help. Again, i apologize
for the incessant emails.

I would have thought that a translation unit tree node's chain would
point to all the nested tree nodes. This does not seem to be the case,
however. Am I missing something? Or is this the intended behavior?

Again, thank you for your patience!
Will

>
>>
>> The other way to approach the problem, I suppose, is to just
>> accumulate those stats at the end of each pass execution phase and
>> then simply print them when PLUGIN_FINISH is invoked.
>>
>> I'm sorry to make this so difficult. I am just wondering about the way
>> that the community expects the plugins to be written in the most
>> modular fashion.
>>
>> Thanks again for walking me through all this!
>> Will
>>
>>>
>>> I will just have to experiment to see exactly when the right time to
>>> invoke this plugin to get the best data.
>>>
>>> Thanks again!
>>> Will
>>>
>>>>
>>>> jeff


Re: Basic Block Statistics

2017-05-20 Thread Will Hawkins
On Fri, May 19, 2017 at 4:40 PM, Jeff Law  wrote:
> On 05/17/2017 08:22 PM, Will Hawkins wrote:
>> On Wed, May 17, 2017 at 2:59 PM, Will Hawkins  wrote:
>>> On Wed, May 17, 2017 at 2:41 PM, Will Hawkins  wrote:
>>>> On Wed, May 17, 2017 at 1:04 PM, Will Hawkins  wrote:
>>>>> On Wed, May 17, 2017 at 1:02 PM, Jeff Law  wrote:
>>>>>> On 05/17/2017 10:36 AM, Will Hawkins wrote:
>>>>>>> As I started looking into this, it seems like PLUGIN_FINISH is where
>>>>>>> my plugin will go. Everything is great so far. However, when plugins
>>>>>>> at that event are invoked, they get no data. That means I will have to
>>>>>>> look into global structures for information regarding the compilation.
>>>>>>> Are there pointers to the documentation that describe the relevant
>>>>>>> global data structures that are accessible at this point?
>>>>>>>
>>>>>>> I am looking through the source code and documentation and can't find
>>>>>>> what I am looking for. I am happy to continue working, but thought I'd
>>>>>>> ask just in case I was missing something silly.
>>>>>>>
>>>>>>> Thanks again for all your help getting me started on this!
>>>>>> FOR_EACH_BB (bb) is what you're looking for.  That will iterate over the
>>>>>> basic blocks.
>>>>>
>>>>> Thank you so much for your response!
>>>>>
>>>>> I just found this as soon as you sent it. Sorry for wasting your time!
>>>>>
>>>>>
>>>>>>
>>>>>> Assuming you're running late, you'll then want to walk each insn within
>>>>>> the bb.  So something like this
>>>>>>
>>>>>> basic_block bb
>>>>>> FOR_EACH_BB (bb)
>>>>>>   {
>>>>>> rtx_insn *insn;
>>>>>> FOR_BB_INSNS (bb, insn)
>>>>>>   {
>>>>>> /* Do something with INSN.  */
>>>>>>   }
>>>>>>   }
>>>>>>
>>>>>>
>>>>>> Note that if you're running too late the CFG may have been released, in
>>>>>> which case this code wouldn't do anything.
>>>>
>>>> This macro seems to require that there be a valid cfun. This seems to
>>>> imply that the macro will work only where the plugin callback is
>>>> invoked before/after a pass that does some optimization for a
>>>> particular function. In particular, at PLUGIN_FINISH, cfun is NULL.
>>>> This makes perfect sense.
>>>>
>>>> Since PLUGIN_FINISH is the place where diagnostics are supposed to be
>>>> printed, I was wondering if there was an equivalent iterator for all
>>>> translation units (from which I could derive functions, from which I
>>>> could derive basic blocks) that just "FINISH"ed compiling?
>>>
>>>
>>> Answering my own question for historical purposes and anyone else who
>>> might need this:
>>>
>>>   FOR_EACH_VEC_ELT(*all_translation_units, i, t)
>>>
>>> is exactly what I was looking for!
>>>
>>> Sorry for the earlier spam and thank you for your patience!
>>> Will
>>
>>
>> Well, I thought that this was what I wanted, but it turns out perhaps
>> I was wrong. So, I am turning back for some help. Again, i apologize
>> for the incessant emails.
>>
>> I would have thought that a translation unit tree node's chain would
>> point to all the nested tree nodes. This does not seem to be the case,
>> however. Am I missing something? Or is this the intended behavior?
> I think there's a fundamental misunderstanding.

You are right, Mr. Law. I'm really sorry for the confusion. I got
things straightened out in my head and now I am making great progress.
>
> We don't hold the RTL IR for all the functions in a translation unit in
> memory at the same time.  You have to look at the RTL IR for each as its
> generated.

Thank you, as ever, for your continued input. I am going to continue
to work and I will keep everyone on the list posted and let you know
when it is complete.

Thanks again and have a great rest of the weekend!

Will
>
> jeff


Re: Basic Block Statistics

2017-05-30 Thread Will Hawkins
I just wanted to send a quick follow up.

Thanks to the incredible support on this list from Mr. Law and support
in IRC from segher, djgpp and dmalcolm, I was able to put together a
serviceable little plugin that does some very basic statistic
generation on basic blocks.

Here is a link to the source with information about how to build/run:
https://github.com/whh8b/bb_stats

If you are interested in more information, just send me an email.

Thanks again for everyone's help!
Will

On Sat, May 20, 2017 at 11:29 PM, Will Hawkins  wrote:
> On Fri, May 19, 2017 at 4:40 PM, Jeff Law  wrote:
>> On 05/17/2017 08:22 PM, Will Hawkins wrote:
>>> On Wed, May 17, 2017 at 2:59 PM, Will Hawkins  wrote:
>>>> On Wed, May 17, 2017 at 2:41 PM, Will Hawkins  wrote:
>>>>> On Wed, May 17, 2017 at 1:04 PM, Will Hawkins  wrote:
>>>>>> On Wed, May 17, 2017 at 1:02 PM, Jeff Law  wrote:
>>>>>>> On 05/17/2017 10:36 AM, Will Hawkins wrote:
>>>>>>>> As I started looking into this, it seems like PLUGIN_FINISH is where
>>>>>>>> my plugin will go. Everything is great so far. However, when plugins
>>>>>>>> at that event are invoked, they get no data. That means I will have to
>>>>>>>> look into global structures for information regarding the compilation.
>>>>>>>> Are there pointers to the documentation that describe the relevant
>>>>>>>> global data structures that are accessible at this point?
>>>>>>>>
>>>>>>>> I am looking through the source code and documentation and can't find
>>>>>>>> what I am looking for. I am happy to continue working, but thought I'd
>>>>>>>> ask just in case I was missing something silly.
>>>>>>>>
>>>>>>>> Thanks again for all your help getting me started on this!
>>>>>>> FOR_EACH_BB (bb) is what you're looking for.  That will iterate over the
>>>>>>> basic blocks.
>>>>>>
>>>>>> Thank you so much for your response!
>>>>>>
>>>>>> I just found this as soon as you sent it. Sorry for wasting your time!
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Assuming you're running late, you'll then want to walk each insn within
>>>>>>> the bb.  So something like this
>>>>>>>
>>>>>>> basic_block bb
>>>>>>> FOR_EACH_BB (bb)
>>>>>>>   {
>>>>>>> rtx_insn *insn;
>>>>>>> FOR_BB_INSNS (bb, insn)
>>>>>>>   {
>>>>>>> /* Do something with INSN.  */
>>>>>>>   }
>>>>>>>   }
>>>>>>>
>>>>>>>
>>>>>>> Note that if you're running too late the CFG may have been released, in
>>>>>>> which case this code wouldn't do anything.
>>>>>
>>>>> This macro seems to require that there be a valid cfun. This seems to
>>>>> imply that the macro will work only where the plugin callback is
>>>>> invoked before/after a pass that does some optimization for a
>>>>> particular function. In particular, at PLUGIN_FINISH, cfun is NULL.
>>>>> This makes perfect sense.
>>>>>
>>>>> Since PLUGIN_FINISH is the place where diagnostics are supposed to be
>>>>> printed, I was wondering if there was an equivalent iterator for all
>>>>> translation units (from which I could derive functions, from which I
>>>>> could derive basic blocks) that just "FINISH"ed compiling?
>>>>
>>>>
>>>> Answering my own question for historical purposes and anyone else who
>>>> might need this:
>>>>
>>>>   FOR_EACH_VEC_ELT(*all_translation_units, i, t)
>>>>
>>>> is exactly what I was looking for!
>>>>
>>>> Sorry for the earlier spam and thank you for your patience!
>>>> Will
>>>
>>>
>>> Well, I thought that this was what I wanted, but it turns out perhaps
>>> I was wrong. So, I am turning back for some help. Again, i apologize
>>> for the incessant emails.
>>>
>>> I would have thought that a translation unit tree node's chain would
>>> point to all the nested tree nodes. This does not seem to be the case,
>>> however. Am I missing something? Or is this the intended behavior?
>> I think there's a fundamental misunderstanding.
>
> You are right, Mr. Law. I'm really sorry for the confusion. I got
> things straightened out in my head and now I am making great progress.
>>
>> We don't hold the RTL IR for all the functions in a translation unit in
>> memory at the same time.  You have to look at the RTL IR for each as its
>> generated.
>
> Thank you, as ever, for your continued input. I am going to continue
> to work and I will keep everyone on the list posted and let you know
> when it is complete.
>
> Thanks again and have a great rest of the weekend!
>
> Will
>>
>> jeff


Re: GCC and Meltdown and Spectre vulnerabilities

2018-01-04 Thread Will Hawkins
On Thu, Jan 4, 2018 at 10:10 PM, Eric Gallager  wrote:
> Is there anything GCC could be doing at the compiler level to mitigate
> the recently-announced Meltdown and Spectre vulnerabilities? From
> reading about them, it seems like they involve speculative execution
> and indirect branch prediction, and those are the domain of things the
> compiler deals with, right? (For reference, Meltdown is CVE-2017-5754,
> and Spectre is CVE-2017-5753 and CVE-2017-5715)
>
> Just wondering,
> Eric

Check out

https://support.google.com/faqs/answer/7625886

and especially

http://git.infradead.org/users/dwmw2/gcc-retpoline.git/shortlog/refs/heads/gcc-7_2_0-retpoline-20171219

I'd love to hear what other people have heard!

Will


Re: About Bug 52485

2018-05-09 Thread Will Hawkins
Thanks to your brand new Bugzilla account, you may now comment! :-)


You will receive instructions on how to reset your default default
password and access your account. Please let me know if you have any
questions or trouble gaining access.

I'd be happy to help in any way that I can!

Thanks for contributing to GCC!
Will


On Wed, May 9, 2018 at 4:08 AM, SHIH YEN-TE  wrote:
> Want to comment on "Bug 52485 - [c++11] add an option to disable c++11 
> user-defined literals"
>
>
> It's a pity GCC doesn't support this, which forces me to give up introducing 
> newer C++ standard into my project. I know it is ridiculous, but we must know 
> the real world is somehow ridiculous as well as nothing is perfect.
>
>


Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-06 Thread Will Deacon
On Thu, Feb 06, 2014 at 06:55:01PM +, Ramana Radhakrishnan wrote:
> On 02/06/14 18:25, David Howells wrote:
> >
> > Is it worth considering a move towards using C11 atomics and barriers and
> > compiler intrinsics inside the kernel?  The compiler _ought_ to be able to 
> > do
> > these.
> 
> 
> It sounds interesting to me, if we can make it work properly and 
> reliably. + gcc@gcc.gnu.org for others in the GCC community to chip in.

Given my (albeit limited) experience playing with the C11 spec and GCC, I
really think this is a bad idea for the kernel. It seems that nobody really
agrees on exactly how the C11 atomics map to real architectural
instructions on anything but the trivial architectures. For example, should
the following code fire the assert?


extern atomic foo, bar, baz;

void thread1(void)
{
foo.store(42, memory_order_relaxed);
bar.fetch_add(1, memory_order_seq_cst);
baz.store(42, memory_order_relaxed);
}

void thread2(void)
{
while (baz.load(memory_order_seq_cst) != 42) {
/* do nothing */
}

assert(foo.load(memory_order_seq_cst) == 42);
}


To answer that question, you need to go and look at the definitions of
synchronises-with, happens-before, dependency_ordered_before and a whole
pile of vaguely written waffle to realise that you don't know. Certainly,
the code that arm64 GCC currently spits out would allow the assertion to fire
on some microarchitectures.

There are also so many ways to blow your head off it's untrue. For example,
cmpxchg takes a separate memory model parameter for failure and success, but
then there are restrictions on the sets you can use for each. It's not hard
to find well-known memory-ordering experts shouting "Just use
memory_model_seq_cst for everything, it's too hard otherwise". Then there's
the fun of load-consume vs load-acquire (arm64 GCC completely ignores consume
atm and optimises all of the data dependencies away) as well as the definition
of "data races", which seem to be used as an excuse to miscompile a program
at the earliest opportunity.

Trying to introduce system concepts (writes to devices, interrupts,
non-coherent agents) into this mess is going to be an uphill battle IMHO. I'd
just rather stick to the semantics we have and the asm volatile barriers.

That's not to say I don't there's no room for improvement in what we have
in the kernel. Certainly, I'd welcome allowing more relaxed operations on
architectures that support them, but it needs to be something that at least
the different architecture maintainers can understand how to implement
efficiently behind an uncomplicated interface. I don't think that interface is
C11.

Just my thoughts on the matter...

Will


Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-07 Thread Will Deacon
Hello Torvald,

It looks like Paul clarified most of the points I was trying to make
(thanks Paul!), so I won't go back over them here.

On Thu, Feb 06, 2014 at 09:09:25PM +, Torvald Riegel wrote:
> Are you familiar with the formalization of the C11/C++11 model by Batty
> et al.?
> http://www.cl.cam.ac.uk/~mjb220/popl085ap-sewell.pdf
> http://www.cl.cam.ac.uk/~mjb220/n3132.pdf
> 
> They also have a nice tool that can run condensed examples and show you
> all allowed (and forbidden) executions (it runs in the browser, so is
> slow for larger examples), including nice annotated graphs for those:
> http://svr-pes20-cppmem.cl.cam.ac.uk/cppmem/

Thanks for the link, that's incredibly helpful. I've used ppcmem and armmem
in the past, but I didn't realise they have a version for C++11 too.
Actually, the armmem backend doesn't implement our atomic instructions or
the acquire/release accessors, so it's not been as useful as it could be.
I should probably try to learn OCaml...

> IMHO, one thing worth considering is that for C/C++, the C11/C++11 is
> the only memory model that has widespread support.  So, even though it's
> a fairly weak memory model (unless you go for the "only seq-cst"
> beginners advice) and thus comes with a higher complexity, this model is
> what likely most people will be familiar with over time.  Deviating from
> the "standard" model can have valid reasons, but it also has a cost in
> that new contributors are more likely to be familiar with the "standard"
> model.

Indeed, I wasn't trying to write-off the C11 memory model as something we
can never use in the kernel. I just don't think the current situation is
anywhere close to usable for a project such as Linux. If a greater
understanding of the memory model does eventually manifest amongst C/C++
developers (by which I mean, the beginners advice is really treated as
such and there is a widespread intuition about ordering guarantees, as
opposed to the need to use formal tools), then surely the tools and libraries
will stabilise and provide uniform semantics across the 25+ architectures
that Linux currently supports. If *that* happens, this discussion is certainly
worth having again.

Will


Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-07 Thread Will Deacon
On Fri, Feb 07, 2014 at 05:06:54PM +, Peter Zijlstra wrote:
> On Fri, Feb 07, 2014 at 04:55:48PM +0000, Will Deacon wrote:
> > Hi Paul,
> > 
> > On Fri, Feb 07, 2014 at 04:50:28PM +, Paul E. McKenney wrote:
> > > On Fri, Feb 07, 2014 at 08:44:05AM +0100, Peter Zijlstra wrote:
> > > > On Thu, Feb 06, 2014 at 08:20:51PM -0800, Paul E. McKenney wrote:
> > > > > Hopefully some discussion of out-of-thin-air values as well.
> > > > 
> > > > Yes, absolutely shoot store speculation in the head already. Then drive
> > > > a wooden stake through its hart.
> > > > 
> > > > C11/C++11 should not be allowed to claim itself a memory model until 
> > > > that
> > > > is sorted.
> > > 
> > > There actually is a proposal being put forward, but it might not make ARM
> > > and Power people happy because it involves adding a compare, a branch,
> > > and an ISB/isync after every relaxed load...  Me, I agree with you,
> > > much preferring the no-store-speculation approach.
> > 
> > Can you elaborate a bit on this please? We don't permit speculative stores
> > in the ARM architecture, so it seems counter-intuitive that GCC needs to
> > emit any additional instructions to prevent that from happening.
> > 
> > Stores can, of course, be observed out-of-order but that's a lot more
> > reasonable :)
> 
> This is more about the compiler speculating on stores; imagine:
> 
>   if (x)
>   y = 1;
>   else
>   y = 2;
> 
> The compiler is allowed to change that into:
> 
>   y = 2;
>   if (x)
>   y = 1;
> 
> Which is of course a big problem when you want to rely on the ordering.

Understood, but that doesn't explain why Paul wants to add ISB/isync
instructions which affect the *CPU* rather than the compiler!

Will


Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-07 Thread Will Deacon
Hi Paul,

On Fri, Feb 07, 2014 at 04:50:28PM +, Paul E. McKenney wrote:
> On Fri, Feb 07, 2014 at 08:44:05AM +0100, Peter Zijlstra wrote:
> > On Thu, Feb 06, 2014 at 08:20:51PM -0800, Paul E. McKenney wrote:
> > > Hopefully some discussion of out-of-thin-air values as well.
> > 
> > Yes, absolutely shoot store speculation in the head already. Then drive
> > a wooden stake through its hart.
> > 
> > C11/C++11 should not be allowed to claim itself a memory model until that
> > is sorted.
> 
> There actually is a proposal being put forward, but it might not make ARM
> and Power people happy because it involves adding a compare, a branch,
> and an ISB/isync after every relaxed load...  Me, I agree with you,
> much preferring the no-store-speculation approach.

Can you elaborate a bit on this please? We don't permit speculative stores
in the ARM architecture, so it seems counter-intuitive that GCC needs to
emit any additional instructions to prevent that from happening.

Stores can, of course, be observed out-of-order but that's a lot more
reasonable :)

Will


Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-10 Thread Will Deacon
On Mon, Feb 10, 2014 at 11:48:13AM +, Peter Zijlstra wrote:
> On Fri, Feb 07, 2014 at 10:02:16AM -0800, Paul E. McKenney wrote:
> > As near as I can tell, compiler writers hate the idea of prohibiting
> > speculative-store optimizations because it requires them to introduce
> > both control and data dependency tracking into their compilers.  Many of
> > them seem to hate dependency tracking with a purple passion.  At least,
> > such a hatred would go a long way towards explaining the incomplete
> > and high-overhead implementations of memory_order_consume, the long
> > and successful use of idioms based on the memory_order_consume pattern
> > notwithstanding [*].  ;-)
> 
> Just tell them that because the hardware provides control dependencies
> we actually use and rely on them.

s/control/address/ ?

Will


Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-10 Thread Will Deacon
On Mon, Feb 10, 2014 at 03:04:43PM +, Paul E. McKenney wrote:
> On Mon, Feb 10, 2014 at 11:49:29AM +0000, Will Deacon wrote:
> > On Mon, Feb 10, 2014 at 11:48:13AM +, Peter Zijlstra wrote:
> > > On Fri, Feb 07, 2014 at 10:02:16AM -0800, Paul E. McKenney wrote:
> > > > As near as I can tell, compiler writers hate the idea of prohibiting
> > > > speculative-store optimizations because it requires them to introduce
> > > > both control and data dependency tracking into their compilers.  Many of
> > > > them seem to hate dependency tracking with a purple passion.  At least,
> > > > such a hatred would go a long way towards explaining the incomplete
> > > > and high-overhead implementations of memory_order_consume, the long
> > > > and successful use of idioms based on the memory_order_consume pattern
> > > > notwithstanding [*].  ;-)
> > > 
> > > Just tell them that because the hardware provides control dependencies
> > > we actually use and rely on them.
> > 
> > s/control/address/ ?
> 
> Both are important, but as Peter's reply noted, it was control
> dependencies under discussion.  Data dependencies (which include the
> ARM/PowerPC notion of address dependencies) are called out by the standard
> already, but control dependencies are not.  I am not all that satisified
> by current implementations of data dependencies, admittedly.  Should
> be an interesting discussion.  ;-)

Ok, but since you can't use control dependencies to order LOAD -> LOAD, it's
a pretty big ask of the compiler to make use of them for things like
consume, where a data dependency will suffice for any combination of
accesses.

Will


Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-17 Thread Will Deacon
On Mon, Feb 17, 2014 at 06:59:31PM +, Joseph S. Myers wrote:
> On Sat, 15 Feb 2014, Torvald Riegel wrote:
> 
> > glibc is a counterexample that comes to mind, although it's a smaller
> > code base.  (It's currently not using C11 atomics, but transitioning
> > there makes sense, and some thing I want to get to eventually.)
> 
> glibc is using C11 atomics (GCC builtins rather than _Atomic / 
> , but using __atomic_* with explicitly specified memory model 
> rather than the older __sync_*) on AArch64, plus in certain cases on ARM 
> and MIPS.

Hmm, actually that results in a change in behaviour for the __sync_*
primitives on AArch64. The documentation for those states that:

  `In most cases, these built-in functions are considered a full barrier. That
  is, no memory operand is moved across the operation, either forward or
  backward. Further, instructions are issued as necessary to prevent the
  processor from speculating loads across the operation and from queuing stores
  after the operation.'

which is stronger than simply mapping them to memory_model_seq_cst, which
seems to be what the AArch64 compiler is doing (so you get acquire + release
instead of a full fence).

Will


Re: [PATCH 5/5] gcc-plugins/stackleak: Don't instrument vgettimeofday.c in arm64 VDSO

2020-06-04 Thread Will Deacon via Gcc
On Thu, Jun 04, 2020 at 04:49:57PM +0300, Alexander Popov wrote:
> Don't try instrumenting functions in arch/arm64/kernel/vdso/vgettimeofday.c.
> Otherwise that can cause issues if the cleanup pass of stackleak gcc plugin
> is disabled.
> 
> Signed-off-by: Alexander Popov 
> ---
>  arch/arm64/kernel/vdso/Makefile | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/kernel/vdso/Makefile b/arch/arm64/kernel/vdso/Makefile
> index 3862cad2410c..9b84cafbd2da 100644
> --- a/arch/arm64/kernel/vdso/Makefile
> +++ b/arch/arm64/kernel/vdso/Makefile
> @@ -32,7 +32,8 @@ UBSAN_SANITIZE  := n
>  OBJECT_FILES_NON_STANDARD:= y
>  KCOV_INSTRUMENT  := n
>  
> -CFLAGS_vgettimeofday.o = -O2 -mcmodel=tiny -fasynchronous-unwind-tables
> +CFLAGS_vgettimeofday.o = -O2 -mcmodel=tiny -fasynchronous-unwind-tables \
> + $(DISABLE_STACKLEAK_PLUGIN)

I can pick this one up via arm64, thanks. Are there any other plugins we
should be wary of? It looks like x86 filters out $(GCC_PLUGINS_CFLAGS)
when building the vDSO.

Will


Re: [PATCH 5/5] gcc-plugins/stackleak: Don't instrument vgettimeofday.c in arm64 VDSO

2020-06-10 Thread Will Deacon via Gcc
On Tue, Jun 09, 2020 at 12:09:27PM -0700, Kees Cook wrote:
> On Thu, Jun 04, 2020 at 02:58:06PM +0100, Will Deacon wrote:
> > On Thu, Jun 04, 2020 at 04:49:57PM +0300, Alexander Popov wrote:
> > > Don't try instrumenting functions in 
> > > arch/arm64/kernel/vdso/vgettimeofday.c.
> > > Otherwise that can cause issues if the cleanup pass of stackleak gcc 
> > > plugin
> > > is disabled.
> > > 
> > > Signed-off-by: Alexander Popov 
> > > ---
> > >  arch/arm64/kernel/vdso/Makefile | 3 ++-
> > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/arch/arm64/kernel/vdso/Makefile 
> > > b/arch/arm64/kernel/vdso/Makefile
> > > index 3862cad2410c..9b84cafbd2da 100644
> > > --- a/arch/arm64/kernel/vdso/Makefile
> > > +++ b/arch/arm64/kernel/vdso/Makefile
> > > @@ -32,7 +32,8 @@ UBSAN_SANITIZE  := n
> > >  OBJECT_FILES_NON_STANDARD:= y
> > >  KCOV_INSTRUMENT  := n
> > >  
> > > -CFLAGS_vgettimeofday.o = -O2 -mcmodel=tiny -fasynchronous-unwind-tables
> > > +CFLAGS_vgettimeofday.o = -O2 -mcmodel=tiny -fasynchronous-unwind-tables \
> > > + $(DISABLE_STACKLEAK_PLUGIN)
> > 
> > I can pick this one up via arm64, thanks. Are there any other plugins we
> > should be wary of? It looks like x86 filters out $(GCC_PLUGINS_CFLAGS)
> > when building the vDSO.
> 
> I didn't realize/remember that arm64 retained the kernel build flags for
> vDSO builds. (I'm used to x86 throwing all its flags away for its vDSO.)
> 
> How does 32-bit ARM do its vDSO?
> 
> My quick run-through on plugins:
> 
> arm_ssp_per_task_plugin.c
>   32-bit ARM only (but likely needs disabling for 32-bit ARM vDSO?)

On arm64, the 32-bit toolchain is picked up via CC_COMPAT -- does that still
get the plugins?

> cyc_complexity_plugin.c
>   compile-time reporting only
> 
> latent_entropy_plugin.c
>   this shouldn't get triggered for the vDSO (no __latent_entropy
>   nor __init attributes in vDSO), but perhaps explicitly disabling
>   it would be a sensible thing to do, just for robustness?
> 
> randomize_layout_plugin.c
>   this shouldn't get triggered (again, lacking attributes), but
>   should likely be disabled too.
> 
> sancov_plugin.c
>   This should be tracking the KCOV directly (see
>   scripts/Makefile.kcov), which is already disabled here.
> 
> structleak_plugin.c
>   This should be fine in the vDSO, but there's not security
>   boundary here, so it wouldn't be important to KEEP it enabled.

Thanks for going through these. In general though, it seems like an
opt-in strategy would make more sense, as it doesn't make an awful lot
of sense to me for the plugins to be used to build the vDSO.

So I would prefer that this patch filters out $(GCC_PLUGINS_CFLAGS).

Will


Re: [PATCH v2 3/5] arm64: vdso: Don't use gcc plugins for building vgettimeofday.c

2020-06-24 Thread Will Deacon via Gcc
On Wed, Jun 24, 2020 at 03:33:28PM +0300, Alexander Popov wrote:
> Don't use gcc plugins for building arch/arm64/kernel/vdso/vgettimeofday.c
> to avoid unneeded instrumentation.
> 
> Signed-off-by: Alexander Popov 
> ---
>  arch/arm64/kernel/vdso/Makefile | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/kernel/vdso/Makefile b/arch/arm64/kernel/vdso/Makefile
> index 556d424c6f52..0f1ad63b3326 100644
> --- a/arch/arm64/kernel/vdso/Makefile
> +++ b/arch/arm64/kernel/vdso/Makefile
> @@ -29,7 +29,7 @@ ldflags-y := -shared -nostdlib -soname=linux-vdso.so.1 
> --hash-style=sysv \
>  ccflags-y := -fno-common -fno-builtin -fno-stack-protector -ffixed-x18
>  ccflags-y += -DDISABLE_BRANCH_PROFILING
>  
> -CFLAGS_REMOVE_vgettimeofday.o = $(CC_FLAGS_FTRACE) -Os $(CC_FLAGS_SCS)
> +CFLAGS_REMOVE_vgettimeofday.o = $(CC_FLAGS_FTRACE) -Os $(CC_FLAGS_SCS) 
> $(GCC_PLUGINS_CFLAGS)
>  KBUILD_CFLAGS+= $(DISABLE_LTO)
>  KASAN_SANITIZE   := n
>  UBSAN_SANITIZE   := n
> -- 
> 2.25.4

I'll pick this one up as a fix for 5.8, please let me know if that's a
problem.

Will


Re: [PATCH v2 0/5] Improvements of the stackleak gcc plugin

2020-06-24 Thread Will Deacon via Gcc
On Wed, 24 Jun 2020 15:33:25 +0300, Alexander Popov wrote:
> This is the v2 of the patch series with various improvements of the
> stackleak gcc plugin.
> 
> The first three patches disable unneeded gcc plugin instrumentation for
> some files.
> 
> The fourth patch is the main improvement. It eliminates an unwanted
> side-effect of kernel code instrumentation performed by stackleak gcc
> plugin. This patch is a deep reengineering of the idea described on
> grsecurity blog:
>   https://grsecurity.net/resolving_an_unfortunate_stackleak_interaction
> 
> [...]

Applied to arm64 (for-next/fixes), thanks!

[1/1] arm64: vdso: Don't use gcc plugins for building vgettimeofday.c
  https://git.kernel.org/arm64/c/e56404e8e475

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev


Re: Re: typeof and operands in named address spaces

2020-11-17 Thread Will Deacon via Gcc
On Tue, Nov 17, 2020 at 11:31:57AM -0800, Linus Torvalds wrote:
> On Tue, Nov 17, 2020 at 11:25 AM Jakub Jelinek  wrote:
> >
> > It would need to be typeof( (typeof(type)) (type) ) to not be that
> > constrained on what kind of expressions it accepts as arguments.
> 
> Yup.
> 
> > Anyway, it won't work with array types at least,
> >   int a[10];
> >   typeof ((typeof (a)) (a)) b;
> > is an error (in both gcc and clang), while typeof (a) b; will work
> > (but not drop the qualifiers).  Don't know if the kernel cares or not.
> 
> Well, the kernel already doesn't allow that, because our existing
> horror only handles simple integer scalar types.
> 
> So that macro is a clear improvement - if it actually works (local
> testing says it does, but who knows about random compiler versions
> etc)

I'll give it a go now, although if it works I honestly won't know whether
to laugh or cry.

Will


Re: Re: typeof and operands in named address spaces

2020-11-17 Thread Will Deacon via Gcc
On Tue, Nov 17, 2020 at 09:10:53PM +, Will Deacon wrote:
> On Tue, Nov 17, 2020 at 11:31:57AM -0800, Linus Torvalds wrote:
> > On Tue, Nov 17, 2020 at 11:25 AM Jakub Jelinek  wrote:
> > >
> > > It would need to be typeof( (typeof(type)) (type) ) to not be that
> > > constrained on what kind of expressions it accepts as arguments.
> > 
> > Yup.
> > 
> > > Anyway, it won't work with array types at least,
> > >   int a[10];
> > >   typeof ((typeof (a)) (a)) b;
> > > is an error (in both gcc and clang), while typeof (a) b; will work
> > > (but not drop the qualifiers).  Don't know if the kernel cares or not.
> > 
> > Well, the kernel already doesn't allow that, because our existing
> > horror only handles simple integer scalar types.
> > 
> > So that macro is a clear improvement - if it actually works (local
> > testing says it does, but who knows about random compiler versions
> > etc)
> 
> I'll give it a go now, although if it works I honestly won't know whether
> to laugh or cry.

GCC 9 and Clang 11 both seem to generate decent code for aarch64 defconfig
with:

#define __unqual_scalar_typeof(x)  typeof( (typeof(x)) (x))

replacing the current monstrosity. allnoconfig and allmodconfig build fine
too.

However, GCC 4.9.0 goes mad and starts spilling to the stack when dealing
with a pointer to volatile, as though we were just using typeof(). I tried
GCC 5.4.0 and that looks ok, so I think if anybody cares about the potential
performance regression with 4.9 then perhaps they should consider upgrading
their toolchain.

In other words, let's do it.

Will


Kick-starting P1997 implementation, array copy semantics

2021-08-10 Thread will wray via Gcc
P1997 Relaxing Restrictions on Array https://wg21.link/p1997
proposes copy semantics for C array; initialization and assignment
of arrays from arrays, and array as a function return type.
For C++, a new placeholder deduction syntax is proposed.

The paper was seen for the first time on Friday by SG22,
The Joint C and C++ Liaison Study Group. It was well received.

The next step is implementation experience...
I'm looking to kick-start a gcc branch for this work.

My gcc dev-fu is padawan level (two small patches ~ two years ago)
so I may need a kick-start myself.

If anyone has an interest or experience in array mechanics, C or C++,
I could do with a mentor / lifeline / phone-a-friend.
I've joined the gcc developer IRC as doodle.

At some point, ABI specifications for array return types will be needed.

Thanks, Will


Re: [PATCH] arm64/io: Remind compiler that there is a memory side effect

2022-04-04 Thread Will Deacon via Gcc
On Sun, Apr 03, 2022 at 09:47:47AM +0200, Ard Biesheuvel wrote:
> On Sun, 3 Apr 2022 at 09:47, Ard Biesheuvel  wrote:
> > On Sun, 3 Apr 2022 at 09:38, Andrew Pinski  wrote:
> > > It might not be the most restricted fix but it is a fix.
> > > The best fix is to tell that you are writing to that location of memory.
> > > volatile asm does not do what you think it does.
> > > You didn't read further down about memory clobbers:
> > > https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#Clobbers-and-Scratch-Registers
> > > Specifically this part:
> > > The "memory" clobber tells the compiler that the assembly code
> > > performs memory reads or writes to items other than those listed in
> > > the input and output operands
> > >
> >
> > So should we be using "m"(*addr) instead of "r"(addr) here?
> >
> > (along with the appropriately sized casts)
> 
> I mean "=m" not "m"

That can generate writeback addressing modes, which I think breaks
MMIO virtualisation. We usually end up using "Q" instead but the codegen
tends to be worse iirc.

In any case, for this specific problem I think we either need a fixed
compiler or some kbuild magic to avoid using it / disable the new behaviour.

We rely on 'asm volatile' not being elided in other places too.

Will


Re: htsearch broken?

2005-12-09 Thread Will L (sent by Nabble.com)


Hans-Peter Nilsson-2 wrote: 
> 
> If you mean "latest" instead of "earliest", it's because the
> search engine has stopped indexing, permanently.  No ETA; I'm
> not sure it'll be fixed at all.
> 

Try search Nabble, the gcc user list is archived here: 
http://www.nabble.com/gcc---General-f1157.html

Posts from the list are updated to the minute. The new posts will be almost 
immediately searchable.

Nabble also have a combined gcc archive which combines all the lists from gcc - 
this includes the gcc-fortran, gcc-help, gcc-java... Instead of searching every 
list individually, you can search all the lists in one place:

http://www.nabble.com/gcc-f1154.html

Regards,

Will L
Nabble.com
--
Sent from the gcc - General forum at Nabble.com:
http://www.nabble.com/htsearch-broken--t704225.html#a1879186



Re: htsearch broken?

2005-12-10 Thread Will L (sent by Nabble.com)


Jonathan Wakely wrote: 
> 
> Please note this is NOT, I repeat NOT, a GCC users list - this is a GCC
> developers list.  There have been several mails sent via nabble.com
> to this list that should have been sent to gcc-help instead.
> 
> jon
> 

Jon, 

Sorry for the confusion. I just corrected the info on Nabble, and I quoted your 
remarks in the description. The url is now: 
http://www.nabble.com/gcc---Dev-f1157.html

Nabble will develop a feature that will allow a user like you to correct this 
type of mistake by yourself. It will be kind of like wiki. But for now, thanks 
again for pointing this out.

Regards,

Will L
Nabble.com
--
Sent from the gcc - Dev forum at Nabble.com:
http://www.nabble.com/htsearch-broken--t704225.html#a1884523



Re: GCC mailing list archive search omits results after May 2005

2005-12-15 Thread Will L (sent by Nabble.com)

> Re: GCC mailing list archive search omits results after May 2005

I have been following this thread of discussion. I am a little puzzled. Google 
and Gmail are both free but they are not "free" software according to the FSF 
definition. But does it matter? We still use them for work. Gmane is completely 
free and non-commercial, but its free-ness is still somehow questioned. How 
free is free? Don't get me wrong, I know what real free is and I appreciate it, 
but still I want to be practical.

I am a member of the Nabble project (similiar to Gmane), so I have a selfish 
interest in discussing this with you guys. Below is my view of the pros and 
cons of different alternatives:

1. Google - Not free, has ads
But the real problem is that Google does not index all the posts, and we don't 
know what are the criteria for indexing. One thing for sure, the recent posts 
will take days or weeks to get crawled and put into the index. So using Google 
to search is uncertain. Some people in this thread of discussion have noticed 
this.

2. Gmail - Not free, has ads
But the real problem is that it is not open to the public. Yes, I can search my 
gmail, but what about the newcomers to this list? Where do they search?

3. Gmane - free, no ads
I don't see a problem with using Gmane to search this list.

4. Nabble - Not free, no ads now but will have ads eventually
Gmane is the pioneer. Nabble tries to do better. The main improvement is to 
allow cross search and browing of multiple lists. You can search or browse all 
GCC lists here: http://www.nabble.com/gcc-f1154.html You can also drill down to 
the child node for this list http://www.nabble.com/gcc---Dev-f1157.html Nabble 
allows a lot more parameters in fine-tuning a search. Try a search, then click 
the 'Search Tips' link.

The problem with Nabble is that it only started archiving GCC lists half a year 
ago, so the data is not complete. But if the mbox file is still available, we 
can probably do a custom import.

I am not involved in any of the gcc work, but I hope this helps your cause.

Regards,

Will L
Nabble.com
--
Sent from the gcc - Dev forum at Nabble.com:
http://www.nabble.com/GCC-mailing-list-archive-search-omits-results-after-May-2005-t738227.html#a1963126