Re: Source Code for Profile Guided Code Positioning

2016-01-15 Thread Yury Gribov

On 01/15/2016 06:53 PM, vivek pandya wrote:

Hello GCC Developers,

Are 'Profile Guided Code Positioning' algorithms mentioned in
http://dl.acm.org/citation.cfm?id=93550 this paper ( Pettis and Hanse
) implemented in gcc ?
If yes kindly help me with code file location in gcc source tree.


There's some stuff on Google branch: 
https://gcc.gnu.org/ml/gcc-patches/2011-09/msg01440.html


-Y



Re: Source Code for Profile Guided Code Positioning

2016-01-15 Thread Yury Gribov

On 01/15/2016 08:44 PM, vivek pandya wrote:

Thanks Yury for
https://gcc.gnu.org/ml/gcc-patches/2011-09/msg01440.html this link.
It implements procedure reordering as linker plugin.
I have some questions :
1 ) Can you point me to some documentation for "how to write plugin
for linkers " I am I have not seen doc for structs with 'ld_' prefix
(i.e defined in plugin-api.h )
  2 ) There is one more algorithm for Basic Block ordering with
execution frequency count in PH paper . Is there any implementation
available for it ?


Quite frankly - I don't know (I've only learned about Google 
implementation recently).


I've added Sriram to maybe comment.

-Y


Re: Option handling (support) of -fsanitize=use-after-scope

2016-05-11 Thread Yury Gribov

On 05/11/2016 04:18 PM, Martin Liška wrote:

Hello.

I've been working on use-after-scope sanitizer enablement in the GCC compiler 
([1]) and
as I've read following submit request ([2]), the LLVM compiler started to 
utilize following option:
-mllvm -asan-use-after-scope=1

My initial attempt was to introduce a new option value for -fsanitize option 
(which would make both LLVM and GCC
option compatible). Following the current behavior of the LLVM, I would have to 
add a new --param which would
lead to a divergence. Is the suggested approach alterable for LLVM community?

I would also suggest following default behavior:
- If -fsanitize=address or -fsanitize=kernel-address is enabled, the 
use-after-scope sanitization should be enabled
- Similarly, providing -fuse-after-scope should enable address sanitization 
(either use-space or kernel-space)

Thank you for feedback,
Martin

[1] https://gcc.gnu.org/ml/gcc-patches/2016-05/msg00468.html
[2] http://reviews.llvm.org/D19347


Cc-ed Google folks.



Improving Asan code on ARM targets

2014-04-28 Thread Yury Gribov

Hi all,

I've recently noticed that GCC generates suboptimal code for Asan on ARM 
targets. E.g. for a 4-byte memory access check


(shadow_val != 0) & (last_byte >= shadow_val)

we get the following sequence:

movr2, r0, lsr #3
andr3, r0, #7
addr3, r3, #3
addr2, r2, #536870912
ldrbr2, [r2]@ zero_extendqisi2
sxtbr2, r2
cmpr3, r2
movltr3, #0
movger3, #1
cmpr2, #0
moveqr3, #0
cmpr3, #0
bne.L5
ldrr0, [r0]

Obviously a shorter code is possible:

movr3, r0, lsr #3
andr1, r0, #7
addr1, r1, #4
addr3, r3, #536870912
ldrbr3, [r3]@ zero_extendqisi2
sxtbr3, r3
cmpr3, #0
cmpner1, r3
bgt.L5
ldrr0, [r0]

A 30% improvement looked quite important given that Asan usually 
increases code-size by 1.5-2x so I decided to investigate this. It 
turned out that ARM backend already has full support for dominated 
comparisons (cmp-cmpne-bgt sequence above) and can generate efficient 
code if we provide it with a slightly more explicit gimple sequence:


(shadow_val != 0) & (last_byte + 1 > shadow_val)

Ideally backend should be able perform this transform itself. But I'm 
not sure this is possible: it needs to know that last_range + 1 can not 
overflow and this info is not available in RTL (because we don't have 
VRP pass there).


I have attached a simple patch which changes Asan pass to generate the 
ARM-friendly code. I've only bootstrapped/regtested on x64 but I can 
perform additional tests on ARM if the patch make sense. As far as I can 
tell it does not worsen sanitized code on other platforms (x86/x64) 
while significantly improving ARM (15% less code for bzip).


The patch is certainly not ideal:
* it makes target-specific changes in machine-independent code
* it does not help with 1-byte accesses (forwprop pass thinks that it's 
always beneficial to convert x + 1 > y to x >= y so it reverts my change)
* it only improves Asan code whereas it would be great if ARM backend 
could improve generic RTL code
but it achieves significant improvement on ARM without hurting other 
platforms.


So my questions are:
* is this kind of target-specific tweaking acceptable in middle-end?
* if not - what would be a better option?

-Y
2014-04-29  Yury Gribov  

	* asan.c (build_check_stmt): Change generated code to improve
	code generated for ARM.

diff --git a/gcc/asan.c b/gcc/asan.c
index d7c282e..f00705a 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -1543,18 +1543,17 @@ build_check_stmt (location_t location, tree base, gimple_stmt_iterator *iter,
 {
   /* Slow path for 1, 2 and 4 byte accesses.
 	 Test (shadow != 0)
-	  & ((base_addr & 7) + (size_in_bytes - 1)) >= shadow).  */
+	  & ((base_addr & 7) + size_in_bytes) > shadow).  */
   gimple_seq seq = NULL;
   gimple shadow_test = build_assign (NE_EXPR, shadow, 0);
   gimple_seq_add_stmt (&seq, shadow_test);
   gimple_seq_add_stmt (&seq, build_assign (BIT_AND_EXPR, base_addr, 7));
   gimple_seq_add_stmt (&seq, build_type_cast (shadow_type,
   gimple_seq_last (seq)));
-  if (size_in_bytes > 1)
-gimple_seq_add_stmt (&seq,
- build_assign (PLUS_EXPR, gimple_seq_last (seq),
-   size_in_bytes - 1));
-  gimple_seq_add_stmt (&seq, build_assign (GE_EXPR, gimple_seq_last (seq),
+  gimple_seq_add_stmt (&seq,
+   build_assign (PLUS_EXPR, gimple_seq_last (seq),
+ size_in_bytes));
+  gimple_seq_add_stmt (&seq, build_assign (GT_EXPR, gimple_seq_last (seq),
shadow));
   gimple_seq_add_stmt (&seq, build_assign (BIT_AND_EXPR, shadow_test,
gimple_seq_last (seq)));


Re: Improving Asan code on ARM targets

2014-04-29 Thread Yury Gribov

Andrew wrote:
> Does the patch series at located at:
> http://gcc.gnu.org/ml/gcc-patches/2014-02/msg01407.html
> http://gcc.gnu.org/ml/gcc-patches/2014-02/msg01405.html
> Fix this code generation issue?  I suspect it does and improves more
> than just the above code.

No, they don't help as is.

-Y


Re: Improving Asan code on ARM targets

2014-04-29 Thread Yury Gribov

Andrew wrote:

I think it would good to figure out how to improve this code gen

with the above patches rather than changing asan.

I suspect it might easy to expand them to handle this case too.


True, let me take a closer look and get back to you. When will this is 
expected to land in trunk btw?


-Y


Re: Improving Asan code on ARM targets

2014-05-06 Thread Yury Gribov

Andrew Pinski wrote:
> Yury Gribov wrote:
>> Andrew Pinski wrote:
>>> Yury Gribov wrote:
>>>> I've recently noticed that GCC generates suboptimal code
>>>> for Asan on ARM targets. E.g. for a 4-byte memory access check
>>>
>>> Does the patch series at located at:
>>> http://gcc.gnu.org/ml/gcc-patches/2014-02/msg01407.html
>>> http://gcc.gnu.org/ml/gcc-patches/2014-02/msg01405.html
>>> Fix this code generation issue?  I suspect it does and improves more
>>> than just the above code.
>>
>> No, they don't help as is.
>
> I think it would good to figure out how to improve this code gen
> with the above patches rather than changing asan.
> I suspect it might easy to expand them to handle this case too.

I was indeed able to reuse Zhenqiang's work. After updating 
select_ccmp_cmp_order hook to also return suggestions on how to change 
comparisons to allow better code generation (so it sounds more like 
select_ccmp_cmp_layout now) I was able to use this information in 
expand_ccmp_expr to generate optimal code.


The patch is still a draft (only supports Asan's case) and I think I'll 
wait until Zhenqiang's conditional compare patches get into trunk before 
going deeper (not sure when this is going to happen though...).


-Y


Re: Cross-testing libsanitizer

2014-06-02 Thread Yury Gribov

Christophe,

> Indeed, when testing on my laptop, execution tests fail because
> libsanitizer wants to allocated 8GB of memory (I am using qemu as
> execution engine).

Is this 8G of RAM? If yes - I'd be curious to know which part of 
libsanitizer needs so much memory.


-Y


Re: Cross-testing libsanitizer

2014-06-03 Thread Yury Gribov

Is this 8G of RAM? If yes - I'd be curious to know which part of
libsanitizer needs so much memory.


Here is what I have in gcc.log:
==12356==ERROR: AddressSanitizer failed to allocate 0x21000
(8589938688) bytes at address ff000 (errno: 12)^M
==12356==ReserveShadowMemoryRange failed while trying to map
0x21000 bytes. Perhaps you're using ulimit -v^M


Interesting. AFAIK Asan maps shadow memory with NORESERVE flag so it 
should not consume any RAM at all...


-Y


Re: Prototype of a --report-bug option

2014-07-30 Thread Yury Gribov

On 07/30/2014 11:56 AM, Richard Biener wrote:
On Tue, Jul 29, 2014 at 8:35 PM, David Malcolm  wrote:

>At Cauldron on the Sunday morning there was a Release Management BoF
>session, replacing the specRTL talk (does anyone know what happened to
>the latter?)
>
>One of the topics was bug triage, and how many bug reports lacked basic
>metadata on e.g. host/build/target, reproducer etc.
>
Heh...  I was hoping this would be a patch to the driver directly
(thus not a python script).  Note that I don't care too much about
the reproducing/-save-temps and backtrace for the driver option.
Of course in case of an ICE producing a proper bug-url with the
backtrace info included would even be better.


All, we've been trying to upstream a patch for something like this for 
last month.

It doesn't bring you to Bugzilla but at least generates a repro
with host/target information and call stack. Could someone take a look?
We could certainly enhance it to generate user-friendly links like in 
David's script.


https://gcc.gnu.org/ml/gcc-patches/2014-07/msg01649.html

-Y



Re: ASAN test failures make compare_tests useless

2014-08-17 Thread Yury Gribov

On 08/16/2014 04:37 AM, Manuel López-Ibáñez wrote:

On the compile farm, ASAN tests seem to fail a lot like:

FAIL: c-c++-common/asan/global-overflow-1.c   -O0  output pattern
test, is ==31166==ERROR: AddressSanitizer failed to allocate
0xdfff0001000 (15392894357504) bytes at address 2008fff7000 (errno:
12)
==31166==ReserveShadowMemoryRange failed while trying to map
0xdfff0001000 bytes. Perhaps you're using ulimit -v
, should match READ of size 1 at 0x[0-9a-f]+ thread T0.*(

The problem is that those addresses and sizes are very random, so when
I compare the test results of a pristine trunk with a patched one, I
get:

New tests that FAIL:

unix//-m64: c-c++-common/asan/global-overflow-1.c   -O0  output
pattern test, is ==12875==ERROR: AddressSanitizer failed to allocate
0xdfff0001000 (15392894357504) bytes at address 2008fff7000 (errno:
12)
unix//-m64: c-c++-common/asan/global-overflow-1.c   -O0  output
pattern test, is ==18428==ERROR: AddressSanitizer failed to allocate
0xdfff0001000 (15392894357504) bytes at address 2008fff7000 (errno:
12)
[... hundreds of ASAN tests that failed...]

Old tests that failed, that have disappeared: (Eeek!)

unix//-m64: c-c++-common/asan/global-overflow-1.c   -O0  output
pattern test, is ==30142==ERROR: AddressSanitizer failed to allocate
0xdfff0001000 (15392894357504) bytes at address 2008fff7000 (errno:
12)
unix//-m64: c-c++-common/asan/global-overflow-1.c   -O0  output
pattern test, is ==31166==ERROR: AddressSanitizer failed to allocate
0xdfff0001000 (15392894357504) bytes at address 2008fff7000 (errno:
12)
[... the same hundreds of tests that already failed before...]

The above makes very difficult to identify failures caused by my patch.

Can we remove the "==" part of the error? This way compare_tests
will ignore the failures.

Alternatively, I could patch compare_tests to sed out that part before
comparing. Would that be acceptable?

Cheers,

Manuel.



Added Sanitizer folks. Frankly it'd be cool if dumping PIDs and 
addresses could be turned off.




Re: ASAN test failures make compare_tests useless

2014-08-17 Thread Yury Gribov

On 08/18/2014 09:42 AM, Yury Gribov wrote:

On 08/16/2014 04:37 AM, Manuel López-Ibáñez wrote:

On the compile farm, ASAN tests seem to fail a lot like:

FAIL: c-c++-common/asan/global-overflow-1.c   -O0  output pattern
test, is ==31166==ERROR: AddressSanitizer failed to allocate
0xdfff0001000 (15392894357504) bytes at address 2008fff7000 (errno:
12)
==31166==ReserveShadowMemoryRange failed while trying to map
0xdfff0001000 bytes. Perhaps you're using ulimit -v
, should match READ of size 1 at 0x[0-9a-f]+ thread T0.*(

The problem is that those addresses and sizes are very random, so when
I compare the test results of a pristine trunk with a patched one, I
get:

New tests that FAIL:

unix//-m64: c-c++-common/asan/global-overflow-1.c   -O0  output
pattern test, is ==12875==ERROR: AddressSanitizer failed to allocate
0xdfff0001000 (15392894357504) bytes at address 2008fff7000 (errno:
12)
unix//-m64: c-c++-common/asan/global-overflow-1.c   -O0  output
pattern test, is ==18428==ERROR: AddressSanitizer failed to allocate
0xdfff0001000 (15392894357504) bytes at address 2008fff7000 (errno:
12)
[... hundreds of ASAN tests that failed...]

Old tests that failed, that have disappeared: (Eeek!)

unix//-m64: c-c++-common/asan/global-overflow-1.c   -O0  output
pattern test, is ==30142==ERROR: AddressSanitizer failed to allocate
0xdfff0001000 (15392894357504) bytes at address 2008fff7000 (errno:
12)
unix//-m64: c-c++-common/asan/global-overflow-1.c   -O0  output
pattern test, is ==31166==ERROR: AddressSanitizer failed to allocate
0xdfff0001000 (15392894357504) bytes at address 2008fff7000 (errno:
12)
[... the same hundreds of tests that already failed before...]

The above makes very difficult to identify failures caused by my patch.

Can we remove the "==" part of the error? This way compare_tests
will ignore the failures.

Alternatively, I could patch compare_tests to sed out that part before
comparing. Would that be acceptable?

Cheers,

Manuel.



Added Sanitizer folks. Frankly it'd be cool if dumping PIDs and
addresses could be turned off.



Ok, this time actually added them.



Re: ASAN test failures make compare_tests useless

2014-08-18 Thread Yury Gribov

On 08/18/2014 06:36 PM, Alexander Potapenko wrote:

Added Sanitizer folks. Frankly it'd be cool if dumping PIDs and addresses
could be turned off.


Could you please name a reason for that?


Reproducibility?

-Y


Re: non-reproducible g++.dg/ubsan/align-2.C -Os execution failure

2014-09-05 Thread Yury Gribov

On 09/04/2014 11:12 AM, Tom de Vries wrote:
> I ran into this non-reproducible failure while testing a non-bootstrap
> build on x86_64:
> ...
> PASS: g++.dg/ubsan/align-2.C   -Os  (test for excess errors)

Added UBSan folks.

Can this be related to http://llvm.org/bugs/show_bug.cgi?id=20721 ? It 
has been causing sporadic align-4 errors.


-Y


Re: [PATCH] RE: gcc parallel make check

2014-09-09 Thread Yury Gribov

On 09/09/2014 10:51 AM, VandeVondele Joost wrote:
> Attached is an extended version of the patch,
> it brings a 100% improvement in make -j32 -k check-gcc

First of all, many thanks for working on this.

+#   ls -1 | ../../../contrib/generate_tcl_patterns.sh 300 
"dg.exp=gfortran.dg/"


How does this work with subdirectories? Can we replace ls with find?

-check_p_numbers=1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
+check_p_numbers=1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 \
+   21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40

$(shell seq 1 40) ?

+  if (_assert_exit) exit 1

Haven't you already exited above?

> A second part of the patch is a new file 
'contrib/generate_tcl_patterns.sh'

> which generates the needed regexp

Can we provide a Makefile target to automatically update Makefile.in?

-Y



Re: [PATCH] RE: gcc parallel make check

2014-09-09 Thread Yury Gribov

On 09/09/2014 06:14 PM, VandeVondele Joost wrote:

I certainly don't want to claim that the patch I have now is perfect,
it is rather an incremental improvement on the current setup.


I'd second this. Writing patterns manually seems rather inefficient and 
error-prone
(not undoable of course but unnecessarily complicated). And with current 
(crippled)

version Joost already got 100% test time improvement.

-Y



Re: [PATCH] RE: gcc parallel make check

2014-09-09 Thread Yury Gribov

On 09/09/2014 06:33 PM, Jakub Jelinek wrote:

On Tue, Sep 09, 2014 at 06:27:10PM +0400, Yury Gribov wrote:

On 09/09/2014 06:14 PM, VandeVondele Joost wrote:

I certainly don't want to claim that the patch I have now is perfect,
it is rather an incremental improvement on the current setup.


I'd second this. Writing patterns manually seems rather inefficient and
error-prone
(not undoable of course but unnecessarily complicated). And with current
(crippled)
version Joost already got 100% test time improvement.


But if there are jobs that just take 1s to complete, then clearly it doesn't
make sense to split them off as separate job.  I think we don't need 100%
even split, but at least roughly is highly desirable.


You mean enhancing the script to split across arbitrarily long prefixes?
That would be great.

-Y


Backporting KAsan patches to 4.9 branch

2014-09-18 Thread Yury Gribov

Hi all,

Kernel Asan patches are currently being discussed in LKML. One of the 
points raised during review was that KAsan requires GCC 5.0 which is 
presumably unstable (e.g. compilation of kernel modules has been broken 
for two months due to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61848).


Would it make sense to backport Kasan-related patches to 4.9 branch to 
make this feature more accessible to kernel developers? Quick analysis 
showed that at the very least this would require

* r211091 (BUILT_IN_ASAN_REPORT_LOAD_N and friends)
* r211092 (instrument unaligned accesses)
* r211713 and r211699 (New asan-instrumentation-with-call-threshold 
parameter)

* r213367 (initial support for -fsanitize=kernel-address)
and also maybe ~10 bugfix patches.

Is it ok to backport these to 4.9? Note that I would discard patches for 
other sanitizers (UBsan, Tsan).


-Y


Re: Backporting KAsan patches to 4.9 branch

2014-09-18 Thread Yury Gribov

On 09/18/2014 01:57 PM, Jakub Jelinek wrote:

On Thu, Sep 18, 2014 at 01:46:21PM +0400, Yury Gribov wrote:

Kernel Asan patches are currently being discussed in LKML. One of the points
raised during review was that KAsan requires GCC 5.0 which is presumably
unstable (e.g. compilation of kernel modules has been broken for two months
due to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61848).

Would it make sense to backport Kasan-related patches to 4.9 branch to make
this feature more accessible to kernel developers? Quick analysis showed
that at the very least this would require
* r211091 (BUILT_IN_ASAN_REPORT_LOAD_N and friends)
* r211092 (instrument unaligned accesses)
* r211713 and r211699 (New asan-instrumentation-with-call-threshold
parameter)
* r213367 (initial support for -fsanitize=kernel-address)
and also maybe ~10 bugfix patches.

Is it ok to backport these to 4.9? Note that I would discard patches for
other sanitizers (UBsan, Tsan).


I'd say so, if it doesn't need any library changes (especially not any ABI
visible ones, guess bugfixes could be acceptable).


Cool! I'll go for it then.


What asan related patches are still pending review (sorry for missing some)?


Np, AFAIK there are just two:
* add -fasan-shadow-offset 
(https://gcc.gnu.org/ml/gcc-patches/2014-09/msg01170.html)
* enable -fsanitize-recover for KAsan by default 
(https://gcc.gnu.org/ml/gcc-patches/2014-09/msg01169.html)



Do we have any known regressions in 5 from 4.9?


Not that I know of.

-Y


[RFC] Add asm constraint modifier to mark strict memory accesses

2014-09-18 Thread Yury Gribov

Hi all,

Current semantics of memory constraints in GCC inline asm (i.e. "m", 
"v", etc.) is somewhat loosy in that it tells GCC that asm code _may_ 
access given amount of bytes but is not guaranteed to do so. This is 
(ab)used by e.g. glibc (and also some pieces of kernel):

__STRING_INLINE void *
__rawmemchr (const void *__s, int __c)
{
...
  __asm__ __volatile__
("cld\n\t"
 "repne; scasb\n\t"
...
   "m" ( *(struct { char __x[0xfff]; } *)__s)

Imprecise size specification prevents code analysis tools from 
understanding semantics of inline asm (without parsing inline asm 
instructions which e.g. Asan in Clang tries to do). In particular we 
can't automatically instrument inline asm in kernel with Kasan because 
we can not determine exact access size (see e.g. discussion in 
https://gcc.gnu.org/ml/gcc-patches/2014-05/msg02530.html).


Would it make sense to add another constraint modifier (like "=", "&", 
etc.) that would tell compiler/tool that memory access in asm is 
_guaranteed_ to have the specified size?


-Y


Re: [RFC] Add asm constraint modifier to mark strict memory accesses

2014-09-18 Thread Yury Gribov

On 09/18/2014 03:09 PM, Yury Gribov wrote:

Hi all,

Current semantics of memory constraints in GCC inline asm (i.e. "m",
"v", etc.) is somewhat loosy in that it tells GCC that asm code _may_
access given amount of bytes but is not guaranteed to do so. This is
(ab)used by e.g. glibc (and also some pieces of kernel):
__STRING_INLINE void *
__rawmemchr (const void *__s, int __c)
{
...
   __asm__ __volatile__
 ("cld\n\t"
  "repne; scasb\n\t"
...
"m" ( *(struct { char __x[0xfff]; } *)__s)

Imprecise size specification prevents code analysis tools from
understanding semantics of inline asm (without parsing inline asm
instructions which e.g. Asan in Clang tries to do). In particular we
can't automatically instrument inline asm in kernel with Kasan because
we can not determine exact access size (see e.g. discussion in
https://gcc.gnu.org/ml/gcc-patches/2014-05/msg02530.html).

Would it make sense to add another constraint modifier (like "=", "&",
etc.) that would tell compiler/tool that memory access in asm is
_guaranteed_ to have the specified size?

-Y



Added kernel folks.


Re: [RFC] Add asm constraint modifier to mark strict memory accesses

2014-09-18 Thread Yury Gribov

On 09/18/2014 03:16 PM, Jakub Jelinek wrote:

On Thu, Sep 18, 2014 at 03:09:34PM +0400, Yury Gribov wrote:

Current semantics of memory constraints in GCC inline asm (i.e. "m", "v",
etc.) is somewhat loosy in that it tells GCC that asm code _may_ access
given amount of bytes but is not guaranteed to do so. This is (ab)used by
e.g. glibc (and also some pieces of kernel):
__STRING_INLINE void *
__rawmemchr (const void *__s, int __c)
{
...
   __asm__ __volatile__
 ("cld\n\t"
  "repne; scasb\n\t"
...
"m" ( *(struct { char __x[0xfff]; } *)__s)

Imprecise size specification prevents code analysis tools from understanding
semantics of inline asm (without parsing inline asm instructions which e.g.
Asan in Clang tries to do). In particular we can't automatically instrument
inline asm in kernel with Kasan because we can not determine exact access
size (see e.g. discussion in
https://gcc.gnu.org/ml/gcc-patches/2014-05/msg02530.html).

Would it make sense to add another constraint modifier (like "=", "&", etc.)
that would tell compiler/tool that memory access in asm is _guaranteed_ to
have the specified size?


CCing Richard/Jeff on this for thoughts.

Would that modifier mean that the inline asm is unconditionally reading
resp. writing that memory? "m"/"=m" right now is always about might read or
might write, not must.


Yes, that's what I had in mind. Many inline asms (at least in kernel) do 
read memory region unconditionally.



In any case, as no GCC versions support that, you'd need to heavily macroize
it in the kernel, not sure the kernel people would like that very much.


They said they could think about it.

-Y



Re: [RFC] Add asm constraint modifier to mark strict memory accesses

2014-09-18 Thread Yury Gribov

On 09/18/2014 05:36 PM, Jeff Law wrote:

On 09/18/14 05:19, Yury Gribov wrote:


Would that modifier mean that the inline asm is unconditionally reading
resp. writing that memory? "m"/"=m" right now is always about might
read or might write, not must.


Yes, that's what I had in mind. Many inline asms (at least in kernel) do
read memory region unconditionally.

That's precisely what I'd expect such a modifier to mean.  Right now
memory modifiers are strictly "may" but I can see a use case for "must".

I think the question is will the kernel or glibc folks use that new
capability and if so, do we get a significant improvement in the amount
of checking we can do.So I think both those groups need to be looped
into this conversation.


Right. Should I x-post or better send separate emails and then report 
feedback on GCC list?



 From an implementation standpoint, are you thinking a different
modifier (my first choice)?


So we have constraints ("m", "v", "<", etc.) and modifiers which can be 
attached to arbitrary constraints ("+", "=", "&", etc.). I though about 
adding a new modifier so that it could be added to arbitrary memory 
constraint as needed.



That wouldn't allow us to say something
like the first X bytes of this memory region are written and the
remaining Y bytes may be written, but I suspect that's not a use case
we're likely to care about.


Yeah, I don't think anyone needs this.

-Y


Re: [RFC] Add asm constraint modifier to mark strict memory accesses

2014-09-18 Thread Yury Gribov

On 09/18/2014 09:33 PM, Dmitry Vyukov wrote:

What is the number of cases it will fix for kasan?


Re-added kernel people again.

AFAIR silly instrumentation that assumed all memory accesses in inline 
asm are must-accesses (instead of may-accesses) resulted in only one 
false positive. We haven't performed an extensive testing though.



It won't fix the memchr function because the size is indeed not known
statically. So it's a bad example.


Sure, we will _not_ be able to instrument memchr. But being able to 
identify "safe" inline asms would allow us to instrument those (and my 
gut feeling is that they are a vast majority).



My impression was that kernel has relatively small amount of assembly,


Well,
$ grep -r '"[=+]\?[moVv<>]" *(' ~/src/linux-stable/ | wc -l
1133

And also
$ grep -r '"[=+]\?[moVv<>]" *(' ~/src/ffmpeg-2.2.2/ | wc -l
211

> And the rest is just not interesting enough.

Now that may be the case. But how do we know without trying?

-Y


Re: [RFD] Using the 'memory constraint' trick to avoid memory clobber doesn't work

2014-09-25 Thread Yury Gribov

On 09/24/2014 12:31 PM, Richard Biener wrote:

On Wed, Sep 24, 2014 at 9:43 AM, David Wohlferd  wrote:

Hans-Peter Nilsson: I should have listened to you back when you raised
concerns about this.  My apologies for ever doubting you.

In summary:

- The "trick" in the docs for using an arbitrarily sized struct to force
register flushes for inline asm does not work.
- Placing the inline asm in a separate routine can sometimes mask the
problem with the trick not working.
- The sample that has been in the docs forever performs an unhelpful,
unexpected, and probably unwanted stack allocation + memcpy.

Details:

Here is the text from the docs:

---
One trick to avoid [using the "memory" clobber] is available if the size of
the memory being accessed is known at compile time. For example, if
accessing ten bytes of a string, use a memory input like:

 "m"( ({ struct { char x[10]; } *p = (void *)ptr ; *p; }) )


Well - this can't work because you essentially are using a _value_
here (looking at the GIMPLE - I'm not sure if a statement expression
evaluates to an lvalue.

It should work if you simply do this without a stmt expression:

   "m" (*(struct { char x[10]; } *)ptr)

because that's clearly an lvalue (and the GIMPLE correctly says so):

   :
   c.a = 1;
   c.b = 2;
   __asm__ __volatile__("rep; stosb" : "=D" Dest_4, "=c" Count_5 : "0"
&c, "a" 0, "m" MEM[(struct foo *)&c], "1" 8);
   printf ("%u %u\n", 1, 2);

note that we still constant propagated 1 and 2 for the reason that
the asm didn't get any VDEF.  That's because you do not have any
memory output!  So while it keeps 'c' live it doesn't consider it
modified by the asm.  You'd still need to clobber the memory,
but "m" clobbers are not supported, only "memory".

Thus fixed asm:


   __asm__ __volatile__ ("rep; stosb"
: "=D" (Dest), "+c" (Count)
: "0" (&c), "a" (0),
"m" (*( struct foo { char x[8]; } *)&c)
: "memory"
   );

where I'm not 100% sure if the "m" input is now pointless (that is,
if a "memory" clobber also constitutes a use of all memory).


Or maybe even
  __asm__ __volatile__ ("rep; stosb"
   : "=D" (Dest), "+c" (Count), "+m" (*(struct foo { char x[8]; } *)&c)
   : "0" (&c), "a" (0)
  );
to avoid the big-hammer memory clobber?

-Y


Re: Testing Leak Sanitizer?

2014-09-30 Thread Yury Gribov

On 09/30/2014 07:15 PM, Christophe Lyon wrote:

Hello,

After I've recently enabled Address Sanitizer for AArch64 in GCC, I'd
like to enable Leak Sanitizer.

I'd like to know what are the requirements wrt testing it? IIUC there
are no lsan tests in the GCC testsuite so far.

Should I just test a few sample programs to check if basic functionality is OK?

The patch seems to be a 1-line patch, I just want to check the
acceptance criteria.


AFAIK compiler-rt testsuite supports running under non-Clang compiler. 
Don't ask me how to setup the beast though.




Re: msan and gcc ?

2014-10-02 Thread Yury Gribov

On 10/01/2014 10:39 PM, Kostya Serebryany wrote:

On Wed, Oct 1, 2014 at 11:38 AM, Toon Moene  wrote:

On 10/01/2014 08:00 PM, Kostya Serebryany wrote:


-gcc folks.

Why not use clang then?
It offers many more nice features.



What's the Fortran front-end called for clang (or do you really think we are
going to write Weather Forecasting codes in C :-) )


Oh, crap. :)


Well, there's always f2c ;)



Re: msan and gcc ?

2014-10-02 Thread Yury Gribov

On 10/02/2014 11:35 AM, Jakub Jelinek wrote:

On Thu, Oct 02, 2014 at 11:30:50AM +0400, Yury Gribov wrote:

On 10/01/2014 10:39 PM, Kostya Serebryany wrote:

On Wed, Oct 1, 2014 at 11:38 AM, Toon Moene  wrote:

On 10/01/2014 08:00 PM, Kostya Serebryany wrote:


-gcc folks.

Why not use clang then?
It offers many more nice features.



What's the Fortran front-end called for clang (or do you really think we are
going to write Weather Forecasting codes in C :-) )


Oh, crap. :)


Well, there's always f2c ;)


You mean for performance critical code?  Fortran has different aliasing
rules than C, so it is hard to express those in C...


No-no, I only meant debugging.



Re: bug report - libsanitizer compilation fail

2014-10-07 Thread Yury Gribov

On 10/06/2014 03:09 PM, Daniel Doron wrote:

Hi,

I am sending this bug report here because I can't register an account
in bugzilla...

gcc version: gcc-linaro-4.9-2014.09 (I checked also the main repo git,
the code is the same)
kernel: 2.6.37

"home/daniel/Downloads/.build/src/gcc-custom/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.cc:675:43:
error: 'EVIOCGPROP' was not declared in this scope"

This happens when compiling with kernel 2.6.37 headers.

#if EV_VERSION > (0x01)
   unsigned IOCTL_EVIOCGKEYCODE_V2 = EVIOCGKEYCODE_V2;
   unsigned IOCTL_EVIOCGPROP = EVIOCGPROP(0);
   unsigned IOCTL_EVIOCSKEYCODE_V2 = EVIOCSKEYCODE_V2;
#else
   unsigned IOCTL_EVIOCGKEYCODE_V2 = IOCTL_NOT_PRESENT;
   unsigned IOCTL_EVIOCGPROP = IOCTL_NOT_PRESENT;
   unsigned IOCTL_EVIOCSKEYCODE_V2 = IOCTL_NOT_PRESENT;
#endif


although in kernel 2.6.37 the EV_VERSION is indeed > (0x01) the
EVIOCGPROP define is missing and only appears in 2.6.38 onwards.


You'll probably want to report this to upstream project (which is 
compiler-rt).


-Y



Re: Backporting KAsan patches to 4.9 branch

2014-10-14 Thread Yury Gribov

On 09/18/2014 01:57 PM, Jakub Jelinek wrote:
> On Thu, Sep 18, 2014 at 01:46:21PM +0400, Yury Gribov wrote:
>> Kernel Asan patches are currently being discussed in LKML. One of 
the points>> raised during review was that KAsan requires GCC 5.0 which 
is presumably
>> unstable (e.g. compilation of kernel modules has been broken for two 
months

>> due to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61848).
>>
>> Would it make sense to backport Kasan-related patches to 4.9 branch 
to make

>> this feature more accessible to kernel developers? Quick analysis showed
>> that at the very least this would require
>> ...
>> Is it ok to backport these to 4.9? Note that I would discard patches for
>> other sanitizers (UBsan, Tsan).
>
> I'd say so, if it doesn't need any library changes
> (especially not any ABI
> visible ones, guess bugfixes could be acceptable).

Finally got time to look into this. I've successfully backported 22 
patches to 4.9:

* bugfixes (12 patches)
* install Asan headers (1 patch)
* libsanitizer merge (1 patch) - this is questionable, see below for 
discussion

* BUILT_IN_ASAN_REPORT_{LOAD,STORE}_N (2 patches)
* instrumentation with calls (1 patch)
* optimize strlen instrumentation (1 patch)
* move inlining to sanopt pass (2 patches)
* Kasan (2 patches)

One problem is that for BUILT_IN_ASAN_REPORT_{LOAD,STORE}_N patch I need 
libsanitizer APIs (__asan_loadN, __asan_storeN) which were introduced in 
a giant libsanitizer merge in 5.0. In current patchset I backport the 
whole merge patch (and a bunch of cherry-picks which followed it) but it 
changes libsanitizer ABI (new version of __asan_init_vXXX, etc.) which 
is probably undesirable. Another option would be to backport just the 
necessary minimum (__asan_loadN, __asan_storeN). How should I proceed?


Another question: Should I update patch CL dates for backported patches? 
If not - should I insert them to CLs in chronological order or just 
stack on top of previous contents?


-Y


Re: Backporting KAsan patches to 4.9 branch

2014-10-14 Thread Yury Gribov

On 10/14/2014 03:19 PM, Dmitry Vyukov wrote:

On Tue, Oct 14, 2014 at 3:07 PM, Yury Gribov  wrote:

On 09/18/2014 01:57 PM, Jakub Jelinek wrote:

On Thu, Sep 18, 2014 at 01:46:21PM +0400, Yury Gribov wrote:

Kernel Asan patches are currently being discussed in LKML. One of the
points>> raised during review was that KAsan requires GCC 5.0 which is
presumably
unstable (e.g. compilation of kernel modules has been broken for two
months
due to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61848).

Would it make sense to backport Kasan-related patches to 4.9 branch to
make
this feature more accessible to kernel developers? Quick analysis showed
that at the very least this would require
...
Is it ok to backport these to 4.9? Note that I would discard patches for
other sanitizers (UBsan, Tsan).


I'd say so, if it doesn't need any library changes
(especially not any ABI
visible ones, guess bugfixes could be acceptable).


Finally got time to look into this. I've successfully backported 22 patches
to 4.9:
* bugfixes (12 patches)
* install Asan headers (1 patch)
* libsanitizer merge (1 patch) - this is questionable, see below for
discussion
* BUILT_IN_ASAN_REPORT_{LOAD,STORE}_N (2 patches)
* instrumentation with calls (1 patch)
* optimize strlen instrumentation (1 patch)
* move inlining to sanopt pass (2 patches)
* Kasan (2 patches)

One problem is that for BUILT_IN_ASAN_REPORT_{LOAD,STORE}_N patch I need
libsanitizer APIs (__asan_loadN, __asan_storeN) which were introduced in a
giant libsanitizer merge in 5.0. In current patchset I backport the whole
merge patch (and a bunch of cherry-picks which followed it) but it changes
libsanitizer ABI (new version of __asan_init_vXXX, etc.) which is probably
undesirable. Another option would be to backport just the necessary minimum
(__asan_loadN, __asan_storeN). How should I proceed?


Backporting only __asan_loadN/__asan_storeN looks like the safest option to me.


This would break forward compatibility of 4.9's libsanitizer which seems 
to be unacceptable.


-Y



Re: [RFC] Adjusted VRP

2014-10-30 Thread Yury Gribov

On 10/30/2014 01:27 PM, Richard Biener wrote:

Well, VRP is not path-insensitive - it is the value-ranges we are able
to retain after removing the ASSERT_EXPRs VRP inserts.

Why can't you do the ASAN optimizations in the VRP transform phase?


I think this is not Asan-specific: Marat's point was that allowing 
basic-block-precise ranges would generally allow middle-end to produce 
better code.


-Y


Re: [RFC] Adjusted VRP

2014-10-30 Thread Yury Gribov

On 10/30/2014 04:19 PM, Marat Zakirov wrote:

On 10/30/2014 02:32 PM, Jakub Jelinek wrote:

On Thu, Oct 30, 2014 at 02:16:04PM +0300, Yury Gribov wrote:

On 10/30/2014 01:27 PM, Richard Biener wrote:

Well, VRP is not path-insensitive - it is the value-ranges we are able
to retain after removing the ASSERT_EXPRs VRP inserts.

Why can't you do the ASAN optimizations in the VRP transform phase?

I think this is not Asan-specific: Marat's point was that allowing
basic-block-precise ranges would generally allow middle-end to produce
better code.

The reason for get_range_info in the current form is that it is cheap,
and
unless we want to make some SSA_NAMEs non-propagatable [*], IMHO it
should
stay that way.  Now that we have ASAN_ internal calls, if you want to
optimize away ASAN_CHECK calls where VRP suggests that e.g. array
index will be within the right bounds and you'd optimize away
ASAN_CHECK to
a VAR_DECL access if the index was constant (say minimum or maximum of
the
range), you can do so in VRP and it is the right thing to do it there.

[*] - that is something I've been talking about for
__builtin_unreachable ()
etc., whether it would be worth it if range_info of certain SSA_NAME that
would VRP want to remove again was significantly better than range
info of
the base SSA_NAME, to keep that SSA_NAME around and e.g. block
forwprop etc.
from propagating the SSA_NAME copy, unless something other than
SSA_NAME has
been propagated into it.  Richard was against that though.


We didn't find reasonable performance gains to use VRP in asan. But even
if we found we couldn't use it because it is not safe for asan. It make
some optimistic conclusions invalid for asan.

Adjusted VRP memory upper bound is #{trees that are compared} x nblocks
which could be reduced by some threshold.


Do you have some concrete numbers at hand?

-Y



Re: [RFC] UBSan unsafely uses VRP

2014-11-12 Thread Yury Gribov

On 11/11/2014 05:15 PM, Jakub Jelinek wrote:

There are also some unsafe code in functions
ubsan_expand_si_overflow_addsub_check, ubsan_expand_si_overflow_mul_check
which uses get_range_info to reduce checks number. As seen before vrp usage
for sanitizers may decrease quality of error detection.


Using VRP is completely intentional there, we don't want to generate too
slow code if you decide you want to optimize your code (for -O0 VRP isn't
performed of course).


On the other hand detection quality is probably more important than 
important regardless of optimization level. When I use a checker, I 
don't want it to miss bugs due to overly aggressive optimization.


I wish we had some test to check that sanitizer optimizations are indeed 
conservative.


-Y



Re: ubsan, asan testing is broken due to coloring

2014-11-12 Thread Yury Gribov

[CC-ing sanitizer team.]

On 11/12/2014 08:02 AM, Andrew Pinski wrote:

With some configurations (looks like out of tree testing more than in
tree testing), all of ubsan and asan tests fail due to the
libsanitizer using coloring and that confuses the dejagnu pattern
matching.


Right, we fix new errors like this every now and then but they keep 
popping up.



I don't have time to look fully into how to fix this issue
and I don't care much to coloring anyways so I disabled in the source
for my own use so the tests now pass.


First, we could run with ASAN_OPTIONS=color=0.  I think we once tracked 
this error to QEMU incorrectly returning 1 to ASan's isatty() but never 
bothered to fix because fixing tests is so easier.


-Y


Re: [RFC] UBSan unsafely uses VRP

2014-11-12 Thread Yury Gribov

On 11/12/2014 11:45 AM, Marek Polacek wrote:

On Wed, Nov 12, 2014 at 11:42:39AM +0300, Yury Gribov wrote:

On 11/11/2014 05:15 PM, Jakub Jelinek wrote:

There are also some unsafe code in functions
ubsan_expand_si_overflow_addsub_check, ubsan_expand_si_overflow_mul_check
which uses get_range_info to reduce checks number. As seen before vrp usage
for sanitizers may decrease quality of error detection.


Using VRP is completely intentional there, we don't want to generate too
slow code if you decide you want to optimize your code (for -O0 VRP isn't
performed of course).


On the other hand detection quality is probably more important than
important regardless of optimization level. When I use a checker, I don't
want it to miss bugs due to overly aggressive optimization.


Yes, but as said above, VRP is only run with >-O2 and -Os.


Hm, I must be missing something.  99% of users will only run their code 
under -O2 because it'll be too slow otherwise.  Why should we penalize 
them for this by lowering analysis quality?  Isn't error detection the 
main goal of sanitizers (performance being the secondary at best)?



I wish we had some test to check that sanitizer optimizations are indeed
conservative.


I think most of the tests we have are tested with various optimization
levels.


Existing tests are really a joke when we consider interblock 
optimization.  Most don't even contain any non-trivial control flow.


-Y



Re: [RFC] UBSan unsafely uses VRP

2014-11-12 Thread Yury Gribov

On 11/12/2014 04:26 PM, Jakub Jelinek wrote:

But, if -O0 isn't too slow for them, having unnecessary bloat even at -O2
is bad the same.  But not using VRP at all, you are giving up all the cases
where you know something won't overflow because you e.g. sign extend
or zero extend from some smaller type, sum op such values, and something
with constant, or you can use a cheaper code to multiply etc.


Sure, I was not suggesting anything like that - just pointing out that 
we must be careful about potential loss of precision and do all we can 
to avoid it.  Faster code should not justify lower quality (as used to 
be in 1960-s).



Turning off -faggressive-loop-optimizations is certainly the right thing for
-fsanitize=undefined (any undefined I'd say), so are perhaps selected other
optimizations.


Totally agree.

-Y



Re: [RFC] UBSan unsafely uses VRP

2014-11-13 Thread Yury Gribov

On 11/12/2014 04:26 PM, Jakub Jelinek wrote:

On Wed, Nov 12, 2014 at 12:58:37PM +0300, Yury Gribov wrote:

On 11/12/2014 11:45 AM, Marek Polacek wrote:

On Wed, Nov 12, 2014 at 11:42:39AM +0300, Yury Gribov wrote:

On 11/11/2014 05:15 PM, Jakub Jelinek wrote:

There are also some unsafe code in functions
ubsan_expand_si_overflow_addsub_check, ubsan_expand_si_overflow_mul_check
which uses get_range_info to reduce checks number. As seen before vrp usage
for sanitizers may decrease quality of error detection.


Using VRP is completely intentional there, we don't want to generate too
slow code if you decide you want to optimize your code (for -O0 VRP isn't
performed of course).


On the other hand detection quality is probably more important than
important regardless of optimization level. When I use a checker, I don't
want it to miss bugs due to overly aggressive optimization.


Yes, but as said above, VRP is only run with >-O2 and -Os.


Hm, I must be missing something.  99% of users will only run their code
under -O2 because it'll be too slow otherwise.  Why should we penalize them
for this by lowering analysis quality?  Isn't error detection the main goal
of sanitizers (performance being the secondary at best)?


But, if -O0 isn't too slow for them, having unnecessary bloat even at -O2
is bad the same.  But not using VRP at all, you are giving up all the cases
where you know something won't overflow because you e.g. sign extend
or zero extend from some smaller type, sum op such values, and something
with constant, or you can use a cheaper code to multiply etc.
Turning off -faggressive-loop-optimizations is certainly the right thing for
-fsanitize=undefined (any undefined I'd say), so are perhaps selected other
optimizations.

Jakub





Re: limiting call clobbered registers for library functions

2015-01-29 Thread Yury Gribov

On 01/29/2015 08:32 PM, Richard Henderson wrote:

On 01/29/2015 02:08 AM, Paul Shortis wrote:

I've ported GCC to a small 16 bit CPU that has single bit shifts. So I've
handled variable / multi-bit shifts using a mix of inline shifts and calls to
assembler support functions.

The calls to the asm library functions clobber only one (by const) or two
(variable) registers but of course calling these functions causes all of the
standard call clobbered registers to be considered clobbered, thus wasting lots
of candidate registers for use in expressions surrounding these shifts and
causing unnecessary register saves in the surrounding function 
prologue/epilogue.

I've scrutinized and cloned the actions of other ports that do the same,
however I'm unable to convince the various passes that only r1 and r2 can be
clobbered by these library calls.

Is anyone able to point me in the proper direction for a solution to this
problem ?


You wind up writing a pattern that contains a call,
but isn't represented in rtl as a call.


Could it be useful to provide a pragma for specifying function register 
usage? This would allow e.g. library writer to write a hand-optimized 
assembly version and then inform compiler of it's binary interface.


Currently a surrogate of this can be achieved by putting inline asm code 
in static inline functions in public library headers but this has it's 
own disadvantages (e.g. code bloat).


-Y



Re: limiting call clobbered registers for library functions

2015-02-02 Thread Yury Gribov

On 01/30/2015 11:16 AM, Matthew Fortune wrote:

Yury Gribov  writes:

On 01/29/2015 08:32 PM, Richard Henderson wrote:

On 01/29/2015 02:08 AM, Paul Shortis wrote:

I've ported GCC to a small 16 bit CPU that has single bit shifts. So
I've handled variable / multi-bit shifts using a mix of inline shifts
and calls to assembler support functions.

The calls to the asm library functions clobber only one (by const) or
two
(variable) registers but of course calling these functions causes all
of the standard call clobbered registers to be considered clobbered,
thus wasting lots of candidate registers for use in expressions
surrounding these shifts and causing unnecessary register saves in

the surrounding function prologue/epilogue.


I've scrutinized and cloned the actions of other ports that do the
same, however I'm unable to convince the various passes that only r1
and r2 can be clobbered by these library calls.

Is anyone able to point me in the proper direction for a solution to
this problem ?


You wind up writing a pattern that contains a call, but isn't
represented in rtl as a call.


Could it be useful to provide a pragma for specifying function register
usage? This would allow e.g. library writer to write a hand-optimized
assembly version and then inform compiler of it's binary interface.

Currently a surrogate of this can be achieved by putting inline asm code
in static inline functions in public library headers but this has it's
own disadvantages (e.g. code bloat).


This sounds like a good idea in principle. I seem to recall seeing something
similar to this in other compiler frameworks that allow a number of special
calling conventions to be defined and enable functions to be attributed to use
one of them. I.e. not quite so general as specifying an arbitrary clobber list
but some sensible pre-defined alternative conventions.


FYI a colleague from kernel mentioned that they already achieve this by 
wrapping the actual call with inline asm e.g.


static inline int foo(int x) {
  asm(
".global foo_core\n"
// foo_core accepts single parameter in %rax,
// returns result in %rax and
// clobbers %rbx
"call foo_core\n"
: "+a"(x)
:
: "rbx"
  );
  return x;
}

We still can't mark inline asm with things like __attribute__((pure)), 
etc. though so it's not an ideal solution.


-Y



Re: gcc wiki project

2015-03-24 Thread Yury Gribov

On 03/24/2015 03:20 PM, Jonathan Wakely wrote:

On Mon, Mar 23, 2015 at 06:14:30PM -0500, David Kunsman wrote:

Hello, I was just reading through the current projects wiki page and I
noticed how out of date pretty much all of them are.  So I was
planning on doing "spring cleaning" by going down the list tracking
down what has been and what needs to be down and updating all the
wikis.  Do you think this is something that is worthwhile to work on?


Yes, I think that would be very useful.

On 24 March 2015 at 12:16, Martin Jambor wrote:

Yes, I think that even just moving hopelessly outdated stuff to some
"Archive" section,


I don't see any need to move pages (that would break old links).


So why not fix links as well?

-Y



Re: gcc addresssanitizer in MIPS

2013-10-28 Thread Yury Gribov

> Does someone use addresssanitizer in other platform (i386/x64/arm/ppc)
> suffer this problem?

Hi Jean,

Yes, we do see this error on ARM. Full description and suggested patch 
are available at http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58543

I'm curious whether suggested patch is going to work for Andrew.

-Y


Re: gcc addresssanitizer in MIPS

2013-10-28 Thread Yury Gribov

> Yes, we do see this error on ARM.

Here is another instance of the same bug: 
http://permalink.gmane.org/gmane.comp.debugging.address-sanitizer/531


> Full description and suggested patch are available at 
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58543

> I'm curious whether suggested patch is going to work for Andrew.

-Y





Re: gcc addresssanitizer in MIPS

2013-10-29 Thread Yury Gribov

> Hi Yury, try to use the patch for asan.c to see if it solve your problem.

I tried but unfortunately it did not work for me. Could you try the 
patch suggested in http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58543 
(I've attached it) when you have time? This was verified against gcc 
testsuite on x64 and ARM.


> My test was to use attached time.cpp for asan test.

BTW I can't reproduce your error on ARM (using gcc trunk):
$ 
/home/ygribov/install/gcc-master-arm-full/bin/arm-none-linux-gnueabi-gcc 
-fsanitize=address -O2 time.cpp
$ qemu-arm -R 0 -L 
/home/ygribov/install/gcc-master-arm-full/arm-none-linux-gnueabi/sys-root -E 
LD_LIBRARY_PATH=/lib:/usr/lib:/home/ygribov/install/gcc-master-arm-full/arm-none-linux-gnueabi/lib 
a.out


-Y
diff --git a/gcc/asan.c b/gcc/asan.c
index 32f1837..acb00ea 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -895,7 +895,7 @@ asan_clear_shadow (rtx shadow_mem, HOST_WIDE_INT len)
 
   gcc_assert ((len & 3) == 0);
   top_label = gen_label_rtx ();
-  addr = force_reg (Pmode, XEXP (shadow_mem, 0));
+  addr = copy_to_reg (force_reg (Pmode, XEXP (shadow_mem, 0)));
   shadow_mem = adjust_automodify_address (shadow_mem, SImode, addr, 0);
   end = force_reg (Pmode, plus_constant (Pmode, addr, len));
   emit_label (top_label);


Re: gcc addresssanitizer in MIPS

2013-10-29 Thread Yury Gribov

> "copy_to_mode_reg (Pmode, XEXP (shadow_mem, 0))" would be more direct.
> But it looks good to me with that change FWIW.

Thanks, Richard. Note that Jakub has proposed an optimized patch on 
gcc-patches ML (in Re: [PATCH] Invalid unpoisoning of stack redzones on 
ARM).


-Y


Re: Report on the bounded pointers work

2013-11-05 Thread Yury Gribov

> If you're referring to mudflap (Frank Eigler's work),
> ...
> It never reached a point where interoperability across objects with 
and without mudflap instrumentation worked


Jeff,

Could you add more details? E.g. I don't see how mudflap 
interoperability is different from AdressSanitizer which seems to be 
state of the art.


-Y


Asm volatile causing performance regressions on ARM

2014-02-27 Thread Yury Gribov

Hi all,

We have recently ran into a performance/code size regression on ARM 
targets after transition from GCC 4.7 to GCC 4.8 (this regression is 
also present in 4.9).


The following code snippet uses Linux-style compiler barriers to protect 
memory writes:


  #define barrier() __asm__ __volatile__ ("": : :"memory")
  #define write(v,a) { barrier(); *(volatile unsigned *)(a) = (v); }

  #define v1 0x0010
  #define v2 0xaabbccdd

  void test(unsigned base) {
write(v1, base + 0x100);
write(v2, base + 0x200);
write(v1, base + 0x300);
write(v2, base + 0x400);
  }

Code generated by GCC 4.7 under -Os (all good):

   mov r2, #7340032
   str r2, [r0, #3604]
   ldr r3, .L2
   str r3, [r0, #3612]
   str r2, [r0, #3632]
   str r3, [r0, #3640]

(note that compiler decided to load v2 from constant pool).

Now code generated by GCC 4.8/4.9 under -Os is much larger because v1 
and v2 are reloaded before every store:


   mov r3, #7340032
   str r3, [r0, #3604]
   ldr r3, .L2
   str r3, [r0, #3612]
   mov r3, #7340032
   str r3, [r0, #3632]
   ldr r3, .L2
   str r3, [r0, #3640]

v1 and v2 are constant literals and can't really be changed by user so I 
would expect compiler to combine loads.


After some investigation, we discovered that this behavior is caused by 
big hammer in gcc/cse.c:

   /* A volatile ASM or an UNSPEC_VOLATILE invalidates everything.  */
   if (NONJUMP_INSN_P (insn)
   && volatile_insn_p (PATTERN (insn)))
 flush_hash_table ();
This code (introduced in 
http://gcc.gnu.org/viewcvs/gcc?view=revision&revision=193802) aborts CSE 
after seeing a volatile inline asm.


Is this compiler behavior reasonable? AFAIK GCC documentation only says 
that __volatile__ prevents compiler from removing the asm but it does 
not mention that it supresses optimization of all surrounding expressions.


If this behavior is not intended, what would be the best way to fix 
performance? I could teach GCC to not remove constant RTXs in 
flush_hash_table() but this is probably very naive and won't cover some 
corner-cases.


-Y


Re: Asm volatile causing performance regressions on ARM

2014-02-27 Thread Yury Gribov

Richard Biener wrote:

If this behavior is not intended, what would be the best way to fix
performance? I could teach GCC to not remove constant RTXs in
flush_hash_table() but this is probably very naive and won't cover some
corner-cases.


That could be a good starting point though.


Though with modifying "machine state" you can modify constants as well, no?


Valid point but this would mean relying on compiler to always load all 
constants from memory (instead of, say, generating them via movhi/movlo) 
for a piece of code which looks extremely unstable.


What is the general attitude towards volatile asm? Are people interested 
in making it more defined/performant or should we just leave this can of 
worms as is? I can try to improve generated code but my patches will be 
doomed if there is no consensus on what volatile asm actually means...


-Y


Re: linux says it is a bug

2014-03-04 Thread Yury Gribov

Richard wrote:
> volatile __asm__("":::"memory")
>
> is a memory barrier and a barrier for other volatile instructions.

AFAIK asm without output arguments is implicitly marked as volatile. So 
it may not be needed in barrier() at all.


-Y


Re: linux says it is a bug

2014-03-04 Thread Yury Gribov
>> Asms without outputs are automatically volatile.  So there ought be 
zero change

>> with and without the explicit use of the __volatile__ keyword.
>
> That’s what the documentation says but it wasn’t actually true
> as of a couple of releases ago, as I recall.

Looks like 2005:

$ git annotate gcc/c/c-typeck.c
...
89552023(   bonzini 2005-10-05 12:17:16 +   9073) 
/* asm statements without outputs, including simple ones, are treated
89552023(   bonzini 2005-10-05 12:17:16 +   9074) 
  as volatile.  */
89552023(   bonzini 2005-10-05 12:17:16 +   9075) 
ASM_INPUT_P (args) = simple;
89552023(   bonzini 2005-10-05 12:17:16 +   9076) 
ASM_VOLATILE_P (args) = (noutputs == 0);


-Y


Re: linux says it is a bug

2014-03-04 Thread Yury Gribov

What is volatile instructions? Can you give us an example?


Check volatile_insn_p. AFAIK there are two classes of volatile instructions:
* volatile asm
* unspec volatiles (target-specific instructions for e.g. protecting 
function prologues)


-Y


Re: gcc-4.9: How to generate Makefile.in from a modified Makefile.am?

2014-03-26 Thread Yury Gribov

You must use autoconf 2.65, exactly.


Perhaps we could update 
http://gcc.gnu.org/wiki/Regenerating_GCC_Configuration ?


-Y


Re: gcc-4.9: How to generate Makefile.in from a modified Makefile.am?

2014-03-26 Thread Yury Gribov

>> You must use autoconf 2.65, exactly.
> configure.ac:27: error: Please use exactly Autoconf 2.64 instead of 2.69.

Hm...

-Y