Backporting KAsan patches to 4.9 branch
Hi all, Kernel Asan patches are currently being discussed in LKML. One of the points raised during review was that KAsan requires GCC 5.0 which is presumably unstable (e.g. compilation of kernel modules has been broken for two months due to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61848). Would it make sense to backport Kasan-related patches to 4.9 branch to make this feature more accessible to kernel developers? Quick analysis showed that at the very least this would require * r211091 (BUILT_IN_ASAN_REPORT_LOAD_N and friends) * r211092 (instrument unaligned accesses) * r211713 and r211699 (New asan-instrumentation-with-call-threshold parameter) * r213367 (initial support for -fsanitize=kernel-address) and also maybe ~10 bugfix patches. Is it ok to backport these to 4.9? Note that I would discard patches for other sanitizers (UBsan, Tsan). -Y
Re: Backporting KAsan patches to 4.9 branch
On Thu, Sep 18, 2014 at 01:46:21PM +0400, Yury Gribov wrote: > Kernel Asan patches are currently being discussed in LKML. One of the points > raised during review was that KAsan requires GCC 5.0 which is presumably > unstable (e.g. compilation of kernel modules has been broken for two months > due to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61848). > > Would it make sense to backport Kasan-related patches to 4.9 branch to make > this feature more accessible to kernel developers? Quick analysis showed > that at the very least this would require > * r211091 (BUILT_IN_ASAN_REPORT_LOAD_N and friends) > * r211092 (instrument unaligned accesses) > * r211713 and r211699 (New asan-instrumentation-with-call-threshold > parameter) > * r213367 (initial support for -fsanitize=kernel-address) > and also maybe ~10 bugfix patches. > > Is it ok to backport these to 4.9? Note that I would discard patches for > other sanitizers (UBsan, Tsan). I'd say so, if it doesn't need any library changes (especially not any ABI visible ones, guess bugfixes could be acceptable). What asan related patches are still pending review (sorry for missing some)? Do we have any known regressions in 5 from 4.9? Those would need to be resolved first. Jakub
Re: Backporting KAsan patches to 4.9 branch
On 09/18/2014 01:57 PM, Jakub Jelinek wrote: On Thu, Sep 18, 2014 at 01:46:21PM +0400, Yury Gribov wrote: Kernel Asan patches are currently being discussed in LKML. One of the points raised during review was that KAsan requires GCC 5.0 which is presumably unstable (e.g. compilation of kernel modules has been broken for two months due to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61848). Would it make sense to backport Kasan-related patches to 4.9 branch to make this feature more accessible to kernel developers? Quick analysis showed that at the very least this would require * r211091 (BUILT_IN_ASAN_REPORT_LOAD_N and friends) * r211092 (instrument unaligned accesses) * r211713 and r211699 (New asan-instrumentation-with-call-threshold parameter) * r213367 (initial support for -fsanitize=kernel-address) and also maybe ~10 bugfix patches. Is it ok to backport these to 4.9? Note that I would discard patches for other sanitizers (UBsan, Tsan). I'd say so, if it doesn't need any library changes (especially not any ABI visible ones, guess bugfixes could be acceptable). Cool! I'll go for it then. What asan related patches are still pending review (sorry for missing some)? Np, AFAIK there are just two: * add -fasan-shadow-offset (https://gcc.gnu.org/ml/gcc-patches/2014-09/msg01170.html) * enable -fsanitize-recover for KAsan by default (https://gcc.gnu.org/ml/gcc-patches/2014-09/msg01169.html) Do we have any known regressions in 5 from 4.9? Not that I know of. -Y
[RFC] Add asm constraint modifier to mark strict memory accesses
Hi all, Current semantics of memory constraints in GCC inline asm (i.e. "m", "v", etc.) is somewhat loosy in that it tells GCC that asm code _may_ access given amount of bytes but is not guaranteed to do so. This is (ab)used by e.g. glibc (and also some pieces of kernel): __STRING_INLINE void * __rawmemchr (const void *__s, int __c) { ... __asm__ __volatile__ ("cld\n\t" "repne; scasb\n\t" ... "m" ( *(struct { char __x[0xfff]; } *)__s) Imprecise size specification prevents code analysis tools from understanding semantics of inline asm (without parsing inline asm instructions which e.g. Asan in Clang tries to do). In particular we can't automatically instrument inline asm in kernel with Kasan because we can not determine exact access size (see e.g. discussion in https://gcc.gnu.org/ml/gcc-patches/2014-05/msg02530.html). Would it make sense to add another constraint modifier (like "=", "&", etc.) that would tell compiler/tool that memory access in asm is _guaranteed_ to have the specified size? -Y
Re: [RFC] Add asm constraint modifier to mark strict memory accesses
On Thu, Sep 18, 2014 at 03:09:34PM +0400, Yury Gribov wrote: > Current semantics of memory constraints in GCC inline asm (i.e. "m", "v", > etc.) is somewhat loosy in that it tells GCC that asm code _may_ access > given amount of bytes but is not guaranteed to do so. This is (ab)used by > e.g. glibc (and also some pieces of kernel): > __STRING_INLINE void * > __rawmemchr (const void *__s, int __c) > { > ... > __asm__ __volatile__ > ("cld\n\t" > "repne; scasb\n\t" > ... >"m" ( *(struct { char __x[0xfff]; } *)__s) > > Imprecise size specification prevents code analysis tools from understanding > semantics of inline asm (without parsing inline asm instructions which e.g. > Asan in Clang tries to do). In particular we can't automatically instrument > inline asm in kernel with Kasan because we can not determine exact access > size (see e.g. discussion in > https://gcc.gnu.org/ml/gcc-patches/2014-05/msg02530.html). > > Would it make sense to add another constraint modifier (like "=", "&", etc.) > that would tell compiler/tool that memory access in asm is _guaranteed_ to > have the specified size? CCing Richard/Jeff on this for thoughts. Would that modifier mean that the inline asm is unconditionally reading resp. writing that memory? "m"/"=m" right now is always about might read or might write, not must. In any case, as no GCC versions support that, you'd need to heavily macroize it in the kernel, not sure the kernel people would like that very much. Jakub
Re: [RFC] Add asm constraint modifier to mark strict memory accesses
On 09/18/2014 03:09 PM, Yury Gribov wrote: Hi all, Current semantics of memory constraints in GCC inline asm (i.e. "m", "v", etc.) is somewhat loosy in that it tells GCC that asm code _may_ access given amount of bytes but is not guaranteed to do so. This is (ab)used by e.g. glibc (and also some pieces of kernel): __STRING_INLINE void * __rawmemchr (const void *__s, int __c) { ... __asm__ __volatile__ ("cld\n\t" "repne; scasb\n\t" ... "m" ( *(struct { char __x[0xfff]; } *)__s) Imprecise size specification prevents code analysis tools from understanding semantics of inline asm (without parsing inline asm instructions which e.g. Asan in Clang tries to do). In particular we can't automatically instrument inline asm in kernel with Kasan because we can not determine exact access size (see e.g. discussion in https://gcc.gnu.org/ml/gcc-patches/2014-05/msg02530.html). Would it make sense to add another constraint modifier (like "=", "&", etc.) that would tell compiler/tool that memory access in asm is _guaranteed_ to have the specified size? -Y Added kernel folks.
Re: [RFC] Add asm constraint modifier to mark strict memory accesses
On 09/18/2014 03:16 PM, Jakub Jelinek wrote: On Thu, Sep 18, 2014 at 03:09:34PM +0400, Yury Gribov wrote: Current semantics of memory constraints in GCC inline asm (i.e. "m", "v", etc.) is somewhat loosy in that it tells GCC that asm code _may_ access given amount of bytes but is not guaranteed to do so. This is (ab)used by e.g. glibc (and also some pieces of kernel): __STRING_INLINE void * __rawmemchr (const void *__s, int __c) { ... __asm__ __volatile__ ("cld\n\t" "repne; scasb\n\t" ... "m" ( *(struct { char __x[0xfff]; } *)__s) Imprecise size specification prevents code analysis tools from understanding semantics of inline asm (without parsing inline asm instructions which e.g. Asan in Clang tries to do). In particular we can't automatically instrument inline asm in kernel with Kasan because we can not determine exact access size (see e.g. discussion in https://gcc.gnu.org/ml/gcc-patches/2014-05/msg02530.html). Would it make sense to add another constraint modifier (like "=", "&", etc.) that would tell compiler/tool that memory access in asm is _guaranteed_ to have the specified size? CCing Richard/Jeff on this for thoughts. Would that modifier mean that the inline asm is unconditionally reading resp. writing that memory? "m"/"=m" right now is always about might read or might write, not must. Yes, that's what I had in mind. Many inline asms (at least in kernel) do read memory region unconditionally. In any case, as no GCC versions support that, you'd need to heavily macroize it in the kernel, not sure the kernel people would like that very much. They said they could think about it. -Y
Re: [RFC] Add asm constraint modifier to mark strict memory accesses
On 09/18/14 05:19, Yury Gribov wrote: Would that modifier mean that the inline asm is unconditionally reading resp. writing that memory? "m"/"=m" right now is always about might read or might write, not must. Yes, that's what I had in mind. Many inline asms (at least in kernel) do read memory region unconditionally. That's precisely what I'd expect such a modifier to mean. Right now memory modifiers are strictly "may" but I can see a use case for "must". I think the question is will the kernel or glibc folks use that new capability and if so, do we get a significant improvement in the amount of checking we can do.So I think both those groups need to be looped into this conversation. From an implementation standpoint, are you thinking a different modifier (my first choice)? That wouldn't allow us to say something like the first X bytes of this memory region are written and the remaining Y bytes may be written, but I suspect that's not a use case we're likely to care about. jeff
Re: [RFC] Add asm constraint modifier to mark strict memory accesses
On September 18, 2014 3:36:24 PM CEST, Jeff Law wrote: >On 09/18/14 05:19, Yury Gribov wrote: >>> >>> Would that modifier mean that the inline asm is unconditionally >reading >>> resp. writing that memory? "m"/"=m" right now is always about might >>> read or might write, not must. >> >> Yes, that's what I had in mind. Many inline asms (at least in kernel) >do >> read memory region unconditionally. >That's precisely what I'd expect such a modifier to mean. Right now >memory modifiers are strictly "may" but I can see a use case for >"must". > >I think the question is will the kernel or glibc folks use that new >capability and if so, do we get a significant improvement in the amount > >of checking we can do.So I think both those groups need to be >looped >into this conversation. > > From an implementation standpoint, are you thinking a different >modifier (my first choice)? That wouldn't allow us to say something >like the first X bytes of this memory region are written and the >remaining Y bytes may be written, but I suspect that's not a use case >we're likely to care about. It would also enable us to do more DSE as the asm stmt is then known to kill a specific part of memory. Maybe we even want to constrain the effective type of the memory accesses so we can do TBAA against inline asms? Richard. >jeff
Re: [RFC] Add asm constraint modifier to mark strict memory accesses
On 09/18/2014 05:36 PM, Jeff Law wrote: On 09/18/14 05:19, Yury Gribov wrote: Would that modifier mean that the inline asm is unconditionally reading resp. writing that memory? "m"/"=m" right now is always about might read or might write, not must. Yes, that's what I had in mind. Many inline asms (at least in kernel) do read memory region unconditionally. That's precisely what I'd expect such a modifier to mean. Right now memory modifiers are strictly "may" but I can see a use case for "must". I think the question is will the kernel or glibc folks use that new capability and if so, do we get a significant improvement in the amount of checking we can do.So I think both those groups need to be looped into this conversation. Right. Should I x-post or better send separate emails and then report feedback on GCC list? From an implementation standpoint, are you thinking a different modifier (my first choice)? So we have constraints ("m", "v", "<", etc.) and modifiers which can be attached to arbitrary constraints ("+", "=", "&", etc.). I though about adding a new modifier so that it could be added to arbitrary memory constraint as needed. That wouldn't allow us to say something like the first X bytes of this memory region are written and the remaining Y bytes may be written, but I suspect that's not a use case we're likely to care about. Yeah, I don't think anyone needs this. -Y
Re: [RFC] Add asm constraint modifier to mark strict memory accesses
On Thu, Sep 18, 2014 at 4:09 AM, Yury Gribov wrote: > Hi all, > > Current semantics of memory constraints in GCC inline asm (i.e. "m", "v", > etc.) is somewhat loosy in that it tells GCC that asm code _may_ access > given amount of bytes but is not guaranteed to do so. This is (ab)used by > e.g. glibc (and also some pieces of kernel): > __STRING_INLINE void * > __rawmemchr (const void *__s, int __c) > { > ... > __asm__ __volatile__ > ("cld\n\t" > "repne; scasb\n\t" > ... >"m" ( *(struct { char __x[0xfff]; } *)__s) > > Imprecise size specification prevents code analysis tools from understanding > semantics of inline asm (without parsing inline asm instructions which e.g. > Asan in Clang tries to do). In particular we can't automatically instrument > inline asm in kernel with Kasan because we can not determine exact access > size (see e.g. discussion in > https://gcc.gnu.org/ml/gcc-patches/2014-05/msg02530.html). > > Would it make sense to add another constraint modifier (like "=", "&", etc.) > that would tell compiler/tool that memory access in asm is _guaranteed_ to > have the specified size? Hi, What is the number of cases it will fix for kasan? It won't fix the memchr function because the size is indeed not known statically. So it's a bad example. My impression was that kernel has relatively small amount of assembly, out of which: 1. memchr/strcpy need special handling anyway. 2. putuser/getuser must not be instrumented. 3. atomic operations need special handling for ktsan, so kasan can just reuse the same manual instrumentation. And the rest is just not interesting enough. Am I missing something?
Re: [RFC] Add asm constraint modifier to mark strict memory accesses
On 09/18/14 08:38, Yury Gribov wrote: On 09/18/2014 05:36 PM, Jeff Law wrote: On 09/18/14 05:19, Yury Gribov wrote: Would that modifier mean that the inline asm is unconditionally reading resp. writing that memory? "m"/"=m" right now is always about might read or might write, not must. Yes, that's what I had in mind. Many inline asms (at least in kernel) do read memory region unconditionally. That's precisely what I'd expect such a modifier to mean. Right now memory modifiers are strictly "may" but I can see a use case for "must". I think the question is will the kernel or glibc folks use that new capability and if so, do we get a significant improvement in the amount of checking we can do.So I think both those groups need to be looped into this conversation. Right. Should I x-post or better send separate emails and then report feedback on GCC list? I think cross posting is fine. Most of us don't necessarily watch the kernel or glibc lists -- and in this case I think those cross list discussions could be extremely valuable. Jeff
Re: [RFC] Add asm constraint modifier to mark strict memory accesses
On 09/18/14 08:32, Richard Biener wrote: On September 18, 2014 3:36:24 PM CEST, Jeff Law wrote: On 09/18/14 05:19, Yury Gribov wrote: Would that modifier mean that the inline asm is unconditionally reading resp. writing that memory? "m"/"=m" right now is always about might read or might write, not must. Yes, that's what I had in mind. Many inline asms (at least in kernel) do read memory region unconditionally. That's precisely what I'd expect such a modifier to mean. Right now memory modifiers are strictly "may" but I can see a use case for "must". I think the question is will the kernel or glibc folks use that new capability and if so, do we get a significant improvement in the amount of checking we can do.So I think both those groups need to be looped into this conversation. From an implementation standpoint, are you thinking a different modifier (my first choice)? That wouldn't allow us to say something like the first X bytes of this memory region are written and the remaining Y bytes may be written, but I suspect that's not a use case we're likely to care about. It would also enable us to do more DSE as the asm stmt is then known to kill a specific part of memory. Maybe we even want to constrain the effective type of the memory accesses so we can do TBAA against inline asms? Yea, but I suspect there aren't that many opportunities to do DSE that are enabled by seeing the the must-write in an ASM. Then again, one might argue that even if they aren't common, if they do occur, they're important as (in theory) folks shouldn't be using ASMs if the code isn't hot. jeff
LTO testsuite - single test execution
Hello. I would to introduce a new test case for an issue (PR63270). I was looking for *.exp files and I expected that another test located in: ./gcc/testsuite/g++.dg/lto/pr63166_0.ii can be executed with: make check -k RUNTESTFLAGS="lto.exp=pr63166*" But without succeed. Another interesting issue is running: 'make check-lto', where I was given: make: *** No rule to make target `check-lto'. Stop. Can you please help my with a LTO test integration? Thanks, Martin
Re: Fwd: Building gcc-4.9 on OpenBSD
On Wed, Sep 17, 2014 at 01:26:48PM -0400, Ian Grant wrote: > The reason I'm doing this is that I want to understand why the total > size of the binaries grew from around 10MB (gcc v 4.5) to over 70MB in > 4.9 > > I can compile the first stage OK, and the binaries are quite modest: > > -rwxr-xr-x 1 ian ian 17.2M Sep 6 03:47 prev-gcc/cc1 > -rwxr-xr-x 1 ian ian 1.2M Sep 6 04:24 prev-gcc/cpp > -rwxr-xr-x 1 ian ian 1.2M Sep 6 04:24 prev-gcc/xgcc Gcc 4.9 binaries on OpenBSD/amd64 are resonable: -r-xr-xr-x 1 root bin11.6M Sep 9 03:02 cc1 -r-xr-xr-x 1 root bin15.4M Sep 9 03:02 gnat1 -r-xr-xr-x 1 root bin 749K Sep 9 03:02 ecpp There is indeed a problem with huge binaries on OpenBSD/arm, which I've not yet figured out, but i386/amd64/sparc64 are fine. Are you trying to build gcc from the vanilla sources? If so, you're in for a treat... > > The 2nd stage doesn't compile however, because the Intel library > doesn't support OpenBSD. The host/target is i386-unknown-openbsd5.4: > > ../.././libcilkrts/runtime/os-unix.c:69:5: error: #error "Unsupported OS" > # error "Unsupported OS" > ^ > ../.././libcilkrts/runtime/os-unix.c: In function > '__cilkrts_hardware_cpu_count': > ../.././libcilkrts/runtime/os-unix.c:386:2: error: #error "Unknown > architecture" > #error "Unknown architecture" > ^ > Makefile:691: recipe for target 'os-unix.lo' failed > > My questions are, is this what I should expect in terms of file sizes?: > > ian3@jaguar:~/build/guile-2.0.11$ ls -l ~/usr/bin/gcc > ~/usr/libexec/gcc/i686-pc-linux-gnu/4.9.0/cc1 > -rwxr-xr-x 3 ian3 ian3 2538426 2014-08-03 01:18 /home/ian3/usr/bin/gcc > -rwxr-xr-x 1 ian3 ian3 66149541 2014-08-03 01:18 > /home/ian3/usr/libexec/gcc/i686-pc-linux-gnu/4.9.0/cc1 > ian3@jaguar:~/build/guile-2.0.11$ > > And is there any way to disable the Intel library? The fact that the > first stage bootstrap works without it indicates that it might be > possible. > > Thanks > Ian
Re: Fwd: Building gcc-4.9 on OpenBSD
On Thu, Sep 18, 2014 at 5:37 PM, Ian Grant wrote: > > On Thu, Sep 18, 2014 at 5:22 PM, Tobias Ulmer wrote: >> >> On Wed, Sep 17, 2014 at 01:26:48PM -0400, Ian Grant wrote: >> > The reason I'm doing this is that I want to understand why the total >> > size of the binaries grew from around 10MB (gcc v 4.5) to over 70MB in >> > 4.9 >> >> There is indeed a problem with huge binaries on OpenBSD/arm, which I've >> not yet figured out, but i386/amd64/sparc64 are fine. > > > I don't have huge binaries (c.f. "Buster Gonad and his Infeasibly Large > Testicles") on Open BSD. I have them on i686-pc-linux-gnu. > > -rwxr-xr-x 1 ian3 ian3 64M 2014-08-03 01:18 cc1 > -rwxr-xr-x 1 ian3 ian3 65M 2014-08-03 01:18 cc1obj > -rwxr-xr-x 1 ian3 ian3 68M 2014-08-03 01:18 cc1plus > -rwxr-xr-x 1 ian3 ian3 1.8M 2014-08-03 01:18 collect2 > -rwxr-xr-x 1 ian3 ian3 65M 2014-08-03 01:18 f951 > >> Are you trying to build gcc from the vanilla sources? If so, you're in >> for a treat... > > > I didn't know there was chocolate source! Where is it?! And why is it a > secret? > > Ian >
gcc-4.8-20140918 is now available
Snapshot gcc-4.8-20140918 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.8-20140918/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.8 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_8-branch revision 215364 You'll find: gcc-4.8-20140918.tar.bz2 Complete GCC MD5=c7ec8bf43b10eb40b650e1c6f7fa733b SHA1=8d6fe878bcd315918aadceb45a12aa200a6f99e4 Diffs from 4.8-20140911 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.8 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.
Re: Fwd: Building gcc-4.9 on OpenBSD
On Thu, Sep 18, 2014 at 5:22 PM, Tobias Ulmer wrote: > On Wed, Sep 17, 2014 at 01:26:48PM -0400, Ian Grant wrote: >> I can compile the first stage OK, and the binaries are quite modest: >> >> -rwxr-xr-x 1 ian ian 17.2M Sep 6 03:47 prev-gcc/cc1 >> -rwxr-xr-x 1 ian ian 1.2M Sep 6 04:24 prev-gcc/cpp >> -rwxr-xr-x 1 ian ian 1.2M Sep 6 04:24 prev-gcc/xgcc > > Gcc 4.9 binaries on OpenBSD/amd64 are resonable: > > -r-xr-xr-x 1 root bin11.6M Sep 9 03:02 cc1 > -r-xr-xr-x 1 root bin15.4M Sep 9 03:02 gnat1 > -r-xr-xr-x 1 root bin 749K Sep 9 03:02 ecpp I think we need to be able to explain this. It's an increase of over 60%, I wouldn't expect that to be due to the relative ineffiiciency of Intel instruction encoding over AMD. And it is not due to the inclusion of libsylkrts (it's much easier to say "Intel library", how many other libraries are there in GCC that were written by Intel?) because that is not in the stage1 bootstrap. Ian
Re: Fwd: Building gcc-4.9 on OpenBSD
On 18 September 2014 23:46, Ian Grant wrote: > On Thu, Sep 18, 2014 at 5:22 PM, Tobias Ulmer wrote: >> On Wed, Sep 17, 2014 at 01:26:48PM -0400, Ian Grant wrote: >>> I can compile the first stage OK, and the binaries are quite modest: >>> >>> -rwxr-xr-x 1 ian ian 17.2M Sep 6 03:47 prev-gcc/cc1 >>> -rwxr-xr-x 1 ian ian 1.2M Sep 6 04:24 prev-gcc/cpp >>> -rwxr-xr-x 1 ian ian 1.2M Sep 6 04:24 prev-gcc/xgcc >> >> Gcc 4.9 binaries on OpenBSD/amd64 are resonable: >> >> -r-xr-xr-x 1 root bin11.6M Sep 9 03:02 cc1 >> -r-xr-xr-x 1 root bin15.4M Sep 9 03:02 gnat1 >> -r-xr-xr-x 1 root bin 749K Sep 9 03:02 ecpp > > I think we need to be able to explain this. It's an increase of over > 60%, I wouldn't expect that to be due to the relative ineffiiciency of > Intel instruction encoding over AMD. And it is not due to the > inclusion of libsylkrts (it's much easier to say "Intel library", how > many other libraries are there in GCC that were written by Intel?) liboffload might get added soon. > because that is not in the stage1 bootstrap. Are you looking at stripped binaries or unstripped? Have you compared the binaries using size(1) instead of ls(1)?
Re: Fwd: Building gcc-4.9 on OpenBSD
On Thu, Sep 18, 2014 at 6:54 PM, Jonathan Wakely wrote: > On 18 September 2014 23:46, Ian Grant wrote: >> On Thu, Sep 18, 2014 at 5:22 PM, Tobias Ulmer wrote: >>> On Wed, Sep 17, 2014 at 01:26:48PM -0400, Ian Grant wrote: I can compile the first stage OK, and the binaries are quite modest: -rwxr-xr-x 1 ian ian 17.2M Sep 6 03:47 prev-gcc/cc1 -rwxr-xr-x 1 ian ian 1.2M Sep 6 04:24 prev-gcc/cpp -rwxr-xr-x 1 ian ian 1.2M Sep 6 04:24 prev-gcc/xgcc >>> >>> Gcc 4.9 binaries on OpenBSD/amd64 are resonable: >>> >>> -r-xr-xr-x 1 root bin11.6M Sep 9 03:02 cc1 >>> -r-xr-xr-x 1 root bin15.4M Sep 9 03:02 gnat1 >>> -r-xr-xr-x 1 root bin 749K Sep 9 03:02 ecpp >> >> I think we need to be able to explain this. It's an increase of over >> 60%, I wouldn't expect that to be due to the relative ineffiiciency of >> Intel instruction encoding over AMD. And it is not due to the >> inclusion of libsylkrts (it's much easier to say "Intel library", how >> many other libraries are there in GCC that were written by Intel?) > > liboffload might get added soon. I don't know what that is. I'll look it up later maybe. >> because that is not in the stage1 bootstrap. > > Are you looking at stripped binaries or unstripped? I don't know. How should I find out, read the Makefile? :-) Doesn't the stage-1 get stripped? I'm not a GCC developer, I'm a 'user.' > Have you compared the binaries using size(1) instead of ls(1)? Yes, they're a lot smaller. Are you suggesting the filesystem size is just holes in the file? I would want to know what data is in there. Think of this as a security audit. Ian
Re: Fwd: Building gcc-4.9 on OpenBSD
On Thu, Sep 18, 2014 at 6:54 PM, Jonathan Wakely wrote: > On 18 September 2014 23:46, Ian Grant wrote: >> On Thu, Sep 18, 2014 at 5:22 PM, Tobias Ulmer wrote: > Have you compared the binaries using size(1) instead of ls(1)? Actually, when I look at the output of size I realise I don't know what it means: ian3@jaguar:~/usr/libexec/gcc$ size i686-pc-linux-gnu/4.9.0/{cc1,f951} text databssdechexfilename 14965183 23708 74494415733835 f0144b i686-pc-linux-gnu/4.9.0/cc1 15882830 29264 75083216662926 fe418e i686-pc-linux-gnu/4.9.0/f951 The phrase "dangerous GNU crap" comes to mind :-) Ian
Re: Fwd: Building gcc-4.9 on OpenBSD
On 19 September 2014 00:07, Ian Grant wrote: > > Actually, when I look at the output of size I realise I don't know > what it means: > > ian3@jaguar:~/usr/libexec/gcc$ size i686-pc-linux-gnu/4.9.0/{cc1,f951} >text databssdechexfilename > 14965183 23708 74494415733835 f0144b > i686-pc-linux-gnu/4.9.0/cc1 > 15882830 29264 75083216662926 fe418e > i686-pc-linux-gnu/4.9.0/f951 > > The phrase "dangerous GNU crap" comes to mind :-) If you say so. The size command is older than GNU, or BSD for that matter. Your OS probably has a man page for it.
Re: Fwd: Building gcc-4.9 on OpenBSD
On Thu, Sep 18, 2014 at 8:32 PM, Jonathan Wakely wrote: >> ian3@jaguar:~/usr/libexec/gcc$ size i686-pc-linux-gnu/4.9.0/{cc1,f951} >>text databssdechexfilename >> 14965183 23708 74494415733835 f0144b >> i686-pc-linux-gnu/4.9.0/cc1 >> 15882830 29264 75083216662926 fe418e >> i686-pc-linux-gnu/4.9.0/f951 >> >> The phrase "dangerous GNU crap" comes to mind :-) > If you say so. The size command is older than GNU, or BSD for that matter. It's OK. A GNU can't be blamed for crapping, it's natural :-). > Your OS probably has a man page for it. It does. "Copyright (c) 1991-2013 Free Software Foundation, Inc." But the man page doesn't tell me what the column headings actually mean. And even if it did, why should I have to look up the manual to find out what the headings mean? What's the point of headings? Since they aren't even aligned, it might just as well leave the out altogether: they're just a waste of screen space. But I don't want to argue about GNU crap, it's a natural and understandable phenomenon, and not particularly interesting. I want to know what's in the GCC binaries. So let's focus on that. What was the reason you suggested I look at the output of the size command? What does that tell me about what is the cause of the holes in the file, or the extra padding, or whatever it is you think is the explanation for this phenomenon? Ian
Re: Fwd: Building gcc-4.9 on OpenBSD
In case it isn't obvious, what I am interested in is how easily we can know the problem of infeasibly large binaries isn't an instance of this one: http://livelogic.blogspot.com/2014/08/beware-insiduous-penetrator-my-son.html Ian
RE: Fwd: Building gcc-4.9 on OpenBSD
(delurking) Ian Grant writes: > In case it isn't obvious, what I am interested in is how easily we can know > the problem of infeasibly large binaries isn't an instance of this one: > > http://livelogic.blogspot.com/2014/08/beware-insiduous-penetrator-my-son.html Ah, this is commonly called the Thompson hack, since Ken Thompson actually produced a successful demo: http://www.win.tue.nl/~aeb/linux/hh/thompson/trust.html The only way that the Thompson hack can survive a three-stage bootstrap is if the compiler used for the stage 1 build has the bad code. The comparison between stages 2 and 3 require exact match, and any imperfection in the object code injection would reveal itself. So, you can build GCC with LLVM or Intel's compiler or Microsoft's or IBM's or Sun's, doing cross-compilation where necessary. The basic idea is: 1: build gcc with 3-stage bootstrap, starting with a compiler that you suspect might be infected. call the result A. 2: do it again, starting with a different compiler that you think is independent of the compiler you used in step 1. call it B. 3: compare A to B. If they differ, you've found something that should be investigated. If you don't, then either A and B are both clean, or A and B both have the identical inserted object code. Maybe they have a common ancestor? Note that if you build gcc with a cross-compiler the object code will be different. You have to use the cross-compiler to build one more time to "normalize": GCC 4.9.0 built with GCC 4.9.0 on operating system X should always be the same. As far as I know no one has been paranoid enough to put in the time to do the experiment on a large scale, and it's harder because you can't build a modern GCC (or LLVM for that matter) with an ancient compiler. But you can create a chain: grab an ancient gcc version off a 15-year-old CD, and build newer versions with it until you get up to the present. The result should be byte-for-byte identical with what you get when building the current compiler with a recent version. If it is, then either the infection is 15 years old or does not exist. Try it again by building cross-compilers from a Microsoft system. Don't trust Apple, they used to use GCC so maybe all their LLVM binaries caught the bug. BTW, if "size" is reporting much smaller size than the executable file itself and that motivates this concern, most of the difference is likely to be debug info, which is bigger since gcc switched to C++. Might want to try "strip".
Re: Fwd: Building gcc-4.9 on OpenBSD
On Thu, Sep 18, 2014 at 9:37 PM, Joe Buck wrote: > (delurking) > Ah, this is commonly called the Thompson hack, since Ken Thompson > actually produced a successful demo: How do you know Thompson's attempt was the first instance? The document I refer to in the blog is the "Unknown Air Force Report" Thompson refers to. It was written by Roger Schell (cc'ed) > http://www.win.tue.nl/~aeb/linux/hh/thompson/trust.html > The only way that the Thompson hack can survive a three-stage > bootstrap is if the compiler used for the stage 1 build has the bad > code. This is the overwhelmingly likely (probability 1) case. How else would the stage-2 and three compilers get the bad code? > The comparison between stages 2 and 3 require exact match, > and any imperfection in the object code injection would reveal itself. How? In the output of a utility, or a system device driver, on a system booted from a boot loader and using standard libraries such libc, all compiled by the same bug in the compilers which compiled the stage 1, 2 and 3 C compilers? > So, you can build GCC with LLVM or Intel's compiler or Microsoft's or IBM's > or Sun's, doing cross-compilation where necessary. Do these compilers all support cross-OS compilation to any OS? It sounds a bit hard to me. I just can't imagine MS, say, going to a great deal of trouble to make sure that their compiler targets Linux and OpenBSD. GCC needs quite a lot of library and OS support, doesn't it? People will have to help me a bit with this, I've not yet managed to cross-compile anything. This thread started because I was just trying to build gcc from Vanilla gcc-4.9 sources on OpenBSD, and it doesn't work. See the earlier messages. I was next going to try to build gcc-4.9 on OpenBSD, cross-targetting Linux on the same physical machine (i.e. same CPU) but I don't imagine this will be at all easy, given I can't even build the vanilla sources. People say there is chocolate source, but no-one has told me where it is yet! > The basic idea is: > > 1: build gcc with 3-stage bootstrap, starting with a compiler that you > suspect might be infected. call the result A. > 2: do it again, starting with a different compiler that you think is > independent of the compiler you used in step 1. call it B. > 3: compare A to B. If they differ, you've found something that should > be investigated. If you don't, then either A and B are both clean, or A > and B both have the identical inserted object code. Maybe they have > common ancestor? > > Note that if you build gcc with a cross-compiler the object code will be > different. > You have to use the cross-compiler to build one more time to "normalize": > GCC 4.9.0 built with GCC 4.9.0 on operating system X should always be > the same. Yes, but the problem is when the object code bug is not in the compiler binaries, it's something injected into the compiler binaries from the infected ld.so, or glibc, or the IDE disk device driver, and it infects the source to those programs. > As far as I know no one has been paranoid enough to put in the time to do > the experiment on a large scale, and it's harder because you can't build > a modern GCC (or LLVM for that matter) with an ancient compiler. But > you can create a chain: grab an ancient gcc version off a 15-year-old CD, When did you last try grabbing an ancient gcc off a 15 year old CD and getting to run on a modern OS? Was it easy? > and build newer versions with it until you get up to the present. And the rest of the chain, are they easier still? > The result should be byte-for-byte identical with what you get when > building the current compiler with a recent version. And what does that tell me, really? > If it is, then either the infection is 15 years old or does not exist. How do you figure that? > Try it again by building cross-compilers from a Microsoft system. > Don't trust Apple, they used to use GCC so maybe all their LLVM > binaries caught the bug. Interesting idea. > BTW, if "size" is reporting much smaller size than the executable > file itself and that motivates this concern, most of the difference > is likely to be debug info, which is bigger since gcc switched to > C++. Might want to try "strip". Great. As I said, the exercise we are here engaged in is to convince as many people as possible that GCC does NOT suffer from this problem on any OS, either OS, Windows, OpenBSD, FreeBSD, Solaris, or Linux on any arch., including IBM System z. So can someone tell me the quickest way to build a new set of binaries, stripped, or just how to tell whether the stage-1 binaries are in fact stripped or not? And can anyone tell me what are the 'non-vanilla' sources? Ian
RE: Fwd: Building gcc-4.9 on OpenBSD
> From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf Of > Ian Grant > > And can anyone tell me what are the 'non-vanilla' sources? "Vanilla source" refers to unmodified source (as distributed on gcc.gnu.org for the case of gcc). This is in contrast to modified source from distribution for instance that will usually add some patches. Best regards, Thomas
Re: [RFC] Add asm constraint modifier to mark strict memory accesses
On 09/18/2014 09:33 PM, Dmitry Vyukov wrote: What is the number of cases it will fix for kasan? Re-added kernel people again. AFAIR silly instrumentation that assumed all memory accesses in inline asm are must-accesses (instead of may-accesses) resulted in only one false positive. We haven't performed an extensive testing though. It won't fix the memchr function because the size is indeed not known statically. So it's a bad example. Sure, we will _not_ be able to instrument memchr. But being able to identify "safe" inline asms would allow us to instrument those (and my gut feeling is that they are a vast majority). My impression was that kernel has relatively small amount of assembly, Well, $ grep -r '"[=+]\?[moVv<>]" *(' ~/src/linux-stable/ | wc -l 1133 And also $ grep -r '"[=+]\?[moVv<>]" *(' ~/src/ffmpeg-2.2.2/ | wc -l 211 > And the rest is just not interesting enough. Now that may be the case. But how do we know without trying? -Y