On Tue, Nov 9, 2021 at 11:02 AM Siddhesh Poyarekar <siddh...@gotplt.org> wrote: > > > This patchset implements the __builtin_dynamic_object_size builtin for > gcc. The primary motivation to have this builtin in gcc is to enable > _FORTIFY_SOURCE=3 support with gcc, thus allowing greater fortification > in use cases where the potential performance tradeoff is acceptable. > > Semantics: > ---------- > > __builtin_dynamic_object_size has the same signature as > __builtin_object_size; it accepts a pointer and type ranging from 0 to 3 > and it returns an object size estimate for the pointer based on an > analysis of which objects the pointer could point to. The actual > properties of the object size estimate are different: > > - In the best case __builtin_dynamic_object_size evaluates to an > expression that represents a precise size of the object being pointed > to. > > - In case a precise object size expression cannot be evaluated, > __builtin_dynamic_object_size attempts to evaluate an estimate size > expression based on the object size type. > > - In what situations the builtin returns an estimate vs a precise > expression is an implementation detail and may change in future. > Users must always assume, as in the case of __builtin_object_size, that > the returned value is the maximum or minimum based on the object size > type they have provided. > > - In the worst case of failure, __builtin_dynamic_object_size returns a > constant (size_t)-1 or (size_t)0. > > Implementation: > --------------- > > - The __builtin_dynamic_object_size support is implemented in > tree-object-size. In most cases the first pass (early_objsz) is able > to evaluate object size expressions. The second phase mainly ends up > simplifying the __builtin_dynamic_object_size to > __builtin_object_size. > > - The patchset begins with structural modification of the > tree-object-size pass, followed by enhancement to return size > expressions. I have split the implementation into one feature per > patch (calls, function parameters, PHI, etc.) to hopefully ease > review. > > Performance: > ------------ > > Expressions generated by this pass in theory could be arbitrarily > complex. I have not made an attempt to limit nesting of objects since > it seemed too early to do that. In practice based on the few > applications I built, most of the complexity of the expressions got > folded away. Even so, the performance overhead is likely to be > non-zero. If we find performance degradation to be significant, we > could later add nesting limits to bail out if a size expression gets too > complex. > > I have also not implemented simplification of __*_chk to their normal > variants if we can determine at compile time that it is safe, which > still depends on the object size to be constant. I hope to do this as a > minor performance improvement in stage 3. > > Build time performance doesn't seem to be affected much based on an > unscientific check to time > `make check-gcc RUNTESTFLAGS="dg.exp=builtin*"`. It only increases by > about a couple of seconds when the dynamic tests are added and remains > more or less in the same ballpark otherwise. > > Testing: > -------- > > I have added tests for dynamic object sizes as well as wrappers for all > __builtin_object_size tests to provide wide coverage. With that in > place I have run full bootstrap builds on Fedora rawhide by backporting the > patches to the gcc11 branch and x86_64 and i686 have no failures in any > of the builtin-*object-size* tests and no new failures. > > I have also built bash, cmake, zfs-fuse and systemtap with > _FORTIFY_SOURCE=3 (with a hacked up glibc to make sure it works) and saw > no issues in any of those builds. I did some rudimentary analysis of > the generated binaries to see if there was any difference in coverage > and found that there was. In terms of pure numbers, there were far more > _chk variants of calls than the regular ones due to _FORTIFY_SOURCE > (from about 4% to 70% in bash), but that could well be due to the _chk > variants not being folded into regular variants when safe. However, on > manual inspection of some of these sites, it was clear that coverage was > increasing significantly where __builtin_object_size was previously > bailing out. > > Specifically for bash, the coverage went from 30.81% to 86.87% (it was > 84.5% with the v1 patch). I actually hope to reduce this a bit with > folding improvements for __builtin___memcpy_chk, etc. > > A bootstrap build is in progress on x86_64. > > Limitations/Future work: > ------------------------ > > - The most immediate work is to fold _chk variants of builtins into > regular ones when it can be proven at compile time that the object > size will alwasy be less than the length of the write. I am working > on it right now. > > - I need to enable _FORTIFY_SOURCE=3 for gcc in glibc; currently it is > llvm-only. I've started working on these patches too on the side. > > - Instead of bailing out on non-constant sizes with > __builtin_object_size, it should be possible to use ranger to > get an upper and lower bound on the size expression and use that to > implement __builtin_object_size.
When I was implementing improvements into phiopt I ran into case where objsz would fail now because we get: tmp = PHI <CST0, CST1> ptr = ptr + tmp where before the pointer plus was inside each branch instead. So my question is there any progress on implementing objsz with ranger or has that work been put off? I filed https://gcc.gnu.org/PR116556 for this. Do you have any start of patches for this so it maybe it could be taken to finish? If not then I am going to try to implement it since it blocks my work on phiopt. Thanks, Andrew Pinski > > - More work could to be done to reduce the performance impact of the > computation. One way could be to add a heuristic where the pass keeps > track of nesting in the expression and either bail out or compute an > estimate if nesting crosses a threshold. I'll take this up once we > have more data on the nature of the bottlenecks. > > > Siddhesh Poyarekar (10): > tree-object-size: Replace magic numbers with enums > tree-object-size: Abstract object_sizes array > tree-object-size: Use tree instead of HOST_WIDE_INT > tree-object-size: Single pass dependency loop resolution > __builtin_dynamic_object_size: Recognize builtin > tree-object-size: Support dynamic sizes in conditions > tree-object-size: Handle function parameters > tree-object-size: Handle GIMPLE_CALL > tree-object-size: Dynamic sizes for ADDR_EXPR > tree-object-size: Handle dynamic offsets > > gcc/builtins.c | 22 +- > gcc/builtins.def | 1 + > gcc/doc/extend.texi | 13 + > gcc/gimple-fold.c | 9 +- > .../g++.dg/ext/builtin-dynamic-object-size1.C | 5 + > .../g++.dg/ext/builtin-dynamic-object-size2.C | 5 + > .../gcc.dg/builtin-dynamic-alloc-size.c | 7 + > .../gcc.dg/builtin-dynamic-object-size-0.c | 463 +++++ > .../gcc.dg/builtin-dynamic-object-size-1.c | 7 + > .../gcc.dg/builtin-dynamic-object-size-10.c | 9 + > .../gcc.dg/builtin-dynamic-object-size-11.c | 7 + > .../gcc.dg/builtin-dynamic-object-size-12.c | 5 + > .../gcc.dg/builtin-dynamic-object-size-13.c | 5 + > .../gcc.dg/builtin-dynamic-object-size-14.c | 5 + > .../gcc.dg/builtin-dynamic-object-size-15.c | 5 + > .../gcc.dg/builtin-dynamic-object-size-16.c | 7 + > .../gcc.dg/builtin-dynamic-object-size-17.c | 8 + > .../gcc.dg/builtin-dynamic-object-size-18.c | 8 + > .../gcc.dg/builtin-dynamic-object-size-19.c | 104 + > .../gcc.dg/builtin-dynamic-object-size-2.c | 7 + > .../gcc.dg/builtin-dynamic-object-size-3.c | 7 + > .../gcc.dg/builtin-dynamic-object-size-4.c | 7 + > .../gcc.dg/builtin-dynamic-object-size-5.c | 8 + > .../gcc.dg/builtin-dynamic-object-size-6.c | 5 + > .../gcc.dg/builtin-dynamic-object-size-7.c | 5 + > .../gcc.dg/builtin-dynamic-object-size-8.c | 5 + > .../gcc.dg/builtin-dynamic-object-size-9.c | 5 + > gcc/testsuite/gcc.dg/builtin-object-size-1.c | 160 +- > gcc/testsuite/gcc.dg/builtin-object-size-16.c | 2 + > gcc/testsuite/gcc.dg/builtin-object-size-17.c | 2 + > gcc/testsuite/gcc.dg/builtin-object-size-2.c | 134 ++ > gcc/testsuite/gcc.dg/builtin-object-size-3.c | 151 ++ > gcc/testsuite/gcc.dg/builtin-object-size-4.c | 99 + > gcc/testsuite/gcc.dg/builtin-object-size-5.c | 12 + > gcc/tree-object-size.c | 1766 +++++++++++------ > gcc/tree-object-size.h | 3 +- > gcc/ubsan.c | 46 +- > 37 files changed, 2499 insertions(+), 620 deletions(-) > create mode 100644 gcc/testsuite/g++.dg/ext/builtin-dynamic-object-size1.C > create mode 100644 gcc/testsuite/g++.dg/ext/builtin-dynamic-object-size2.C > create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-alloc-size.c > create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-0.c > create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-1.c > create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-10.c > create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-11.c > create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-12.c > create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-13.c > create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-14.c > create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-15.c > create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-16.c > create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-17.c > create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-18.c > create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-19.c > create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-2.c > create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-3.c > create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-4.c > create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-5.c > create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-6.c > create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-7.c > create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-8.c > create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-9.c > > -- > 2.31.1 >