On Wed, May 9, 2012 at 11:48 AM, Dehao Chen <de...@google.com> wrote: > On Wed, May 9, 2012 at 5:22 PM, Richard Guenther > <richard.guent...@gmail.com> wrote: >> On Wed, May 9, 2012 at 10:38 AM, Dehao Chen <de...@google.com> wrote: >>> On Wed, May 9, 2012 at 4:12 PM, Richard Guenther >>> <richard.guent...@gmail.com> wrote: >>>> On Tue, May 8, 2012 at 6:18 PM, Xinliang David Li <davi...@google.com> >>>> wrote: >>>>> To be clear, this flag is for malloc implementation (such as tcmalloc) >>>>> with side effect unknown to the compiler. Using -fno-builtin-xxx is >>>>> too conservative for that purpose. >>>> >>>> I don't think that flys. Btw, the patch also guards alloca - alloca is >>>> purely >>>> GCC internal. >>>> >>>> What's the "unknown side-effects" that are also important to preserve >>>> for free(malloc(4))? >>> >>> Malloc implementation may record some info to a global structure, and >>> a program may use this free(malloc()) pair to simulate the real runs >>> to get some data, such as peak memory requirement. >> >> So why not use an alternate interface into this special allocator for this >> purpose? > > There can be the following scenario: > > We want to add a module to an existing app. Before implementing the > module, we want to collect some statistics on real runs. In this > scenario, we need: > > * No change to the legacy code > * Optimized build for the simulation run > * Provide accurate statistical info > > We want to collect data for both new module and the legacy code > without changing the later, thus we cannot use a new malloc/free > interface.
So put in a optimization barrier then. Like free (({ void * x = malloc (4); __asm ("" : "+m" (x)); __x; }); Btw, why can't you simply build the new module (which doesn't something real anyway, just fake stuff) without optimization? > Thanks, > Dehao > >> >>> Dehao >>> >>>> >>>> Richard. >>>> >>>>> David >>>>> >>>>> On Tue, May 8, 2012 at 7:43 AM, Dehao Chen <de...@google.com> wrote: >>>>>> Hello, >>>>>> >>>>>> This patch adds a flag to guard the optimization that optimize the >>>>>> following code away: >>>>>> >>>>>> free (malloc (4)); >>>>>> >>>>>> In some cases, we'd like this type of malloc/free pairs to remain in >>>>>> the optimized code. >>>>>> >>>>>> Tested with bootstrap, and no regression in the gcc testsuite. >>>>>> >>>>>> Is it ok for mainline? >>>>>> >>>>>> Thanks, >>>>>> Dehao >>>>>> >>>>>> gcc/ChangeLog >>>>>> 2012-05-08 Dehao Chen <de...@google.com> >>>>>> >>>>>> * common.opt (feliminate-malloc): New. >>>>>> * doc/invoke.texi: Document it. >>>>>> * tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Honor it. >>>>>> >>>>>> gcc/testsuite/ChangeLog >>>>>> 2012-05-08 Dehao Chen <de...@google.com> >>>>>> >>>>>> * gcc.dg/free-malloc.c: Check if -fno-eliminate-malloc is working >>>>>> as expected. >>>>>> >>>>>> Index: gcc/doc/invoke.texi >>>>>> =================================================================== >>>>>> --- gcc/doc/invoke.texi (revision 187277) >>>>>> +++ gcc/doc/invoke.texi (working copy) >>>>>> @@ -360,7 +360,8 @@ >>>>>> -fcx-limited-range @gol >>>>>> -fdata-sections -fdce -fdelayed-branch @gol >>>>>> -fdelete-null-pointer-checks -fdevirtualize -fdse @gol >>>>>> --fearly-inlining -fipa-sra -fexpensive-optimizations -ffat-lto-objects >>>>>> @gol >>>>>> +-fearly-inlining -feliminate-malloc -fipa-sra -fexpensive-optimizations >>>>>> @gol >>>>>> +-ffat-lto-objects @gol >>>>>> -ffast-math -ffinite-math-only -ffloat-store >>>>>> -fexcess-precision=@var{style} @gol >>>>>> -fforward-propagate -ffp-contract=@var{style} -ffunction-sections @gol >>>>>> -fgcse -fgcse-after-reload -fgcse-las -fgcse-lm -fgraphite-identity @gol >>>>>> @@ -6238,6 +6239,7 @@ >>>>>> -fdefer-pop @gol >>>>>> -fdelayed-branch @gol >>>>>> -fdse @gol >>>>>> +-feliminate-malloc @gol >>>>>> -fguess-branch-probability @gol >>>>>> -fif-conversion2 @gol >>>>>> -fif-conversion @gol >>>>>> @@ -6762,6 +6764,11 @@ >>>>>> Perform dead store elimination (DSE) on RTL@. >>>>>> Enabled by default at @option{-O} and higher. >>>>>> >>>>>> +@item -feliminate-malloc >>>>>> +@opindex feliminate-malloc >>>>>> +Eliminate unnecessary malloc/free pairs. >>>>>> +Enabled by default at @option{-O} and higher. >>>>>> + >>>>>> @item -fif-conversion >>>>>> @opindex fif-conversion >>>>>> Attempt to transform conditional jumps into branch-less equivalents. >>>>>> This >>>>>> Index: gcc/testsuite/gcc.dg/free-malloc.c >>>>>> =================================================================== >>>>>> --- gcc/testsuite/gcc.dg/free-malloc.c (revision 0) >>>>>> +++ gcc/testsuite/gcc.dg/free-malloc.c (revision 0) >>>>>> @@ -0,0 +1,12 @@ >>>>>> +/* { dg-do compile } */ >>>>>> +/* { dg-options "-O2 -fno-eliminate-malloc" } */ >>>>>> +/* { dg-final { scan-assembler-times "malloc" 2} } */ >>>>>> +/* { dg-final { scan-assembler-times "free" 2} } */ >>>>>> + >>>>>> +extern void * malloc (unsigned long); >>>>>> +extern void free (void *); >>>>>> + >>>>>> +void test () >>>>>> +{ >>>>>> + free (malloc (10)); >>>>>> +} >>>>>> Index: gcc/common.opt >>>>>> =================================================================== >>>>>> --- gcc/common.opt (revision 187277) >>>>>> +++ gcc/common.opt (working copy) >>>>>> @@ -1474,6 +1474,10 @@ >>>>>> Common Var(flag_dce) Init(1) Optimization >>>>>> Use the RTL dead code elimination pass >>>>>> >>>>>> +feliminate-malloc >>>>>> +Common Var(flag_eliminate_malloc) Init(1) Optimization >>>>>> +Eliminate unnecessary malloc/free pairs >>>>>> + >>>>>> fdse >>>>>> Common Var(flag_dse) Init(1) Optimization >>>>>> Use the RTL dead store elimination pass >>>>>> Index: gcc/tree-ssa-dce.c >>>>>> =================================================================== >>>>>> --- gcc/tree-ssa-dce.c (revision 187277) >>>>>> +++ gcc/tree-ssa-dce.c (working copy) >>>>>> @@ -309,6 +309,8 @@ >>>>>> case BUILT_IN_CALLOC: >>>>>> case BUILT_IN_ALLOCA: >>>>>> case BUILT_IN_ALLOCA_WITH_ALIGN: >>>>>> + if (!flag_eliminate_malloc) >>>>>> + mark_stmt_necessary (stmt, true); >>>>>> return; >>>>>> >>>>>> default:;