A customer expressed interest in mudflap, so I tried to see if I could
use it compile something large.  I used gcc itself for this test.  I've
attached my notes, and the patches I wrote as part of this.  I will
probably be able to generate a couple of bug reports and/or patches out
of this.
-- 
Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com

On an x86 FreeBSD 4.7 system, I built and installed mainline gcc, and then
tried to use this compiler to configure with a CFLAGS of "-O2 -g -fmudflap" and
then bootstrap.  What follows is a list of issues I ran into while trying to
get this to work.

Configure dies with the complaint that the C compiler doesn't work.  The
problem here is that -fmudflap does not imply -lmudflap.  This is different
from how other profiling options like -pg work.  Unfortunately, there are
so many different places that link in so many different ways that there is no
way to make this work unless gcc supplies the -lmudflap itself.  I modified
MFLIB_SPEC in gcc.c to include -lmudflap.  I think the decision to force the
user to specify -lmudflap should be revisited.  If the user needs to specify
-lmudflap manually, then there are multiple ways to disable default linker
options.  But if the user needs gcc to do this automatically, there is no way
to work around this if gcc does not.

The fixincludes build failed with link errors for undefined mudflap
functions.  The fixincludes Makefile.in does not use CFLAGS when linking.  I
added $(CFLAGS) to the 3 rules that contain a link command.

I now get a configure failure in the intl directory.  configure is unable
to run programs compiled by $CC.  I gets an error from the dynamic linker
complaining that libgcc.so can't be found.  The problem here is that the
toplevel Makefile support for LD_LIBRARY_PATH is confused.  See the
SET_GCC_LIB_PATH support.  We set LD_LIBRARY_PATH to point to the gcc we are
building, to ensure that we use the new libgcc.so library.  Unfortunately, we
do this before gcc has been built, and use it for host tools built by $CC.  It
should only be used for target libraries built by $CC_FOR_TARGET.  Since I
have a $CC that requires my own LD_LIBRARY_PATH setting, this breaks anything
built by $CC.  There are also redundancies in the code.  We are setting
LD_LIBRARY_PATH to the same value multiple times.  I hacked HOST_EXPORTS to
include both the $CC and $CC_FOR_TARGET library directories.

I had modify the gcc Makefile to include -fmudflap in BOOT_CFLAGS.  Hmm,
maybe putting -fmudflap in CFLAGS was the wrong approach.  Maybe I should
have put it in CC.  Oh well, I don't feel like restarting.

gengtype generates many fopen/fread/fwrite/fclose violations.  The problem
here is that mudflap isn't registering the memory allocated by fopen.  I didn't
check the FreeBSD libraries to see why.  I just hacked mf-hooks2.c in
libmudflap to define MF_REGISTER_fopen to __MF_TYPE_STATIC.  It is nice that
mudflap has a define for this, though it would be even nicer if there was
configure support to enable it conditionally for targets that need it.

A real bug.  In gengtype.c, we have
        note_flds->opt->info = note_insn_name[c];
where c ranges from 0 to NOTE_INSN_MAX, but the array note_insn_name has only
NOTE_INSN_MAX elements.  We must make the array one element larger.

Looking at this mudflap error report, I noticed that the PC values printed
are all inside the libmudflap library, which makes them fairly useless.  It
should be the PC in the program where the error occured.  Looking at the
mf-runtime.c file, I see that mfu_check uses builtin_return_address to get
the address of the caller.  However, it is always called from mf_check, and
thus the address reported is always an address inside the libmudflap library.
Similarly for mfu_register and mf_register.  It looks like someone added an
additional level of function calls for thread support, without realizing that
this broke the PC error message reporting.

gengtype now dies with a recursive call in malloc.  The problem here is
that we call the mudflap malloc, which calls the real malloc, which calls the
mudflap mmap which then calls malloc to register the new pages.  And now we
have a recursive malloc call.  This does not appear to happen on linux.  The
real malloc apparently calls the real mmap instead of the mudflap mmap.
Perhaps due to a difference in shared library construction.  I haven't checked.
Even if the system malloc wasn't complaining here, we would also have the
problem that we are registering the same bit of memory twice, once in the
malloc call and once in the mmap call.  I hacked around this by adding a static
variable in_malloc, which is set to true when calling the real malloc/realloc
routine.  Then, when in mmap, I don't register the pages if in_malloc is true.
Also, there is the same issue for free and munmap when unregistering the
pages.

Now the build gets all of the way to genattrtab, which dies with an out
of memory error after generating 4753 mudflap violations.  There is 16MB of
make output, and a total of 32008 mudflap violations from gengtype to
genattrtab.

The first mudflap violation is for genmddeps, and occurs in in
vconcat_length from libiberty/concat.c.  Mudflap is complaining about the
stack reads generated by the va_arg calls.  The code looks OK.  This seems
to be a mudflap/stdarg integration bug.  I suspect it is confused by the
va_list args parameter being passed to vconcat_length from the concat function.
This problem is responsible for 2160 of the mudflap violations.

The next one comes from genflags.  We get complaints about the string reads
in hash_c_test.  These strings are static strings embedded in the static
insn_conditions array in the insn-conditions.c file.  Apparently, these
static strings never got registered as objects.  hash_c_test generates 2710
warnings.  cmp_c_test generates 17036 warnings.  maybe_eval_c_test generates
11182 warnings.

That accounts for all of the warnings I got so far.

That brings us back to genattrtab, which dies after 4753 mudflap violations
with an out of memory error.
    out of memory allocating 12 bytes after a total of 536322520 bytes
It seems I have a 512MB data size limit, which isn't enough in this case.
An uninstrumented one needs about 67MB memory total.  Unfortunately, I can't
tell how an instrumented one needs, as the 512MB limit is a hard limit that I
can't raise.  I did manage to run genattrtab using MUDFLAP_OPTIONS=-mode-nop.
It only needed about 200MB then.

Continuing the build, this time I got as far as the crtbegin.o before dying.
cc1 generated 90060 mudflap violations before it died.  There is 104MB of make
output containing 209742 mudflap violations.

Many of these are ones I've already looked at.  libiberty concat generates
9376 warnings.  hash_c_test generates 3252 warnings.  cmp_c_test generates
16588 warnings.  maybe_eval_c_test generates 10385 warnings.

There are 16 complaints about "strcasecmp 1st arg".  This error does not
give a filename and line number like the others, as this one came from a
mudflap hook.  The PC value is useless as I mentioned before.  Also, there is
no indication of which binary generated this violation.  Since this came from
a ./xgcc run, I can't tell by looking at it whether it can from xgcc or from
cc1 or even from the assembler or linker.  This seems like a flaw that needs
to be fixed.  There is only one strcasecmp call in the gcc sources, and it
is in DOS specific code.  I had to run xgcc under the debugger and insert a
break point in __mf_violation to find it.  It comes from gcc_init_libintl in
intl.c.  The data comes from a string returned by nl_langinfo.  Looks like we
need another hook to register this string.

There are 8 complaints about "strlen region".  As above, I can't tell exactly
where they are coming from without using a debugger.  It is a complaint about
the xstrdup (version_string) call in gcc.c.  The static string version_string
did not get registered apparently.

There are 12 complaints about "memcpy source".  6 of them are from the
xstrdup (version_string) as above.  (I don't know why this isn't 8.)  The
other 6 occur at mudflap violation 88084 or later, so I will skip this for
now.  It will take too long to reproduce in a debugger.

There are 159 complaints about find_opt, opts.c line 181.  This seems to be
complaints about reading strings from the cl_options array, which means
another case of static strings inside a static array not being registered as
objects.

There are 283 complaints about handle_option in opts.c, at various line
numbers.  There are all complaints about accesses to the flag and flag_var
fields on the cl_options array.  The flag case is especially curious, as this
is an integer field which should be trivial for mudflap to handle.  Maybe there
is confusion about bit-field accesses, as all of the uses involve checking to
see if one or more bits are set.  I didn't investigate further.

There are 116 complaints about add_standard_paths in c-incpath.c at various
line numbers.  It is complaining about accesses to various fields in the
cpp_include_defaults array, which makes this more of the same.

There are 10 complaints about "strncmp 2nd arg".  I debugged one, and found
it cam from add_standard_paths.  Another field reference in the
cpp_include_defaults array.

There are 4776 complaints about init_attributes.  This is another array of
structures field reference problem.

There are 170539 complaints about decl_attributes in attribs.c at various
line numbers.  These are all references to fields inside attribute_tables.

There are 30 complaints about define__GNUC__ in c-cppbuiltin.  These are
all complaints about references to the static string version_string.

And that is all of the complaints; none of them appear to be valid.

So I got about 240k mudflap violations, and only found one real bug.  Not a
very good ratio, but it looks like there are only a few problems here.

There is still the issue of why cc1 died when building crtbegin.o.  The compile
died with parse errors in /usr/include/stdlib.h.  This is because fixincludes
failed to run properly, but didn't terminate make with an error.  Make doesn't
exit because the fixinc.sh script calls fixincl within backquotes, without
bothering to check for an error return.  In this case, fixincl fails with
a seg fault, and returns the string "Segmentation fault" which then confuses
the script, but doesn't cause it to exit with an error.  As for why fixincl
failed, it turns out that it ran of memory.  If I run it with MUDFLAP_OPTIONS
set to -mode-nop it works, using 300MB of memory.  An uninstrumented one seems
to use about 1MB of memory, but it ran so quickly it was hard to check with
top.

After working around the fixinclude problem, I made it to the stage2 build.
It failed while linking genmodes, because -fmudflap was missing from CFLAGS.
It is required here because I compiled libiberty with -fmudflap.  I got
7.3GB of make output, including 14.7M mudflap violations.  I didn't bother
to look at them, as they are probably mostly the same problems I already
noticed above.
diff -pr orig-gcc/fixincludes/Makefile.in gcc/fixincludes/Makefile.in
*** orig-gcc/fixincludes/Makefile.in	Fri Jan  7 19:16:09 2005
--- gcc/fixincludes/Makefile.in	Tue Feb 22 11:01:24 2005
*************** oneprocess : full-stamp
*** 102,116 ****
  twoprocess : test-stamp $(AF)
  
  full-stamp : $(ALLOBJ) $(LIBIBERTY)
! 	$(CC) $(LDFLAGS) -o $(FI) $(ALLOBJ) $(LIBIBERTY)
  	$(STAMP) $@
  
  test-stamp : $(TESTOBJ) $(LIBIBERTY)
! 	$(CC) $(LDFLAGS) -o $(FI) $(TESTOBJ) $(LIBIBERTY)
  	$(STAMP) $@
  
  $(AF): $(FIXOBJ) $(LIBIBERTY)
! 	$(CC) $(LDFLAGS) -o $@ $(FIXOBJ) $(LIBIBERTY)
  
  $(ALLOBJ)   : $(HDR)
  fixincl.o   : fixincl.c  $(srcdir)/fixincl.x
--- 102,116 ----
  twoprocess : test-stamp $(AF)
  
  full-stamp : $(ALLOBJ) $(LIBIBERTY)
! 	$(CC) $(CFLAGS) $(LDFLAGS) -o $(FI) $(ALLOBJ) $(LIBIBERTY)
  	$(STAMP) $@
  
  test-stamp : $(TESTOBJ) $(LIBIBERTY)
! 	$(CC) $(CFLAGS) $(LDFLAGS) -o $(FI) $(TESTOBJ) $(LIBIBERTY)
  	$(STAMP) $@
  
  $(AF): $(FIXOBJ) $(LIBIBERTY)
! 	$(CC) $(CFLAGS) $(LDFLAGS) -o $@ $(FIXOBJ) $(LIBIBERTY)
  
  $(ALLOBJ)   : $(HDR)
  fixincl.o   : fixincl.c  $(srcdir)/fixincl.x
diff -pr orig-gcc/gcc/gcc.c gcc/gcc/gcc.c
*** orig-gcc/gcc/gcc.c	Sat Feb 19 14:56:21 2005
--- gcc/gcc/gcc.c	Tue Feb 22 10:51:05 2005
*************** proper position among the other output f
*** 612,618 ****
  }} %{fmudflap|fmudflapth: --wrap=main}"
  #endif
  #ifndef MFLIB_SPEC
! #define MFLIB_SPEC "%{fmudflap|fmudflapth: -export-dynamic}" 
  #endif
  
  /* config.h can define LIBGCC_SPEC to override how and when libgcc.a is
--- 612,618 ----
  }} %{fmudflap|fmudflapth: --wrap=main}"
  #endif
  #ifndef MFLIB_SPEC
! #define MFLIB_SPEC "%{fmudflap|fmudflapth: -export-dynamic -lmudflap}"
  #endif
  
  /* config.h can define LIBGCC_SPEC to override how and when libgcc.a is
diff -pr orig-gcc/gcc/gengtype.c gcc/gcc/gengtype.c
*** orig-gcc/gcc/gengtype.c	Mon Jan 24 14:38:27 2005
--- gcc/gcc/gengtype.c	Tue Feb 22 13:13:59 2005
*************** enum insn_note {
*** 367,373 ****
    NOTE_INSN_MAX
  };
  
! static const char *const note_insn_name[NOTE_INSN_MAX] = {
  #define DEF_INSN_NOTE(NAME) #NAME,
  #include "insn-notes.def"
  #undef DEF_INSN_NOTE
--- 367,375 ----
    NOTE_INSN_MAX
  };
  
! /* We must allocate one more entry here, as we use NOTE_INSN_MAX as the
!    default field for line number notes.  */
! static const char *const note_insn_name[NOTE_INSN_MAX+1] = {
  #define DEF_INSN_NOTE(NAME) #NAME,
  #include "insn-notes.def"
  #undef DEF_INSN_NOTE
diff -pr orig-gcc/libmudflap/mf-hooks1.c gcc/libmudflap/mf-hooks1.c
*** orig-gcc/libmudflap/mf-hooks1.c	Tue Jul  6 13:07:39 2004
--- gcc/libmudflap/mf-hooks1.c	Tue Feb 22 13:12:19 2005
*************** __mf_0fn_malloc (size_t c)
*** 85,90 ****
--- 85,92 ----
  #endif
  
  
+ static int in_malloc = 0;
+ 
  #undef malloc
  WRAPPER(void *, malloc, size_t c)
  {
*************** WRAPPER(void *, malloc, size_t c)
*** 96,102 ****
--- 98,106 ----
    size_with_crumple_zones = 
      CLAMPADD(c,CLAMPADD(__mf_opts.crumple_zone,
  			__mf_opts.crumple_zone));
+   in_malloc = 1;
    result = (char *) CALL_REAL (malloc, size_with_crumple_zones);
+   in_malloc = 0;
    
    if (LIKELY(result))
      {
*************** WRAPPER(void *, calloc, size_t c, size_t
*** 147,153 ****
--- 151,159 ----
      CLAMPADD((c * n), /* XXX: CLAMPMUL */
  	     CLAMPADD(__mf_opts.crumple_zone,
  		      __mf_opts.crumple_zone));  
+   in_malloc = 1;
    result = (char *) CALL_REAL (malloc, size_with_crumple_zones);
+   in_malloc = 0;
    
    if (LIKELY(result))
      memset (result, 0, size_with_crumple_zones);
*************** WRAPPER(void *, realloc, void *buf, size
*** 189,195 ****
--- 195,203 ----
    size_with_crumple_zones = 
      CLAMPADD(c, CLAMPADD(__mf_opts.crumple_zone,
  			 __mf_opts.crumple_zone));
+   in_malloc = 1;
    result = (char *) CALL_REAL (realloc, base, size_with_crumple_zones);
+   in_malloc = 0;
  
    /* Ensure heap wiping doesn't occur during this peculiar
       unregister/reregister pair.  */
*************** WRAPPER(void, free, void *buf)
*** 274,280 ****
--- 282,290 ----
  			     (void *) freeme,
  			     __mf_opts.crumple_zone);
  	    }
+ 	  in_malloc = 1;
  	  CALL_REAL (free, freeme);
+ 	  in_malloc = 0;
  	}
      } 
    else 
*************** WRAPPER(void, free, void *buf)
*** 289,295 ****
--- 299,307 ----
  			 (void *) buf, 
  			 __mf_opts.crumple_zone);
  	}
+       in_malloc = 1;
        CALL_REAL (free, base);
+       in_malloc = 0;
      }
  }
  
*************** WRAPPER(void *, mmap, 
*** 323,329 ****
  		 (uintptr_t) result);
    */
  
!   if (result != (void *)-1)
      {
        /* Register each page as a heap object.  Why not register it all
  	 as a single segment?  That's so that a later munmap() call
--- 335,341 ----
  		 (uintptr_t) result);
    */
  
!   if (result != (void *)-1 && !in_malloc)
      {
        /* Register each page as a heap object.  Why not register it all
  	 as a single segment?  That's so that a later munmap() call
*************** WRAPPER(int , munmap, void *start, size_
*** 372,378 ****
  		 (uintptr_t) result);
    */
  
!   if (result == 0)
      {
        /* Unregister each page as a heap object.  */
        size_t ps = getpagesize ();
--- 384,390 ----
  		 (uintptr_t) result);
    */
  
!   if (result == 0 && !in_malloc)
      {
        /* Unregister each page as a heap object.  */
        size_t ps = getpagesize ();
*************** __mf_wrap_alloca_indirect (size_t c)
*** 422,429 ****
--- 434,443 ----
      {
        struct alloca_tracking *next = alloca_history->next;
        __mf_unregister (alloca_history->ptr, 0, __MF_TYPE_HEAP);
+       in_malloc = 1;
        CALL_REAL (free, alloca_history->ptr);
        CALL_REAL (free, alloca_history);
+       in_malloc = 0;
        alloca_history = next;
      }
  
*************** __mf_wrap_alloca_indirect (size_t c)
*** 431,444 ****
--- 445,464 ----
    result = NULL;
    if (LIKELY (c > 0)) /* alloca(0) causes no allocation.  */
      {
+       in_malloc = 1;
        track = (struct alloca_tracking *) CALL_REAL (malloc, 
  						    sizeof (struct alloca_tracking));
+       in_malloc = 0;
        if (LIKELY (track != NULL))
  	{
+ 	  in_malloc = 1;
  	  result = CALL_REAL (malloc, c);
+ 	  in_malloc = 0;
  	  if (UNLIKELY (result == NULL))
  	    {
+ 	      in_malloc = 1;
  	      CALL_REAL (free, track);
+ 	      in_malloc = 0;
  	      /* Too bad.  XXX: What about errno?  */
  	    }
  	  else
diff -pr orig-gcc/libmudflap/mf-hooks2.c gcc/libmudflap/mf-hooks2.c
*** orig-gcc/libmudflap/mf-hooks2.c	Mon Oct 18 17:50:37 2004
--- gcc/libmudflap/mf-hooks2.c	Tue Feb 22 12:33:46 2005
*************** WRAPPER2(struct tm*, gmtime, const time_
*** 544,550 ****
  /* The following indicate if the result of the corresponding function
   * should be explicitly un/registered by the wrapper
  */
! #undef  MF_REGISTER_fopen
  #define MF_RESULT_SIZE_fopen		(sizeof (FILE))
  #undef  MF_REGISTER_opendir
  #define MF_RESULT_SIZE_opendir		0	/* (sizeof (DIR)) */
--- 544,550 ----
  /* The following indicate if the result of the corresponding function
   * should be explicitly un/registered by the wrapper
  */
! #define MF_REGISTER_fopen		__MF_TYPE_STATIC
  #define MF_RESULT_SIZE_fopen		(sizeof (FILE))
  #undef  MF_REGISTER_opendir
  #define MF_RESULT_SIZE_opendir		0	/* (sizeof (DIR)) */

Reply via email to