hints on debugging memory corruption...

2011-02-04 Thread Basile Starynkevitch
Hello All

(Sorry for such a basic question; very probably there are some GDB tricks
that I ignore).

In my MELT branch I have now some corrputed memory (maybe because I am
inserting a pass at the wrong place in the pass tree). At some point, I call
bb_debug, and it crashes because the field bb_next contains 0x101 (which is
not a valid adress).

So I need to understand who is writing the 0x101 in that field.

How do you folks debug such issues. 

An obvious strategy is to use the hardware watchpoint feature of GDB.
However, one cannot nicely put a watchpoint on an address which is not
mmap-ed yet.

But I don't know how to ask gdb to be notified when a given adress is
becoming valid in the address space. Putting a breakpoint on mmap is really
not funny.

Any hints are welcome!

Cheers

-- 
Basile STARYNKEVITCH http://starynkevitch.net/Basile/
email: basilestarynkevitchnet mobile: +33 6 8501 2359
8, rue de la Faiencerie, 92340 Bourg La Reine, France
*** opinions {are only mines, sont seulement les miennes} ***


Re: hints on debugging memory corruption...

2011-02-04 Thread Richard Henderson
On 02/04/2011 07:42 AM, Basile Starynkevitch wrote:
> An obvious strategy is to use the hardware watchpoint feature of GDB.
> However, one cannot nicely put a watchpoint on an address which is not
> mmap-ed yet.

I typically find the location at which the object containing the address
is allocated.  E.g. in alloc_block on the return statement.  Make this
bp conditional on the object you're looking for.


r~


Re: hints on debugging memory corruption...

2011-02-04 Thread Ian Lance Taylor
Basile Starynkevitch  writes:

> In my MELT branch I have now some corrputed memory (maybe because I am
> inserting a pass at the wrong place in the pass tree). At some point, I call
> bb_debug, and it crashes because the field bb_next contains 0x101 (which is
> not a valid adress).
>
> So I need to understand who is writing the 0x101 in that field.
>
> How do you folks debug such issues. 
>
> An obvious strategy is to use the hardware watchpoint feature of GDB.
> However, one cannot nicely put a watchpoint on an address which is not
> mmap-ed yet.
>
> But I don't know how to ask gdb to be notified when a given adress is
> becoming valid in the address space. Putting a breakpoint on mmap is really
> not funny.

I usually put a breakpoint on the allocation routine and make it
conditional on returning the address I am interested in.  Once that
address is returned, I can put a hardware breakpoint on the field value
being changed.  This approach is only moderately successful in practice.

Ian


Re: hints on debugging memory corruption...

2011-02-04 Thread Joern Rennecke

Quoting Basile Starynkevitch :


Hello All

(Sorry for such a basic question; very probably there are some GDB tricks
that I ignore).

In my MELT branch I have now some corrputed memory (maybe because I am
inserting a pass at the wrong place in the pass tree). At some point, I call
bb_debug, and it crashes because the field bb_next contains 0x101 (which is
not a valid adress).

So I need to understand who is writing the 0x101 in that field.

How do you folks debug such issues.

An obvious strategy is to use the hardware watchpoint feature of GDB.
However, one cannot nicely put a watchpoint on an address which is not
mmap-ed yet.


One way is to do bit of stepping or inserting breakpoints in interesting
places and see when the address gets mapped.

If this is to tedious, and the memory corruption is in the heap, and
will manifest itself also after a bit of perturbation, you can use a  
breakpoint

at main (or if the corruption happens earlier, in a constructor or init)
and call free and the result of malloc for a large memory area.
For some OSes you might also need to malloc several medium-sized blocks
and free them to actually get stuff mapped.
run to see where the corruption strikes now, then do a fresh start with the
same allocation pattern and set your watchpoint.

For other allocators, you can hack the code a bit to make the first allocated
chunk huge enough to contain the corrupted area.


Re: hints on debugging memory corruption...

2011-02-04 Thread Laurynas Biveinis
2011/2/4 Basile Starynkevitch :
> An obvious strategy is to use the hardware watchpoint feature of GDB.
> However, one cannot nicely put a watchpoint on an address which is not
> mmap-ed yet.

Actually, you can do this with a recent enough GDB (7.1 AFAIK). It
will keep watchpoint disabled until the page is mapped in, then
enables it automatically (perhaps might have false hits if the page is
cleared etc.) However I still like the method described by Ian and rth
better.

-- 
Laurynas


[google] Merged google/integration into google/main

2011-02-04 Thread Diego Novillo
This merge brings all the integration patches.  I've also created
a ChangeLog.google-main to separate the changes from both
branches.

Tested on x86_64.


Diego.


Re: hints on debugging memory corruption...

2011-02-04 Thread Tom Tromey
> "Basile" == Basile Starynkevitch  writes:

Basile> So I need to understand who is writing the 0x101 in that field.

valgrind can sometimes catch this, assuming that the write is an invalid
one.

Basile> An obvious strategy is to use the hardware watchpoint feature of GDB.
Basile> However, one cannot nicely put a watchpoint on an address which is not
Basile> mmap-ed yet.

I think a new-enough gdb should handle this ok.

rth> I typically find the location at which the object containing the address
rth> is allocated.  E.g. in alloc_block on the return statement.  Make this
rth> bp conditional on the object you're looking for.

I do this, too.

One thing to watch out for is that the memory can be recycled.  I've
been very confused whenever I've forgotten this.  I have a hack for the
GC (appended -- ancient enough that it probably won't apply) that makes
it easy to notice when an object you are interested in is collected.
IIRC I apply this before the first run, call ggc_watch_object for the
thing I am interested in, and then see in what GC cycle the real one is
allocated.

Tom

Index: ggc-page.c
===
--- ggc-page.c  (revision 127650)
+++ ggc-page.c  (working copy)
@@ -430,6 +430,13 @@
   } *free_object_list;
 #endif
 
+  /* Watched objects.  */
+  struct watched_object
+  {
+void *object;
+struct watched_object *next;
+  } *watched_object_list;
+
 #ifdef GATHER_STATISTICS
   struct
   {
@@ -481,7 +488,7 @@
 /* Initial guess as to how many page table entries we might need.  */
 #define INITIAL_PTE_COUNT 128
 
-static int ggc_allocated_p (const void *);
+int ggc_allocated_p (const void *);
 static page_entry *lookup_page_table_entry (const void *);
 static void set_page_table_entry (void *, page_entry *);
 #ifdef USING_MMAP
@@ -549,7 +556,7 @@
 
 /* Returns nonzero if P was allocated in GC'able memory.  */
 
-static inline int
+int
 ggc_allocated_p (const void *p)
 {
   page_entry ***base;
@@ -1264,9 +1271,36 @@
 (unsigned long) size, (unsigned long) object_size, result,
 (void *) entry);
 
+  {
+struct watched_object *w;
+for (w = G.watched_object_list; w; w = w->next)
+  {
+   if (result == w->object)
+ {
+   fprintf (stderr, "re-returning watched object %p\n", w->object);
+   break;
+ }
+  }
+  }
+
   return result;
 }
 
+int
+ggc_check_watch (void *p, char *what)
+{
+  struct watched_object *w;
+  for (w = G.watched_object_list; w; w = w->next)
+{
+  if (p == w->object)
+   {
+ fprintf (stderr, "got it: %s\n", what);
+ return 1;
+   }
+}
+  return 0;
+}
+
 /* If P is not marked, marks it and return false.  Otherwise return true.
P must have been allocated by the GC allocator; it mustn't point to
static objects, stack variables, or memory allocated with malloc.  */
@@ -1293,6 +1327,19 @@
   if (entry->in_use_p[word] & mask)
 return 1;
 
+  {
+struct watched_object *w;
+for (w = G.watched_object_list; w; w = w->next)
+  {
+   if (p == w->object)
+ {
+   fprintf (stderr, "marking object %p; was %d\n", p,
+(int) (entry->in_use_p[word] & mask));
+   break;
+ }
+  }
+  }
+
   /* Otherwise set it, and decrement the free object count.  */
   entry->in_use_p[word] |= mask;
   entry->num_free_objects -= 1;
@@ -1337,6 +1384,15 @@
   return OBJECT_SIZE (pe->order);
 }
 
+void
+ggc_watch_object (void *p)
+{
+  struct watched_object *w = XNEW (struct watched_object);
+  w->object = p;
+  w->next = G.watched_object_list;
+  G.watched_object_list = w;
+}
+
 /* Release the memory for object P.  */
 
 void
@@ -1345,11 +1401,21 @@
   page_entry *pe = lookup_page_table_entry (p);
   size_t order = pe->order;
   size_t size = OBJECT_SIZE (order);
+  struct watched_object *w;
 
 #ifdef GATHER_STATISTICS
   ggc_free_overhead (p);
 #endif
 
+  for (w = G.watched_object_list; w; w = w->next)
+{
+  if (w->object == p)
+   {
+ fprintf (stderr, "freeing watched object %p\n", p);
+ break;
+   }
+}
+
   if (GGC_DEBUG_LEVEL >= 3)
 fprintf (G.debug_file,
 "Freeing object, actual size=%lu, at %p on %p\n",
@@ -1868,6 +1934,10 @@
 #define validate_free_objects()
 #endif
 
+int ggc_nc = 0;
+
+
+
 /* Top level mark-and-sweep routine.  */
 
 void
@@ -1903,6 +1973,21 @@
 
   clear_marks ();
   ggc_mark_roots ();
+
+  if (G.watched_object_list)
+{
+  struct watched_object *w;
+  fprintf (stderr, "== starting collection %d\n", ggc_nc);
+  ++ggc_nc;
+  for (w = G.watched_object_list; w; w = w->next)
+   {
+ if (!ggc_marked_p (w->object))
+   {
+ fprintf (stderr, "object %p is free\n", w->object);
+   }
+   }
+}
+
 #ifdef GATHER_STATISTICS
   ggc_prune_overhead_list ();
 #endif


Re: hints on debugging memory corruption...

2011-02-04 Thread Joern Rennecke

Quoting Tom Tromey :


"Basile" == Basile Starynkevitch  writes:


Basile> So I need to understand who is writing the 0x101 in that field.




One thing to watch out for is that the memory can be recycled.  I've
been very confused whenever I've forgotten this.  I have a hack for the
GC (appended -- ancient enough that it probably won't apply) that makes
it easy to notice when an object you are interested in is collected.
IIRC I apply this before the first run, call ggc_watch_object for the
thing I am interested in, and then see in what GC cycle the real one is
allocated.


If what you are looking for survives such a change, postponing garbage
collection so it won't happen till the crash can make things simpler.


[google] Separating ChangeLogs from different branches

2011-02-04 Thread Diego Novillo
While doing the integration->main merge today, I realized that I
misnamed the ChangeLog.google files.  If both branches name them
the same, merges will be ugly and we won't be able to tell them
apart easily.

I renamed all the existing ChangeLog.google to
ChangeLog.google-integration.  We'll have a ChangeLog.google-main in the
other branch.


Diego.