[mdb-discuss] trace memory corruption

Steven Xie Thu, 29 Jun 2006 09:35:39 -0400

Hi james,

James Coleman wrote:


> Xie,Zhong wrote:
>
>> I wondering is it possible to track down memory corruption like below 
>> using umem
>>
>> char * p1= new char[8];
>> ...
>> delete p1;
>> char * p2 =new char[8];
>> ....
>> strcpy(p2, "56789");
>> strcpy(p1, "01234");  //Bug  causes memory corruption
>>
>> p1 and p2 point to the "same size" memory. It's possible p1, p2 
>> acctually point to the same address. In this case , can umem still be 
>> able to tracing down the memory corruption. It doesn't work for me. 
>> gcc version 3.4.3 (csl-sol210-3_4-branch+sol_rpath)
>>  
>
>
> Hello,
>
> With my limited experience with mdb and libumem they are able to 
> detect memory corruption.
> I was curious about how this was handled so I poked at it a bit myself 
> and here is
> what I have learned. I hope it is a little bit useful to you.
>
> Two things:
>
> 1. Your example above might not corrupt memory (if it is simplified).

I wouldn't say it's not memory corruption. even it doesn't core dump.
The behavier  just depends on your os ,  compiler and application.
If p1, p2 has different size, most likely you will get core dump, 
because p1 and p2 mostly goes to different addr.
If p1 and p2 has the same size, then things getting interesting. p1 and 
p2 will point to the same addr at most time.
In this case, you still can use p1 to access that piece of  addr. 
However that piece of memory doesn't below to p1.
It's the typical memory corruption bug in the application and very hard 
to find without some sort of memory detection utility.

The key of the bug is, p1 can't be used after delete, even p1 stll 
points to a accessible addr.
That's why I wondering is umem possiblely track down this kind bug.
It doesn't work for me.







> 2. The memory corruption will not be detected immediately.
>
>
> 1.
>
> If I take the simplest possible case of your example.
> And add a printf to check one thing.
>
> #include <strings.h>
> #include <stdio.h>
>
> int main()
> {
>  char * p1= new char[8];
>  delete p1;
>  char * p2 =new char[8];
>
>  if (p2 == p1) printf("Oops! p1 == p2\n");
>
>  strcpy(p2, "56789");
>  strcpy(p1, "01234");  //Bug causes memory corruption (if p1 points to 
> invalid area!)
> }
>
> We see that p2 is p1 so by accident in this case there is no corruption!
> :) oops!
>
> Moving the delete we do get corruption (and we do not see the Oops):
>
> int main()
> {
>  char * p1= new char[8];
>  // no corruption if delete here, delete p1;
>  char * p2 =new char[8];
>  delete p1;
>
>  if (p2 == p1) printf("Oops! p1 == p2\n");
>
>  strcpy(p2, "56789");
>  strcpy(p1, "01234");  //Bug  causes memory corruption
>
>  return 0;
> }
>
> Now running this looks normal. No coredump.
> Running with libumem:
> UMEM_DEBUG=default UMEM_LOGGING=transaction LD_PRELOAD=libumem.so.1  
> ~/c/testcorrupt
> nothing.
> Hmmmmm. (see 2. below for why this is expected)
>
>
> Now put in a sleep before the return to allow us time to attach to the 
> process using mdb:
> #include <unistd.h>
>  sleep(70);
> Now using `mdb -p pid` and
>
> > ::umem_verify
>
> umem_alloc_16                      3d6c8 1 corrupt buffer
>
>
> > 3d6c8::umem_verify
> Summary for cache 'umem_alloc_16'
>   buffer 49fe0 (free) seems corrupted, at 0
>
>
> > 49fe0/10X
> 0x49fe0:        deadbeef        deadbeef        30313233        
> 3400beef        feedface feedface        54780           f4ebb36e
>                 deadbeef        deadbeef
>
> By examining what is in the corrupt buffer you might be able to tell 
> where it came from.
>
> > ::umalog
>
> T-0.000000000  addr=49fe0  umem_alloc_16
>          libumem.so.1`umem_cache_free+0x4c
>          libumem.so.1`process_free+0x68
>          libumem.so.1`free+0x38
>          libstdc++.so.6.0.3`_ZdlPv+0x10
>          main+0x28
>          _start+0x5c
>
> T-0.000031250  addr=49fc0  umem_alloc_16
>          libumem.so.1`umem_cache_alloc+0x13c
>          libumem.so.1`umem_alloc+0x44
>          libumem.so.1`malloc+0x2c
>          libstdc++.so.6.0.3`_Znwj+0x1c
>          libstdc++.so.6.0.3`_Znaj+4
>          main+0x18
>          _start+0x5c
>
> T-0.000053750  addr=49fe0  umem_alloc_16
>          libumem.so.1`umem_cache_alloc+0x13c
>          libumem.so.1`umem_alloc+0x44
>          libumem.so.1`malloc+0x2c
>          libstdc++.so.6.0.3`_Znwj+0x1c
>          libstdc++.so.6.0.3`_Znaj+4
>          main+8
>          _start+0x5c
> >
>
> By grepping the umalog you can find where the buffer that was 
> corrupted was malloc or freed from.
>
>
> 2.
>
> Memory corruption is detected when a buffer with corrupted redzones is 
> freed.
> You can also attach to the process and run ::umem_verify and friends.
> When freed memory is used by another malloc maybe the corruption is 
> detected?
> Not sure. Didn't
>
> Of course you cannot validate all memory and look for corruption after 
> every malloc/free.
> This would make things very slow.
>
> This article:
> http://access1.sun.com/techarticles/libumem.html
> Describes how to use gcore to make the process (under libumem) dump 
> core and then
> run ::umem_verify and friends. Or attach to process while still 
> running but
> after the put in a big sleep at the end
>
>
>
> A final note.
>
> If I had other "stuff" instead of the sleep after the corruption.
> A new (which would use the corrupt buffer).
>
> "stuff":
>  char * p3= new char[8];
>  if (p3 == p1) printf("Yes. p1 == p3\n");
>
>  for(int i=0;i<100;i++);
>
>  return 0;
>
> oops silly me! I left whatever I was going to do with the for loop 
> undone.
> but irregardless of that. umem dumped core (it looked like it was on 
> that new):
>
> [jcoleman at slaine] ~/c/$ UMEM_DEBUG=default UMEM_LOGGING=transaction 
> LD_PRELOAD=libumem.so.1 ~/c/testcorrupt
> Abort (core dumped)
>
> [jcoleman at slaine] ~/c/$ mdb core
> Loading modules: [ libumem.so.1 libc.so.1 ld.so.1 ]
> > $c
> libc.so.1`_kill+8(1, 64, 65640000, 7efefeff, 81010100, ff00)
> libumem.so.1`umem_err_recoverable+0x74(ff360cac, fffffff7, ffffffff, 
> 3b288b8f, 4bfb0, 3d720)
> libumem.so.1`umem_error+0x49c(1, ff377010, 0, 49fe0, 3d780, 10)
> libumem.so.1`umem_cache_alloc_debug+0xf0(3d6c8, 49fe0, 0, ff356d8c, 0, 0)
> libumem.so.1`umem_cache_alloc+0x208(49fe0, 0, 0, 0, 0, 0)
> libumem.so.1`umem_alloc+0x44(10, 0, 0, 0, 0, 0)
> libumem.so.1`malloc+0x2c(8, ffbff0d8, 0, 0, ffbff188, ff1bc000)
> libstdc++.so.6.0.3`_Znwj+0x1c(8, ffbff188, 0, 0, 0, ff19ff1c)
> libstdc++.so.6.0.3`_Znaj+4(8, 10950, 49fe8, 34000000, 3400, 49fc8)
> main+0x8c(1, ffbff2ac, ffbff2b4, 20bb0, 0, 0)
> _start+0x5c(0, 0, 0, 0, 0, 0)
>
> > ::umem_status
> Status:         ready and active
> Concurrency:    1
> Logs:           transaction=64k (inactive)
> Message buffer:
> umem allocator: buffer modified after being freed
> modification occurred at offset 0x8 (0xdeadbeefdeadbeef replaced by 
> 0x303132333400beef)
> buffer=49fe0  bufctl=54780  cache: umem_alloc_16
> previous transaction on buffer 49fe0:
> thread=1  time=T-6.992512911  slab=4bfb0  cache: umem_alloc_16
> libumem.so.1'umem_cache_free+0x4c
> libumem.so.1'?? (0xff353868)
> libumem.so.1'free+0x38
> libstdc++.so.6.0.3'_ZdlPv+0x10
> testcorrupt'main+0x28
> testcorrupt'_start+0x5c
> umem: heap corruption detected
> stack trace:
> libumem.so.1'?? (0xff3554c8)
> libumem.so.1'?? (0xff356508)
> libumem.so.1'umem_cache_alloc+0x208
> libumem.so.1'umem_alloc+0x44
> libumem.so.1'malloc+0x2c
> libstdc++.so.6.0.3'_Znwj+0x1c
> libstdc++.so.6.0.3'_Znaj+0x4
> testcorrupt'main+0x8c
> testcorrupt'_start+0x5c
>
> Hooray.
>
> Isn't that nice :)
>
> James.
>
>
> As a side-note I am using gnu g++ as a compiler. With -g3 for debug info.
>
> .
>

[mdb-discuss] trace memory corruption

Reply via email to