[mdb-discuss] trace memory corruption

James Coleman Thu, 29 Jun 2006 15:38:44 +0100

>> 1. Your example above might not corrupt memory (if it is simplified).
> 
> 
> I wouldn't say it's not memory corruption. even it doesn't core dump.


I meant in the simplified case where p2 was assigned the same memory address
as p1 there really was no memory corruption.

 > The behavier  just depends on your os ,  compiler and application.
 > If p1, p2 has different size, most likely you will get core dump,
 > because p1 and p2 mostly goes to different addr.
 > If p1 and p2 has the same size, then things getting interesting. p1 and
 > p2 will point to the same addr at most time.
 > In this case, you still can use p1 to access that piece of  addr.
 > However that piece of memory doesn't below to p1.
 > It's the typical memory corruption bug in the application and very hard
 > to find without some sort of memory detection utility.
 >

The next two examples of your case that I looked at (still simplified but this 
time with real
memory corruption) I think showed how the corruption can be detected by libumem.

Have you tried attaching to your process and running ::umem_verify after the 
memory
corruption has occured?  As I described how I had done it?

Or run with the libumem firewall as suggested by Ivan to get immediate 
detection of corruption.

> The key of the bug is, p1 can't be used after delete, even p1 stll 
> points to a accessible addr.
> That's why I wondering is umem possiblely track down this kind bug.
> It doesn't work for me.

Some safe coding practices may help.
I like to always set deleted pointers to NULL.

Unless a tool analyses what happens at compile-time then you it cannot
really detect unsafe coding practices such as that p1/p2 example.

I think purify might have that functionality but you have to make a purify
build for your application before you can get results.
Even then the effort spent in filtering the results for the useful warnings and 
running
regular purify tests is better spent in other ways IMHO.
And of course you have to pay for it! :-P

The great thing with libumem is it is so easy to run on any process.

Did you run libumem on your process and not detect corruption?
Did you attach to your process while running?
I'm not sure what your problem is as I see that libumem
  does detect heap corruption and buffer overruns.

James.

> 
> 
>> 2. The memory corruption will not be detected immediately.
>>
>>
>> 1.
>>
>> If I take the simplest possible case of your example.
>> And add a printf to check one thing.
>>
>> #include <strings.h>
>> #include <stdio.h>
>>
>> int main()
>> {
>>  char * p1= new char[8];
>>  delete p1;
>>  char * p2 =new char[8];
>>
>>  if (p2 == p1) printf("Oops! p1 == p2\n");
>>
>>  strcpy(p2, "56789");
>>  strcpy(p1, "01234");  //Bug causes memory corruption (if p1 points to 
>> invalid area!)
>> }
>>
>> We see that p2 is p1 so by accident in this case there is no corruption!
>> :) oops!
>>
>> Moving the delete we do get corruption (and we do not see the Oops):
>>
>> int main()
>> {
>>  char * p1= new char[8];
>>  // no corruption if delete here, delete p1;
>>  char * p2 =new char[8];
>>  delete p1;
>>
>>  if (p2 == p1) printf("Oops! p1 == p2\n");
>>
>>  strcpy(p2, "56789");
>>  strcpy(p1, "01234");  //Bug  causes memory corruption
>>
>>  return 0;
>> }
>>
>> Now running this looks normal. No coredump.
>> Running with libumem:
>> UMEM_DEBUG=default UMEM_LOGGING=transaction LD_PRELOAD=libumem.so.1  
>> ~/c/testcorrupt
>> nothing.
>> Hmmmmm. (see 2. below for why this is expected)
>>
>>
>> Now put in a sleep before the return to allow us time to attach to the 
>> process using mdb:
>> #include <unistd.h>
>>  sleep(70);
>> Now using `mdb -p pid` and
>>
>> > ::umem_verify
>>
>> umem_alloc_16                      3d6c8 1 corrupt buffer
>>
>>
>> > 3d6c8::umem_verify
>> Summary for cache 'umem_alloc_16'
>>   buffer 49fe0 (free) seems corrupted, at 0
>>
>>
>> > 49fe0/10X
>> 0x49fe0:        deadbeef        deadbeef        30313233        
>> 3400beef        feedface feedface        54780           f4ebb36e
>>                 deadbeef        deadbeef
>>
>> By examining what is in the corrupt buffer you might be able to tell 
>> where it came from.
>>
>> > ::umalog
>>
>> T-0.000000000  addr=49fe0  umem_alloc_16
>>          libumem.so.1`umem_cache_free+0x4c
>>          libumem.so.1`process_free+0x68
>>          libumem.so.1`free+0x38
>>          libstdc++.so.6.0.3`_ZdlPv+0x10
>>          main+0x28
>>          _start+0x5c
>>
>> T-0.000031250  addr=49fc0  umem_alloc_16
>>          libumem.so.1`umem_cache_alloc+0x13c
>>          libumem.so.1`umem_alloc+0x44
>>          libumem.so.1`malloc+0x2c
>>          libstdc++.so.6.0.3`_Znwj+0x1c
>>          libstdc++.so.6.0.3`_Znaj+4
>>          main+0x18
>>          _start+0x5c
>>
>> T-0.000053750  addr=49fe0  umem_alloc_16
>>          libumem.so.1`umem_cache_alloc+0x13c
>>          libumem.so.1`umem_alloc+0x44
>>          libumem.so.1`malloc+0x2c
>>          libstdc++.so.6.0.3`_Znwj+0x1c
>>          libstdc++.so.6.0.3`_Znaj+4
>>          main+8
>>          _start+0x5c
>> >
>>
>> By grepping the umalog you can find where the buffer that was 
>> corrupted was malloc or freed from.
>>
>>
>> 2.
>>
>> Memory corruption is detected when a buffer with corrupted redzones is 
>> freed.
>> You can also attach to the process and run ::umem_verify and friends.
>> When freed memory is used by another malloc maybe the corruption is 
>> detected?
>> Not sure. Didn't
>>
>> Of course you cannot validate all memory and look for corruption after 
>> every malloc/free.
>> This would make things very slow.
>>
>> This article:
>> http://access1.sun.com/techarticles/libumem.html
>> Describes how to use gcore to make the process (under libumem) dump 
>> core and then
>> run ::umem_verify and friends. Or attach to process while still 
>> running but
>> after the put in a big sleep at the end
>>
>>
>>
>> A final note.
>>
>> If I had other "stuff" instead of the sleep after the corruption.
>> A new (which would use the corrupt buffer).
>>
>> "stuff":
>>  char * p3= new char[8];
>>  if (p3 == p1) printf("Yes. p1 == p3\n");
>>
>>  for(int i=0;i<100;i++);
>>
>>  return 0;
>>
>> oops silly me! I left whatever I was going to do with the for loop 
>> undone.
>> but irregardless of that. umem dumped core (it looked like it was on 
>> that new):
>>
>> [jcoleman at slaine] ~/c/$ UMEM_DEBUG=default UMEM_LOGGING=transaction 
>> LD_PRELOAD=libumem.so.1 ~/c/testcorrupt
>> Abort (core dumped)
>>
>> [jcoleman at slaine] ~/c/$ mdb core
>> Loading modules: [ libumem.so.1 libc.so.1 ld.so.1 ]
>> > $c
>> libc.so.1`_kill+8(1, 64, 65640000, 7efefeff, 81010100, ff00)
>> libumem.so.1`umem_err_recoverable+0x74(ff360cac, fffffff7, ffffffff, 
>> 3b288b8f, 4bfb0, 3d720)
>> libumem.so.1`umem_error+0x49c(1, ff377010, 0, 49fe0, 3d780, 10)
>> libumem.so.1`umem_cache_alloc_debug+0xf0(3d6c8, 49fe0, 0, ff356d8c, 0, 0)
>> libumem.so.1`umem_cache_alloc+0x208(49fe0, 0, 0, 0, 0, 0)
>> libumem.so.1`umem_alloc+0x44(10, 0, 0, 0, 0, 0)
>> libumem.so.1`malloc+0x2c(8, ffbff0d8, 0, 0, ffbff188, ff1bc000)
>> libstdc++.so.6.0.3`_Znwj+0x1c(8, ffbff188, 0, 0, 0, ff19ff1c)
>> libstdc++.so.6.0.3`_Znaj+4(8, 10950, 49fe8, 34000000, 3400, 49fc8)
>> main+0x8c(1, ffbff2ac, ffbff2b4, 20bb0, 0, 0)
>> _start+0x5c(0, 0, 0, 0, 0, 0)
>>
>> > ::umem_status
>> Status:         ready and active
>> Concurrency:    1
>> Logs:           transaction=64k (inactive)
>> Message buffer:
>> umem allocator: buffer modified after being freed
>> modification occurred at offset 0x8 (0xdeadbeefdeadbeef replaced by 
>> 0x303132333400beef)
>> buffer=49fe0  bufctl=54780  cache: umem_alloc_16
>> previous transaction on buffer 49fe0:
>> thread=1  time=T-6.992512911  slab=4bfb0  cache: umem_alloc_16
>> libumem.so.1'umem_cache_free+0x4c
>> libumem.so.1'?? (0xff353868)
>> libumem.so.1'free+0x38
>> libstdc++.so.6.0.3'_ZdlPv+0x10
>> testcorrupt'main+0x28
>> testcorrupt'_start+0x5c
>> umem: heap corruption detected
>> stack trace:
>> libumem.so.1'?? (0xff3554c8)
>> libumem.so.1'?? (0xff356508)
>> libumem.so.1'umem_cache_alloc+0x208
>> libumem.so.1'umem_alloc+0x44
>> libumem.so.1'malloc+0x2c
>> libstdc++.so.6.0.3'_Znwj+0x1c
>> libstdc++.so.6.0.3'_Znaj+0x4
>> testcorrupt'main+0x8c
>> testcorrupt'_start+0x5c
>>
>> Hooray.
>>
>> Isn't that nice :)
>>
>> James.
>>
>>
>> As a side-note I am using gnu g++ as a compiler. With -g3 for debug info.
>>
>> .
>>

[mdb-discuss] trace memory corruption

Reply via email to