Frank Hofmann wrote:
> [ ... ] > >> The key of the bug is, p1 can't be used after delete, even p1 stll >> points to a accessible addr. >> That's why I wondering is umem possiblely track down this kind bug. >> It doesn't work for me. > > > No memory allocator that re-uses previously-freed virtual addresses > can detect/track this. > > What umem/kmem debugging support does, though, is to give you the > buffer history. I.e. you _see_ who allocated/freed this piece of > memory. Which gives you a handle how to narrow down your search for > possible culprits. > > I have an actual kernel crashdump where exactly this situation has > occurred and kmem's buffer allocation/free history was the key to > finding the culprit/fixing the bug. If you wish to try, I'll make it > available, and comment on it as needed. Thats would be great! I like to give a shot. Could you give me some hint of kmem. Thanks. > Whether that works for you depends on the frequency of alloc/free for > this buffer. The history goes back only so and so far. > > Do we have the userland equivalent of "::kgrep" ? I'm too rarely > looking into application dumps ... > > FrankH. > >> >> >> >> >> >> >> >>> 2. The memory corruption will not be detected immediately. >>> >>> >>> 1. >>> >>> If I take the simplest possible case of your example. >>> And add a printf to check one thing. >>> >>> #include <strings.h> >>> #include <stdio.h> >>> >>> int main() >>> { >>> char * p1= new char[8]; >>> delete p1; >>> char * p2 =new char[8]; >>> >>> if (p2 == p1) printf("Oops! p1 == p2\n"); >>> >>> strcpy(p2, "56789"); >>> strcpy(p1, "01234"); //Bug causes memory corruption (if p1 points >>> to invalid area!) >>> } >>> >>> We see that p2 is p1 so by accident in this case there is no >>> corruption! >>> :) oops! >>> >>> Moving the delete we do get corruption (and we do not see the Oops): >>> >>> int main() >>> { >>> char * p1= new char[8]; >>> // no corruption if delete here, delete p1; >>> char * p2 =new char[8]; >>> delete p1; >>> >>> if (p2 == p1) printf("Oops! p1 == p2\n"); >>> >>> strcpy(p2, "56789"); >>> strcpy(p1, "01234"); //Bug causes memory corruption >>> >>> return 0; >>> } >>> >>> Now running this looks normal. No coredump. >>> Running with libumem: >>> UMEM_DEBUG=default UMEM_LOGGING=transaction LD_PRELOAD=libumem.so.1 >>> ~/c/testcorrupt >>> nothing. >>> Hmmmmm. (see 2. below for why this is expected) >>> >>> >>> Now put in a sleep before the return to allow us time to attach to >>> the process using mdb: >>> #include <unistd.h> >>> sleep(70); >>> Now using `mdb -p pid` and >>> >>> > ::umem_verify >>> >>> umem_alloc_16 3d6c8 1 corrupt buffer >>> >>> >>> > 3d6c8::umem_verify >>> Summary for cache 'umem_alloc_16' >>> buffer 49fe0 (free) seems corrupted, at 0 >>> >>> >>> > 49fe0/10X >>> 0x49fe0: deadbeef deadbeef 30313233 >>> 3400beef feedface feedface 54780 f4ebb36e >>> deadbeef deadbeef >>> >>> By examining what is in the corrupt buffer you might be able to tell >>> where it came from. >>> >>> > ::umalog >>> >>> T-0.000000000 addr=49fe0 umem_alloc_16 >>> libumem.so.1`umem_cache_free+0x4c >>> libumem.so.1`process_free+0x68 >>> libumem.so.1`free+0x38 >>> libstdc++.so.6.0.3`_ZdlPv+0x10 >>> main+0x28 >>> _start+0x5c >>> >>> T-0.000031250 addr=49fc0 umem_alloc_16 >>> libumem.so.1`umem_cache_alloc+0x13c >>> libumem.so.1`umem_alloc+0x44 >>> libumem.so.1`malloc+0x2c >>> libstdc++.so.6.0.3`_Znwj+0x1c >>> libstdc++.so.6.0.3`_Znaj+4 >>> main+0x18 >>> _start+0x5c >>> >>> T-0.000053750 addr=49fe0 umem_alloc_16 >>> libumem.so.1`umem_cache_alloc+0x13c >>> libumem.so.1`umem_alloc+0x44 >>> libumem.so.1`malloc+0x2c >>> libstdc++.so.6.0.3`_Znwj+0x1c >>> libstdc++.so.6.0.3`_Znaj+4 >>> main+8 >>> _start+0x5c >>> > >>> >>> By grepping the umalog you can find where the buffer that was >>> corrupted was malloc or freed from. >>> >>> >>> 2. >>> >>> Memory corruption is detected when a buffer with corrupted redzones >>> is freed. >>> You can also attach to the process and run ::umem_verify and friends. >>> When freed memory is used by another malloc maybe the corruption is >>> detected? >>> Not sure. Didn't >>> >>> Of course you cannot validate all memory and look for corruption >>> after every malloc/free. >>> This would make things very slow. >>> >>> This article: >>> http://access1.sun.com/techarticles/libumem.html >>> Describes how to use gcore to make the process (under libumem) dump >>> core and then >>> run ::umem_verify and friends. Or attach to process while still >>> running but >>> after the put in a big sleep at the end >>> >>> >>> >>> A final note. >>> >>> If I had other "stuff" instead of the sleep after the corruption. >>> A new (which would use the corrupt buffer). >>> >>> "stuff": >>> char * p3= new char[8]; >>> if (p3 == p1) printf("Yes. p1 == p3\n"); >>> >>> for(int i=0;i<100;i++); >>> >>> return 0; >>> >>> oops silly me! I left whatever I was going to do with the for loop >>> undone. >>> but irregardless of that. umem dumped core (it looked like it was on >>> that new): >>> >>> [jcoleman at slaine] ~/c/$ UMEM_DEBUG=default UMEM_LOGGING=transaction >>> LD_PRELOAD=libumem.so.1 ~/c/testcorrupt >>> Abort (core dumped) >>> >>> [jcoleman at slaine] ~/c/$ mdb core >>> Loading modules: [ libumem.so.1 libc.so.1 ld.so.1 ] >>> > $c >>> libc.so.1`_kill+8(1, 64, 65640000, 7efefeff, 81010100, ff00) >>> libumem.so.1`umem_err_recoverable+0x74(ff360cac, fffffff7, ffffffff, >>> 3b288b8f, 4bfb0, 3d720) >>> libumem.so.1`umem_error+0x49c(1, ff377010, 0, 49fe0, 3d780, 10) >>> libumem.so.1`umem_cache_alloc_debug+0xf0(3d6c8, 49fe0, 0, ff356d8c, >>> 0, 0) >>> libumem.so.1`umem_cache_alloc+0x208(49fe0, 0, 0, 0, 0, 0) >>> libumem.so.1`umem_alloc+0x44(10, 0, 0, 0, 0, 0) >>> libumem.so.1`malloc+0x2c(8, ffbff0d8, 0, 0, ffbff188, ff1bc000) >>> libstdc++.so.6.0.3`_Znwj+0x1c(8, ffbff188, 0, 0, 0, ff19ff1c) >>> libstdc++.so.6.0.3`_Znaj+4(8, 10950, 49fe8, 34000000, 3400, 49fc8) >>> main+0x8c(1, ffbff2ac, ffbff2b4, 20bb0, 0, 0) >>> _start+0x5c(0, 0, 0, 0, 0, 0) >>> >>> > ::umem_status >>> Status: ready and active >>> Concurrency: 1 >>> Logs: transaction=64k (inactive) >>> Message buffer: >>> umem allocator: buffer modified after being freed >>> modification occurred at offset 0x8 (0xdeadbeefdeadbeef replaced by >>> 0x303132333400beef) >>> buffer=49fe0 bufctl=54780 cache: umem_alloc_16 >>> previous transaction on buffer 49fe0: >>> thread=1 time=T-6.992512911 slab=4bfb0 cache: umem_alloc_16 >>> libumem.so.1'umem_cache_free+0x4c >>> libumem.so.1'?? (0xff353868) >>> libumem.so.1'free+0x38 >>> libstdc++.so.6.0.3'_ZdlPv+0x10 >>> testcorrupt'main+0x28 >>> testcorrupt'_start+0x5c >>> umem: heap corruption detected >>> stack trace: >>> libumem.so.1'?? (0xff3554c8) >>> libumem.so.1'?? (0xff356508) >>> libumem.so.1'umem_cache_alloc+0x208 >>> libumem.so.1'umem_alloc+0x44 >>> libumem.so.1'malloc+0x2c >>> libstdc++.so.6.0.3'_Znwj+0x1c >>> libstdc++.so.6.0.3'_Znaj+0x4 >>> testcorrupt'main+0x8c >>> testcorrupt'_start+0x5c >>> >>> Hooray. >>> >>> Isn't that nice :) >>> >>> James. >>> >>> >>> As a side-note I am using gnu g++ as a compiler. With -g3 for debug >>> info. >>> >>> . >>> >> _______________________________________________ >> mdb-discuss mailing list >> mdb-discuss at opensolaris.org >> > > ========================================================================== > > No good can come from selling your freedom, not for all gold of the > world, > for the value of this heavenly gift exceeds that of any fortune on earth. > ========================================================================== > > > . >