Isn't there any idea about my finding?
On 4/27/12, Mahmood Naderan <mahmood...@gmail.com> wrote: > ok I think I find the bug. I used "continue" and "ctrl+c" multiple > times to see if it stuck at a particular function. The backtrace > shows: > > > #0 0x00000000004dfee7 in __gnu_cxx::hashtable<std::pair<unsigned long > const, X86ISA::TlbEntry>, unsigned long, __gnu_cxx::hash<unsigned > long>, std::_Select1st<std::pair<unsigned long const, > X86ISA::TlbEntry> >, std::equal_to<unsigned long>, > std::allocator<X86ISA::TlbEntry> >::_M_bkt_num_key (this=0x28b8970, > __key=@0x21d9a7c8, __n=50331653) at > /usr/include/c++/4.4/backward/hashtable.h:590 > #1 0x00000000004dfff9 in __gnu_cxx::hashtable<std::pair<unsigned long > const, X86ISA::TlbEntry>, unsigned long, __gnu_cxx::hash<unsigned > long>, std::_Select1st<std::pair<unsigned long const, > X86ISA::TlbEntry> >, std::equal_to<unsigned long>, > std::allocator<X86ISA::TlbEntry> >::_M_bkt_num (this=0x28b8970, > __obj=..., __n=50331653) at > /usr/include/c++/4.4/backward/hashtable.h:594 > #2 0x00000000004df9c8 in __gnu_cxx::hashtable<std::pair<unsigned long > const, X86ISA::TlbEntry>, unsigned long, __gnu_cxx::hash<unsigned > long>, std::_Select1st<std::pair<unsigned long const, > X86ISA::TlbEntry> >, std::equal_to<unsigned long>, > std::allocator<X86ISA::TlbEntry> >::resize (this=0x28b8970, > __num_elements_hint=25165844) at > /usr/include/c++/4.4/backward/hashtable.h:1001 > #3 0x00000000004df100 in __gnu_cxx::hashtable<std::pair<unsigned long > const, X86ISA::TlbEntry>, unsigned long, __gnu_cxx::hash<unsigned > long>, std::_Select1st<std::pair<unsigned long const, > X86ISA::TlbEntry> >, std::equal_to<unsigned long>, > std::allocator<X86ISA::TlbEntry> >::find_or_insert (this=0x28b8970, > __obj=...) at /usr/include/c++/4.4/backward/hashtable.h:789 > #4 0x00000000004deaca in __gnu_cxx::hash_map<unsigned long, > X86ISA::TlbEntry, __gnu_cxx::hash<unsigned long>, > std::equal_to<unsigned long>, std::allocator<X86ISA::TlbEntry> >>::operator[] (this=0x28b8970, > __key=@0x7fffffffba80) at /usr/include/c++/4.4/ext/hash_map:217 > #5 0x00000000004daa68 in PageTable::map (this=0x28b8970, > vaddr=47015569313792, paddr=103079288832, > size=5548434767986339840, clobber=false) at > build/X86/mem/page_table.cc:82 > #6 0x000000000074b9c8 in Process::allocateMem (this=0x30be640, > vaddr=46912496128000, > size=5548434871059525632, clobber=false) at > build/X86/sim/process.cc:332 > #7 0x00000000007aba21 in mmapFunc<X86Linux64> (desc=0x2052fb8, num=9, > p=0x30be640, tc=0x3331210) > at build/X86/sim/syscall_emul.hh:1069 > #8 0x000000000073ca11 in SyscallDesc::doSyscall (this=0x2052fb8, > callnum=9, process=0x30be640, > tc=0x3331210) at build/X86/sim/syscall_emul.cc:69 > #9 0x00000000007516a0 in LiveProcess::syscall (this=0x30be640, > callnum=9, tc=0x3331210) > at build/X86/sim/process.cc:590 > #10 0x0000000000c10ce3 in SimpleThread::syscall (this=0x33305d0, callnum=9) > at build/X86/cpu/simple_thread.hh:384 > > > > As you can see there is a problem with mmapFunc<X86Linux64> syscall > which allocate memory through Process::allocateMem > That is my understanding.... > > > > On 4/27/12, Mahmood Naderan <mahmood...@gmail.com> wrote: >> Is this useful? >> >> 339051500: system.cpu + A0 T0 : 0x83d48d.4 : CALL_NEAR_I : wrip , >> t7, t1 : IntAlu : >> 339052000: system.cpu.icache: ReadReq (ifetch) 452f90 hit >> 339052000: system.cpu + A0 T0 : 0x852f90 : mov r10, rcx >> 339052000: system.cpu + A0 T0 : 0x852f90.0 : MOV_R_R : mov r10, >> r10, rcx : IntAlu : D=0x0000000000000022 >> 339052500: system.cpu.icache: ReadReq (ifetch) 452f90 hit >> 339052500: system.cpu + A0 T0 : 0x852f93 : mov eax, 0x9 >> 339052500: system.cpu + A0 T0 : 0x852f93.0 : MOV_R_I : limm eax, >> 0x9 : IntAlu : D=0x0000000000000009 >> 339053000: system.cpu.icache: ReadReq (ifetch) 452f98 hit >> ^C >> Program received signal SIGINT, Interrupt. >> 0x00000000004e0f90 in >> std::__fill_n_a<__gnu_cxx::_Hashtable_node<std::pair<unsigned long >> const, X86ISA::TlbEntry> >**, unsigned long, >> __gnu_cxx::_Hashtable_node<std::pair<unsigned long const, >> X86ISA::TlbEntry> >*> (__first=0x7fff70017000, __n=4065295, >> __value=@0x7fffffffb8d0) >> at /usr/include/c++/4.4/bits/stl_algobase.h:758 >> 758 *__first = __tmp; >> (gdb) ^CQuit >> (gdb) >> >> >> >> On 4/27/12, Steve Reinhardt <ste...@gmail.com> wrote: >>> Perhaps you could fire off the run under gdb, and use the --debug-break >>> flag to drop in to gdb at the tick where it seems to stop running. If >>> the >>> simulation stops and memory blows up, it's almost like you're stuck in >>> some >>> subtle infinite loop with a memory allocation in it. (You might have to >>> continue just a little past there and hit ctrl-c before it dies to catch >>> it >>> in the middle of this loop.) >>> >>> On Fri, Apr 27, 2012 at 11:29 AM, Mahmood Naderan >>> <mahmood...@gmail.com>wrote: >>> >>>> i searched for something similar (stoping the simulation when it reach >>>> at a specific memory usage to prevent killing) but didn't find such >>>> thing. Do you know? >>>> >>>> I also attached gdb. it doesn't show anything useful because lastly it >>>> get killed. >>>> >>>> On 4/27/12, Gabe Black <gbl...@eecs.umich.edu> wrote: >>>> > Valgrind should tell you where the leaked memory was allocated. You >>>> > may >>>> > have to give it a command line option for that, or stop it before it >>>> > gets itself killed. >>>> > >>>> > Gabe >>>> > >>>> > On 04/27/12 11:10, Steve Reinhardt wrote: >>>> >> Can you attach gdb when it does this, see where it's at, and maybe >>>> >> step through the code a bit to see what it's doing? >>>> >> >>>> >> On Fri, Apr 27, 2012 at 10:54 AM, Mahmood Naderan >>>> >> <mahmood...@gmail.com <mailto:mahmood...@gmail.com>> wrote: >>>> >> >>>> >> That was a guess. As I said, i turned on the debugger to see >>>> >> when >>>> >> it >>>> >> start eating the memory. As you can see the last messageit print >>>> >> is: >>>> >> 339069000: system.cpu + A0 T0 : 0x852f93.0 : MOV_R_I : limm >>>> eax, >>>> >> 0x9 : IntAlu : D=0x0000000000000009 >>>> >> 339069500: system.cpu.icache: set be: moving blk 452f80 to MRU >>>> >> 339069500: system.cpu.icache: ReadReq (ifetch) 452f98 hit >>>> >> >>>> >> Then no message is printed and I see, with top command, that the >>>> >> memory usage gos up and up until it consumes all memory. >>>> >> >>>> >> >>>> >> On 4/27/12, Nilay Vaish <ni...@cs.wisc.edu >>>> >> <mailto:ni...@cs.wisc.edu>> wrote: >>>> >> > How do you know the instruction at which the memory starts >>>> >> leaking? What >>>> >> > should we conclude from the instruction trace in your mail. I >>>> >> am >>>> >> unable to >>>> >> > arrive at any conclusion from the valgrind report that you had >>>> >> attached. >>>> >> > Apart from the info on uninitialized values, I did not find >>>> >> any >>>> >> useful >>>> >> > output produced by valgrind. >>>> >> > >>>> >> > -- >>>> >> > Nilay >>>> >> > >>>> >> > On Fri, 27 Apr 2012, Mahmood Naderan wrote: >>>> >> > >>>> >> >> tonto with the test input uses about 4 GB and runs for about >>>> >> 2 >>>> >> seconds >>>> >> >> on a real machine. >>>> >> >> >>>> >> >> I also used the test input with gem5. However again after >>>> >> tick >>>> >> >> 300000000, all the 30GB memory is used and then gem5 is >>>> >> killed. >>>> >> The >>>> >> >> same behaviour with ref input... >>>> >> >> >>>> >> >> I ran the following command: >>>> >> >> valgrind --tool=memcheck --leak-check=full >>>> >> --track-origins=yes >>>> >> >> --suppressions=../util/valgrind-suppressions >>>> ../build/X86/m5.debug >>>> >> >> --debug-flags=Cache,ExecAll,Bus,CacheRepl,Context >>>> >> >> --trace-start=339050000 ../configs/example/se.py -c >>>> >> >> tonto_base.amd64-m64-gcc44-nn --cpu-type=detailed -F 5000000 >>>> >> --maxtick >>>> >> >> 10000000 --caches --l2cache --prog-interval=100000 >>>> >> >> >>>> >> >> >>>> >> >> I also attach the report again. At the instruction that the >>>> memory >>>> >> >> leak begins, you can see: >>>> >> >> ... >>>> >> >> 339066000: system.cpu + A0 T0 : 0x83d48d : call 0x15afe >>>> >> >> 339066000: system.cpu + A0 T0 : 0x83d48d.0 : CALL_NEAR_I : >>>> limm >>>> >> >> t1, 0x15afe : IntAlu : D=0x0000000000015afe >>>> >> >> 339066500: system.cpu + A0 T0 : 0x83d48d.1 : CALL_NEAR_I : >>>> rdip >>>> >> >> t7, %ctrl153, : IntAlu : D=0x000000000083d492 >>>> >> >> 339067000: system.cpu.dcache: set 9a: moving blk 5aa680 to >>>> >> MRU >>>> >> >> 339067000: system.cpu.dcache: WriteReq 5aa6b8 hit >>>> >> >> 339067000: system.cpu + A0 T0 : 0x83d48d.2 : CALL_NEAR_I : >>>> >> st t7, >>>> >> >> SS:[rsp + 0xfffffffffffffff8] : MemWrite : >>>> >> D=0x000000000083d492 >>>> >> >> A=0x7fffffffe6b8 >>>> >> >> 339067500: system.cpu + A0 T0 : 0x83d48d.3 : CALL_NEAR_I : >>>> subi >>>> >> >> rsp, rsp, 0x8 : IntAlu : D=0x00007fffffffe6b8 >>>> >> >> 339068000: system.cpu + A0 T0 : 0x83d48d.4 : CALL_NEAR_I : >>>> >> wrip , >>>> >> >> t7, t1 : IntAlu : >>>> >> >> 339068500: system.cpu.icache: set be: moving blk 452f80 to >>>> >> MRU >>>> >> >> 339068500: system.cpu.icache: ReadReq (ifetch) 452f90 hit >>>> >> >> 339068500: system.cpu + A0 T0 : 0x852f90 : mov r10, rcx >>>> >> >> 339068500: system.cpu + A0 T0 : 0x852f90.0 : MOV_R_R : mov >>>> >> r10, >>>> >> >> r10, rcx : IntAlu : D=0x0000000000000022 >>>> >> >> 339069000: system.cpu.icache: set be: moving blk 452f80 to >>>> >> MRU >>>> >> >> 339069000: system.cpu.icache: ReadReq (ifetch) 452f90 hit >>>> >> >> 339069000: system.cpu + A0 T0 : 0x852f93 : mov eax, 0x9 >>>> >> >> 339069000: system.cpu + A0 T0 : 0x852f93.0 : MOV_R_I : >>>> >> limm >>>> >> eax, >>>> >> >> 0x9 : IntAlu : D=0x0000000000000009 >>>> >> >> 339069500: system.cpu.icache: set be: moving blk 452f80 to >>>> >> MRU >>>> >> >> 339069500: system.cpu.icache: ReadReq (ifetch) 452f98 hit >>>> >> >> >>>> >> >> >>>> >> >> What is your opinion then? >>>> >> >> Regards, >>>> >> >> >>>> >> >> On 4/27/12, Steve Reinhardt <ste...@gmail.com >>>> >> <mailto:ste...@gmail.com>> wrote: >>>> >> >>> Also, if you do run valgrind, use the >>>> >> util/valgrind-suppressions file to >>>> >> >>> suppress spurious reports. Read the valgrind docs to see >>>> >> how >>>> >> this >>>> >> >>> works. >>>> >> >>> >>>> >> >>> Steve >>>> >> >>> >>>> >> > _______________________________________________ >>>> >> > gem5-users mailing list >>>> >> > gem5-users@gem5.org <mailto:gem5-users@gem5.org> >>>> >> > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>> >> > >>>> >> >>>> >> >>>> >> -- >>>> >> // Naderan *Mahmood; >>>> >> _______________________________________________ >>>> >> gem5-users mailing list >>>> >> gem5-users@gem5.org <mailto:gem5-users@gem5.org> >>>> >> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>> >> >>>> >> >>>> >> >>>> >> _______________________________________________ >>>> >> gem5-users mailing list >>>> >> gem5-users@gem5.org >>>> >> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>> > >>>> > >>>> >>>> >>>> -- >>>> // Naderan *Mahmood; >>>> _______________________________________________ >>>> gem5-users mailing list >>>> gem5-users@gem5.org >>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>> >>> >> >> >> -- >> // Naderan *Mahmood; >> > > > -- > // Naderan *Mahmood; > -- // Naderan *Mahmood; _______________________________________________ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users