SyscallVerbose shows this: 300000000: Event_36: system.cpu progress event, total committed:535958, progress insts committed: 89710, IPC: 0.8971 336777500: system.cpu: syscall time called w/arguments 15,0,0,140737488349664 336777500: system.cpu: syscall time returns 1000000000 336795000: system.cpu: syscall time called w/arguments 15,0,0,0 336795000: system.cpu: syscall time returns 1000000000 337615500: system.cpu: syscall open called w/arguments 2,438,0,9618232 337615500: global: opening file /etc/localtime 337615500: system.cpu: syscall open returns 5 337748000: system.cpu: syscall fstat called w/arguments 2,140737488349184,140737488349184,5 337748000: system.cpu: syscall fstat returns 0 337857500: system.cpu: syscall fstat called w/arguments 2,140737488348816,140737488348816,5 337857500: system.cpu: syscall fstat returns 0 337883500: system.cpu: syscall mmap called w/arguments 34,3,8192,0 337883500: system.cpu: syscall mmap returns 46912496119808 338041000: system.cpu: syscall read called w/arguments 34,8192,46912496119808,5 338041000: system.cpu: syscall read returns 1622 338303500: system.cpu: syscall lseek called w/arguments 100663296,1,18446744073709550604,5 338303500: system.cpu: syscall lseek returns 610 338483500: system.cpu: syscall read called w/arguments 100663296,8192,46912496119808,5 338483500: system.cpu: syscall read returns 1012 339055500: system.cpu: syscall mmap called w/arguments 34,3,5548434871059525632,0 *** Exited on signal SIGKILL ***
On 4/28/12, Steve Reinhardt <ste...@gmail.com> wrote: > You could try running with the SyscallVerbose debug flag and/or setting a > breakpoint in mmapFunc<X86Linux64> to see if/why the application seems to > be requesting such a ridiculously large amount of memory. It could be > something like trying to read a 64-bit value where only a 32-bit value is > being provided. > > Make sure you use gem5.debug for this if you're not already. > > Steve > > On Sat, Apr 28, 2012 at 9:10 AM, Nilay Vaish <ni...@cs.wisc.edu> wrote: > >> Check why the size of the memory being asked for is so high? It might be >> that there is some problem with the way 'length' is found in mmapFunc(). >> Or >> it could be that the application did request a huge chunk of memory to be >> mmapped. >> >> -- >> Nilay >> >> >> On Sat, 28 Apr 2012, Mahmood Naderan wrote: >> >> Isn't there any idea about my finding? >>> >>> >>> On 4/27/12, Mahmood Naderan <mahmood...@gmail.com> wrote: >>> >>>> ok I think I find the bug. I used "continue" and "ctrl+c" multiple >>>> times to see if it stuck at a particular function. The backtrace >>>> shows: >>>> >>>> >>>> #0 0x00000000004dfee7 in __gnu_cxx::hashtable<std::**pair<unsigned >>>> long >>>> const, X86ISA::TlbEntry>, unsigned long, __gnu_cxx::hash<unsigned >>>> long>, std::_Select1st<std::pair<**unsigned long const, >>>> X86ISA::TlbEntry> >, std::equal_to<unsigned long>, >>>> std::allocator<X86ISA::**TlbEntry> >::_M_bkt_num_key (this=0x28b8970, >>>> __key=@0x21d9a7c8, __n=50331653) at >>>> /usr/include/c++/4.4/backward/**hashtable.h:590 >>>> #1 0x00000000004dfff9 in __gnu_cxx::hashtable<std::**pair<unsigned >>>> long >>>> const, X86ISA::TlbEntry>, unsigned long, __gnu_cxx::hash<unsigned >>>> long>, std::_Select1st<std::pair<**unsigned long const, >>>> X86ISA::TlbEntry> >, std::equal_to<unsigned long>, >>>> std::allocator<X86ISA::**TlbEntry> >::_M_bkt_num (this=0x28b8970, >>>> __obj=..., __n=50331653) at >>>> /usr/include/c++/4.4/backward/**hashtable.h:594 >>>> #2 0x00000000004df9c8 in __gnu_cxx::hashtable<std::**pair<unsigned >>>> long >>>> const, X86ISA::TlbEntry>, unsigned long, __gnu_cxx::hash<unsigned >>>> long>, std::_Select1st<std::pair<**unsigned long const, >>>> X86ISA::TlbEntry> >, std::equal_to<unsigned long>, >>>> std::allocator<X86ISA::**TlbEntry> >::resize (this=0x28b8970, >>>> __num_elements_hint=25165844) at >>>> /usr/include/c++/4.4/backward/**hashtable.h:1001 >>>> #3 0x00000000004df100 in __gnu_cxx::hashtable<std::**pair<unsigned >>>> long >>>> const, X86ISA::TlbEntry>, unsigned long, __gnu_cxx::hash<unsigned >>>> long>, std::_Select1st<std::pair<**unsigned long const, >>>> X86ISA::TlbEntry> >, std::equal_to<unsigned long>, >>>> std::allocator<X86ISA::**TlbEntry> >::find_or_insert (this=0x28b8970, >>>> __obj=...) at /usr/include/c++/4.4/backward/**hashtable.h:789 >>>> #4 0x00000000004deaca in __gnu_cxx::hash_map<unsigned long, >>>> X86ISA::TlbEntry, __gnu_cxx::hash<unsigned long>, >>>> std::equal_to<unsigned long>, std::allocator<X86ISA::**TlbEntry> >>>> >>>>> ::operator[] (this=0x28b8970, >>>>> >>>> __key=@0x7fffffffba80) at /usr/include/c++/4.4/ext/hash_**map:217 >>>> #5 0x00000000004daa68 in PageTable::map (this=0x28b8970, >>>> vaddr=47015569313792, paddr=103079288832, >>>> size=5548434767986339840, clobber=false) at >>>> build/X86/mem/page_table.cc:82 >>>> #6 0x000000000074b9c8 in Process::allocateMem (this=0x30be640, >>>> vaddr=46912496128000, >>>> size=5548434871059525632, clobber=false) at >>>> build/X86/sim/process.cc:332 >>>> #7 0x00000000007aba21 in mmapFunc<X86Linux64> (desc=0x2052fb8, num=9, >>>> p=0x30be640, tc=0x3331210) >>>> at build/X86/sim/syscall_emul.hh:**1069 >>>> #8 0x000000000073ca11 in SyscallDesc::doSyscall (this=0x2052fb8, >>>> callnum=9, process=0x30be640, >>>> tc=0x3331210) at build/X86/sim/syscall_emul.cc:**69 >>>> #9 0x00000000007516a0 in LiveProcess::syscall (this=0x30be640, >>>> callnum=9, tc=0x3331210) >>>> at build/X86/sim/process.cc:590 >>>> #10 0x0000000000c10ce3 in SimpleThread::syscall (this=0x33305d0, >>>> callnum=9) >>>> at build/X86/cpu/simple_thread.**hh:384 >>>> >>>> >>>> >>>> As you can see there is a problem with mmapFunc<X86Linux64> syscall >>>> which allocate memory through Process::allocateMem >>>> That is my understanding.... >>>> >>>> >>>> >>>> On 4/27/12, Mahmood Naderan <mahmood...@gmail.com> wrote: >>>> >>>>> Is this useful? >>>>> >>>>> 339051500: system.cpu + A0 T0 : 0x83d48d.4 : CALL_NEAR_I : wrip , >>>>> t7, t1 : IntAlu : >>>>> 339052000: system.cpu.icache: ReadReq (ifetch) 452f90 hit >>>>> 339052000: system.cpu + A0 T0 : 0x852f90 : mov r10, rcx >>>>> 339052000: system.cpu + A0 T0 : 0x852f90.0 : MOV_R_R : mov r10, >>>>> r10, rcx : IntAlu : D=0x0000000000000022 >>>>> 339052500: system.cpu.icache: ReadReq (ifetch) 452f90 hit >>>>> 339052500: system.cpu + A0 T0 : 0x852f93 : mov eax, 0x9 >>>>> 339052500: system.cpu + A0 T0 : 0x852f93.0 : MOV_R_I : limm eax, >>>>> 0x9 : IntAlu : D=0x0000000000000009 >>>>> 339053000: system.cpu.icache: ReadReq (ifetch) 452f98 hit >>>>> ^C >>>>> Program received signal SIGINT, Interrupt. >>>>> 0x00000000004e0f90 in >>>>> std::__fill_n_a<__gnu_cxx::_**Hashtable_node<std::pair<**unsigned long >>>>> const, X86ISA::TlbEntry> >**, unsigned long, >>>>> __gnu_cxx::_Hashtable_node<**std::pair<unsigned long const, >>>>> X86ISA::TlbEntry> >*> (__first=0x7fff70017000, __n=4065295, >>>>> __value=@0x7fffffffb8d0) >>>>> at /usr/include/c++/4.4/bits/stl_**algobase.h:758 >>>>> 758 *__first = __tmp; >>>>> (gdb) ^CQuit >>>>> (gdb) >>>>> >>>>> >>>>> >>>>> On 4/27/12, Steve Reinhardt <ste...@gmail.com> wrote: >>>>> >>>>>> Perhaps you could fire off the run under gdb, and use the >>>>>> --debug-break >>>>>> flag to drop in to gdb at the tick where it seems to stop running. >>>>>> If >>>>>> the >>>>>> simulation stops and memory blows up, it's almost like you're stuck >>>>>> in >>>>>> some >>>>>> subtle infinite loop with a memory allocation in it. (You might have >>>>>> to >>>>>> continue just a little past there and hit ctrl-c before it dies to >>>>>> catch >>>>>> it >>>>>> in the middle of this loop.) >>>>>> >>>>>> On Fri, Apr 27, 2012 at 11:29 AM, Mahmood Naderan >>>>>> <mahmood...@gmail.com>wrote: >>>>>> >>>>>> i searched for something similar (stoping the simulation when it >>>>>> reach >>>>>>> at a specific memory usage to prevent killing) but didn't find such >>>>>>> thing. Do you know? >>>>>>> >>>>>>> I also attached gdb. it doesn't show anything useful because lastly >>>>>>> it >>>>>>> get killed. >>>>>>> >>>>>>> On 4/27/12, Gabe Black <gbl...@eecs.umich.edu> wrote: >>>>>>> >>>>>>>> Valgrind should tell you where the leaked memory was allocated. You >>>>>>>> may >>>>>>>> have to give it a command line option for that, or stop it before >>>>>>>> it >>>>>>>> gets itself killed. >>>>>>>> >>>>>>>> Gabe >>>>>>>> >>>>>>>> On 04/27/12 11:10, Steve Reinhardt wrote: >>>>>>>> >>>>>>>>> Can you attach gdb when it does this, see where it's at, and maybe >>>>>>>>> step through the code a bit to see what it's doing? >>>>>>>>> >>>>>>>>> On Fri, Apr 27, 2012 at 10:54 AM, Mahmood Naderan >>>>>>>>> <mahmood...@gmail.com <mailto:mahmood...@gmail.com>> wrote: >>>>>>>>> >>>>>>>>> That was a guess. As I said, i turned on the debugger to see >>>>>>>>> when >>>>>>>>> it >>>>>>>>> start eating the memory. As you can see the last messageit >>>>>>>>> print >>>>>>>>> is: >>>>>>>>> 339069000: system.cpu + A0 T0 : 0x852f93.0 : MOV_R_I : limm >>>>>>>>> >>>>>>>> eax, >>>>>>> >>>>>>>> 0x9 : IntAlu : D=0x0000000000000009 >>>>>>>>> 339069500: system.cpu.icache: set be: moving blk 452f80 to MRU >>>>>>>>> 339069500: system.cpu.icache: ReadReq (ifetch) 452f98 hit >>>>>>>>> >>>>>>>>> Then no message is printed and I see, with top command, that >>>>>>>>> the >>>>>>>>> memory usage gos up and up until it consumes all memory. >>>>>>>>> >>>>>>>>> >>>>>>>>> On 4/27/12, Nilay Vaish <ni...@cs.wisc.edu >>>>>>>>> <mailto:ni...@cs.wisc.edu>> wrote: >>>>>>>>> > How do you know the instruction at which the memory starts >>>>>>>>> leaking? What >>>>>>>>> > should we conclude from the instruction trace in your mail. I >>>>>>>>> am >>>>>>>>> unable to >>>>>>>>> > arrive at any conclusion from the valgrind report that you had >>>>>>>>> attached. >>>>>>>>> > Apart from the info on uninitialized values, I did not find >>>>>>>>> any >>>>>>>>> useful >>>>>>>>> > output produced by valgrind. >>>>>>>>> > >>>>>>>>> > -- >>>>>>>>> > Nilay >>>>>>>>> > >>>>>>>>> > On Fri, 27 Apr 2012, Mahmood Naderan wrote: >>>>>>>>> > >>>>>>>>> >> tonto with the test input uses about 4 GB and runs for about >>>>>>>>> 2 >>>>>>>>> seconds >>>>>>>>> >> on a real machine. >>>>>>>>> >> >>>>>>>>> >> I also used the test input with gem5. However again after >>>>>>>>> tick >>>>>>>>> >> 300000000, all the 30GB memory is used and then gem5 is >>>>>>>>> killed. >>>>>>>>> The >>>>>>>>> >> same behaviour with ref input... >>>>>>>>> >> >>>>>>>>> >> I ran the following command: >>>>>>>>> >> valgrind --tool=memcheck --leak-check=full >>>>>>>>> --track-origins=yes >>>>>>>>> >> --suppressions=../util/**valgrind-suppressions >>>>>>>>> >>>>>>>> ../build/X86/m5.debug >>>>>>> >>>>>>>> >> --debug-flags=Cache,ExecAll,**Bus,CacheRepl,Context >>>>>>>>> >> --trace-start=339050000 ../configs/example/se.py -c >>>>>>>>> >> tonto_base.amd64-m64-gcc44-nn --cpu-type=detailed -F 5000000 >>>>>>>>> --maxtick >>>>>>>>> >> 10000000 --caches --l2cache --prog-interval=100000 >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> I also attach the report again. At the instruction that the >>>>>>>>> >>>>>>>> memory >>>>>>> >>>>>>>> >> leak begins, you can see: >>>>>>>>> >> ... >>>>>>>>> >> 339066000: system.cpu + A0 T0 : 0x83d48d : call 0x15afe >>>>>>>>> >> 339066000: system.cpu + A0 T0 : 0x83d48d.0 : CALL_NEAR_I : >>>>>>>>> >>>>>>>> limm >>>>>>> >>>>>>>> >> t1, 0x15afe : IntAlu : D=0x0000000000015afe >>>>>>>>> >> 339066500: system.cpu + A0 T0 : 0x83d48d.1 : CALL_NEAR_I : >>>>>>>>> >>>>>>>> rdip >>>>>>> >>>>>>>> >> t7, %ctrl153, : IntAlu : D=0x000000000083d492 >>>>>>>>> >> 339067000: system.cpu.dcache: set 9a: moving blk 5aa680 to >>>>>>>>> MRU >>>>>>>>> >> 339067000: system.cpu.dcache: WriteReq 5aa6b8 hit >>>>>>>>> >> 339067000: system.cpu + A0 T0 : 0x83d48d.2 : CALL_NEAR_I : >>>>>>>>> st t7, >>>>>>>>> >> SS:[rsp + 0xfffffffffffffff8] : MemWrite : >>>>>>>>> D=0x000000000083d492 >>>>>>>>> >> A=0x7fffffffe6b8 >>>>>>>>> >> 339067500: system.cpu + A0 T0 : 0x83d48d.3 : CALL_NEAR_I : >>>>>>>>> >>>>>>>> subi >>>>>>> >>>>>>>> >> rsp, rsp, 0x8 : IntAlu : D=0x00007fffffffe6b8 >>>>>>>>> >> 339068000: system.cpu + A0 T0 : 0x83d48d.4 : CALL_NEAR_I : >>>>>>>>> wrip , >>>>>>>>> >> t7, t1 : IntAlu : >>>>>>>>> >> 339068500: system.cpu.icache: set be: moving blk 452f80 to >>>>>>>>> MRU >>>>>>>>> >> 339068500: system.cpu.icache: ReadReq (ifetch) 452f90 hit >>>>>>>>> >> 339068500: system.cpu + A0 T0 : 0x852f90 : mov r10, rcx >>>>>>>>> >> 339068500: system.cpu + A0 T0 : 0x852f90.0 : MOV_R_R : mov >>>>>>>>> r10, >>>>>>>>> >> r10, rcx : IntAlu : D=0x0000000000000022 >>>>>>>>> >> 339069000: system.cpu.icache: set be: moving blk 452f80 to >>>>>>>>> MRU >>>>>>>>> >> 339069000: system.cpu.icache: ReadReq (ifetch) 452f90 hit >>>>>>>>> >> 339069000: system.cpu + A0 T0 : 0x852f93 : mov eax, 0x9 >>>>>>>>> >> 339069000: system.cpu + A0 T0 : 0x852f93.0 : MOV_R_I : >>>>>>>>> limm >>>>>>>>> eax, >>>>>>>>> >> 0x9 : IntAlu : D=0x0000000000000009 >>>>>>>>> >> 339069500: system.cpu.icache: set be: moving blk 452f80 to >>>>>>>>> MRU >>>>>>>>> >> 339069500: system.cpu.icache: ReadReq (ifetch) 452f98 hit >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> What is your opinion then? >>>>>>>>> >> Regards, >>>>>>>>> >> >>>>>>>>> >> On 4/27/12, Steve Reinhardt <ste...@gmail.com >>>>>>>>> <mailto:ste...@gmail.com>> wrote: >>>>>>>>> >>> Also, if you do run valgrind, use the >>>>>>>>> util/valgrind-suppressions file to >>>>>>>>> >>> suppress spurious reports. Read the valgrind docs to see >>>>>>>>> how >>>>>>>>> this >>>>>>>>> >>> works. >>>>>>>>> >>> >>>>>>>>> >>> Steve >>>>>>>>> >>> >>>>>>>>> > ______________________________**_________________ >>>>>>>>> > gem5-users mailing list >>>>>>>>> > gem5-users@gem5.org <mailto:gem5-users@gem5.org> >>>>>>>>> > >>>>>>>>> http://m5sim.org/cgi-bin/**mailman/listinfo/gem5-users<http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users> >>>>>>>>> > >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> // Naderan *Mahmood; >>>>>>>>> ______________________________**_________________ >>>>>>>>> gem5-users mailing list >>>>>>>>> gem5-users@gem5.org <mailto:gem5-users@gem5.org> >>>>>>>>> >>>>>>>>> http://m5sim.org/cgi-bin/**mailman/listinfo/gem5-users<http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> ______________________________**_________________ >>>>>>>>> gem5-users mailing list >>>>>>>>> gem5-users@gem5.org >>>>>>>>> http://m5sim.org/cgi-bin/**mailman/listinfo/gem5-users<http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> // Naderan *Mahmood; >>>>>>> ______________________________**_________________ >>>>>>> gem5-users mailing list >>>>>>> gem5-users@gem5.org >>>>>>> http://m5sim.org/cgi-bin/**mailman/listinfo/gem5-users<http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users> >>>>>>> >>>>>>> >>>>>> >>>>> >>>>> -- >>>>> // Naderan *Mahmood; >>>>> >>>>> >>>> >>>> -- >>>> // Naderan *Mahmood; >>>> >>>> >>> >>> -- >>> // Naderan *Mahmood; >>> ______________________________**_________________ >>> gem5-users mailing list >>> gem5-users@gem5.org >>> http://m5sim.org/cgi-bin/**mailman/listinfo/gem5-users<http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users> >>> >>> ______________________________**_________________ >> gem5-users mailing list >> gem5-users@gem5.org >> http://m5sim.org/cgi-bin/**mailman/listinfo/gem5-users<http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users> >> > -- // Naderan *Mahmood; _______________________________________________ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users