Re: [gem5-users] problem with benchmarks which uses '<'

Mahmood Naderan Sat, 28 Apr 2012 09:23:55 -0700

SyscallVerbose shows this:
300000000: Event_36: system.cpu progress event, total
committed:535958, progress insts committed: 89710, IPC: 0.8971
336777500: system.cpu: syscall time called w/arguments 15,0,0,140737488349664
336777500: system.cpu: syscall time returns 1000000000
336795000: system.cpu: syscall time called w/arguments 15,0,0,0
336795000: system.cpu: syscall time returns 1000000000
337615500: system.cpu: syscall open called w/arguments 2,438,0,9618232
337615500: global: opening file /etc/localtime
337615500: system.cpu: syscall open returns 5
337748000: system.cpu: syscall fstat called w/arguments
2,140737488349184,140737488349184,5
337748000: system.cpu: syscall fstat returns 0
337857500: system.cpu: syscall fstat called w/arguments
2,140737488348816,140737488348816,5
337857500: system.cpu: syscall fstat returns 0
337883500: system.cpu: syscall mmap called w/arguments 34,3,8192,0
337883500: system.cpu: syscall mmap returns 46912496119808
338041000: system.cpu: syscall read called w/arguments 34,8192,46912496119808,5
338041000: system.cpu: syscall read returns 1622
338303500: system.cpu: syscall lseek called w/arguments
100663296,1,18446744073709550604,5
338303500: system.cpu: syscall lseek returns 610
338483500: system.cpu: syscall read called w/arguments
100663296,8192,46912496119808,5
338483500: system.cpu: syscall read returns 1012
339055500: system.cpu: syscall mmap called w/arguments
34,3,5548434871059525632,0
*** Exited on signal SIGKILL ***



On 4/28/12, Steve Reinhardt <ste...@gmail.com> wrote:
> You could try running with the SyscallVerbose debug flag and/or setting a
> breakpoint in mmapFunc<X86Linux64> to see if/why the application seems to
> be requesting such a ridiculously large amount of memory.  It could be
> something like trying to read a 64-bit value where only a 32-bit value is
> being provided.
>
> Make sure you use gem5.debug for this if you're not already.
>
> Steve
>
> On Sat, Apr 28, 2012 at 9:10 AM, Nilay Vaish <ni...@cs.wisc.edu> wrote:
>
>> Check why the size of the memory being asked for is so high? It might be
>> that there is some problem with the way 'length' is found in mmapFunc().
>> Or
>> it could be that the application did request a huge chunk of memory to be
>> mmapped.
>>
>> --
>> Nilay
>>
>>
>> On Sat, 28 Apr 2012, Mahmood Naderan wrote:
>>
>>  Isn't there any idea about my finding?
>>>
>>>
>>> On 4/27/12, Mahmood Naderan <mahmood...@gmail.com> wrote:
>>>
>>>> ok I think I find the bug. I used "continue" and "ctrl+c" multiple
>>>> times to see if it stuck at a particular function. The backtrace
>>>> shows:
>>>>
>>>>
>>>> #0  0x00000000004dfee7 in __gnu_cxx::hashtable<std::**pair<unsigned
>>>> long
>>>> const, X86ISA::TlbEntry>, unsigned long, __gnu_cxx::hash<unsigned
>>>> long>, std::_Select1st<std::pair<**unsigned long const,
>>>> X86ISA::TlbEntry> >, std::equal_to<unsigned long>,
>>>> std::allocator<X86ISA::**TlbEntry> >::_M_bkt_num_key (this=0x28b8970,
>>>>    __key=@0x21d9a7c8, __n=50331653) at
>>>> /usr/include/c++/4.4/backward/**hashtable.h:590
>>>> #1  0x00000000004dfff9 in __gnu_cxx::hashtable<std::**pair<unsigned
>>>> long
>>>> const, X86ISA::TlbEntry>, unsigned long, __gnu_cxx::hash<unsigned
>>>> long>, std::_Select1st<std::pair<**unsigned long const,
>>>> X86ISA::TlbEntry> >, std::equal_to<unsigned long>,
>>>> std::allocator<X86ISA::**TlbEntry> >::_M_bkt_num (this=0x28b8970,
>>>>    __obj=..., __n=50331653) at
>>>> /usr/include/c++/4.4/backward/**hashtable.h:594
>>>> #2  0x00000000004df9c8 in __gnu_cxx::hashtable<std::**pair<unsigned
>>>> long
>>>> const, X86ISA::TlbEntry>, unsigned long, __gnu_cxx::hash<unsigned
>>>> long>, std::_Select1st<std::pair<**unsigned long const,
>>>> X86ISA::TlbEntry> >, std::equal_to<unsigned long>,
>>>> std::allocator<X86ISA::**TlbEntry> >::resize (this=0x28b8970,
>>>>    __num_elements_hint=25165844) at
>>>> /usr/include/c++/4.4/backward/**hashtable.h:1001
>>>> #3  0x00000000004df100 in __gnu_cxx::hashtable<std::**pair<unsigned
>>>> long
>>>> const, X86ISA::TlbEntry>, unsigned long, __gnu_cxx::hash<unsigned
>>>> long>, std::_Select1st<std::pair<**unsigned long const,
>>>> X86ISA::TlbEntry> >, std::equal_to<unsigned long>,
>>>> std::allocator<X86ISA::**TlbEntry> >::find_or_insert (this=0x28b8970,
>>>>    __obj=...) at /usr/include/c++/4.4/backward/**hashtable.h:789
>>>> #4  0x00000000004deaca in __gnu_cxx::hash_map<unsigned long,
>>>> X86ISA::TlbEntry, __gnu_cxx::hash<unsigned long>,
>>>> std::equal_to<unsigned long>, std::allocator<X86ISA::**TlbEntry>
>>>>
>>>>> ::operator[] (this=0x28b8970,
>>>>>
>>>>    __key=@0x7fffffffba80) at /usr/include/c++/4.4/ext/hash_**map:217
>>>> #5  0x00000000004daa68 in PageTable::map (this=0x28b8970,
>>>> vaddr=47015569313792, paddr=103079288832,
>>>>    size=5548434767986339840, clobber=false) at
>>>> build/X86/mem/page_table.cc:82
>>>> #6  0x000000000074b9c8 in Process::allocateMem (this=0x30be640,
>>>> vaddr=46912496128000,
>>>>    size=5548434871059525632, clobber=false) at
>>>> build/X86/sim/process.cc:332
>>>> #7  0x00000000007aba21 in mmapFunc<X86Linux64> (desc=0x2052fb8, num=9,
>>>> p=0x30be640, tc=0x3331210)
>>>>    at build/X86/sim/syscall_emul.hh:**1069
>>>> #8  0x000000000073ca11 in SyscallDesc::doSyscall (this=0x2052fb8,
>>>> callnum=9, process=0x30be640,
>>>>    tc=0x3331210) at build/X86/sim/syscall_emul.cc:**69
>>>> #9  0x00000000007516a0 in LiveProcess::syscall (this=0x30be640,
>>>> callnum=9, tc=0x3331210)
>>>>    at build/X86/sim/process.cc:590
>>>> #10 0x0000000000c10ce3 in SimpleThread::syscall (this=0x33305d0,
>>>> callnum=9)
>>>>    at build/X86/cpu/simple_thread.**hh:384
>>>>
>>>>
>>>>
>>>> As you can see there is a problem with mmapFunc<X86Linux64> syscall
>>>> which allocate memory through Process::allocateMem
>>>> That is my understanding....
>>>>
>>>>
>>>>
>>>> On 4/27/12, Mahmood Naderan <mahmood...@gmail.com> wrote:
>>>>
>>>>> Is this useful?
>>>>>
>>>>> 339051500: system.cpu + A0 T0 : 0x83d48d.4  :   CALL_NEAR_I : wrip   ,
>>>>> t7, t1 : IntAlu :
>>>>> 339052000: system.cpu.icache: ReadReq (ifetch) 452f90 hit
>>>>> 339052000: system.cpu + A0 T0 : 0x852f90    : mov       r10, rcx
>>>>> 339052000: system.cpu + A0 T0 : 0x852f90.0  :   MOV_R_R : mov   r10,
>>>>> r10, rcx : IntAlu :  D=0x0000000000000022
>>>>> 339052500: system.cpu.icache: ReadReq (ifetch) 452f90 hit
>>>>> 339052500: system.cpu + A0 T0 : 0x852f93    : mov       eax, 0x9
>>>>> 339052500: system.cpu + A0 T0 : 0x852f93.0  :   MOV_R_I : limm   eax,
>>>>> 0x9 : IntAlu :  D=0x0000000000000009
>>>>> 339053000: system.cpu.icache: ReadReq (ifetch) 452f98 hit
>>>>> ^C
>>>>> Program received signal SIGINT, Interrupt.
>>>>> 0x00000000004e0f90 in
>>>>> std::__fill_n_a<__gnu_cxx::_**Hashtable_node<std::pair<**unsigned long
>>>>> const, X86ISA::TlbEntry> >**, unsigned long,
>>>>> __gnu_cxx::_Hashtable_node<**std::pair<unsigned long const,
>>>>> X86ISA::TlbEntry> >*> (__first=0x7fff70017000, __n=4065295,
>>>>> __value=@0x7fffffffb8d0)
>>>>>    at /usr/include/c++/4.4/bits/stl_**algobase.h:758
>>>>> 758             *__first = __tmp;
>>>>> (gdb) ^CQuit
>>>>> (gdb)
>>>>>
>>>>>
>>>>>
>>>>> On 4/27/12, Steve Reinhardt <ste...@gmail.com> wrote:
>>>>>
>>>>>> Perhaps you could fire off the run under gdb, and use the
>>>>>> --debug-break
>>>>>> flag to drop in to gdb at the tick where it seems to stop running.
>>>>>> If
>>>>>> the
>>>>>> simulation stops and memory blows up, it's almost like you're stuck
>>>>>> in
>>>>>> some
>>>>>> subtle infinite loop with a memory allocation in it.  (You might have
>>>>>> to
>>>>>> continue just a little past there and hit ctrl-c before it dies to
>>>>>> catch
>>>>>> it
>>>>>> in the middle of this loop.)
>>>>>>
>>>>>> On Fri, Apr 27, 2012 at 11:29 AM, Mahmood Naderan
>>>>>> <mahmood...@gmail.com>wrote:
>>>>>>
>>>>>>  i searched for something similar (stoping the simulation when it
>>>>>> reach
>>>>>>> at a specific memory usage to prevent killing) but didn't find such
>>>>>>> thing. Do you know?
>>>>>>>
>>>>>>> I also attached gdb. it doesn't show anything useful because lastly
>>>>>>> it
>>>>>>> get killed.
>>>>>>>
>>>>>>> On 4/27/12, Gabe Black <gbl...@eecs.umich.edu> wrote:
>>>>>>>
>>>>>>>> Valgrind should tell you where the leaked memory was allocated. You
>>>>>>>> may
>>>>>>>> have to give it a command line option for that, or stop it before
>>>>>>>> it
>>>>>>>> gets itself killed.
>>>>>>>>
>>>>>>>> Gabe
>>>>>>>>
>>>>>>>> On 04/27/12 11:10, Steve Reinhardt wrote:
>>>>>>>>
>>>>>>>>> Can you attach gdb when it does this, see where it's at, and maybe
>>>>>>>>> step through the code a bit to see what it's doing?
>>>>>>>>>
>>>>>>>>> On Fri, Apr 27, 2012 at 10:54 AM, Mahmood Naderan
>>>>>>>>> <mahmood...@gmail.com <mailto:mahmood...@gmail.com>> wrote:
>>>>>>>>>
>>>>>>>>>    That was a guess. As I said, i turned on the debugger to see
>>>>>>>>> when
>>>>>>>>> it
>>>>>>>>>    start eating the memory. As you can see the last messageit
>>>>>>>>> print
>>>>>>>>> is:
>>>>>>>>>    339069000: system.cpu + A0 T0 : 0x852f93.0  :   MOV_R_I : limm
>>>>>>>>>
>>>>>>>> eax,
>>>>>>>
>>>>>>>>    0x9 : IntAlu :  D=0x0000000000000009
>>>>>>>>>    339069500: system.cpu.icache: set be: moving blk 452f80 to MRU
>>>>>>>>>    339069500: system.cpu.icache: ReadReq (ifetch) 452f98 hit
>>>>>>>>>
>>>>>>>>>    Then no message is printed and I see, with top command, that
>>>>>>>>> the
>>>>>>>>>    memory usage gos up and up until it consumes all memory.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>    On 4/27/12, Nilay Vaish <ni...@cs.wisc.edu
>>>>>>>>>    <mailto:ni...@cs.wisc.edu>> wrote:
>>>>>>>>>   > How do you know the instruction at which the memory starts
>>>>>>>>>    leaking? What
>>>>>>>>>   > should we conclude from the instruction trace in your mail. I
>>>>>>>>> am
>>>>>>>>>    unable to
>>>>>>>>>   > arrive at any conclusion from the valgrind report that you had
>>>>>>>>>    attached.
>>>>>>>>>   > Apart from the info on uninitialized values, I did not find
>>>>>>>>> any
>>>>>>>>>    useful
>>>>>>>>>   > output produced by valgrind.
>>>>>>>>>   >
>>>>>>>>>   > --
>>>>>>>>>   > Nilay
>>>>>>>>>   >
>>>>>>>>>   > On Fri, 27 Apr 2012, Mahmood Naderan wrote:
>>>>>>>>>   >
>>>>>>>>>   >> tonto with the test input uses about 4 GB and runs for about
>>>>>>>>> 2
>>>>>>>>>    seconds
>>>>>>>>>   >> on a real machine.
>>>>>>>>>   >>
>>>>>>>>>   >> I also used the test input with gem5. However again after
>>>>>>>>> tick
>>>>>>>>>   >> 300000000, all the 30GB memory is used and then gem5 is
>>>>>>>>> killed.
>>>>>>>>> The
>>>>>>>>>   >> same behaviour with ref input...
>>>>>>>>>   >>
>>>>>>>>>   >> I ran the following command:
>>>>>>>>>   >> valgrind --tool=memcheck --leak-check=full
>>>>>>>>> --track-origins=yes
>>>>>>>>>   >> --suppressions=../util/**valgrind-suppressions
>>>>>>>>>
>>>>>>>> ../build/X86/m5.debug
>>>>>>>
>>>>>>>>   >> --debug-flags=Cache,ExecAll,**Bus,CacheRepl,Context
>>>>>>>>>   >> --trace-start=339050000 ../configs/example/se.py -c
>>>>>>>>>   >> tonto_base.amd64-m64-gcc44-nn --cpu-type=detailed -F 5000000
>>>>>>>>>    --maxtick
>>>>>>>>>   >> 10000000 --caches --l2cache --prog-interval=100000
>>>>>>>>>   >>
>>>>>>>>>   >>
>>>>>>>>>   >> I also attach the report again. At the instruction that the
>>>>>>>>>
>>>>>>>> memory
>>>>>>>
>>>>>>>>   >> leak begins, you can see:
>>>>>>>>>   >> ...
>>>>>>>>>   >> 339066000: system.cpu + A0 T0 : 0x83d48d    : call   0x15afe
>>>>>>>>>   >> 339066000: system.cpu + A0 T0 : 0x83d48d.0  :   CALL_NEAR_I :
>>>>>>>>>
>>>>>>>> limm
>>>>>>>
>>>>>>>>   >> t1, 0x15afe : IntAlu :  D=0x0000000000015afe
>>>>>>>>>   >> 339066500: system.cpu + A0 T0 : 0x83d48d.1  :   CALL_NEAR_I :
>>>>>>>>>
>>>>>>>> rdip
>>>>>>>
>>>>>>>>   >> t7, %ctrl153,  : IntAlu :  D=0x000000000083d492
>>>>>>>>>   >> 339067000: system.cpu.dcache: set 9a: moving blk 5aa680 to
>>>>>>>>> MRU
>>>>>>>>>   >> 339067000: system.cpu.dcache: WriteReq 5aa6b8 hit
>>>>>>>>>   >> 339067000: system.cpu + A0 T0 : 0x83d48d.2  :   CALL_NEAR_I :
>>>>>>>>>    st   t7,
>>>>>>>>>   >> SS:[rsp + 0xfffffffffffffff8] : MemWrite :
>>>>>>>>> D=0x000000000083d492
>>>>>>>>>   >> A=0x7fffffffe6b8
>>>>>>>>>   >> 339067500: system.cpu + A0 T0 : 0x83d48d.3  :   CALL_NEAR_I :
>>>>>>>>>
>>>>>>>> subi
>>>>>>>
>>>>>>>>   >> rsp, rsp, 0x8 : IntAlu :  D=0x00007fffffffe6b8
>>>>>>>>>   >> 339068000: system.cpu + A0 T0 : 0x83d48d.4  :   CALL_NEAR_I :
>>>>>>>>>    wrip   ,
>>>>>>>>>   >> t7, t1 : IntAlu :
>>>>>>>>>   >> 339068500: system.cpu.icache: set be: moving blk 452f80 to
>>>>>>>>> MRU
>>>>>>>>>   >> 339068500: system.cpu.icache: ReadReq (ifetch) 452f90 hit
>>>>>>>>>   >> 339068500: system.cpu + A0 T0 : 0x852f90    : mov    r10, rcx
>>>>>>>>>   >> 339068500: system.cpu + A0 T0 : 0x852f90.0  :   MOV_R_R : mov
>>>>>>>>>    r10,
>>>>>>>>>   >> r10, rcx : IntAlu :  D=0x0000000000000022
>>>>>>>>>   >> 339069000: system.cpu.icache: set be: moving blk 452f80 to
>>>>>>>>> MRU
>>>>>>>>>   >> 339069000: system.cpu.icache: ReadReq (ifetch) 452f90 hit
>>>>>>>>>   >> 339069000: system.cpu + A0 T0 : 0x852f93    : mov    eax, 0x9
>>>>>>>>>   >> 339069000: system.cpu + A0 T0 : 0x852f93.0  :   MOV_R_I :
>>>>>>>>> limm
>>>>>>>>>      eax,
>>>>>>>>>   >> 0x9 : IntAlu :  D=0x0000000000000009
>>>>>>>>>   >> 339069500: system.cpu.icache: set be: moving blk 452f80 to
>>>>>>>>> MRU
>>>>>>>>>   >> 339069500: system.cpu.icache: ReadReq (ifetch) 452f98 hit
>>>>>>>>>   >>
>>>>>>>>>   >>
>>>>>>>>>   >> What is your opinion then?
>>>>>>>>>   >> Regards,
>>>>>>>>>   >>
>>>>>>>>>   >> On 4/27/12, Steve Reinhardt <ste...@gmail.com
>>>>>>>>>    <mailto:ste...@gmail.com>> wrote:
>>>>>>>>>   >>> Also, if you do run valgrind, use the
>>>>>>>>>    util/valgrind-suppressions file to
>>>>>>>>>   >>> suppress spurious reports.  Read the valgrind docs to see
>>>>>>>>> how
>>>>>>>>> this
>>>>>>>>>   >>> works.
>>>>>>>>>   >>>
>>>>>>>>>   >>> Steve
>>>>>>>>>   >>>
>>>>>>>>>   > ______________________________**_________________
>>>>>>>>>   > gem5-users mailing list
>>>>>>>>>   > gem5-users@gem5.org <mailto:gem5-users@gem5.org>
>>>>>>>>>   >
>>>>>>>>> http://m5sim.org/cgi-bin/**mailman/listinfo/gem5-users<http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users>
>>>>>>>>>   >
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>    --
>>>>>>>>>    // Naderan *Mahmood;
>>>>>>>>>    ______________________________**_________________
>>>>>>>>>    gem5-users mailing list
>>>>>>>>>    gem5-users@gem5.org <mailto:gem5-users@gem5.org>
>>>>>>>>>
>>>>>>>>> http://m5sim.org/cgi-bin/**mailman/listinfo/gem5-users<http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> ______________________________**_________________
>>>>>>>>> gem5-users mailing list
>>>>>>>>> gem5-users@gem5.org
>>>>>>>>> http://m5sim.org/cgi-bin/**mailman/listinfo/gem5-users<http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> // Naderan *Mahmood;
>>>>>>> ______________________________**_________________
>>>>>>> gem5-users mailing list
>>>>>>> gem5-users@gem5.org
>>>>>>> http://m5sim.org/cgi-bin/**mailman/listinfo/gem5-users<http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> // Naderan *Mahmood;
>>>>>
>>>>>
>>>>
>>>> --
>>>> // Naderan *Mahmood;
>>>>
>>>>
>>>
>>> --
>>> // Naderan *Mahmood;
>>> ______________________________**_________________
>>> gem5-users mailing list
>>> gem5-users@gem5.org
>>> http://m5sim.org/cgi-bin/**mailman/listinfo/gem5-users<http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users>
>>>
>>>  ______________________________**_________________
>> gem5-users mailing list
>> gem5-users@gem5.org
>> http://m5sim.org/cgi-bin/**mailman/listinfo/gem5-users<http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users>
>>
>


-- 
// Naderan *Mahmood;
_______________________________________________
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Re: [gem5-users] problem with benchmarks which uses '<'

Reply via email to