Re: [gem5-users] Weird IPC statistics for Spec2006 Multiprogram mode

Saptarshi Mallick Wed, 06 Nov 2013 08:51:01 -0800

Fulya,
         In case of the multicore system, you are hard coding the value of
the clock, but I feel for different CPUs if you change the value of the
clock for different CPUs, that would create a multicore system and make the
system heterogeneous. I think this change should be made in the
se_AMD_multicore.py, but I feel, some changes for the clock needs to be
done in the CacheConfig.py file too which I am not sure, also giving your
se_AMD_multicore.py will definitely help.



On Wed, Nov 6, 2013 at 9:24 AM, biswabandan panda <biswa....@gmail.com>wrote:

> Could you check your config.dot.pdf inside m5out ??
>
>
> On Wed, Nov 6, 2013 at 9:50 PM, Fulya Kaplan <fkapl...@bu.edu> wrote:
>
>> I was also wondering if my CacheConfig.py file looks ok in terms of
>> defining private L2 caches.
>> Best,
>> Fulya
>>
>>
>> On Wed, Nov 6, 2013 at 11:18 AM, Fulya Kaplan <fkapl...@bu.edu> wrote:
>>
>>> Hi Saptarshi,
>>> Command line for *Case 1:*
>>> /mnt/nokrb/fkaplan3/gem5/gem5-stable-07352f119e48/build/X86/gem5.fast
>>> --outdir=/mnt/nokrb/fkaplan3/gem5/gem5-stable-07352f119e48/RUNS/14_single/m5out_cactusADM
>>> --remote-gdb-port=0
>>> /mnt/nokrb/fkaplan3/gem5/gem5-stable-07352f119e48/configs/example/se_AMD_multicore.py
>>>  -n 1 --cpu-type=detailed --caches --l2cache --num-l2caches=1
>>> --l1d_size=64kB --l1i_size=64kB --l1d_assoc=2 --l1i_assoc=2 --l2_size=1MB
>>> --l2_assoc=16 --fast-forward=2000000000 --bench="cactusADM"
>>> --max_total_inst=100000000 --clock=2.1GHz
>>>
>>> Command line for *Case 2:*
>>> /mnt/nokrb/fkaplan3/gem5/gem5-stable-07352f119e48/build/X86/gem5.fast
>>> --outdir=/mnt/nokrb/fkaplan3/gem5/gem5-stable-07352f119e48/RUNS/14_hom/m5out_cactusADM-cactusADM-cactusADM-cactusADM
>>> --remote-gdb-port=0
>>> /mnt/nokrb/fkaplan3/gem5/gem5-stable-07352f119e48/configs/example/se_AMD_multicore.py
>>>  -n 4 --cpu-type=detailed --caches --l2cache --num-l2caches=4
>>> --l1d_size=64kB --l1i_size=64kB --l1d_assoc=2 --l1i_assoc=2 --l2_size=1MB
>>> --l2_assoc=16 --fast-forward=2000000000
>>> --bench="cactusADM-cactusADM-cactusADM-cactusADM"
>>> --max_total_inst=400000000 --clock=2.1GHz
>>>
>>> For clarity, i turned off remote-gdb because I was getting error about
>>> listeners when running on the cluster, and I am not doing anything with gdb.
>>> My gem5 is modified such that I count the number of instructions
>>> executed after switching (which i implemented in o3 cpu definition).
>>> Max_total_inst option determines the total number of instructions executed
>>> for all cores after switching, therefore this number is set to 400 million
>>> in case 2. I also checked and verified this by looking at the stats.txt
>>> file.
>>> Let me know if you also need to check my se_AMD_multicore.py file. This
>>> has been modified to add all the Spec benchmarks and their binary&input
>>> file paths.
>>>
>>> Thanks,
>>> Fulya
>>>
>>>
>>> On Wed, Nov 6, 2013 at 11:03 AM, Saptarshi Mallick <
>>> saptarshi.mall...@aggiemail.usu.edu> wrote:
>>>
>>>> Hello Fulya,
>>>>                  Can you please give the command line for both the
>>>> cases, which you run for getting the results? I also had the same kind of
>>>> problem, maybe there is some mistake in the command line which we are
>>>> using.
>>>>
>>>> On Tuesday, November 5, 2013, Fulya Kaplan <fkapl...@bu.edu> wrote:
>>>> > Number of  committed instructions
>>>> (system.switch_cpus.committedInsts_total) for;
>>>> > Case 1: 100,000,000
>>>> > Case 2: cpu0->100,045,354
>>>> >             cpu1->100,310,197
>>>> >             cpu2-> 99,884,333
>>>> >             cpu3-> 99,760,117
>>>> > Number of cycles for;
>>>> > Case 1: 150,570,516
>>>> > Case 2: 139,230,042
>>>> > For both cases, they switch cpus to detailed mode at instruction #2
>>>> billion. All the reported data correspond to the 100 million instructions
>>>> in detailed mode.
>>>> > From the config.ini files I can see that separate L2 caches are
>>>> defined for Case 2. My modified CacheConfig.py file to have private L2
>>>> caches looks like:
>>>> >  def config_cache(options, system):
>>>> >     if options.cpu_type == "arm_detailed":
>>>> >         try:
>>>> >             from O3_ARM_v7a import *
>>>> >         except:
>>>> >             print "arm_detailed is unavailable. Did you compile the
>>>> O3 model?"
>>>> >             sys.exit(1)
>>>> >         dcache_class, icache_class, l2_cache_class = \
>>>> >             O3_ARM_v7a_DCache, O3_ARM_v7a_ICache, O3_ARM_v7aL2
>>>> >     else:
>>>> >         dcache_class, icache_class, l2_cache_class = \
>>>> >             L1Cache, L1Cache, L2Cache
>>>> >     if options.l2cache:
>>>> >         # Provide a clock for the L2 and the L1-to-L2 bus here as they
>>>> >         # are not connected using addTwoLevelCacheHierarchy. Use the
>>>> >         # same clock as the CPUs, and set the L1-to-L2 bus width to 32
>>>> >         # bytes (256 bits).
>>>> >         system.l2 = [l2_cache_class(clock=options.clock,
>>>> >                                    size=options.l2_size,
>>>> >                                    assoc=options.l2_assoc,
>>>> >                                    block_size=options.cacheline_size)
>>>> for i in xrange(options.num_cpus)]
>>>> >
>>>> >         system.tol2bus = [CoherentBus(clock = options.clock, width =
>>>> 32) for i in xrange(options.num_cpus)]
>>>> >         #system.l2.cpu_side = system.tol2bus.master
>>>> >         #system.l2.mem_side = system.membus.slave
>>>> >     for i in xrange(options.num_cpus):
>>>> >         if options.caches:
>>>> >             icache = icache_class(size=options.l1i_size,
>>>> >                                   assoc=options.l1i_assoc,
>>>> >                                   block_size=options.cacheline_size)
>>>> >             dcache = dcache_class(size=options.l1d_size,
>>>> >                                   assoc=options.l1d_assoc,
>>>> >                                   block_size=options.cacheline_size)
>>>> >             # When connecting the caches, the clock is also inherited
>>>> >             # from the CPU in question
>>>> >             if buildEnv['TARGET_ISA'] == 'x86':
>>>> >                 system.cpu[i].addPrivateSplitL1Caches(icache, dcache,
>>>> >
>>>> PageTableWalkerCache(),
>>>> >
>>>> PageTableWalkerCache())
>>>> >             else:
>>>> >                 system.cpu[i].addPrivateSplitL1Caches(icache, dcache)
>>>> >         system.cpu[i].createInterruptController()
>>>> >         if options.l2cache:
>>>> >             system.l2[i].cpu_side = system.tol2bus[i].master
>>>> >             system.l2[i].mem_side = system.membus.slave
>>>> >             system.cpu[i].connectAllPorts(system.tol2bus[i],
>>>> system.membus)
>>>> >         else:
>>>> >             system.cpu[i].connectAllPorts(system.membus)
>>>> >     return system
>>>> >
>>>> >
>>>> >
>>>> > Best,
>>>> > Fulya
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > On Mon, Nov 4, 2013 at 10:35 PM, biswabandan panda <
>>>> biswa....@gmail.com> wrote:
>>>> >
>>>> > Hi,
>>>> >        Could you report the number of committedInsts for both the
>>>> cases.
>>>> >
>>>> >
>>>> > On Tue, Nov 5, 2013 at 7:04 AM, fulya <fkapl...@bu.edu> wrote:
>>>> >
>>>> > In single core case, there is a 1 MB L2 cache. In 4-core case, each
>>>> core has its own private L2 cache of size 1 MB. As they are not shared, i
>>>> dont understand the reason for different cache miss rates.
>>>> >
>>>> > Best,
>>>> > Fulya Kaplan
>>>> > On Nov 4, 2013, at 7:55 PM, "Tao Zhang" <tao.zhang.0...@gmail.com>
>>>> wrote:
>>>> >
>>>> > Hi Fulya,
>>>> >
>>>> >
>>>> >
>>>> > What’s the L2 cache size of the 1-core test? Is it equal to the total
>>>> capacity of 4-core case? The stats indicates that 4-core test has less L2
>>>> cache miss rate, which may be the reason of IPC improvement.
>>>> >
>>>> >
>>>> >
>>>> > -Tao
>>>> >
>>>> >
>>>> >
>>>> > From: gem5-users-boun...@gem5.org [mailto:gem5-users-boun...@gem5.org]
>>>> On Behalf Of Fulya Kaplan
>>>> > Sent: Monday, November 04, 2013 10:20 AM
>>>> > To: gem5 users mailing list
>>>> > Subject: [gem5-users] Weird IPC statistics for Spec2006 Multiprogram
>>>> mode
>>>> >
>>>> >
>>>> >
>>>> > Hi all,
>>>> >
>>>> > I am running Spec 2006 on X86 with the version
>>>> gem5-stable-07352f119e48. I am using multiprogram mode with syscall
>>>> emulation. I am trying to compare the IPC statistics for 2 cases:
>>>> >
>>>> > 1)Running benchmark A on a single core
>>>> >
>>>> > 2)Running 4 instances of benchmark A on a 4-core system with 1MB
>>>> private L2 cashes.
>>>> >
>>>> > All parameters are the same for the 2 runs except the number of cores.
>>>> >
>>>> > I am expecting some IPC decrease for the 4-core case as the cores
>>>> will share the same system bus. However, for CactusADM and Soplex
>>>> benchmarks, I see higher IPC for case2 compared to case 1.
>>>> >
>>>> > I look at the same phase of execution for both runs. I fastforward
>>>> for 2 billion instructions and grab the ipc for each of the cores
>>>> corresponding to the next 100 million instructions in detailed mode.
>>>> >
>>>> > I ll report some other statistics for CactusADM to give a better idea
>>>> of what is going on.
>>>> >
>>>> > Case 1: ipc=0.664141, L2_overall _accesses=573746, L2_miss_rate=0.616
>>>> >
>>>> > Case 2: cpu0_ipc=0.718562, cpu1_ipc= 0.720464, cpu2_ipc=0.717405,
>>>> cpu3_ipc= 0.716513
>>>> >
>>>> >             L2_0_accesses=591607, L2_1_accesses=581846,
>>>> L2_2_accesses=568095, L2_3_accesses=561180, L2_0_missrate=0.452978,
>>>> L2_1_missrate=0.454510, L2_2_missrate=0.475646, L2_3_missrate=0.488171
>>>> >
>>>> >
>>>> >
>>>> > Case 1:Running Time for 100M insts = 0.0716
>>>>
>>>> --
>>>> Thank you,
>>>> Saptarshi Mallick
>>>> Department of Electrical and Computer Engineering
>>>> Utah State University
>>>> Utah, USA.
>>>>
>>>>
>>>> _______________________________________________
>>>> gem5-users mailing list
>>>> gem5-users@gem5.org
>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>>>
>>>
>>>
>>
>> _______________________________________________
>> gem5-users mailing list
>> gem5-users@gem5.org
>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>
>
>
>
> --
>
>
> *thanks&regards *
> *BISWABANDAN*
> http://www.cse.iitm.ac.in/~biswa/
>
> “We might fall down, but we will never lay down. We might not be the best,
> but we will beat the best! We might not be at the top, but we will rise.”
>
>
>
> _______________________________________________
> gem5-users mailing list
> gem5-users@gem5.org
> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>



-- 
Thank you,
Saptarshi Mallick
Department of Electrical and Computer Engineering
Utah State University
Utah, USA.

_______________________________________________
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Re: [gem5-users] Weird IPC statistics for Spec2006 Multiprogram mode

Reply via email to