Fulya, In case of the multicore system, you are hard coding the value of the clock, but I feel for different CPUs if you change the value of the clock for different CPUs, that would create a multicore system and make the system heterogeneous. I think this change should be made in the se_AMD_multicore.py, but I feel, some changes for the clock needs to be done in the CacheConfig.py file too which I am not sure, also giving your se_AMD_multicore.py will definitely help.
On Wed, Nov 6, 2013 at 9:24 AM, biswabandan panda <biswa....@gmail.com>wrote: > Could you check your config.dot.pdf inside m5out ?? > > > On Wed, Nov 6, 2013 at 9:50 PM, Fulya Kaplan <fkapl...@bu.edu> wrote: > >> I was also wondering if my CacheConfig.py file looks ok in terms of >> defining private L2 caches. >> Best, >> Fulya >> >> >> On Wed, Nov 6, 2013 at 11:18 AM, Fulya Kaplan <fkapl...@bu.edu> wrote: >> >>> Hi Saptarshi, >>> Command line for *Case 1:* >>> /mnt/nokrb/fkaplan3/gem5/gem5-stable-07352f119e48/build/X86/gem5.fast >>> --outdir=/mnt/nokrb/fkaplan3/gem5/gem5-stable-07352f119e48/RUNS/14_single/m5out_cactusADM >>> --remote-gdb-port=0 >>> /mnt/nokrb/fkaplan3/gem5/gem5-stable-07352f119e48/configs/example/se_AMD_multicore.py >>> -n 1 --cpu-type=detailed --caches --l2cache --num-l2caches=1 >>> --l1d_size=64kB --l1i_size=64kB --l1d_assoc=2 --l1i_assoc=2 --l2_size=1MB >>> --l2_assoc=16 --fast-forward=2000000000 --bench="cactusADM" >>> --max_total_inst=100000000 --clock=2.1GHz >>> >>> Command line for *Case 2:* >>> /mnt/nokrb/fkaplan3/gem5/gem5-stable-07352f119e48/build/X86/gem5.fast >>> --outdir=/mnt/nokrb/fkaplan3/gem5/gem5-stable-07352f119e48/RUNS/14_hom/m5out_cactusADM-cactusADM-cactusADM-cactusADM >>> --remote-gdb-port=0 >>> /mnt/nokrb/fkaplan3/gem5/gem5-stable-07352f119e48/configs/example/se_AMD_multicore.py >>> -n 4 --cpu-type=detailed --caches --l2cache --num-l2caches=4 >>> --l1d_size=64kB --l1i_size=64kB --l1d_assoc=2 --l1i_assoc=2 --l2_size=1MB >>> --l2_assoc=16 --fast-forward=2000000000 >>> --bench="cactusADM-cactusADM-cactusADM-cactusADM" >>> --max_total_inst=400000000 --clock=2.1GHz >>> >>> For clarity, i turned off remote-gdb because I was getting error about >>> listeners when running on the cluster, and I am not doing anything with gdb. >>> My gem5 is modified such that I count the number of instructions >>> executed after switching (which i implemented in o3 cpu definition). >>> Max_total_inst option determines the total number of instructions executed >>> for all cores after switching, therefore this number is set to 400 million >>> in case 2. I also checked and verified this by looking at the stats.txt >>> file. >>> Let me know if you also need to check my se_AMD_multicore.py file. This >>> has been modified to add all the Spec benchmarks and their binary&input >>> file paths. >>> >>> Thanks, >>> Fulya >>> >>> >>> On Wed, Nov 6, 2013 at 11:03 AM, Saptarshi Mallick < >>> saptarshi.mall...@aggiemail.usu.edu> wrote: >>> >>>> Hello Fulya, >>>> Can you please give the command line for both the >>>> cases, which you run for getting the results? I also had the same kind of >>>> problem, maybe there is some mistake in the command line which we are >>>> using. >>>> >>>> On Tuesday, November 5, 2013, Fulya Kaplan <fkapl...@bu.edu> wrote: >>>> > Number of committed instructions >>>> (system.switch_cpus.committedInsts_total) for; >>>> > Case 1: 100,000,000 >>>> > Case 2: cpu0->100,045,354 >>>> > cpu1->100,310,197 >>>> > cpu2-> 99,884,333 >>>> > cpu3-> 99,760,117 >>>> > Number of cycles for; >>>> > Case 1: 150,570,516 >>>> > Case 2: 139,230,042 >>>> > For both cases, they switch cpus to detailed mode at instruction #2 >>>> billion. All the reported data correspond to the 100 million instructions >>>> in detailed mode. >>>> > From the config.ini files I can see that separate L2 caches are >>>> defined for Case 2. My modified CacheConfig.py file to have private L2 >>>> caches looks like: >>>> > def config_cache(options, system): >>>> > if options.cpu_type == "arm_detailed": >>>> > try: >>>> > from O3_ARM_v7a import * >>>> > except: >>>> > print "arm_detailed is unavailable. Did you compile the >>>> O3 model?" >>>> > sys.exit(1) >>>> > dcache_class, icache_class, l2_cache_class = \ >>>> > O3_ARM_v7a_DCache, O3_ARM_v7a_ICache, O3_ARM_v7aL2 >>>> > else: >>>> > dcache_class, icache_class, l2_cache_class = \ >>>> > L1Cache, L1Cache, L2Cache >>>> > if options.l2cache: >>>> > # Provide a clock for the L2 and the L1-to-L2 bus here as they >>>> > # are not connected using addTwoLevelCacheHierarchy. Use the >>>> > # same clock as the CPUs, and set the L1-to-L2 bus width to 32 >>>> > # bytes (256 bits). >>>> > system.l2 = [l2_cache_class(clock=options.clock, >>>> > size=options.l2_size, >>>> > assoc=options.l2_assoc, >>>> > block_size=options.cacheline_size) >>>> for i in xrange(options.num_cpus)] >>>> > >>>> > system.tol2bus = [CoherentBus(clock = options.clock, width = >>>> 32) for i in xrange(options.num_cpus)] >>>> > #system.l2.cpu_side = system.tol2bus.master >>>> > #system.l2.mem_side = system.membus.slave >>>> > for i in xrange(options.num_cpus): >>>> > if options.caches: >>>> > icache = icache_class(size=options.l1i_size, >>>> > assoc=options.l1i_assoc, >>>> > block_size=options.cacheline_size) >>>> > dcache = dcache_class(size=options.l1d_size, >>>> > assoc=options.l1d_assoc, >>>> > block_size=options.cacheline_size) >>>> > # When connecting the caches, the clock is also inherited >>>> > # from the CPU in question >>>> > if buildEnv['TARGET_ISA'] == 'x86': >>>> > system.cpu[i].addPrivateSplitL1Caches(icache, dcache, >>>> > >>>> PageTableWalkerCache(), >>>> > >>>> PageTableWalkerCache()) >>>> > else: >>>> > system.cpu[i].addPrivateSplitL1Caches(icache, dcache) >>>> > system.cpu[i].createInterruptController() >>>> > if options.l2cache: >>>> > system.l2[i].cpu_side = system.tol2bus[i].master >>>> > system.l2[i].mem_side = system.membus.slave >>>> > system.cpu[i].connectAllPorts(system.tol2bus[i], >>>> system.membus) >>>> > else: >>>> > system.cpu[i].connectAllPorts(system.membus) >>>> > return system >>>> > >>>> > >>>> > >>>> > Best, >>>> > Fulya >>>> > >>>> > >>>> > >>>> > >>>> > On Mon, Nov 4, 2013 at 10:35 PM, biswabandan panda < >>>> biswa....@gmail.com> wrote: >>>> > >>>> > Hi, >>>> > Could you report the number of committedInsts for both the >>>> cases. >>>> > >>>> > >>>> > On Tue, Nov 5, 2013 at 7:04 AM, fulya <fkapl...@bu.edu> wrote: >>>> > >>>> > In single core case, there is a 1 MB L2 cache. In 4-core case, each >>>> core has its own private L2 cache of size 1 MB. As they are not shared, i >>>> dont understand the reason for different cache miss rates. >>>> > >>>> > Best, >>>> > Fulya Kaplan >>>> > On Nov 4, 2013, at 7:55 PM, "Tao Zhang" <tao.zhang.0...@gmail.com> >>>> wrote: >>>> > >>>> > Hi Fulya, >>>> > >>>> > >>>> > >>>> > What’s the L2 cache size of the 1-core test? Is it equal to the total >>>> capacity of 4-core case? The stats indicates that 4-core test has less L2 >>>> cache miss rate, which may be the reason of IPC improvement. >>>> > >>>> > >>>> > >>>> > -Tao >>>> > >>>> > >>>> > >>>> > From: gem5-users-boun...@gem5.org [mailto:gem5-users-boun...@gem5.org] >>>> On Behalf Of Fulya Kaplan >>>> > Sent: Monday, November 04, 2013 10:20 AM >>>> > To: gem5 users mailing list >>>> > Subject: [gem5-users] Weird IPC statistics for Spec2006 Multiprogram >>>> mode >>>> > >>>> > >>>> > >>>> > Hi all, >>>> > >>>> > I am running Spec 2006 on X86 with the version >>>> gem5-stable-07352f119e48. I am using multiprogram mode with syscall >>>> emulation. I am trying to compare the IPC statistics for 2 cases: >>>> > >>>> > 1)Running benchmark A on a single core >>>> > >>>> > 2)Running 4 instances of benchmark A on a 4-core system with 1MB >>>> private L2 cashes. >>>> > >>>> > All parameters are the same for the 2 runs except the number of cores. >>>> > >>>> > I am expecting some IPC decrease for the 4-core case as the cores >>>> will share the same system bus. However, for CactusADM and Soplex >>>> benchmarks, I see higher IPC for case2 compared to case 1. >>>> > >>>> > I look at the same phase of execution for both runs. I fastforward >>>> for 2 billion instructions and grab the ipc for each of the cores >>>> corresponding to the next 100 million instructions in detailed mode. >>>> > >>>> > I ll report some other statistics for CactusADM to give a better idea >>>> of what is going on. >>>> > >>>> > Case 1: ipc=0.664141, L2_overall _accesses=573746, L2_miss_rate=0.616 >>>> > >>>> > Case 2: cpu0_ipc=0.718562, cpu1_ipc= 0.720464, cpu2_ipc=0.717405, >>>> cpu3_ipc= 0.716513 >>>> > >>>> > L2_0_accesses=591607, L2_1_accesses=581846, >>>> L2_2_accesses=568095, L2_3_accesses=561180, L2_0_missrate=0.452978, >>>> L2_1_missrate=0.454510, L2_2_missrate=0.475646, L2_3_missrate=0.488171 >>>> > >>>> > >>>> > >>>> > Case 1:Running Time for 100M insts = 0.0716 >>>> >>>> -- >>>> Thank you, >>>> Saptarshi Mallick >>>> Department of Electrical and Computer Engineering >>>> Utah State University >>>> Utah, USA. >>>> >>>> >>>> _______________________________________________ >>>> gem5-users mailing list >>>> gem5-users@gem5.org >>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>> >>> >>> >> >> _______________________________________________ >> gem5-users mailing list >> gem5-users@gem5.org >> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >> > > > > -- > > > *thanks®ards * > *BISWABANDAN* > http://www.cse.iitm.ac.in/~biswa/ > > “We might fall down, but we will never lay down. We might not be the best, > but we will beat the best! We might not be at the top, but we will rise.” > > > > _______________________________________________ > gem5-users mailing list > gem5-users@gem5.org > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users > -- Thank you, Saptarshi Mallick Department of Electrical and Computer Engineering Utah State University Utah, USA.
_______________________________________________ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users