Hi Andreas,
Many thanks for your suggestion. The generated figure is just what I
expect. So I can ensure the correctness of my cache configurations.
However, I still need to meditate these surprising simulation results.
Hanfeng
On 03/13/2013 09:36 PM, gem5-users-requ...@gem5.org wrote:
Hi Hanfeng,
Without having looked at the numbers, have you had a look at the generated
config.dot.pdf in m5out to ensure that the system architecture ends up
being what you intend?
You need py-dot installed for the figure to be generated.
Andreas
On 13/03/2013 08:19, "hanfeng QIN"<hanfeng.h...@gmail.com> wrote:
>Hi all,
>
>I model a shared L3 while private L2 cache hierarchy with gem5. During
>my experiments (running gem5 under classic memory model and SE mode), I
>control the main simulation parameters as following.
>
>1 core
>private L1 dcache: 32KB/8-way; icache: 32KB/4-way
>private L2 cache: 256KB/8-way
>shared L3 cahce: 1MB/16-way
>
>I select workloads from SPEC CPU 2K6. The first 500 million instructions
>are fast forwarded and then 500 millions instructions are cache warmed
>up. Later another 500 million instructions are detailed simulated with
>O3 cpu.
>
>I get the simulation stats as following (these stats are about
>'switch_cpus_1' with detailed measurement, 1st column is the benchmark
>name while the 2nd name is the corresponding metric stat)
>
>( A ). L1D$$ miss rate:
>
>401 0.039705
>403 0.154247
>410 0.107384
>450 0.169413
>459 0.000102
>462 0.031115
>471 0.060141
>
>( B ). L2$$ miss rate:
>
>401 0.463055
>403 0.900824
>410 0.344414
>450 0.820350
>459 0.989815
>462 0.997665
>471 0.964760
>
>( C ). L3$$ miss rate:
>
>401 0.334149
>403 0.291530
>410 0.918030
>450 0.561909
>459 0.970418
>462 0.989723
>471 0.080612
>
>I am surprised by these statistics. Several workloads (such as 403.gcc,
>459.GemsFDTD, 462.libquantum and 471.omnetpp) have a large L2 and L3
>cache miss rate, very close to 100%. I am not sure whether it is related
>to my cache configuration settings or intrinsic behavior characteristics
>of benchmarks. Attached please find related cache configuration files.
>
>--------------------------
>config/common/CacheConfig.py-----------------------------------------
>
> if options.l3cache:
> system.l3 = l3_cache_class(clock=options.clock,
> size=options.l3_size,
> assoc=options.l3_assoc,
>block_size=options.cacheline_size)
>
> system.tol3bus = CoherentBus(clock = options.clock, width = 32)
> system.l3.cpu_side = system.tol3bus.master
> system.l3.mem_side = system.membus.slave
> else:
> if options.l2cache:
> system.l2 = l2_cache_class(clock=options.clock,
> size=options.l2_size,
> assoc=options.l2_assoc,
>block_size=options.cacheline_size)
>
> system.tol2bus = CoherentBus(clock = options.clock, width =
>32)
> system.l2.cpu_side = system.tol2bus.master
> system.l2.mem_side = system.membus.slave
>
> for i in xrange(options.num_cpus):
> if options.caches:
> icache = icache_class(size=options.l1i_size,
> assoc=options.l1i_assoc,
> block_size=options.cacheline_size)
> dcache = dcache_class(size=options.l1d_size,
> assoc=options.l1d_assoc,
> block_size=options.cacheline_size)
>
> if options.l3cache:
> system.cpu[i].l2 = l2_cache_class(size = options.l2_size,
> assoc =
>options.l2_assoc,
> block_size =
>options.cacheline_size)
> system.cpu[i].tol2bus = CoherentBus()
> system.cpu[i].l2.cpu_side = system.cpu[i].tol2bus.master
> system.cpu[i].l2.mem_side = system.tol3bus.slave
>
> if buildEnv['TARGET_ISA'] == 'x86':
> system.cpu[i].addPrivateSplitL1Caches(icache, dcache,
>PageTableWalkerCache(),
>PageTableWalkerCache())
> else:
> system.cpu[i].addPrivateSplitL1Caches(icache, dcache)
> system.cpu[i].createInterruptController()
> if options.l3cache:
> system.cpu[i].connectAllPorts(system.cpu[i].tol2bus,
>system.membus)
> else:
> if options.l2cache:
> system.cpu[i].connectAllPorts(system.tol2bus,
>system.membus)
> else:
> system.cpu[i].connectAllPorts(system.membus)
>
>--------------------------
>config/common/Caches.py-----------------------------------------
>
>class L1Cache(BaseCache):
> assoc = 2
> hit_latency = 2
> response_latency = 2
> block_size = 64
> mshrs = 4
> tgts_per_mshr = 20
> is_top_level = True
>
>class L2Cache(BaseCache):
> assoc = 8
> block_size = 64
> hit_latency = 8
> response_latency = 8
> mshrs = 16
> tgts_per_mshr = 16
> write_buffers = 8
>
>class L3Cache(BaseCache):
> assoc = 16
> block_size = 64
> hit_latency = 20
> response_latency = 20
> mshrs = 512
> tgts_per_mshr = 20
> write_buffers = 256
>
>
>
>Any mistakes I missing?
>
>
>
>Thanks in advance,
>
>Hanfeng
_______________________________________________
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users