Dear almighty mailing list:

  I am trying to study the performance impact of memory controller count.
After reading various documentations, tutorials and mailing list threads, I
choose full system ALPHA_MOESI_CMP_token with timing cpu, ddr3_1600_x64
memory and Pt2Pt network topology. My testing workload was
multi-threaded(OpenMP) STREAM benchmark. Here is the cmd I use (options
like outdir, disk-image, script are omitted):

./build/ALPHA_MOESI_CMP_token/gem5.opt ./configs/example/ruby_fs.py -n 4
--cpu-type=timing --mem-size=256MB --mem-type=ddr3_1600_x64 --l1i_size=32kB
--l1d_size=32kB --l2_size=2MB --num-l2caches=4 --topology=Pt2Pt --num-dirs=1

   On a 4 core testing platform, I sweeped both number of threads and
number of directories (num-dirs, which effectively equal to the number of
memory controllers). STREAM's outputs indicate that while having more
threads increases memory bandwidth, including more directory controllers
has very little impact on effective memory bandwidth.
  I checked stats.txt for multiple memory controller simulation and found
that all the memory controllers were active and had processed roughly the
same amount of memory request. The total number of memory controller stall
cycles in the single MC case is about 10% higher than the total number of
stall cycles for all memory controllers in the 4 MC case. To me, these data
suggests that the default MC is fast enough to support 4 cores/threads. Is
this a reasonable conclusion? If I want to push the MCs to  their limit,
can I just increase core frequency?

Thanks!
-- 
Runjie Zhang
Computer Engineering
University of Virginia
_______________________________________________
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Reply via email to