Jason,

>> First, I would check to be sure you're simulating what you expect. Is there 
>> a way for you to

>> print out what the simulated system is doing (say, by logging in with m5term 
>> and running top)?

>> This would at least eliminate one potential source of the problem.


Is it possible to login to a gem5 server and drop into the shell while a script 
is running? From

what I know, telnet'ing a running gem5 only shows the shell prompt when no 
script is provided.

In my case, I'm restoring from a checkpoint where an rcS script is still 
running.


>> Second, it's possible that the way you have configured your cache hierarchy 
>> that some cores are

>> given priority over others. I've run into this problem before, especially in 
>> bandwidth-limited systems.


Would you please explain how you solved this problem? Did you use private L2 
caches? L3?


Thank you very much for your suggestions.

Ali.

________________________________
From: gem5-users <gem5-users-boun...@gem5.org> on behalf of Jason Lowe-Power 
<ja...@lowepower.com>
Sent: Tuesday, February 14, 2017 1:12:05 PM
To: gem5 users mailing list
Subject: Re: [gem5-users] Different number of sim cycles among the CPU cores

Hi Ali,

First, I would check to be sure you're simulating what you expect. Is there a 
way for you to print out what the simulated system is doing (say, by logging in 
with m5term and running top)? This would at least eliminate one potential 
source of the problem.

Second, it's possible that the way you have configured your cache hierarchy 
that some cores are given priority over others. I've run into this problem 
before, especially in bandwidth-limited systems.

Finally, you should dig into the stats a little more to determine why CPU 2 is 
active for less cycles than CPU 3. This would indicate to me that the 
application just isn't running on CPU 2 for the time you expect it to be.

Jason

On Mon, Feb 13, 2017 at 10:05 AM Alsuwaiyan, Ali Saleh 
<as...@pitt.edu<mailto:as...@pitt.edu>> wrote:
Dear all,

I ran a full system simulation with four cores CPU, in which each core is 
hardcoded (using taskset command) to run a specific SPEC CPU2006 benchmarks 
(leslie3d, leslie3d, mcf, and mcf). The architecture of the simulated system 
can be inferred from the simulation command below:

build/X86/gem5.fast configs/example/fs.py --caches 
--disk-image=linux-x86-large.img --kernel=x86_64-vmlinux-3.2.24.smp 
--l1d_size=128kB --l1i_size=128kB --l2cache --l2_size=8MB --l2_assoc=8 
--cacheline_size=64 -n 4 --mem-size=8GB --cpu-clock=4GHz --sys-clock=4GHz 
--mem-type=DDR3_1600_x64 --cpu-type=detailed -r 2 -I 250000000

Here is the result of the number of instructions and number of cycles simulated 
for each core:

system.switch_cpus0.commit.committedInsts       276150                       # 
Number of instructions committed
system.switch_cpus1.commit.committedInsts    250000000                       # 
Number of instructions committed
system.switch_cpus2.commit.committedInsts      8546307                       # 
Number of instructions committed
system.switch_cpus3.commit.committedInsts     35930361                       # 
Number of instructions committed

system.switch_cpus0.numCycles                  864447                       # 
number of cpu cycles simulated
system.switch_cpus1.numCycles                63273161                       # 
number of cpu cycles simulated
system.switch_cpus2.numCycles                 6382626                       # 
number of cpu cycles simulated
system.switch_cpus3.numCycles                26166842                       # 
number of cpu cycles simulated


I can understand a small variation in the number of cycles/instructions between 
the cores; however, the variation in this case is huge. For example, why would 
the second core run faster than the first, even though they are running the 
same benchmark (leslie3d) with the same exact input (reference input)? In fact, 
the first core has an IPC of 0.32, while the second has an IPC of 3.95 (more 
that 10x faster).

Also, is there a way to make the simulation fair, in the sense that each core 
is simulated for approximately the same number of cycles (or the same number of 
instructions)?

Thank you,
Ali.
_______________________________________________
gem5-users mailing list
gem5-users@gem5.org<mailto:gem5-users@gem5.org>
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users<https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fm5sim.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fgem5-users&data=01%7C01%7Casa78%40pitt.edu%7Cd143b4e6ca7b4f29748c08d455050514%7C9ef9f489e0a04eeb87cc3a526112fd0d%7C1&sdata=Mfeldlnb45dEKHGdWMDsAKy3iipPwD7wlq6q86ZptcI%3D&reserved=0>
_______________________________________________
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Reply via email to