Re: [gem5-users] Minor CPU in X86

2015-10-09 Thread Mitch Hayenga
>From your command it's pretty clear that CPU 1 isnt getting directed to the proper start address for the program, but CPU 1 is. cpu0.execute: Waking up Fetch (via Execute) by issuing a branch: (0x400190=>0x400198).(0=>1) cpu1.execute: Waking up Fetch (via Execute) by issuing a branch: (0=>0x8).(0

Re: [gem5-users] ARM Minor CPU model

2015-06-16 Thread Mitch Hayenga
The Minor CPU supports full system mode on ARM. It was developed by ARM and is used there to run Linux-based full system simulations. It is primarily tested/used with the classic memory system. On Tue, Jun 16, 2015 at 8:30 AM, Konstadinos PARASYRIS wrote: > > Hello, > > Could someone please in

Re: [gem5-users] using SMT in gem5

2015-04-24 Thread Mitch Hayenga
) out before pushing the SMT ones. > > It seems to me our work complements each other. > > Best regards, > Alex > -- > *From:* gem5-users [gem5-users-boun...@gem5.org] on behalf of Mitch > Hayenga [mitch.hayenga+g...@gmail.com] > *Sent:* Friday, Ap

Re: [gem5-users] using SMT in gem5

2015-04-24 Thread Mitch Hayenga
Hi Alexander, Just saw this thread and thought I'd contribute. Are you focusing additional SMT support on the various CPU models or just x86-ISA/library side of things? I'm wondering how much overlap we have. I've recently been working on extending gem5 SMT support by adding SMT to the Atomic,

Re: [gem5-users] Forwarding data from strd to ldrd

2015-03-12 Thread Mitch Hayenga
Here's how o3 would work in this case. The relevant code is in src/cpu/o3/lsq_unit.hh (in the "LSQUnit::read" function) around line 640. The code in the backend explicitly works on micro-ops, so each load/store micro-op will get it's own LSQ entry. If both the ldrd and strd are cracked, then not

Re: [gem5-users] Stride prefetcher across pages

2015-01-27 Thread Mitch Hayenga via gem5-users
Hmm, unsure of what was removed. Prefetching across page boundaries should have always been broken/bad from the perspective of the prefetchers. Since they are in the memory system which is purely physical addresses in gem5. Without a TLB/walker interface to the prefetchers, there is no proper way

Re: [gem5-users] FS mode with 03CPU withtout caches

2014-11-22 Thread Mitch Hayenga via gem5-users
Have you tried running with the O3CPUAll debug flag? That may shed some more light on whats happening. Steve's suggestion sounds like a possibility. On Sat, Nov 22, 2014 at 12:24 PM, Steve Reinhardt via gem5-users < gem5-users@gem5.org> wrote: > I don't recall the details, but there's some issu

Re: [gem5-users] Number of loads per instruction is huge.

2014-11-14 Thread Mitch Hayenga via gem5-users
Whoops, sorry just read the ruby stats at the end, missed the earlier sim_insts. Sorry for the mis-read, someone with knowledge of the ruby stats is needed I guess. On Fri, Nov 14, 2014 at 8:27 AM, Mitch Hayenga wrote: > Haven't used ruby with gem5... But are instruction fetches enti

Re: [gem5-users] Number of loads per instruction is huge.

2014-11-14 Thread Mitch Hayenga via gem5-users
Haven't used ruby with gem5... But are instruction fetches entire cache lines (or something larger than a single instruction)? Than 1 load per cache line of instructions isn't crazy. On Fri, Nov 14, 2014 at 7:42 AM, Geeta Patil via gem5-users < gem5-users@gem5.org> wrote: > > Hi All, > > I got

Re: [gem5-users] Branch predictor

2014-11-13 Thread Mitch Hayenga via gem5-users
Hi, I suspect one of a few things might be behind what you think you are seeing. 1) "I have observed that call and return instructions are predicted as if these instructions were conditional branches." The ARMv7 ISA actually does have conditional calls and returns (this is a consequence of lett

Re: [gem5-users] confusing cycle number

2014-11-11 Thread Mitch Hayenga via gem5-users
numCycles is only incremented on cycles where the CPU is clocked (see src/cpu/o3/cpu.cc: line 540). Two things can lead to this not correlating with the number of sim_ticks. 1) Quiesce instructions ("wait for interrupt" on ARM), cause the CPU to sleep until an interrupt or some external event occu

Re: [gem5-users] [Gem5 Minor CPU] About the actual "execution" of an instruction

2014-11-06 Thread Mitch Hayenga via gem5-users
In general there are 3 functions that the CPU calls on instructions in order to execute them. These are all functions within the "StaticInst" class. 1) initiateAcc 2) completeAcc 3) execute The first 2 are used for memory operations while the 3rd is what you care about for your integer example.

Re: [gem5-users] Cannot Compile Gem5 on MacOS Version 10.9.4

2014-09-24 Thread Mitch Hayenga via gem5-users
Hi, this should have been fixed in this changeset. http://repo.gem5.org/gem5?cmd=changeset;node=0edd36ea6130 I don't believe this fix is yet in gem5-stable, but it is in the development branch. On Wed, Sep 24, 2014 at 7:47 PM, Khaled Mahmoud via gem5-users < gem5-users@gem5.org> wrote: > Hi, >

Re: [gem5-users] Prefetchers in LLC assertion fail

2014-09-04 Thread Mitch Hayenga via gem5-users
Hi George, For the tagged prefetcher, I believe this is a bug in the current implementation. I hit this a few weeks ago on ARM. For ARM (assuming X86 is the same), hardware page table walk requests are not assigned a thread ID. When generating prefetches, the tagged prefetcher attempts to tag t

Re: [gem5-users] Switching from Atomic CPU to Detailed CPU after Linux booted up

2014-09-03 Thread Mitch Hayenga via gem5-users
ith switched cpu after > restore? > > > On Tue, Sep 2, 2014 at 3:33 PM, Mitch Hayenga < > mitch.hayenga+g...@gmail.com> wrote: > >> Yes you can. Generally the preferred way to run is to boot/start a >> benchmark with the atomic CPU and then drop a checkpoint.

Re: [gem5-users] Switching from Atomic CPU to Detailed CPU after Linux booted up

2014-09-02 Thread Mitch Hayenga via gem5-users
Yes you can. Generally the preferred way to run is to boot/start a benchmark with the atomic CPU and then drop a checkpoint. You can then restore from the checkpoint with the "detailed" CPU. Simple use case: 1) specify gem5 command with "--checkpoint-at-end" and the atomic CPU 2) Once the benchm

Re: [gem5-users] Big executed instruction difference between X86 atomic adn X86 O3

2014-08-28 Thread Mitch Hayenga via gem5-users
last committed user instruction and > >> first instruction in apic_timer_interrupt function. This confirms that > >> the last user instruction sits in commit until timer interrupt happens. > >> Am I right about this? > >> > >> Next step, I think I need to

Re: [gem5-users] Big executed instruction difference between X86 atomic adn X86 O3

2014-08-27 Thread Mitch Hayenga via gem5-users
x86KvmCPU to boot up, then take checkpoints and run from > checkpoints. > > I will report whether this works or not. > > Thanks. > > -- > Best Regards > Yan Zi > > On 27 Aug 2014, at 15:44, Mitch Hayenga wrote: > > > There are probably three main patches th

Re: [gem5-users] Big executed instruction difference between X86 atomic adn X86 O3

2014-08-27 Thread Mitch Hayenga via gem5-users
There are probably three main patches that could help. The fact you mention the timer interrupt makes me think Andreas is right and these might solve your issue. 1. http://reviews.gem5.org/r/2363/ - o3 is supposed to stop fetching instructions immediately once a quiesce instruction is encountere

Re: [gem5-users] O3 fetch throughput when i-cache hit latency is more than 1 cycle

2014-08-27 Thread Mitch Hayenga via gem5-users
> Thanks for the response Mitch. It seems like a nice way to fake a > pipelined fetch. > > Amin > > > On Tue, Aug 26, 2014 at 10:54 AM, Mitch Hayenga < > mitch.hayenga+g...@gmail.com> wrote: > >> Yep, >> >> I've thought of the need for a fully pip

Re: [gem5-users] O3 fetch throughput when i-cache hit latency is more than 1 cycle

2014-08-26 Thread Mitch Hayenga via gem5-users
Yep, I've thought of the need for a fully pipelined fetch as well. However my current method is to fake longer instruction cache latencies by leaving the delay as 1 cycle, but make up for it by adding additional "fetchToDecode" delay. This makes the front-end latency and branch mispredict penal

Re: [gem5-users] strange decoding of floating point instructions

2014-07-16 Thread Mitch Hayenga via gem5-users
Are you sure its actually ignoring dependencies? Run with --debug-flags=IntRegs,FloatRegs. To verify what vdivs wrote and what vstr read. I'd assume it's just a bug in the gem5 generateDissasembly routine printing out the instruction format. It probably just passed reg 15, meaning floating poin

Re: [gem5-users] Nehalem on gem5

2014-05-20 Thread Mitch Hayenga via gem5-users
Hi, I'm the one who originally wrote that config file. If you don't need anything else from those scripts, I'd just use the mainline run scripts like Andreas said and copy the appropriate config values like O3_ARM_v7a. Since those scripts were written the branch predictor structure was re-organi

Re: [gem5-users] Segfault when changing renameWidth

2014-05-13 Thread Mitch Hayenga via gem5-users
PS: This issue was fixed about two weeks ago by putting in an assert to warn when MaxWidth was <= any width in the machine. So it's in the mainline, just not in gem5-stable. http://repo.gem5.org/gem5?cmd=changeset;node=790a214be1f4 On Tue, May 13, 2014 at 4:36 PM, Mitch Hayenga wrote:

Re: [gem5-users] Segfault when changing renameWidth

2014-05-13 Thread Mitch Hayenga via gem5-users
Gem5 has some hard compile-time limits on how large certain widths can be. In src/cpu/o3/impl.hh there is a line that sets "MaxWidth = 8". Increase this to greater than or equal to 16 (or whatever the maximum width in your machine is). The issue you are hitting is time buffer entries writing int

Re: [gem5-users] Writeback buffer kills O3 performance, what is it meant to model?

2014-05-13 Thread Mitch Hayenga via gem5-users
functional unit if you can't get one). >>> >>> I believe Karu Sankaralingham at Wisc also found this and a few other >>> issues, they have a related paper at WDDD this year. >>> >>> We also found a problem where multiple outstanding loads to the same >&g

Re: [gem5-users] Writeback buffer kills O3 performance, what is it meant to model?

2014-05-12 Thread Mitch Hayenga via gem5-users
*"Realistically, to me, it seems like those buffers would be distributed among the function units anyway, not a global resource, so having a global limit doesn't make a lot of sense. Does anyone else out there agree or disagree?"* I believe that's more or less correct. With wbWidth probably mean

Re: [gem5-users] LSQ bottleneck when using X86 TSO

2014-05-05 Thread Mitch Hayenga via gem5-users
Yep, the single-store in flight is a significant limitation of TSO. There are things you can do to alleviate it (which gem5 doesn't do). A cpu could speculatively try to obtain ownership for a cacheline before a store were fully committed. Thus the store could be retired much more quickly to the

Re: [gem5-users] [O3 pipeline viewer]

2014-04-15 Thread Mitch Hayenga
"--trace-start" and "--trace-file" were renamed to "--debug-start" and "--debug-file" Hope that helps. On Tue, Apr 15, 2014 at 2:40 PM, Kuk-Hwan Kim wrote: > > Dear Gem5 community member, > > I wish to display pipeline stages from 200cycles to 1000. So, I would like > to create trace.out by us

Re: [gem5-users] panic: initiateAcc not defined!

2014-04-01 Thread Mitch Hayenga
You get this when you try to execute a non-memry instruction as a memory instruction. You first need to figure out what type of instruction was being executed and then go about figuring what isn't done properly. Since you have a core dump you could view the stack trace most likely to figure out w

Re: [gem5-users] Why system.ruby.dir_cntrl0.memBuffer.memReq is different for same benchmark but different Bank configurations?

2014-03-10 Thread Mitch Hayenga
Praxal, I'm pretty sure the other two answers answer your question. It is completely possible for slight timing changes to change the number of memory accesses and instructions simulated. Because the o3 cpu is speculative, slight timing changes can result in fewer or more speculative memory acce

Re: [gem5-users] Panic: initiateAcc not defined! error with ruby_fs.py on x86

2014-01-07 Thread Mitch Hayenga
So, The first thing you need to do is identify which x86 instruction is causing this (mnemonic and binary encoding). This looks to be an issue in the ISA decoder for gem5 either not properly detecting the instruction you are executing or not fully supporting it. Basically, you are executing what

Re: [gem5-users] Macro Ops splitting and Register 33

2014-01-03 Thread Mitch Hayenga
g at the rename and iew stages for a given instruction. It > turns out that this register is being accessed in rename stage but not in > iew stage. Does this mean that the CPSR is not required for these > instructions? > > Thanks > V Vanchinathan > > > On Thu, Jan 2, 20

Re: [gem5-users] Macro Ops splitting and Register 33

2014-01-02 Thread Mitch Hayenga
R33 is a "zero register". It is used whenever a zero is required. It is also often sourced unnecessarily if an instruction requires fewer source registers. In gem5 the basis for splitting is solely up to whoever wrote the ISA decoder. For arm its mostly what you would expect 2 real sources (not

Re: [gem5-users] Patch for a particular revision of gem5

2013-12-05 Thread Mitch Hayenga
-r 5e8970397ab7" but I have the error : unknown >> revision '5e8970397ab7' ! >> >> Cordialement / Best Regards >> >> SENNI Sophiane >> Ph.D. candidate - Microelectronics >> LIRMM - www.lirmm.fr >> >> Le 05/12/2013 14:58, Mitch H

Re: [gem5-users] Patch for a particular revision of gem5

2013-12-05 Thread Mitch Hayenga
All mercurial commands have a built in help that explains their options ("hg help diff"). For this one you what "hg diff -r ". Hope that helps. On Thu, Dec 5, 2013 at 6:44 AM, senni sophiane wrote: > Hi all, > > I want to apply the command "hg diff" for the gem5 revision 5e8970397ab7. > I di

Re: [gem5-users] Does gem5 only support LRU cache replacement policy in non-ruby mode?

2013-11-17 Thread Mitch Hayenga
By default they don't exist, but they should be fairly simple to add. You just have to add your own "tag" class in src/mem/cache/tags. So, I'd just copy the existing LRU files, rename them something different, then edit them to implement them to be what you what (doing search/replace within the

Re: [gem5-users] Cache Prefetch Configuration Problem

2013-11-04 Thread Mitch Hayenga
The prefetchers for the classic memory system are located under src/mem/cache/prefetch. You'll have to add your prefetcher to the listed classes in Prefetecher.py as well as add corresponding source files. Note: The "GHB" prefetcher class is effectively a misnomer and doesn't really work. So don

Re: [gem5-users] Fwd: User defined names for output directory and output files.

2013-10-08 Thread Mitch Hayenga
Just type in the gem5 binary without any arguments. It gives you a list of accepted parameters. --stats-file=FILE Sets the output file for statistics [Default: stats.txt] --outdir=DIR, -d DIRSet the output directory to DIR [Default: m5out] I don't use ruby, so I don't know how to renam

Re: [gem5-users] (no subject)

2013-09-26 Thread Mitch Hayenga
Amin/Tony, there is a very big reason for why gem5 does this. It's about modeling what real processors do. Modern out of orders are very deeply pipelined and instructions take multiple cycles to execute from the time they are scheduled. To enable back-to-back execution of dependent instructions,

Re: [gem5-users] gem5 with simpoint

2013-09-25 Thread Mitch Hayenga
it is more specific >> than "Re: Contents of gem5-users digest..." >> >> >> Today's Topics: >> >>1. gem5 with simpoint (Jagadish Kotra) >>2. Re: gem5 with s

Re: [gem5-users] gem5 with simpoint

2013-09-24 Thread Mitch Hayenga
Hi, I'm the person who wrote the config scripts you are using. I don't have access to them right at this moment but if I remember correctly #1 would properly warm up the caches. It keeps the caches in the system, it just swaps the connection between the atomic or detailed cpu (depending on i

Re: [gem5-users] Valgrind is not working for GEM5

2013-09-19 Thread Mitch Hayenga
Hi, This happens to me whenever I compile with google's tcmalloc. If you have that, try disabling it. I tend to just remove the tcmalloc package and do a fresh rebuild whenever I need to use valgrind to debug memory issues. This is just because by default valgrind doesn't recognize/trap the all

Re: [gem5-users] running spec2006

2013-08-29 Thread Mitch Hayenga
Using the atomic cpu, with fastmem, it takes about 7 days for most of the benchmarks to finish on the "train" input set. This was with the ARM ISA on a bunch of old 2.4 ghz opterons. I calculated it out once using other paper's presented instruction counts, and running the whole reference input s

Re: [gem5-users] panic: ListenSocket(listen): listen() failed!

2013-08-04 Thread Mitch Hayenga
Hi Ali, There is actually a minor bug/race condition in the gem5 ListenSocket::listen function (src/base/socktet.cc). I think Hao might be hitting this, I just haven't had time time to upload the patch for it to the mainline. I hit this when launching hundreds of simulations at the same time (on

Re: [gem5-users] cache miss latency variation is larger than CPU time?

2013-05-09 Thread Mitch Hayenga
Other possible sources off the top of my head Prefetchers? Wrong path loads that don't contribute to IPC? Sent from my phone. On May 9, 2013 5:06 AM, "Mitch Hayenga" wrote: > Trying to analyze stats like this is often more trouble than it's worth... > But anyway,

Re: [gem5-users] cache miss latency variation is larger than CPU time?

2013-05-09 Thread Mitch Hayenga
Trying to analyze stats like this is often more trouble than it's worth... But anyway, here is one way this could happen I think. Write misses. As long as the in order does not stall for the tlb translations, it can still get memory level parallelism for its writebacks. And if they missed in t

Re: [gem5-users] simpoint.bb file from gem5 for ARM

2013-05-08 Thread Mitch Hayenga
That number is equal to the # of instructions in the basic block multiplied by the # of times the basic block was executed. Looking at what you attached, I could figure out that basic block 462 was actually a 28 instruction loop. In general if you sum up the second numbers across a line, they sh

Re: [gem5-users] Alpha O3 CPU Broken?

2013-04-26 Thread Mitch Hayenga
++ to this likely just being an issue of reading the wrong stat. I've personally diffed every instruction on a small run of libquantum (though on ARM). You can always implement a "poor man's checker" to execute two gem5 cpu models in lock-step, verifying the committed instruction path (assuming s

Re: [gem5-users] SimpleDDR3 failing on an assertion with default parameters

2013-04-23 Thread Mitch Hayenga
; Hi Mitch, > > Thanks for reporting. Is there an easy way to reproduce this? > > Andreas > > From: Mitch Hayenga > Reply-To: gem5 users mailing list > Date: Tuesday, 23 April 2013 01:17 > To: gem5 users mailing list > Subject: [gem5-users] SimpleDDR3 failing on an

[gem5-users] SimpleDDR3 failing on an assertion with default parameters

2013-04-22 Thread Mitch Hayenga
Hi all, I'm running the SimpleDDR3 memory with default parameters and one of my benchmarks is failing on a panic/sanity check. I was wondering if any knew of any issues with the default DDR3 parameters or if this sanity check might be overzealous? Here's how I've configured the memory: physmem =

Re: [gem5-users] How to make trace file show branch target address?

2013-04-20 Thread Mitch Hayenga
You could add your own DPRINTF that accesses the fields of the staticInstruction. Check out src/cpu/static_inst.hh. Specifically the functions hasBranchTarget() and the two branchTarget() functions. On Sat, Apr 20, 2013 at 4:24 PM, Meng Wang wrote: > Hi, all > I dumped trace of ARM benchmark

Re: [gem5-users] difference between commit.committedInsts and commitedInsts

2013-04-14 Thread Mitch Hayenga
Quick answer: system.cpuname.commit.committedInsts counts nops and instruction prefetches, system.cpuname.committedInsts doesn't. These stats are incremented in src/cpu/o3/commit_impl.hh:updateComInstStats and src/cpu/o3/cpu.cc:instDone. instDone is called by updateComInstStats after testing for

Re: [gem5-users] a question about simpoint profiling patch

2013-04-05 Thread Mitch Hayenga
Hi Meng, I'm CC'ing the mailing list in case anyone else has interest in running with the simpoint patch. This part of the patch was coded by Ali I think. I originally wrote the profiling bit that generated the bbv file. I use this current patch with with my own custom se.py script. I've linked

Re: [gem5-users] discussion on modeling shared L3 cache, hierarchy

2013-03-13 Thread Mitch Hayenga
Last level cache miss rates can be quite high on SPEC. Effectively the higher level caches "filter" all of the easily cacheable accesses, so that the last level cache only sees accesses that tend to miss. Aameer Jaleel, of Intel, has published miss rates for L1/L2/L3 cache configurations on the r

Re: [gem5-users] changeToTiming function

2013-03-11 Thread Mitch Hayenga
The changeToTiming function was removed ~3 weeks ago from the mainline. http://repo.gem5.org/gem5/rev/1cd02decbfd3 CPU's now define a method that is used to determine their memory mode (timing or atomic). Check O3CPU.py for an example. If a mode change is necessary when swapping CPUs, the memor

Re: [gem5-users] LSQ full condition checked at rename - O3

2013-03-04 Thread Mitch Hayenga
It looks like the logic is just organized poorly. Yes, it will unnecessarily stall non-loads if there are no free LSQ entries. It shouldn't take many changes to fix this (basically changing the later while loop to track number of LSQ entries remaining and accounting based upon number of loads sen

Re: [gem5-users] Switching CPUs - Python script trouble

2013-03-03 Thread Mitch Hayenga
Oops, perhaps posted a bit too quickly... It seems my detailed cpu model wasn't properly connected to the system, and this was just a very poor error message. On Sun, Mar 3, 2013 at 2:42 PM, Mitch Hayenga wrote: > Hi all, > > I'm trying to automate switching between an atomic

[gem5-users] Switching CPUs - Python script trouble

2013-03-03 Thread Mitch Hayenga
Hi all, I'm trying to automate switching between an atomic cpu and my own cpu. This is done via my own python configuration script. With the default config script (configs/common/se.py), switching works properly. With my own script, which makes the same calls, it fails because it doesn't find

Re: [gem5-users] IPC

2013-02-28 Thread Mitch Hayenga
By using the "timing" cpu, you are effectively using something that is like an idealized 1-wide, in-order cpu model. So the maximum possible IPC would be one and with cache accesses, etc it should be expected to be much lower than one. For relative comparisons, especially papers not looking expli

Re: [gem5-users] How to dump data structure symbols to trace files ?

2013-02-28 Thread Mitch Hayenga
Not that I know of. For a poor man's version, since it seems you are just trying to generate various traces from a region of interest You could just use the existing m5ops to checkpoint @ the beginning of the region of interest and exit @ the end. That way you could just run from the checkpo

Re: [gem5-users] data type: Addr vs TheISA::PCState

2013-02-18 Thread Mitch Hayenga
They all inherit from the same sets of classes in src/arch/generic/types.hh.You can use any of those constructors or "set" methods to properly convert. Also look at the corresponding types.hh file in the ISA-specific folder (ex: src/arch/arm/types.hh). TheISA::PCState pc_addr(0x12cf4); //

Re: [gem5-users] Question about PacketQueue::scheduleSend

2013-02-15 Thread Mitch Hayenga
99e).(0=>1). 2 loads sent to the memory system in the same cycle, both hit in the L1 cache, but result in different cycle latencies. On Fri, Feb 15, 2013 at 10:36 AM, Mitch Hayenga < mitch.hayenga+g...@gmail.com> wrote: > This is a nicely timed thread. I just hit a related ticking issue

Re: [gem5-users] Question about PacketQueue::scheduleSend

2013-02-15 Thread Mitch Hayenga
This is a nicely timed thread. I just hit a related ticking issue while performance validating my core model. Here is an example case: ld r1, [sp, #0x16] // L1 cache hit ld r2, [sp, #0x24] // L1 cache hit My core assumes 2 load ports, so both of these loads issue and hit in the same cycle. B

Re: [gem5-users] get same results from different benchmarks

2013-02-01 Thread Mitch Hayenga
Hi, I'm currently traveling and don't run full system that much myself these days. So maybe someone else would be a better choice to help you. That said, I searched the list for your error and it looks like its the same problem as discussed in this thread? http://comments.gmane.org/gmane.comp.em

Re: [gem5-users] get same results from different benchmarks

2013-01-28 Thread Mitch Hayenga
Hi, It looks to me like you might just be dumping execution information during the OS boot (before the benchmark has even started to run). Which, should be the same regardless of the benchmark (since it wouldn't have started). Is this the case? Also, be warned dumping execution information for

Re: [gem5-users] documents on O3 cpu implementation?

2013-01-26 Thread Mitch Hayenga
ons ( http://www.cs.ucr.edu/~tianc/). On Sat, Jan 26, 2013 at 8:23 PM, Mitch Hayenga wrote: > "If both answers are t-1, which means the output of any stage only depends > on some other stages' output at previous cycle, then I can understand why > time buffer can get ride of th

Re: [gem5-users] documents on O3 cpu implementation?

2013-01-26 Thread Mitch Hayenga
"If both answers are t-1, which means the output of any stage only depends on some other stages' output at previous cycle, then I can understand why time buffer can get ride of the dependencies. However, if a stage requires a result from another stage at the same cycle, I cannot see how this works.

Re: [gem5-users] documents on O3 cpu implementation?

2013-01-25 Thread Mitch Hayenga
Nilay, Ticking pipestages in reverse (and allowing values to propagate in that order) is a *very* common way to implement processor simulators. I'd almost call it the standard method. Though gem5 gets around this via the timebuffer, other simulators do not use a timebuffer/pipe method. For examp

Re: [gem5-users] run SPEC CPU2K6 in SE mode

2012-12-18 Thread Mitch Hayenga
You should do gem5.opt configs/example/se.py --help It's clearly documented there how to do this. -i INPUT, --input=INPUT Read stdin from a file. --output=OUTPUT Redirect stdout to a file. --errout=ERROUT Redirect stderr to a file. So for your case

Re: [gem5-users] question regarding LL/SC implementation in gem5

2012-11-20 Thread Mitch Hayenga
The SC (2) failing after (1) should force the program to loop trying properly execute a LL/SC pair. Assuming (1) and (2) properly execute, the value of the lock will be set to taken. This would force your other thread to continuously loop on its LL until it saw the lock was free. I think your co

Re: [gem5-users] ARM/O3: Load-linked, store-conditional behavior

2012-11-20 Thread Mitch Hayenga
t; ** > > Any updates Mitch? > > Thanks, > > Ali > > > > On 11.10.2012 20:44, Mitch Hayenga wrote: > > Hi, > > I have a patch that fixes this in classic and ruby. I was waiting for > another student (Dibakar, he runs a lot more parallel code than I do) to >

Re: [gem5-users] question regarding LL/SC implementation in gem5

2012-11-20 Thread Mitch Hayenga
"It releases the lock using normal store" I think this might be where your confusion is coming from. This is not true, it does a store conditional not a normal store. The store conditional only stores if the context id is still set on the cacheline. This code is in (if ruby) src/mem/ruby/system/

Re: [gem5-users] Perfect branch predictor in ooo

2012-10-23 Thread Mitch Hayenga
; > > On Tue, Oct 23, 2012 at 10:02 AM, Mitch Hayenga < > mitch.hayenga+g...@gmail.com> wrote: > >> Since gem5 the O3 cpu model actually executes instructions @ execute (not >> fetch/decode) a perfect branch predictor is a bit tricky. Assuming you are >> running a s

Re: [gem5-users] Perfect branch predictor in ooo

2012-10-22 Thread Mitch Hayenga
t; inst->setPredTaken(false); > return false; > } > > > to > > //if (!inst->isControl()) { > TheISA::advancePC(nextPC, inst->staticInst); > inst->setPredTarg(nextPC); > inst->setPredTaken(false); >

Re: [gem5-users] Perfect branch predictor in ooo

2012-10-22 Thread Mitch Hayenga
Since gem5 the O3 cpu model actually executes instructions @ execute (not fetch/decode) a perfect branch predictor is a bit tricky. Assuming you are running a single-threaded app in SE mode (so you don't have OS/multi-threaded time variance issues), you could simply run the application twice. Sav

Re: [gem5-users] ARM/O3: Load-linked, store-conditional behavior

2012-10-11 Thread Mitch Hayenga
not be implicitly expanded > to cover the whole block as we've done. So you've convinced me that that's > not just the most straightforward fix, but probably the right one. > > If you get it working, please submit the patch. > > Thanks! > > Steve > >

Re: [gem5-users] ARM/O3: Load-linked, store-conditional behavior

2012-09-26 Thread Mitch Hayenga
; Steve > > > On Wed, Sep 26, 2012 at 12:50 PM, Mitch Hayenga < > mitch.hayenga+g...@gmail.com> wrote: > >> Thanks for the reply. >> >> Thinking about this... I don't know too much about the O3 store-set >> predictor, but it would seem that load-l

Re: [gem5-users] ARM/O3: Load-linked, store-conditional behavior

2012-09-26 Thread Mitch Hayenga
want to mark the ops as serializing as that slows > down the cpu quite a bit. > > > > Thanks, > > Ali > > > > On 26.09.2012 13:14, Mitch Hayenga wrote: > > Background: > I have a non-o3, out of order CPU implemented on gem5. Since I don't have > a check

[gem5-users] ARM/O3: Load-linked, store-conditional behavior

2012-09-26 Thread Mitch Hayenga
Background: I have a non-o3, out of order CPU implemented on gem5. Since I don't have a checker implemented yet, I tend to diff committed instructions vs o3. Yesterday's patches caused a few of these diffs change because of load-linked/store-conditional behavior (better prediction on data ops tha

Re: [gem5-users] Question on PARSEC+GARNET+RUBY

2012-09-12 Thread Mitch Hayenga
f >> 75,899,868 flits and the successful reception of 75,899,865 flits. Am I >> doing something wrong with the simulation? Do I need to set some parameters >> for the power calculations? >> >> Thanks for your time. >> >> Thanks, >> Pavan >> >&

Re: [gem5-users] LLC states in classic coherence

2011-10-17 Thread Mitch Hayenga
I actually did a slight modification of the m5 classic protocol to "fix" this. Basically, I allowed the data to remain dirty in the L2 & forwarded a version that looked clean-exclusive to the L1. The way m5 is structured, the L2 would already snoop upwards to the L1 if it got a request from below

[gem5-users] Gem5 classic coherence

2011-10-04 Thread Mitch Hayenga
understand this somewhat moreso than the previous issue, since forcing traffic on another cache is undesirable, but with the non-inclusive nature of the hierarchy, this original request may have to go all the way out to memory. Just doing some sanity checking that this is how things are supposed to