[gem5-users] Re: CHI - Cluster CPUs having a shared L2 cache

Javed Osmany via gem5-users Mon, 09 Aug 2021 09:38:25 -0700

Hello Gabriel

Just wondered if any further thoughts as how to resolve the CHI assertion error?


I have tried running the failing Parsec/Splash2 benchmarks and if I change to 
private L2 cache, then the tests pass. So the issue is definitely  related to 
the shared L2 cache implementation that I have added.

Furthermore, it is from the shared L2 cache controller that the assertion is 
firing.

Just thinking further, if the cache line is resident in an upstream cache (in 
state UC or UD) but not in the shared L2 cache (hence the final state being 
RU), this would imply that the L2$ is non-inclusive of the upstream L1 caches. 
Can you please confirm if this is the case?

Best regards
J.Osmany

_____________________________________________
From: Javed Osmany
Sent: 04 August 2021 08:38
To: gem5 users mailing list <gem5-users@gem5.org>
Cc: Gabriel Busnot <gabriel.bus...@arteris.com>; Javed Osmany 
<javed.osm...@huawei.com>
Subject: RE: [gem5-users] Re: CHI - Cluster CPUs having a shared L2 cache


Hello Gabriel

>> I would then suggest two non-tested options. You can assign the L2 
>> controller to cpu.l2 after registering it as downstream_components of l1i 
>> and l1d. Let's hope it will set the desired name.

I tried this but was unsuccessful, so have left it for the time being to make 
progress.

I have been trying to run the parsec benchmarks and a majority of them are 
failing due to an assertion firing from CHI-cache-actions.sm.  The assertion 
error being:

panic: Runtime Error at CHI-cache-actions.sm:2611: assert failure.

The system being modelled consists of 8 CPUs, partitioned into three clusters, 
where one of the clusters has shared L2 cache and the other two clusters have 
private L2 cache per CPU.

The command being used to run is (also in the attachment):

./build/ARM/gem5.opt --debug-flags=RubySlicc 
--outdir=m5out_parsec_lu_cb_struct1 configs/example/se_kirin_custom.py --ruby 
--topology=Pt2Pt --cpu-type=DerivO3CPU --num-cpus=8 --num-dirs=1 
--num-l3caches=1 --num-cpu-bigclust=1 --num-cpu-middleclust=1 
--num-cpu-littleclust=2 --num-clusters=3 --cpu-type-bigclust=derivo3 
--cpu-type-middleclust=derivo3 --cpu-type-littleclust=hpi 
--bigclust-l2cache=private --middleclust-l2cache=private 
--littleclust-l2cache=shared --num-bigclust-subclust=1 
--num-middleclust-subclust=2 --num-littleclust-subclust=2 
--num-cpu-bigclust-subclust2=1  --num-cpu-middleclust-subclust2=3 
--num-cpu-littleclust-subclust2=2 --big-cpu-clock=3GHz 
--middle-cpu-clock=2.6GHz --little-cpu-clock=2GHz --verbose=true 
--cmd=tests/parsec/splash2/lu_cb/splash2x.lu_cb.hooks -o " -p7 -n512 -b16"

The code in CHI-cache-actions.sm where the assertion is firing from is:

action(Finalize_UpdateCacheFromTBE, desc="") {
  assert(is_valid(tbe));
  State final := tbe.finalState;
  // JO
   DPRINTF(RubySlicc, "JO: CHI-cache-actions.sm: Final state value is %s\n", 
final);
  if ((final == State:UD_RSC) || (final == State:SD_RSC) || (final == 
State:UC_RSC) ||
      (final == State:SC_RSC) || (final == State:UD)     || (final == 
State:UD_T) ||
      (final == State:SD)     || (final == State:UC)     || (final == State:SC) 
||
      (final == State:UC_RU)  || (final == State:UD_RU)  || (final == 
State:UD_RSD) ||
      (final == State:SD_RSD)) {
    assert(tbe.dataBlkValid.isFull());
    assert(tbe.dataValid);
    assert(is_valid(cache_entry));
    cache_entry.DataBlk := tbe.dataBlk;
    DPRINTF(RubySlicc, "Cached data %s pfb %s\n", tbe.dataBlk, 
cache_entry.HWPrefetched);
  } else {
    // make sure only deallocate the cache entry if data is invalid
    assert(tbe.dataValid == false); <== This is the assertion which is firing.
    if (is_valid(cache_entry)) {
      cache.deallocate(address);
      unset_cache_entry();
    }
  }
}

I added a DPRINTF to check what the final state was before the assertion fires.
The final state being RU. This state is not encoded in the if (xxx) clause, 
hence we go the else clause and the assertion is firing.

21984560500: system.cpu2.l1i.downstream_destinations: 
CHI-cache-actions.sm:2598: JO: CHI-cache-actions.sm: Final state value is RU
panic: Runtime Error at CHI-cache-actions.sm:2611: assert failure.

The issue seems to be the final state, RU. The RU seems to be saying the state 
of the cache line in the upstream cache is unique, but it is not encoding the 
state of the cache line in the L2 cache.

Is this due to missing functionality in the SLICC encoding for CHI or is the 
shared L2$ implementation causing the problem?
Any pointers as how to resolve this assertion issue would be much appreciated.

I am attaching (as a winzip rar file) the updated CHI.py, CHI_config.py files. 
Also included is the custom version of the se.py used to run the test and the 
snippet of the trace file prior to the assertion firing when DPRINTF is enabled.

Best regards
J.Osmany


 << File: chi_parsec_splash2_lu_cb_shared_l2_assertion_error.rar >>
-----Original Message-----
From: Javed Osmany
Sent: 22 July 2021 11:52
To: 'gem5 users mailing list' <gem5-users@gem5.org>
Cc: Gabriel Busnot <gabriel.bus...@arteris.com>; Javed Osmany 
<javed.osm...@huawei.com>
Subject: RE: [gem5-users] Re: CHI - Cluster CPUs having a shared L2 cache

Hi Gabriel

Many thanks for your insight and input.

I have taken on board your suggestion and simplified the customisation of 
CHI.py and CHI_config.py by just using the CHI_Config.CHI_RNF() class object 
and adding another method to CHI_Config.CHI_RNF(), called addSharedL2Cache.

Also I have started testing with 1 cluster which contains 1 CPU and permutating 
with shared/private L2 cache.

>> I would then suggest two non-tested options. You can assign the L2 
>> controller to cpu.l2 after registering it as downstream_components of l1i 
>> and l1d. Let's hope it will set the desired name.

Will try this out.

Best regards
J.Osmany


-----Original Message-----
From: Gabriel Busnot via gem5-users [mailto:gem5-users@gem5.org]
Sent: 22 July 2021 08:40
To: gem5-users@gem5.org
Cc: Gabriel Busnot <gabriel.bus...@arteris.com>
Subject: [gem5-users] Re: CHI - Cluster CPUs having a shared L2 cache

Hi Javed,

Woops, I didn't see the split option in your first post. My bad.

I think the l2 is actually named "system.cpu0.l1i.downstream_destinations" and 
you will find it in the ini file. I think this is due to the way gem5 sets 
SimObject names. When you assign a SimObject to several object attributes 
(cpu.l2, cpu.l1i and finally cpu.l1d), it will have one of the names according 
to the complex SimObject and SimObjectVector logic. for some reason, it does 
not end up as a child of cpu0.l1d despite it being the last in the list. I am 
regularly fighting SimObject naming logic as well, that's normal ;)

Also check the warnings in the output. Some of them will warn you about 
SimObject reparenting. Sadly, SimObject name is determined by the attribute you 
set it to and you are not supposed to change it.

I would then suggest two non-tested options. You can assign the L2 controller 
to cpu.l2 after registering it as downstream_components of l1i and l1d. Let's 
hope it will set the desired name.
The other "last resort" option is to violate SimObject._name privacy and set it 
manually after the SimObject has been assigned for the last time... I would 
advise against that, though.

Whenever possible, it is actually best to assign a SimObject at the time of 
object creation and never assign it again afterwards... Not always possible, 
though. Also make use of "private" attributes (i.e., attributes with a name 
starting with '_') as much as possible. It bypasses the SimObject assignment 
logic and solves many issues.

Gabriel
_______________________________________________
gem5-users mailing list -- gem5-users@gem5.org To unsubscribe send an email to 
gem5-users-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

_______________________________________________
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

[gem5-users] Re: CHI - Cluster CPUs having a shared L2 cache

Reply via email to