On 3/18/24 21:41, Jeff Law wrote:
>> The first patch is the main change which improves SPEC cactu by 10%.
> Just to confirm.  Yup, 10% reduction in icounts and about a 3.5% 
> improvement in cycles on our target.  Which is great!

Nice.

> This also makes me wonder if cactu is the benchmark that was sensitive 
> to flushing the pending queue in the scheduler.  Jivan's data would tend 
> to indicate that is the case as several routines seem to flush the 
> pending queue often.  In particular:
>
> ML_BSSN_RHS_Body
> ML_BSSN_Advect_Body
> ML_BSSN_constraints_Body
>
> All have a high number of dynamic instructions as well as lots of 
> flushes of the pending queue.
>
> Vineet, you might want to look and see if cranking up the 
> max-pending-list-length parameter helps drive down spilling.   I think 
> it's default value is 32 insns.  I've seen it cranked up to 128 and 256 
> insns without significant ill effects on compile time.
>
> My recollection (it's been like 3 years) of the key loop was that it had 
> a few hundred instructions and we'd flush the pending list about 50 
> cycles into the loop as there just wasn't enough issue bandwidth to the 
> FP units to dispatch all the FP instructions as their inputs became 
> ready.  So you'd be looking for flushes in a big loop.

Great insight.

Fired off a cactu run with 128, will keep you posted.

Thx,
-Vineet

Reply via email to