Re: running Assembler I/O macro code as AMODE 31, RMODE ANY

John Baxter Fri, 22 Jul 2011 12:53:00 -0700

For the z10: IBM J. RES. & DEV. VOL. 53 NO. 1 PAPER 1 2009 ("Design and
microarchitecture of the IBM System z10 microprocessor"), obtainable
through IEEE by subscription. The z9 version may still be available via
alternate internet sources.


John Baxter
ATCO I-Tek



-----Original Message-----
From: IBM Mainframe Discussion List [mailto:[email protected]] On
Behalf Of Bill Fairchild
Sent: Friday, July 22, 2011 1:07 PM
To: [email protected]
Subject: Re: running Assembler I/O macro code as AMODE 31, RMODE ANY

Where do you find such detailed descriptions?  Patent application?
System journal?  NDA material?  You can tell me, but then you'ld have to
kill me?

Bill Fairchild
Rocket Software

-----Original Message-----
From: IBM Mainframe Discussion List [mailto:[email protected]] On
Behalf Of Edward Jaffe
Sent: Friday, July 22, 2011 1:19 PM
To: [email protected]
Subject: Re: running Assembler I/O macro code as AMODE 31, RMODE ANY

The last time I thoroughly studied/investigated System z branch
prediction logic
(BPL) was on the z9.

On that model, the BPL runs early in the pipeline, before instruction
decode, and it essentially runs asynchronously to the rest of the
pipeline. It prefetches instructions and predicts direction and target
based on the path it thinks the rest of the pipeline will be later
executing. It puts those prefetched streams into fairly large
instruction buffers awaiting when they might be needed by the decode
logic.

The BPL logic itself uses what appears to be a unified Branch Target
Buffer and Direction Buffer (they are physically built of separate
arrays but are logically the same). The BTB contains 8K entries. Being
clever, IBM decided not to 'remember' not-taken conditional branches,
thus allowing the BTB to appear much bigger than it really is. This was
considered a reasonable performance trade-off (the rationale for keeping
not taken branches in the BTB is to more accurately handle branches that
frequently change direction). The z9 BTB uses a strongly-taken,
weakly-taken approach for each branch.

Another clever thing IBM did on the z9 was to implement "just in time" 
prefetching down non-predicted branch paths to enhance performance. So
if the hardware predicts a branch will be not taken, it prefetches the
taken-path just before the branch direction is resolved in the execution
stage of the pipeline. 
So if it was predicted incorrectly, the taken-path will be in a
"recovery instruction buffer" where it could be sent into the decoder on
the next cycle.

Slick! :-) 

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html


The information transmitted is intended only for the addressee and may contain 
confidential, proprietary and/or privileged material. Any unauthorized review, 
distribution or other use of or the taking of any action in reliance upon this 
information is prohibited. If you receive this in error, please contact the 
sender and delete or destroy this message and any copies.

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html

Re: running Assembler I/O macro code as AMODE 31, RMODE ANY

Reply via email to