Mike Schwab writes:
> Maybe the z13 will execute multiple bytes in parallel?  One byte being
> looked up per core.  Then a cycle to check the condition codes and how
> many to accept.

Per-core or SMT wouldn't be suitable (as other responses have covered)
but your wish for processing multiple bytes (or half-words or words or
doublewords) in one instruction is fulfilled by the new SIMD
instructions in z13: Single Instruction Multiple Data.

They might or might not be overkill for a simple "search for a single
target byte" (though there are some special cases for zeroes that
might be more generally useful) but they are extremely powerful.

The SIMD/vector instructions are described in the latest version of
the POPs. There are 32 x 128-bit vector registers and you can treat
each as 16 bytes, 8 half-words, 4 words or 2 doublewords. The ones
most likely to be useful for string handling are in Chapter 23:
"Vector String Instructions". So, for example, VFAEB is a flavour
of VECTOR FIND ANY ELEMENT EQUAL which treats one vector register
as a list of 16 individual bytes that you want to look for and then
hunts through the 16 bytes of another vector register for a match
(and you'd iterate the latter through your source string). You can
set a flag bit in the instruction (mnemonic VFAEZB) which also looks
out for a zero byte in there. You can collect the various matches in
various ways (via other flag bits in the instruction) to give you
match/nonmatch bits for each source byte or collect them into a byte
index to the first match/nonmatch.

If you want something really fancy, VECTOR STRING RANGE COMPARE
(VSTRC) lets you list multiple ranges of values to compare your
source against.  Again, this goes through all the bytes in the source
vector register and compare each against all your list of ranges and
you can set bits to compare equal/low/high, include the zero byte in
your search and so on.

Of course, it's really via the compilers (and Java JIT) and libraries
where most people are going to benefit from this transparently but
I can easily imagine folks wondering why compiler implementers should
get all the fun and using them for their own low-level use :-)

(And the instruction descriptions above are based on no more than
familiarity with other architecture SIMD implementations and some
brief reading of the new POPs so I may have goofed.)

--Malcolm

-- 
Malcolm Beattie
Linux and System z Technical Consultant, zChampion
IBM UK Systems and Technology Group

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN

Reply via email to