It's what Dr. Chung-Lung Shum told me. (He was the architect of the Z chips through 2022.)
The context I asked him about was strings from ~100 to 32K bytes, and where it was not a waste of time to have the target data in cache after the move. I think I benchmarked, but don't recall for sure. I know I used MVC's (in a situation where speed was of the essence -- thousands of executions per second). I just found the note. He wrote: Since you indicated that for the "to queue" - the data would have been recently touched; and no control of alignment, it is best you do loops of MVCs. With the out-of-order pipeline and the internal Hardware prefetcher, the hardware should be able to parallelize any cache miss if it is there. But since the data is recently touched, it is most likely in some cache some where (not near memory), and the MVCL near-memory mover won't help in that case. The suggestion is also based on the fact that the size of the records moved tend to be small (and <32K as you mentioned). The near-memory mover will likely not provide any benefit until 40-50K pages. Charles On Thu, 31 Oct 2024 17:18:21 -0400, Steve Thompson <ste...@wkyr.net> wrote: >Uh, isn't that true up to about 1024 bytes (MVCs stacked)? And >then after the MVCL seems to be faster. ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN