On Wed, Mar 31, 2021 at 10:43 AM Jan Hubicka <hubi...@ucw.cz> wrote: > > > > Reading through the optimization manual it seems that mosvb is fast for > > > small block no matter if the size is hard wired. In that case you > > > probably want to check whetehr max_size or expected_size is known to be > > > small rather than max_size == min_size and both being small. > > > > > > But it depends on what CPU really does. > > > Honza > > > > For small data size, rep movsb is faster only under certain conditions. We > > can continue fine tuning rep movsb. > > OK, I however wonder why you need condtion maxsize=minsize. > - If CPU is looking for movl $cst, %rcx than we probably want to be > sure that it is not moved away fro rep ;movsb by adding fused pattern > - If rep movsb is slower than loop for very small blocks then you want > to set lower bound on minsize & expected size, but you do not need > to require maxsize=minsize > - If rep movsb is slower than sequence of moves for small blocks then > one needs to tweak move by pieces > - If rep movsb is slower for larger blocks than you want to test > maxsize and expected size > So in neither of those scenarios testing maxsize=minsize alone makes too > much sense to me... What was the original motivation for differentiating > between precisely known size? > > I am mostly curious because it is not that uncomon to have small maxsize > because we are able to track the object size and using short sequence > for those would be nice. > > Having minsize non-trivial may not be that uncommon these days either > given that we track value ranges (and under assumption that > memcpy/memset expanders was updated to take these into account). >
Hongyu has done some analysis on this. Hongyu, can you share what you got? Thanks. -- H.J.