https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88398
--- Comment #22 from rguenther at suse dot de <rguenther at suse dot de> --- On Tue, 22 Jan 2019, ktkachov at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88398 > > --- Comment #21 from ktkachov at gcc dot gnu.org --- > So the actual hot loop in xz_r does: > typedef unsigned char __uint8_t; > typedef unsigned int __uint32_t; > typedef unsigned long long __uint64_t; > > int > foo (const __uint64_t len_limit, const __uint8_t *cur, > __uint32_t delta, int len) { > > const __uint8_t *pb = cur - delta; > > while (++len != len_limit) { > if (pb[len] != cur[len]) > break; > } > > return len; > } > > The 'pb' pointer is the 'cur' pointer but moved back by 'delta'. > Presumably that means that all memory between 'pb' and 'delta' and could be > read in as wide a load as possible? A C language lawyer would agree with that. But does it really help? The loop also accesses [cur + len, cur + len_limit].