On Sat, May 30, 2020 at 11:19 AM Li Qiang <liqian...@huawei.com> wrote: > 在 2020/5/26 10:39, l00374334 写道: > > From: liqiang <liqian...@huawei.com> > > > > By analyzing the compression and decompression process of gzip, I found > > > > that the hot spots of CRC32 and longest_match function are very high. > > > > > > > > On the aarch64 architecture, we can optimize the efficiency of crc32 > > > > through the interface provided by the neon instruction set (12x faster > > > > in aarch64), and optimize the performance of random access code through > > > > prefetch instructions (about 5%~8% improvement). In some compression > > > > scenarios, loop expansion can also get a certain performance improvement > > > > (about 10%). > > > > > > > > Modify by Li Qiang. > > > > --- > > configure | 14 ++++++++++++++ > > deflate.c | 30 +++++++++++++++++++++++++++++- > > util.c | 45 +++++++++++++++++++++++++++++++++++++++++++++
Thank you for that work and sorry for the delay in responding. However, for now I prefer not to apply it. I'd prefer to see arch-specific optimizations added to libz in the hope (perhaps naive) that someone will find time to make gzip use libz.