On Sat, May 30, 2020 at 11:19 AM Li Qiang <liqian...@huawei.com> wrote:
> 在 2020/5/26 10:39, l00374334 写道:
> > From: liqiang <liqian...@huawei.com>
> >
> > By analyzing the compression and decompression process of gzip, I found
> >
> > that the hot spots of CRC32 and longest_match function are very high.
> >
> >
> >
> > On the aarch64 architecture, we can optimize the efficiency of crc32
> >
> > through the interface provided by the neon instruction set (12x faster
> >
> > in aarch64), and optimize the performance of random access code through
> >
> > prefetch instructions (about 5%~8% improvement). In some compression
> >
> > scenarios, loop expansion can also get a certain performance improvement
> >
> > (about 10%).
> >
> >
> >
> > Modify by Li Qiang.
> >
> > ---
> >  configure | 14 ++++++++++++++
> >  deflate.c | 30 +++++++++++++++++++++++++++++-
> >  util.c    | 45 +++++++++++++++++++++++++++++++++++++++++++++

Thank you for that work and sorry for the delay in responding.
However, for now I prefer not to apply it.
I'd prefer to see arch-specific optimizations added to libz in the
hope (perhaps naive) that someone will find time to make gzip use
libz.



Reply via email to