On 11.02.2019 21:25, Arthur Zakirov wrote:
Hello hackers,
Grigory noticed that one of our utilities has very slow performance when
xlogreader reads zlib archives. We found out that xlogreader sometimes
reads a WAL file block twice.
zlib has slow performance when you read an archive not in sequential
order. I think reading a block twice in same position isn't sequential,
because gzread() moves current position forward and next call gzseek()
to the same position moves it back.
It seems that the attached patch solves the issue. I think when reqLen
== state->readLen the requested block already is in the xlogreader's
buffer.
What do you think?
I looked at the history of the code changes:
---------------------------------------------------------------
7fcbf6a405f (Alvaro Herrera 2013-01-16 16:12:53 -0300 539)
reqLen < state->readLen)
1bb2558046c (Heikki Linnakangas 2010-01-27 15:27:51 +0000 9349)
targetPageOff == readOff && targetRecOff < readLen)
eaef111396e (Tom Lane 2006-04-03 23:35:05 +0000 3842)
len = XLOG_BLCKSZ - RecPtr->xrecoff % XLOG_BLCKSZ;
4d14fe0048c (Tom Lane 2001-03-13 01:17:06 +0000 3843)
if (total_len > len)
---------------------------------------------------------------
In the original code of Tom Lane, condition (total_len > len) caused a
page reread from disk. As I understand it, this is equivalent to your
proposal.
Th code line in commit 1bb2558046c seems tantamount to the corresponding
line in commit 7fcbf6a405f but have another semantics: the targetPageOff
value can't be more or equal XLOG_BLCKSZ, but the reqLen value can be.
It may be a reason of appearance of possible mistake, introduced by
commit 7fcbf6a405f.
--
Andrey Lepikhov
Postgres Professional
https://postgrespro.com
The Russian Postgres Company