Re: Some questions about PostgreSQL’s design.

2024-08-22 Thread
For other approaches, such as whether to use an LRU list to manage the
shared_buffer or to use a clock sweep for management, both methods
have their pros and cons. But for these two issues, there is a clearly
better solution. For example, using DirectIO avoids the problem of
double-copying data, and the OS’s page cache LRU list is optimized for
general scenarios, while the database kernel should use its own
eviction algorithm. Regarding the other issue, full-page writes don’t
actually reduce the number of page reads—it’s just a matter of whether
those page reads come from data files or from the redo log; the amount
of data read is essentially the same. However, the problem it
introduces is significant write amplification on the critical write
path, which severely impacts performance. As a result, PostgreSQL has
to minimize the frequency of checkpoints as much as possible.

I thought someone could write a demo to show it..

On Tue, Aug 20, 2024 at 9:46 PM Heikki Linnakangas  wrote:
>
> On 20/08/2024 11:46, 陈宗志 wrote:
> > I’ve recently started exploring PostgreSQL implementation. I used to
> > be a MySQL InnoDB developer, and I find the PostgreSQL community feels
> > a bit strange.
> >
> > There are some areas where they’ve done really well, but there are
> > also some obvious issues that haven’t been improved.
> >
> > For example, the B-link tree implementation in PostgreSQL is
> > particularly elegant, and the code is very clean.
> > But there are some clear areas that could be improved but haven’t been
> > addressed, like the double memory problem where the buffer pool and
> > page cache store the same page, using full-page writes to deal with
> > torn page writes instead of something like InnoDB’s double write
> > buffer.
> >
> > It seems like these issues have clear solutions, such as using
> > DirectIO like InnoDB instead of buffered IO, or using a double write
> > buffer instead of relying on the full-page write approach.
> > Can anyone replay why?
>
> There are pros and cons. With direct I/O, you cannot take advantage of
> the kernel page cache anymore, so it becomes important to tune
> shared_buffers more precisely. That's a downside: the system requires
> more tuning. For many applications, squeezing the last ounce of
> performance just isn't that important. There are also scaling issues
> with the Postgres buffer cache, which might need to be addressed first.
>
> With double write buffering, there are also pros and cons. It also
> requires careful tuning. And replaying WAL that contains full-page
> images can be much faster, because you can write new page images
> "blindly" without reading the old pages first. We have WAL prefetching
> now, which alleviates that, but it's no panacea.
>
> In summary, those are good solutions but they're not obviously better in
> all circumstances.
>
> > However, the PostgreSQL community’s mailing list is truly a treasure
> > trove, where you can find really interesting discussions. For
> > instance, this discussion on whether lock coupling is needed for
> > B-link trees, etc.
> > https://www.postgresql.org/message-id/flat/CALJbhHPiudj4usf6JF7wuCB81fB7SbNAeyG616k%2Bm9G0vffrYw%40mail.gmail.com
>
> Yep, there are old threads and patches for double write buffers and
> direct IO too :-).
>
> --
> Heikki Linnakangas
> Neon (https://neon.tech)
>


-- 
---
Blog: http://www.chenzongzhi.info
Twitter: https://twitter.com/baotiao
Git: https://github.com/baotiao




Re: Some questions about PostgreSQL’s design.

2024-08-22 Thread
I disagree with the point made in the article. The article mentions
that ‘prevents the kernel from reordering reads and writes to optimize
performance,’ which might be referring to the file system’s IO
scheduling and merging. However, this can be handled within the
database itself, where IO scheduling and merging can be done even
better.

Regarding ‘does not allow free memory to be used as kernel cache,’ I
believe the database itself should manage memory well, and most of the
memory should be managed by the database rather than handed over to
the operating system. Additionally, the database’s use of the page
cache should be restricted.

On Wed, Aug 21, 2024 at 12:55 AM Bruce Momjian  wrote:
>
> On Tue, Aug 20, 2024 at 04:46:54PM +0300, Heikki Linnakangas wrote:
> > There are pros and cons. With direct I/O, you cannot take advantage of the
> > kernel page cache anymore, so it becomes important to tune shared_buffers
> > more precisely. That's a downside: the system requires more tuning. For many
> > applications, squeezing the last ounce of performance just isn't that
> > important. There are also scaling issues with the Postgres buffer cache,
> > which might need to be addressed first.
> >
> > With double write buffering, there are also pros and cons. It also requires
> > careful tuning. And replaying WAL that contains full-page images can be much
> > faster, because you can write new page images "blindly" without reading the
> > old pages first. We have WAL prefetching now, which alleviates that, but
> > it's no panacea.
>
> 陈宗志, you mimght find this blog post helpful:
>
> https://momjian.us/main/blogs/pgblog/2017.html#June_5_2017
>
> --
>   Bruce Momjian  https://momjian.us
>   EDB  https://enterprisedb.com
>
>   Only you can decide what is important to you.



-- 
---
Blog: http://www.chenzongzhi.info
Twitter: https://twitter.com/baotiao
Git: https://github.com/baotiao




Re: AIO v2.0

2024-09-04 Thread
I hope there can be a high-level design document that includes a
description, high-level architecture, and low-level design.
This way, others can also participate in reviewing the code.
For example, which paths were modified in the AIO module? Is it the
path for writing WAL logs, or the path for flushing pages, etc.?

Also, I recommend keeping this patch as small as possible.
For example, the first step could be to introduce libaio only, without
considering io_uring, as that would make it too complex.




Some questions about PostgreSQL’s design.

2024-08-20 Thread
I’ve recently started exploring PostgreSQL implementation. I used to
be a MySQL InnoDB developer, and I find the PostgreSQL community feels
a bit strange.

There are some areas where they’ve done really well, but there are
also some obvious issues that haven’t been improved.

For example, the B-link tree implementation in PostgreSQL is
particularly elegant, and the code is very clean.
But there are some clear areas that could be improved but haven’t been
addressed, like the double memory problem where the buffer pool and
page cache store the same page, using full-page writes to deal with
torn page writes instead of something like InnoDB’s double write
buffer.

It seems like these issues have clear solutions, such as using
DirectIO like InnoDB instead of buffered IO, or using a double write
buffer instead of relying on the full-page write approach.
Can anyone replay why?

However, the PostgreSQL community’s mailing list is truly a treasure
trove, where you can find really interesting discussions. For
instance, this discussion on whether lock coupling is needed for
B-link trees, etc.
https://www.postgresql.org/message-id/flat/CALJbhHPiudj4usf6JF7wuCB81fB7SbNAeyG616k%2Bm9G0vffrYw%40mail.gmail.com

-- 
---
Blog: https://baotiao.github.io/
Twitter: https://twitter.com/baotiao
Git: https://github.com/baotiao