From: Takashi Menjo
> > The other question is whether simply placing WAL on DAX (without any
> > code changes) is safe. If it's not, then all the "speedups" are
> > computed with respect to unsafe configuration and so are useless. And
> > BTT should be used instead, which would of course produce v
Hi Tomas,
> Hello Takashi-san,
>
> On 3/5/21 9:08 AM, Takashi Menjo wrote:
> > Hi Tomas,
> >
> > Thank you so much for your report. I have read it with great interest.
> >
> > Your conclusion sounds reasonable to me. My patchset you call "NTT /
> > segments" got as good performance as "NTT / buffe
Hello Takashi-san,
On 3/5/21 9:08 AM, Takashi Menjo wrote:
> Hi Tomas,
>
> Thank you so much for your report. I have read it with great interest.
>
> Your conclusion sounds reasonable to me. My patchset you call "NTT /
> segments" got as good performance as "NTT / buffer" patchset. I have
> been
Hi Tomas,
Thank you so much for your report. I have read it with great interest.
Your conclusion sounds reasonable to me. My patchset you call "NTT /
segments" got as good performance as "NTT / buffer" patchset. I have
been worried that calling mmap/munmap for each WAL segment file could
have a l
Hi Sawada,
I am relieved to hear that the performance problem was solved.
And I added a tip about PMEM namespace and partitioning in PG wiki[1].
Regards,
[1]
https://wiki.postgresql.org/wiki/Persistent_Memory_for_WAL#Configure_and_verify_DAX_hugepage_faults
--
Takashi Menjo
On Sat, Feb 13, 2021 at 12:18 PM Masahiko Sawada wrote:
>
> On Thu, Jan 28, 2021 at 1:41 AM Tomas Vondra
> wrote:
> >
> > On 1/25/21 3:56 AM, Masahiko Sawada wrote:
> > >>
> > >> ...
> > >>
> > >> On 1/21/21 3:17 AM, Masahiko Sawada wrote:
> > >>> ...
> > >>>
> > >>> While looking at the two meth
Hi,
I had a performance test in another environment. The steps, setup,
and postgresql.conf of the test are same as the ones sent by me on
Feb 17 [1], except the following items:
# Setup
- Distro: Red Hat Enterprise Linux release 8.2 (Ootpa)
- C compiler: gcc-8.3.1-5.el8.x86_64
- libc: glibc-2.28-
Thank you for your feedback.
On 19.02.2021 6:25, Tomas Vondra wrote:
On 1/22/21 5:04 PM, Konstantin Knizhnik wrote:
...
I have heard from several DBMS experts that appearance of huge and
cheap non-volatile memory can make a revolution in database system
architecture. If all database can fit in
On 1/22/21 5:04 PM, Konstantin Knizhnik wrote:
> ...
>
> I have heard from several DBMS experts that appearance of huge and
> cheap non-volatile memory can make a revolution in database system
> architecture. If all database can fit in non-volatile memory, then we
> do not need buffers, WAL, ...>
>
Hi Sawada,
Thank you for your performance report.
First, I'd say that the latest v5 non-volatile WAL buffer patchset
looks not bad itself. I made a performance test for the v5 and got
better performance than the original (non-patched) one and our
previous work. See the attached figure for results
Hi Takayuki,
Thank you for your helpful comments.
In "Allocates WAL buffers on shared buffers", "shared buffers" should be
> DRAM because shared buffers in Postgres means the buffer cache for database
> data.
>
That's true. Fixed.
> I haven't tracked the whole thread, but could you collect inf
From: Takashi Menjo
> I made a new page at PostgreSQL Wiki to gather and summarize information and
> discussion about PMEM-backed WAL designs and implementations. Some parts of
> the page are TBD. I will continue to maintain the page. Requests are welcome.
>
> Persistent Memory for WAL
> https
Hi,
I made a new page at PostgreSQL Wiki to gather and summarize information
and discussion about PMEM-backed WAL designs and implementations. Some
parts of the page are TBD. I will continue to maintain the page. Requests
are welcome.
Persistent Memory for WAL
https://wiki.postgresql.org/wiki/Per
From: Masahiko Sawada
> I've done some performance benchmarks with the master and NTT v4
> patch. Let me share the results.
>
...
> master NTT master-unlogged
> 32 113209 67107 154298
> 64 144880 54289 178883
> 96 151405 50562 180018
>
> "master-unlogged" is
On Thu, Jan 28, 2021 at 1:41 AM Tomas Vondra
wrote:
>
> On 1/25/21 3:56 AM, Masahiko Sawada wrote:
> >>
> >> ...
> >>
> >> On 1/21/21 3:17 AM, Masahiko Sawada wrote:
> >>> ...
> >>>
> >>> While looking at the two methods: NTT and simple-no-buffer, I realized
> >>> that in XLogFlush(), NTT patch fl
Hi Tomas,
I'd answer your questions. (Not all for now, sorry.)
> Do I understand correctly that the patch removes "regular" WAL buffers
and instead writes the data into the non-volatile PMEM buffer, without
writing that to the WAL segments at all (unless in archiving mode)?
> Firstly, I guess ma
From: Tomas Vondra
> (c) As mentioned before, PMEM behaves differently with concurrent
> access, i.e. it reaches peak throughput with relatively low number of
> threads wroting data, and then the throughput drops quite quickly. I'm
> not sure if the same thing applies to pmem_drain() too - if it d
On 1/25/21 3:56 AM, Masahiko Sawada wrote:
>>
>> ...
>>
>> On 1/21/21 3:17 AM, Masahiko Sawada wrote:
>>> ...
>>>
>>> While looking at the two methods: NTT and simple-no-buffer, I realized
>>> that in XLogFlush(), NTT patch flushes (by pmem_flush() and
>>> pmem_drain()) WAL without acquiring WALWri
Hi,
Now I have caught up with this thread. I see that many of you are
interested in performance profiling.
I share my slides in SNIA SDC 2020 [1]. In the slides, I had profiles
focused on XLogInsert and XLogFlush (mainly the latter) for my non-volatile
WAL buffer patchset. I found that the time f
Dear everyone, Tomas,
First of all, the "v4" patchset for non-volatile WAL buffer attached to the
previous mail is actually v5... Please read "v4" as "v5."
Then, to Tomas:
Thank you for your crash report you gave on Nov 27, 2020, regarding msync
patchset. I applied the latest msync patchset v3 a
Dear everyone,
I'm sorry for the late reply. I rebase my two patchsets onto the latest
master 411ae64.The one patchset prefixed with v4 is for non-volatile WAL
buffer; the other prefixed with v3 is for msync.
I will reply to your thankful feedbacks one by one within days. Please wait
for a moment
On Fri, Jan 22, 2021 at 11:32 AM Tomas Vondra
wrote:
>
>
>
> On 1/21/21 3:17 AM, Masahiko Sawada wrote:
> > On Thu, Jan 7, 2021 at 2:16 AM Tomas Vondra
> > wrote:
> >>
> >> Hi,
> >>
> >> I think I've managed to get the 0002 patch [1] rebased to master and
> >> working (with help from Masahiko Saw
On 22.01.2021 5:32, Tomas Vondra wrote:
On 1/21/21 3:17 AM, Masahiko Sawada wrote:
On Thu, Jan 7, 2021 at 2:16 AM Tomas Vondra
wrote:
Hi,
I think I've managed to get the 0002 patch [1] rebased to master and
working (with help from Masahiko Sawada). It's not clear to me how it
could have
On 1/21/21 3:17 AM, Masahiko Sawada wrote:
On Thu, Jan 7, 2021 at 2:16 AM Tomas Vondra
wrote:
Hi,
I think I've managed to get the 0002 patch [1] rebased to master and
working (with help from Masahiko Sawada). It's not clear to me how it
could have worked as submitted - my theory is that an
On Thu, Jan 7, 2021 at 2:16 AM Tomas Vondra
wrote:
>
> Hi,
>
> I think I've managed to get the 0002 patch [1] rebased to master and
> working (with help from Masahiko Sawada). It's not clear to me how it
> could have worked as submitted - my theory is that an incomplete patch
> was submitted by mi
On 11/27/20 1:02 AM, Tomas Vondra wrote:
>
> Unfortunately, that patch seems to fail for me :-(
>
> The patches seem to be for PG12, so I applied them on REL_12_STABLE (all
> the parts 0001-0005) and then I did this:
>
> LIBS="-lpmem" ./configure --prefix=/home/tomas/pg-12-pmem --enable-debug
>
On 11/26/20 10:19 PM, Tomas Vondra wrote:
>
>
> On 11/26/20 9:59 PM, Heikki Linnakangas wrote:
>> On 26/11/2020 21:27, Tomas Vondra wrote:
>>> Hi,
>>>
>>> Here's the "simple patch" that I'm currently experimenting with. It
>>> essentially replaces open/close/write/fsync with pmem calls
>>> (ma
On 11/26/20 9:59 PM, Heikki Linnakangas wrote:
> On 26/11/2020 21:27, Tomas Vondra wrote:
>> Hi,
>>
>> Here's the "simple patch" that I'm currently experimenting with. It
>> essentially replaces open/close/write/fsync with pmem calls
>> (map/unmap/memcpy/persist variants), and it's by no means c
Hi,
Here's the "simple patch" that I'm currently experimenting with. It
essentially replaces open/close/write/fsync with pmem calls
(map/unmap/memcpy/persist variants), and it's by no means committable.
But it works well enough for experiments / measurements, etc.
The numbers (5-minute pgbench ru
On 26/11/2020 21:27, Tomas Vondra wrote:
Hi,
Here's the "simple patch" that I'm currently experimenting with. It
essentially replaces open/close/write/fsync with pmem calls
(map/unmap/memcpy/persist variants), and it's by no means committable.
But it works well enough for experiments / measureme
On 11/25/20 2:10 AM, Ashwin Agrawal wrote:
> On Sun, Nov 22, 2020 at 5:23 PM Tomas Vondra
> wrote:
>
>> I'm not entirely sure whether the "pmemdax" (i.e. unpatched instance
>> with WAL on PMEM DAX device) is actually safe, but I included it anyway
>> to see what difference is.
> > I am curious to
On 11/25/20 1:27 AM, tsunakawa.ta...@fujitsu.com wrote:
> From: Tomas Vondra
>> It's interesting that they only place the tail of the log on PMEM,
>> i.e. the PMEM buffer has limited size, and the rest of the log is
>> not on PMEM. It's a bit as if we inserted a PMEM buffer between our
>> wal buff
On Sun, Nov 22, 2020 at 5:23 PM Tomas Vondra
wrote:
> I'm not entirely sure whether the "pmemdax" (i.e. unpatched instance
> with WAL on PMEM DAX device) is actually safe, but I included it anyway
> to see what difference is.
I am curious to learn more on this aspect. Kernels have provided supp
From: Tomas Vondra
> It's interesting that they only place the tail of the log on PMEM, i.e.
> the PMEM buffer has limited size, and the rest of the log is not on
> PMEM. It's a bit as if we inserted a PMEM buffer between our wal buffers
> and the WAL segments, and kept the WAL segments on regular
On 11/24/20 7:34 AM, tsunakawa.ta...@fujitsu.com wrote:
> From: Tomas Vondra
>> So I wonder if using PMEM for the WAL buffer is the right way forward.
>> AFAIK the WAL buffer is quite concurrent (multiple clients writing
>> data), which seems to contradict the PMEM vs. DRAM trade-offs.
>>
>> Th
From: Tomas Vondra
> So I wonder if using PMEM for the WAL buffer is the right way forward.
> AFAIK the WAL buffer is quite concurrent (multiple clients writing
> data), which seems to contradict the PMEM vs. DRAM trade-offs.
>
> The design I've originally expected would look more like this
>
>
Hi,
On 11/23/20 3:01 AM, Tomas Vondra wrote:
> Hi,
>
> On 10/30/20 6:57 AM, Takashi Menjo wrote:
>> Hi Heikki,
>>
>>> I had a new look at this thread today, trying to figure out where
>>> we are.
>>
>> I'm a bit confused.
>>>
>>> One thing we have established: mmap()ing WAL files performs worse
Hi,
On 10/30/20 6:57 AM, Takashi Menjo wrote:
> Hi Heikki,
>
>> I had a new look at this thread today, trying to figure out where
>> we are.
>
> I'm a bit confused.
>>
>> One thing we have established: mmap()ing WAL files performs worse
>> than the current method, if pg_wal is not on a persis
Hi,
These patches no longer apply :-( A rebased version would be nice.
I've been interested in what performance improvements this might bring,
so I've been running some extensive benchmarks on a machine with PMEM
hardware. So let me share some interesting results. (I used commit from
early Septem
Hi Gang,
I appreciate your patience. I reproduced the results you reported to me, on
my environment.
First of all, the condition you gave to me was a little unstable on my
environment, so I made the values of {max_,min_,nv}wal_size larger and the
pre-warm duration longer to get stable performance
Hi Heikki,
> I had a new look at this thread today, trying to figure out where we are.
I'm a bit confused.
>
> One thing we have established: mmap()ing WAL files performs worse than
the current method, if pg_wal is not on
> a persistent memory device. This is because the kernel faults in existing
I had a new look at this thread today, trying to figure out where we
are. I'm a bit confused.
One thing we have established: mmap()ing WAL files performs worse than
the current method, if pg_wal is not on a persistent memory device. This
is because the kernel faults in existing content of each
,
Takashi
--
Takashi Menjo
NTT Software Innovation Center
> -Original Message-
> From: Deng, Gang
> Sent: Friday, October 9, 2020 3:10 PM
> To: Takashi Menjo
> Cc: pgsql-hack...@postgresql.org; 'Takashi Menjo'
> Subject: RE: [PoC] Non-volatile WAL buffer
>
ds
Gang
-Original Message-
From: Takashi Menjo
Sent: Tuesday, October 6, 2020 4:49 PM
To: Deng, Gang
Cc: pgsql-hack...@postgresql.org; 'Takashi Menjo'
Subject: RE: [PoC] Non-volatile WAL buffer
Hi Gang,
I have tried to but yet cannot reproduce performance degrade you reporte
v4
--
Takashi Menjo
NTT Software Innovation Center
> -Original Message-
> From: Takashi Menjo
> Sent: Thursday, September 24, 2020 2:38 AM
> To: Deng, Gang
> Cc: pgsql-hack...@postgresql.org; Takashi Menjo
>
> Subject: Re: [PoC] Non-volatile WAL buffer
&g
gt; Througput (10^3 TPS)
> 13.0 16.9
>
> CPU Time % of CopyXlogRecordToWAL
> 3.0 1.6
>
> CPU Time % of XLogInsertRecord
> 23.0 16.4
>
> CPU Time % of XLogFlush
>
2.3
5.9
Best Regards,
Gang
From: Takashi Menjo
Sent: Thursday, September 10, 2020 4:01 PM
To: Takashi Menjo
Cc: pgsql-hack...@postgresql.org
Subject: Re: [PoC] Non-volatile WAL buffer
Rebased.
2020年6月24日(水) 16:44 Takashi Menjo
rt Haas' ; 'Heikki Linnakangas'
> ; 'Amit Langote'
>
> Subject: RE: [PoC] Non-volatile WAL buffer
>
> Dear hackers,
>
> I rebased my non-volatile WAL buffer's patchset onto master. A new v2
> patchset is attached to this mail.
>
> I als
n Center
> -Original Message-
> From: Amit Langote
> Sent: Monday, February 17, 2020 5:21 PM
> To: Takashi Menjo
> Cc: Robert Haas ; Heikki Linnakangas
> ; PostgreSQL-development
>
> Subject: Re: [PoC] Non-volatile WAL buffer
>
> Hello,
>
> On Mo
Hi,
On 2020-02-17 13:12:37 +0900, Takashi Menjo wrote:
> I applied my patchset that mmap()-s WAL segments as WAL buffers to
> refs/tags/REL_12_0, and measured and analyzed its performance with
> pgbench. Roughly speaking, When I used *SSD and ext4* to store WAL,
> it was "obviously worse" than th
Hello,
On Mon, Feb 17, 2020 at 4:16 PM Takashi Menjo
wrote:
> Hello Amit,
>
> > I apologize for not having any opinion on the patches themselves, but let
> > me point out that it's better to base these
> > patches on HEAD (master branch) than REL_12_0, because all new code is
> > committed to t
ashi Menjo
> Cc: Robert Haas ; Heikki Linnakangas
> ; PostgreSQL-development
>
> Subject: Re: [PoC] Non-volatile WAL buffer
>
> Menjo-san,
>
> On Mon, Feb 17, 2020 at 1:13 PM Takashi Menjo
> wrote:
> > I applied my patchset that mmap()-s WAL segments as WAL buffers
Menjo-san,
On Mon, Feb 17, 2020 at 1:13 PM Takashi Menjo
wrote:
> I applied my patchset that mmap()-s WAL segments as WAL buffers to
> refs/tags/REL_12_0, and measured and analyzed its performance with pgbench.
> Roughly speaking, When I used *SSD and ext4* to store WAL, it was "obviously
> w
Sent: Monday, February 10, 2020 6:30 PM
> To: 'Robert Haas' ; 'Heikki Linnakangas'
>
> Cc: 'pgsql-hack...@postgresql.org'
> Subject: RE: [PoC] Non-volatile WAL buffer
>
> Dear hackers,
>
> I made another WIP patchset to mmap WAL segments as WA
for the result report...
Best regards,
Takashi
--
Takashi Menjo
NTT Software Innovation Center
> -Original Message-
> From: Robert Haas
> Sent: Wednesday, January 29, 2020 6:00 AM
> To: Takashi Menjo
> Cc: Heikki Linnakangas ; pgsql-hack...@postgresql.org
> Subj
Hi,
On 2020-01-27 13:54:38 -0500, Robert Haas wrote:
> On Mon, Jan 27, 2020 at 2:01 AM Takashi Menjo
> wrote:
> > It sounds reasonable, but I'm sorry that I haven't tested such a program
> > yet. I'll try it to compare with my non-volatile WAL buffer. For now, I'm
> > a little worried about the
On Tue, Jan 28, 2020 at 3:28 AM Takashi Menjo
wrote:
> I think our concerns are roughly classified into two:
>
> (1) Performance
> (2) Consistency
>
> And your "different concern" is rather into (2), I think.
Actually, I think it was mostly a performance concern (writes
triggering lots of readi
Hello Robert,
I think our concerns are roughly classified into two:
(1) Performance
(2) Consistency
And your "different concern" is rather into (2), I think.
I'm also worried about it, but I have no good answer for now. I suppose
mmap(flags|=MAP_SHARED) called by multiple backend processes
On Mon, Jan 27, 2020 at 2:01 AM Takashi Menjo
wrote:
> It sounds reasonable, but I'm sorry that I haven't tested such a program
> yet. I'll try it to compare with my non-volatile WAL buffer. For now, I'm
> a little worried about the overhead of mmap()/munmap() for each WAL segment
> file.
I gue
Hello Heikki,
> I have the same comments on this that I had on the previous patch, see:
>
> https://www.postgresql.org/message-id/2aec6e2a-6a32-0c39-e4e2-aad854543aa8%40iki.fi
Thanks. I re-read your messages [1][2]. What you meant, AFAIU, is how
about using memory-mapped WAL segment files as W
Hello Fabien,
Thank you for your +1 :)
> Is it possible to emulate somthing without the actual hardware, at least
> for testing purposes?
Yes, you can emulate PMEM using DRAM on Linux, via "memmap=nnG!ssG" kernel
parameter. Please see [1] and [2] for emulation details. If your emulation
does n
On 24/01/2020 10:06, Takashi Menjo wrote:
I propose "non-volatile WAL buffer," a proof-of-concept new feature. It
enables WAL records to be durable without output to WAL segment files by
residing on persistent memory (PMEM) instead of DRAM. It improves database
performance by reducing copies of
Hello,
+1 on the idea.
By quickly looking at the patch, I notice that there are no tests.
Is it possible to emulate somthing without the actual hardware, at least
for testing purposes?
--
Fabien.
63 matches
Mail list logo