> > I'm drawing a blank on trivial candidate uses for preadv(), without
> > infrastructure from later patches.
>
> Can't immediately think of something either.
This might be not that trivial , but maybe acquire_sample_rows() from analyze.c
?
Please note however there's patch
https://www.post
Hi Ray,
> So can we delete the limit of ArchiveRecoveryRequested, and enable launch
> bgwriter in master node ?
Please take a look on https://commitfest.postgresql.org/29/2706/ and the
related email thread.
-J.
Hi Stephen, hackers,
> The analyze is doing more-or-less random i/o since it's skipping through
> the table picking out select blocks, not doing regular sequential i/o.
VS
>> Breakpoint 1, heapam_scan_analyze_next_block (scan=0x10c8098,
>> blockno=19890910, bstrategy=0x1102278) at heapam_handler.
Hi Stephen, hackers,
>> > With all those 'readahead' calls it certainly makes one wonder if the
>> > Linux kernel is reading more than just the block we're looking for
>> > because it thinks we're doing a sequential read and will therefore want
>> > the next few blocks when, in reality, we're goin
e in wildest dreams that
posix_fadvise(POSIX_FADV_WILLNEED) is such a cheap syscall.
-J.
--------
From: Stephen Frost
Sent: Tuesday, November 3, 2020 6:47 PM
To: Jakub Wartak
Cc: pgsql-hackers
Subject: Re: automatic analyze: readahead - add "IO read time" log me
k_io_timings=on doesn't feel like it is
going to make stuff crash., so again I think it is good idea.
-Jakub Wartak.
o current
commitfest https://commitfest.postgresql.org/36/3461/ with You as the author
and in 'Ready for review' state.
I think it behaves as almost finished one and apparently after reading all
those discussions that go back over 10years+ time span about this feature, and
lot of failed effort towards wal_level=noWAL I think it would be nice to
finally start getting some of that of it into the core.
-Jakub Wartak.
v7-0001-In-place-table-persistence-change-with-new-comman.patch
Description: v7-0001-In-place-table-persistence-change-with-new-comman.patch
> Justin wrote:
> On Fri, Dec 17, 2021 at 09:10:30AM +, Jakub Wartak wrote:
> > As the thread didn't get a lot of traction, I've registered it into current
> commitfest
> https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcommitf
> est.postgresql.
Hi Kyotaro, I'm glad you are still into this
> I didn't register for some reasons.
Right now in v8 there's a typo in ./src/backend/catalog/storage.c :
storage.c: In function 'RelationDropInitFork':
storage.c:385:44: error: expected statement before ')' token
pending->unlink_forknum != INIT_F
Hi Kyotaro,
> At Mon, 20 Dec 2021 17:39:27 +0900 (JST), Kyotaro Horiguchi
> wrote in
> > At Mon, 20 Dec 2021 07:59:29 +, Jakub Wartak
> > wrote in
> > > BTW fast feedback regarding that ALTER patch (there were 4 unlogged
> tables):
> > > # ALTER
Hi Kyotaro,
> I took a bit too long detour but the patch gets to pass make-world for me.
Good news, v10 passes all the tests for me (including TAP recover ones).
There's major problem I think:
drop table t6;
create unlogged table t6 (id bigint, t text);
create sequence s1;
insert into t6 select
Hi Kyotaro,
> At Tue, 21 Dec 2021 13:07:28 +0000, Jakub Wartak
> wrote in
> > So what's suspicious is that 122880 -> 0 file size truncation. I've
> > investigated WAL and it seems to contain TRUNCATE records after logged
> FPI images, so when the crash recover
The following review has been posted through the commitfest application:
make installcheck-world: tested, passed
Implements feature: tested, passed
Spec compliant: tested, passed
Documentation:not tested
I've retested v15 of the patch with everything that came to my mi
s with much lower latency
dd if=/dev/zero of=test bs=1M count=1000 oflag=direct ==> 1.9 GB/s or
maybe even more
dd if=/dev/zero of=test bs=8k count=1 oflag=direct => 141 MB/s
-Jakub Wartak.
[1] - https://github.com/macdice/redo-bench/
[2] - https://commitfest.postgresql.org/29/2687/
fer_common
>> ReadBufferWithoutRelcache
>> XLogReadBufferExtended
>> XLogReadBufferForRedoExtended
>
> For these reads, the solution should be WAL prefetching,(..) But... when
> combined with Andres's work-in-progress AIO stuff (..)
Yes, I've heard a thing or two about those :) I hope I'll be able to deliver
some measurements sooner or later of those two together (AIO+WALprefetch).
-Jakub Wartak.
huge btrees, multiple
INSERTs wih plenty of data in VALUES() thrown as one commit, real
primary->hot-standby replication [not closed DB in recovery], sorted not random
UUIDs) - I'm going to try nail down these differences and maybe I manage to
produce more realistic "pgbench reproducer" (this may take some time though).
-Jakub Wartak.
. As I've learned it's apparently much more complex to reproduce what I'm
after and involves a lot of reading about LogStandbySnapshot() / standby
recovery points on my side.
Now, back to smgropen() hash_search_by_values() reproducer...
-Jakub Wartak.
ks from WAL at once (b) then issuing
preadv() to get all the DB blocks into s_b going from the same rel/fd (c)
applying WAL. Sounds like a major refactor just to save syscalls :(
- mmap() - even more unrealistic
- IO_URING - gives a lot of promise here I think, is it even planned to be
shown f
th
PinBuffer()->GetPrivateRefCountEntry() -> dynahash that could be called pretty
often I have no idea what kind of pgbench stresstest could be used to
demonstrate the gain (or lack of it).
-Jakub Wartak.
fer*() / PinBuffer()
(some recent discussions, maybe on NUMA boxes), not just WAL recovery as it
seems relatively easy to improve.
-J.
[1] - https://github.com/macdice/redo-bench
[2] - https://fuhrwerks.com/csrg/info/93c40a660b6cdf74
From: Thomas Munro
Sent: Tues
On 2021-Sep-25, Alvaro Herrera wrote:
>> On 2021-Sep-24, Alvaro Herrera wrote:
>>
>> > Here's the set for all branches, which I think are really final, in
>> > case somebody wants to play and reproduce their respective problem
>> scenarios.
>>
>> I forgot to mention that I'll wait until 14.0 is t
On Fri, Jan 12, 2024 at 7:33 AM Bharath Rupireddy
wrote:
>
> On Wed, Jan 10, 2024 at 11:43 AM Tom Lane wrote:
> >
> > Bharath Rupireddy writes:
> > > On Wed, Jan 10, 2024 at 10:00 AM Tom Lane wrote:
> > >> Maybe. I bet just bumping up the constant by 2X or 4X or so would get
> > >> most of the
Hi Daniel,
On Tue, Jan 30, 2024 at 3:29 PM Daniel Verite wrote:
> PFA a rebased version.
Thanks for the patch! I've tested it using my original reproducer and
it works great now against the original problem description. I've
taken a quick look at the patch, it looks good for me. I've tested
us
Hi, I've tested the attached patch by Justin and it applied almost
cleanly to the master, but there was a tiny typo and make
postgres-A4.pdf didn't want to run:
Note that creating a partition using PARTITION OF
=> (note lack of closing literal) =>
Note that creating a partition using PARTITION OF
Hi Tomas,
> I took a quick look at the remaining part adding copy_file_range to
> pg_combinebackup. The patch no longer applies, so I had to rebase it.
> Most of the issues were trivial, but I had to fix a couple missing
> prototypes - I added them to copy_file.h/c, mostly.
>
> 0001 is the minimal
On Sat, Mar 23, 2024 at 6:57 PM Tomas Vondra
wrote:
> On 3/23/24 14:47, Tomas Vondra wrote:
> > On 3/23/24 13:38, Robert Haas wrote:
> >> On Fri, Mar 22, 2024 at 8:26 PM Thomas Munro
> >> wrote:
[..]
> > Yeah, that's in write_reconstructed_file() and the patch does not touch
> > that at all. I
On Tue, Mar 26, 2024 at 7:03 PM Tomas Vondra
wrote:
[..]
>
> That's really strange.
Hi Tomas, but it looks like it's fixed now :)
> > --manifest-checksums=NONE --copy-file-range without v20240323-2-0002:
> > 27m23.887s
> > --manifest-checksums=NONE --copy-file-range with v20240323-2-0002 and
>
Hi -hackers,
While chasing some other bug I've learned that backtrace_functions
might be misleading with top elog/ereport() address.
Reproducer:
# using Tom's reproducer on master:
wget
https://www.postgresql.org/message-id/attachment/112394/ri-collation-bug-example.sql
echo '' >> ri-collation-
s %drqm d_await dareq-sz f/s f_await aqu-sz %util
nvme0n1 61212.00591.82 0.00 0.000.10 9.90
2.00 0.02 0.00 0.000.0012.000.00 0.00
0.00 0.000.00 0.000.00 0.006.28 85.20
So in short it looks good to me.
-Jakub Wartak.
Hi Andrey,
On Thu, Mar 28, 2024 at 1:09 PM Andrey M. Borodin wrote:
>
>
>
> > On 8 Aug 2023, at 12:31, John Naylor wrote:
> >
> > > > Also the shared counter is the cause of the slowdown, but not the
> > > > reason for the numeric limit.
> > >
> > > Isn't it both? typedef Oid is unsigned int =
On Mon, Apr 1, 2024 at 9:46 PM Tomas Vondra
wrote:
>
> Hi,
>
> I've been running some benchmarks and experimenting with various stuff,
> trying to improve the poor performance on ZFS, and the regression on XFS
> when using copy_file_range. And oh boy, did I find interesting stuff ...
[..]
Congra
a key thing due to
having fast ability to "restore" the clone rather than copying the
data from somewhere else)
- pg_basebackup without that would be unusuable without space savings
(e.g. imagine daily backups @ 10+TB DWHs)
> On 4/3/24 15:39, Jakub Wartak wrote:
> > On Mon, Apr 1, 202
On Thu, Apr 4, 2024 at 9:11 PM Tomas Vondra
wrote:
>
> On 4/4/24 19:38, Robert Haas wrote:
> > Hi,
> >
> > Yesterday, Tomas Vondra reported to me off-list that he was seeing
> > what appeared to be data corruption after taking and restoring an
> > increme
45, "\0\0\0\0\240s\325\4\0\0\4\0\370\1\0\2\0 \4
\0\0\0\0\300\237t\0\200\237t\0"..., 8192, 491520) = 8192
fadvise64(45, 401408, 8192, POSIX_FADV_WILLNEED) = 0
fadvise64(45, 335872, 8192, POSIX_FADV_WILLNEED) = 0
pread64(45, "\0\0\0\0\250\233r\4\0\0\4\0\370\1\0\2\0 \4
\0\0\0\0\300\237t\0\2
On Fri, Mar 1, 2024 at 3:58 PM Tomas Vondra
wrote:
[..]
> TBH I don't have a clear idea what to do. It'd be cool to have at least
> some benefits in v17, but I don't know how to do that in a way that
> would be useful in the future.
>
> For example, the v20240124 patch implements this in the execu
series(1, 2000) as Total) select repeat('a', 100) ||
data.Total || repeat('b', 800) as total_pat from data;" | wc -l
2000
postgres@hive:~$
Regards,
-Jakub Wartak.
0001-psql-allow-CTE-queries-to-be-executed-also-using-cur.patch
Description: Binary data
importance and very low priority of this, how about
adding it as a TODO wiki item then and maybe adding just some warning
instead? I've intentionally avoided parsing grammar and regexp so it's
not perfect (not that I do care about this too much either, as web
crawlers already have index
Hi -hackers,
I would like to ask if it wouldn't be good idea to copy the
https://wiki.postgresql.org/wiki/TOAST#Total_table_size_limit
discussion (out-of-line OID usage per TOAST-ed columns / potential
limitation) to the official "Appendix K. PostgreSQL Limits" with also
little bonus mentioning th
Hi,
>> These 2 discussions show that it's a painful experience to run into
>> this problem, and that the hackers have ideas on how to fix it, but
>> those fixes haven't materialized for years. So I would say that, yes,
>> this info belongs in the hard-limits section, because who knows how
>> long
sewhere.
I've wrongly put it, I've meant that pg_largeobject also consume OID
and as such are subject to 32TB limit.
>
> +
> + large objects number
>
> "large objects per database"
Fixed.
> + subject to the same limitations as rows per
> table
>
> That implies table size is the only factor. Max OID is also a factor, which
> was your stated reason to include LOs here in the first place.
Exactly..
Regards,
-Jakub Wartak.
v2-0001-doc-Add-some-OID-TOAST-related-limitations-to-the.patch
Description: Binary data
earlier. The patch passed all my very limited tests along with
make check-world. Patch looks good to me on the surface from a
usability point of view. I haven't looked at the code, so the patch
might still need an in-depth review.
Regards,
-Jakub Wartak.
Hi Robert,
On Wed, Oct 4, 2023 at 10:09 PM Robert Haas wrote:
>
> On Tue, Oct 3, 2023 at 2:21 PM Robert Haas wrote:
> > Here's a new patch set, also addressing Jakub's observation that
> > MINIMUM_VERSION_FOR_WAL_SUMMARIES needed updating.
>
> Here's yet another new version.[..]
Okay, so anothe
On Mon, Oct 30, 2023 at 6:46 PM Robert Haas wrote:
>
> On Thu, Sep 28, 2023 at 6:22 AM Jakub Wartak
> wrote:
> > If that is still an area open for discussion: wouldn't it be better to
> > just specify LSN as it would allow resyncing standby across major lag
> >
Hi Robert,
[..spotted the v9 patchset..]
so I've spent some time playing still with patchset v8 (without the
6/6 testing patch related to wal_level=minimal), with the exception of
- patchset v9 - marked otherwise.
1. On compile time there were 2 warnings to shadowing variable (at
least with gcc
On Mon, Nov 20, 2023 at 4:43 PM Robert Haas wrote:
>
> On Fri, Nov 17, 2023 at 5:01 AM Alvaro Herrera
> wrote:
> > I made a pass over pg_combinebackup for NLS. I propose the attached
> > patch.
>
> This doesn't quite compile for me so I changed a few things and
> incorporated it. Hopefully I di
On Tue, Dec 5, 2023 at 7:11 PM Robert Haas wrote:
[..v13 patchset]
The results with v13 patchset are following:
* - requires checkpoint on primary when doing incremental on standby
when it's too idle, this was explained by Robert in [1], something AKA
too-fast-incremental backup due to testing-
On Thu, Dec 7, 2023 at 4:15 PM Robert Haas wrote:
Hi Robert,
> On Thu, Dec 7, 2023 at 9:42 AM Jakub Wartak
> wrote:
> > Comment: I was wondering if it wouldn't make some sense to teach
> > pg_resetwal to actually delete all WAL summaries after any any
> > WAL/
Hi Robert,
On Mon, Dec 11, 2023 at 6:08 PM Robert Haas wrote:
>
> On Fri, Dec 8, 2023 at 5:02 AM Jakub Wartak
> wrote:
> > While we are at it, maybe around the below in PrepareForIncrementalBackup()
> >
> > if (tlep[i] == NULL)
> >
Hi Robert,
On Wed, Dec 13, 2023 at 2:16 PM Robert Haas wrote:
>
>
> > > not even in case of an intervening
> > > timeline switch. So, all of the errors in this function are warning
> > > you that you've done something that you really should not have done.
> > > In this particular case, you've ei
Hi Robert,
On Tue, Dec 19, 2023 at 9:36 PM Robert Haas wrote:
>
> On Fri, Dec 15, 2023 at 5:36 AM Jakub Wartak
> wrote:
> > I've played with with initdb/pg_upgrade (17->17) and i don't get DBID
> > mismatch (of course they do differ after i
The following review has been posted through the commitfest application:
make installcheck-world: tested, passed
Implements feature: tested, passed
Spec compliant: not tested
Documentation:tested, passed
I've tested the patched on 17devel/master and it is my feeling -
low due to (int) * (int), while
MemoryContextAllocHuge() allows taking Size(size_t) as parameter. I
get similar behaviour with:
size_t val = (int)1048576 * (int)3022;
Attached patch adjusts pgstat_track_activity_query_size to be of
size_t from int and fixes the issue.
Regards,
-Jakub Wartak.
0
On Wed, Sep 27, 2023 at 10:08 AM Michael Paquier wrote:
>
> On Wed, Sep 27, 2023 at 08:41:55AM +0200, Jakub Wartak wrote:
> > Attached patch adjusts pgstat_track_activity_query_size to be of
> > size_t from int and fixes the issue.
>
> This cannot be backpatched, and us
On Thu, Sep 28, 2023 at 12:53 AM Michael Paquier wrote:
>
> On Wed, Sep 27, 2023 at 10:29:25AM -0700, Andres Freund wrote:
> > I don't think going for size_t is a viable path for fixing this. I'm pretty
> > sure the initial patch would trigger a type mismatch from guc_tables.c - we
> > don't have
On Wed, Aug 30, 2023 at 4:50 PM Robert Haas wrote:
[..]
I've played a little bit more this second batch of patches on
e8d74ad625f7344f6b715254d3869663c1569a51 @ 31Aug (days before wait
events refactor):
test_across_wallevelminimal.sh
test_many_incrementals_dbcreate.sh
test_many_incrementals.sh
t
On Fri, Sep 29, 2023 at 4:00 AM Michael Paquier wrote:
>
> On Thu, Sep 28, 2023 at 11:01:14AM +0200, Jakub Wartak wrote:
> > v3 attached. I had a problem coming out with a better error message,
> > so suggestions are welcome. The cast still needs to be present as per
> >
(I don't know why
LP_DEAD/hints cleaning was not kicking in, but maybe it was, but given
the scale of the problem it was not helping much).
-Jakub Wartak.
[1] -
https://www.postgresql.org/message-id/flat/54446AE2.6080909%40BlueTreble.com#f436bb41cf044b30eeec29472a13631e
[2] -
https://www.po
Hi,
Draft version of the patch attached (it is based on Simon's)
I would be happier if we could make that #define into GUC (just in
case), although I do understand the effort to reduce the number of
various knobs (as their high count causes their own complexity).
-Jakub Wartak.
On Mon, N
Hi all,
apologies the patch was rushed too quickly - my bad. I'm attaching a
fixed one as v0004 (as it is the 4th patch floating around here).
-Jakub Wartak
On Mon, Nov 21, 2022 at 9:55 PM Robert Haas wrote:
>
> On Mon, Nov 21, 2022 at 1:17 PM Andres Freund wrote:
> > On No
xRelationId=indexRelationId@entry=0,
parentIndexId=parentIndexId@entry=0
-Jakub Wartak.
On Fri, Nov 25, 2022 at 9:48 AM Tomas Vondra
wrote:
>
>
>
> On 11/18/22 15:43, Tom Lane wrote:
> > David Geier writes:
> >> On a different note: are we frequently running our tests suites wit
Hi David, Alvaro, -hackers
> Hi David,
>
> You're probably aware of this, but just to make it explicit: Jakub Wartak was
> testing performance of recovery, and one of the bottlenecks he found in
> some of his cases was dynahash as used by SMgr. It seems quite possible
&g
Hey David,
> I think you'd have to batch by filenode and transaction in that case. Each
> batch might be pretty small on a typical OLTP workload, so it might not help
> much there, or it might hinder.
True, it is very workload dependent (I was chasing mainly INSERTs multiValues,
INSERT-SELECT)
[..]
INSERT 0 50
Time: 22737.729 ms (00:22.738)
Without this feature (or with synchronous_commit_flush_wal_after=0)
the TCP's SendQ on socket walsender-->walreceiver is growing and as
such any next sendto() by OLTP backends/walwriter ends being queued
too much causing sta
> On 1/25/23 20:05, Andres Freund wrote:
> > Hi,
> >
> > Such a feature could be useful - but I don't think the current place of
> > throttling has any hope of working reliably:
[..]
> > You're blocking in the middle of an XLOG insertion.
[..]
> Yeah, I agree the sleep would have to happen elsewher
Hi,
v2 is attached.
On Thu, Jan 26, 2023 at 4:49 PM Andres Freund wrote:
> Huh? Why did you remove the GUC?
After reading previous threads, my optimism level of getting it ever
in shape of being widely accepted degraded significantly (mainly due
to the discussion of wider category of 'WAL I/O
Hi Bharath,
On Fri, Jan 27, 2023 at 12:04 PM Bharath Rupireddy
wrote:
>
> On Fri, Jan 27, 2023 at 2:03 PM Alvaro Herrera
> wrote:
> >
> > On 2023-Jan-27, Bharath Rupireddy wrote:
> >
> > > Looking at the patch, the feature, in its current shape, focuses on
> > > improving replication lag (by th
On Mon, Jan 30, 2023 at 9:16 AM Bharath Rupireddy
wrote:
Hi Bharath, thanks for reviewing.
> I think measuring the number of WAL flushes with and without this
> feature that the postgres generates is great to know this feature
> effects on IOPS. Probably it's even better with variations in
> syn
On Wed, Feb 1, 2023 at 2:14 PM Tomas Vondra
wrote:
> > Maybe we should avoid calling fsyncs for WAL throttling? (by teaching
> > HandleXLogDelayPending()->XLogFlush()->XLogWrite() to NOT to sync when
> > we are flushing just because of WAL thortting ?) Would that still be
> > safe?
>
> It's not c
On Thu, Feb 2, 2023 at 11:03 AM Tomas Vondra
wrote:
> > I agree that some other concurrent backend's
> > COMMIT could fsync it, but I was wondering if that's sensible
> > optimization to perform (so that issue_fsync() would be called for
> > only commit/rollback records). I can imagine a scenario
Hi, Asking out of pure technical curiosity about "the rhinoceros" - what kind
of animal is it ? Physical box or VM? How one could get dmidecode(1) / dmesg(1)
/ mcelog (1) from what's out there (e.g. does it run ECC or not ?)
-J.
> -Original Message-
> From: Alvaro Herrera
> Sent: Tuesd
> I do agree that the perf report does indicate that the extra time is taken
> due to
> some large amount of memory being allocated. I just can't quite see how that
> would happen in Memoize given that
> estimate_num_groups() clamps the distinct estimate as the number of input
> rows, which is 91
Hi Pavel,
> I have not debug symbols, so I have not more details now
> Breakpoint 1 at 0x7f557f0c16c0
> (gdb) c
> Continuing.
> Breakpoint 1, 0x7f557f0c16c0 in mmap64 () from /lib64/libc.so.6
> (gdb) bt
> #0 0x7f557f0c16c0 in mmap64 () from /lib64/libc.so.6
> #1 0x7f557f04dd91 in sy
Hi Nathan,
> > NVMe devices have a maximum queue length of 64k:
[..]
> > but our effective_io_concurrency maximum is 1,000:
[..]
> > Should we increase its maximum to 64k? Backpatched? (SATA has a
> > maximum queue length of 256.)
>
> If there are demonstrable improvements with higher values, t
Hi Tomas,
> Hi,
>
> At on of the pgcon unconference sessions a couple days ago, I presented a
> bunch of benchmark results comparing performance with different data/WAL
> block size. Most of the OLTP results showed significant gains (up to 50%) with
> smaller (4k) data pages.
Nice. I just saw th
Hi Tomas,
> Well, there's plenty of charts in the github repositories, including the
> charts I
> think you're asking for:
Thanks.
> I also wonder how is this related to filesystem page size - in all the
> benchmarks I
> did I used the default (4k), but maybe it'd behave if the filesystem page
[..]
>I doubt we could ever
> make the default smaller than it is today as it would nobody would be able to
> insert rows larger than 4 kilobytes into a table anymore.
Add error "values larger than 1/3 of a buffer page cannot be indexed" to that
list...
-J.
Hi Tomas,
> > I have a machine here with 1 x PCIe 3.0 NVMe SSD and also 1 x PCIe 4.0
> > NVMe SSD. I ran a few tests to see how different values of
> > effective_io_concurrency would affect performance. I tried to come up
> > with a query that did little enough CPU processing to ensure that I/O
>
Hi,
> The really
> puzzling thing is why is the filesystem so much slower for smaller pages. I
> mean,
> why would writing 1K be 1/3 of writing 4K?
> Why would a filesystem have such effect?
Ha! I don't care at this point as 1 or 2kB seems too small to handle many real
world scenarios ;)
> > b
> >> The attached patch is a trivial version that waits until we're at
> >> least
> >> 32 pages behind the target, and then prefetches all of them. Maybe give it
> >> a
> try?
> >> (This pretty much disables prefetching for e_i_c below 32, but for an
> >> experimental patch that's enough.)
> >
> >
Hi, got some answers!
TL;DR for fio it would make sense to use many stressfiles (instead of 1) and
same for numjobs ~ VCPU to avoid various pitfails.
> >> The really
> >> puzzling thing is why is the filesystem so much slower for smaller
> >> pages. I mean, why would writing 1K be 1/3 of writing
> > The really
> puzzling thing is why is the filesystem so much slower for smaller
> pages. I mean, why would writing 1K be 1/3 of writing 4K?
> Why would a filesystem have such effect?
> >>>
> >>> Ha! I don't care at this point as 1 or 2kB seems too small to handle
> >>> many
> On 6/9/22 13:23, Jakub Wartak wrote:
> >>>>>>> The really
> >>>>>> puzzling thing is why is the filesystem so much slower for
> >>>>>> smaller pages. I mean, why would writing 1K be 1/3 of writing 4K?
> >>>>
>> > On 21 Jun 2022, at 12:35, Amit Kapila wrote:
>> >
>> > I wonder if the newly introduced "recovery_prefetch" [1] for PG-15 can
>> > help your case?
>>
>> AFAICS recovery_prefetch tries to prefetch main fork, but does not try to
>> prefetch WAL itself before reading it. Kirill is trying to sol
> > Maybe the important question is why would be readahead mechanism be
> disabled in the first place via /sys | blockdev ?
>
> Because database should know better than OS which data needs to be
> prefetched and which should not. Big OS readahead affects index scan
> performance.
OK fair point, h
> On Tue, Jun 21, 2022 at 10:33 PM Jakub Wartak
> wrote:
> > > > Maybe the important question is why would be readahead mechanism
> > > > be
> > > disabled in the first place via /sys | blockdev ?
> > >
> > > Because database should know b
>> > On 21 Jun 2022, at 16:59, Jakub Wartak wrote:
>> Oh, wow, your benchmarks show really impressive improvement.
>>
>> > I think that 1 additional syscall is not going to be cheap just for
>> > non-standard OS configurations
>> Also we can reduce nu
Hey Andrey,
> > 23 июня 2022 г., в 13:50, Jakub Wartak
> написал(а):
> >
> > Thoughts?
> The patch leaves 1st 128KB chunk unprefetched. Does it worth to add and extra
> branch for 120KB after 1st block when readOff==0?
> Or maybe do
> + posix_fadvis
> On Fri, Jul 30, 2021 at 4:00 PM Andres Freund wrote:
> > I don't agree with that? If (user+system) << wall then it is very
> > likely that recovery is IO bound. If system is a large percentage of
> > wall, then shared buffers is likely too small (or we're replacing the
> > wrong
> > buffers) bec
Hi Álvaro, -hackers,
> I attach the patch with the change you suggested.
I've gave a shot to to the v02 patch on top of REL_12_STABLE (already including
5065aeafb0b7593c04d3bc5bc2a86037f32143fc). Previously(yesterday) without the
v02 patch I was getting standby corruption always via simulation
On Mon, Jul 10, 2023 at 6:24 PM Andres Freund wrote:
>
> Hi,
>
> On 2023-07-03 11:53:56 +0200, Jakub Wartak wrote:
> > Out of curiosity I've tried and it is reproducible as you have stated : XFS
> > @ 4.18.0-425.10.1.el8_7.x86_64:
> >...
> > Accord
On Tue, Apr 23, 2024 at 2:24 AM Michael Paquier wrote:
>
> On Mon, Apr 22, 2024 at 03:40:01PM +0200, Majid Garoosi wrote:
> > Any news, comments, etc. about this thread?
>
> FWIW, I'd still be in favor of doing a GUC-ification of this part, but
> at this stage I'd need more time to do a proper stu
Hi,
> My understanding of Majid's use-case for tuning MAX_SEND_SIZE is that the
> bottleneck is storage, not network. The reason MAX_SEND_SIZE affects that is
> that it determines the max size passed to WALRead(), which in turn determines
> how much we read from the OS at once. If the storage has
Hi Ashutosh & hackers,
On Mon, Apr 15, 2024 at 9:00 AM Ashutosh Bapat
wrote:
>
> Here's patch with
>
[..]
> Adding to the next commitfest but better to consider this for the next set of
> minor releases.
1. The patch does not pass cfbot -
https://cirrus-ci.com/task/5486258451906560 on master du
Hi Tom and -hackers!
On Thu, Mar 28, 2024 at 7:36 PM Tom Lane wrote:
>
> Jakub Wartak writes:
> > While chasing some other bug I've learned that backtrace_functions
> > might be misleading with top elog/ereport() address.
>
> That was understood from the beginni
Hi Peter!
On Sun, May 12, 2024 at 10:33 PM Peter Eisentraut wrote:
>
> On 07.05.24 09:43, Jakub Wartak wrote:
> > NOTE: in case one will be testing this: one cannot ./configure with
> > --enable-debug as it prevents the compiler optimizations that actually
> > end up
On Tue, May 14, 2024 at 8:19 PM Robert Haas wrote:
>
> I looked at your version and wrote something that is shorter and
> doesn't touch any existing text. Here it is.
Hi Robert, you are a real tactician here - thanks for whatever
references the original problem! :) Maybe just slight hint nearby
e
that's still good enough?; Or, well maybe try to hack a palloc()
a little, but that has probably too big overhead, right? (just
thinking loud).
-Jakub Wartak.
Hi Masahiko,
Out of curiosity I've tried and it is reproducible as you have stated : XFS
@ 4.18.0-425.10.1.el8_7.x86_64:
[root@rockyora ~]# time ./test test.1 1
total 20
fallocate 20
filewrite 0
real0m5.868s
user0m0.035s
sys 0m5.716s
[root@rockyora ~]# time ./te
On Tue, Jun 13, 2023 at 10:20 AM John Naylor
wrote:
Hi John,
v3 is attached for review.
> >
> >-
> >+ see note below on TOAST
>
> Maybe:
> "further limited by the number of TOAST-ed values; see note below"
Fixed.
> > I've wrongly put it, I've meant that pg_largeobject also co
On Thu, Aug 22, 2024 at 8:11 AM Peter Eisentraut wrote:
>
> On 15.08.24 08:38, Peter Eisentraut wrote:
> > On 08.08.24 19:42, Robert Haas wrote:
> >>> I'm thinking pg_upgrade could have a mode where it adds the
> >>> checksum during the upgrade as it copies the files (essentially a subset
> >>> of
1 - 100 of 171 matches
Mail list logo