Re: JIT compiling with LLVM v12.2

2018-03-24 Thread Thomas Munro
Thomas Munro wrote: > typos A dead line. -- Thomas Munro http://www.enterprisedb.com 0003-Remove-dead-code.patch Description: Binary data

Re: new function for tsquery creartion

2018-03-26 Thread Thomas Munro
ns only stopword(s) or doesn't contain lexeme(s), ignored"))); + (errmsg("text-search query contains only stop words or doesn't contain lexemes, ignored"))); But the old test still appears in an example in doc/src/sgml/textsearch.sgml. -- Thomas Munro http://www.enterprisedb.com docs.patch Description: Binary data

Parallel safety of binary_upgrade_create_empty_extension

2018-03-26 Thread Thomas Munro
ark that function PARALLEL UNSAFE. Obviously that'll affect only newly initdb'd clusters after this patch, but that's what people have in a pg_upgrade scenario. This goes back to d89f06f0482 so I think it should probably be back-patched to 9.6 and 1

Re: Parallel safety of binary_upgrade_create_empty_extension

2018-03-26 Thread Thomas Munro
others all just set a global variable so they're technically fine as 'r'. I have no strong preference either way; these functions will only actually be run in parallel in the weird situation of force_parallel_mode = on. -- Thomas Munro http://www.enterprisedb.com

Re: Parallel safety of binary_upgrade_create_empty_extension

2018-03-26 Thread Thomas Munro
of others all lead there creating many possibly false positives (though who knows). If I filter those out I'm left with the ones already mentioned (pg_import_system_collations, binary_upgrade_create_empty_extension) plus two others: 1. unique_key_recheck, not user callable anyway. 2. brin

Re: Parallel safety of binary_upgrade_create_empty_extension

2018-03-26 Thread Thomas Munro
On Tue, Mar 27, 2018 at 3:30 PM, Thomas Munro wrote: > I hacked something up in Python # otool -tvV | \ In case anyone is interested in trying that, it should be "otool -tvV [path to postgres executable compiled with -O0]" (meaning disassemble it). I was removing my home direc

Re: [HACKERS] pg_serial early wraparound

2018-03-26 Thread Thomas Munro
On Tue, Mar 27, 2018 at 5:50 AM, Tom Lane wrote: > Thomas Munro writes: >> Rebased again, now with a commit message. That assertion has since >> been removed (commit ec99dd5a) so the attached test script can once >> again be used to see the contents of pg_serial as the xi

Re: Updating parallel.sgml's treatment of parallel joins

2018-03-28 Thread Thomas Munro
On Fri, Mar 23, 2018 at 6:26 AM, Robert Haas wrote: > On Fri, Feb 23, 2018 at 10:30 PM, Thomas Munro > wrote: >> Here is an attempt at updating parallel.sgml to cover Parallel Hash. >> I will be neither surprised nor offended if Robert would like to put >> it dif

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-03-28 Thread Thomas Munro
et/Articles/724307/ "Current kernels might report a writeback error on an fsync() call, but there are a number of ways in which that can fail to happen." That's... I'm speechless. -- Thomas Munro http://www.enterprisedb.com

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-03-28 Thread Thomas Munro
g even matters, whether it's a single bit or a counter as described in that patch. If write back failed, *the page is still dirty*. So all future calls to fsync() need to try to try to flush it again, and (presumably) fail again (unless it happens to succeed this time around). --

Re: Parallel safety of binary_upgrade_create_empty_extension

2018-03-29 Thread Thomas Munro
_summarize_new_values * brin_desummarize_range * gin_clean_pending_list * cursor_to_xml * cursor_to_xmlschema Has anyone got anything else? -- Thomas Munro http://www.enterprisedb.com

Re: Typo in shared_record_table_compare() commentary

2018-03-29 Thread Thomas Munro
On Thu, Mar 29, 2018 at 10:16 PM, Arthur Zakirov wrote: > During studying dshash I've found a little typo. There is no > SharedRecordTableKey struct in the code, I think the commentary refers > to SharedRecordTableKey struct. Right, thanks! > The patch is attached. +1 --

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-03-29 Thread Thomas Munro
haven't looked past linux yet, though. I see no reason to think that any other operating system would behave that way without strong evidence... This is openly acknowledged to be "a mess" and "a surprise" in the Filesystem Summit article. I am not really qualified to comment, but from a cursory glance at FreeBSD's vfs_bio.c I think it's doing what you'd hope for... see the code near the comment "Failed write, redirty." -- Thomas Munro http://www.enterprisedb.com

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-03-29 Thread Thomas Munro
reSQL's buffer and the kernel's buffer are clean and might be reused for another block at any time, so your data might be gone from the known universe -- we don't even have the option to rewrite our buffers in general. Recovery is the only option. Thank you to Craig for chasing this

Re: [HACKERS] SERIALIZABLE with parallel query

2018-03-29 Thread Thomas Munro
ingering uncertainty about this patch and we're out of time, so I moved it to PG12 CF1. Thanks Haribabu, Robert, Amit for the reviews and comments so far. -- Thomas Munro http://www.enterprisedb.com

Re: [HACKERS] Planning counters in pg_stat_statements

2018-03-30 Thread Thomas Munro
ystem load from ~8 to ~1. Having this patch in > pg_stat_statements would have been critical to get the full picture of what > was going on earlier. > > Thomas: I'm not as familiar with planner internals as you are, but happy to > try and contribute here. Would it be useful for m

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-01 Thread Thomas Munro
On Fri, Mar 30, 2018 at 10:18 AM, Thomas Munro wrote: > ... on Linux only. Apparently I was too optimistic. I had looked only at FreeBSD, which keeps the page around and dirties it so we can retry, but the other BSDs apparently don't (FreeBSD changed that in 1999). From what I can t

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-02 Thread Thomas Munro
1] https://github.com/apple/darwin-xnu/blob/master/bsd/vfs/vfs_bio.c#L2695 -- Thomas Munro http://www.enterprisedb.com

Re: Optimize Arm64 crc32c implementation in Postgresql

2018-04-03 Thread Thomas Munro
I hope we can figure out a more portable way to detect the instructions, or failing that a way to detect them on FreeBSD in userspace. I'll try to send a patch for PG12 if I get a chance. No opinion on the unaligned memory access question. -- Thomas Munro http://www.enterprisedb.com

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-03 Thread Thomas Munro
On Tue, Apr 3, 2018 at 1:29 PM, Thomas Munro wrote: > Interestingly, there don't seem to be many operating systems that can > report ENOSPC from fsync(), based on a quick scan through some > documentation: > > POSIX, AIX, HP-UX, FreeBSD, OpenBSD, NetBSD: no > Illumos/Sol

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-03 Thread Thomas Munro
rently think it's reasonable behaviour and not a bug. At least there is a plausible workaround for that: namely the nuclear option proposed by Craig. -- Thomas Munro http://www.enterprisedb.com

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-03 Thread Thomas Munro
On Wed, Apr 4, 2018 at 2:14 PM, Bruce Momjian wrote: > On Tue, Apr 3, 2018 at 10:05:19PM -0400, Bruce Momjian wrote: >> On Wed, Apr 4, 2018 at 01:54:50PM +1200, Thomas Munro wrote: >> > I believe there were some problems of that nature (with various >> > twists

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-03 Thread Thomas Munro
On Wed, Apr 4, 2018 at 2:44 PM, Thomas Munro wrote: > On Wed, Apr 4, 2018 at 2:14 PM, Bruce Momjian wrote: >> Uh, are you sure it fixes our use-case? From the email description it >> sounded like it only reported fsync errors for every open file >> descriptor at the time of

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-04 Thread Thomas Munro
On Wed, Apr 4, 2018 at 6:00 PM, Craig Ringer wrote: > On 4 April 2018 at 13:29, Thomas Munro > wrote: >> /* Ensure that we skip any errors that predate opening of the file */ >> f->f_wb_err = filemap_sample_wb_err(f->f_mapping); >> >> [...] > > Ho

Re: Optimize Arm64 crc32c implementation in Postgresql

2018-04-04 Thread Thomas Munro
ould break when called by pg_logical_slot_get_changes()... https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=eelpout&dt=2018-04-04%2009%3A58%3A56 -- Thomas Munro http://www.enterprisedb.com

Re: Optimize Arm64 crc32c implementation in Postgresql

2018-04-04 Thread Thomas Munro
t now and found out that some libraries use a technique they call "CPU probing": just try it and see if you get SIGILL. Is that a bad idea for some reason? Here is a quick hack -- anyone got an ARM system without crc that they could test it on? -- Thomas Munro http://www.enterpris

Re: Optimize Arm64 crc32c implementation in Postgresql

2018-04-04 Thread Thomas Munro
On Wed, Apr 4, 2018 at 11:47 PM, Thomas Munro wrote: > BTW I did some googling just now and found out that some libraries use > a technique they call "CPU probing": just try it and see if you get > SIGILL. Is that a bad idea for some reason? Here is a quick hack -- > an

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-04 Thread Thomas Munro
idea on ZFS because you'll finish up double-buffering (or is that triple-buffering?), flooding your page cache with transient data. Oops. That is off-topic and not relevant for the checkpoint correctness topic of this thread through, since pg_flush_data() is advisory only. -- Thomas Munro http://www.enterprisedb.com

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-04 Thread Thomas Munro
On Thu, Apr 5, 2018 at 9:28 AM, Thomas Munro wrote: > On Thu, Apr 5, 2018 at 2:00 AM, Craig Ringer wrote: >> I've tried xfs, jfs, ext3, ext4, even vfat. All behave the same on EIO. >> Didn't try zfs-on-linux or other platforms yet. > > While contemplating wha

Unstable number of workers in select_parallel test on spurfowl

2018-04-04 Thread Thomas Munro
s); alter table f_star reset (parallel_workers); set enable_parallel_append to off; explain (costs off) select round(avg(aa)), sum(aa) from a_star; I don't see why... -- Thomas Munro http://www.enterprisedb.com

Re: Checkpoint not retrying failed fsync?

2018-04-05 Thread Thomas Munro
efore we * begin to scan their fork. Why is it OK to unlink the bitmapset? We still need its contents, in the case that the fsync fails! -- Thomas Munro http://www.enterprisedb.com

Re: Checkpoint not retrying failed fsync?

2018-04-05 Thread Thomas Munro
On Fri, Apr 6, 2018 at 11:34 AM, Andrew Gierth wrote: >>>>>> "Thomas" == Thomas Munro writes: > > >> As far as I can tell from reading the code, if a checkpoint fails the > >> checkpointer is supposed to keep all the outstanding fsync requests

Re: Checkpoint not retrying failed fsync?

2018-04-05 Thread Thomas Munro
On Fri, Apr 6, 2018 at 11:36 AM, Thomas Munro wrote: > On Fri, Apr 6, 2018 at 11:34 AM, Andrew Gierth > wrote: >> Right. >> >> But I don't think just copying the value is sufficient; if a new bit was >> set while we were processing the old ones, how would we

Re: Checkpoint not retrying failed fsync?

2018-04-05 Thread Thomas Munro
On Fri, Apr 6, 2018 at 12:56 PM, Thomas Munro wrote: > After some testing, here is a better one for review. One problem I thought of about 8 milliseconds after clicking send is that bms_union() may fail to allocate memory and then you're hosed. Here is a new version that uses bms_join()

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-05 Thread Thomas Munro
you could try to confirm our understand of the Linux 4.13+ policy would be to hack PostgreSQL so that it reopens the file descriptor every time in mdsync(). See attached. -- Thomas Munro http://www.enterprisedb.com force-reopen-when-syncing.patch Description: Binary data

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-07 Thread Thomas Munro
way and be reset. Perhaps you could keep inodes pinned by keeping the associated buffers dirty after an error (like FreeBSD), but if you did that you'd have solved the problem already and wouldn't really need the wb_err system at all. Is there some other idea long these

Re: Checkpoint not retrying failed fsync?

2018-04-08 Thread Thomas Munro
e them as we go. New patch attached. -- Thomas Munro http://www.enterprisedb.com 0001-Make-sure-we-don-t-forget-about-fsync-requests-af-v3.patch Description: Binary data

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Thomas Munro
esort to flakey fsync() error reporting. I wonder if anyone can tell us what Windows, AIX and HPUX do here. > [1] > https://www.usenix.org/system/files/conference/osdi14/osdi14-paper-pillai.pdf Very interesting, thanks. -- Thomas Munro http://www.enterprisedb.com

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Thomas Munro
On Tue, Apr 10, 2018 at 10:33 AM, Thomas Munro wrote: > I wonder if anyone can tell us what Windows, AIX and HPUX do here. I created a wiki page to track what we know (or think we know) about fsync() on various operating systems: https://wiki.postgresql.org/wiki/Fsync_Errors If anyone has m

Re: Excessive PostmasterIsAlive calls slow down WAL redo

2018-04-09 Thread Thomas Munro
doesn't even touch the pipe, or any other kernel objects apart from your own queue IIUC. -- Thomas Munro http://www.enterprisedb.com

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Thomas Munro
hen the backend is done with it would be better. How would that interlock with concurrent checkpoints? I can see how to make that work if the share-fd-or-fsync-now logic happens in smgrwrite() when called by FlushBuffer() while you hold io_in_progress, but not if you defer it to some random time later. -- Thomas Munro http://www.enterprisedb.com

Re: [HACKERS] PATCH: Keep one postmaster monitoring pipe per process

2018-04-10 Thread Thomas Munro
On Tue, Sep 20, 2016 at 11:26 AM, Andres Freund wrote: > On 2016-09-20 11:07:03 +1200, Thomas Munro wrote: >> Yeah, I wondered why that was different than the pattern established >> elsewhere when I was hacking on replication code. There are actually >> several

Re: [HACKERS] PATCH: Keep one postmaster monitoring pipe per process

2018-04-10 Thread Thomas Munro
On Wed, Apr 11, 2018 at 12:03 PM, Andres Freund wrote: > On 2018-04-11 11:57:20 +1200, Thomas Munro wrote: >> Then if pgarch_ArchiverCopyLoop() and HandleStartupProcInterrupts() >> (ie loops without waiting) adopt a prctl(PR_SET_PDEATHSIG)-based >> approach where available a

Re: [HACKERS] PATCH: Keep one postmaster monitoring pipe per process

2018-04-10 Thread Thomas Munro
On Wed, Apr 11, 2018 at 12:26 PM, Andres Freund wrote: > On 2018-04-11 12:17:14 +1200, Thomas Munro wrote: >> I arrived at this idea via the realisation that the closest thing to >> prctl(PR_SET_PDEATHSIG) on BSD-family systems today is >> please-tell-my-kqueue-if-this

Re: [HACKERS] kqueue

2018-04-10 Thread Thomas Munro
On Wed, Dec 6, 2017 at 12:53 AM, Thomas Munro wrote: > On Thu, Jun 22, 2017 at 7:19 PM, Thomas Munro > wrote: >> I don't plan to resubmit this patch myself, but I was doing some >> spring cleaning and rebasing today and I figured it might be worth >> quietly leaving

Re: [HACKERS] PATCH: Keep one postmaster monitoring pipe per process

2018-04-11 Thread Thomas Munro
On Wed, Apr 11, 2018 at 12:47 PM, Thomas Munro wrote: > On Wed, Apr 11, 2018 at 12:26 PM, Andres Freund wrote: >> On 2018-04-11 12:17:14 +1200, Thomas Munro wrote: >>> I arrived at this idea via the realisation that the closest thing to >>> prctl(PR_SET_PDEATHSIG) on

Re: Excessive PostmasterIsAlive calls slow down WAL redo

2018-04-11 Thread Thomas Munro
On Wed, Apr 11, 2018 at 10:22 PM, Heikki Linnakangas wrote: > On 10/04/18 04:36, Thomas Munro wrote: >> Just an idea, not tested: what about a reusable WaitEventSet with zero >> timeout? Using the kqueue patch, that'd call kevent() which'd return >> immediately

Re: es_query_dsa is broken

2018-04-11 Thread Thomas Munro
On Thu, Apr 12, 2018 at 4:04 AM, Andres Freund wrote: > This is an open item for v11: > > Tidy up es_query_dsa and possibly ParallelWorkerContext? > Original commit: e13029a5ce353574516c64fd1ec9c50201e705fd (principal > author: Thomas Munro; owner: Robert Haas) &g

Instability in partition_prune test?

2018-04-12 Thread Thomas Munro
= $2) AND (b < 4)) -> Parallel Seq Scan on ab_a2_b3 (actual rows=0 loops=1) Filter: ((a >= $1) AND (a <= $2) AND (b < 4)) This is a Parallel Append with three processes working on three subplans. It looks like one of the subplans got executed twice? -- Thomas Munro http://www.enterprisedb.com

Re: Instability in partition_prune test?

2018-04-12 Thread Thomas Munro
On Fri, Apr 13, 2018 at 1:21 PM, David Rowley wrote: > On 13 April 2018 at 10:29, Thomas Munro wrote: >> This is a Parallel Append with three processes working on three >> subplans. It looks like one of the subplans got executed twice? > > Hi Thomas, > > Thanks for th

Re: [HACKERS] lseek/read/write overhead becomes visible at scale ..

2018-04-15 Thread Thomas Munro
pread()/pwrite() in PG12. I understand that the use of lseek() to find file sizes is a different problem and unrelated. -- Thomas Munro http://www.enterprisedb.com

Re: [sqlsmith] Unpinning error in parallel worker

2018-04-17 Thread Thomas Munro
vide them if/when the issue > happens again. Thanks, that would be much appreciated, as would any clues about what workload you're running. Do you know what the query plan looks like for the queries that crashed? -- Thomas Munro http://www.enterprisedb.com

Re: [sqlsmith] Unpinning error in parallel worker

2018-04-17 Thread Thomas Munro
On Wed, Apr 18, 2018 at 11:01 AM, Jonathan Rudenberg wrote: > On Tue, Apr 17, 2018, at 18:38, Thomas Munro wrote: >> Thanks, that would be much appreciated, as would any clues about what >> workload you're running. Do you know what the query plan looks like >> fo

Re: Excessive PostmasterIsAlive calls slow down WAL redo

2018-04-17 Thread Thomas Munro
of Andres's idea for Linux, and also for patched FreeBSD (for later if/when that lands). Do you think this makes sense Heikki? I am planning to add this to the next CF. -- Thomas Munro http://www.enterprisedb.com 0001-Use-signals-for-postmaster-death-detection-on-Linux.patch Description:

Re: [HACKERS] PATCH: Keep one postmaster monitoring pipe per process

2018-04-17 Thread Thomas Munro
ellation point' or a quiet PANIC that you can opt out of. It's nice to remove the old boilerplate code without having to add a new boilerplate event that you have to remember every time. Any other opinions? I'm not sure if the exit(1) vs proc_exit(1) distinction is important.

Re: Excessive PostmasterIsAlive calls slow down WAL redo

2018-04-18 Thread Thomas Munro
On Wed, Apr 18, 2018 at 5:04 PM, Thomas Munro wrote: > On Wed, Apr 11, 2018 at 10:22 PM, Heikki Linnakangas wrote: >>> On Tue, Apr 10, 2018 at 12:53 PM, Andres Freund >>> wrote: >>>> That person said he'd work on adding an equivalent of linux' >&g

Re: [HACKERS] PATCH: Keep one postmaster monitoring pipe per process

2018-04-18 Thread Thomas Munro
On Wed, Apr 18, 2018 at 6:55 PM, Thomas Munro wrote: > Here's a draft patch that does that. Here's a better one (the previous version could read past the end of the occurred_events array). -- Thomas Munro http://www.enterprisedb.com 0001-Exit-by-default-if-postmaster-

Re: Excessive PostmasterIsAlive calls slow down WAL redo

2018-04-19 Thread Thomas Munro
On Thu, Apr 19, 2018 at 6:20 PM, Andres Freund wrote: > On April 18, 2018 8:05:50 PM PDT, Thomas Munro > wrote: >>By the way, these patches only use the death signal to make >>PostmasterIsAlive() fast, for use by busy loops like recovery. The >>postmaster pipe is s

Re: [sqlsmith] Unpinning error in parallel worker

2018-04-19 Thread Thomas Munro
On Wed, Apr 18, 2018 at 11:43 AM, Jonathan Rudenberg wrote: > On Tue, Apr 17, 2018, at 19:31, Thomas Munro wrote: >> On Wed, Apr 18, 2018 at 11:01 AM, Jonathan Rudenberg >> wrote: >> > Yep, I think I know approximately what it looked like, I've attached a >>

Re: Excessive PostmasterIsAlive calls slow down WAL redo

2018-04-20 Thread Thomas Munro
Here's a new version, because FreeBSD's new interface changed slightly. -- Thomas Munro http://www.enterprisedb.com 0001-Use-signals-for-postmaster-death-on-Linux-v3.patch Description: Binary data 0002-Use-signals-for-postmaster-death-on-FreeBSD-v3.patch Description: Binary data

Re: [HACKERS] Clock with Adaptive Replacement

2018-04-23 Thread Thomas Munro
wnload?doi=10.1.1.452.9699&rep=rep1&type=pdf [2] https://www.usenix.org/legacy/event/usenix01/full_papers/zhou/zhou.pdf [3] https://www.usenix.org/legacy/event/usenix02/full_papers/wong/wong_html/ [4] http://www.spinics.net/lists/linux-mm/msg13385.html -- Thomas Munro http://www.enterprisedb.com

Re: "could not reattach to shared memory" on buildfarm member dory

2018-04-23 Thread Thomas Munro
gt; Yeah, that's definitely interesting. I wondered if another thread with the right timing could map something between the VirtualFree() and MapViewOfFileEx() calls, but we don't create the Windows signal handling thread until a bit later. Could there be any any other threads activ

Re: [sqlsmith] Unpinning error in parallel worker

2018-04-24 Thread Thomas Munro
;d love to see the stack of the one process that did that and then self-deadlocked. I will have another go at trying to reproduce it here today. -- Thomas Munro http://www.enterprisedb.com

Re: Excessive PostmasterIsAlive calls slow down WAL redo

2018-04-24 Thread Thomas Munro
se(DNSServiceRefSockFD(bonjour_sdref)); > #endif > + > +PostmasterDeathInit(); > > Thomas, trying to understand here... Why this place for the signal > initialization? Wouldn't InitPostmasterChild() be a more logical place > as we'd want to have this logic caught

Re: [HACKERS] Clock with Adaptive Replacement

2018-04-25 Thread Thomas Munro
hological searches for buffers (I believe that such workloads exist, from anecdotal reports). -- Thomas Munro http://www.enterprisedb.com

Re: [HACKERS] Clock with Adaptive Replacement

2018-04-25 Thread Thomas Munro
On Thu, Apr 26, 2018 at 10:54 AM, Peter Geoghegan wrote: > On Wed, Apr 25, 2018 at 5:26 AM, Thomas Munro > wrote: >> I think pgbench isn't a very interesting workload though. I don't >> have time right now but it would be fun to dig further and try to >>

Append with naive multiplexing of FDWs

2019-09-03 Thread Thomas Munro
;re ready 
select * from pt where b like '42'; [1] https://www.postgresql.org/message-id/CAEepm%3D1CuAWfxDk%3D%3DjZ7pgCDCv52fiUnDSpUvmznmVmRKU5zpA%40mail.gmail.com -- Thomas Munro https://enterprisedb.com 0001-Multiplexing-Append-POC.patch Description: Binary data

Re: ERROR: multixact X from before cutoff Y found to be still running

2019-09-04 Thread Thomas Munro
st running multi; the v2 uses the least aggressive of the 'safe' and oldest running multi. At first glance it seems like the second one is better: it only does something different if we're in the dangerous scenario you identified, but otherwise it sticks to the safe limit, which generates less IO. -- Thomas Munro https://enterprisedb.com

Re: ERROR: multixact X from before cutoff Y found to be still running

2019-09-05 Thread Thomas Munro
oesn't seem to be a problem in itself. (I am not sure why GetOldestMultiXactId() needs to consider OldestVisibleMXactId[] at all for this purpose, and not just OldestMemberXactId[], but I suppose it has to do with simultaneously key-share-locked and updated tuples or something, it's too earl

Re: Avoiding hash join batch explosions with extreme skew and weird stats

2019-09-05 Thread Thomas Munro
ounter can never escape and deadlock somewhere in the consumer part of the plan. Obviously we don't want to have loads of extra OS processes all over the place, but I think you can get the same effect using a form of asynchronous execution where the program counter jumps between nodes and streams based on readiness, and yields control instead of blocking. Similar ideas have been proposed to deal with asynchronous IO. -- Thomas Munro https://enterprisedb.com

Re: ERROR: multixact X from before cutoff Y found to be still running

2019-09-06 Thread Thomas Munro
On Sat, Sep 7, 2019 at 5:25 AM Robert Haas wrote: > (I apologize if any of the above sounds like I'm talking credit for > work actually done by Thomas, who I see is listed as the primary > author of the commit in question. I feel like I invented > MultiXactMemberFreezeThre

Re: Shared Memory: How to use SYSV rather than MMAP ?

2019-09-09 Thread Thomas Munro
On Wed, Sep 4, 2019 at 10:30 AM Alvaro Herrera wrote: > On 2019-Feb-03, Thomas Munro wrote: > > On Sat, Feb 2, 2019 at 12:49 AM Thomas Munro > > wrote: > > > I am planning to commit the 0001 patch shortly, unless there are > > > objections. I attach

Re: Consolidate 'unique array values' logic into a reusable function?

2019-09-09 Thread Thomas Munro
On Fri, Aug 30, 2019 at 3:34 PM Thomas Munro wrote: > Adding to CF. Rebased due to bitrot. Spotted one more place to use this, in src/backend/utils/adt/txid.c. -- Thomas Munro https://enterprisedb.com 0001-Consolidate-code-that-makes-a-sorted-array-unique-v2.patch Description: Binary data

Re: Should we add xid_current() or a int8->xid cast?

2019-09-09 Thread Thomas Munro
On Sun, Sep 1, 2019 at 5:04 PM Thomas Munro wrote: > Adding to CF. Rebased. An OID clashed so re-roll the dice. Also spotted a typo. -- Thomas Munro https://enterprisedb.com 0001-Add-SQL-type-xid8-to-expose-FullTransactionId-to--v2.patch Description: Binary data 0002-Introduce-x

Parallel Full Hash Join

2019-09-11 Thread Thomas Munro
postgresql.org/message-id/CA%2BTgmoY4LogYcg1y5JPtto_fL-DBUqvxRiZRndDC70iFiVsVFQ%40mail.gmail.com [3] https://www.postgresql.org/message-id/flat/CA%2BhUKGLBRyu0rHrDCMC4%3DRn3252gogyp1SjOgG8SEKKZv%3DFwfQ%40mail.gmail.com -- Thomas Munro https://enterprisedb.com 0001-WIP-Add-support-for-Parallel-Full-Hash-Join.patch Description: Binary data

Standby Replication and Replication Delay

2019-09-14 Thread Thomas Rosenstein
why does this still happen, shouldn't this prevent the removal on the primary and allow replication to continue even if queries are active? Thanks Thomas

Re: Standby Replication and Replication Delay

2019-09-14 Thread Thomas Rosenstein
Hi Tomas, I'm using Postgresql 10.10 on the standbys and 10.5 on the primary. On 14 Sep 2019, at 21:16, Tomas Vondra wrote: On Sat, Sep 14, 2019 at 06:03:34PM +0200, Thomas Rosenstein wrote: Hi, so I got two questions: 1) I have multiple Postgresql Standby servers replicating ove

Re: Standby Replication and Replication Delay

2019-09-14 Thread Thomas Rosenstein
On 14 Sep 2019, at 22:08, Tomas Vondra wrote: On Sat, Sep 14, 2019 at 09:26:26PM +0200, Thomas Rosenstein wrote: Hi Tomas, I'm using Postgresql 10.10 on the standbys and 10.5 on the primary. On 14 Sep 2019, at 21:16, Tomas Vondra wrote: On Sat, Sep 14, 2019 at 06:03:34PM +0200, T

Re: POC: Cleaning up orphaned files using undo logs

2019-09-15 Thread Thomas Munro
the page that DOES depend on the insert pointer might not log the meta-data if it's not the first WAL record to touch it after a checkpoint. Rats. I'll have to think about that some more. -- Thomas Munro https://enterprisedb.com

Re: POC: Cleaning up orphaned files using undo logs

2019-09-16 Thread Thomas Munro
On Tue, Sep 17, 2019 at 3:09 AM Kuntal Ghosh wrote: > On Mon, Sep 16, 2019 at 11:23 AM Thomas Munro wrote: > > Agreed. I added a line to break out of that loop if !block->in_use. > > > I think we should skip the block if !block->in_use. Because, the undo > bu

scorpionfly needs more semaphores

2019-09-17 Thread Thomas Munro
n switch to "unnamed" POSIX semaphores :-) [1] https://www.postgresql.org/message-id/flat/27582.1546928073%40sss.pgh.pa.us [2] https://github.com/openbsd/src/blob/master/lib/librthread/rthread_sem.c#L112 -- Thomas Munro https://enterprisedb.com

Re: [PATCH] src/test/modules/dummy_index -- way to test reloptions from inside of access method

2019-09-18 Thread Thomas Munro
On Thu, Sep 19, 2019 at 7:58 AM Nikolay Shaplov wrote: > В Fri, 2 Aug 2019 11:12:35 +1200 > Thomas Munro пишет: > > > While moving this to the September CF, I noticed this failure: > > > > test reloptions ... FAILED 32 ms > > Do you have a

Usage of the system truststore for SSL certificate validation

2019-09-19 Thread Thomas Berger
. Validating the certificate against this CA requires to either override the PGSSLROOTCERT location via the environment or provide a copy of the file for each user that connects with libpq or libpq-like connectors. We would like to simplify this. -- Thomas Berger PostgreSQL DBA Database

Re: dropdb --force

2019-09-19 Thread Thomas Munro
19:07:22.203 UTC [1516] 050_dropdb.pl LOG: statement: DROP DATABASE (FORCE) foobar2; # doesn't match '(?^:statement: DROP DATABASE (FORCE) foobar2)' -- Thomas Munro https://enterprisedb.com

Re: scorpionfly needs more semaphores

2019-09-22 Thread Thomas Munro
ired. Also, as I speculated in that other thread: based on a quick peek at the implementation, you might get better performance on very large busy PostgreSQL clusters from our cache line padded sem_init() array than you do with your more densely packed SysV semas (I could be totally wrong about that,

Re: Proposal: Make use of C99 designated initialisers for nulls/values arrays

2019-10-01 Thread Thomas Munro
On Wed, Oct 2, 2019 at 5:49 AM Andres Freund wrote: > On 2019-10-01 12:17:08 -0400, Tom Lane wrote: > > Note though that InsertPgAttributeTuple uses memset(), while some of > > these other places use MemSet(). The code I see being generated for > > MemSet() is also the same(!) on clang, but it is

Re: Peripatus: Can someone look?

2019-10-01 Thread Thomas Munro
that fixes anything. > > (this is -CURRENT on FreeBSD, so it's always a moving target). Hi Larry, I'm seeing this on my FreeBSD 13 bleeding edge system too (built a couple of days ago) and will see if I can find out what's up with that. The most obvious culprit is the stuff that just landed in the kernel to support Linux-style memfd_create() and thereby changed around some shm_open() related things. Seems to be clearly not a PostgreSQL problem. Thanks, Thomas

Re: Collation versioning

2019-10-03 Thread Thomas Munro
On Thu, Oct 3, 2019 at 7:53 AM Peter Eisentraut wrote: > On 2018-09-05 23:18, Thomas Munro wrote: > > On Wed, Sep 5, 2018 at 12:10 PM Christoph Berg wrote: > >>> So, it's not ideal but perhaps worth considering on the grounds that > >>> it's better than

Re: Collation versioning

2019-10-11 Thread Thomas Munro
On Thu, Oct 10, 2019 at 8:38 AM Peter Eisentraut wrote: > On 2019-10-09 21:19, Peter Eisentraut wrote: > > On 2019-10-03 14:25, Thomas Munro wrote: > >>> The only open question on this patch was whether it's a good version to > >>> use. I think based on

Re: stress test for parallel workers

2019-10-11 Thread Thomas Munro
On Sat, Oct 12, 2019 at 7:56 AM Tom Lane wrote: > This matches up with the intermittent infinite_recurse failures > we've been seeing in the buildfarm. Those are happening across > a range of systems, but they're (almost) all Linux-based ppc64, > suggesting that there's a longstanding arch-specif

Re: stress test for parallel workers

2019-10-11 Thread Thomas Munro
On Sat, Oct 12, 2019 at 9:40 AM Tom Lane wrote: > Andres Freund writes: > > On 2019-10-11 14:56:41 -0400, Tom Lane wrote: > >> ... So it's really hard to explain > >> that as anything except a kernel bug: sometimes, the kernel > >> doesn't give us as much stack as it promised it would. And the >

Re: stress test for parallel workers

2019-10-12 Thread Thomas Munro
On Sun, Oct 13, 2019 at 1:06 PM Tom Lane wrote: > I don't think any further proof is required that this is > a kernel bug. Where would be a good place to file it? linuxppc-...@lists.ozlabs.org might be the right place. https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: "pg_ctl: the PID file ... is empty" at end of make check

2019-10-14 Thread Thomas Munro
On Tue, Oct 15, 2019 at 1:55 PM Tom Lane wrote: > Thomas Munro writes: > > Agreed. Secret non-shareable bug report filed. Fingers crossed. > > Since that conversation, longfin has shown the same symptom > just once more: > > longfin | REL_11_STABLE | 2019-07-28

Re: Collation versioning

2019-10-14 Thread Thomas Munro
On Fri, Oct 11, 2019 at 11:41 PM Thomas Munro wrote: > On Thu, Oct 10, 2019 at 8:38 AM Peter Eisentraut > wrote: > > Actually, I had to revert that because pg_dump and pg_upgrade tests need > > to be updated, but that seems doable. > > [Returning from a couple of weeks mo

Re: Collation versioning

2019-10-15 Thread Thomas Munro
On Tue, Oct 15, 2019 at 5:39 PM Thomas Munro wrote: > Here's a version with a small note added to the documentation. I'm > planning to commit this tomorrow. Done. It's not much, but it's a start. Some things to do: * handle default collation (probably comes with

Re: Collation versioning

2019-10-15 Thread Thomas Munro
On Wed, Oct 16, 2019 at 5:33 PM Thomas Munro wrote: > On Tue, Oct 15, 2019 at 5:39 PM Thomas Munro wrote: > > Here's a version with a small note added to the documentation. I'm > > planning to commit this tomorrow. > > Done. The buildfarm is telling me that I did

Re: ERROR: multixact X from before cutoff Y found to be still running

2019-10-15 Thread Thomas Munro
On Wed, Sep 18, 2019 at 8:11 AM Bossart, Nathan wrote: > Thanks for the detailed background information. FWIW I am now in > favor of the v2 patch. Here's a version with a proposed commit message and a comment. Please let me know if I credited things to the right people! 0001-Fix-bug-that-coul

Re: ERROR: multixact X from before cutoff Y found to be still running

2019-10-16 Thread Thomas Munro
On Thu, Oct 17, 2019 at 6:11 AM Jeremy Schneider wrote: > On 10/16/19 10:09, Bossart, Nathan wrote: > > On 10/15/19, 11:11 PM, "Thomas Munro" wrote: > >> Here's a version with a proposed commit message and a comment. Please > >> let me kno

Re: ICU for global collation

2019-10-16 Thread Thomas Munro
On Wed, Oct 9, 2019 at 12:16 AM Marius Timmer wrote: > like the others before me we (the university of Münster) are happy to > see this feature as well. Thank you this. > > When I applied the patch two weeks ago I run into the issue that initdb > did not recognize the new parameters (collation-pro

Re: SegFault on 9.6.14

2019-10-16 Thread Thomas Munro
On Fri, Sep 13, 2019 at 1:35 AM Robert Haas wrote: > On Thu, Sep 12, 2019 at 8:55 AM Amit Kapila wrote: > > Robert, Thomas, do you have any more suggestions related to this. I > > am planning to commit the above-discussed patch (Forbid Limit node to > > shutdown resource

<    2   3   4   5   6   7   8   9   10   11   >