Hi,
> There are some similarities between this and
> https://www.postgresql.org/message-id/20240207184050.rkvpuudq7huijmkb%40awork3.anarazel.de
> as described in that email.
Thanks for this information.
>
>
> Hm, I think this might make this code a bit more expensive. It's cheaper, both
> in t
Hi,
There are some similarities between this and
https://www.postgresql.org/message-id/20240207184050.rkvpuudq7huijmkb%40awork3.anarazel.de
as described in that email.
On 2024-01-25 15:24:17 +0800, Andy Fan wrote:
> From 4c8fd0ab71299e57fbeb18dec70051bd1d035c7c Mon Sep 17 00:00:00 2001
> From: "
Andres Freund writes:
> On 2024-01-18 14:00:58 -0500, Robert Haas wrote:
>> > The LockBufHdr also used init_local_spin_delay / perform_spin_delay
>> > infrastruce and then it has the same issue like ${subject}, it is pretty
>> > like the code in s_lock; Based on my current knowledge, I think we
Hi,
> Hi,
>
> On 2024-01-22 15:18:35 +0800, Andy Fan wrote:
>> I used sigismember(&BlockSig, SIGQUIT) to detect if a process is doing a
>> quickdie, however this is bad not only because it doesn't work on
>> Windows, but also it has too poor performance even it impacts on
>> USE_ASSERT_CHECKING
Hi,
On 2024-01-22 15:18:35 +0800, Andy Fan wrote:
> I used sigismember(&BlockSig, SIGQUIT) to detect if a process is doing a
> quickdie, however this is bad not only because it doesn't work on
> Windows, but also it has too poor performance even it impacts on
> USE_ASSERT_CHECKING build only. In v
Hi,
On 2024-01-18 14:00:58 -0500, Robert Haas wrote:
> > The LockBufHdr also used init_local_spin_delay / perform_spin_delay
> > infrastruce and then it has the same issue like ${subject}, it is pretty
> > like the code in s_lock; Based on my current knowledge, I think we
> > should add the check
Robert Haas writes:
> On Mon, Jan 22, 2024 at 12:13 PM Andy Fan wrote:
>> > On Mon, Jan 22, 2024 at 11:58 AM Andy Fan wrote:
>> >> I get your point! Acquiring an already held spinlock in quickdie is
>> >> unlikely to happen, but since our existing infrastructure can handle it,
>> >> then ther
On Mon, Jan 22, 2024 at 12:13 PM Andy Fan wrote:
> > On Mon, Jan 22, 2024 at 11:58 AM Andy Fan wrote:
> >> I get your point! Acquiring an already held spinlock in quickdie is
> >> unlikely to happen, but since our existing infrastructure can handle it,
> >> then there is no reason to bypass it.
>
Robert Haas writes:
> On Mon, Jan 22, 2024 at 11:58 AM Andy Fan wrote:
>> I get your point! Acquiring an already held spinlock in quickdie is
>> unlikely to happen, but since our existing infrastructure can handle it,
>> then there is no reason to bypass it.
>
> No, the existing infrastructure
On Mon, Jan 22, 2024 at 11:58 AM Andy Fan wrote:
> I get your point! Acquiring an already held spinlock in quickdie is
> unlikely to happen, but since our existing infrastructure can handle it,
> then there is no reason to bypass it.
No, the existing infrastructure cannot handle that at all.
--
Robert Haas writes:
> On Mon, Jan 22, 2024 at 2:22 AM Andy Fan wrote:
>> I used sigismember(&BlockSig, SIGQUIT) to detect if a process is doing a
>> quickdie, however this is bad not only because it doesn't work on
>> Windows, but also it has too poor performance even it impacts on
>> USE_ASSE
On Mon, Jan 22, 2024 at 2:22 AM Andy Fan wrote:
> I used sigismember(&BlockSig, SIGQUIT) to detect if a process is doing a
> quickdie, however this is bad not only because it doesn't work on
> Windows, but also it has too poor performance even it impacts on
> USE_ASSERT_CHECKING build only. In v8,
Andy Fan writes:
> I found a speical case about checking it in errstart. So commit 3 in v7
> is added.
>
> commit 757c67c1d4895ce6a523bcf5217af8eb2351e2a1 (HEAD -> s_stuck_v3)
> Author: yizhi.fzh
> Date: Mon Jan 22 07:14:29 2024 +0800
>
> Bypass SpinLock checking in SIGQUIT signal hander
Hi,
Andy Fan writes:
> Hi,
>
> v6 attached which addressed all the items Robert suggested except the
> following 2 open items. They are handled differently.
>
>>
>> Here is the summary of the open-items, it would be great that Andres and
>> Matthias have a look at this when they have time.
>>
>
Hi,
v6 attached which addressed all the items Robert suggested except the
following 2 open items. They are handled differently.
>
> Here is the summary of the open-items, it would be great that Andres and
> Matthias have a look at this when they have time.
>
>>> The LockBufHdr also used init_loc
Hi,
Here is the summary of the open-items, it would be great that Andres and
Matthias have a look at this when they have time.
1. Shall we treat the LockBufHdr as a SpinLock usage.
>> The LockBufHdr also used init_local_spin_delay / perform_spin_delay
>> infrastruce and then it has the same is
On Thu, Jan 18, 2024 at 1:30 PM Andy Fan wrote:
> > You added an #include to dynahash.c despite making no other changes to
> > the file.
>
> That's mainly because I put the code into miscadmin.h and spin.h depends
> on miscadmin.h with MACROs.
That's not a good reason. Headers need to include hea
Hi Robert,
Thanks for your attention!
> On Thu, Jan 18, 2024 at 7:56 AM Andy Fan wrote:
>> Do you still have interest with making some progress on this topic?
>
> Some, but it's definitely not my top priority. I wish I could give as
> much attention as everyone would like to everyone's patches
On Thu, Jan 18, 2024 at 7:56 AM Andy Fan wrote:
> Do you still have interest with making some progress on this topic?
Some, but it's definitely not my top priority. I wish I could give as
much attention as everyone would like to everyone's patches, but I
can't even come close.
I think that the s
Hi Matthias / Robert:
Do you still have interest with making some progress on this topic?
> Robert Haas writes:
>
>> On Wed, Jan 10, 2024 at 10:17 PM Andy Fan wrote:
>>> fixed in v2.
>>
>> Timing the spinlock wait seems like a separate patch from the new sanity
>> checks.
>
> Yes, a separate
Robert Haas writes:
> On Wed, Jan 10, 2024 at 10:17 PM Andy Fan wrote:
>> fixed in v2.
>
> Timing the spinlock wait seems like a separate patch from the new sanity
> checks.
Yes, a separate patch would be better, so removed it from v4.
> I suspect that the new sanity checks should only be ar
On Wed, Jan 10, 2024 at 10:17 PM Andy Fan wrote:
> fixed in v2.
Timing the spinlock wait seems like a separate patch from the new sanity checks.
I suspect that the new sanity checks should only be armed in
assert-enabled builds.
I'm doubtful that this method of tracking the wait time will be
ac
Robert Haas writes:
> Thanks for jumping in with a review, Matthias!
FWIW, Matthias is also the first one for this proposal at this
thread, thanks for that as well!
--
Best Regards
Andy Fan
Hi Matthias,
Thanks for the review!
Matthias van de Meent writes:
> On Wed, 10 Jan 2024 at 02:44, Andy Fan wrote:
>> Hi,
>>
>> I want to know if Andres or you have plan
>> to do some code review. I don't expect this would happen very soon, just
>> want to make sure this will not happen that b
Thanks for jumping in with a review, Matthias!
On Wed, Jan 10, 2024 at 8:03 AM Matthias van de Meent
wrote:
> I'm not 100% sure on the policy of this, but theoretically you could
> use LockAquireExtended(dontWait=true) while holding a spin lock, as
> that would not have an unknown duration. Then
On Wed, 10 Jan 2024 at 02:44, Andy Fan wrote:
> Hi,
>
> I want to know if Andres or you have plan
> to do some code review. I don't expect this would happen very soon, just
> want to make sure this will not happen that both of you think the other
> one will do, but actually none of them does it in
Hi,
Robert Haas writes:
> On Mon, Jan 8, 2024 at 9:40 PM Andy Fan wrote:
>> The singler handler I was refering to is 'CHECK_FOR_INTERRUPTS', Based
>> on this, spin_lock and lwlock are acted pretty differently.
>
> CHECK_FOR_INTERRUPTS() is not a signal handler,
hmm, I knew this but I th
On Mon, Jan 8, 2024 at 9:40 PM Andy Fan wrote:
> The singler handler I was refering to is 'CHECK_FOR_INTERRUPTS', Based
> on this, spin_lock and lwlock are acted pretty differently.
CHECK_FOR_INTERRUPTS() is not a signal handler, and it's OK to acquire
and release spin locks or lwlocks there. We
Hi,
Robert Haas writes:
> On Sun, Jan 7, 2024 at 9:52 PM Andy Fan wrote:
>> > I think we should add cassert-only infrastructure tracking whether we
>> > currently hold spinlocks, are in a signal handler and perhaps a few other
>> > states. That'd allow us to add assertions like:
>> ..
>> > - n
On Sun, Jan 7, 2024 at 9:52 PM Andy Fan wrote:
> > I think we should add cassert-only infrastructure tracking whether we
> > currently hold spinlocks, are in a signal handler and perhaps a few other
> > states. That'd allow us to add assertions like:
> ..
> > - no lwlocks or ... while in signal ha
Hi!
>
> I think we should add cassert-only infrastructure tracking whether we
> currently hold spinlocks, are in a signal handler and perhaps a few other
> states. That'd allow us to add assertions like:
..
> - no lwlocks or ... while in signal handlers
I *wish* lwlocks should *not* be held whi
Hi,
> On 2024-01-05 14:19:23 -0500, Robert Haas wrote:
>> On Fri, Jan 5, 2024 at 2:11 PM Andres Freund wrote:
>> > I see it fairly regularly. Including finding several related bugs that
>> > lead to
>> > stuck systems last year (signal handlers are a menace).
>>
>> In that case, I think this
Hi,
On 2024-01-05 14:19:23 -0500, Robert Haas wrote:
> On Fri, Jan 5, 2024 at 2:11 PM Andres Freund wrote:
> > I see it fairly regularly. Including finding several related bugs that lead
> > to
> > stuck systems last year (signal handlers are a menace).
>
> In that case, I think this proposal i
Hi,
On 2024-01-05 10:20:39 +0800, Andy Fan wrote:
> Andres Freund writes:
> > On 2024-01-04 14:59:06 +0800, Andy Fan wrote:
> >> My question is if someone doesn't obey the rule by mistake (everyone
> >> can make mistake), shall we PANIC on a production environment? IMO I
> >> think it can be a WA
On Fri, Jan 5, 2024 at 2:11 PM Andres Freund wrote:
> I see it fairly regularly. Including finding several related bugs that lead to
> stuck systems last year (signal handlers are a menace).
In that case, I think this proposal is dead. I can't personally
testify to this code being a force for goo
Hi,
On 2024-01-05 08:51:53 -0500, Robert Haas wrote:
> On Thu, Jan 4, 2024 at 6:06 PM Andres Freund wrote:
> > I think we should add infrastructure to detect bugs like this during
> > development, but not PANICing when this happens in production seems
> > completely
> > non-viable.
>
> I mean +
On Thu, Jan 4, 2024 at 6:06 PM Andres Freund wrote:
> I think we should add infrastructure to detect bugs like this during
> development, but not PANICing when this happens in production seems completely
> non-viable.
I mean +1 for the infrastructure, but "completely non-viable"? Why?
I've only
Hi,
Andres Freund writes:
>
> On 2024-01-04 14:59:06 +0800, Andy Fan wrote:
>> My question is if someone doesn't obey the rule by mistake (everyone
>> can make mistake), shall we PANIC on a production environment? IMO I
>> think it can be a WARNING on a production environment and be a stuck
>>
Hi,
On 2024-01-04 17:03:18 -0600, Jim Nasby wrote:
> On 1/4/24 10:33 AM, Tom Lane wrote:
>
> Robert Haas writes:
>
> On Thu, Jan 4, 2024 at 10:22 AM Tom Lane wrote:
>
> We should be making an effort to ban coding patterns like
> "return with spinlock still
Hi,
On 2024-01-04 14:59:06 +0800, Andy Fan wrote:
> My question is if someone doesn't obey the rule by mistake (everyone
> can make mistake), shall we PANIC on a production environment? IMO I
> think it can be a WARNING on a production environment and be a stuck
> when 'ifdef USE_ASSERT_CHECKING'.
On 1/4/24 10:33 AM, Tom Lane wrote:
Robert Haas writes:
On Thu, Jan 4, 2024 at 10:22 AM Tom Lane wrote:
We should be making an effort to ban coding patterns like
"return with spinlock still held", because they're just too prone
to errors similar to this one.
I agree th
Hi,
On 2024-01-04 12:04:07 -0500, Robert Haas wrote:
> On Thu, Jan 4, 2024 at 11:33 AM Tom Lane wrote:
> > I believe it's (a). No matter what the reason for a stuck spinlock
> > is, the only reliable method of getting out of the problem is to
> > blow things up and start over. The patch propose
On Thu, Jan 4, 2024 at 11:33 AM Tom Lane wrote:
> I believe it's (a). No matter what the reason for a stuck spinlock
> is, the only reliable method of getting out of the problem is to
> blow things up and start over. The patch proposed at the top of this
> thread would leave the system unable to
Robert Haas writes:
> On Thu, Jan 4, 2024 at 10:22 AM Tom Lane wrote:
>> We should be making an effort to ban coding patterns like
>> "return with spinlock still held", because they're just too prone
>> to errors similar to this one.
> I agree that we don't want to add overhead, and also about h
On Thu, Jan 4, 2024 at 10:22 AM Tom Lane wrote:
> I'm not a fan of adding overhead to such a performance-critical
> thing as spinlocks in order to catch coding errors that are easily
> detectable statically. IMV the correct usage of spinlocks is that
> they should only be held across *short, stra
Robert Haas writes:
> I'm not sure that the approach this patch takes is correct in detail,
> but I kind of agree with you about the overall point. I mean, the idea
> of the PANIC is to avoid having the system just sit there in a state
> from which it will never recover ... but it can also have th
Hi Matthias and Robert,
Matthias van de Meent writes:
> On Thu, 4 Jan 2024 at 08:09, Andy Fan wrote:
>>
>> My question is if someone doesn't obey the rule by mistake (everyone
>> can make mistake), shall we PANIC on a production environment? IMO I
>> think it can be a WARNING on a production e
On Thu, Jan 4, 2024 at 2:09 AM Andy Fan wrote:
> My question is if someone doesn't obey the rule by mistake (everyone
> can make mistake), shall we PANIC on a production environment? IMO I
> think it can be a WARNING on a production environment and be a stuck
> when 'ifdef USE_ASSERT_CHECKING'.
>
On Thu, 4 Jan 2024 at 08:09, Andy Fan wrote:
>
> My question is if someone doesn't obey the rule by mistake (everyone
> can make mistake), shall we PANIC on a production environment? IMO I
> think it can be a WARNING on a production environment and be a stuck
> when 'ifdef USE_ASSERT_CHECKING'.
>
49 matches
Mail list logo