Thank you so much Amit! I have created the patch below:
https://commitfest.postgresql.org/22/2003/

Please let me know should you have more suggestions. Thank you!

Best regards,
--
Chengchao Yu
Software Engineer | Microsoft | Azure Database for PostgreSQL
https://azure.microsoft.com/en-us/services/postgresql/


-----Original Message-----
From: Amit Kapila <amit.kapil...@gmail.com> 
Sent: Friday, February 1, 2019 6:58 PM
To: Chengchao Yu <chen...@microsoft.com>
Cc: Thomas Munro <thomas.mu...@enterprisedb.com>; Pg Hackers 
<pgsql-hack...@postgresql.org>; Prabhat Tripathi <pt...@microsoft.com>; Sunil 
Kamath <sunil.kam...@microsoft.com>; Michal Primke <mpri...@microsoft.com>; 
TEJA Mupparti <tejeswar.muppa...@microsoft.com>
Subject: Re: [PATCH] Fix Proposal - Deadlock Issue in Single User Mode When IO 
Failure Occurs

On Sat, Feb 2, 2019 at 4:42 AM Chengchao Yu <chen...@microsoft.com> wrote:
>
> Hi Amit, Thomas,
>
> Thank you very much for your feedbacks! Apologizes but I just saw both 
> messages.
>
> > We generally reserve the space in a relation before attempting to write, so 
> > not sure how you are able to hit the disk full situation via mdwrite.  If 
> > you see the description of the function, that also indicates same.
>
> Absolutely agree, this isn’t a PG issue. Issue manifest for us at Microsoft 
> due to our own storage layer which treat mdextend() actions as setting length 
> of the file only. We have a workaround, and any change isn’t needed for 
> Postgres.
>
> > I am not telling that mdwrite can never lead to error, but just trying to 
> > understand the issue you actually faced.  I haven't read your proposed 
> > solution yet, let's first try to establish the problem you are facing.
>
> We see transient IO errors reading a block where PG instance gets dead-lock 
> in single user mode until we kill the instance. The stack trace below shows 
> the behavior which is when mdread() failed with buffer holding its lw-lock. 
> This happens because in single user mode there is no call back to release the 
> lock and when AbortBufferIO() tries to acquire the same lock again, it will 
> wait for the lock indefinitely.
>

I think you can register your patch for next CF [1] so that we don't forget 
about it.

[1] - 
https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcommitfest.postgresql.org%2F22%2F&amp;data=02%7C01%7Cchengyu%40microsoft.com%7Cfee132e6ec2843c2838a08d688ba3aef%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636846730778775307&amp;sdata=lJ2LjRgo%2Bd6ViKqwJ040BPzicOTFtFO8NmmVft00yKY%3D&amp;reserved=0

--
With Regards,
Amit Kapila.
EnterpriseDB: 
https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.enterprisedb.com&amp;data=02%7C01%7Cchengyu%40microsoft.com%7Cfee132e6ec2843c2838a08d688ba3aef%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636846730778775307&amp;sdata=nXcVn6B1fl6b5iiDKybl3zf0fXo22%2BrZ1Ne7v1%2FM5DE%3D&amp;reserved=0

Attachment: fix-deadlock.patch
Description: fix-deadlock.patch

Reply via email to