Tuple concurrency issue in large objects
Hi all, I am working on a project which allows multiple users to work on single large text document. I am using lo_put to apply only the diff into the large object without replacing it with a new lob. While working on it, I encountered an error "Tuple concurrently updated". The error can be reproduced with two psql clients. Setup: mydb=# create table text_docs(id serial primary key, data oid); CREATE TABLE mydb=# insert into text_docs(data) select lo_import('./upload.txt'); INSERT 0 1 mydb=# select * from text_docs; id | data +- 1 | 5810130 (1 rows) Now, if we open two psql clients and execute the following commands: Client 1: mydb=# begin; BEGIN mydb=# select lo_put(5810130, 10, '\xaa'); UPDATE 1 Client 2: mydb=# select lo_put(5810130, 10, '\xaa'); Client 1: mydb=# commit; COMMIT Client 2: mydb=# select lo_put(5810130, 10, '\xaa'); ERROR: tuple concurrently updated Is there a workaround to this concurrency issue without creating a new large object? Regards Shalini
Re: Tuple concurrency issue in large objects
Hi Rene, I am using Postgresql 11.2. Major version is 11 and minor version is 2. On 12/10/2019 11:24 AM, Rene Romero Benavides wrote: Hi Shalini. The usual diagnostic info is your postgresql server version, major and minor version, such as in 12.1 , the major version is 12 and the minor version (patch version) is 1. On Fri, Dec 6, 2019 at 9:26 AM Shalini <mailto:shal...@saralweb.com>> wrote: Hi all, I am working on a project which allows multiple users to work on single large text document. I am using lo_put to apply only the diff into the large object without replacing it with a new lob. While working on it, I encountered an error "Tuple concurrently updated". The error can be reproduced with two psql clients. Setup: mydb=# create table text_docs(id serial primary key, data oid); CREATE TABLE mydb=# insert into text_docs(data) select lo_import('./upload.txt'); INSERT 0 1 mydb=# select * from text_docs; id | data +- 1 | 5810130 (1 rows) Now, if we open two psql clients and execute the following commands: Client 1: mydb=# begin; BEGIN mydb=# select lo_put(5810130, 10, '\xaa'); UPDATE 1 Client 2: mydb=# select lo_put(5810130, 10, '\xaa'); Client 1: mydb=# commit; COMMIT Client 2: mydb=# select lo_put(5810130, 10, '\xaa'); ERROR: tuple concurrently updated Is there a workaround to this concurrency issue without creating a new large object? Regards Shalini -- El genio es 1% inspiración y 99% transpiración. Thomas Alva Edison http://pglearn.blogspot.mx/
Re: Tuple concurrency issue in large objects
Hi,Thanks. I will try this approach.Could you also please state the reason why is it happening in case of large objects? Because concurrent transactions are very well handled for other data types, but the same is not happening for lobs. Is it because the fomer are stored in toast table and there is no support for concurrent txns in pg_largeobject table?On 13-Dec-2019 5:25 PM, Daniel Verite wrote: Shalini wrote: > > Is there a workaround to this concurrency issue without creating a > > new large object? The transaction failing with the "Tuple concurrently updated" error could be resubmitted by the client, as if it was a serialization failure. Or the failure could be prevented by using advisory locks: https://www.postgresql.org/docs/current/explicit-locking.html Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite
Re: Tuple concurrency issue in large objects
Well.. it seems I have to rethink about my application design. Anyway, thank you all for your insights and suggestions. On 12/18/2019 10:46 PM, Justin wrote: I agree completely, I do not think Postgresql is a good fit for Shalini based on the conversation so far tracking Concurrency is going to be a killer... But i see the temptation to use a DB for this as the updates are ACID less likely to corrupted data for X reason On Wed, Dec 18, 2019 at 12:12 PM Tom Lane <mailto:t...@sss.pgh.pa.us>> wrote: Justin mailto:z.g...@gmail.com>> writes: > I now see what is causing this specific issue... > The update and row versions is happening on 2kb chunk at a time, That's > going to make tracking what other clients are doing a difficult task. Yeah, it's somewhat unfortunate that the chunkiness of the underlying data storage becomes visible to clients if they try to do concurrent updates of the same large object. Ideally you'd only get a concurrency failure if you tried to overwrite the same byte(s) that somebody else did, but as it stands, modifying nearby bytes might be enough --- or not, if there's a chunk boundary between. On the whole, though, it's not clear to me why concurrent updates of sections of large objects is a good application design. You probably ought to rethink how you're storing your data. regards, tom lane