On 2/6/25 07:40, Robert Leach wrote:

Comments inline.

Let me try and distill all the back story down to an oversimplified explanation:

I created a scientific data submission validation interface that helps 
researchers compile their data submissions to our database. To do this, I 
decided to add a `--validate` option to the load scripts that essentially 
raises a dry run exception at the end of a load run so that the data is all 
rolled back before leaving the atomic transaction block.

Load to where existing table or temporary table?


This validation interface skips the raw data load step, which is the heftiest, 
most long running, part and generally finishes in well under a minute.

So what is the validation actually doing?


The interface works fantastically well. It rolls back problematic data in 
smaller transactions and buffers those errors for a final report that the user 
can work to fix in their data files and retry validation until all their 
problems are solved and the data is ready to load.

The problem is that if we have a long running load going on in the background 
(which can take hours) and during that load, a user tries to validate a new 
submission that contains some overlapping common data (e.g. a new tissue type 
or protocol that is currently being loaded) or perhaps they are trying to 
validate data added to an existing submission that is being partially loaded, 
that validation process gets blocked and the validation interface encounters a 
gateway timeout.

Where are the background processes loading data to?


I had tried changing the isolation level to both repeatable read and 
serializable, but the hang can still occur (though serializable seems to avoid 
hangs in some cases that repeatable read does not).

My initial interpretation of the isolation level documentation was that transactions 
would not be aware of what is happening in other transactions and that if there was a 
problem, you would only ever get a serialization error at the end when the data is 
committed, but after encountering the hang yesterday, I realized that the 
"snapshot" that the second transaction gets includes locks already established 
by the first transaction.

I had hoped that since my validation interface would never commit, it would 
never have a serialization error or hang.

What options are available that might allow this all to work as I intended? Is 
there some way to tell the validation process transaction to assume that any 
pending changes from another transaction will succeed?

Not that I know of and that would be a false promise anyway as I know of no process that is perfect.


I have lots of ideas on how to mitigate these occurrences, but I don't want to 
bias any potentially helpful responses. Am I screwed from my early decision to 
use the loading code to validate data submissions?


--
Adrian Klaver
adrian.kla...@aklaver.com



Reply via email to