[ https://issues.apache.org/jira/browse/HIVE-23725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183835#comment-17183835 ]
Zoltan Haindrich commented on HIVE-23725: ----------------------------------------- * this patch added an arbitrary MAX_EXECUTION 10 ? * it's enabled by default???? it shouldn't be...it should be pluggable - so that you don't mess up other plugins like this patch have done.... * uses CommandProcessorException instead of tapping into the hooks? I see no benefit in that.. * and changed ALL existing plugin to check for max executions ? why didn't you guys pinged me? > ValidTxnManager snapshot outdating causing partial reads in merge insert > ------------------------------------------------------------------------ > > Key: HIVE-23725 > URL: https://issues.apache.org/jira/browse/HIVE-23725 > Project: Hive > Issue Type: Bug > Reporter: Peter Varga > Assignee: Peter Varga > Priority: Major > Labels: pull-request-available > Time Spent: 5h 40m > Remaining Estimate: 0h > > When the ValidTxnManager invalidates the snapshot during merge insert and > starts to read committed transactions that were not committed when the query > compilation happened, it can cause partial read problems if the committed > transaction created new partition in the source or target table. > The solution should be not only fix the snapshot but also recompile the query > and acquire the locks again. > You could construct an example like this: > 1. open and compile transaction 1 that merge inserts data from a partitioned > source table that has a few partition. > 2. Open, run and commit transaction 2 that inserts data to an old and a new > partition to the source table. > 3. Open, run and commit transaction 3 that inserts data to the target table > of the merge statement, that will retrigger a snapshot generation in > transaction 1. > 4. Run transaction 1, the snapshot will be regenerated, and it will read > partial data from transaction 2 breaking the ACID properties. > Different setup. > Switch the transaction order: > 1. compile transaction 1 that inserts data to an old and a new partition of > the source table. > 2. compile transaction 2 that insert data to the target table > 2. compile transaction 3 that merge inserts data from the source table to the > target table > 3. run and commit transaction 1 > 4. run and commit transaction 2 > 5. run transaction 3, since it cointains 1 and 2 in its snaphot the > isValidTxnListState will be triggered and we do a partial read of the > transaction 1 for the same reasons. -- This message was sent by Atlassian Jira (v8.3.4#803005)