[ https://issues.apache.org/jira/browse/HIVE-20664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Saurabh Seth updated HIVE-20664: -------------------------------- Status: In Progress (was: Patch Available) > Potential ArrayIndexOutOfBoundsException in > VectorizedOrcAcidRowBatchReader.findMinMaxKeys > ------------------------------------------------------------------------------------------ > > Key: HIVE-20664 > URL: https://issues.apache.org/jira/browse/HIVE-20664 > Project: Hive > Issue Type: Bug > Components: Transactions > Reporter: Saurabh Seth > Assignee: Saurabh Seth > Priority: Minor > Attachments: HIVE-20664.patch > > > [~ekoifman], could you please confirm if my understanding is correct and if > so, review the fix? > In the method {{VectorizedOrcAcidRowBatchReader.findMinMaxKeys}}, the code > snippet that identifies the first and last stripe indices in the current > split could result in an ArrayIndexOutOfBoundsException if a complete split > is within the same stripe: > {noformat} > for(int i = 0; i < stripes.size(); i++) { > StripeInformation stripe = stripes.get(i); > long stripeEnd = stripe.getOffset() + stripe.getLength(); > if(firstStripeIndex == -1 && stripe.getOffset() >= splitStart) { > firstStripeIndex = i; > } > if(lastStripeIndex == -1 && splitEnd <= stripeEnd && > stripes.get(firstStripeIndex).getOffset() <= stripe.getOffset() ) { > //the last condition is for when both splitStart and splitEnd are in > // the same stripe > lastStripeIndex = i; > } > } > {noformat} > Consider the example where there are 2 stripes - 0-500 and 500-1000 and > splitStart is 600 and splitEnd is 800. > In the first iteration of the loop, stripe.getOffset() is 0 and stripeEnd is > 500. In this iteration, neither of the if statement conditions will be met > and firstSripeIndex as well as lastStripeIndex remain -1. > In the second iteration of the loop stripe.getOffset() is 500, stripeEnd is > 1000, The first if statement condition will not be met in this case because > stripe's offset (500) is not greater than or equal to the splitStart (600). > However, in the second if statement, splitEnd (800) is <= stripeEnd(1000) and > it will try to compute the last condition > {{stripes.get(firstStripeIndex).getOffset() <= stripe.getOffset()}}. This > will throw an ArrayIndexOutOfBoundsException because firstStripeIndex is > still -1. > I'm not sure if this scenario is possible at all, hence logging this as a low > priority issue. Perhaps block based split generation using BISplitStrategy > could trigger this? -- This message was sent by Atlassian JIRA (v7.6.3#76005)