Re: Review Request: HIVE-2127. Improve stats gathering reliability by retries on failures

Ning Zhang Tue, 26 Apr 2011 22:27:41 -0700


> On 2011-04-27 05:14:00, namit jain wrote:
> > One minor comment:
> > 
> > In both statspublisher and statsaggregator, you are executing multiple 
> > statements in the loop (iteration < numTries).
> > It is possible that the first jdbc statement succeeds in the loop, but 
> > subsequent ones fail.
> > It might be good to store a state to denote which ones have succeeded and 
> > not retry them.


In JDBCStatsPublisher/Aggregator, we only retry when got a 
SQLRecoverableException only. Based on the description of 
SQLRecoverableException (attached below) I think this kind of exception need to 
restart the whole transaction from scratch.


"SQLRecoverableException: The subclass of SQLException thrown in situations 
where a previously failed operation might be able to succeed if the application 
performs some recovery steps and retries the entire transaction or in the case 
of a distributed transaction, the transaction branch. At a minimum, the 
recovery operation must include closing the current connection and getting a 
new connection."


- Ning


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/664/#review570
-----------------------------------------------------------


On 2011-04-25 23:42:52, Ning Zhang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/664/
> -----------------------------------------------------------
> 
> (Updated 2011-04-25 23:42:52)
> 
> 
> Review request for hive.
> 
> 
> Summary
> -------
> 
> The major changes are:
> 
>  0) 2 parameters are introduced: hive.stats.retries.max (default 0) to be the 
> maximum # of retries on SQLException failures, and hive.stats.retries.wait 
> (default 3 sec) to be the base time window (explained below) to wait before 
> the next retry. 
> 
>  1) introduced a couple of Utilities function to execute SQL queries with 
> retries on failures. One Utilities function is to determine the wait time 
> based on the number of failures and a base wait window (same as the one 
> introduced in HDFS-767 for DFSClient to retry on BlockMissingExceptions). The 
> actual wait time is determined by baseWindow * failues + baseWindow * 
> (failure + 1) * (random number between [0.0,1.0]).
> 
>  2) changed the JDBCStatsAggregator.java to use PreparedStatement to be able 
> to use executeWithRetries(). 
> 
>  3) change the JDBCStatsPublisher.java and JDBCStasAggregator.java to use 
> retries on SQL connections and SQL executions. 
> 
> 
> Diffs
> -----
> 
>   trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1096632 
>   trunk/conf/hive-default.xml 1096632 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 1096632 
>   
> trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/jdbc/JDBCStatsAggregator.java
>  1096632 
>   
> trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/jdbc/JDBCStatsPublisher.java
>  1096632 
> 
> Diff: https://reviews.apache.org/r/664/diff
> 
> 
> Testing
> -------
> 
> Running unit tests. 
> 
> 
> Thanks,
> 
> Ning
> 
>

Re: Review Request: HIVE-2127. Improve stats gathering reliability by retries on failures

Reply via email to