[ 
https://issues.apache.org/jira/browse/HIVE-7615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14141866#comment-14141866
 ] 

Thejas M Nair commented on HIVE-7615:
-------------------------------------

Thanks for the new patch and pointing out the issue with just having a single 
isRunning boolean.

I have some more comments/thoughts -
# I think we should avoid throwing exceptions in the normal code path, as Brock 
pointed out. (getQueryLog throwing exception when statementHandle is not 
initialized)
# getQueryLog should throw an exception when the Statement is cancelled or 
closed. I think this is the state that needs to be captured (canceled/closed vs 
pre-initalization). For the getQueryLog api, it not matter if the query 
succeeded or failed.
# It would be useful to have a way to determine if there are not going to be 
any more logs being logged. In current implementation, once execute call 
returns, the execution is over and all logs have been written. The user can 
stop making calls at that point.
# The current code does not guarantee that the last few lines of logs (which 
indicate success) would be picked up. The logging thread could be sleeping 
while query completes and interrupt might happen before it is able to make 
another getQueryLog. This can be confusing to a beeline user.
# HiveQueryResultSet does not lock calls to client using the transportLock . 
This means that the getQueryLog and  
HiveQueryResultSet.next might end up using the client object at same time, 
causing problems.

{code}
 /**
  * This method is a public API for usage outside of Hive, although it is not 
part of the
   * interface java.sql.Statement.
  * @return true if query execution might be producing more logs. It does not 
indicate if last log lines have been fetched by getQueryLog.
 * @throws ClosedOrCancelledStatement if statement has been cancelled or closed
*/
 boolean hasMoreLogs() throws ClosedOrCancelledStatement;

  /**
   * Get the execution logs of the given SQL statement.
  * This method is a public API for usage outside of Hive, although it is not 
part of the
   * interface java.sql.Statement.
   * @param incremental indicate getting logs either incrementally or from the 
beginning,
   *                    when it is true or false.
   * @param fetchSize the number of lines to fetch
   * @return a list of log messages. It can be empty if there are no new logs 
to be retrieved at that time.
   * @throws ClosedOrCancelledStatement if statement has been cancelled or 
closed
   * @throws SQLException
   */
  public List<String> getQueryLog(boolean incremental, int fetchSize) throws 
ClosedOrCancelledStatement, SQLException

{code}

The code for retrieving the logs can simply be -
{code}
new Runnable () {
public void run() {
  while(stmt.hasMoreLogs()) {
     printProgress(stmt.getQueryLog(true, 50));
     try {
       Thread.sleep(1000);
     } catch(InterruptedException e) {
        return;
     }
  }
}
{code}

The mainline code using jdbc can on the lines of  -
{code}
stmt.execute();
// get results
// before closing statement, interrupt the thread.
logThread.interrupt();
// get any last lines of log synchronously before closing statement
stmt.close(); // or resultSet.close()
{code}

volatile boolean closedStmt = false; // set this to true when 
HiveStatement.closeClientOperation is called
volatile boolean isLogBeingGenerated = true; // set this to false when the " 
while (!operationComplete) {" loop is complete, or an exception is thrown in 
that loop.

in getQueryLog, the check that is there for stmtHandle == null can be changed 
to throw ClosedOrCancelledStatement only if stmtHandle == null && closedStmt

hasMoreLogs can return the value of isLogBeingGenerated

Sorry about the large number of comments. Creating a public api is a big 
commitment, just trying to make sure it is possible to stay committed! Thats 
for all the work you have done.



> Beeline should have an option for user to see the query progress
> ----------------------------------------------------------------
>
>                 Key: HIVE-7615
>                 URL: https://issues.apache.org/jira/browse/HIVE-7615
>             Project: Hive
>          Issue Type: Improvement
>          Components: CLI
>            Reporter: Dong Chen
>            Assignee: Dong Chen
>         Attachments: HIVE-7615.1.patch, HIVE-7615.2.patch, HIVE-7615.patch, 
> complete_logs, simple_logs
>
>
> When executing query in Beeline, user should have a option to see the 
> progress through the outputs.
> Beeline could use the API introduced in HIVE-4629 to get and display the logs 
> to the client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to