[jira] [Work logged] (HIVE-24230) Integrate HPL/SQL into HiveServer2

ASF GitHub Bot (Jira) Thu, 12 Nov 2020 07:14:26 -0800


     [ 
https://issues.apache.org/jira/browse/HIVE-24230?focusedWorklogId=510841&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-510841
 ]


ASF GitHub Bot logged work on HIVE-24230:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 12/Nov/20 15:12
            Start Date: 12/Nov/20 15:12
    Worklog Time Spent: 10m 
      Work Description: zeroflag commented on a change in pull request #1633:
URL: https://github.com/apache/hive/pull/1633#discussion_r522180727



##########
File path: 
service/src/java/org/apache/hive/service/cli/operation/hplsql/HplSqlQueryExecutor.java
##########
@@ -0,0 +1,149 @@
+/*
+ *
+ *  Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ *
+ */
+
+package org.apache.hive.service.cli.operation.hplsql;
+
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+
+import org.antlr.v4.runtime.ParserRuleContext;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hive.hplsql.executor.ColumnMeta;
+import org.apache.hive.hplsql.executor.Metadata;
+import org.apache.hive.hplsql.executor.QueryException;
+import org.apache.hive.hplsql.executor.QueryExecutor;
+import org.apache.hive.hplsql.executor.QueryResult;
+import org.apache.hive.hplsql.executor.RowResult;
+import org.apache.hive.service.cli.ColumnDescriptor;
+import org.apache.hive.service.cli.FetchOrientation;
+import org.apache.hive.service.cli.FetchType;
+import org.apache.hive.service.cli.HiveSQLException;
+import org.apache.hive.service.cli.OperationHandle;
+import org.apache.hive.service.cli.RowSet;
+import org.apache.hive.service.cli.TableSchema;
+import org.apache.hive.service.cli.session.HiveSession;
+
+/**
+ * Executing HiveQL from HPL/SQL directly, without JDBC or Thrift.
+ */
+public class HplSqlQueryExecutor implements QueryExecutor {
+  public static final String QUERY_EXECUTOR = "QUERY_EXECUTOR";
+  public static final String HPLSQL = "HPLSQL";
+  private final HiveSession hiveSession;
+  private long fetchSize;
+
+  public HplSqlQueryExecutor(HiveSession hiveSession) {
+    this.fetchSize = 
hiveSession.getHiveConf().getIntVar(HiveConf.ConfVars.HIVE_SERVER2_THRIFT_RESULTSET_DEFAULT_FETCH_SIZE);
+    this.hiveSession = hiveSession;
+  }
+
+  @Override
+  public QueryResult executeQuery(String sql, ParserRuleContext ctx) {
+    try {
+      Map<String, String> confOverlay = new HashMap<>();
+      confOverlay.put(QUERY_EXECUTOR, HPLSQL);
+      OperationHandle operationHandle = hiveSession.executeStatement(sql, 
confOverlay);
+      return new QueryResult(new OperationRowResult(operationHandle), () -> 
metadata(operationHandle), null);
+    } catch (HiveSQLException e) {
+      return new QueryResult(null, () -> new 
Metadata(Collections.emptyList()), e);
+    }
+  }
+
+  public Metadata metadata(OperationHandle operationHandle) {
+    try {
+      TableSchema meta = hiveSession.getResultSetMetadata(operationHandle);
+      List<ColumnMeta> colMeta = new ArrayList<>();
+      for (int i = 0; i < meta.getSize(); i++) {
+        ColumnDescriptor col = meta.getColumnDescriptorAt(i);
+        colMeta.add(new ColumnMeta(col.getName(), col.getTypeName(), 
col.getType().toJavaSQLType()));
+      }
+      return new Metadata(colMeta);
+    } catch (HiveSQLException e) {
+      throw new QueryException(e);
+    }
+  }
+
+  private class OperationRowResult implements RowResult {
+    private final OperationHandle handle;
+    private RowSet rows;
+    private Iterator<Object[]> iterator;
+    private Object[] current;
+
+    private OperationRowResult(OperationHandle operationHandle) {
+      this.handle = operationHandle;
+    }
+
+    @Override
+    public boolean next() {
+      if (rows == null) {
+        this.rows = fetch();
+        this.iterator = rows.iterator();
+      }
+      if (iterator.hasNext()) {
+        current = iterator.next();
+        return true;
+      } else {
+        current = null;
+        return false;
+      }
+    }
+
+    private RowSet fetch() {
+      try {
+        return hiveSession.fetchResults(
+                handle, FetchOrientation.FETCH_NEXT, fetchSize, 
FetchType.QUERY_OUTPUT);

Review comment:
       Right. I think I fixed it by adding an extra condition.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 510841)
    Time Spent: 1h 20m  (was: 1h 10m)

> Integrate HPL/SQL into HiveServer2
> ----------------------------------
>
>                 Key: HIVE-24230
>                 URL: https://issues.apache.org/jira/browse/HIVE-24230
>             Project: Hive
>          Issue Type: Bug
>          Components: HiveServer2, hpl/sql
>            Reporter: Attila Magyar
>            Assignee: Attila Magyar
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> HPL/SQL is a standalone command line program that can store and load scripts 
> from text files, or from Hive Metastore (since HIVE-24217). Currently HPL/SQL 
> depends on Hive and not the other way around.
> Changing the dependency order between HPL/SQL and HiveServer would open up 
> some possibilities which are currently not feasable to implement. For example 
> one might want to use a third party SQL tool to run selects on stored 
> procedure (or rather function in this case) outputs.
> {code:java}
> SELECT * from myStoredProcedure(1, 2); {code}
> HPL/SQL doesn’t have a JDBC interface and it’s not a daemon so this would not 
> work with the current architecture.
> Another important factor is performance. Declarative SQL commands are sent to 
> Hive via JDBC by HPL/SQL. The integration would make it possible to drop JDBC 
> and use HiveSever’s internal API for compilation and execution.
> The third factor is that existing tools like Beeline or Hue cannot be used 
> with HPL/SQL since it has its own, separated CLI.
>  
> To make it easier to implement, we keep things separated in the inside at 
> first, by introducing a hive session level JDBC parameter.
> {code:java}
> jdbc:hive2://localhost:10000/default;hplsqlMode=true {code}
>  
> The hplsqlMode indicates that we are in procedural SQL mode where the user 
> can create and call stored procedures. HPLSQL allows you to write any kind of 
> procedural statement at the top level. This patch doesn't limit this but it 
> might be better to eventually restrict what statements are allowed outside of 
> stored procedures.
>  
> Since HPLSQL and Hive are running in the same process there is no need to use 
> the JDBC driver between them. The patch adds an abstraction with 2 different 
> implementations, one for executing queries on JDBC (for keeping the existing 
> behaviour) and another one for directly calling Hive's compiler. In HPLSQL 
> mode the latter is used.
> In the inside a new operation (HplSqlOperation) and operation type 
> (PROCEDURAL_SQL) was added which works similar to the SQLOperation but it 
> uses the hplsql interpreter to execute arbitrary scripts. This operation 
> might spawns new SQLOpertions.
> For example consider the following statement:
> {code:java}
> FOR i in 1..10 LOOP   
>   SELECT * FROM table 
> END LOOP;{code}
> We send this to beeline while we'er in hplsql mode. Hive will create a hplsql 
> interpreter and store it in the session state. A new HplSqlOperation is 
> created to run the script on the interpreter.
> HPLSQL knows how to execute the for loop, but i'll call Hive to run the 
> select expression. The HplSqlOperation is notified when the select reads a 
> row and accumulates the rows into a RowSet (memory consumption need to be 
> considered here) which can be retrieved via thrift from the client side.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24230) Integrate HPL/SQL into HiveServer2

Reply via email to