[ https://issues.apache.org/jira/browse/HIVE-14876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15551259#comment-15551259 ]
Vaibhav Gumashta commented on HIVE-14876: ----------------------------------------- Following are the details of RPC fetch from jdbc to hs2 and also the confusion over {{hive.server2.thrift.resultset.max.fetch.size}}: When we create a new connection, we use a default value for fetch size if not specified in the connection string by the end user. In {{HiveConnection}}: {code} private int fetchSize = HiveStatement.DEFAULT_FETCH_SIZE; {code} If however, a user specifies the fetch size by using a connection string like this: {{jdbc:hive2://localhost:10000/default;fetchSize=10000}}, we override the default value with the user supplied value. In {{HiveConnection}}: {code} if (sessConfMap.containsKey(JdbcConnectionParams.FETCH_SIZE)) { fetchSize = Integer.parseInt(sessConfMap.get(JdbcConnectionParams.FETCH_SIZE)); } {code} When we run a {{HiveStatement.execute}}, we set the fetch size in the result set. In {{HiveStatement.execute}}: {code} resultSet = new HiveQueryResultSet.Builder(this).setClient(client).setSessionHandle(sessHandle) .setStmtHandle(stmtHandle).setMaxRows(maxRows).setFetchSize(fetchSize) .setScrollable(isScrollableResultset) .build(); {code} Finally, when we issue a fetch rpc request, we send this value as part of the rpc request. In {{HiveQueryResultSet.next}}: {code} TFetchResultsReq fetchReq = new TFetchResultsReq(stmtHandle, orientation, fetchSize); {code} On the server side, the fetch request hits {{ThriftCLIService.FetchResults}}: {code} RowSet rowSet = cliService.fetchResults( new OperationHandle(req.getOperationHandle()), FetchOrientation.getFetchOrientation(req.getOrientation()), req.getMaxRows(), FetchType.getFetchType(req.getFetchType())); {code} The request eventually reaches {{SQLOperation.getNextRowSet}} which gets the fetch size specified in the RPC as the parameter. Apologize for the confusion regarding {{hive.server2.thrift.resultset.max.fetch.size}}, but that is only used when ThriftJDBCSerde is used to write resultsets in tasks, to decide how many rows to serialize in a blob. I have created a jira for resolving the confusion and shall have a patch out soon: HIVE-14901. Meanwhile, to increase the default fetch size for the code path that doesn't use ThriftJDBCSerde, we should bump the value of HiveStatement.DEFAULT_FETCH_SIZE on the driver side. cc [~ziyangz]: you might want to follow the discussion here. > make the number of rows to fetch from various HS2 clients/servers configurable > ------------------------------------------------------------------------------ > > Key: HIVE-14876 > URL: https://issues.apache.org/jira/browse/HIVE-14876 > Project: Hive > Issue Type: Bug > Reporter: Sergey Shelukhin > Assignee: Sergey Shelukhin > Attachments: HIVE-14876.patch > > > Right now, it's hardcoded to a variety of values -- This message was sent by Atlassian JIRA (v6.3.4#6332)