[ https://issues.apache.org/jira/browse/HIVE-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Bennie Schut updated HIVE-1815: ------------------------------- Attachment: HIVE-1815.1.patch.txt This is the simplest implementation I could do. Just changed the fetchOne to fetchN and return the result on each next() call until the list is empty and then do another fetchN. We've used this for a week and the performance increase on large resultsets is significant. You could also do the fetchN on a different thread to keep the queue full but that's a bit more work for just a little more gain. I've added 1 small test to call the setFetchSize and getFetchSize but the jdbc tests should all work like they worked before this test since the functionality doesn't change. > The class HiveResultSet should implement batch fetching. > -------------------------------------------------------- > > Key: HIVE-1815 > URL: https://issues.apache.org/jira/browse/HIVE-1815 > Project: Hive > Issue Type: Improvement > Components: JDBC > Affects Versions: 0.5.0 > Environment: Custom Java application using the Hive JDBC driver to > connect to a Hive server, execute a Hive query and process the results. > Reporter: Guy le Mar > Attachments: HIVE-1815.1.patch.txt > > > When using the Hive JDBC driver, you can execute a Hive query and obtain a > HiveResultSet instance that contains the results of the query. > Unfortunately, HiveResultSet can then only fetch a single row of these > results from the Hive server at a time. As a consequence, it's extremely slow > to fetch a resultset of anything other than a trivial size. > It would be nice for the HiveResultSet to be able to fetch N rows from the > server at a time, so that performance is suitable to support applications > that provide human interaction. > (From memory, I think it took me around 20 minutes to fetch 4000 rows.) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira