[ https://issues.apache.org/jira/browse/HIVE-5692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13809117#comment-13809117 ]
Remus Rusanu commented on HIVE-5692: ------------------------------------ The implementation is much more aggresive now: - shouldFlush test the in-use vs. max at each batch boudary, not only at checking limit. Checking limit is only used to decide when to probe/adjust the average variable row size - the flush is called in a while loop until it shouldFlush returns false, ie. it flushes as much as necessary to stay within the prescribed bounds. Progress is being monitored to prevent infinite loop. - the checking limit is configured via HiveConf hive.vectorized.groupby.checkinterval - the flushing percent is configured via HiveConf hive.vectorized.groupby.flush.percent > Make VectorGroupByOperator parameters configurable > -------------------------------------------------- > > Key: HIVE-5692 > URL: https://issues.apache.org/jira/browse/HIVE-5692 > Project: Hive > Issue Type: Sub-task > Reporter: Remus Rusanu > Assignee: Remus Rusanu > Priority: Minor > Attachments: HIVE-5692.1.patch, HIVE-5692.2.patch > > > The FLUSH_CHECK_THRESHOLD and PERCENT_ENTRIES_TO_FLUSH should be configurable. -- This message was sent by Atlassian JIRA (v6.1#6144)