[ https://issues.apache.org/jira/browse/HIVE-7224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15363001#comment-15363001 ]
Sahil Takiar commented on HIVE-7224: ------------------------------------ [~vgumashta] is seems the behavior you are seeing is by design. Looking at https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-BeelineCommandOptions the following explanation of the {{--incremental}} property suggests that this is expected: {quote} Defaults to false. When set to false, the entire result set is fetched and buffered before being displayed, yielding optimal display column sizing. When set to true, result rows are displayed immediately as they are fetched, yielding lower latency and memory usage at the price of extra display column padding. Setting --incremental=true is recommended if you encounter an OutOfMemory on the client side (due to the fetched result set size being large). {quote} So it seems there is a tradeoff when using {{--incremental}} that the column padding won't be optimal, but memory usage will be better. This makes sense since the {{IncrementalRows}} class that controls this logic doesn't do any buffering of rows, so it cannot predict what the optimal column width should be since it only looks at one row at a time. I think a better approach for the {{IncrementalRows}} class would be to instead buffer 1000 rows at a time (by default, this value can be configurable), this way it can optimally set the column width for each set of 1000 rows. This shouldn't introduce memory issues unless each row is huge, in which case the use can decrease the buffer size to say 100 or 10. What do you think? > Set incremental printing to true by default in Beeline > ------------------------------------------------------ > > Key: HIVE-7224 > URL: https://issues.apache.org/jira/browse/HIVE-7224 > Project: Hive > Issue Type: Bug > Components: Beeline, Clients, JDBC > Affects Versions: 0.13.0, 1.0.0, 1.2.0, 1.1.0 > Reporter: Vaibhav Gumashta > Assignee: Sahil Takiar > Attachments: HIVE-7224.1.patch, HIVE-7224.2.patch, HIVE-7224.2.patch, > HIVE-7224.3.patch > > > See HIVE-7221. > By default beeline tries to buffer the entire output relation before printing > it on stdout. This can cause OOM when the output relation is large. However, > beeline has the option of incremental prints. We should keep that as the > default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)