ORC format is transparent to CBO. Currently we are working on a new cost model which might reflect ORC's performance advantages in optimization decisions.
Thanks John From: Mich Talebzadeh <m...@peridale.co.uk<mailto:m...@peridale.co.uk>> Reply-To: "user@hive.apache.org<mailto:user@hive.apache.org>" <user@hive.apache.org<mailto:user@hive.apache.org>> Date: Sunday, April 19, 2015 at 12:32 PM To: "user@hive.apache.org<mailto:user@hive.apache.org>" <user@hive.apache.org<mailto:user@hive.apache.org>> Subject: Orc file and Hive Optimiser My understanding is that the Optimized Row Columnar (ORC) file format provides a highly efficient way to store Hive data. https://cwiki.apache.org/confluence/display/Hive/LanguageManual+ORC In a nutshell the columnar storage allows pretty efficient compression of columns on par with what Data Warehouses databases like Sybase IQ provide. In short if a normal Hive table is "Row based implementation of relational model", then ORC is the equivalent for "Columnar based implementation of relational model" I find ORC file format pretty interesting as it provides a more efficient performance compared to other Hive file formats Trying testing it). MY only question is whether the Cost Based Optimiser (CBO) of Hive is aware of ORC storage format and it treats the table accordingly? Finally this is more of a speculative question. If we have ORC files that provide good functionality, is there any reason why one should deploy a columnar database such as Hbase or Cassandra If Hive can do the job as well? Thanks, Mich Talebzadeh http://talebzadehmich.wordpress.com Author of the books "A Practitioner's Guide to Upgrading to Sybase ASE 15", ISBN 978-0-9563693-0-7. co-author "Sybase Transact SQL Guidelines Best Practices", ISBN 978-0-9759693-0-4 Publications due shortly: Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and Coherence Cache Oracle and Sybase, Concepts and Contrasts, ISBN: 978-0-9563693-1-4, volume one out shortly NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility.