Thanks Mich for your reply. I am curious to know one thing, Hive uses CBO which take into account of cpu cost, Does hive optimizer has any advantage over spark catalyst optimizer?.
Regards, Srinivasan Hariharan +91-9940395830 From: Mich Talebzadeh [mailto:mich.talebza...@gmail.com] Sent: Friday, June 10, 2016 3:27 PM To: Srinivasan Hariharan02 <srinivasan_...@infosys.com> Cc: Takeshi Yamamuro <linguin....@gmail.com>; user@spark.apache.org Subject: Re: Catalyst optimizer cpu/Io cost in an SMP system such as Oracle or Sybase the CBO will take into account LIO, PIO and CPU costing or use some empirical costing. In a distributed system like Spark with so many nodes that may not be that easy or its contribution to the Catalyst decision may be subject to variations that may not make it worthwhile. HTH Dr Mich Talebzadeh LinkedIn https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw http://talebzadehmich.wordpress.com<http://talebzadehmich.wordpress.com/> On 10 June 2016 at 10:45, Srinivasan Hariharan02 <srinivasan_...@infosys.com<mailto:srinivasan_...@infosys.com>> wrote: Thanks Takeshi. Is there any reason for not using I/o cpu cost in catalyst optimizer?. Some sql engines which leverages Apache calcite has cost planner like volcanoPlanner which takes cpu and io cost for plan optimization. Regards, Srinivasan Hariharan +91-9940395830<tel:%2B91-9940395830> From: Takeshi Yamamuro [mailto:linguin....@gmail.com<mailto:linguin....@gmail.com>] Sent: Friday, June 10, 2016 2:38 PM To: Srinivasan Hariharan02 <srinivasan_...@infosys.com<mailto:srinivasan_...@infosys.com>> Cc: user@spark.apache.org<mailto:user@spark.apache.org> Subject: Re: Catalyst optimizer cpu/Io cost Hi, There no way to retrieve that information in spark. In fact, the current optimizer only consider the byte size of outputs in LogicalPlan. Related code can be found in https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala#L90 If you want to know more about catalyst, you can check the Yin Huai's slide in spark summit 2016. https://spark-summit.org/2016/speakers/yin-huai/ # Note: the slide is not available now, and it seems it will be in a few weeks. // maropu On Fri, Jun 10, 2016 at 3:29 PM, Srinivasan Hariharan02 <srinivasan_...@infosys.com<mailto:srinivasan_...@infosys.com>> wrote: Hi,, How can I get spark sql query cpu and Io cost after optimizing for the best logical plan. Is there any api to retrieve this information?. If anyone point me to the code where actually cpu and Io cost computed in catalyst module. Regards, Srinivasan Hariharan +91-9940395830<tel:%2B91-9940395830> -- --- Takeshi Yamamuro