Re: Cached Tables SQL Performance Worse than Uncached

2016-12-15 Thread Mich Talebzadeh
How many tables are involved in the SQL join and how do you cache them? If you do unpersist on the DF(s) and run the same SQL query (the same sesiion) what do you see with explain? HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOAB

Re: Cached Tables SQL Performance Worse than Uncached

2016-12-15 Thread Mich Talebzadeh
How many tables are involved in the SQL join and how do you cache them? If you do unpersist on the DF and run the sdame Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw

Re: Cached Tables SQL Performance Worse than Uncached

2016-12-15 Thread Michael Armbrust
Its hard to comment on performance without seeing query plans. I'd suggest posting the result of an explain. On Thu, Dec 15, 2016 at 2:14 PM, Warren Kim wrote: > Playing with TPC-H and comparing performance between cached (serialized > in-memory tables) and uncached (DF from parquet) results in

Cached Tables SQL Performance Worse than Uncached

2016-12-15 Thread Warren Kim
Playing with TPC-H and comparing performance between cached (serialized in-memory tables) and uncached (DF from parquet) results in various SQL queries performing much worse, duration-wise. I see some physical plans have an extra layer of shuffle/sort/merge under cached scenario. I could do