I generally write to Parquet when I want to repeat the operation of reading data and perform different operations on it every time. This would save db time for me.
Thanks Muthu On Thu, Jul 19, 2018, 18:34 amin mohebbi <aminn_...@yahoo.com.invalid> wrote: > We do have two big tables each includes 5 billion of rows, so my question > here is should we partition /sort the data and convert it to Parquet before > doing any join? > > Best Regards ....................................................... Amin > Mohebbi PhD candidate in Software Engineering at university of Malaysia > Tel : +60 18 2040 017 E-Mail : tp025...@ex.apiit.edu.my > amin_...@me.com >