Re: Re: spark-sql force parallel union

2018-11-21 Thread Alessandro Solimando
e hundreds of partitions to union, creating a temp view > for each of them might be slow? > > Sent using Zoho Mail <https://www.zoho.com/mail/> > > > Forwarded message > From : kathleen li > To : > Cc : > Date : Wed, 21 Nov 20

Fwd: Re: spark-sql force parallel union

2018-11-20 Thread onmstester onmstester
tions to union, creating a temp view for each of them might be slow? Sent using Zoho Mail Forwarded message From : kathleen li To : Cc : Date : Wed, 21 Nov 2018 10:16:21 +0330 Subject : Re: spark-sql force parallel union Forwarded message you

Re: spark-sql force parallel union

2018-11-20 Thread kathleen li
you might first write the code to construct query statement with "union all" like below: scala> val query="select * from dfv1 union all select * from dfv2 union all select * from dfv3" query: String = select * from dfv1 union all select * from dfv2 union all select * from dfv3 then write loop to

spark-sql force parallel union

2018-11-20 Thread onmstester onmstester
I'm using Spark-Sql to query Cassandra tables. In Cassandra, i've partitioned my data with time bucket and one id, so based on queries i need to union multiple partitions with spark-sql and do the aggregations/group-by on union-result, something like this: for(all cassandra partitions){ DataSet