Hi,
We have recently run into this issue:
https://issues.apache.org/jira/browse/SPARK-9042
My organization's application reads raw data from files, processes/cleanses
it and pushes the results to Hive tables. To keep reads efficient, we have
partitioned our tables. In a Sentry enabled cluster, ou
adcast function is in org.apache.spark.sql.functions
>
>
>
> On Wed, Nov 4, 2015 at 10:19 AM, Charmee Patel wrote:
>
>> Hi,
>>
>> If I have a hive table, analyze table compute statistics will ensure
>> Spark SQL has statistics of that table. When I have a dataframe, is there
Hi,
If I have a hive table, analyze table compute statistics will ensure Spark
SQL has statistics of that table. When I have a dataframe, is there a way
to force spark to collect statistics?
I have a large lookup file and I am trying to avoid a broadcast join by
applying a filter before hand. Thi
A similar issue occurs when interacting with Hive secured by Sentry.
https://issues.apache.org/jira/browse/SPARK-9042
By changing how Hive Context instance is created, this issue might also be
resolved.
On Thu, Oct 22, 2015 at 11:33 AM Steve Loughran
wrote:
> On 22 Oct 2015, at 08:25, Chester C