subject:"How to run large Hive queries in PySpark 1.2.1"

Re: How to run large Hive queries in PySpark 1.2.1

2016-05-26 Thread Nikolay Voronchikhin

Hi Jörn, We will be upgrading to MapR 5.1, Hive 1.2, and Spark 1.6.1 at the end of June. In the meantime, still can this be done with these versions? There is not a firewall issue since we have edge nodes and cluster nodes hosted in the same location with the same NFS mount. On Thu, May 26, 20

How to run large Hive queries in PySpark 1.2.1

2016-05-26 Thread Nikolay Voronchikhin

Hi PySpark users, We need to be able to run large Hive queries in PySpark 1.2.1. Users are running PySpark on an Edge Node, and submit jobs to a Cluster that allocates YARN resources to the clients. We are using MapR as the Hadoop Distribution on top of Hive 0.13 and Spark 1.2.1. Currently, our