Re: Is there a way to resolve Fair Scheduler Problem

2016-04-25 Thread Khaja Hussain
Hi If you need other application in a queue to start you can use pre-emption. Below link has the details. What pre-emption does is it guarantee's capacity, in which case your job will start. If there is no load on the given queue those resources will be used by other queue. Hope this helps. https

Is there a way to resolve Fair Scheduler Problem

2016-04-25 Thread mahender bigdata
Hi Team, Is there way to resolve Fair Queue Scheduler problem. Currently I see If application requires more resources, it fully consumes available resources leaving other submitted applications in *pending *or *accepted *state. Do i need to modify set yarn.scheduler.maximum-allocation-mb=512

Re: Hive footprint

2016-04-25 Thread Mich Talebzadeh
Hi Naveen, Thank you for your detailed explanation. Please allow me to explain my points if I may I think a viable solution for big data stack will encompass (again this is my view) Spark with Hive, HDFS and Yarn as winning combinations. Hadoop encompasses HDFS and it is almost impossibl

Re: [VOTE] Bylaws change to allow some commits without review

2016-04-25 Thread Lars Francke
Thanks for the further votes. If I'm not mistaken three more would be missing for a successful vote. @Carl, thanks for your vote. I'd behappy to hear any concerns you might have. @Ashutosh: Sounds like a very sensible idea. I've never actually gotten around to use Travis CI so I hope there'll be

Remove unnecessary joins from view at runtime.

2016-04-25 Thread Grant Overby (groverby)
Suppose I have a view that joins 3 tables together. If I execute a query against this view that is answerable by only joining 2 of these 3 tables together, can hive preform this optimization automatically? Example: For the below select and view, I'd like hive to avoid the join on iede_xff6. SE

Re: Hive footprint

2016-04-25 Thread Naveen Gangam
Hi Mich, I am a developer at Cloudera and contribute to Apache Hive. Hive and MPP query engine projects like Impala have settled into their respective positions so there is less confusion between these projects. For example, across Cloudera's customer base the majority of customers use Impala to

Using a different FileSystem for Staging

2016-04-25 Thread Blake Martin
Hi Hive Folks, We're writing to S3-backed tables, but hoping to use HDFS for staging and merging. When we set hive.exec.stagingdir to an HDFS location, we get: 16/04/19 22:59:54 [main]: ERROR ql.Driver: FAILED: IllegalArgumentException Wrong FS: hdfs://:8020/tmp/hivestaging_hive_2016-04-19_22-59

java.lang.ArrayIndexOutOfBoundsException in getSplitHosts

2016-04-25 Thread Saumitra Shahapure
Hello, I am using using Hive 0.13.1 in EMR and trying to create Hive table on top of our custom file system (which is a thin wrapper on top of S3) and I am getting error while accessing the data in the table. Stack trace and command history below. I had a doubt that CombineFileInputFormat is tryi

Varying vcores/ram for hive queries running Tez engine

2016-04-25 Thread Nitin Kumar
I was trying to benchmark some hive queries. I am using the tez execution engine. I varied the values of the following properties: 1. hive.tez.container.size 2. tez.task.resource.memory.mb 3. tez.task.resource.cpu.vcores Changes in values for property 1 is reflected properly.