Re: Reduce phase in Hive on Spark

Xuefu Zhang Mon, 03 Nov 2014 06:38:37 -0800

Hi Prabu,

Thanks for trying out. Hive on Spark is currently based on Spark 1.2, which
is its current master branch.


Thanks,
Xuefu

On Mon, Nov 3, 2014 at 4:16 AM, Prabu Soundar Rajan -X (prabsoun - MINDTREE
LIMITED at Cisco) <prabs...@cisco.com> wrote:

> Hi Team,
>
> We are trying hive on Spark in our cluster and we are experiencing the
> below exception whenever the hive queries involves a reducer phase in its
> execution (like group by, UDAF). Could you please help us understand the
> compatibility of Hive on Spark in UDAF execution and the root cause of this
> exception.
>
> We are using Spark 1.1.0 version and made the build using the hadoop-2
> profile(mvn clean install -DskipTests -Phadoop-2) with the code downloaded
> from https://github.com/apache/hive/tree/spark.
>
> hive (default)> select count(*) from employee;
> Query ID = phodisvc_20141103032121_978e1f48-6290-4e5d-8a57-955edc98b7cd
> Total jobs = 1
> Launching Job 1 out of 1
> In order to change the average load for a reducer (in bytes):
>   set hive.exec.reducers.bytes.per.reducer=<number>
> In order to limit the maximum number of reducers:
>   set hive.exec.reducers.max=<number>
> In order to set a constant number of reducers:
>   set mapred.reduce.tasks=<number>
> java.lang.NoSuchMethodError:
> org.apache.spark.api.java.JavaPairRDD.foreachAsync(Lorg/apache/spark/api/java/function/VoidFunction;)Lorg/apache/spark/api/java/JavaFutureAction;
>         at
> org.apache.hadoop.hive.ql.exec.spark.SparkClient.execute(SparkClient.java:189)
>         at
> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.submit(SparkSessionImpl.java:52)
>         at
> org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:76)
>         at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:161)
>         at
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
>         at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1606)
>         at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1366)
>         at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1178)
>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1005)
>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:995)
>         at
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:246)
>         at
> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:198)
>         at
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:408)
>         at
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:781)
>         at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
>         at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:601)
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:197)
> FAILED: Execution Error, return code -101 from
> org.apache.hadoop.hive.ql.exec.spark.SparkTask.
> org.apache.spark.api.java.JavaPairRDD.foreachAsync(Lorg/apache/spark/api/java/function/VoidFunction;)Lorg/apache/spark/api/java/JavaFutureAction;
>
> Thanks & Regards,
> Prabu
>
>

Re: Reduce phase in Hive on Spark

Reply via email to