Hi Sean,

Thanks for the reply.  I'm on CDH 5.0.3 and upgrading the whole cluster to
5.1.0 will eventually happen but not immediately.

I've tried running the CDH spark-1.0 release and also building it from
source.  This, unfortunately goes into a whole other rathole of
dependencies.  :-(

Eric


On Sun, Aug 10, 2014 at 10:16 AM, Sean Owen <so...@cloudera.com> wrote:

> As far as I can tell, the method was removed after 0.12.0 in the fix
> for HIVE-5223 (
> https://github.com/apache/hive/commit/4059a32f34633dcef1550fdef07d9f9e044c722c#diff-948cc2a95809f584eb030e2b57be3993
> ),
> and that fix was back-ported in its entirety to 5.0.0+:
>
> http://archive.cloudera.com/cdh5/cdh/5/hive-0.12.0-cdh5.0.0.releasenotes.html
>
> The fix was evidently also important, but it's not clear the build can
> have the fix and keep this method, not without forking via a custom
> patch. Even though CDH5 never *didn't* have this version of the code,
> it creates this sort of surprising problem.
>
> I imagine it's not the only instance of this kind of problem people
> will ever encounter. Can you rebuild Spark with this particular
> release of Hive?
>
> Because that's what the Spark that was shipped with CDH would have
> done. Are you replacing / not using that?
>
>
>
>
>
> On Sun, Aug 10, 2014 at 5:36 PM, Eric Friedman
> <eric.d.fried...@gmail.com> wrote:
> > I have a CDH5.0.3 cluster with Hive tables written in Parquet.
> >
> > The tables have the "DeprecatedParquetInputFormat" on their metadata, and
> > when I try to select from one using Spark SQL, it blows up with a stack
> > trace like this:
> >
> > java.lang.RuntimeException: java.lang.ClassNotFoundException:
> > parquet.hive.DeprecatedParquetInputFormat
> >
> > at
> >
> org.apache.hadoop.hive.ql.metadata.Table.getInputFormatClass(Table.java:309)
> >
> >
> > Fair enough, DeprecatedParquetInputFormat isn't in the Spark assembly
> built
> > with Hive.
> >
> >
> > if I try to add hive-exec-0.12.0-cdh5.0.3.jar to my SPARK_CLASSPATH, in
> > order to get DeprecatedParquetInputFormat, I find out that there is an
> > incompatibility in the SerDeUtils class.  Spark's Hive snapshot expects
> to
> > find
> >
> >
> > java.lang.NoSuchMethodError:
> >
> org.apache.hadoop.hive.serde2.SerDeUtils.lookupDeserializer(Ljava/lang/String;)Lorg/apache/hadoop/hive/serde2/Deserializer;
> >
> > at
> >
> org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:217)
> >
> >
> > But that isn't in the Hive snapshot provided by CDH5.0.3
> >
> >
> > Both Spark and CDH label their Hive versions as 0.12.0.
> >
> >
> > According to the Apache SVN server, CDH is the one that's out of step, as
> > this method is definitely on the 0.12.0 release.  I have raised a ticket
> > with Cloudera about this.
> >
> >
> > Has anyone found a workaround?
> >
> >
> > I did try extracting a subset of jars from hive-exec.jar, but that
> quickly
> > turned into a journey down the rabbit hole.
>

Reply via email to