Awesome work Ismaël, let me know if I can help somewhere.

Chees, Fokko

Op ma 17 feb. 2020 om 16:00 schreef Ismaël Mejía <[email protected]>:

> I had not tried to do the upgrade Spark since I assumed it will fail
> because of the transitive dependencies of Hive.
>
> But I decided to give it a shot today. Luckily the Spark code base is quite
> Avro friendly so codewise it was 'easy'.
>
> Of course it is still failing, but you can use that to refer on the other
> PR.
> And if you can find any fixes to the pending things that would be great.
>
> https://github.com/apache/spark/pull/27609
>
> Regards,
> Ismaël
>
>
> On Fri, Feb 14, 2020 at 5:42 PM Michael Heuer <[email protected]> wrote:
>
> > Hello Ismaël,
> >
> > Might you be able to share a link to your patch for Spark?  I would like
> > to try to apply it on top of
> >
> > https://github.com/apache/spark/pull/26804 <
> > https://github.com/apache/spark/pull/26804>
> >
> > which attempts to upgrade the Parquet dependency for Spark to 1.11.0.
> >
> > Thank you,
> >
> >    michael
> >
> >
> > > On Feb 14, 2020, at 10:30 AM, Ismaël Mejía <[email protected]> wrote:
> > >
> > > Ah lovely question.
> > >
> > > tldr; version
> > > Spark depends on Hive so Hive should be upgraded first
> > > Spark depends on two versions of Hive a fork by Spark of 1.x and
> upstream
> > > Hive 2.x
> > > Upgrading the first is not even discussed at the moment, for the
> second I
> > > added a patch that passes all tests if you run it against Spark
> > 2.4/master,
> > > but Hive uses a forked version of Spark 2.3 to run its tests (YES
> > CIRCULAR
> > > DEPENDENCY!!!)
> > >
> > > One extra point that is pushing things in the right direction is that
> > > Parquet and Iceberg already moved to Avro 1.9.x so pressure is growing
> > for
> > > things to move, but it is still is a mess, but we want to give the
> fight,
> > > one thing is sure it won't be for Spark 3.0.0, best case 3.1.x and that
> > > also depends on the good will of the Hive contributors that have
> ignored
> > my
> > > emails + patches for some time.
> > >
> >
> https://lists.apache.org/thread.html/rc6c672ad4a5e255957d54d80ff83bf48eacece2828a86bc6cedd9c4c%40%3Cdev.hive.apache.org%3E
> > >
> > > For the detailed details on the saga:
> > > https://issues.apache.org/jira/browse/SPARK-27733
> > > https://issues.apache.org/jira/browse/HIVE-21737
> > >
> > >
> > > On Fri, Feb 14, 2020 at 5:04 PM Michael Heuer <[email protected]>
> wrote:
> > >
> > >> Hello,
> > >>
> > >> I wonder if any Avro devs might be willing to help push a PR for
> Apache
> > >> Spark to update the Avro dependency from 1.8.2 to 1.9.2?
> > >>
> > >> I foresee some trouble with binary incompatible code changes and
> > >> dependency version conflicts, and could use some additional support.
> > >>
> > >> Thank you in advance,
> > >>
> > >>   michael
> >
> >
>

Reply via email to