+1 on removing 2.4 support On Fri, Apr 14, 2023 at 5:31 PM John Zhuge <jzh...@apache.org> wrote:
> Netflix internal Spark 2.4 is different from OSS. It is closer to OSS 3.0 > or 3.1 because it has DataSourceV2 and catalog support. So we don't rely on > Iceberg Spark 2.4 code. > > On Fri, Apr 14, 2023 at 3:12 PM Russell Spitzer <russell.spit...@gmail.com> > wrote: > >> +1, Spark 2.4 is very out of sync with current developments as noted >> above. It's almost impossible for us to get any newer features to be >> compatible with it. >> >> On Fri, Apr 14, 2023 at 4:52 PM Anjali Norwood >> <anorw...@netflix.com.invalid> wrote: >> >>> Hi Fokko, Ryan, >>> >>> Netflix is still on Spark-2.4.4 with Iceberg-0.9. We are >>> actively migrating to Spark-3.x and Iceberg 1.1 (or later). I do not >>> anticipate us using Spark-2.4.4 with newer versions of Iceberg (>0.9). >>> If the plan is to not support Spark-2.4.4 with Iceberg >= 1.X, that >>> should be ok. >>> @John Zhuge <jzh...@netflix.com> can you please chime in? >>> >>> thanks, >>> Anjali >>> >>> >>> On Fri, Apr 14, 2023 at 10:56 AM Ryan Blue <b...@tabular.io> wrote: >>> >>>> Overall I'm +1, but could be convinced otherwise. >>>> >>>> Spark 2.4 is old and doesn't really function properly because the Spark >>>> Catalog API was missing at the time. And people can still use older >>>> versions of Iceberg that support Spark 2.4 if they need it because the >>>> Iceberg spec guarantees forward compatibility. >>>> >>>> That said, I'd love to hear from more people on this. I think it >>>> would be great to drop support, but I don't know how many people still use >>>> it. Is upgrading Hadoop a good reason to drop support for an engine? Hadoop >>>> seems like a minor concern to me unless it is blocking something. >>>> >>>> Ryan >>>> >>>> On Thu, Apr 13, 2023 at 12:54 PM Jack Ye <yezhao...@gmail.com> wrote: >>>> >>>>> +1 for dropping 2.4 support >>>>> >>>>> Best, >>>>> Jack Ye >>>>> >>>>> On Thu, Apr 13, 2023 at 10:59 AM Fokko Driesprong <fo...@apache.org> >>>>> wrote: >>>>> >>>>>> Hi all, >>>>>> >>>>>> I'm working on moving to Hadoop 3.x >>>>>> <https://github.com/apache/iceberg/pull/7114>, and one thing is that >>>>>> it seems to be incompatible with Spark 2.4. I wanted to ask if people are >>>>>> still on Spark 2.4 and what we think of dropping the support. The last >>>>>> release of Spark 2.4.8 was on 2021-05-17 and it also looks like the 2.4 >>>>>> branch on the Spark Github repository is stale, so I don't expect any >>>>>> further releases. >>>>>> >>>>>> Before creating a PR I would like to check on the mail-list if anyone >>>>>> has any objections. If so, please let us know. >>>>>> >>>>>> Thanks, >>>>>> Fokko Driesprong >>>>>> >>>>> >>>> >>>> -- >>>> Ryan Blue >>>> Tabular >>>> >>> > > -- > John Zhuge > -- John Zhuge