Hi Mordechai,
Thanks for your interest! Addition to what Jack mentioned, we also have a
slack channel #python in apache-iceberg slack workspace for the iceberg
python library.
As the iceberg python library is an implementation of iceberg spec, it
would be great to get familiar with the spec
Hi Wing Yew,
I think 2.4 is a different story, we will continue to support Spark 2.4,
but as you can see it will continue to have very limited functionalities
comparing to Spark 3. I believe we discussed about option 3 when we were
doing Spark 3.0 to 3.1 upgrade. Recently we are seeing the same is
I understand and sympathize with the desire to use new DSv2 features in
Spark 3.2. I agree that Option 1 is the easiest for developers, but I don't
think it considers the interests of users. I do not think that most users
will upgrade to Spark 3.2 as soon as it is released. It is a "minor
version"
Option 1 sounds good to me. Here are my reasons:
1. Both 2 and 3 will slow down the development. Considering the limited
resources in the open source community, the upsides of option 2 and 3 are
probably not worthy.
2. Both 2 and 3 assume the use cases may not exist. It's hard to predict
anything,
To sum up what we have so far:
Option 1 (support just the most recent minor Spark 3 version)
The easiest option for us devs, forces the user to upgrade to the most recent
minor Spark version to consume any new Iceberg features.
Option 2 (a separate project under Iceberg)
Can support as many S
Hi Mordechai,
Thank you very much for your interest! We are in the process of refactoring
the python codebase, so there are a lot of opportunities for contribution.
We have had a few discussions so far, you can join future meetings by
subscribing to this google group:
https://groups.google.com/g/
I think we should go for option 1. I already am not a big fan of having runtime
errors for unsupported things based on versions and I don't think minor version
upgrades are a large issue for users. I'm especially not looking forward to
supporting interfaces that only exist in Spark 3.2 in a mul
Hey Imran,
I don’t know why I forgot to mention this option too. It is definitely a
solution to consider. We used this approach to support Spark 2 and Spark 3.
Right now, this would mean having iceberg-spark (common code for all versions),
iceberg-spark2, iceberg-spark-3 (common code for all Spa
> First of all, is option 2 a viable option? We discussed separating the python
> module outside of the project a few weeks ago, and decided to not do that
> because it's beneficial for code cross reference and more intuitive for new
> developers to see everything in the same repository. I would
Hi
I want to join the effort of the iceberg python package.
I have several years of Python/Big data/Backend/ML experience and will be happy
to donate code to this project
Do you have any guidelines and some learning materials?
Thanks
Mordechai Ben Zecharia
Big Data Engineer | Data Engineering
T
10 matches
Mail list logo