+1 for removing to avoid misunderstanding :). It's cleaner/clearer now
with iceberg-python repo. Thanks Fokko & Ed !

Regards
JB

On Sun, Oct 8, 2023 at 9:07 PM Fokko Driesprong <fo...@apache.org> wrote:
>
> Hey everyone,
>
> It has been a week since PyIceberg migrated to its own repository. Should we 
> move forward by removing the Python codebase from the main repository? 
> Ajantha already raised a pull-request to do this (thank you for that 🙌).
>
> Kind regards,
> Fokko
>
> Op ma 2 okt 2023 om 16:16 schreef Fokko Driesprong <fo...@apache.org>:
>>
>> Hey everyone,
>>
>> Update from my side. I've moved all the issues and my PRs. Not all issues 
>> needed to be migrated since a lot of them were already fixed. I've closed 
>> the remaining PRs that were still open, those are either abandoned, failed 
>> on CI, or had changes pending. Of course, with the kind request to re-open 
>> them to the iceberg-python repository.
>>
>> Ajantha already created a PR (thanks for that!) to remove Python from the 
>> iceberg repo.
>>
>> Kind regards, Fokko
>>
>>
>> Op za 30 sep 2023 om 21:06 schreef Fokko Driesprong <fo...@apache.org>:
>>>
>>> Hey everyone,
>>>
>>>> Pucheng: I wonder how do we deal with all the issues filed for python 
>>>> module but still open in iceberg repo?
>>>
>>>
>>> That's a good point. I think we should migrate them. I checked and it is 
>>> only 3 pages. Likely a few more if we query on other keywords. I think 
>>> migrating them by hand is feasible. It also gives us a chance to clean them 
>>> up (all the issues on the last page I linked above are not relevant 
>>> anymore, and can be closed).
>>>
>>>> Brian: The one thing we will lose is pull requests, but I assume there are 
>>>> very few.
>>>
>>>
>>> I've checked those as well, and as Brian already mentioned, there are just 
>>> a few. There is never a perfect moment since there are always PRs open that 
>>> will break, but just after the release I think is the best worst moment :) 
>>> The PRs that are open are trivial to move to the new repo as well.
>>>
>>>> Hussain: I checked the discussion thread, and one of the motivations for 
>>>> this separation was to avoid triggering unrelated CI jobs after each 
>>>> change. However, I wonder if it isn't (and will not be) necessary to check 
>>>> the compatibility between the main repository and the client after each 
>>>> change. Otherwise, we will need to trigger the CI across the different 
>>>> repositories using the GHA API, not necessarily to block the PR, but just 
>>>> to give quick feedback and notification that something needs to be changed 
>>>> on the client side.
>>>
>>>
>>> Checking between dev versions is not something we do today, and PyIceberg 
>>> lives isolated in the main repository. We might want to do some integration 
>>> tests at some point, but I'm not sure if we should start testing dev 
>>> versions against each other. The main issue with triggering the CI is to 
>>> not exponentially explode the ignore list of a Github action. An example 
>>> here is where the Python GA file was not properly excluded.
>>>
>>> I would much rather rely on some reference tests that Jean-Baptiste 
>>> mentioned at the Java Iceberg 1.4.0 release, and that we're also working on 
>>> at Tabular (disclaimer: I'm working for Tabular). Python i inspired by 
>>> Java, and we've recently uncovered some issues (thanks Jan Finis!) with 
>>> respect to adhering to the spec, so I think a strict approach to validate 
>>> the implementations would be preferred.
>>>
>>> That said, in PyIceberg we use Spark (which uses the Java library) to run 
>>> integration tests. This is based on the released versions which works very 
>>> well. Not sure if we should create matrices between 
>>> Python/Go/Rust/Iceberg/Athena/Snowflake/... (you're seeing where this is 
>>> going) :) But these are just my thoughts today and might change in the 
>>> future.
>>>
>>> Thanks everyone, I'll go ahead and merge the PR that includes the history.
>>>
>>> Cheers, Fokko
>>>
>>> Ps. The repo might look a bit funky, but that's because I've created the 
>>> pr-branch before the main branch. I didn't know that the branch that was 
>>> created first, would be promoted to the default branch. I'm working with 
>>> Apache Infra to get it fixed.
>>>
>>> Op za 30 sep 2023 om 20:29 schreef Daniel Weeks <dwe...@apache.org>:
>>>>
>>>> +1 to relocate with history.
>>>>
>>>> On Sat, Sep 30, 2023, 10:24 AM Brian Olsen <bitsondata...@gmail.com> wrote:
>>>>>
>>>>> This shouldn’t be too hard and can likely be a nightly build that occurs 
>>>>> with each client repository.
>>>>>
>>>>> We’re already planning on doing the documentation using git submodule to 
>>>>> pull all the documentation under a single build in the central repo. We 
>>>>> can likely go the other direction to run client-core integration tests. I 
>>>>> prefer these go on the client end to avoid too much ci running on the 
>>>>> core repo. We have to also consider whatever we choose to do with Python 
>>>>> client we will also apply to go, Rust, and any future client. Happy to 
>>>>> hear alternatives though!
>>>>>
>>>>> WDYT Fokko?
>>>>>
>>>>>
>>>>>
>>>>> On Sat, Sep 30, 2023 at 7:12 AM Hussein Awala <huss...@awala.fr> wrote:
>>>>>>
>>>>>> +1
>>>>>>
>>>>>> I checked the discussion thread, and one of the motivations for this 
>>>>>> separation was to avoid triggering unrelated CI jobs after each change. 
>>>>>> However, I wonder if it isn't (and will not be) necessary to check the 
>>>>>> compatibility between the main repository and the client after each 
>>>>>> change. Otherwise, we will need to trigger the CI across the different 
>>>>>> repositories using the GHA API, not necessarily to block the PR, but 
>>>>>> just to give quick feedback and notification that something needs to be 
>>>>>> changed on the client side.
>>>>>>
>>>>>> On Fri, Sep 29, 2023 at 9:39 PM Brian Olsen <bitsondata...@gmail.com> 
>>>>>> wrote:
>>>>>>>
>>>>>>> +1
>>>>>>>
>>>>>>> Great work Fokko!
>>>>>>>
>>>>>>> Pucheng,
>>>>>>>
>>>>>>> We still want to maintain all of the issues in the Python repository. 
>>>>>>> The one thing we will lose is pull requests, but I assume there are 
>>>>>>> very few.
>>>>>>>
>>>>>>> On Fri, Sep 29, 2023 at 10:34 AM Pucheng Yang 
>>>>>>> <py...@pinterest.com.invalid> wrote:
>>>>>>>>
>>>>>>>> Thanks for doing this. I wonder how do we deal with all the issues 
>>>>>>>> filed for python module but still open in iceberg repo?
>>>>>>>>
>>>>>>>> On Fri, Sep 29, 2023 at 7:55 AM Eduard Tudenhoefner 
>>>>>>>> <edu...@tabular.io> wrote:
>>>>>>>>>
>>>>>>>>> +1 on moving to a separate repo and maintaining git history
>>>>>>>>>
>>>>>>>>> On Fri, Sep 29, 2023 at 3:30 PM Jean-Baptiste Onofré 
>>>>>>>>> <j...@nanthrax.net> wrote:
>>>>>>>>>>
>>>>>>>>>> Awesome, it looks even better ;)
>>>>>>>>>>
>>>>>>>>>> Thanks !
>>>>>>>>>> Regards
>>>>>>>>>> JB
>>>>>>>>>>
>>>>>>>>>> On Fri, Sep 29, 2023 at 2:31 PM Fokko Driesprong <fo...@apache.org> 
>>>>>>>>>> wrote:
>>>>>>>>>> >
>>>>>>>>>> > Hey Ajantha,
>>>>>>>>>> >
>>>>>>>>>> > That's a great suggestion. I've followed the steps and created a 
>>>>>>>>>> > new PR here: https://github.com/apache/iceberg-python/pull/3
>>>>>>>>>> >
>>>>>>>>>> > The subdirectory-filter command moves a subdirectory to the root 
>>>>>>>>>> > directory. This way I still had to add some files afterward 
>>>>>>>>>> > (.github/*, .gitignore, etc.), these are in a separate commit. 
>>>>>>>>>> > Please take a look.
>>>>>>>>>> >
>>>>>>>>>> > Thanks,
>>>>>>>>>> >
>>>>>>>>>> > Fokko
>>>>>>>>>> >
>>>>>>>>>> > Op vr 29 sep 2023 om 13:39 schreef Ajantha Bhat 
>>>>>>>>>> > <ajanthab...@gmail.com>:
>>>>>>>>>> >>
>>>>>>>>>> >> I think we are gonna lose the history of commits if we merge the 
>>>>>>>>>> >> above PR.
>>>>>>>>>> >>
>>>>>>>>>> >> There are ways to move the subfolder into a new repo by retaining 
>>>>>>>>>> >> commit history.
>>>>>>>>>> >> For example:
>>>>>>>>>> >> - 
>>>>>>>>>> >> https://medium.com/@ayushya/move-directory-from-one-repository-to-another-preserving-git-history-d210fa049d4b
>>>>>>>>>> >> - https://gist.github.com/trongthanh/2779392
>>>>>>>>>> >>
>>>>>>>>>> >> Please give it a try.
>>>>>>>>>> >>
>>>>>>>>>> >> Thanks,
>>>>>>>>>> >> Ajantha
>>>>>>>>>> >>
>>>>>>>>>> >> On Fri, Sep 29, 2023 at 4:55 PM Fokko Driesprong 
>>>>>>>>>> >> <fo...@apache.org> wrote:
>>>>>>>>>> >>>
>>>>>>>>>> >>> Hey everyone 👋
>>>>>>>>>> >>>
>>>>>>>>>> >>> A while ago we discussed that Rust and Go are going into a 
>>>>>>>>>> >>> separate repository: 
>>>>>>>>>> >>> https://lists.apache.org/thread/4s02lmwf1kyrxxdpj3q9w2fqnxq2llbn
>>>>>>>>>> >>>
>>>>>>>>>> >>> Since we just did the PyIcerg 0.5.0 release, I think it is a 
>>>>>>>>>> >>> good moment to migrate PyIceberg to iceberg-python as well: 
>>>>>>>>>> >>> https://github.com/apache/iceberg-python/pull/2 I went over the 
>>>>>>>>>> >>> PRs that are ready to merge and got them in. If there is 
>>>>>>>>>> >>> anything missing, please let me know.
>>>>>>>>>> >>>
>>>>>>>>>> >>> I would suggest merging the PR and leaving the source code in 
>>>>>>>>>> >>> the main repository for another week or so to make sure that we 
>>>>>>>>>> >>> didn't miss anything.
>>>>>>>>>> >>>
>>>>>>>>>> >>> Since PyIceberg now also hosts the docs on the Github pages of 
>>>>>>>>>> >>> the Iceberg repository, moving PyIceberg will also free up the 
>>>>>>>>>> >>> Github pages for the migration of the docs back into the main 
>>>>>>>>>> >>> repository.
>>>>>>>>>> >>>
>>>>>>>>>> >>> Let me know if there are any concerns.
>>>>>>>>>> >>>
>>>>>>>>>> >>> Kind regards,
>>>>>>>>>> >>> Fokko Driesprong

Reply via email to