Hi Jarek,

Thanks a lot for detailed feedback and sharing the Airflow story, this is
exactly what I was hoping to hear in response from the mailing list!

600+ dependencies is very impressive, so I'd be happy to chat more and
learn from your experience.

On Wed, Aug 24, 2022 at 5:50 AM Jarek Potiuk <ja...@potiuk.com> wrote:

> Comment (from a bit outsider)
>
> Fantastic document Valentyn.
>
> Very, very insightful and interesting. We feel a lot of the same pain in
> Apache Airflow (actually even more because we have not 20 but 620+
> dependencies) but we are also a bit more advanced in the way how we are
> managing the dependencies - some of the ideas you had there are already
> tested and tried in Airflow, some of them are a bit different but we can
> definitely share "principles" and we are a little higher in the "supply
> chain" (i.e. Apache Beam Python SDK is our dependency).
>
> I left some suggestions and some comments describing in detail how the
> same problems look like in Airflow and how we addressed them (if we did)
> and I am happy to participate in further discussions. I am "the dependency
> guy" in Airflow and happy to share my experiences and help to work out some
> problems - and especially help to solve problems coming from using multiple
> google-client libraries and diamond dependencies (we are just now dealing
> with similar issue - where likely we will have to do a massive update of
> several of our clients - hopefully with the involvement of Composer team.
> And I'd love to be involved in a joint discussion with the google client
> team to work out some common and expectations that we can rely on when we
> define our future upgrade strategy for google clients.
>
> I will watch it here and be happy to spend quite some time on helping to
> hash it out.
>
> BTW. You can also watch my talk I gave last year at PyWaw about "Managing
> Python dependencies at Scale"
> https://www.youtube.com/watch?v=_SjMdQLP30s&t=2549s where I explain the
> approach we took, reasoning behind it etc.
>
> J.
>
>
> On Wed, Aug 24, 2022 at 2:45 AM Valentyn Tymofieiev via dev <
> dev@beam.apache.org> wrote:
>
>> Hi everyone,
>>
>> Recently, several issues [1-3]  have highlighted outage risks and
>> developer inconveniences due to  dependency management practices in Beam
>> Python.
>>
>> With dependabot and other tooling  that we have integrated with Beam, one
>> of the missing pieces seems to be having a clear guideline of how we should
>> be specifying requirements for our dependencies and when and how we should
>> be updating them to have a sustainable process.
>>
>> As a conversation starter, I put together a retrospective
>> <https://docs.google.com/document/d/1gxQF8mciRYgACNpCy1wlR7TBa8zN-Tl6PebW-U8QvBk/edit?resourcekey=0-XcHRyFh4KRPkA0GsdUmU3g#>[4]
>> covering a recent incident and would like to get community opinions on the
>> open questions.
>>
>> In particular, if you have experience managing dependencies for other
>> Python libraries with rich dependency chains, knowledge of available
>> tooling or first hand experience dealing with other dependency issues in
>> Beam, your input would be greatly appreciated.
>>
>> Thanks,
>> Valentyn
>>
>> [1] https://github.com/apache/beam/issues/22218
>> [2] https://github.com/apache/beam/pull/22550#issuecomment-1217348455
>> [3] https://github.com/apache/beam/issues/22533
>> [4]
>> https://docs.google.com/document/d/1gxQF8mciRYgACNpCy1wlR7TBa8zN-Tl6PebW-U8QvBk/edit
>>
>

Reply via email to