Hey Everyone,

Based on the feedback, I updated DAG-44
https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-44+Airflow+Internal+API
- the "implementation notes" with improved approach.

Ash had a good suggestion (which I really like) that instead of inventing
our own decorators and different way of handling the internal and external
communication for the "coarse" functions that require the database, we
could approach it  differently - namely we could always use RPC - no matter
if we are in DB isolation mode or "no isolation" mode. Of course in case of
the "no isolation" mode, the communication should have very low overhead
(local TCP or Sockets, no authorization). I looked at existing RPC
implementations we could use for that and I narrowed down potential choice
of technologies to gRPC and Apache Thrift for that.

This approach has multiple advantages:

* we can leverage existing RPC implementations (Thrift and gRPC are both
mature and have integration with HTTPS, various authentication options and
can be also run using local sockets)
* the code will be much simpler to maintain - we will use existing
serialization mechanisms from those protocols
* no custom code for communication needed - both Thrift and gRPC have all
that is needed for scalable, robust communication

I think this way we will be able to implement a more robust and
maintainable solution much faster.

I also reached out to Apache Beam (they have support for both gRPC and
Thrift and are in the process of transitioning - from Thrift to gRPC as
primary protocol and I am sure they have done a lot of analysis that can
help us to make the final decision.

This approach changes only the implementation details of the AIP-44 - all
the rest is the same, the approach, deployment options remain untouched by
this change.

If you have any comments to that - feel free/ I will also discuss it today
at the meeting and if there will be general consensus that the direction is
right I would love to start voting on AIP-44 ideally tomorrow - so that
next week we can start implementing it. I am not sure if we want to make a
final decision about gRPC/Thrift (maybe there are people who have good
experience both and can share it here?).

I think more detailed POC and benchmarking might be the first step of the
AiP - where we make the final choice based on an attempt to implement POC
for both - but I am also happy to listen to those who have more experience
with both (and maybe Beam experience will help with that)..

J.




On Tue, Feb 15, 2022 at 1:49 PM Jarek Potiuk <[email protected]> wrote:

> The meeting is tomorrow :)/ Feel free to join I will also record it
> and publish minutes!
>
> On Tue, Feb 15, 2022 at 12:31 PM Giorgio Zoppi <[email protected]>
> wrote:
> >
> > Hello Everyone,
> > is there any follow up of this meeting? I would like to participate if
> it's possible.
> > Best Regards,
> > Giorgio
> >
> > Il giorno mar 1 feb 2022 alle ore 15:29 Jarek Potiuk <[email protected]>
> ha scritto:
> >>
> >> Hello Everyone,
> >>
> >>  I think it's about the time for the next sig-multitenancy meeting :
> >>
> >> I created a doodle poll for next week - please mark your availability
> till Friday the 4th.
> >>
> >>
> https://doodle.com/poll/axvu2gz7zhv8ieye?utm_source=poll&utm_medium=link
> >>
> >> I think what the rough agenda will be:
> >>
> >> * AIP-43 Dag Processor Separation [1] - implementation progress -
> Mateusz
> >> * AIP-44 Airflow Internal API [2] - voting progress (hopefully) -  Jarek
> >> * AIP-45 Remove double DAG parsing [3] -  discussion - Ping
> >> * AIP-46 Docker runtime isolation [4] - discussion - Ping
> >> * Also there are some ideas (not yet in AIP form) around optimizing
> DagProcessorLoop that might be good to talk about - also Ping.
> >>
> >> If there are any more proposals - feel free to ping me.
> >> I also encourage everyone to comment the AIP-45/46 proposals from Ping
> before the meeting.
> >>
> >> [1]
> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-43+DAG+Processor+separation
> >> [2]
> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-44+Airflow+Internal+API
> >> [3]
> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-45+Remove+double+dag+parsing+in+airflow+run
> >> [4]
> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-46+Add+support+for+docker+runtime+isolation+for+airflow+tasks+and+dag+parsing
> >>
> >> J.
> >>
> >>
> >
> >
> > --
> > Life is a chess game - Anonymous.
>

Reply via email to