POSAL] Add streaming support to PartialOperator
EXTERNAL MAIL: Indien je de afzender van deze e-mail niet kent en deze niet
vertrouwt, klik niet op een link of open geen bijlages. Bij twijfel, stuur deze
e-mail als bijlage naar ab...@infrabel.be<mailto:ab...@infrabel.be>.
Hi David,
As it s
xpand functionality.
Kind regards,
David
-Original Message-
From: Ash Berlin-Taylor
Sent: Tuesday, 3 December 2024 11:44
To: dev@airflow.apache.org
Subject: Re: [PROPOSAL] Add streaming support to PartialOperator
EXTERNAL MAIL: Indien je de afzender van deze e-mail niet kent en deze niet
vert
e better to do an official AIP proposal. I just planted the seed here to
> see how this proposal would be received. I will try to do this as soon as
> possible.
> >
> > Kind regards,
> > David
> >
> > From: Constance Martineau
> > Sent: Wednesday, 16 Octo
To: dev@airflow.apache.org
> Cc: Blain David
> Subject: Re: [PROPOSAL] Add streaming support to PartialOperator
>
> You don't often get email from
> consta...@astronomer.io<mailto:consta...@astronomer.io>. Learn why this is
> important<https://aka.ms/LearnAboutSende
soon as possible.
Kind regards,
David
From: Constance Martineau
Sent: Wednesday, 16 October 2024 23:06
To: dev@airflow.apache.org
Cc: Blain David
Subject: Re: [PROPOSAL] Add streaming support to PartialOperator
You don't often get email from
consta...@astronomer.io<mailt
Oh. I don't think we want to "vote" on it (but I will let David to chime in
because I was mostly guessing what's his expectation and worries are).
On Thu, Oct 17, 2024 at 1:07 AM Vikram Koka
wrote:
> Hmm, I can think of a different solution to the problem here as well, but I
> could be misunder
Hmm, I can think of a different solution to the problem here as well, but I
could be misunderstanding the problem.
I understand that producing a full AIP may be frustrating, but I don't feel
confident enough in my understanding that I can vote on this at this time.
I do think a "light AIP" in the
Yeah I agree with Jens here. I think it makes sense to produce an AIP so
people can understand the proposal better. We can't really give a thumbs
up or down without a proposal. At least it's not like python where you
have to implement the whole thing first :)
That was a lot to read through, and to be honest, it's hard for me to tell
whether or not Jarek's proposal solves David's problem. However, if the
debate is whether it's worthwhile or not to provide a first-class way for
DAG authors to use Operators as part of TaskFlow Tasks, it is.
Operators are
Hi all,
thanks for picking-up the discussion. So following the email chain a bit
I would recommend to spin an AIP for the implementation. There might be
one or multiple cases where this is a cool feature. Still it will add
complexity and needs a closer discussion. The best discussion might be
on
RE SLAs there was actually a lot of people who chimed in and expressed
concerns with the approach, but no one took the step of actually down
voting it. It's hard to down vote and say no this does not seem right.
And sometimes these things gain a momentum and you don't want to be a stick
in the mud
So I think what David really needs (from you Daniel and others) if is the
idaa sounds right, if it does and we agree it is something that should be
clarified in detail and there are no major blockers to move in this
direction - this can be turned into detailed proposal with the syntax,
I think we
It's about the same David's proposal is about stream syntax to run the
operators in the task. So those are not two things - this is the "idea"
(run operators in a loop in a task) and implementation detail (stream
syntax).
I think at this stage I distilled the idea from the syntax proposal, and
wha
I'm still a bit fuzzy on the proposal. It also seems at times like you two
(David and Jarek) are sorta talking about two different things. David:
"stream" syntax. Jarek: run operator in a task.
I would suggest @David maybe just produce a sort of draft AIP maybe in
google docs or something and s
s were check marked, we could
> think of a better implementation in Airflow 3.x, if this feature would be
> accepted of course.
>
> -----Original Message-
> From: Jarek Potiuk
> Sent: Saturday, October 5, 2024 12:35 AM
> To: dev@airflow.apache.org
> Subject: Re: [P
ked, we could
think of a better implementation in Airflow 3.x, if this feature would be
accepted of course.
-Original Message-
From: Jarek Potiuk
Sent: Saturday, October 5, 2024 12:35 AM
To: dev@airflow.apache.org
Subject: Re: [PROPOSAL] Add streaming support to PartialOperator
EX
>From the earlier discussions with David - this is also (and mainly) about
optimisation. Those operators do very little, and when you add total
overhead that Airflow adds for scheduling and running every task, then it
turns out that looping such operator's execute in a single interpreter is
many, m
Well, it looks like we do have concurrency control for mapped tasks after
all.
See max_active_tis_per_dagrun which was added in
https://github.com/apache/airflow/pull/29094.
So this would allow you to map over your 3000 users in a single run, but
process only one at a time (or 5 or 10 at a time).
One thing, it would have to be 3.0 since no new features are going into 2.x
anymore AFAIK.
Do I understand correctly that essentially what you want to be able to do
is limit parallelism in mapped task? E.g. is it correct that you
essentially want to do task mapping, but with parallelism=1? Would
s we can
already use it, but it could nice that other Aiflow users could also benefit
from this functionality, as I know this topic has been discussed many times.
-Original Message-----
From: Jarek Potiuk
Sent: Friday, October 4, 2024 5:52 AM
To: dev@airflow.apache.org
Subject: Re: [PROPOSA
> why not just do things sequentially in a loop inside of a task?
Yes I think you nailed it - and I think it's just the abstraction you use
in this case.
When you loop in the task to do a small thing many times with one of the
integrations of Airflow - you could use Hook for that. But - apparentl
The thing i'm having trouble with is that the problem the user, David, is
trying to solve is basically, that airflow doesn't like super fine-grained
tasks. Like let's push this to the limit. I run an ecommerce company
that has 10M visitors per day and each time they visit we update the
visitor t
Often hooks do a lot of validation -> Often operators do a lot of
validation
On Thu, Oct 3, 2024 at 7:50 PM Jarek Potiuk wrote:
> I think this is very similar to past discussions that we had about
> allowing operators to be used in task flow as a "first class citizen".
> https://lists.apache.org
I think this is very similar to past discussions that we had about allowing
operators to be used in task flow as a "first class citizen".
https://lists.apache.org/thread/nflt9h6dc5obzztmyqxlpxfs950rtqsq
I re-read the original discussion and ...
In theory It sounds like you should be able to do the
responses so we can
handle it in one task, but in some situations, you just can't do that as
explained in my previous example.
-Original Message-
From: Daniel Standish
Sent: Wednesday, September 18, 2024 6:41 PM
To: dev@airflow.apache.org
Subject: Re: [PROPOSAL] Add streaming suppo
Standish
Sent: Wednesday, September 18, 2024 6:41 PM
To: dev@airflow.apache.org
Subject: Re: [PROPOSAL] Add streaming support to PartialOperator
EXTERNAL MAIL: Indien je de afzender van deze e-mail niet kent en deze niet
vertrouwt, klik niet op een link of open geen bijlages. Bij twijfel, stuur d
Curious why you want to model this as many tasks, e.g. one page == one task.
Another option would be to handle many pages in one task. And I'm curious
what were the factors that led you to split it out more granularly.
27 matches
Mail list logo