As we discuss this topic, the more and more I get to understand the reasons 
behind all those philosophies behind, so I appreciate the knowledge that I 
gained.

As long as those terms and principles are well described and explained without 
confusion, I believe we are moving to the right direction and that’s what 
matters.

- Howard

Sent from my iPhone

> On Feb 6, 2022, at 3:24 PM, Jarek Potiuk <ja...@potiuk.com> wrote:
> 
> 
> IMHO It does not really matter if they are the same or not and which one is 
> the same. This is actually the beauty of the "abstract" and "vague" 
> logical_date. Those are different "concepts" that you use in different cases.
> 
> The logical date **might** be the same as one of the interval_dates. It's 
> just an "abstract" representation of the particular "run_id" - and you should 
> not care, because "logical_date" makes sense for some cases, but 
> "data_interval_start/end" for other cases.
> 
> * If your task is about "data_interval" - by all means use the 
> data_interval_start and end.
> * if your task is not about "interval" - use the "logical_date".
> 
> That is how I see it at least. By using a different approach when you use 
> different cases the users might free their "mental-mapping" - they do not 
> have to map the "logical_date" to either "start" or "end". It does not 
> matter. but if they process a data interval, they have very clear boundaries 
> of ("start" <-> "end") range that they can use without even thinking on. how 
> "logical_date" maps to it.
> 
> For me - those are completely different cases and they are orthogonal to each 
> other (even if some of those values are the same).
> 
> J.
> 
>> On Sun, Feb 6, 2022 at 7:00 PM Howard Yoo <howard...@gmail.com> wrote:
>> I see, thank you for the info.
>> I didn’t know about the existence of the data_interval_start and end dates. 
>> I briefly looked at those definitions, and was wondering… wouldn’t they be 
>> equal to the logical dates? I do see those variables mentioned in 
>> https://airflow.apache.org/docs/apache-airflow/stable/templates-ref.html, 
>> and also see the ds and ts meaning logical dates. In practice, are those 
>> dates and timestamps supposed to be the same?
>> 
>> Wonder also, if the ‘data_’ prefix would be necessary if airfow would be 
>> used to orchestrate far more things in the future (perhaps this may be 
>> another thread), but in general, we should have a continuous discussions to 
>> further clearly define all those dates for the improved usage of airflow.
>> 
>> Howard
>> 
>> Sent from my iPhone
>> 
>>>> On Feb 6, 2022, at 11:15 AM, Jarek Potiuk <ja...@potiuk.com> wrote:
>>>> 
>>> 
>>> We already have `data_interval_start` and `data_interval_end' as fields, 
>>> and we need something else that can have more "abstract" meaning to apply 
>>> to the whole run as "single thing". Using interval_date would be a bit 
>>> ambiguous.
>>> 
>>> "Did you mean start or end actually when you mentioned interval date?" - is 
>>> the question that I anticipate happening a lot if we mix those.
>>> 
>>> J.
>>> 
>>> 
>>> 
>>>> On Sun, Feb 6, 2022 at 6:04 PM Howard Yoo <howard...@gmail.com> wrote:
>>>> Now I can understand why the data_date may not be a perfect fit to 
>>>> describe the term.
>>>> 
>>>> This is not to be against the logical_date, but what about 
>>>> ‘interval_date?’ We have the schedule interval, which defines the duration 
>>>> of the interval (e.g. 1day), so wouldn’t interval start and end date be a 
>>>> better representation of it rather than the logical date?
>>>> 
>>>> Just want to hear whether that has been brought up already or not.
>>>> 
>>>> Howard
>>>> 
>>>> Sent from my iPhone
>>>> 
>>>>>> On Feb 6, 2022, at 10:25 AM, Jarek Potiuk <ja...@potiuk.com> wrote:
>>>>>> 
>>>>> 
>>>>> I wholeheartedly agree with TP on that one.  I think while some time ago 
>>>>> "data date" could make sense, Airflow's future is much more than just 
>>>>> processing data intervals. 
>>>>> This is the primary use case and this is where Airflow shines od course, 
>>>>> but one of the good examples of how Airflow is used out there, and while 
>>>>> we are not really encouraging it, there are not only legitimate, but also 
>>>>> something that I hope Airflow will treat as first-time citizens soon (and 
>>>>> it kind of already is with custom timetables). 
>>>>> 
>>>>> Just an example here - for me one of the most eye-opening talks in last 
>>>>> year's Airflow Summit 
>>>>> https://airflowsummit.org/sessions/2021/provision-as-a-service/ 
>>>>> In this talk Cloudflare engineers explain how they manage the CloudFlare 
>>>>> infrastructure using Airflow. 
>>>>> 
>>>>> The "Data date" has no meaning in this case. But the "logical Date" 
>>>>> (which is the vaguest-possible one as TP explained) continues to have 
>>>>> one. This is the "logical date of the infrastructure provisioning". 
>>>>> Thanks to Airflow (as I understand it) Cloudflare is able to re-provision 
>>>>> their services to "yesterday's logical date infrastructure"  today - for 
>>>>> example. 
>>>>> 
>>>>> That would not fly with "data date".
>>>>> 
>>>>> J,
>>>>> 

Reply via email to