Hey Jarek,

Thanks for taking the time! I think we are well aligned actually :) The 
framework we're developing to setup and run these tests in an AWS account has 
all 4 of the characteristics you mentioned. So the real question is where do 
they run and who is the owner.


> Looking at the expectations above - I think it would be better to run such 
> tests by the Amazon team for Amazon, Google team for Google etc.
> I think it will be far more efficient to get it in the hands of those 
> stakeholders who are mostly interested in getting the "green" tests


We on our end are committed to keeping those tests green, or at least triaging 
them and opening tickets to work with the community to get them green. We'll be 
doing this either way, whether there is a publicly running copy of the stack or 
not.


> That is much more scalable solution from the community point of view. We are 
> not going to publish it to our users and is not really
> needed to be run on our infra. I don't see a particular need for regular 
> community members to even know how/what infrastructure is
>used to run the tests - the test execution is pretty standardised, and I think 
>we are really interested in output rather than the infra to run > it.

Agreed, so the question now is what is the "API" between us and the community 
if we run this infra:

1. For vending the results it seems agreeable to publish a public dashboard of 
the real-time results as we run the tests daily. This will likely be our first 
goal. We can link to this from somewhere in the Airflow docs/site (perhaps 
somewhere on the ecosystem page?) for any interested folks. Though, I agree 
that it will be mostly release managers who are interested for provider package 
releases.
I'm not sure that notifying on the Apache Airflow Slack will be useful for many 
users. Also once more cloud providers' system tests are up and running it could 
become quite spam-y. Though, I'm interested to hear what others think.

2. For allowing folks to trigger the tests and inspect the logs, this is 
trickier. But I'm not sure it's actually a blocking issue, at least to start. 
All of the AWS system tests and the code to run them in Breeze are published to 
the Airflow code base. So if some system test, say SQS, is failing and someone 
from the community would like to work on it there's nothing stopping them from 
creating an AWS account, deploying credentials to their machine and running 
breeze testing tests --system amazon 
tests/system/providers/amazon/aws/example_sqs.py

For an individual developer working on a PR, this will be a faster dev/test 
cycle anyway.
You may run into services that don't have sufficient free-tier usage for 
contributors to do this for free, but I think it will cover most cases for the 
short term until we implement a way for folks to remotely trigger these system 
tests.

So how about this for an incremental plan:

1. We at Amazon will continue to host and run AWS system tests for the time 
being. With us being the "owners" of triaging failing tests, either fixing them 
or cutting tickets to the Apache Airflow Github for other contributors to take 
on.
2. We'll work next on exposing results via some kind of public dashboard so 
that the community can see the real-time health of the AWS provider package.
3. Then follow up with producing some mechanisms to trigger these tests from 
the Airflow community, whether that be from within a PR or by the release 
manager. Though, I think this one still needs some more thinking on just how 
that would work and scale.

Cheers,
Niko


________________________________
From: Jarek Potiuk <ja...@potiuk.com>
Sent: Friday, August 19, 2022 2:32:11 PM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL]Vending AWS System Test Results Back to the Community


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.


Hey Niko,

Very good points to discuss. I think this is something equally needed on 
AWS/Google but also other "popular" services we have integration with (and 
eventually fulfilling the goal of AIP-4) :)

Context:

I am a big fan of thinking of CI systems as largely invisible to the "regular" 
users. The best CI system (and it's almost impossible goal but should always be 
our compass) is a system you are not aware of until you (or someone) introduced 
a change that needs an action from someone - when things get broken (and the 
breakage might come from multiple sources in case of System Tests: code change, 
library upgrade, service changing the API, required permission change etc. etc. 
Unlike in other types of tests, a lot more of those might be independent from 
PRs merged, this is actually more likely that some external change might impact 
System Tests.

Also System Tests - cannot be really run with every PR (they take too long) so 
there is a bit different "usage" and "access" needs for the results. This 
impacts what can be the trigger for such tests. I think it's far more likely 
that we will run the system tests regularly on schedule (once a day?) and 
manually by the release manager when we prepare a provider's release to verify 
that the providers are still working or when we want to test if we fix the 
problem reported by failing "scheduled" run (from a PR branch).

And I think it's not only "how you access" the results, but how "failure 
notifications" are delivered and how the tests are triggered and we should 
answer all those questions.

Audience of the solution:

This leads to another question - WHO will be interested in seeing notifications 
and fixing the problems ?

I think this is the most important question to answer. I really think regular 
contributors (even those who contribute to - say - Amazon provider) will not be 
interested and will not regularly monitor failure notifications. Unlike regular 
"test failure" notifications - they should not come to the users who made the 
PR but to those who are interested in keeping the "provider" green. It will be 
rather difficult in a number of cases to engage (automatically) a contributor's 
attention when such system tests start to fail. But eventually (and I think 
this split is quite obvious) there will be people interested in monitoring the 
overall health of a given provider. They don't have to fix it, they can merely 
investigate and see that likely this or that caused the problem and "pull in" 
the PR contributors. But there is a watchout there - those contributors might 
not have or want to use access to run such system tests manually (it might 
cost, might need paid accounts, have some risks involved as data is 
deleted/recreated etc.). Those contributors should see the logs/results, should 
be able to fix the problem but then they should also be able to trigger system 
test execution on their PR when they want to check if the problem is fixed.

Characteristics of the solution:

So I think any solution should have those characteristics:

1) Produce notification that tests executed have failed/succeeded - this should 
come to a dedicated, separate per-provider place (for example slack channel - 
with the "low" frequency of such messages, slack channel seems like the best 
idea. Then people who want to monitor a given provider could simply subscribe 
to that channel. Seeing regular "All tests passed" and sometimes "some tests 
failed" message there is a great indication that a) tests continue to work in 
general, b) see when things fail. We need to have regular schedule and notify 
about successes as well - to make some kind of "heartbeat" which would tell 
those people monitoring the errors that things are not working when the 
heartbeat is missing.

2) Seeing logs - the notifications should contain links to logs that could be 
browsed by anyone (Read-only). Luckily we have no "secret" information so it 
could be a publicly available link. Ideally, it should be a cloud-based one 
(CloudWatch for AWS). I think access to logs is absolutely crucial for anyone 
trying to investigate and fix the problem. And I think this is the only thing 
needed by anyone outside of the group that is interested in keeping the 
provider "green" - such individual contributors might only need to see a 
"green" log to compare what has changed in this particular build, maybe, but 
they are not really interested in seeing the historical stats.

3) Triggering the tests. This one is tricky. This is something that should be 
accessible by everyone contributing a PR, but it should be controlled somehow. 
No idea yet how this can be gated/controlled but this is something we will need 
to figure out. Who and when and how to trigger such a build for your own PR? 
There might be various ways - special comment on the PR + some conditions 
(approvals?) that the PR/user should fulfill to be able to trigger it - this is 
not really something I have complete proposal for.

4) Dashboard - that one is mostly interesting to release manager and people who 
are interested in keeping given provider "green".  It's OK to make it public, 
but it does not need to be "beautiful" or anything - it can be a very "raw" 
output.

In this context, answering some of your questions Niko:

* I do not think we need automated API. The frequency and nature of "reasons" 
for failures are low and I do not see a reason why we would consume it (but 
this might come in the future as we learn).
* Public Dashboard is fine, but public access to logs is far more important IMHO

Who should run the infrastructure ?

Looking at the expectations above - I think it would be better to run such 
tests by the Amazon team for Amazon, Google team for Google etc. - While the 
CloudFormation scripts would be best if published, I think it will be far more 
efficient to get it in the hands of those stakeholders who are mostly 
interested in getting the "green" tests. That is much more scalable solution 
from the community point of view. We are not going to publish it to our users 
and is not really needed to be run on our infra. I don't see a particular need 
for regular community members to even know how/what infrastructure is used to 
run the tests - the test execution is pretty standardised, and I think we are 
really interested in output rather than the infra to run it.

J.




On Fri, Aug 19, 2022 at 2:53 PM Kamil Breguła 
<dzaku...@gmail.com<mailto:dzaku...@gmail.com>> wrote:
I don't think we have to limit ourselves that only the commiters have access to 
the Amazon account managed by Airflow community. In the past, commiters was 
supported by other people whom they trust e.g. commiter asked for help from 
another co-worker from her company when he needed it.

This means that there are no restrictions on Amazon employees using this 
account and maintaining this environment.

We just have to be careful that no-commiters have not write permission to the 
repository, and that they cannot publish a new version of the application that 
can be seen as official released by the Apache Foundation.

On Fri, Aug 19, 2022, 01:30 Oliveira, Niko <oniko...@amazon.com.invalid> wrote:

Hey folks,


Those of us on the AWS Airflow team (myself, Dennis F, Vincent B, Seyed H) have 
been working on a few projects over the past few months:


1. Writing example dags/docs for all existing Operators in the AWS Airflow 
provider package (done)

2. Writing AWS specific logic in Airflow codebase to support AIP-47 (done)

3. Converting all example dags to AIP-47 compliant system tests (just over 
halfway done)


All of these are ultimately culminating to the goal of us running these system 
tests at a regular cadence within Amazon (where we have access to funded AWS 
accounts). We will run these system tests, triggered by updates to 
airflow:main, at least once a day.

I'd like to open a discussion on how we can vend these results back to the 
community in a way that is most consumable for contributors, release managers 
and users alike.

A quick and easy approach would be to create a publicly viewable CloudWatch 
Dashboard. With at least the following metrics for each system test over time:  
pass/fail, duration, and execution count.
This would be a human readable way to consume the current status of AWS 
Operators.


If a more machine readable format is required/preferred (e.g. for scripts 
related to Airflow release management perhaps) we could also put together a 
simple API Gateway endpoint that would vend the data in a format such as JSON.

Another interesting option would be for us to publish the CloudFormation 
templates (or the codebase used to generate the templates) for configuring the 
system test environment and executing the tests. This could be deployed to an 
AWS account owned and managed by the Airflow community where tests would be run 
periodically. AWS has provided some credits in the past which could be used to 
help fund the account. But this introduces a large component that would need 
ownership and management by folks within the Airflow community who have access 
to such AWS accounts and credits (likely only committers/release managers?). So 
it might not be worth the complexity.


I'd like to hear what folks think!

Cheers,
Niko



Reply via email to