Just as a little follow up - I think I have a hypothesis about what
happened.

We got one other user creating one issue which was very similar and from
this comment I gather:
https://github.com/apache/airflow/issues/45940#issuecomment-2608307111

* there is some tool out there that is supposed to make "issue creation"
easier - with help of AI
* some test accounts were used to test it (likely there are people who have
a bunch of fake Github accounts they maintain to test new things with AI)
* apparently some "real" people also got their hands on that tool and tried
it
* this tool LIKELY used "airflow" and "iceberg" in some documentation or
default settings as "examples"
* apparently this tool mislead people into thinking they are "testing"
issue creation where it actually created those issues
* I guess whoever has the tool realised their mistake and either stopped it
or removed some confusion
* I have my own suspicions (which I am exploring) - but I asked the user to
provide information about what tooling they were using (and the user was
apologising, and expressed willingness to provide more information so I
hope I will get more information soon).

J.

On Thu, Jan 23, 2025 at 8:57 AM Piotr Findeisen <piotr.findei...@gmail.com>
wrote:

> Hi
>
> Thank you Jarek for taking care of this matter!
>
> > Should we react and block new users from interacting with Airflow repo if
> we see it happening again?
>
> Maintainers' time is not an infinite resource, so "yes!" from me (also for
> Iceberg).
>
> Best
>
>
>
>
> On Wed, 22 Jan 2025 at 15:40, Russell Spitzer <russell.spit...@gmail.com>
> wrote:
>
> > This is pretty disturbing and I hope that any users out there see that
> > using automated tools to submit issues is just adding noise to the
> project
> > which makes it very hard for real issues to be addressed.
> >
> > On Wed, Jan 22, 2025 at 6:58 AM Jarek Potiuk <ja...@potiuk.com> wrote:
> >
> >>  - Iceberg dev to not flood them :) (in bcc:)
> >>
> >> It looks like the flood had been somehow flood-gated - no similar report
> >> for the last 4 hours or so.
> >>
> >> I also started to receive confirmation from Github that they are looking
> >> at the reports, so likely we do not have to do any action now, but I
> >> think we can turn it into deciding about "future" reactions when
> something
> >> like this happens, so that we can potentially react quickly
> >>
> >> What do others think ? Should we react and block new users from
> >> interacting with Airflow repo if we see it happening again? Maybe
> >> temporarily - for a day or two initially - after reporting some initial
> >> reports? Does it sound reasonable?
> >>
> >> J.
> >>
> >> On Wed, Jan 22, 2025 at 11:35 AM Pavankumar Gopidesu <
> >> gopidesupa...@gmail.com> wrote:
> >>
> >>> +1 from me.
> >>>
> >>> It looks started yesterday, I feel we may get many of these tickets
> when
> >>> new users starts testing those AI agents.
> >>>
> >>> Regards,
> >>> Pavan Kumar
> >>>
> >>> On Wed, Jan 22, 2025, 10:27 Jarek Potiuk <ja...@potiuk.com> wrote:
> >>>
> >>> > We continue getting new issues - and more of them are by "new users"
> -
> >>> > created just an hour or so ago.
> >>> >
> >>> > Apparently Github has a way to temporarily limit interactions with
> the
> >>> repo
> >>> > for new users - see this screenshot:
> >>> >
> >>> > https://ibb.co/WWsr7RB
> >>> >
> >>> > And I think I'd be for enabling it - we will need an INFRA ticket for
> >>> that,
> >>> > because that's not currently configurable via .asf.yaml  - and maybe
> if
> >>> > Iceberg would like to do it as well, we can create a single ticket
> for
> >>> > that.
> >>> >
> >>> > There is a new framework coming to enable faster implementation and
> >>> testing
> >>> > of .asf.yaml features (this was discussed at the latest roundtable) -
> >>> and
> >>> > we can contribute a feature to add it in .asf.yaml soon, but
> >>> temporarily we
> >>> > might want to ask INFRA to help.
> >>> >
> >>> > WDYT? If I hear a few voices for +1 and no strong opposition I will
> >>> open a
> >>> > JIRA ticket (and would love to hear what Iceberg friends of ours
> think
> >>> as
> >>> > well :)
> >>> >
> >>> >
> >>> > J.
> >>> >
> >>> >
> >>> > On Wed, Jan 22, 2025 at 10:36 AM Jarek Potiuk <ja...@potiuk.com>
> >>> wrote:
> >>> >
> >>> > > Yeah. just closed this one. The pattern where those are coming at
> the
> >>> > same
> >>> > > time as two unrelated issues to both iceberg and airflow are very.
> >>> ....
> >>> > > strange
> >>> > >
> >>> > > On Wed, Jan 22, 2025 at 10:35 AM Elad Kalif <elad...@apache.org>
> >>> wrote:
> >>> > >
> >>> > >> Another one who also opened issues in Airflow and Iceberg
> >>> > >> https://github.com/apache/iceberg/issues/12034
> >>> > >> https://github.com/apache/airflow/issues/45920
> >>> > >>
> >>> > >> Same "mistake" with the # Title.
> >>> > >> All of these seem to come with accounts opened months ago, with
> some
> >>> > minor
> >>> > >> traffic to their own forks so they would appear legit to Github
> >>> > >>
> >>> > >> On Wed, Jan 22, 2025 at 11:23 AM Jarek Potiuk <ja...@potiuk.com>
> >>> wrote:
> >>> > >>
> >>> > >> > Yeah. Again - my guess is that those are "Agentic AI" trials,
> >>> where
> >>> > >> someone
> >>> > >> > is deploying fake "agent" accounts acting as "people in the repo
> >>> > would".
> >>> > >> > That's a bit terrifying if this is not contained.
> >>> > >> >
> >>> > >> > On Wed, Jan 22, 2025 at 9:52 AM Fokko Driesprong <
> >>> fo...@apache.org>
> >>> > >> wrote:
> >>> > >> >
> >>> > >> > > That's quite a few! I also noticed that they sometimes
> >>> self-close
> >>> > the
> >>> > >> > issue
> >>> > >> > > (eg here <https://github.com/apache/iceberg/issues/12032>).
> >>> Closed
> >>> > >> > after 1
> >>> > >> > > minute, but still flooding my mailbox :D
> >>> > >> > >
> >>> > >> > > So you might have more such issues now than you think.
> >>> > >> > >
> >>> > >> > >
> >>> > >> > > Yes, that's probably the case, still going through my mailbox.
> >>> > >> > >
> >>> > >> > >
> >>> > >> > > Op wo 22 jan 2025 om 09:49 schreef Jarek Potiuk <
> >>> ja...@potiuk.com>:
> >>> > >> > >
> >>> > >> > > > Example case:
> >>> > >> > > >
> >>> > >> > > > * https://github.com/apache/airflow/issues/45904  - airflow
> >>> > >> > > > * https://github.com/apache/iceberg/issues/12034 - iceberg
> >>> > >> > > >
> >>> > >> > > > Both issues are generic and useless and bring 0 value except
> >>> > noise.
> >>> > >> > > >
> >>> > >> > > > Interesting thing is that many of those users, if you look
> at
> >>> > their
> >>> > >> > > > history - created. similar number of issues in iceberg and
> >>> airflow
> >>> > >> > about
> >>> > >> > > > the same time. So you might have more such issues now than
> you
> >>> > >> think.
> >>> > >> > > >
> >>> > >> > > > J.
> >>> > >> > > >
> >>> > >> > > >
> >>> > >> > > >
> >>> > >> > > >
> >>> > >> > > > On Wed, Jan 22, 2025 at 9:41 AM Jarek Potiuk <
> >>> ja...@potiuk.com>
> >>> > >> wrote:
> >>> > >> > > >
> >>> > >> > > >> I have not counted all of them. there are quite a bit too
> >>> many -
> >>> > >> and
> >>> > >> > > >> other people closed some of them as well. I got a very
> >>> > rudimentary
> >>> > >> > check
> >>> > >> > > >> and applied "AI Spam" label to some of the issues
> >>> > >> > > >>
> >>> > >> > >
> >>> > >> >
> >>> > >>
> >>> >
> >>>
> https://github.com/apache/airflow/issues?q=is%3Aissue%20state%3Aclosed%20AI%20label%3A%22AI%20Spam%22
> >>> > >> > > .
> >>> > >> > > >> -> so we have had at least 25 such issues in the last 12
> >>> hours.
> >>> > >> > > >>
> >>> > >> > > >> > we also want to make sure that we don't accidentally
> close
> >>> > issues
> >>> > >> > that
> >>> > >> > > >> don't come from a bot, but just a newcomer to the project.
> >>> > >> > > >>
> >>> > >> > > >> Those reports and patterns look very. very human-like -
> they
> >>> are
> >>> > >> > > reported
> >>> > >> > > >> infrequently (per user) the description and text seem
> >>> legitimate,
> >>> > >> but
> >>> > >> > > they
> >>> > >> > > >> are wordy and just reading and understanding that those are
> >>> > >> completely
> >>> > >> > > >> useless takes a lot of time. This is part of the problem,
> >>> that it
> >>> > >> > takes
> >>> > >> > > a
> >>> > >> > > >> lot of energy and time to determine if those are valid or
> >>> not -
> >>> > and
> >>> > >> > with
> >>> > >> > > >> such a rate, it's not sustainable just to analyze whether
> >>> they
> >>> > are
> >>> > >> > good
> >>> > >> > > or
> >>> > >> > > >> bad.
> >>> > >> > > >>
> >>> > >> > > >> J.
> >>> > >> > > >>
> >>> > >> > > >>
> >>> > >> > > >>
> >>> > >> > > >> On Wed, Jan 22, 2025 at 9:23 AM Fokko Driesprong <
> >>> > fo...@apache.org
> >>> > >> >
> >>> > >> > > >> wrote:
> >>> > >> > > >>
> >>> > >> > > >>> Hey Jarek,
> >>> > >> > > >>>
> >>> > >> > > >>> Thanks for bringing this to our attention. When you talk
> >>> about
> >>> > >> > > flooding,
> >>> > >> > > >>> how many are we talking about? I see some suspicious
> issues
> >>> (eg,
> >>> > >> here
> >>> > >> > > >>> <https://github.com/apache/iceberg/issues/12039>), but
> not
> >>> > many.
> >>> > >> I
> >>> > >> > > >>> hope this will come to a halt soon because it all
> additional
> >>> > work,
> >>> > >> > and
> >>> > >> > > we
> >>> > >> > > >>> also want to make sure that we don't accidentally close
> >>> issues
> >>> > >> that
> >>> > >> > > don't
> >>> > >> > > >>> come from a bot, but just a newcomer to the project.
> >>> > >> > > >>>
> >>> > >> > > >>> Kind regards,
> >>> > >> > > >>> Fokko
> >>> > >> > > >>>
> >>> > >> > > >>> Op wo 22 jan 2025 om 09:00 schreef Jarek Potiuk <
> >>> > ja...@potiuk.com
> >>> > >> >:
> >>> > >> > > >>>
> >>> > >> > > >>> > Hey Iceberg community, And Airflow community too.
> >>> > >> > > >>> >
> >>> > >> > > >>> > As of yesterday Airflow repo is literally flooded with a
> >>> > number
> >>> > >> of
> >>> > >> > > >>> issues
> >>> > >> > > >>> > that look almost good, except they are clearly AI
> >>> generated
> >>> > and
> >>> > >> > make
> >>> > >> > > no
> >>> > >> > > >>> > sense or repeat content from other issues. We noticed
> >>> that the
> >>> > >> > users
> >>> > >> > > >>> who
> >>> > >> > > >>> > create a lot of the "spam AI" issues that are created in
> >>> > Airflow
> >>> > >> > are
> >>> > >> > > >>> also
> >>> > >> > > >>> > creating similar issues for Iceberg.
> >>> > >> > > >>> >
> >>> > >> > > >>> > We got to the point that we are closing and reporting
> such
> >>> > >> issues
> >>> > >> > to
> >>> > >> > > >>> > GitHub and we are blocking all such users without
> >>> spending too
> >>> > >> much
> >>> > >> > > >>> time on
> >>> > >> > > >>> > it with messages similar to this:
> >>> > >> > > >>> >
> >>> > >> > > >>> > ```
> >>> > >> > > >>> > This looks totally AI-generated. useless issue report
> that
> >>> > >> brings
> >>> > >> > no
> >>> > >> > > >>> value
> >>> > >> > > >>> > and makes no sense. We are generally blocking users that
> >>> > sends a
> >>> > >> > lot
> >>> > >> > > of
> >>> > >> > > >>> > spam AI reports generated by bots.. as of yesterday so
> we
> >>> will
> >>> > >> > report
> >>> > >> > > >>> your
> >>> > >> > > >>> > account and block it unless:
> >>> > >> > > >>> >
> >>> > >> > > >>> > a) you explain how you generated reports
> >>> > >> > > >>> > b) prove you are human
> >>> > >> > > >>> > c) explain why you created the issue
> >>> > >> > > >>> > ```
> >>> > >> > > >>> >
> >>> > >> > > >>> > My guess is that some company released and is testing an
> >>> > >> "agentic
> >>> > >> > AI"
> >>> > >> > > >>> that
> >>> > >> > > >>> > is "github-targeted" - where people can run the AI
> agents
> >>> on
> >>> > >> their
> >>> > >> > > >>> behalf.
> >>> > >> > > >>> > It does not look like regular bot-spam.
> >>> > >> > > >>> > I think we should all generally crowd-source reporting
> it
> >>> to
> >>> > >> > Github -
> >>> > >> > > >>> and
> >>> > >> > > >>> > hopefully they will find a way to battle those without
> >>> > involving
> >>> > >> > > >>> > maintainers.
> >>> > >> > > >>> >
> >>> > >> > > >>> > I hope it will not last too long.
> >>> > >> > > >>> >
> >>> > >> > > >>> > J.
> >>> > >> > > >>> >
> >>> > >> > > >>> >
> >>> > >> > > >>> >
> >>> > >> > > >>> > ---------- Forwarded message ---------
> >>> > >> > > >>> > From: Jarek Potiuk <ja...@potiuk.com>
> >>> > >> > > >>> > Date: Wed, Jan 22, 2025 at 8:12 AM
> >>> > >> > > >>> > Subject: Re: Very strange (AI generated) issues
> >>> > >> > > >>> > To: <d...@airflow.apache.org>
> >>> > >> > > >>> >
> >>> > >> > > >>> >
> >>> > >> > > >>> > You can also report it directly from the issue (... at
> >>> the top
> >>> > >> and
> >>> > >> > > >>> "report
> >>> > >> > > >>> > content")
> >>> > >> > > >>> >
> >>> > >> > > >>> > On Wed, Jan 22, 2025 at 7:46 AM Amogh Desai <
> >>> > >> > > amoghdesai....@gmail.com>
> >>> > >> > > >>> > wrote:
> >>> > >> > > >>> >
> >>> > >> > > >>> >> Elad, I just managed to report this user.
> >>> > >> > > >>> >>
> >>> > >> > > >>> >> This is how its done:
> >>> > >> > > >>> >>
> >>> > >> > > >>> >>
> >>> > >> > > >>>
> >>> > >> > >
> >>> > >> >
> >>> > >>
> >>> >
> >>>
> https://docs.github.com/en/communities/maintaining-your-safety-on-github/reporting-abuse-or-spam#reporting-a-user
> >>> > >> > > >>> >>
> >>> > >> > > >>> >> Thanks & Regards,
> >>> > >> > > >>> >> Amogh Desai
> >>> > >> > > >>> >>
> >>> > >> > > >>> >>
> >>> > >> > > >>> >> On Wed, Jan 22, 2025 at 12:05 PM Elad Kalif <
> >>> > >> elad...@apache.org>
> >>> > >> > > >>> wrote:
> >>> > >> > > >>> >>
> >>> > >> > > >>> >> > There are several reports from this user
> >>> > >> > > >>> >> >
> >>> > >> > > >>> >> > https://github.com/atharv9017
> >>> > >> > > >>> >> >
> >>> > >> > > >>> >> >
> >>> > >> > > >>> >> > I didnt find a way to report the user account to
> >>> github.
> >>> > >> > > >>> >> >
> >>> > >> > > >>> >> > בתאריך יום ד׳, 22 בינו׳ 2025, 06:41, מאת Pavankumar
> >>> > Gopidesu
> >>> > >> ‏<
> >>> > >> > > >>> >> > gopidesupa...@gmail.com>:
> >>> > >> > > >>> >> >
> >>> > >> > > >>> >> > > Yes, still issues are coming.
> >>> > >> > > >>> >> > >
> >>> > >> > > >>> >> > > Regards,
> >>> > >> > > >>> >> > > Pavan
> >>> > >> > > >>> >> > >
> >>> > >> > > >>> >> > > On Wed, Jan 22, 2025 at 4:35 AM Amogh Desai <
> >>> > >> > > >>> amoghdesai....@gmail.com
> >>> > >> > > >>> >> >
> >>> > >> > > >>> >> > > wrote:
> >>> > >> > > >>> >> > >
> >>> > >> > > >>> >> > > > I saw a couple of such SPAM issues too.
> >>> > >> > > >>> >> > > >
> >>> > >> > > >>> >> > > > I also recall some SPAM comments on pull requests
> >>> as
> >>> > >> well,
> >>> > >> > so
> >>> > >> > > >>> if any
> >>> > >> > > >>> >> > > > contributor sees any such SPAM message,
> >>> > >> > > >>> >> > > > please report it on Slack so that we can delete
> it
> >>> and
> >>> > >> > report
> >>> > >> > > >>> it.
> >>> > >> > > >>> >> > > >
> >>> > >> > > >>> >> > > > Thanks & Regards,
> >>> > >> > > >>> >> > > > Amogh Desai
> >>> > >> > > >>> >> > > >
> >>> > >> > > >>> >> > > >
> >>> > >> > > >>> >> > > > On Wed, Jan 22, 2025 at 8:45 AM Zhe You Liu <
> >>> > >> > > >>> zhu424....@gmail.com>
> >>> > >> > > >>> >> > > wrote:
> >>> > >> > > >>> >> > > >
> >>> > >> > > >>> >> > > > > I came across another strange issue:
> >>> > >> > > >>> >> > > > > https://github.com/apache/airflow/issues/45837
> .
> >>> It
> >>> > >> > appears
> >>> > >> > > >>> to be
> >>> > >> > > >>> >> a
> >>> > >> > > >>> >> > > > > copy-paste of
> >>> > >> > > https://github.com/apache/airflow/issues/45661
> >>> > >> > > >>> with
> >>> > >> > > >>> >> > just
> >>> > >> > > >>> >> > > > the
> >>> > >> > > >>> >> > > > > issue title changed.
> >>> > >> > > >>> >> > > > >
> >>> > >> > > >>> >> > > > > On Wed, Jan 22, 2025 at 6:50 AM Jarek Potiuk <
> >>> > >> > > >>> ja...@potiuk.com>
> >>> > >> > > >>> >> > wrote:
> >>> > >> > > >>> >> > > > >
> >>> > >> > > >>> >> > > > > > I even got to this stage:
> >>> > >> > > >>> >> > > > > >
> >>> > >> > > >>> >> > > > > > > We've received a few new tickets from your
> >>> > account
> >>> > >> > > >>> recently.
> >>> > >> > > >>> >> If
> >>> > >> > > >>> >> > > you'd
> >>> > >> > > >>> >> > > > > > like to add additional information you can
> add
> >>> a
> >>> > >> comment
> >>> > >> > > to
> >>> > >> > > >>> an
> >>> > >> > > >>> >> > > existing
> >>> > >> > > >>> >> > > > > > ticket, or wait a few minutes before opening
> a
> >>> new
> >>> > >> > ticket.
> >>> > >> > > >>> >> > > > > >
> >>> > >> > > >>> >> > > > > > On Tue, Jan 21, 2025 at 11:49 PM Jarek
> Potiuk <
> >>> > >> > > >>> ja...@potiuk.com
> >>> > >> > > >>> >> >
> >>> > >> > > >>> >> > > > wrote:
> >>> > >> > > >>> >> > > > > >
> >>> > >> > > >>> >> > > > > > > There are few more that I still saw after
> >>> sending
> >>> > >> it.
> >>> > >> > > >>> There is
> >>> > >> > > >>> >> > > > > something
> >>> > >> > > >>> >> > > > > > > going on bypassing GitHub filters.  I hope
> >>> they
> >>> > >> will
> >>> > >> > > >>> manage
> >>> > >> > > >>> >> to do
> >>> > >> > > >>> >> > > > > > something
> >>> > >> > > >>> >> > > > > > > about it
> >>> > >> > > >>> >> > > > > > >
> >>> > >> > > >>> >> > > > > > > Last one is
> >>> > >> > > >>> https://github.com/apache/airflow/issues/45867
> >>> > >> > > >>> >> > > > > > >
> >>> > >> > > >>> >> > > > > > > On Tue, Jan 21, 2025 at 11:46 PM Vikram
> Koka
> >>> > >> > > >>> >> > > > > > <vik...@astronomer.io.invalid>
> >>> > >> > > >>> >> > > > > > > wrote:
> >>> > >> > > >>> >> > > > > > >
> >>> > >> > > >>> >> > > > > > >> Agreed.
> >>> > >> > > >>> >> > > > > > >>
> >>> > >> > > >>> >> > > > > > >> Thanks for flagging these Jarek!
> >>> > >> > > >>> >> > > > > > >>
> >>> > >> > > >>> >> > > > > > >>
> >>> > >> > > >>> >> > > > > > >> On Tue, Jan 21, 2025 at 2:34 PM Jarek
> >>> Potiuk <
> >>> > >> > > >>> >> ja...@potiuk.com>
> >>> > >> > > >>> >> > > > > wrote:
> >>> > >> > > >>> >> > > > > > >>
> >>> > >> > > >>> >> > > > > > >> > Seems that we have a flood of AI
> generated
> >>> > >> feature
> >>> > >> > > >>> requests
> >>> > >> > > >>> >> > for
> >>> > >> > > >>> >> > > > > > Airflow,
> >>> > >> > > >>> >> > > > > > >> > The issues look somewhat legitimate,
> with
> >>> > >> somewhat
> >>> > >> > > >>> related
> >>> > >> > > >>> >> > > > content,
> >>> > >> > > >>> >> > > > > > but
> >>> > >> > > >>> >> > > > > > >> > they are wordy and make no sense when
> you
> >>> read
> >>> > >> > them.
> >>> > >> > > >>> Some
> >>> > >> > > >>> >> > > > examples:
> >>> > >> > > >>> >> > > > > > >> >
> >>> > >> > > >>> >> > > > > > >> > *
> >>> > >> https://github.com/apache/airflow/issues/45858
> >>> > >> > > >>> >> > > > > > >> > *
> >>> > >> https://github.com/apache/airflow/issues/45856
> >>> > >> > > >>> >> > > > > > >> > *
> >>> > >> https://github.com/apache/airflow/issues/45854
> >>> > >> > > >>> >> > > > > > >> >
> >>> > >> > > >>> >> > > > > > >> > All of them done by accounts with short
> >>> > history
> >>> > >> in
> >>> > >> > GH
> >>> > >> > > >>> and
> >>> > >> > > >>> >> not
> >>> > >> > > >>> >> > > much
> >>> > >> > > >>> >> > > > > > >> activity
> >>> > >> > > >>> >> > > > > > >> > before
> >>> > >> > > >>> >> > > > > > >> >
> >>> > >> > > >>> >> > > > > > >> > There were quite a few more.
> >>> > >> > > >>> >> > > > > > >> >
> >>> > >> > > >>> >> > > > > > >> > I suggest we close such issues AND
> report
> >>> > >> authors
> >>> > >> > to
> >>> > >> > > >>> >> GitHub -
> >>> > >> > > >>> >> > > > > > hopefully
> >>> > >> > > >>> >> > > > > > >> we
> >>> > >> > > >>> >> > > > > > >> > can help to battle the AI-generated
> >>> traffic
> >>> > >> flood.
> >>> > >> > > >>> >> > > > > > >> >
> >>> > >> > > >>> >> > > > > > >> > J.
> >>> > >> > > >>> >> > > > > > >> >
> >>> > >> > > >>> >> > > > > > >>
> >>> > >> > > >>> >> > > > > > >
> >>> > >> > > >>> >> > > > > >
> >>> > >> > > >>> >> > > > >
> >>> > >> > > >>> >> > > >
> >>> > >> > > >>> >> > >
> >>> > >> > > >>> >> >
> >>> > >> > > >>> >>
> >>> > >> > > >>> >
> >>> > >> > > >>>
> >>> > >> > > >>
> >>> > >> > >
> >>> > >> >
> >>> > >>
> >>> > >
> >>> >
> >>>
> >>
>

Reply via email to