We continue getting new issues - and more of them are by "new users" -
created just an hour or so ago.

Apparently Github has a way to temporarily limit interactions with the repo
for new users - see this screenshot:

https://ibb.co/WWsr7RB

And I think I'd be for enabling it - we will need an INFRA ticket for that,
because that's not currently configurable via .asf.yaml  - and maybe if
Iceberg would like to do it as well, we can create a single ticket for that.

There is a new framework coming to enable faster implementation and testing
of .asf.yaml features (this was discussed at the latest roundtable) - and
we can contribute a feature to add it in .asf.yaml soon, but temporarily we
might want to ask INFRA to help.

WDYT? If I hear a few voices for +1 and no strong opposition I will open a
JIRA ticket (and would love to hear what Iceberg friends of ours think as
well :)


J.


On Wed, Jan 22, 2025 at 10:36 AM Jarek Potiuk <ja...@potiuk.com> wrote:

> Yeah. just closed this one. The pattern where those are coming at the same
> time as two unrelated issues to both iceberg and airflow are very. ....
> strange
>
> On Wed, Jan 22, 2025 at 10:35 AM Elad Kalif <elad...@apache.org> wrote:
>
>> Another one who also opened issues in Airflow and Iceberg
>> https://github.com/apache/iceberg/issues/12034
>> https://github.com/apache/airflow/issues/45920
>>
>> Same "mistake" with the # Title.
>> All of these seem to come with accounts opened months ago, with some minor
>> traffic to their own forks so they would appear legit to Github
>>
>> On Wed, Jan 22, 2025 at 11:23 AM Jarek Potiuk <ja...@potiuk.com> wrote:
>>
>> > Yeah. Again - my guess is that those are "Agentic AI" trials, where
>> someone
>> > is deploying fake "agent" accounts acting as "people in the repo would".
>> > That's a bit terrifying if this is not contained.
>> >
>> > On Wed, Jan 22, 2025 at 9:52 AM Fokko Driesprong <fo...@apache.org>
>> wrote:
>> >
>> > > That's quite a few! I also noticed that they sometimes self-close the
>> > issue
>> > > (eg here <https://github.com/apache/iceberg/issues/12032>). Closed
>> > after 1
>> > > minute, but still flooding my mailbox :D
>> > >
>> > > So you might have more such issues now than you think.
>> > >
>> > >
>> > > Yes, that's probably the case, still going through my mailbox.
>> > >
>> > >
>> > > Op wo 22 jan 2025 om 09:49 schreef Jarek Potiuk <ja...@potiuk.com>:
>> > >
>> > > > Example case:
>> > > >
>> > > > * https://github.com/apache/airflow/issues/45904  - airflow
>> > > > * https://github.com/apache/iceberg/issues/12034 - iceberg
>> > > >
>> > > > Both issues are generic and useless and bring 0 value except noise.
>> > > >
>> > > > Interesting thing is that many of those users, if you look at their
>> > > > history - created. similar number of issues in iceberg and airflow
>> > about
>> > > > the same time. So you might have more such issues now than you
>> think.
>> > > >
>> > > > J.
>> > > >
>> > > >
>> > > >
>> > > >
>> > > > On Wed, Jan 22, 2025 at 9:41 AM Jarek Potiuk <ja...@potiuk.com>
>> wrote:
>> > > >
>> > > >> I have not counted all of them. there are quite a bit too many -
>> and
>> > > >> other people closed some of them as well. I got a very rudimentary
>> > check
>> > > >> and applied "AI Spam" label to some of the issues
>> > > >>
>> > >
>> >
>> https://github.com/apache/airflow/issues?q=is%3Aissue%20state%3Aclosed%20AI%20label%3A%22AI%20Spam%22
>> > > .
>> > > >> -> so we have had at least 25 such issues in the last 12 hours.
>> > > >>
>> > > >> > we also want to make sure that we don't accidentally close issues
>> > that
>> > > >> don't come from a bot, but just a newcomer to the project.
>> > > >>
>> > > >> Those reports and patterns look very. very human-like - they are
>> > > reported
>> > > >> infrequently (per user) the description and text seem legitimate,
>> but
>> > > they
>> > > >> are wordy and just reading and understanding that those are
>> completely
>> > > >> useless takes a lot of time. This is part of the problem, that it
>> > takes
>> > > a
>> > > >> lot of energy and time to determine if those are valid or not - and
>> > with
>> > > >> such a rate, it's not sustainable just to analyze whether they are
>> > good
>> > > or
>> > > >> bad.
>> > > >>
>> > > >> J.
>> > > >>
>> > > >>
>> > > >>
>> > > >> On Wed, Jan 22, 2025 at 9:23 AM Fokko Driesprong <fo...@apache.org
>> >
>> > > >> wrote:
>> > > >>
>> > > >>> Hey Jarek,
>> > > >>>
>> > > >>> Thanks for bringing this to our attention. When you talk about
>> > > flooding,
>> > > >>> how many are we talking about? I see some suspicious issues (eg,
>> here
>> > > >>> <https://github.com/apache/iceberg/issues/12039>), but not many.
>> I
>> > > >>> hope this will come to a halt soon because it all additional work,
>> > and
>> > > we
>> > > >>> also want to make sure that we don't accidentally close issues
>> that
>> > > don't
>> > > >>> come from a bot, but just a newcomer to the project.
>> > > >>>
>> > > >>> Kind regards,
>> > > >>> Fokko
>> > > >>>
>> > > >>> Op wo 22 jan 2025 om 09:00 schreef Jarek Potiuk <ja...@potiuk.com
>> >:
>> > > >>>
>> > > >>> > Hey Iceberg community, And Airflow community too.
>> > > >>> >
>> > > >>> > As of yesterday Airflow repo is literally flooded with a number
>> of
>> > > >>> issues
>> > > >>> > that look almost good, except they are clearly AI generated and
>> > make
>> > > no
>> > > >>> > sense or repeat content from other issues. We noticed that the
>> > users
>> > > >>> who
>> > > >>> > create a lot of the "spam AI" issues that are created in Airflow
>> > are
>> > > >>> also
>> > > >>> > creating similar issues for Iceberg.
>> > > >>> >
>> > > >>> > We got to the point that we are closing and reporting such
>> issues
>> > to
>> > > >>> > GitHub and we are blocking all such users without spending too
>> much
>> > > >>> time on
>> > > >>> > it with messages similar to this:
>> > > >>> >
>> > > >>> > ```
>> > > >>> > This looks totally AI-generated. useless issue report that
>> brings
>> > no
>> > > >>> value
>> > > >>> > and makes no sense. We are generally blocking users that sends a
>> > lot
>> > > of
>> > > >>> > spam AI reports generated by bots.. as of yesterday so we will
>> > report
>> > > >>> your
>> > > >>> > account and block it unless:
>> > > >>> >
>> > > >>> > a) you explain how you generated reports
>> > > >>> > b) prove you are human
>> > > >>> > c) explain why you created the issue
>> > > >>> > ```
>> > > >>> >
>> > > >>> > My guess is that some company released and is testing an
>> "agentic
>> > AI"
>> > > >>> that
>> > > >>> > is "github-targeted" - where people can run the AI agents on
>> their
>> > > >>> behalf.
>> > > >>> > It does not look like regular bot-spam.
>> > > >>> > I think we should all generally crowd-source reporting it to
>> > Github -
>> > > >>> and
>> > > >>> > hopefully they will find a way to battle those without involving
>> > > >>> > maintainers.
>> > > >>> >
>> > > >>> > I hope it will not last too long.
>> > > >>> >
>> > > >>> > J.
>> > > >>> >
>> > > >>> >
>> > > >>> >
>> > > >>> > ---------- Forwarded message ---------
>> > > >>> > From: Jarek Potiuk <ja...@potiuk.com>
>> > > >>> > Date: Wed, Jan 22, 2025 at 8:12 AM
>> > > >>> > Subject: Re: Very strange (AI generated) issues
>> > > >>> > To: <d...@airflow.apache.org>
>> > > >>> >
>> > > >>> >
>> > > >>> > You can also report it directly from the issue (... at the top
>> and
>> > > >>> "report
>> > > >>> > content")
>> > > >>> >
>> > > >>> > On Wed, Jan 22, 2025 at 7:46 AM Amogh Desai <
>> > > amoghdesai....@gmail.com>
>> > > >>> > wrote:
>> > > >>> >
>> > > >>> >> Elad, I just managed to report this user.
>> > > >>> >>
>> > > >>> >> This is how its done:
>> > > >>> >>
>> > > >>> >>
>> > > >>>
>> > >
>> >
>> https://docs.github.com/en/communities/maintaining-your-safety-on-github/reporting-abuse-or-spam#reporting-a-user
>> > > >>> >>
>> > > >>> >> Thanks & Regards,
>> > > >>> >> Amogh Desai
>> > > >>> >>
>> > > >>> >>
>> > > >>> >> On Wed, Jan 22, 2025 at 12:05 PM Elad Kalif <
>> elad...@apache.org>
>> > > >>> wrote:
>> > > >>> >>
>> > > >>> >> > There are several reports from this user
>> > > >>> >> >
>> > > >>> >> > https://github.com/atharv9017
>> > > >>> >> >
>> > > >>> >> >
>> > > >>> >> > I didnt find a way to report the user account to github.
>> > > >>> >> >
>> > > >>> >> > בתאריך יום ד׳, 22 בינו׳ 2025, 06:41, מאת Pavankumar Gopidesu
>> ‏<
>> > > >>> >> > gopidesupa...@gmail.com>:
>> > > >>> >> >
>> > > >>> >> > > Yes, still issues are coming.
>> > > >>> >> > >
>> > > >>> >> > > Regards,
>> > > >>> >> > > Pavan
>> > > >>> >> > >
>> > > >>> >> > > On Wed, Jan 22, 2025 at 4:35 AM Amogh Desai <
>> > > >>> amoghdesai....@gmail.com
>> > > >>> >> >
>> > > >>> >> > > wrote:
>> > > >>> >> > >
>> > > >>> >> > > > I saw a couple of such SPAM issues too.
>> > > >>> >> > > >
>> > > >>> >> > > > I also recall some SPAM comments on pull requests as
>> well,
>> > so
>> > > >>> if any
>> > > >>> >> > > > contributor sees any such SPAM message,
>> > > >>> >> > > > please report it on Slack so that we can delete it and
>> > report
>> > > >>> it.
>> > > >>> >> > > >
>> > > >>> >> > > > Thanks & Regards,
>> > > >>> >> > > > Amogh Desai
>> > > >>> >> > > >
>> > > >>> >> > > >
>> > > >>> >> > > > On Wed, Jan 22, 2025 at 8:45 AM Zhe You Liu <
>> > > >>> zhu424....@gmail.com>
>> > > >>> >> > > wrote:
>> > > >>> >> > > >
>> > > >>> >> > > > > I came across another strange issue:
>> > > >>> >> > > > > https://github.com/apache/airflow/issues/45837. It
>> > appears
>> > > >>> to be
>> > > >>> >> a
>> > > >>> >> > > > > copy-paste of
>> > > https://github.com/apache/airflow/issues/45661
>> > > >>> with
>> > > >>> >> > just
>> > > >>> >> > > > the
>> > > >>> >> > > > > issue title changed.
>> > > >>> >> > > > >
>> > > >>> >> > > > > On Wed, Jan 22, 2025 at 6:50 AM Jarek Potiuk <
>> > > >>> ja...@potiuk.com>
>> > > >>> >> > wrote:
>> > > >>> >> > > > >
>> > > >>> >> > > > > > I even got to this stage:
>> > > >>> >> > > > > >
>> > > >>> >> > > > > > > We've received a few new tickets from your account
>> > > >>> recently.
>> > > >>> >> If
>> > > >>> >> > > you'd
>> > > >>> >> > > > > > like to add additional information you can add a
>> comment
>> > > to
>> > > >>> an
>> > > >>> >> > > existing
>> > > >>> >> > > > > > ticket, or wait a few minutes before opening a new
>> > ticket.
>> > > >>> >> > > > > >
>> > > >>> >> > > > > > On Tue, Jan 21, 2025 at 11:49 PM Jarek Potiuk <
>> > > >>> ja...@potiuk.com
>> > > >>> >> >
>> > > >>> >> > > > wrote:
>> > > >>> >> > > > > >
>> > > >>> >> > > > > > > There are few more that I still saw after sending
>> it.
>> > > >>> There is
>> > > >>> >> > > > > something
>> > > >>> >> > > > > > > going on bypassing GitHub filters.  I hope they
>> will
>> > > >>> manage
>> > > >>> >> to do
>> > > >>> >> > > > > > something
>> > > >>> >> > > > > > > about it
>> > > >>> >> > > > > > >
>> > > >>> >> > > > > > > Last one is
>> > > >>> https://github.com/apache/airflow/issues/45867
>> > > >>> >> > > > > > >
>> > > >>> >> > > > > > > On Tue, Jan 21, 2025 at 11:46 PM Vikram Koka
>> > > >>> >> > > > > > <vik...@astronomer.io.invalid>
>> > > >>> >> > > > > > > wrote:
>> > > >>> >> > > > > > >
>> > > >>> >> > > > > > >> Agreed.
>> > > >>> >> > > > > > >>
>> > > >>> >> > > > > > >> Thanks for flagging these Jarek!
>> > > >>> >> > > > > > >>
>> > > >>> >> > > > > > >>
>> > > >>> >> > > > > > >> On Tue, Jan 21, 2025 at 2:34 PM Jarek Potiuk <
>> > > >>> >> ja...@potiuk.com>
>> > > >>> >> > > > > wrote:
>> > > >>> >> > > > > > >>
>> > > >>> >> > > > > > >> > Seems that we have a flood of AI generated
>> feature
>> > > >>> requests
>> > > >>> >> > for
>> > > >>> >> > > > > > Airflow,
>> > > >>> >> > > > > > >> > The issues look somewhat legitimate, with
>> somewhat
>> > > >>> related
>> > > >>> >> > > > content,
>> > > >>> >> > > > > > but
>> > > >>> >> > > > > > >> > they are wordy and make no sense when you read
>> > them.
>> > > >>> Some
>> > > >>> >> > > > examples:
>> > > >>> >> > > > > > >> >
>> > > >>> >> > > > > > >> > *
>> https://github.com/apache/airflow/issues/45858
>> > > >>> >> > > > > > >> > *
>> https://github.com/apache/airflow/issues/45856
>> > > >>> >> > > > > > >> > *
>> https://github.com/apache/airflow/issues/45854
>> > > >>> >> > > > > > >> >
>> > > >>> >> > > > > > >> > All of them done by accounts with short history
>> in
>> > GH
>> > > >>> and
>> > > >>> >> not
>> > > >>> >> > > much
>> > > >>> >> > > > > > >> activity
>> > > >>> >> > > > > > >> > before
>> > > >>> >> > > > > > >> >
>> > > >>> >> > > > > > >> > There were quite a few more.
>> > > >>> >> > > > > > >> >
>> > > >>> >> > > > > > >> > I suggest we close such issues AND report
>> authors
>> > to
>> > > >>> >> GitHub -
>> > > >>> >> > > > > > hopefully
>> > > >>> >> > > > > > >> we
>> > > >>> >> > > > > > >> > can help to battle the AI-generated traffic
>> flood.
>> > > >>> >> > > > > > >> >
>> > > >>> >> > > > > > >> > J.
>> > > >>> >> > > > > > >> >
>> > > >>> >> > > > > > >>
>> > > >>> >> > > > > > >
>> > > >>> >> > > > > >
>> > > >>> >> > > > >
>> > > >>> >> > > >
>> > > >>> >> > >
>> > > >>> >> >
>> > > >>> >>
>> > > >>> >
>> > > >>>
>> > > >>
>> > >
>> >
>>
>

Reply via email to