Just as a little follow up - I think I have a hypothesis about what happened.
We got one other user creating one issue which was very similar and from this comment I gather: https://github.com/apache/airflow/issues/45940#issuecomment-2608307111 * there is some tool out there that is supposed to make "issue creation" easier - with help of AI * some test accounts were used to test it (likely there are people who have a bunch of fake Github accounts they maintain to test new things with AI) * apparently some "real" people also got their hands on that tool and tried it * this tool LIKELY used "airflow" and "iceberg" in some documentation or default settings as "examples" * apparently this tool mislead people into thinking they are "testing" issue creation where it actually created those issues * I guess whoever has the tool realised their mistake and either stopped it or removed some confusion * I have my own suspicions (which I am exploring) - but I asked the user to provide information about what tooling they were using (and the user was apologising, and expressed willingness to provide more information so I hope I will get more information soon). J. On Thu, Jan 23, 2025 at 8:57 AM Piotr Findeisen <piotr.findei...@gmail.com> wrote: > Hi > > Thank you Jarek for taking care of this matter! > > > Should we react and block new users from interacting with Airflow repo if > we see it happening again? > > Maintainers' time is not an infinite resource, so "yes!" from me (also for > Iceberg). > > Best > > > > > On Wed, 22 Jan 2025 at 15:40, Russell Spitzer <russell.spit...@gmail.com> > wrote: > > > This is pretty disturbing and I hope that any users out there see that > > using automated tools to submit issues is just adding noise to the > project > > which makes it very hard for real issues to be addressed. > > > > On Wed, Jan 22, 2025 at 6:58 AM Jarek Potiuk <ja...@potiuk.com> wrote: > > > >> - Iceberg dev to not flood them :) (in bcc:) > >> > >> It looks like the flood had been somehow flood-gated - no similar report > >> for the last 4 hours or so. > >> > >> I also started to receive confirmation from Github that they are looking > >> at the reports, so likely we do not have to do any action now, but I > >> think we can turn it into deciding about "future" reactions when > something > >> like this happens, so that we can potentially react quickly > >> > >> What do others think ? Should we react and block new users from > >> interacting with Airflow repo if we see it happening again? Maybe > >> temporarily - for a day or two initially - after reporting some initial > >> reports? Does it sound reasonable? > >> > >> J. > >> > >> On Wed, Jan 22, 2025 at 11:35 AM Pavankumar Gopidesu < > >> gopidesupa...@gmail.com> wrote: > >> > >>> +1 from me. > >>> > >>> It looks started yesterday, I feel we may get many of these tickets > when > >>> new users starts testing those AI agents. > >>> > >>> Regards, > >>> Pavan Kumar > >>> > >>> On Wed, Jan 22, 2025, 10:27 Jarek Potiuk <ja...@potiuk.com> wrote: > >>> > >>> > We continue getting new issues - and more of them are by "new users" > - > >>> > created just an hour or so ago. > >>> > > >>> > Apparently Github has a way to temporarily limit interactions with > the > >>> repo > >>> > for new users - see this screenshot: > >>> > > >>> > https://ibb.co/WWsr7RB > >>> > > >>> > And I think I'd be for enabling it - we will need an INFRA ticket for > >>> that, > >>> > because that's not currently configurable via .asf.yaml - and maybe > if > >>> > Iceberg would like to do it as well, we can create a single ticket > for > >>> > that. > >>> > > >>> > There is a new framework coming to enable faster implementation and > >>> testing > >>> > of .asf.yaml features (this was discussed at the latest roundtable) - > >>> and > >>> > we can contribute a feature to add it in .asf.yaml soon, but > >>> temporarily we > >>> > might want to ask INFRA to help. > >>> > > >>> > WDYT? If I hear a few voices for +1 and no strong opposition I will > >>> open a > >>> > JIRA ticket (and would love to hear what Iceberg friends of ours > think > >>> as > >>> > well :) > >>> > > >>> > > >>> > J. > >>> > > >>> > > >>> > On Wed, Jan 22, 2025 at 10:36 AM Jarek Potiuk <ja...@potiuk.com> > >>> wrote: > >>> > > >>> > > Yeah. just closed this one. The pattern where those are coming at > the > >>> > same > >>> > > time as two unrelated issues to both iceberg and airflow are very. > >>> .... > >>> > > strange > >>> > > > >>> > > On Wed, Jan 22, 2025 at 10:35 AM Elad Kalif <elad...@apache.org> > >>> wrote: > >>> > > > >>> > >> Another one who also opened issues in Airflow and Iceberg > >>> > >> https://github.com/apache/iceberg/issues/12034 > >>> > >> https://github.com/apache/airflow/issues/45920 > >>> > >> > >>> > >> Same "mistake" with the # Title. > >>> > >> All of these seem to come with accounts opened months ago, with > some > >>> > minor > >>> > >> traffic to their own forks so they would appear legit to Github > >>> > >> > >>> > >> On Wed, Jan 22, 2025 at 11:23 AM Jarek Potiuk <ja...@potiuk.com> > >>> wrote: > >>> > >> > >>> > >> > Yeah. Again - my guess is that those are "Agentic AI" trials, > >>> where > >>> > >> someone > >>> > >> > is deploying fake "agent" accounts acting as "people in the repo > >>> > would". > >>> > >> > That's a bit terrifying if this is not contained. > >>> > >> > > >>> > >> > On Wed, Jan 22, 2025 at 9:52 AM Fokko Driesprong < > >>> fo...@apache.org> > >>> > >> wrote: > >>> > >> > > >>> > >> > > That's quite a few! I also noticed that they sometimes > >>> self-close > >>> > the > >>> > >> > issue > >>> > >> > > (eg here <https://github.com/apache/iceberg/issues/12032>). > >>> Closed > >>> > >> > after 1 > >>> > >> > > minute, but still flooding my mailbox :D > >>> > >> > > > >>> > >> > > So you might have more such issues now than you think. > >>> > >> > > > >>> > >> > > > >>> > >> > > Yes, that's probably the case, still going through my mailbox. > >>> > >> > > > >>> > >> > > > >>> > >> > > Op wo 22 jan 2025 om 09:49 schreef Jarek Potiuk < > >>> ja...@potiuk.com>: > >>> > >> > > > >>> > >> > > > Example case: > >>> > >> > > > > >>> > >> > > > * https://github.com/apache/airflow/issues/45904 - airflow > >>> > >> > > > * https://github.com/apache/iceberg/issues/12034 - iceberg > >>> > >> > > > > >>> > >> > > > Both issues are generic and useless and bring 0 value except > >>> > noise. > >>> > >> > > > > >>> > >> > > > Interesting thing is that many of those users, if you look > at > >>> > their > >>> > >> > > > history - created. similar number of issues in iceberg and > >>> airflow > >>> > >> > about > >>> > >> > > > the same time. So you might have more such issues now than > you > >>> > >> think. > >>> > >> > > > > >>> > >> > > > J. > >>> > >> > > > > >>> > >> > > > > >>> > >> > > > > >>> > >> > > > > >>> > >> > > > On Wed, Jan 22, 2025 at 9:41 AM Jarek Potiuk < > >>> ja...@potiuk.com> > >>> > >> wrote: > >>> > >> > > > > >>> > >> > > >> I have not counted all of them. there are quite a bit too > >>> many - > >>> > >> and > >>> > >> > > >> other people closed some of them as well. I got a very > >>> > rudimentary > >>> > >> > check > >>> > >> > > >> and applied "AI Spam" label to some of the issues > >>> > >> > > >> > >>> > >> > > > >>> > >> > > >>> > >> > >>> > > >>> > https://github.com/apache/airflow/issues?q=is%3Aissue%20state%3Aclosed%20AI%20label%3A%22AI%20Spam%22 > >>> > >> > > . > >>> > >> > > >> -> so we have had at least 25 such issues in the last 12 > >>> hours. > >>> > >> > > >> > >>> > >> > > >> > we also want to make sure that we don't accidentally > close > >>> > issues > >>> > >> > that > >>> > >> > > >> don't come from a bot, but just a newcomer to the project. > >>> > >> > > >> > >>> > >> > > >> Those reports and patterns look very. very human-like - > they > >>> are > >>> > >> > > reported > >>> > >> > > >> infrequently (per user) the description and text seem > >>> legitimate, > >>> > >> but > >>> > >> > > they > >>> > >> > > >> are wordy and just reading and understanding that those are > >>> > >> completely > >>> > >> > > >> useless takes a lot of time. This is part of the problem, > >>> that it > >>> > >> > takes > >>> > >> > > a > >>> > >> > > >> lot of energy and time to determine if those are valid or > >>> not - > >>> > and > >>> > >> > with > >>> > >> > > >> such a rate, it's not sustainable just to analyze whether > >>> they > >>> > are > >>> > >> > good > >>> > >> > > or > >>> > >> > > >> bad. > >>> > >> > > >> > >>> > >> > > >> J. > >>> > >> > > >> > >>> > >> > > >> > >>> > >> > > >> > >>> > >> > > >> On Wed, Jan 22, 2025 at 9:23 AM Fokko Driesprong < > >>> > fo...@apache.org > >>> > >> > > >>> > >> > > >> wrote: > >>> > >> > > >> > >>> > >> > > >>> Hey Jarek, > >>> > >> > > >>> > >>> > >> > > >>> Thanks for bringing this to our attention. When you talk > >>> about > >>> > >> > > flooding, > >>> > >> > > >>> how many are we talking about? I see some suspicious > issues > >>> (eg, > >>> > >> here > >>> > >> > > >>> <https://github.com/apache/iceberg/issues/12039>), but > not > >>> > many. > >>> > >> I > >>> > >> > > >>> hope this will come to a halt soon because it all > additional > >>> > work, > >>> > >> > and > >>> > >> > > we > >>> > >> > > >>> also want to make sure that we don't accidentally close > >>> issues > >>> > >> that > >>> > >> > > don't > >>> > >> > > >>> come from a bot, but just a newcomer to the project. > >>> > >> > > >>> > >>> > >> > > >>> Kind regards, > >>> > >> > > >>> Fokko > >>> > >> > > >>> > >>> > >> > > >>> Op wo 22 jan 2025 om 09:00 schreef Jarek Potiuk < > >>> > ja...@potiuk.com > >>> > >> >: > >>> > >> > > >>> > >>> > >> > > >>> > Hey Iceberg community, And Airflow community too. > >>> > >> > > >>> > > >>> > >> > > >>> > As of yesterday Airflow repo is literally flooded with a > >>> > number > >>> > >> of > >>> > >> > > >>> issues > >>> > >> > > >>> > that look almost good, except they are clearly AI > >>> generated > >>> > and > >>> > >> > make > >>> > >> > > no > >>> > >> > > >>> > sense or repeat content from other issues. We noticed > >>> that the > >>> > >> > users > >>> > >> > > >>> who > >>> > >> > > >>> > create a lot of the "spam AI" issues that are created in > >>> > Airflow > >>> > >> > are > >>> > >> > > >>> also > >>> > >> > > >>> > creating similar issues for Iceberg. > >>> > >> > > >>> > > >>> > >> > > >>> > We got to the point that we are closing and reporting > such > >>> > >> issues > >>> > >> > to > >>> > >> > > >>> > GitHub and we are blocking all such users without > >>> spending too > >>> > >> much > >>> > >> > > >>> time on > >>> > >> > > >>> > it with messages similar to this: > >>> > >> > > >>> > > >>> > >> > > >>> > ``` > >>> > >> > > >>> > This looks totally AI-generated. useless issue report > that > >>> > >> brings > >>> > >> > no > >>> > >> > > >>> value > >>> > >> > > >>> > and makes no sense. We are generally blocking users that > >>> > sends a > >>> > >> > lot > >>> > >> > > of > >>> > >> > > >>> > spam AI reports generated by bots.. as of yesterday so > we > >>> will > >>> > >> > report > >>> > >> > > >>> your > >>> > >> > > >>> > account and block it unless: > >>> > >> > > >>> > > >>> > >> > > >>> > a) you explain how you generated reports > >>> > >> > > >>> > b) prove you are human > >>> > >> > > >>> > c) explain why you created the issue > >>> > >> > > >>> > ``` > >>> > >> > > >>> > > >>> > >> > > >>> > My guess is that some company released and is testing an > >>> > >> "agentic > >>> > >> > AI" > >>> > >> > > >>> that > >>> > >> > > >>> > is "github-targeted" - where people can run the AI > agents > >>> on > >>> > >> their > >>> > >> > > >>> behalf. > >>> > >> > > >>> > It does not look like regular bot-spam. > >>> > >> > > >>> > I think we should all generally crowd-source reporting > it > >>> to > >>> > >> > Github - > >>> > >> > > >>> and > >>> > >> > > >>> > hopefully they will find a way to battle those without > >>> > involving > >>> > >> > > >>> > maintainers. > >>> > >> > > >>> > > >>> > >> > > >>> > I hope it will not last too long. > >>> > >> > > >>> > > >>> > >> > > >>> > J. > >>> > >> > > >>> > > >>> > >> > > >>> > > >>> > >> > > >>> > > >>> > >> > > >>> > ---------- Forwarded message --------- > >>> > >> > > >>> > From: Jarek Potiuk <ja...@potiuk.com> > >>> > >> > > >>> > Date: Wed, Jan 22, 2025 at 8:12 AM > >>> > >> > > >>> > Subject: Re: Very strange (AI generated) issues > >>> > >> > > >>> > To: <d...@airflow.apache.org> > >>> > >> > > >>> > > >>> > >> > > >>> > > >>> > >> > > >>> > You can also report it directly from the issue (... at > >>> the top > >>> > >> and > >>> > >> > > >>> "report > >>> > >> > > >>> > content") > >>> > >> > > >>> > > >>> > >> > > >>> > On Wed, Jan 22, 2025 at 7:46 AM Amogh Desai < > >>> > >> > > amoghdesai....@gmail.com> > >>> > >> > > >>> > wrote: > >>> > >> > > >>> > > >>> > >> > > >>> >> Elad, I just managed to report this user. > >>> > >> > > >>> >> > >>> > >> > > >>> >> This is how its done: > >>> > >> > > >>> >> > >>> > >> > > >>> >> > >>> > >> > > >>> > >>> > >> > > > >>> > >> > > >>> > >> > >>> > > >>> > https://docs.github.com/en/communities/maintaining-your-safety-on-github/reporting-abuse-or-spam#reporting-a-user > >>> > >> > > >>> >> > >>> > >> > > >>> >> Thanks & Regards, > >>> > >> > > >>> >> Amogh Desai > >>> > >> > > >>> >> > >>> > >> > > >>> >> > >>> > >> > > >>> >> On Wed, Jan 22, 2025 at 12:05 PM Elad Kalif < > >>> > >> elad...@apache.org> > >>> > >> > > >>> wrote: > >>> > >> > > >>> >> > >>> > >> > > >>> >> > There are several reports from this user > >>> > >> > > >>> >> > > >>> > >> > > >>> >> > https://github.com/atharv9017 > >>> > >> > > >>> >> > > >>> > >> > > >>> >> > > >>> > >> > > >>> >> > I didnt find a way to report the user account to > >>> github. > >>> > >> > > >>> >> > > >>> > >> > > >>> >> > בתאריך יום ד׳, 22 בינו׳ 2025, 06:41, מאת Pavankumar > >>> > Gopidesu > >>> > >> < > >>> > >> > > >>> >> > gopidesupa...@gmail.com>: > >>> > >> > > >>> >> > > >>> > >> > > >>> >> > > Yes, still issues are coming. > >>> > >> > > >>> >> > > > >>> > >> > > >>> >> > > Regards, > >>> > >> > > >>> >> > > Pavan > >>> > >> > > >>> >> > > > >>> > >> > > >>> >> > > On Wed, Jan 22, 2025 at 4:35 AM Amogh Desai < > >>> > >> > > >>> amoghdesai....@gmail.com > >>> > >> > > >>> >> > > >>> > >> > > >>> >> > > wrote: > >>> > >> > > >>> >> > > > >>> > >> > > >>> >> > > > I saw a couple of such SPAM issues too. > >>> > >> > > >>> >> > > > > >>> > >> > > >>> >> > > > I also recall some SPAM comments on pull requests > >>> as > >>> > >> well, > >>> > >> > so > >>> > >> > > >>> if any > >>> > >> > > >>> >> > > > contributor sees any such SPAM message, > >>> > >> > > >>> >> > > > please report it on Slack so that we can delete > it > >>> and > >>> > >> > report > >>> > >> > > >>> it. > >>> > >> > > >>> >> > > > > >>> > >> > > >>> >> > > > Thanks & Regards, > >>> > >> > > >>> >> > > > Amogh Desai > >>> > >> > > >>> >> > > > > >>> > >> > > >>> >> > > > > >>> > >> > > >>> >> > > > On Wed, Jan 22, 2025 at 8:45 AM Zhe You Liu < > >>> > >> > > >>> zhu424....@gmail.com> > >>> > >> > > >>> >> > > wrote: > >>> > >> > > >>> >> > > > > >>> > >> > > >>> >> > > > > I came across another strange issue: > >>> > >> > > >>> >> > > > > https://github.com/apache/airflow/issues/45837 > . > >>> It > >>> > >> > appears > >>> > >> > > >>> to be > >>> > >> > > >>> >> a > >>> > >> > > >>> >> > > > > copy-paste of > >>> > >> > > https://github.com/apache/airflow/issues/45661 > >>> > >> > > >>> with > >>> > >> > > >>> >> > just > >>> > >> > > >>> >> > > > the > >>> > >> > > >>> >> > > > > issue title changed. > >>> > >> > > >>> >> > > > > > >>> > >> > > >>> >> > > > > On Wed, Jan 22, 2025 at 6:50 AM Jarek Potiuk < > >>> > >> > > >>> ja...@potiuk.com> > >>> > >> > > >>> >> > wrote: > >>> > >> > > >>> >> > > > > > >>> > >> > > >>> >> > > > > > I even got to this stage: > >>> > >> > > >>> >> > > > > > > >>> > >> > > >>> >> > > > > > > We've received a few new tickets from your > >>> > account > >>> > >> > > >>> recently. > >>> > >> > > >>> >> If > >>> > >> > > >>> >> > > you'd > >>> > >> > > >>> >> > > > > > like to add additional information you can > add > >>> a > >>> > >> comment > >>> > >> > > to > >>> > >> > > >>> an > >>> > >> > > >>> >> > > existing > >>> > >> > > >>> >> > > > > > ticket, or wait a few minutes before opening > a > >>> new > >>> > >> > ticket. > >>> > >> > > >>> >> > > > > > > >>> > >> > > >>> >> > > > > > On Tue, Jan 21, 2025 at 11:49 PM Jarek > Potiuk < > >>> > >> > > >>> ja...@potiuk.com > >>> > >> > > >>> >> > > >>> > >> > > >>> >> > > > wrote: > >>> > >> > > >>> >> > > > > > > >>> > >> > > >>> >> > > > > > > There are few more that I still saw after > >>> sending > >>> > >> it. > >>> > >> > > >>> There is > >>> > >> > > >>> >> > > > > something > >>> > >> > > >>> >> > > > > > > going on bypassing GitHub filters. I hope > >>> they > >>> > >> will > >>> > >> > > >>> manage > >>> > >> > > >>> >> to do > >>> > >> > > >>> >> > > > > > something > >>> > >> > > >>> >> > > > > > > about it > >>> > >> > > >>> >> > > > > > > > >>> > >> > > >>> >> > > > > > > Last one is > >>> > >> > > >>> https://github.com/apache/airflow/issues/45867 > >>> > >> > > >>> >> > > > > > > > >>> > >> > > >>> >> > > > > > > On Tue, Jan 21, 2025 at 11:46 PM Vikram > Koka > >>> > >> > > >>> >> > > > > > <vik...@astronomer.io.invalid> > >>> > >> > > >>> >> > > > > > > wrote: > >>> > >> > > >>> >> > > > > > > > >>> > >> > > >>> >> > > > > > >> Agreed. > >>> > >> > > >>> >> > > > > > >> > >>> > >> > > >>> >> > > > > > >> Thanks for flagging these Jarek! > >>> > >> > > >>> >> > > > > > >> > >>> > >> > > >>> >> > > > > > >> > >>> > >> > > >>> >> > > > > > >> On Tue, Jan 21, 2025 at 2:34 PM Jarek > >>> Potiuk < > >>> > >> > > >>> >> ja...@potiuk.com> > >>> > >> > > >>> >> > > > > wrote: > >>> > >> > > >>> >> > > > > > >> > >>> > >> > > >>> >> > > > > > >> > Seems that we have a flood of AI > generated > >>> > >> feature > >>> > >> > > >>> requests > >>> > >> > > >>> >> > for > >>> > >> > > >>> >> > > > > > Airflow, > >>> > >> > > >>> >> > > > > > >> > The issues look somewhat legitimate, > with > >>> > >> somewhat > >>> > >> > > >>> related > >>> > >> > > >>> >> > > > content, > >>> > >> > > >>> >> > > > > > but > >>> > >> > > >>> >> > > > > > >> > they are wordy and make no sense when > you > >>> read > >>> > >> > them. > >>> > >> > > >>> Some > >>> > >> > > >>> >> > > > examples: > >>> > >> > > >>> >> > > > > > >> > > >>> > >> > > >>> >> > > > > > >> > * > >>> > >> https://github.com/apache/airflow/issues/45858 > >>> > >> > > >>> >> > > > > > >> > * > >>> > >> https://github.com/apache/airflow/issues/45856 > >>> > >> > > >>> >> > > > > > >> > * > >>> > >> https://github.com/apache/airflow/issues/45854 > >>> > >> > > >>> >> > > > > > >> > > >>> > >> > > >>> >> > > > > > >> > All of them done by accounts with short > >>> > history > >>> > >> in > >>> > >> > GH > >>> > >> > > >>> and > >>> > >> > > >>> >> not > >>> > >> > > >>> >> > > much > >>> > >> > > >>> >> > > > > > >> activity > >>> > >> > > >>> >> > > > > > >> > before > >>> > >> > > >>> >> > > > > > >> > > >>> > >> > > >>> >> > > > > > >> > There were quite a few more. > >>> > >> > > >>> >> > > > > > >> > > >>> > >> > > >>> >> > > > > > >> > I suggest we close such issues AND > report > >>> > >> authors > >>> > >> > to > >>> > >> > > >>> >> GitHub - > >>> > >> > > >>> >> > > > > > hopefully > >>> > >> > > >>> >> > > > > > >> we > >>> > >> > > >>> >> > > > > > >> > can help to battle the AI-generated > >>> traffic > >>> > >> flood. > >>> > >> > > >>> >> > > > > > >> > > >>> > >> > > >>> >> > > > > > >> > J. > >>> > >> > > >>> >> > > > > > >> > > >>> > >> > > >>> >> > > > > > >> > >>> > >> > > >>> >> > > > > > > > >>> > >> > > >>> >> > > > > > > >>> > >> > > >>> >> > > > > > >>> > >> > > >>> >> > > > > >>> > >> > > >>> >> > > > >>> > >> > > >>> >> > > >>> > >> > > >>> >> > >>> > >> > > >>> > > >>> > >> > > >>> > >>> > >> > > >> > >>> > >> > > > >>> > >> > > >>> > >> > >>> > > > >>> > > >>> > >> >