We continue getting new issues - and more of them are by "new users" - created just an hour or so ago.
Apparently Github has a way to temporarily limit interactions with the repo for new users - see this screenshot: https://ibb.co/WWsr7RB And I think I'd be for enabling it - we will need an INFRA ticket for that, because that's not currently configurable via .asf.yaml - and maybe if Iceberg would like to do it as well, we can create a single ticket for that. There is a new framework coming to enable faster implementation and testing of .asf.yaml features (this was discussed at the latest roundtable) - and we can contribute a feature to add it in .asf.yaml soon, but temporarily we might want to ask INFRA to help. WDYT? If I hear a few voices for +1 and no strong opposition I will open a JIRA ticket (and would love to hear what Iceberg friends of ours think as well :) J. On Wed, Jan 22, 2025 at 10:36 AM Jarek Potiuk <ja...@potiuk.com> wrote: > Yeah. just closed this one. The pattern where those are coming at the same > time as two unrelated issues to both iceberg and airflow are very. .... > strange > > On Wed, Jan 22, 2025 at 10:35 AM Elad Kalif <elad...@apache.org> wrote: > >> Another one who also opened issues in Airflow and Iceberg >> https://github.com/apache/iceberg/issues/12034 >> https://github.com/apache/airflow/issues/45920 >> >> Same "mistake" with the # Title. >> All of these seem to come with accounts opened months ago, with some minor >> traffic to their own forks so they would appear legit to Github >> >> On Wed, Jan 22, 2025 at 11:23 AM Jarek Potiuk <ja...@potiuk.com> wrote: >> >> > Yeah. Again - my guess is that those are "Agentic AI" trials, where >> someone >> > is deploying fake "agent" accounts acting as "people in the repo would". >> > That's a bit terrifying if this is not contained. >> > >> > On Wed, Jan 22, 2025 at 9:52 AM Fokko Driesprong <fo...@apache.org> >> wrote: >> > >> > > That's quite a few! I also noticed that they sometimes self-close the >> > issue >> > > (eg here <https://github.com/apache/iceberg/issues/12032>). Closed >> > after 1 >> > > minute, but still flooding my mailbox :D >> > > >> > > So you might have more such issues now than you think. >> > > >> > > >> > > Yes, that's probably the case, still going through my mailbox. >> > > >> > > >> > > Op wo 22 jan 2025 om 09:49 schreef Jarek Potiuk <ja...@potiuk.com>: >> > > >> > > > Example case: >> > > > >> > > > * https://github.com/apache/airflow/issues/45904 - airflow >> > > > * https://github.com/apache/iceberg/issues/12034 - iceberg >> > > > >> > > > Both issues are generic and useless and bring 0 value except noise. >> > > > >> > > > Interesting thing is that many of those users, if you look at their >> > > > history - created. similar number of issues in iceberg and airflow >> > about >> > > > the same time. So you might have more such issues now than you >> think. >> > > > >> > > > J. >> > > > >> > > > >> > > > >> > > > >> > > > On Wed, Jan 22, 2025 at 9:41 AM Jarek Potiuk <ja...@potiuk.com> >> wrote: >> > > > >> > > >> I have not counted all of them. there are quite a bit too many - >> and >> > > >> other people closed some of them as well. I got a very rudimentary >> > check >> > > >> and applied "AI Spam" label to some of the issues >> > > >> >> > > >> > >> https://github.com/apache/airflow/issues?q=is%3Aissue%20state%3Aclosed%20AI%20label%3A%22AI%20Spam%22 >> > > . >> > > >> -> so we have had at least 25 such issues in the last 12 hours. >> > > >> >> > > >> > we also want to make sure that we don't accidentally close issues >> > that >> > > >> don't come from a bot, but just a newcomer to the project. >> > > >> >> > > >> Those reports and patterns look very. very human-like - they are >> > > reported >> > > >> infrequently (per user) the description and text seem legitimate, >> but >> > > they >> > > >> are wordy and just reading and understanding that those are >> completely >> > > >> useless takes a lot of time. This is part of the problem, that it >> > takes >> > > a >> > > >> lot of energy and time to determine if those are valid or not - and >> > with >> > > >> such a rate, it's not sustainable just to analyze whether they are >> > good >> > > or >> > > >> bad. >> > > >> >> > > >> J. >> > > >> >> > > >> >> > > >> >> > > >> On Wed, Jan 22, 2025 at 9:23 AM Fokko Driesprong <fo...@apache.org >> > >> > > >> wrote: >> > > >> >> > > >>> Hey Jarek, >> > > >>> >> > > >>> Thanks for bringing this to our attention. When you talk about >> > > flooding, >> > > >>> how many are we talking about? I see some suspicious issues (eg, >> here >> > > >>> <https://github.com/apache/iceberg/issues/12039>), but not many. >> I >> > > >>> hope this will come to a halt soon because it all additional work, >> > and >> > > we >> > > >>> also want to make sure that we don't accidentally close issues >> that >> > > don't >> > > >>> come from a bot, but just a newcomer to the project. >> > > >>> >> > > >>> Kind regards, >> > > >>> Fokko >> > > >>> >> > > >>> Op wo 22 jan 2025 om 09:00 schreef Jarek Potiuk <ja...@potiuk.com >> >: >> > > >>> >> > > >>> > Hey Iceberg community, And Airflow community too. >> > > >>> > >> > > >>> > As of yesterday Airflow repo is literally flooded with a number >> of >> > > >>> issues >> > > >>> > that look almost good, except they are clearly AI generated and >> > make >> > > no >> > > >>> > sense or repeat content from other issues. We noticed that the >> > users >> > > >>> who >> > > >>> > create a lot of the "spam AI" issues that are created in Airflow >> > are >> > > >>> also >> > > >>> > creating similar issues for Iceberg. >> > > >>> > >> > > >>> > We got to the point that we are closing and reporting such >> issues >> > to >> > > >>> > GitHub and we are blocking all such users without spending too >> much >> > > >>> time on >> > > >>> > it with messages similar to this: >> > > >>> > >> > > >>> > ``` >> > > >>> > This looks totally AI-generated. useless issue report that >> brings >> > no >> > > >>> value >> > > >>> > and makes no sense. We are generally blocking users that sends a >> > lot >> > > of >> > > >>> > spam AI reports generated by bots.. as of yesterday so we will >> > report >> > > >>> your >> > > >>> > account and block it unless: >> > > >>> > >> > > >>> > a) you explain how you generated reports >> > > >>> > b) prove you are human >> > > >>> > c) explain why you created the issue >> > > >>> > ``` >> > > >>> > >> > > >>> > My guess is that some company released and is testing an >> "agentic >> > AI" >> > > >>> that >> > > >>> > is "github-targeted" - where people can run the AI agents on >> their >> > > >>> behalf. >> > > >>> > It does not look like regular bot-spam. >> > > >>> > I think we should all generally crowd-source reporting it to >> > Github - >> > > >>> and >> > > >>> > hopefully they will find a way to battle those without involving >> > > >>> > maintainers. >> > > >>> > >> > > >>> > I hope it will not last too long. >> > > >>> > >> > > >>> > J. >> > > >>> > >> > > >>> > >> > > >>> > >> > > >>> > ---------- Forwarded message --------- >> > > >>> > From: Jarek Potiuk <ja...@potiuk.com> >> > > >>> > Date: Wed, Jan 22, 2025 at 8:12 AM >> > > >>> > Subject: Re: Very strange (AI generated) issues >> > > >>> > To: <d...@airflow.apache.org> >> > > >>> > >> > > >>> > >> > > >>> > You can also report it directly from the issue (... at the top >> and >> > > >>> "report >> > > >>> > content") >> > > >>> > >> > > >>> > On Wed, Jan 22, 2025 at 7:46 AM Amogh Desai < >> > > amoghdesai....@gmail.com> >> > > >>> > wrote: >> > > >>> > >> > > >>> >> Elad, I just managed to report this user. >> > > >>> >> >> > > >>> >> This is how its done: >> > > >>> >> >> > > >>> >> >> > > >>> >> > > >> > >> https://docs.github.com/en/communities/maintaining-your-safety-on-github/reporting-abuse-or-spam#reporting-a-user >> > > >>> >> >> > > >>> >> Thanks & Regards, >> > > >>> >> Amogh Desai >> > > >>> >> >> > > >>> >> >> > > >>> >> On Wed, Jan 22, 2025 at 12:05 PM Elad Kalif < >> elad...@apache.org> >> > > >>> wrote: >> > > >>> >> >> > > >>> >> > There are several reports from this user >> > > >>> >> > >> > > >>> >> > https://github.com/atharv9017 >> > > >>> >> > >> > > >>> >> > >> > > >>> >> > I didnt find a way to report the user account to github. >> > > >>> >> > >> > > >>> >> > בתאריך יום ד׳, 22 בינו׳ 2025, 06:41, מאת Pavankumar Gopidesu >> < >> > > >>> >> > gopidesupa...@gmail.com>: >> > > >>> >> > >> > > >>> >> > > Yes, still issues are coming. >> > > >>> >> > > >> > > >>> >> > > Regards, >> > > >>> >> > > Pavan >> > > >>> >> > > >> > > >>> >> > > On Wed, Jan 22, 2025 at 4:35 AM Amogh Desai < >> > > >>> amoghdesai....@gmail.com >> > > >>> >> > >> > > >>> >> > > wrote: >> > > >>> >> > > >> > > >>> >> > > > I saw a couple of such SPAM issues too. >> > > >>> >> > > > >> > > >>> >> > > > I also recall some SPAM comments on pull requests as >> well, >> > so >> > > >>> if any >> > > >>> >> > > > contributor sees any such SPAM message, >> > > >>> >> > > > please report it on Slack so that we can delete it and >> > report >> > > >>> it. >> > > >>> >> > > > >> > > >>> >> > > > Thanks & Regards, >> > > >>> >> > > > Amogh Desai >> > > >>> >> > > > >> > > >>> >> > > > >> > > >>> >> > > > On Wed, Jan 22, 2025 at 8:45 AM Zhe You Liu < >> > > >>> zhu424....@gmail.com> >> > > >>> >> > > wrote: >> > > >>> >> > > > >> > > >>> >> > > > > I came across another strange issue: >> > > >>> >> > > > > https://github.com/apache/airflow/issues/45837. It >> > appears >> > > >>> to be >> > > >>> >> a >> > > >>> >> > > > > copy-paste of >> > > https://github.com/apache/airflow/issues/45661 >> > > >>> with >> > > >>> >> > just >> > > >>> >> > > > the >> > > >>> >> > > > > issue title changed. >> > > >>> >> > > > > >> > > >>> >> > > > > On Wed, Jan 22, 2025 at 6:50 AM Jarek Potiuk < >> > > >>> ja...@potiuk.com> >> > > >>> >> > wrote: >> > > >>> >> > > > > >> > > >>> >> > > > > > I even got to this stage: >> > > >>> >> > > > > > >> > > >>> >> > > > > > > We've received a few new tickets from your account >> > > >>> recently. >> > > >>> >> If >> > > >>> >> > > you'd >> > > >>> >> > > > > > like to add additional information you can add a >> comment >> > > to >> > > >>> an >> > > >>> >> > > existing >> > > >>> >> > > > > > ticket, or wait a few minutes before opening a new >> > ticket. >> > > >>> >> > > > > > >> > > >>> >> > > > > > On Tue, Jan 21, 2025 at 11:49 PM Jarek Potiuk < >> > > >>> ja...@potiuk.com >> > > >>> >> > >> > > >>> >> > > > wrote: >> > > >>> >> > > > > > >> > > >>> >> > > > > > > There are few more that I still saw after sending >> it. >> > > >>> There is >> > > >>> >> > > > > something >> > > >>> >> > > > > > > going on bypassing GitHub filters. I hope they >> will >> > > >>> manage >> > > >>> >> to do >> > > >>> >> > > > > > something >> > > >>> >> > > > > > > about it >> > > >>> >> > > > > > > >> > > >>> >> > > > > > > Last one is >> > > >>> https://github.com/apache/airflow/issues/45867 >> > > >>> >> > > > > > > >> > > >>> >> > > > > > > On Tue, Jan 21, 2025 at 11:46 PM Vikram Koka >> > > >>> >> > > > > > <vik...@astronomer.io.invalid> >> > > >>> >> > > > > > > wrote: >> > > >>> >> > > > > > > >> > > >>> >> > > > > > >> Agreed. >> > > >>> >> > > > > > >> >> > > >>> >> > > > > > >> Thanks for flagging these Jarek! >> > > >>> >> > > > > > >> >> > > >>> >> > > > > > >> >> > > >>> >> > > > > > >> On Tue, Jan 21, 2025 at 2:34 PM Jarek Potiuk < >> > > >>> >> ja...@potiuk.com> >> > > >>> >> > > > > wrote: >> > > >>> >> > > > > > >> >> > > >>> >> > > > > > >> > Seems that we have a flood of AI generated >> feature >> > > >>> requests >> > > >>> >> > for >> > > >>> >> > > > > > Airflow, >> > > >>> >> > > > > > >> > The issues look somewhat legitimate, with >> somewhat >> > > >>> related >> > > >>> >> > > > content, >> > > >>> >> > > > > > but >> > > >>> >> > > > > > >> > they are wordy and make no sense when you read >> > them. >> > > >>> Some >> > > >>> >> > > > examples: >> > > >>> >> > > > > > >> > >> > > >>> >> > > > > > >> > * >> https://github.com/apache/airflow/issues/45858 >> > > >>> >> > > > > > >> > * >> https://github.com/apache/airflow/issues/45856 >> > > >>> >> > > > > > >> > * >> https://github.com/apache/airflow/issues/45854 >> > > >>> >> > > > > > >> > >> > > >>> >> > > > > > >> > All of them done by accounts with short history >> in >> > GH >> > > >>> and >> > > >>> >> not >> > > >>> >> > > much >> > > >>> >> > > > > > >> activity >> > > >>> >> > > > > > >> > before >> > > >>> >> > > > > > >> > >> > > >>> >> > > > > > >> > There were quite a few more. >> > > >>> >> > > > > > >> > >> > > >>> >> > > > > > >> > I suggest we close such issues AND report >> authors >> > to >> > > >>> >> GitHub - >> > > >>> >> > > > > > hopefully >> > > >>> >> > > > > > >> we >> > > >>> >> > > > > > >> > can help to battle the AI-generated traffic >> flood. >> > > >>> >> > > > > > >> > >> > > >>> >> > > > > > >> > J. >> > > >>> >> > > > > > >> > >> > > >>> >> > > > > > >> >> > > >>> >> > > > > > > >> > > >>> >> > > > > > >> > > >>> >> > > > > >> > > >>> >> > > > >> > > >>> >> > > >> > > >>> >> > >> > > >>> >> >> > > >>> > >> > > >>> >> > > >> >> > > >> > >> >