Thank you to everyone who responded. We're working with our vendor to see what can be done. I appreciate the responses. -Jon
On Wed, Mar 19, 2025 at 10:55 AM Kev Woolley <[email protected]> wrote: > Hi Jon, > > We use CrowdSec: https://www.crowdsec.net/ > > It allows you to define your own scenarios that allow you to make > decisions on incoming traffic and automatically mitigate it via firewall or > other banning measures, throwing up a CAPTCHA, and more. > > Note that CrowdSec doesn't work in any time slice above 48 hours -- all of > its mitigations are very short-lived. We are combining this with a > substantial long-term blocklist (implemented as an ipset block in Linux > iptables) that subsumes the functionality of both geo and provider blocks > for longer-term mitigations. This is, of course, a labour-heavy endeavour, > but we've tried several alternatives, and this is what's working best so > far. > > We have scenarios defined to catch useragent traits and block useragents > that seem bad. After some initial learning ("oh, so this version of MS > Office says it's MSIE 7.0, so a library just blocked themselves -- oops!" > and similar situations), it was pretty easy to get most bot traffic caught > in that. As I get time (and more familiarity with writing the scenarios) > I'll be designing scenarios that look for specific behaviours (such as > grabbing the links on a page in order, too quickly) and improving our > defense that way. > > CS offers reasonably good visualisation and reporting tools. This is > useful for both keeping track of who's doing what, but also seeing the > persistent threats and creating entries in the long-term blocklist for > those. > > My observation, even very recently as I've been working on the long-term > blocklist and not updating it on our servers (working with ~10k rules takes > a while), is that there really doesn't seem to be a point where one can > take their eyes off the issue entirely and forget about it -- new traffic > comes out of the woodwork. With a substantial enough long-term blocklist > this can reduce the time spent to a reasonable amount, but there doesn't > seem to be an "okay, we're done here" point. > > My gut feel is that 30-50k long-term blocklist rules is where we may end > up eventually (with some years of building them). > > I'm happy to share what I've got in the LTB. It's been built over the last > several months, based on the attacks we've received. > > Resources I've found helpful include: > > https://www.qurium.org/ -- their digital forensics and investigations > pages have a lot of good info on the methods and actors for some types of > attacks -- we experienced this flavour, in particular: > > > https://www.qurium.org/weaponizing-proxy-and-vpn-providers/fineproxy-rayobyte/ > > Finding this site helped confirm a lot of information I'd found over the > previous couple of years, studying these things on my own. > > https://www.radb.net/ -- you can query this for free, and it's a good way > to look up network information without having to bounce around between > ARIN, RIPE, APNIC, and other RIRs (Regional Internet Registries). You can > do advanced queries against it with a Whois client, as well: > > whois -h whois.radb.net -- '-i origin AS714' > > The above command will give a list of everything originating from one of > Apple's ASNs (Autonomous System Numbers; these are used to help manage > routing). For example: > > whois -h whois.radb.net -- '-i origin AS55185' > > Gives: > > route: 209.87.62.0/24 > origin: AS55185 > descr: 750 - 555 Seymour Street > Vancouver BC V6B-3H6 > Canada > admin-c: HOSTM458-ARIN > tech-c: NOC33711-ARIN > mnt-by: MNT-BC-Z > created: 2023-12-07T21:58:41Z > last-modified: 2023-12-07T21:58:41Z > source: ARIN > rpki-ov-state: valid > > route6: 2607:f8f0:6a0::/48 > origin: AS55185 > descr: 750 - 555 Seymour Street > Vancouver BC V6B-3H6 > Canada > admin-c: HOSTM458-ARIN > tech-c: NOC33711-ARIN > mnt-by: MNT-BC > created: 2023-12-07T22:00:06Z > last-modified: 2023-12-07T22:00:06Z > source: ARIN > rpki-ov-state: valid > > With a bit of scripting, it's not difficult to pull out the route: and > route6: lines, run them through aggregate (a tool that removes duplication > and shadowing of lists of netblocks, giving you the shortest possible list > of netblocks that cover all of the provided addresses), and output them to > a file for validation and addition to whatever solution you're using. > > It's a huge topic, and I've already babbled long enough. I'm happy to give > info or lend a hand, though. It's a hard problem. > > Thank you, > > Kev > > > -- > Kev Woolley (they/them) > > Gratefully acknowledging that I live and work in the unceded traditional > territories of the Səl̓ílwətaɬ (Tsleil-Waututh) and Sḵwx̱wú7mesh Úxwumixw. > > > > ________________________________________ > From: JonGeorg SageLibrary via Evergreen-general < > [email protected]> > Sent: 19 March 2025 08:52 > To: Evergreen Discussion Group > Cc: JonGeorg SageLibrary > Subject: [Evergreen-general] Bot issues > > We've been dealing with a lot of bots crawling our catalog, and > overwhelming our app servers. > > Are any of you having the same issue, and if so what tools are you using > to remedy the situation? > > We've already implemented geoblocking to limit traffic to the US and > Canada, after being overwhelmed by queries from overseas. > > I've been looking at bad bot blocker as an option. > -Jon > This message originated from outside the M365 organisation. Please be > careful with links, and don't trust messages you don't recognise. >
_______________________________________________ Evergreen-general mailing list -- [email protected] To unsubscribe send an email to [email protected]
