[Evergreen-general] Re: Bot issues

JonGeorg SageLibrary via Evergreen-general Tue, 25 Mar 2025 11:08:45 -0700

Thank you to everyone who responded. We're working with our vendor to see
what can be done. I appreciate the responses.
-Jon


On Wed, Mar 19, 2025 at 10:55 AM Kev Woolley <[email protected]>
wrote:

> Hi Jon,
>
> We use CrowdSec: https://www.crowdsec.net/
>
> It allows you to define your own scenarios that allow you to make
> decisions on incoming traffic and automatically mitigate it via firewall or
> other banning measures, throwing up a CAPTCHA, and more.
>
> Note that CrowdSec doesn't work in any time slice above 48 hours -- all of
> its mitigations are very short-lived. We are combining this with a
> substantial long-term blocklist (implemented as an ipset block in Linux
> iptables) that subsumes the functionality of both geo and provider blocks
> for longer-term mitigations. This is, of course, a labour-heavy endeavour,
> but we've tried several alternatives, and this is what's working best so
> far.
>
> We have scenarios defined to catch useragent traits and block useragents
> that seem bad. After some initial learning ("oh, so this version of MS
> Office says it's MSIE 7.0, so a library just blocked themselves -- oops!"
> and similar situations), it was pretty easy to get most bot traffic caught
> in that. As I get time (and more familiarity with writing the scenarios)
> I'll be designing scenarios that look for specific behaviours (such as
> grabbing the links on a page in order, too quickly) and improving our
> defense that way.
>
> CS offers reasonably good visualisation and reporting tools. This is
> useful for both keeping track of who's doing what, but also seeing the
> persistent threats and creating entries in the long-term blocklist for
> those.
>
> My observation, even very recently as I've been working on the long-term
> blocklist and not updating it on our servers (working with ~10k rules takes
> a while), is that there really doesn't seem to be a point where one can
> take their eyes off the issue entirely and forget about it -- new traffic
> comes out of the woodwork. With a substantial enough long-term blocklist
> this can reduce the time spent to a reasonable amount, but there doesn't
> seem to be an "okay, we're done here" point.
>
> My gut feel is that 30-50k long-term blocklist rules is where we may end
> up eventually (with some years of building them).
>
> I'm happy to share what I've got in the LTB. It's been built over the last
> several months, based on the attacks we've received.
>
> Resources I've found helpful include:
>
> https://www.qurium.org/ -- their digital forensics and investigations
> pages have a lot of good info on the methods and actors for some types of
> attacks -- we experienced this flavour, in particular:
>
>
> https://www.qurium.org/weaponizing-proxy-and-vpn-providers/fineproxy-rayobyte/
>
> Finding this site helped confirm a lot of information I'd found over the
> previous couple of years, studying these things on my own.
>
> https://www.radb.net/ -- you can query this for free, and it's a good way
> to look up network information without having to bounce around between
> ARIN, RIPE, APNIC, and other RIRs (Regional Internet Registries). You can
> do advanced queries against it with a Whois client, as well:
>
> whois -h whois.radb.net -- '-i origin AS714'
>
> The above command will give a list of everything originating from one of
> Apple's ASNs (Autonomous System Numbers; these are used to help manage
> routing). For example:
>
> whois -h whois.radb.net -- '-i origin AS55185'
>
> Gives:
>
> route:          209.87.62.0/24
> origin:         AS55185
> descr:          750 - 555 Seymour Street
>                 Vancouver BC V6B-3H6
>                 Canada
> admin-c:        HOSTM458-ARIN
> tech-c:         NOC33711-ARIN
> mnt-by:         MNT-BC-Z
> created:        2023-12-07T21:58:41Z
> last-modified:  2023-12-07T21:58:41Z
> source:         ARIN
> rpki-ov-state:  valid
>
> route6:         2607:f8f0:6a0::/48
> origin:         AS55185
> descr:          750 - 555 Seymour Street
>                 Vancouver BC V6B-3H6
>                 Canada
> admin-c:        HOSTM458-ARIN
> tech-c:         NOC33711-ARIN
> mnt-by:         MNT-BC
> created:        2023-12-07T22:00:06Z
> last-modified:  2023-12-07T22:00:06Z
> source:         ARIN
> rpki-ov-state:  valid
>
> With a bit of scripting, it's not difficult to pull out the route: and
> route6: lines, run them through aggregate (a tool that removes duplication
> and shadowing of lists of netblocks, giving you the shortest possible list
> of netblocks that cover all of the provided addresses), and output them to
> a file for validation and addition to whatever solution you're using.
>
> It's a huge topic, and I've already babbled long enough. I'm happy to give
> info or lend a hand, though. It's a hard problem.
>
> Thank you,
>
> Kev
>
>
> --
> Kev Woolley (they/them)
>
> Gratefully acknowledging that I live and work in the unceded traditional
> territories of the Səl̓ílwətaɬ (Tsleil-Waututh) and Sḵwx̱wú7mesh Úxwumixw.
>
>
>
> ________________________________________
> From: JonGeorg SageLibrary via Evergreen-general <
> [email protected]>
> Sent: 19 March 2025 08:52
> To: Evergreen Discussion Group
> Cc: JonGeorg SageLibrary
> Subject: [Evergreen-general] Bot issues
>
> We've been dealing with a lot of bots crawling our catalog, and
> overwhelming our app servers.
>
> Are any of you having the same issue, and if so what tools are you using
> to remedy the situation?
>
> We've already implemented geoblocking to limit traffic to the US and
> Canada, after being overwhelmed by queries from overseas.
>
> I've been looking at bad bot blocker as an option.
> -Jon
> This message originated from outside the M365 organisation. Please be
> careful with links, and don't trust messages you don't recognise.
>

_______________________________________________
Evergreen-general mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[Evergreen-general] Re: Bot issues

Reply via email to