On Monday, February 6, 2017 at 3:21:45 PM UTC-5, Mikko Ohtamaa wrote:
>
> 30k sessions should not be an issue yet. I had 400k sessions and no issues 
> there. Redis should be very high amount of keys easily.
>

We don't have a problem now, but will in the future.  Right now we're 
forecasting that 500k "bad" bot sessions will start to be an issue for us.  

Redis can handle millions of keys fine, and could handle 10MM of these 
sessions on their own.  But each "bot session" corresponds to a single page 
view, which has multiple elements that are backed by our Redis caches as 
well (partial html fragments, object caching, etc).  Based on current 
usage, at around 500k "sessions" we will begin to max out Redis in terms of 
data-size (not number of keys).

Since this is a "bad bot", all these requests happen rather quickly and 
through a bunch of distributed IPs (i think it's a proxy fronted botnet 
that is indexing) and don't respect rate-limit requests.  Blocking them at 
the firewall is a lot of work too, and then they will pop up again.

We operate Redis in a LRU mode, so valid data will be pushed out to make 
room for bot sessions -- the oldest data will largely be the sessions of 
actual users.  The Redis node with sessions is shared within a cluster -- 
so a botnet hitting 10 application servers at the same time is hitting 
Redis 10x at the same time.  This becomes a really annoying performance 
problem fast.

We don't currently use TTLs in Redis.  That was recently removed for better 
performance and more room -- redis maintains the TTLs separately, so you 
end up having 2 lookups and payloads within redis.  This isn't noticeable 
under most conditions, but it causes delays during peak load times (and 
wastes a chunk of space).  We recently shifted the timeout to be entirely 
within the payload (python), and eliminate the EXPIRY calls.  

Unfortunately pyramid_redis_sessions can't support a "lazy" create without 
substantial rewriting.  The current design creates a placeholder session in 
redis when a new session is accessed (or an existing one is invalidated). 
 Our fork is in a decent place to get around this, because we already defer 
persist/refresh into callbacks.

Thinking out-loud, it may make sense to support multiple expiration 
policies -- put a very short TTL on unauthenticated sessions, and keep the 
authenticated ones as a strict LRU with 24hr sessions.


-- 
You received this message because you are subscribed to the Google Groups 
"pylons-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/pylons-discuss/f8b25ecf-bb9d-4bc2-909a-ba452a42b180%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to