Cluster protection, mainly. Swift's rate limiting is based on write requests (eg PUT, POST, DELETE) per second per container. Since a large number of object writes in a single container could cause some background processes to back up and not service other requests, limiting the ops/sec to a container limits the impact of one busy resource in the cluster. [<more technical detail>]
As others have mentioned, rate limiting at the load balancing layer is fine too
(assuming the LB can limit with enough granularity to matter). Most Swift
clusters have a bunch of proxy servers behind something like HAProxy. It
balances requests across the proxy servers (any proxy can service any request,
so that helps) and may also be used to terminate TLS to offload that from the
proxies.
<more technical detail>
Here's some background bullet points to help understand the flow
* stuff in Swift is organized logically as <account>/<container>/<object>
* when an object is created, some metadata about that object (name, size,
etag, etc) is sent to the related container
* the container keeps a list of the objects and some aggregated metadata
* containers are implemented as a SQLite3 DB
* containers are replicated in the cluster, typically 3 times
OK, so with that info, here's what happens. Let's assume, for simplicity, that
we have a 3x replica policy for objects. So when an object is written, each
object replica also sends one update to one of the container replicas. Suppose
we have a cluster of 100 servers with 10 drives each. Create an object, and it
will be stored on 3 of those 100 servers. Create another, and it will be stored
on 3 other servers. So basically, you've got 100 choose 3, followed by 10
choose 1 options for where an object will be. As you keep creating objects, the
load is spread out across every drive in the cluster, all 1000 of them.
But let's say we're adding a bunch of objects to the same container. The
objects will be nicely spread out across all 1000 drives, but we've only got 3
replicas of the container. So with each object write, you've got the exact same
three containers to update. Each object write will end up with 3 of the 100
servers updating the data on three hard drives in the cluster. Suppose you're
trying to write 1000/sec to the cluster, and that's 1000 updates per second to
each of the three container replicas. If you've just got spinning media, that
just isn't possible.
And 1000 writes/sec is a rather low number, compared to what most Swift
clusters expect.
The above is somewhat simplified, but it gets the point across. The container
DBs can become overwhelmed, and the problem gets worse as the containers get
bigger since SQLite gets slower to update as the DB gets bigger. The easiest
way to mitigate the slowdown is for clients to spread writes across many
containers and to put container DBs on SSDs (more IOPS). But even with that,
you still need to give the operator a way to protect the cluster from this
access pattern.
What problems are we protecting from? There's more and more HW resources
consumed by background jobs trying to keep up with the container updates and
keeping the replicas of the container in sync. Since there's a fixed hardware
budget for requests (i.e. you can't get more IOPS or cycles), other requests
now have to wait. Everything slows down, and eventually the whole cluster can
get so far behind it just won't be able to catch up to the correct view of the
world.
We've spent a lot of time working on this problem in the past. We've vastly
improves some parts, and we're still working on improving other parts. The rate
limiting functionality that's in Swift is part of the overall solution for
operators the manage large, active clusters.
--John
On 14 Jun 2016, at 16:23, Joshua Harlow wrote:
> Am curious,
>
> Any reason why swift got in the business of ratelimiting in the first place?
>
> -Josh
>
> John Dickinson wrote:
>> Swift does rate limiting across the proxy servers ("api servers" in nava
>> parlance) as described at
>> http://docs.openstack.org/developer/swift/ratelimit.html. It uses a memcache
>> pool to coordinate the rate limiting across proxy processes (local or across
>> machines).
>>
>> Code's at
>> https://github.com/openstack/swift/blob/master/swift/common/middleware/ratelimit.py
>>
>> --John
>>
>>
>>
>> On 14 Jun 2016, at 8:02, Matt Riedemann wrote:
>>
>>> A question came up in the nova IRC channel this morning about the
>>> api_rate_limit config option in nova which was only for the v2 API.
>>>
>>> Sean Dague explained that it never really worked because it was per API
>>> server so if you had more than one API server it was busted. There is no
>>> in-tree replacement in nova.
>>>
>>> So the open question here is, what are people doing as an alternative?
>>>
>>> --
>>>
>>> Thanks,
>>>
>>> Matt Riedemann
>>>
>>>
>>> _______________________________________________
>>> OpenStack-operators mailing list
>>> [email protected]
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>> _______________________________________________
>> OpenStack-operators mailing list
>> [email protected]
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
signature.asc
Description: OpenPGP digital signature
_______________________________________________ OpenStack-operators mailing list [email protected] http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
