Re: [Twisted-Python] Sending longer messages in AMP

exarkun Thu, 13 Nov 2014 10:20:47 -0800

On 02:57 pm, ga...@gromper.net wrote:

Hi,


We're using AMP and are starting to hit TooLong errors when scaling
our application. In one respect it's a sign that we should do
something like paging large requests and responses, but that's a lot
more work, and comes with its own problems. We also don't need
particularly large payloads: right now, a limit of ~500kiB would allow
us to scale as far as we need and beyond.

I've put together a fork of Twisted's AMP implementation that uses
32-bit length prefixes everywhere, though it limits the maximum
message size to 2MiB. Every other aspect of it is the same so it's a
drop-in replacement, as long as both ends of a connection use it.
However, there's no negotiation phase so it's completely incompatible
on the wire. The overhead of a few extra bytes is negligible for our
use cases, where the networks are all assumed to be low-latency
high-bandwidth LANs.

Are there any reasons that we shouldn't be doing this? Was there a
good reason for 16-bit length prefixes that still holds? Should we be
doing something else?


The short length limit is in place to encourage two things:

* messages that can be processed in a cooperative-multitasking-friendlyway


 * the AMP channel can reliably used to multiplex multiple operations

The limit encourages the former by limiting the total amount of datait's possible to receive in a single command. Of course, you can stilldo ridiculously complicated work based on a small bit of data so thisdoesn't guarantee that no matter what you do you'll be safe. But doingeven something simple on a ridiculously large amount of data is probablyguaranteed to take a while.

The limit encourages the latter by putting a limit on the data thatneeds to be transferred to complete any one command (or answer). Again,this isn't a guarantee of safety (you could always have a `for i inrange(1e10): callRemote(...)` loop and clog up the channel for ages) butit pushes things a bit more in that direction.

At ClusterHQ we *also* maintained a fork of AMP with this limit raised.Basically, it worked. It did let us get into the kind of trouble thatthe limit was supposed to try to avoid (in particular it let us sendaround messages that would take longer and longer to be processed - in asystem where keeping latency down was actually sort of important;fortunately we had *worse* problems introducing latency so this inparticular never bit us too hard ;).

If I assume that the answers are all no, would someone find this
protocol useful if we submitted it for inclusion in Twisted itself?

There are better solutions to the problem. The trouble is that they'realso more work to implement. ;) I think Twisted should hold out for thebetter solutions though, not adopt a like-AMP-but-with-different-hard-coded-limits solution.

What are the better solutions? Library support for paging, basically.Or, to consider things more generally, library support for streaming.The AMP implementation in Twisted (note, not the *protocol*) should beextended to make it easy to pass arbitrarily large streams of dataaround - suitably broken into smaller pieces at the box level.

As of right now, the way I'd do that is by introducing a new argumenttype (or two) supporting `IProducer` and `IConsumer`. Pass in an`IProducer` and the library will take the necessary steps to read dataout of it, chunk it up into <=16kB chunks, and re-assemble them on thereceiving side (as another `IProducer`).

There are two reasons I'm not working on this right now (apart from thestandard reasons of not having time to do so ;):

1) IProducer / IConsumer aren't amenable to this kind of decoupling.You can register a producer with a consumer but you can't register aconsumer with a producer. By the time you give the IProducer to AMP,it's too late to tell it you want it to send its data into the AMPimplementation for the necessary handling. We worked around this intwisted.web.client.Agent by introducing a new IProducer-like interface.It solves the basic problem but it doesn't go any further to improve theusability of the interfaces.

2) Tubes. Glyph is working on a replacement for IProducer/IConsumerthat does go a lot further to improve usability. With this promise of abright, prosperous future looming, it's hard to get excited aboutimplementing for AMP a just-barely-good-enough solution like the oneused by Agent (in particular, with the knowledge that the tubes solutionwill be API incompatible and we'll most likely want to deprecate theIProducer/IConsumer thing).


Jean-Paul

The code right now is a straight copy of amp.py and test_amp.py with
changes to 32-bit length prefixes everywhere, but for upstreaming we'd
probably propose instead to modify the original to have an optional
negotiation phase, and to make the maximum message size a parameter.

Thanks!

Gavin.

_______________________________________________
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python


_______________________________________________
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python

Re: [Twisted-Python] Sending longer messages in AMP

Reply via email to