Re: [grpc-io] [grpc-java] Server crashes after 1 minute at 1000 messages per second.

'Carl Mastrangelo' via grpc.io Thu, 02 Nov 2017 16:03:06 -0700

You should release a token when the RPC completes, rather than solely wait 
for the rate limiter.  The Semaphore class in java lets you do this.   It's 
usually not a a good idea to think in terms of QPS for this kind of use.  
Better is to think in terms of inflight RPCs.


On Thursday, November 2, 2017 at 3:29:17 PM UTC-7, [email protected] wrote:
>
>
> Hey Eric!
>
>>
>> That doesn't sound right, unless it is parallel across streams. onNext is 
>> not thread-safe; you'd need to hold a lock when calling it from multiple 
>> threads simultaneously. This isn't the cause of your problem, but it is a 
>> problem.
>
>
> Thanks for pointing that out!
>
> Is that 1000 QPS total, and all over one stream? Using a single stream 
>> can't use much more than a single core of processing (excluding protobuf 
>> and the application), so you may use some more streams. But 1k QPS is 
>> really low. We see 750k QPS 
>> <https://performance-dot-grpc-testing.appspot.com/explore?dashboard=5652536396611584>
>>  ("Streaming 
>> secure throughput QPS (8 core client to 8 core server)") between a client 
>> and server with 8 cores each. Even with non-streaming RPCs we see 250k QPS.
>
>
> I'm trying 1000 QPS continuously for a minute, all over one stream. FWIW, 
> 500 QPS works fine and I tried that up to 5 minutes. For those charts you 
> showed, are you able to show the server and client configs/code that 
> produced those metrics? Also, I am still using GRPC 1.5, and I know there 
> were performance improvements from then but I still think 1000 QPS is still 
> very low for this version. 
>
> Anything else you could recommend? Thanks again.
>
>
> On Thursday, November 2, 2017 at 2:33:50 PM UTC-7, Eric Anderson wrote:
>>
>> On Thu, Nov 2, 2017 at 11:26 AM, <[email protected]> wrote:
>>
>>> On my client, I am using Guava's RateLimiter 
>>> <https://google.github.io/guava/releases/22.0/api/docs/index.html?com/google/common/util/concurrent/RateLimiter.html>
>>>  to 
>>> send messages in a bi-di stream at 1000 per second. (All using a shared 
>>> channel and stub).
>>> Each message I am sending in a Runnable() just to parallelize the work.
>>>
>>
>> That doesn't sound right, unless it is parallel across streams. onNext is 
>> not thread-safe; you'd need to hold a lock when calling it from multiple 
>> threads simultaneously. This isn't the cause of your problem, but it is a 
>> problem.
>>
>> Same behavior happens if I just call `onNext` directly without the task 
>>> submission step.
>>> Code roughly looks like:
>>>
>>> final long startTime = System.currentTimeMillis();
>>>
>>> final long oneMinute = TimeUnit.MINUTES.toMillis(1);
>>>
>>> final RateLimiter rateLimiter = RateLimiter.create(1000);
>>>
>>> final StreamObserver<TestMessageRequest> requestObserver = 
>>> client.asyncStub.testMessageRpc(client.replyObserver);
>>>
>>> while (System.currentTimeMillis() - startTime < oneMinute) { 
>>>     rateLimiter.acquire(1);
>>>     threadPool.submit(() -> {
>>>
>>>        TestMessageRequest request = TestMessageRequest.getDefaultInstance();
>>>
>>>        requestObserver.onNext(request);
>>>     });
>>> }
>>>
>>> It's not observing outbound flow control and you're sending messages 
>> faster than they are consumed. This causes the buffered messages to consume 
>> too much space. You need to use ClientCallStreamObserver.isReady() and 
>> setOnReadyHandler(). Your replyObserver will need to 
>> implement ClientResponseObserver in order to call setOnReadyHandler() 
>> during beforeStart(). If isReady() is false, then try to pause sending to 
>> avoid excessive buffering. When isReady() transitions from false back to 
>> true, setOnReadyHandler() will be called.
>>
>> See the ManualFlowControlClient 
>> <https://github.com/grpc/grpc-java/blob/master/examples/src/main/java/io/grpc/examples/manualflowcontrol/ManualFlowControlClient.java>,
>>  
>> although you don't need to mess with disableAutoInboundFlowControl() and 
>> request().
>>
>> So anywhere from the 20-60 second mark my server throws a:
>>>  SEVERE: Exception while executing runnable 
>>> io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1@48544a03
>>>      [java] java.lang.OutOfMemoryError: GC overhead limit exceeded
>>>
>>> Am I doing something wrong? Is there any way to have the server support 
>>> this high load.
>>>
>>
>> Is that 1000 QPS total, and all over one stream? Using a single stream 
>> can't use much more than a single core of processing (excluding protobuf 
>> and the application), so you may use some more streams. But 1k QPS is 
>> really low. We see 750k QPS 
>> <https://performance-dot-grpc-testing.appspot.com/explore?dashboard=5652536396611584>
>>  ("Streaming 
>> secure throughput QPS (8 core client to 8 core server)") between a client 
>> and server with 8 cores each. Even with non-streaming RPCs we see 250k QPS.
>>
>
> On Thursday, November 2, 2017 at 2:33:50 PM UTC-7, Eric Anderson wrote:
>>
>> On Thu, Nov 2, 2017 at 11:26 AM, <[email protected]> wrote:
>>
>>> On my client, I am using Guava's RateLimiter 
>>> <https://google.github.io/guava/releases/22.0/api/docs/index.html?com/google/common/util/concurrent/RateLimiter.html>
>>>  to 
>>> send messages in a bi-di stream at 1000 per second. (All using a shared 
>>> channel and stub).
>>> Each message I am sending in a Runnable() just to parallelize the work.
>>>
>>
>> That doesn't sound right, unless it is parallel across streams. onNext is 
>> not thread-safe; you'd need to hold a lock when calling it from multiple 
>> threads simultaneously. This isn't the cause of your problem, but it is a 
>> problem.
>>
>> Same behavior happens if I just call `onNext` directly without the task 
>>> submission step.
>>> Code roughly looks like:
>>>
>>> final long startTime = System.currentTimeMillis();
>>>
>>> final long oneMinute = TimeUnit.MINUTES.toMillis(1);
>>>
>>> final RateLimiter rateLimiter = RateLimiter.create(1000);
>>>
>>> final StreamObserver<TestMessageRequest> requestObserver = 
>>> client.asyncStub.testMessageRpc(client.replyObserver);
>>>
>>> while (System.currentTimeMillis() - startTime < oneMinute) { 
>>>     rateLimiter.acquire(1);
>>>     threadPool.submit(() -> {
>>>
>>>        TestMessageRequest request = TestMessageRequest.getDefaultInstance();
>>>
>>>        requestObserver.onNext(request);
>>>     });
>>> }
>>>
>>> It's not observing outbound flow control and you're sending messages 
>> faster than they are consumed. This causes the buffered messages to consume 
>> too much space. You need to use ClientCallStreamObserver.isReady() and 
>> setOnReadyHandler(). Your replyObserver will need to 
>> implement ClientResponseObserver in order to call setOnReadyHandler() 
>> during beforeStart(). If isReady() is false, then try to pause sending to 
>> avoid excessive buffering. When isReady() transitions from false back to 
>> true, setOnReadyHandler() will be called.
>>
>> See the ManualFlowControlClient 
>> <https://github.com/grpc/grpc-java/blob/master/examples/src/main/java/io/grpc/examples/manualflowcontrol/ManualFlowControlClient.java>,
>>  
>> although you don't need to mess with disableAutoInboundFlowControl() and 
>> request().
>>
>> So anywhere from the 20-60 second mark my server throws a:
>>>  SEVERE: Exception while executing runnable 
>>> io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1@48544a03
>>>      [java] java.lang.OutOfMemoryError: GC overhead limit exceeded
>>>
>>> Am I doing something wrong? Is there any way to have the server support 
>>> this high load.
>>>
>>
>> Is that 1000 QPS total, and all over one stream? Using a single stream 
>> can't use much more than a single core of processing (excluding protobuf 
>> and the application), so you may use some more streams. But 1k QPS is 
>> really low. We see 750k QPS 
>> <https://performance-dot-grpc-testing.appspot.com/explore?dashboard=5652536396611584>
>>  ("Streaming 
>> secure throughput QPS (8 core client to 8 core server)") between a client 
>> and server with 8 cores each. Even with non-streaming RPCs we see 250k QPS.
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/grpc-io.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/grpc-io/0ed79955-410d-4327-9091-a45785a5dd1c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [grpc-io] [grpc-java] Server crashes after 1 minute at 1000 messages per second.

Reply via email to