I'll do some further experimentation based on what you mention. 

Thanks


On Monday, February 5, 2018 at 5:01:15 PM UTC-8, Carl Mastrangelo wrote:
>
> Ah, I thought you were trying to measure latency of a single RPC.    We 
> have 2 QPS benchmarks, an open loop and a closed loop benchmark.  For the 
> closed loop, it runs the single-rpc latency benchmark in parallel with 200 
> copies.   This means there are only ever 200 active RPCs at a time.    The 
> latecny is recorded, but not published anywhere.
>
> From your description, the open-loop benchmark sounds more like what you 
> are doing.   We have a client that has a target QPS, and uses an 
> exponentially distributed delay between starting RPCs.  This simulates real 
> traffic better and has occasional bursts of RPCs.    We use this to measure 
> CPU while holding the QPS constant.
>
>
> Larger payloads making them system faster is odd, and may be explained by 
> your benchmark machine.    For example, if there is no work for gRPC to do, 
> it will go to sleep.   When the amount of work is too low, it spends a lot 
> of time waking up and going back to sleep, lowering the overall 
> performance.   Strangely, by adding more work (with bigger payloads), the 
> system never goes to sleep and thus accomplishes more real work.    We work 
> around this by trying to keep the machine as close to 100% CPU as possible 
> without going over.   Additionally, we disable CPU frequency scaling to 
> ensure stable results.  (The CPU down-clocks while waiting for network 
> traffic, and doesn't speed back up fast enough when there is data).
>
>
> We benchmark almost exclusively on Linux.
>
>
>
>
> On Monday, February 5, 2018 at 4:32:55 PM UTC-8, [email protected] wrote:
>>
>> We actually have 8 threads sending bursts of requests simultaneously and 
>> measuring each request individually. We are using bursts of request and 
>> then waiting for some time to avoid hammering the server with huge amount 
>> of requests. It seems you are describing that it is only one client that 
>> sends one request only and then waits till the response to send another 
>> request. We are not doing that, we are simulating some kind of QPS 
>> approximation and measuring the latency.
>>
>> The behavior I'm seeing is that smaller payloads are slower than the 
>> bigger payloads. I was thinking it maybe had to do with some buffer taking 
>> longer to be filled and sent over the wire.
>>
>> The results you mention are they running on the Windows stack? 
>>
>> Thanks
>>
>> Eduardo
>>
>> On Monday, February 5, 2018 at 4:24:15 PM UTC-8, Carl Mastrangelo wrote:
>>>
>>> By closed loop i mean starting a new RPC upon completion of one.  I 
>>> think that is the same as your option b).  These should be always faster 
>>> with small payloads than larger payloads, which it seems like you are 
>>> saying is happening?   
>>>
>>>
>>> We have closed loop latency tests that use a 1 byte payload, and measure 
>>> the 50th and 99th percentiles.   We see about 100us per RPC at 50th.
>>>
>>>  
>>>
>>> On Monday, February 5, 2018 at 4:16:29 PM UTC-8, [email protected] 
>>> wrote:
>>>>
>>>> With closed loop do you mean 
>>>>
>>>> a) using loopback?
>>>> b) measuring from when the request is made and finish measuring when 
>>>> the response gets back?
>>>>
>>>> In the test we have, we are not using loopback (two vms over the 
>>>> network) and we start measuring right before calling into 
>>>> ClientAsyncResponseReader and calling into Finish and we stop measuring 
>>>> when we get back the response and our callback gets called.
>>>>
>>>> If closed loop means something else please explain further.
>>>>
>>>> I may be able to share the code but before I go through that process do 
>>>> you have any general suggestions that I can try or consider?
>>>>
>>>> Thanks
>>>>
>>>> Eduardo
>>>>
>>>>
>>>> On Monday, February 5, 2018 at 3:43:34 PM UTC-8, Carl Mastrangelo wrote:
>>>>>
>>>>> Are you doing a closed loop latency test like gRPC benchmarking does?  
>>>>>  Also, can you show your code?
>>>>>
>>>>> On Monday, February 5, 2018 at 3:10:03 PM UTC-8, [email protected] 
>>>>> wrote:
>>>>>>
>>>>>> Hi, I'm working on a custom latency test. I'm using payloads of sizes 
>>>>>> 1 byte, 200 bytes, 1kb and 10kb. The tests of 1 byte show a very big 
>>>>>> difference from the rest of the payloads. (longer/worse latency).
>>>>>>
>>>>>> I'm working on grpc for c++ on Windows. I'm guessing this has to do 
>>>>>> with some http2 packing or optimization logic meaning that it is taking 
>>>>>> longer for the packets to be sent until a buffer is filled.
>>>>>>
>>>>>> What are the configuration I should look on modifying to see if I can 
>>>>>> improve this behavior?
>>>>>>
>>>>>> I've tried looking around in 
>>>>>>
>>>>>> https://github.com/grpc/grpc/blob/master/include/grpc/grpc.h
>>>>>>
>>>>>> and in
>>>>>>
>>>>>>
>>>>>> https://github.com/grpc/grpc/blob/master/include/grpc/impl/codegen/grpc_types.h
>>>>>>
>>>>>> with no luck. What do you suggest?
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>> Eduardo
>>>>>>
>>>>>

-- 
You received this message because you are subscribed to the Google Groups 
"grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/grpc-io.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/grpc-io/2b8c6364-44d0-4516-bbe9-a2b1655e2682%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to