Re: Thrift client creates massive amounts of network packets

Ralf Steppacher Thu, 26 May 2016 23:49:34 -0700

Thanks Eric. Indeed it is the way Titan works during graph-traversal: 
https://groups.google.com/forum/#!topic/aureliusgraphs/IwzRMNB0zzM 
<https://groups.google.com/forum/#!topic/aureliusgraphs/IwzRMNB0zzM>. Newer 
Titan versions attempt to batch the requests and reduce the number of network 
roundtrips that way. I will give that a shot.


Ralf


> On 26.05.2016, at 18:36, Eric Stevens <migh...@gmail.com> wrote:
> 
> If it's a single node cluster, then it's not consistency level related as all 
> consistencies are essentially the same.  This looks instead like a usage 
> pattern that's entirely driven by Titan's read pattern which appears to be 
> lots of tiny reads (probably not too surprising for a graph database).  In 
> that case you probably want to go to the Titan community to see what their 
> recommendations are WRT performance.
> 
> On Thu, May 26, 2016 at 1:18 AM Ralf Steppacher <ralf.viva...@gmail.com 
> <mailto:ralf.viva...@gmail.com>> wrote:
> Eric,
> 
> thanks for the hint. Titan 0.5.4 uses ONE, not LOCAL_ONE. I can try and patch 
> the version. Given that it is a single node cluster for the time being, would 
> your remarks apply to that particular setup?
> 
> 
> Thanks again!
> Ralf
> 
> 
>> On 24.05.2016, at 19:18, Eric Stevens <migh...@gmail.com 
>> <mailto:migh...@gmail.com>> wrote:
>> 
>> I'm not familiar with Titan's usage patterns for Cassandra, but I wonder if 
>> this is because of the consistency level it's querying Cassandra at - i.e. 
>> if CL isn't LOCAL_[something], then this might just be lots of little 
>> checksums required to satisfy consistency requirements.
>> 
>> On Mon, May 23, 2016 at 7:22 AM Ralf Steppacher <ralf.viva...@gmail.com 
>> <mailto:ralf.viva...@gmail.com>> wrote:
>> I remembered that Titan treats edges (and vertices?) as immutable and 
>> deletes the entity and re-creates it on every change.
>> So I set the gc_grace_seconds to 0 for every table in the Titan keyspace and 
>> ran a major compaction. However, this made the situation worse. Instead of 
>> roughly 2’700 tcp packets per user request before the compaction, the same 
>> request now results in 5’400 packets. Which is suspiciously close to a 
>> factor or 2. But I have no idea wha to make of it.
>> 
>> Ralf
>> 
>> 
>> > On 20.05.2016, at 15:11, Ralf Steppacher <ralf.steppac...@vivates.ch 
>> > <mailto:ralf.steppac...@vivates.ch>> wrote:
>> >
>> > Hi all,
>> >
>> > tl:dr
>> > The Titan 0.5.4 cassandrathrift client + C* 2.0.8/2.2.6 create massive 
>> > amounts of network packets for multiget_slice queries. Is there a way to 
>> > avoid the “packet storm”?
>> >
>> >
>> > Details...
>> >
>> > We are using Titan 0.5.4 with its cassandrathrift storage engine to 
>> > connect to a single node cluster running C* 2.2.6 (we also tried 2.0.8, 
>> > which is the version in Titans dependencies). When moving to a 
>> > multi-datacenter setup with the client in one DC and the C* server in the 
>> > other, we ran into the problem that response times from Cassandra/the 
>> > graph became unacceptable (>30s vs. 0.2s within datacenter). Looking at 
>> > the network traffic we saw that the client and server exchange a massive 
>> > number of very small packets.
>> > The user action we were tracing yields three packets of type “REPLY 
>> > multiget_slice”. Per such a reply we see about 1’000 of packet pairs like 
>> > this going back and forth between client and server:
>> >
>> > 968   09:45:55.354613   x.x.x.30 x.x.x.98 TCP   181   54406 → 9160 [PSH, 
>> > ACK] Seq=53709 Ack=39558 Win=1002 Len=115 TSval=4169130400 TSecr=4169119527
>> > 0000   00 50 56 a7 d6 0d 00 0c 29 d1 a4 5e 08 00 45 00  .PV.....)..^..E.
>> > 0010   00 a7 e3 6d 40 00 40 06 fe 3c ac 13 00 1e ac 13  ...m@.@..<......
>> > 0020   00 62 d4 86 23 c8 2c 30 4e 45 1b 4b 0b 55 80 18  .b..#.,0NE.K.U..
>> > 0030   03 ea 59 40 00 00 01 01 08 0a f8 7f e1 a0 f8 7f  ..Y@............
>> > 0040   b7 27 00 00 00 6f 80 01 00 01 00 00 00 0e 6d 75  .'...o........mu
>> > 0050   6c 74 69 67 65 74 5f 73 6c 69 63 65 00 00 3a 38  ltiget_slice..:8
>> > 0060   0f 00 01 0b 00 00 00 01 00 00 00 08 00 00 00 00  ................
>> > 0070   00 00 ab 00 0c 00 02 0b 00 03 00 00 00 09 65 64  ..............ed
>> > 0080   67 65 73 74 6f 72 65 00 0c 00 03 0c 00 02 0b 00  gestore.........
>> > 0090   01 00 00 00 02 72 c0 0b 00 02 00 00 00 02 72 c1  .....r........r.
>> > 00a0   02 00 03 00 08 00 04 7f ff ff ff 00 00 08 00 04  ................
>> > 00b0   00 00 00 01 00                                   .....
>> >
>> > 969   09:45:55.354825   x.x.x.98 x.x.x.30 TCP   123   9160 → 54406 [PSH, 
>> > ACK] Seq=39558 Ack=53824 Win=1540 Len=57 TSval=4169119546 TSecr=4169130400
>> > 0000   00 0c 29 d1 a4 5e 00 50 56 a7 d6 0d 08 00 45 00  ..)..^.PV.....E.
>> > 0010   00 6d 19 dd 40 00 40 06 c8 07 ac 13 00 62 ac 13  .m..@.@......b..
>> > 0020   00 1e 23 c8 d4 86 1b 4b 0b 55 2c 30 4e b8 80 18  ..#....K.U,0N...
>> > 0030   06 04 3b d6 00 00 01 01 08 0a f8 7f b7 3a f8 7f  ..;..........:..
>> > 0040   e1 a0 00 00 00 35 80 01 00 02 00 00 00 0e 6d 75  .....5........mu
>> > 0050   6c 74 69 67 65 74 5f 73 6c 69 63 65 00 00 3a 38  ltiget_slice..:8
>> > 0060   0d 00 00 0b 0f 00 00 00 01 00 00 00 08 00 00 00  ................
>> > 0070   00 00 00 ab 00 0c 00 00 00 00 00                 ………..
>> >
>> > With very few exceptions all packets have the exact same length of 181 and 
>> > 123 bytes respectively. The overall response time of the graph query grows 
>> > approx. linearly with the network latency.
>> > As even “normal” internet network latencies render the setup useless I 
>> > assume we are doing something wrong. Is there a way to avoid that storm of 
>> > small packets by configuration? Or is Titan’s cassandrathrift storage 
>> > backend to blame for this?
>> >
>> >
>> > Thanks in advance!
>> > Ralf
>> 
>

Re: Thrift client creates massive amounts of network packets

Reply via email to