Thanks for sharing this, Steve!
On Tue, Aug 12, 2014 at 11:03 AM, Steve Miller <st...@idrathernotsay.com> wrote: > I'd seen references to there being a Kafka protocol dissector built > into wireshark/tshark 1.12, but what I could find on that was a bit light > on the specifics as to how to get it to do anything -- at least for someone > (like me) who might use tcpdump a lot but who doesn't use tshark a lot. > > I got this working, so I figured I'd post a few pointers here on the > off-chance that they save someone else a bit of time. > > Note that I'm using tshark, not wireshark; this might be easier and/or > different in wireshark, but I don't feel like moving many gigabytes of data > to a place where I can use wireshark. (-: > > If you're reading traffic live, you'll want to do something like this: > > tshark -V -i eth1 -o 'kafka.tcp.port:9092' -d tcp.port=9092,kafka > -f 'dst port 9092' -Y (kafka options) > > For example, if you want to see output only for ProduceRequest and > ProduceResponses, and only for the topic "mytopic", you can do: > > tshark -V -i eth1 -o 'kafka.tcp.port:9092' -d tcp.port=9092,kafka > -f 'dst port 9092' -Y 'kafka.topic_name==mytopic && kafka.request_key==0' > > You can get a complete list of Kafka-related fields by doing: > > tshark -G fields | grep -i kafka > > There is a very significant downside to processing packets live: tshark > uses dumpcap to generate the actual packets, and unless I'm missing some > obscure tshark option (which is possible!) it won't toss old data. So if > you run this for a few hours, you'll end up with a ginormous file. > > By default (under Linux, at least) tshark is going to put that file in > /tmp, so if your /tmp is small and/or a tmpfs that can make things a little > exciting. You can get around that by doing: > > (export TMPDIR=/big/damn/filesystem ; tshark bla bla bla) > > which I figure given typical Kafka data volumes is probably pretty > important to know, and which doesn't seem to be documented in the tshark > man pages. It is at least not all that hard to search for. > > In theory, you can use the tshark "-b" option to specify a ring buffer > of files, even for real-time processing, though: > > * adding -b anything (e.g., "-b files:1 -b filesize:1024") seems > to want to force you to use -w (filename) > > * just adding -b and -w to the invocation above gets a warning > about display filters not being supported when capturing and saving packets > > * changing -Y to -2 -R and/or adding -P doesn't seem to help > > (though again someone with more tshark experience might know the magic > combination of arguments to get this to do what it's told). > > So instead, you can capture packets somewhere, e.g.: > > tcpdump -n -s 0 -w /var/tmp/kafka.tcpd -i eth1 'port 9092' > > and then decode them later: > > tshark -V -r /var/tmp/kafka.tcpd -o 'kafka.tcp.port:9092' -d > tcp.port=9092,kafka -R 'kafka.topic_name==mytopic && kafka.request_key==0' > -2 > > Anyway, if you're seeing protocol-related weirdness, hopefully this > will be at least of some help to you. > > -Steve > (Yes, the email address is a joke. Just not on you! It does > work.) >