Hi Sharninder and Ashnish, Thanks for your nice suggestions. I agree one good solution would be writing some tools to glue libpcap, Avro and Flume.
2014-08-01 14:27 GMT+08:00 Sharninder <sharnin...@gmail.com>: > Liu, you first need to figure out what TCP data you want to collect. Is > there a possibility that this data can be collected at some central > router/gateway using SNMP? > > If not SNMP then you can definitely run something like wireshark or write > up your own tool using a library like libpcap and collect data passing > through the network card. I'm not sure if this is what you want. > > > Once you've decided on the data that you want to collect, it is definitely > possible to use flume to collect it and the easiest would be to write a > utility to consume that data and convert it to avro and then use the avro > source on the flume side. > > That's my suggestion. Write your own tool to collect data, bundle it into > avro events and pass them on to flume. > > > > > On Fri, Aug 1, 2014 at 11:25 AM, Liu Blade <hafzc...@gmail.com> wrote: > >> Hi folks, >> >> Sorry didn't clarify my problem. The problem has two folds: (1) use >> which way to collect incoming TCP streams from external connections, and it >> must be made on the fly; (2)use which method as Flume source, e.g., >> syslogTcp, Avro. >> >> It seems syslog is unable to tap into TCP connections. Look forward to >> your opinions. >> >> Thanks, >> >> >> >> 2014-08-01 11:17 GMT+08:00 Liu Blade <hafzc...@gmail.com>: >> >> Dear all, >>> >>> The scenario is we want to collect data over TCP connection which is >>> send to backend database server. But it is not possible to use an intrusive >>> way, which means we would not collect data on servers. >>> >>> Is that possible to use libpcap/winpcap to tap into TCP stream, convert >>> it to Avro/Thrift, and then send to Flume source? >>> >>> Very appreciate your suggestions. Please indicate if there are better >>> options. >>> >>> Cheers, >>> Blade >>> >>> >> >> >