Nice work, glad Arrow proved useful. On Mon, Feb 15, 2021 at 11:44 PM Kohei KaiGai <kai...@heterodb.com> wrote:
> Hello, > > Let me share my recent works below: > https://github.com/heterodb/pg-strom/wiki/804:-Pcap2Arrow > > This standalone command-line tool allows to capture network packets > from network interface devices, > and convert them into Apache Arrow data format according to the > pre-defined data schema for each > supported protocol (TCP, UDP, ICMP x IPv4, IPv6), then write out the > destination files. > > It internally uses PF_RING [*1] to support fast network interface card > (> 10Gb), and to minimize > packet losses by utilization of multi-core CPUs. > Even though I confirmed that Pcap2Arrow write out the captured network > packets more than > 50Gb/s ratio, my test cases are artificial and biased traffic patterns. > If you can test the software on your environment, it makes sense to > improve the software. > [*1] https://www.ntop.org/products/packet-capture/pf_ring/ > > As you may know, network traffic data tends to grow so large, thus, it > is not easy to import > them into database systems for analytics. Once we can convert them > into Apache Arrow, > we don't need to import the captured data again. Just map the files > prior to analytics. > > Best regards, > -- > HeteroDB, Inc / The PG-Strom Project > KaiGai Kohei <kai...@heterodb.com> >