Hello Apache Kafka folks, We invite you to join us for the March Apache Kafka Meetup today (Tuesday, March 29) at San Jose Convention Center, starting at 6pm:
http://www.meetup.com/http-kafka-apache-org/events/229424437/ We have two great talks today: ------ Title: Introduction to Kafka Streams Abstract (by Guozhang Wang, Confluent): In the past few years Apache Kafka has emerged itself as the world's most popular real-time data streaming platform backbone. In this talk, we introduce Kafka Streams, the latest addition to the Apache Kafka project, which is a new stream processing library natively integrated with Kafka. Kafka Streams has a very low barrier to entry, easy operationalization, and a natural DSL for writing stream processing applications. As such it is the most convenient yet scalable option to analyze, transform, or otherwise process data that is backed by Kafka. We will provide the audience with an overview of Kafka Streams including its design and API, typical use cases, code examples, and an outlook of its upcoming roadmap. We will also compare Kafka Streams' light-weight library approach with heavier, framework-based tools such as Spark Streaming or Storm, which require you to understand and operate a whole different infrastructure for processing real-time data in Kafka. ------ Title: Streaming Analytics at 300 billion events/day with Kafka, Samza, and Druid (by Xavier Léauté, Metamarkets) Abstract: Wonder what it takes to scale Kafka, Samza, and Druid to handle complex analytics workloads at petabyte size? We will share a high level overview of the Metamarkets realtime stack, the lessons learned scaling our real-time processing to over 3 million events per second, and how we leverage extensive metric collection to handle heterogeneous processing workloads, while keeping down operational complexity and cost. Built entirely on open source, our stack performs streaming joins using Kafka and Samza, feeding into Druid to serve 1 million interactive queries per day. ------ Thanks, -- Guozhang