Hi, Nick, IMHO, there are following points that differs Samza from KStreams:
- Stability of local state management. Samza supports durable local state and host-affinity for faster state recovery. 0.10.1 makes further progress in host-affinity to allow a) continuous check-pointing of state store; b) minimum movement of state stores when container number changes. - Support of non-Kafka sources and destinations. Samza allows non-Kafka sources and destinations to be added in natively. We already have Elasticsearch and HDFS producer supported in open source. HDFS consumer is coming up soon and LinkedIn has successfully integrate Samza w/ Databus, Kinesis, and DynamoDB Streams internally. - Unification w/ batch jobs. We are actively working on Samza on HDFS project in LinkedIn and have successfully done some proto-typing test that allows running a long-running Samza job on secured Hadoop cluster. Development of HDFSSystemConsumer is underway. The goal is to have the *same* Samza job running in both batch and stream world, simply by switching the data sources/destinations. - Async I/O model. We have built in async processing model as an option in Samza, which will significantly improve the performance of jobs bottlenecked on remote I/O. It is in the open source trunk and will be part of 0.11 release. As far as we know, no other stream processing platform supports the async processing model natively yet. - Operational support for run-as-a-service. Samza has a long operation history in LinkedIn and we run it as a hosted service. To better support auto-service and multi-tenant, we have recently added Samza REST service to allow admin command via REST calls and disk-quote that governs the disk usage of jobs in a cluster. Disk quote is in 0.10.1 and REST APIs are deployed in LinkedIn and in-review in open source (SAMZA-865). Please see Kartik's response to the exact same question in May for some of the points above as well: http://mail-archives.apache.org/mod_mbox/samza-dev/201605.mbox/%3CCACsAj_XZZBohSz7Cf9%3DLO5MDOn2vEzfMrDF6Te%3DwrpeMEab1dQ%40mail.gmail.com%3E Hope that helps. Regards! -Yi On Wed, Aug 3, 2016 at 11:04 AM, Nick Quinn <nqu...@objectivity.com> wrote: > There has been a lot of talk around town about Confluent's new stream > processing engine, Kafka Streams. We are currently using Samza and I want > to get some feedback for myself and other developers on this group list > about the differences and possible advantages to using Samza when compared > to Kafka Streams. I would appreciate any and all feedback. > > Thanks! > Nick Quinn >