Re: kafka producer failed

2015-07-24 Thread Benjamin Black
what are the log messages from the kafka brokers? these look like client messages indicating a broker problem. On Fri, Jul 24, 2015 at 1:18 PM, Job-Selina Wu wrote: > Hi, Yi: > > I am wondering if the problem can be fixed by the parameter " > max.message.size" at kafka.producer.ProducerCo

Re: Samza and sliding window

2015-06-29 Thread Benjamin Black
Shekar, You won't be creating a partition per application. By using the application name as the partitioning key you ensure all events for a given application are consistently mapped to the same partition. Multiple applications will be mapped to each partition without any need for a priori knowled

Re: Reprocessing old events no longer in Kafka

2015-05-29 Thread Benjamin Black
Why not run a map reduce job on the data in hdfs? what is was made for. On May 29, 2015 2:13 PM, "Zach Cox" wrote: > Hi - > > Let's say one day a company wants to start doing all of this awesome data > integration/near-real-time stream processing stuff, so they start sending > their user activity

Re: How to deal with bootstrapping

2015-04-16 Thread Benjamin Black
t of offsets, one for > each partition. > > On Thu, Apr 16, 2015 at 4:58 PM, Benjamin Black wrote: > > > If you need to maintain ordering of a sequence of messages, those > messages > > should all be written to the same partition. If you are concerned with > > gl

Re: How to deal with bootstrapping

2015-04-16 Thread Benjamin Black
architecture to them. To do otherwise produces bad times. On Thu, Apr 16, 2015 at 1:51 PM, jeremy p wrote: > Thank you for the response. Does this mean the Old-Rules-Job would need to > maintain a Last-Processed-Old-Rules offset for each partition? > > On Thu, Apr 16, 2015 at 4:47 PM, Be

Re: How to deal with bootstrapping

2015-04-16 Thread Benjamin Black
Offsets are per partition. The alternative would have poor scaling behavior for both brokers and consumers. On Thu, Apr 16, 2015 at 1:01 PM, jeremy p wrote: > Thanks to everybody for the responses! > > Yi : The queue must be processed in order, which means that I cannot use > Ben and Guozhang's

Re: How to deal with bootstrapping

2015-04-15 Thread Benjamin Black
What about this: 1) Add new rule to the classifier task 2) Take note of offset of the first message processed after restart 3) Run a job to process from offset 0 to the offset from #2, after which the job is deleted I don't know how to do 2 or 3, but perhaps some of the core Samza folk could shed