Re: spark kafka batch integration

2014-12-15 Thread Koert Kuipers
gwen, i thought about it a little more and i feel pretty confident i can make it so that it's deterministic in case of node failure. will push that change out after holidays. On Mon, Dec 15, 2014 at 12:03 AM, Koert Kuipers wrote: > > hey gwen, > > no immediate plans to contribut

Re: spark kafka batch integration

2014-12-14 Thread Koert Kuipers
ark > App is running? Will the RDD recovery process get the exact same data > from Kafka as the original? even if we wrote additional data to Kafka > in the mean time? > > Gwen > > On Sun, Dec 14, 2014 at 5:22 PM, Koert Kuipers wrote: > > hello all, > > we at tre

spark kafka batch integration

2014-12-14 Thread Koert Kuipers
hello all, we at tresata wrote a library to provide for batch integration between spark and kafka. it supports: * distributed write of rdd to kafa * distributed read of rdd from kafka our main use cases are (in lambda architecture speak): * periodic appends to the immutable master dataset on hdfs

Re: No longer supporting Java 6, if? when?

2014-11-06 Thread Koert Kuipers
when is java 6 dropped by the hadoop distros? i am still aware of many clusters that are java 6 only at the moment. On Thu, Nov 6, 2014 at 12:44 PM, Gwen Shapira wrote: > +1 for dropping Java 6 > > On Thu, Nov 6, 2014 at 9:31 AM, Steven Schlansker < > sschlans...@opentable.com > > wrote: > >