Re: Kinesis receiver & spark streaming partition

2014-09-28 Thread Wei Liu
to understand from a > performance perspective. and this extends is beyond kinesis - it's for any > streaming source that supports shards/partitions. > > i need to do a little research into the internals to confirm my theory. > > lemme get back to you! > > -chris > > &

Kinesis receiver & spark streaming partition

2014-08-26 Thread Wei Liu
We are exploring using Kinesis and spark streaming together. I took at a look at the kinesis receiver code in 1.1.0. I have a question regarding kinesis partition & spark streaming partition. It seems to be pretty difficult to align these partitions. Kinesis partitions a stream of data into shards

Re: Multiple column families vs Multiple tables

2014-08-19 Thread Wei Liu
Chutium, thanks for your advices. I will check out your links. I sent the email to the wrong email address! Sorry for the spam. Wei On Tue, Aug 19, 2014 at 4:49 PM, chutium wrote: > ö_ö you should send this message to hbase user list, not spark user > list... > > but i can give you some pers

Multiple column families vs Multiple tables

2014-08-19 Thread Wei Liu
We are doing schema design for our application, One thing we are not so clear about is multiple column families (more than 3, probably 5 - 8) vs multiple tables. In our use case, we will have the same number of rows in all these column families, but some column families may be modified more often t

Re: Data loss - Spark streaming and network receiver

2014-08-18 Thread Wei Liu
ug 18, 2014 at 10:18 PM, Wei Liu wrote: > Thank you all for responding to my question. I am pleasantly surprised by > this many prompt responses I got. It shows the strength of the spark > community. > > Kafka is still an option for us, I will check out the link provided by > Dib

Re: Data loss - Spark streaming and network receiver

2014-08-18 Thread Wei Liu
make sure no data loss in Spark Streaming, still >> need to improve at some points J. >> >> >> >> Thanks >> >> Jerry >> >> >> >> *From:* Tobias Pfeiffer [mailto:t...@preferred.jp] >> *Sent:* Tuesday, August 19, 2014 10:47 AM >

Data loss - Spark streaming and network receiver

2014-08-18 Thread Wei Liu
We are prototyping an application with Spark streaming and Kinesis. We use kinesis to accept incoming txn data, and then process them using spark streaming. So far we really liked both technologies, and we saw both technologies are getting mature rapidly. We are almost settled to use these two tech