Re: Potential memory leak in rocksdb

2017-02-25 Thread Sabarish Sasidharan
Pierre Do you see keys persisting in Rocksdb even after deleting them? Imagine k1 got deleted in first execution of punctuate and then you see it in the second execution of punctuate as well. Do you see such behaviour? That could explain why Rocksdb size keeps increasing. Regards Sab On 21 Feb

Kafka SASL and custom LoginModule and Authorizer

2017-02-25 Thread Christian
We have implemented our own LoginModule and Authorizer. The LoginModule does an authentication on the client side, obtains a token and passes that token down to our custom SaslServer which then verifies that this token is valid. Our Authorizer gets that token and asks another custom service if the

Re: No main class set in JAR; please specify one with --class and java.lang.ClassNotFoundException

2017-02-25 Thread Raymond Xie
Thank you very much Marco, I am a beginner in this area, is it possible for you to show me what you think the right script should be to get it executed in terminal? ** *Sincerely yours,* *Raymond* On Sat, Feb 25, 2017 at 6:00 PM, Marco Mistroni

Re: No main class set in JAR; please specify one with --class and java.lang.ClassNotFoundException

2017-02-25 Thread Raymond Xie
That's right Anahita, however, the class name is not indicated in the original github project so I don't know what class should be used here. The github only says: and then run the example `$ bin/spark-submit --jars \ external/kafka-assembly/target/scala-*/spark-streaming-kafka-assembly-*.jar \ exa

Re: No main class set in JAR; please specify one with --class and java.lang.ClassNotFoundException

2017-02-25 Thread Raymond Xie
Thank you, it is still not working: [image: Inline image 1] By the way, here is the original source: https://github.com/apache/spark/blob/master/examples/src/main/python/streaming/kafka_wordcount.py ** *Sincerely yours,* *Raymond* On Sat, Feb

Re: Kafka Streams vs Spark Streaming

2017-02-25 Thread Tianji Li
Hi Kohki, Thanks very much for providing your investigation results. Regarding 'append' mode with Kafka Streams, isn't KStream the thing you want? Hi Guozhang, Thanks for the pointers to the two blogs. I read one of them before and just had a look at the other one. What I am hoping to do i

No main class set in JAR; please specify one with --class and java.lang.ClassNotFoundException

2017-02-25 Thread Raymond Xie
I am doing a spark streaming on a hortonworks sandbox and am stuck here now, can anyone tell me what's wrong with the following code and the exception it causes and how do I fix it? Thank you very much in advance. spark-submit --jars /usr/hdp/2.5.0.0-1245/spark/lib/spark-assembly-1.6.2.2.5.0.0-124

Re: kafka streams locking issue in 0.10.20.0

2017-02-25 Thread Ara Ebrahimi
Hi, Thanks for the reply. Let me give you more information: - we have a group by + aggregate. The hopping time window is 10 minutes but the maintainMs is 180 days (we’re trying to reprocess the entire data set of billions of records). - Over time, after just a few hours, the aggregate slows d

Re: Kafka Streams vs Spark Streaming

2017-02-25 Thread Guozhang Wang
Hello Kohki, Thanks for the email. I'd like to learn what's your concern of the size of the state store? From your description it's a bit hard to figure out but I'd guess you have lots of state stores while each of them are relatively small? Hello Tianji, Regarding your question about maturity a

Re: Kafka Streams vs Spark Streaming

2017-02-25 Thread Kohki Nishio
I did a bit of research on that matter recently, the comparison is between Spark Structured Streaming(SSS) and Kafka Streams, Both are relatively new (~1y) and trying to solve similar problems, however if you go with Spark, you have to go with a cluster, if your environment already have a cluster,