Re: Spark Caching Kafka Metadata

2016-02-01 Thread Benjamin Han
Is there another way to create topics from Spark? Is there any reason the above code snippet would still produce this error? I've dumbly inserted waits and retries for testing, but that still doesn't consistently work, even after waiting several minutes. On Fri, Jan 29, 2016 at 8:29 AM, Cody Koeni

Re: Spark Caching Kafka Metadata

2016-01-29 Thread Cody Koeninger
The kafka direct stream doesn't do any explicit caching. I haven't looked through the underlying simple consumer code in the kafka project in detail, but I doubt it does either. Honestly, I'd recommend not using auto created topics (it makes it too easy to pollute your topics if someone fat-finge

Re: Spark caching

2015-03-30 Thread Renato MarroquĂ­n Mogrovejo
Thanks Sean! Do you know if there is a way (even manually) to delete these intermediate shuffle results? I was just want to test the "expected" behaviour. I know that re-caching might be a positive action most of the times but I want to try it without it. Renato M. 2015-03-30 12:15 GMT+02:00 Sea

Re: Spark caching

2015-03-30 Thread Sean Owen
I think that you get a sort of "silent" caching after shuffles, in some cases, since the shuffle files are not immediately removed and can be reused. (This is the flip side to the frequent question/complaint that the shuffle files aren't removed straight away.) On Mon, Mar 30, 2015 at 9:43 AM, Re

Re: Spark caching questions

2014-09-10 Thread Mayur Rustagi
Cached RDD do not survive SparkContext deletion (they are scoped on a per sparkcontext basis). I am not sure what you mean by disk based cache eviction, if you cache more RDD than disk space the result will not be very pretty :) Mayur Rustagi Ph: +1 (760) 203 3257 http://www.sigmoidanalytics.com @