Re: [VOTE] Apache Spark 2.1.0 (RC2)

2016-12-09 Thread Sean Owen
Sure, it's only an issue insofar as it may be a flaky test. If it's fixable or disable-able for a possible next RC that could be helpful. On Sat, Dec 10, 2016 at 2:09 AM Shixiong(Ryan) Zhu wrote: > Sean, "stress test for failOnDataLoss=false" is because Kafka consumer > may be thrown NPE when a

Document Similarity -Spark Mllib

2016-12-09 Thread satyajit vegesna
Hi ALL, I am trying to implement a mlllib spark job, to find the similarity between documents(for my case is basically home addess). i believe i cannot use DIMSUM for my use case as, DIMSUM is works well only with matrix with thin columns and more rows in matrix. matrix example format, for my us

Re: [VOTE] Apache Spark 2.1.0 (RC2)

2016-12-09 Thread Cody Koeninger
Agree that frequent topic deletion is not a very Kafka-esque thing to do On Fri, Dec 9, 2016 at 12:09 PM, Shixiong(Ryan) Zhu wrote: > Sean, "stress test for failOnDataLoss=false" is because Kafka consumer may > be thrown NPE when a topic is deleted. I added some logic to retry on such > failure,

Re: [VOTE] Apache Spark 2.1.0 (RC2)

2016-12-09 Thread Shixiong(Ryan) Zhu
Sean, "stress test for failOnDataLoss=false" is because Kafka consumer may be thrown NPE when a topic is deleted. I added some logic to retry on such failure, however, it may still fail when topic deletion is too frequent (the stress test). Just reopened https://issues.apache.org/jira/browse/SPARK-

Re: Question about SPARK-11374 (skip.header.line.count)

2016-12-09 Thread Dongjoon Hyun
Thank you for the opinion, Dongjin! On Thu, Dec 8, 2016 at 21:56 Dongjin Lee wrote: > +1 For this idea. I need it also. > > Regards, > Dongjin > > On Fri, Dec 9, 2016 at 8:59 AM, Dongjoon Hyun wrote: > > Hi, All. > > > > > > Could you give me some opinion? > > > > > > There is an old SPARK iss

Re: [VOTE] Apache Spark 2.1.0 (RC2)

2016-12-09 Thread Sean Owen
As usual, the sigs / hashes are fine and licenses look fine. I am still seeing some test failures. A few I've seen over time and aren't repeatable, but a few seem persistent. ANyone else observed these? I'm on Ubuntu 16 / Java 8 building for -Pyarn -Phadoop-2.7 -Phive If anyone can confirm I'll i

Re: [VOTE] Apache Spark 2.1.0 (RC2)

2016-12-09 Thread Reynold Xin
I uploaded a new one: https://repository.apache.org/content/repositories/orgapachespark-1219/ On Thu, Dec 8, 2016 at 11:42 PM, Prashant Sharma wrote: > I am getting 404 for Link https://repository.apache.org/content/ > repositories/orgapachespark-1217. > > --Prashant > > > On Fri, Dec 9, 2016

Re: java.lang.IllegalStateException: There is no space for new record

2016-12-09 Thread Liang-Chi Hsieh
Hi Nick, I think it is due to a bug in UnsafeKVExternalSorter. I created a Jira and a PR for this bug: https://issues.apache.org/jira/browse/SPARK-18800 - Liang-Chi Hsieh | @viirya Spark Technology Center -- View this message in context: http://apache-spark-developers-list.1001551.n3