Re: Strange ML pipeline errors from HashingTF using v1.6.1

2016-03-29 Thread Timothy Potter
FWIW - I synchronized access to the transformer and the problem went away so this looks like some type of concurrent access issue when dealing with UDFs On Tue, Mar 29, 2016 at 9:19 AM, Timothy Potter wrote: > It's a local spark master, no cluster. I'm not sure what you mean > about assembly or p

Re: Strange ML pipeline errors from HashingTF using v1.6.1

2016-03-29 Thread Timothy Potter
It's a local spark master, no cluster. I'm not sure what you mean about assembly or package? all of the Spark dependencies are on my classpath and this sometimes works. On Mon, Mar 28, 2016 at 11:45 PM, Jacek Laskowski wrote: > Hi, > > How do you run the pipeline? Do you assembly or package? Is t

Re: Strange ML pipeline errors from HashingTF using v1.6.1

2016-03-28 Thread Jacek Laskowski
Hi, How do you run the pipeline? Do you assembly or package? Is this on local or spark or other cluster manager? What's the build configuration? Pozdrawiam, Jacek Laskowski https://medium.com/@jaceklaskowski/ Mastering Apache Spark http://bit.ly/mastering-apache-spark Follow me at https://tw

Strange ML pipeline errors from HashingTF using v1.6.1

2016-03-28 Thread Timothy Potter
I'm seeing the following error when trying to generate a prediction from a very simple ML pipeline based model. I've verified that the raw data sent to the tokenizer is valid (not null). It seems like this is some sort of weird classpath or class loading type issue. Any help you can provide in tryi