Re: Filter on Date by comparing

2014-02-24 Thread Andrew Ash
It's in the data serialization section of the tuning guide, here: http://spark.incubator.apache.org/docs/latest/tuning.html#data-serialization On Mon, Feb 24, 2014 at 7:44 PM, Soumya Simanta wrote: > Thanks Andrew. I was expecting this to be the issue. > Are there any pointers about how to chan

Re: Filter on Date by comparing

2014-02-24 Thread Soumya Simanta
Thanks Andrew. I was expecting this to be the issue. Are there any pointers about how to change the serialization to Kryo ? On Mon, Feb 24, 2014 at 10:17 PM, Andrew Ash wrote: > This is because Joda's DateTimeFormatter is not serializable (doesn't > implement the empty Serializable interface)

Re: Filter on Date by comparing

2014-02-24 Thread Ewen Cheslack-Postava
Or use RDD.filterWith to create whatever you need out of serializable parts so you only run it once per partition. Andrew Ash February 24, 2014 at 7:17 PM This is because Joda's DateTimeFormatter is not serializable (doesn't implement the empty Serializable interface) ht

Re: Filter on Date by comparing

2014-02-24 Thread Andrew Ash
This is because Joda's DateTimeFormatter is not serializable (doesn't implement the empty Serializable interface) http://joda-time.sourceforge.net/apidocs/org/joda/time/format/DateTimeFormat.html One ugly thing I've done before is to instantiate a new DateTimeFormatter in every line, so like this:

Filter on Date by comparing

2014-02-24 Thread Soumya Simanta
I want to filter a RDD by comparing dates. myRDD.filter( x => new DateTime(x.getCreatedAt).isAfter(start) ).count I'm using the JodaTime library but I get an exception about a Jodatime class not serializable. Is there a way to configure this or an easier alternative for this problem. org.apac