Re: Lazy casting with Catalyst

2015-03-28 Thread Patrick Woody
So it looks like this was actually a combination of using out of date artifacts and further debugging needed on my part. Ripping the logic out and testing in spark-shell works fine, so it is likely something upstream in my application that causes it to take the whole Row. Thanks! -Pat On Sat,

Re: Lazy casting with Catalyst

2015-03-28 Thread Cheng Lian
On 3/29/15 12:26 AM, Patrick Woody wrote: Hey Cheng, I didn't meant that catalyst casting was eager, just that my approaches thus far seem to have been. Maybe I should give a concrete example? I have columns A, B, C where B is saved as a String but I'd like all references to B to go throug

Re: Lazy casting with Catalyst

2015-03-28 Thread Patrick Woody
Hey Cheng, I didn't meant that catalyst casting was eager, just that my approaches thus far seem to have been. Maybe I should give a concrete example? I have columns A, B, C where B is saved as a String but I'd like all references to B to go through a Cast to decimal regardless of the code used o

Re: Lazy casting with Catalyst

2015-03-28 Thread Cheng Lian
Hi Pat, I don't understand what "lazy casting" mean here. Why do you think current Catalyst casting is "eager"? Casting happens at runtime, and doesn't disable column pruning. Cheng On 3/28/15 11:26 PM, Patrick Woody wrote: Hi all, In my application, we take input from Parquet files where