@Justin, it's fixed by https://github.com/apache/spark/pull/12057
On Thu, Feb 11, 2016 at 11:26 AM, Davies Liu wrote:
> Had a quick look in your commit, I think that make sense, could you
> send a PR for that, then we can review it.
>
> In order to support 2), we need to change the serialized Pyt
Had a quick look in your commit, I think that make sense, could you
send a PR for that, then we can review it.
In order to support 2), we need to change the serialized Python
function from `f(iter)` to `f(x)`, process one row at a time (not a
partition),
then we can easily combine them together: