Re: RDD filter in for loop gave strange results

Marco Wong Wed, 20 Jan 2021 08:00:13 -0800

Hmm, I think I got what Jingnan means. The lambda function is x != i and i
is not evaluated when the lambda function was defined. So the pipelined rdd
is rdd.filter(lambda x: x != i).filter(lambda x: x != i), rather than
having the values of i substituted. Does that make sense to you, Sean?


On Wed, 20 Jan 2021 at 15:51, Sean Owen <sro...@gmail.com> wrote:

> No, because the final rdd is really the result of chaining 3 filter
> operations. They should all execute. It _should_ work like
> "rdd.filter(...).filter(..).filter(...)"
>
> On Wed, Jan 20, 2021 at 9:46 AM Zhu Jingnan <jingnanzh...@gmail.com>
> wrote:
>
>> I thought that was right result.
>>
>> As rdd runs on a lacy basis.  so every time rdd.collect() executed, the i
>> will be updated to the latest i value, so only one will be filter out.
>>
>> Regards
>>
>> Jingnan
>>
>>
>>

Re: RDD filter in for loop gave strange results

Reply via email to