[issue40110] multiprocessing.Pool.imap() should be lazy

2020-04-03 Thread Tim Peters
Tim Peters added the comment: Whenever there's parallel processing with communication, there's always the potential for producers to pump out data faster than consumers can process them. But builtin primitives generally don't try to address that directly. They don't - and can't - know enou

[issue40110] multiprocessing.Pool.imap() should be lazy

2020-04-01 Thread Nick Guenther
Nick Guenther added the comment: Thank you for taking the time to consider my points! Yes, I think you understood exactly what I was getting at. I slept on it and thought about what I'd posted the day after and realized most of the points you raise, especially that serialized next() would me

[issue40110] multiprocessing.Pool.imap() should be lazy

2020-03-31 Thread Tim Peters
Tim Peters added the comment: "Lazy" has several possible aspects, of which Pool.imap() satisfies some: - Its iterable argument is materialized one object at a time. - It delivers results one at a time. So, for example, if `worker` is a function that takes a single int, then pool = mult

[issue40110] multiprocessing.Pool.imap() should be lazy

2020-03-31 Thread Raymond Hettinger
Change by Raymond Hettinger : -- components: +Library (Lib) nosy: +rhettinger type: -> enhancement versions: +Python 3.9 -Python 2.7, Python 3.5, Python 3.6, Python 3.7, Python 3.8 ___ Python tracker __

[issue40110] multiprocessing.Pool.imap() should be lazy

2020-03-29 Thread Nick Guenther
New submission from Nick Guenther : multiprocessing.Pool.imap() is supposed to be a lazy version of map. But it's not: it submits work to its workers eagerly. As a consequence, in a pipeline, all the work from earlier steps is queued, performed, and finished first, before starting later steps