On Mon, May 27, 2019 at 4:06 AM Grant Edwards <grant.b.edwa...@gmail.com> wrote: > > On 2019-05-23, Chris Angelico <ros...@gmail.com> wrote: > > On Fri, May 24, 2019 at 5:37 AM Bob van der Poel <b...@mellowood.ca> wrote: > >> > >> I've got a short script that loops though a number of files and > >> processes them one at a time. I had a bit of time today and figured > >> I'd rewrite the script to process the files 4 at a time by using 4 > >> different instances of python. My basic loop is: > >> > >> for i in range(0, len(filelist), CPU_COUNT): > >> for z in range(i, i+CPU_COUNT): > >> doit( filelist[z]) > >> > >> With the function doit() calling up the program to do the > >> lifting. Setting CPU_COUNT to 1 or 5 (I have 6 cores) makes no > >> difference in total speed. I'm processing about 1200 files and my > >> total duration is around 2 minutes. No matter how many cores I use > >> the total is within a 5 second range. > > > > Where's the part of the code that actually runs them across multiple > > CPUs? Also, are you spending your time waiting on the disk, the CPU, > > IPC, or something else? > > He said he's using N differenct Python instances, and he even provided > the code that runs in each instance which is obviously processesing > 1/Nth of the files. > > It's a pretty good bet that I/O is the limiting factor. >
Sometimes, the "simple" and "obvious" code, the part that clearly has no bugs in it, is the part that has the problem. :) ChrisA -- https://mail.python.org/mailman/listinfo/python-list