I did eventually get the original code to run from the command line but not the interpreter, so the new example does have a similar problem.
Of course it's not as simple as saying I can't run an imported parallelized function from the interpreter because I can, as long as the parallelized function is being invoked directly. But if I caused the parallelized function to be invoked indirectly, for example, by importing a class that uses the parallelized function to set a class variable, it'll hang the interpreter. for now I added the following to any module that uses parallelized functions import __main__ if hasattr(__main__,'__file__'): __SHOULD_MULTITHREAD__ = True else: __SHOULD_MULTITHREAD__ = False and the parallelized functions check this flag to determine wether to run serially or not. This at least lets me import my classes into the interpreter so I can 'play' with them, although they initialize much slower. I'm not sure why Pool needs the __main__ module, except maybe someone sticks centralized process tracking information in there... sigh add it to the fuel of my love/hate with Python. On Jan 27, 2011, at 10:38 AM, Philip Semanchuk wrote: > > On Jan 27, 2011, at 1:12 PM, Craig Yoshioka wrote: > >> The code will be multi-platform. The OSXisms are there as an example, >> though I am developing on OS X machine. >> >> I've distilled my problem down to a simpler case, so hopefully that'll help >> troubleshoot. >> >> I have 2 files: >> >> test.py: >> -------------------------------------------------------------- >> from multiprocessing import Pool >> >> def square(x): >> return x*x >> >> def squares(numbers): >> pool = Pool(12) >> return pool.map(square,numbers) >> >> >> test2.py: >> -------------------------------------------------------------- >> from test import squares >> >> maxvalues = squares(range(3)) >> print maxvalues >> >> >> >> Now if I import squares into the interactive interpreter: >> >> from test import squares >> print squares(range(3)) >> >> I get the correct result, but if I try to import maxvalues from test2 the >> interactive interpreter python hangs. >> if I run the script from bash, though, it seems to run fine. > > The short, complete example is much more useful, but it sounds like it > demonstrates a different problem than you first described. Your first posting > said that your code worked in the interpreter but failed when run from the > command line. This code has the opposite problem. Correct? > >> I think it might have something to do with this note in the docs, though I >> am not sure how to use this information to fix my problem: >> >> Note: Functionality within this package requires that the __main__ method be >> importable by the children. This is covered inProgramming guidelines however >> it is worth pointing out here. This means that some examples, such as >> themultiprocessing.Pool examples will not work in the interactive >> interpreter. > > I suspect this is the problem with the demo above. Your original code ran > fine in the interpreter, though, correct? > > bye > Philip > > >> >> On Jan 27, 2011, at 6:39 AM, Philip Semanchuk wrote: >> >>> >>> On Jan 25, 2011, at 8:19 PM, Craig Yoshioka wrote: >>> >>>> Hi all, >>>> >>>> I could really use some help with a problem I'm having. >>> >>> >>> Hiya Craig, >>> I don't know if I can help, but it's really difficult to do without a full >>> working example. >>> >>> Also, your code has several OS X-isms in it so I guess that's the platform >>> you're on. But in case you're on Windows, note that that platform requires >>> some extra care when using multiprocessing: >>> http://docs.python.org/library/multiprocessing.html#windows >>> >>> >>> Good luck >>> Philip >>> >>> >>>> I wrote a function that can take a pattern of actions and it apply it to >>>> the filesystem. >>>> It takes a list of starting paths, and a pattern like this: >>>> >>>> pattern = { >>>> InGlob('Test/**'):{ >>>> MatchRemove('DS_Store'):[], >>>> NoMatchAdd('(alhpaID_)|(DS_Store)','warnings'):[], >>>> MatchAdd('alphaID_','alpha_found'):[], >>>> InDir('alphaID_'):{ >>>> NoMatchAdd('(betaID_)|(DS_Store)','warnings'):[], >>>> InDir('betaID_'):{ >>>> NoMatchAdd('(gammaID_)|(DS_Store)','warnings'):[], >>>> MatchAdd('gammaID_','gamma_found'):[] }}}} >>>> >>>> so if you run evalFSPattern(['Volumes/**'],pattern) it'll return a >>>> dictionary where: >>>> >>>> dict['gamma_found'] = [list of paths that matched] (i.e. >>>> '/Volumes/HD1/Test/alphaID_3382/betaID_38824/gammaID_848384') >>>> dict['warning'] = [list of paths that failed to match] (ie. >>>> '/Volumes/HD1/Test/alphaID_3382/gammaID_47383') >>>> >>>> Since some of these volumes are on network shares I also wanted to >>>> parallelize this so that it would not block on IO. I started the >>>> parallelization by using multiprocessing.Pool and got it to work if I ran >>>> the fsparser from the interpreter. It ran in *much* less time and >>>> produced correct output that matched the non-parallelized version. The >>>> problem begins if I then try to use the parallelized function from within >>>> the code. >>>> >>>> For example I wrote a class whose instances are created around valid FS >>>> paths, that are cached to reduce expensive FS lookups. >>>> >>>> class Experiment(object): >>>> >>>> SlidePaths = None >>>> >>>> @classmethod >>>> def getSlidePaths(cls): >>>> if cls.SlidePaths == None: >>>> cls.SlidePaths = fsparser(['/Volumes/**'],pattern) >>>> return cls.SlidePaths >>>> >>>> @classmethod >>>> def lookupPathWithGammaID(cls,id): >>>> paths = cls.getSlidePaths() >>>> ... >>>> return paths[selected] >>>> >>>> @classmethod >>>> def fromGamaID(cls,id): >>>> path = cls.lookupPathWithGammaID(id) >>>> return cls(path) >>>> >>>> def __init__(self,path) >>>> self.Path = path >>>> ... >>>> >>>> ... >>>> >>>> If I do the following from the interpreter it works: >>>> >>>>>>> from experiment import Experiment >>>>>>> expt = Experiment.fromGammaID(10102) >>>> >>>> but if I write a script called test.py: >>>> >>>> from experiment import Experiment >>>> expt1 = Experiment.fromGammaID(10102) >>>> expt2 = Experiment.fromGammaID(10103) >>>> comparison = expt1.compareTo(expt2) >>>> >>>> it fails, if I try to import it or run it from bash prompt: >>>> >>>>>>> from test import comparison (hangs forever) >>>> $ python test.py (hangs forever) >>>> >>>> I would really like some help trying to figure this out... I thought it >>>> should work easily since all the spawned processes don't share data or >>>> state (their results are merged in the main thread). The classes used in >>>> the pattern are also simple python objects (use python primitives). >>>> >>>> >>>> These are the main functions: >>>> >>>> def mapAction(pool,paths,action): >>>> merge = {'next':[]} >>>> for result in pool.map(action,paths): >>>> if result == None: >>>> continue >>>> merge = mergeDicts(merge,result) >>>> return merge >>>> >>>> >>>> def mergeDicts(d1,d2): >>>> for key in d2: >>>> if key not in d1: >>>> d1[key] = d2[key] >>>> else: >>>> d1[key] += d2[key] >>>> return d1 >>>> >>>> >>>> def evalFSPattern(paths,pattern): >>>> pool = Pool(10) >>>> results = {} >>>> for action in pattern: >>>> tomerge1 = mapAction(pool,paths,action) >>>> tomerge2 = evalFSPattern(tomerge1['next'],pattern[action]) >>>> del tomerge1['next'] >>>> results = mergeDicts(results,tomerge1) >>>> results = mergeDicts(results,tomerge2) >>>> return results >>>> >>>> the classes used in the pattern (InGlob,NoMatchAdd,etc.) are callable >>>> classes that take a single parameter (a path) and return a dict result or >>>> None which makes them trivial to adapt to Pool.map. >>>> >>>> Note if I change the mapAction function to: >>>> >>>> def mapAction(pool,paths,action): >>>> merge = {'next':[]} >>>> for path in paths: >>>> result = action(path) >>>> if result == None: >>>> continue >>>> merge = mergeDicts(merge,result) >>>> return merge >>>> >>>> everything works just fine. >>>> >>>> >>>> Thanks. >>>> >>>> >>>> -- >>>> http://mail.python.org/mailman/listinfo/python-list >>> >>> -- >>> http://mail.python.org/mailman/listinfo/python-list >> > > -- > http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list