Share as little as possible between your various processes - shared, mutable state is a parallelism tragedy.
If you can avoid sharing an entire dictionary, do so. It'd probably be better to dedicate one process to updating your dictionary, and then using a multiprocessing.Queue to pass delta records from your workers to your dictionary management process. Also, I'm inclined to doubt it's going to work well to have multiple processes doing I/O on the same socket - you'd probably best have a process that does all the I/O on the socket, and then, again, have one or more multprocessing.Queue's that pass I/O results/requests around. On Thu, May 19, 2011 at 6:10 AM, Pietro Abate <pietro.ab...@pps.jussieu.fr>wrote: > Hi all, > > I'm a bit struggling to understand a KeyError raised by the multiprocessing > library. > > My idea is pretty simple. I want to create a server that will spawn a > number of > workers that will share the same socket and handle requests independently. > The > goal is to build a 3-tier structure where all requests are handled via an > http > server and then dispatched to nodes sitting in a cluster and from nodes to > workers via the multiprocessing managers... > > There is one public server, one node per machine and x number of workers on > each machine depending on the number of cores... I know I can use a more > sophisticated library, but for such a simple task (I'm just prototyping > here) I > would just use the multiprocessing library... Is this possible or I should > explore directly other solutions ? I feel I'm very close to have something > working here ... > > The problem with the code below is that if I run the server as > `python server.py 1` , that is, using only one process, it works as > expected. > > However if I spawn two processes (`python server.py 2`) listening for > connections, I get a nasty error : > > $python client.py ping > Traceback (most recent call last): > File "client.py", line 24, in <module> > sys.exit(main(sys.argv)) > File "client.py", line 21, in main > print m.solver(args[1])._getvalue() > File "/usr/lib/python2.6/multiprocessing/managers.py", line 637, in > temp > authkey=self._authkey, exposed=exp > File "/usr/lib/python2.6/multiprocessing/managers.py", line 894, in > AutoProxy > incref=incref) > File "/usr/lib/python2.6/multiprocessing/managers.py", line 700, in > __init__ > self._incref() > File "/usr/lib/python2.6/multiprocessing/managers.py", line 750, in > _incref > dispatch(conn, None, 'incref', (self._id,)) > File "/usr/lib/python2.6/multiprocessing/managers.py", line 79, in > dispatch > raise convert_to_error(kind, result) > multiprocessing.managers.RemoteError: > > --------------------------------------------------------------------------- > Traceback (most recent call last): > File "/usr/lib/python2.6/multiprocessing/managers.py", line 181, in > handle_request > result = func(c, *args, **kwds) > File "/usr/lib/python2.6/multiprocessing/managers.py", line 402, in > incref > self.id_to_refcount[ident] += 1 > KeyError: '7fb51084c518' > > --------------------------------------------------------------------------- > > My understanding is that all processes share the same socket (from the > Manager). When a client wants to connect, a new connection is created and > server independently by that process. If you look at the server trace > (using > logging), it actually receives the connection, handles it, but fails to > communicate back to the client. > > Can anybody shed some light for me and maybe propose a solution ? > > thanks > pietro > > ---------------------------------------- > > Server : > > import sys > from multiprocessing.managers import BaseManager, BaseProxy, Process > > def baz(aa) : > l = [] > for i in range(3) : > l.append(aa) > return l > > class SolverManager(BaseManager): pass > > class MyProxy(BaseProxy): pass > > manager = SolverManager(address=('127.0.0.1', 50000), authkey='mpm') > manager.register('solver', callable=baz, proxytype=MyProxy) > > def serve_forever(server): > try : > server.serve_forever() > except KeyboardInterrupt: > pass > > def runpool(n): > server = manager.get_server() > workers = [] > > for i in range(int(n)): > Process(target=serve_forever, args=(server,)).start() > > if __name__ == '__main__': > runpool(sys.argv[1]) > > > Client : > > import sys > from multiprocessing.managers import BaseManager, BaseProxy > > import multiprocessing, logging > > class SolverManager(BaseManager): pass > > class MyProxy(BaseProxy): pass > > def main(args) : > SolverManager.register('solver') > m = SolverManager(address=('127.0.0.1', 50000), authkey='mpm') > m.connect() > > print m.solver(args[1])._getvalue() > > if __name__ == '__main__': > sys.exit(main(sys.argv)) > > > also tried on stack overflow and the python list, but I didn't manage to > come up > with a working solution yet... > > -- > ---- > http://en.wikipedia.org/wiki/Posting_style > -- > http://mail.python.org/mailman/listinfo/python-list >
-- http://mail.python.org/mailman/listinfo/python-list