Bengt Richter wrote: > On Sat, 07 May 2005 14:03:34 +1000, Maurice LING <[EMAIL PROTECTED]> wrote: > > >>John Machin wrote: >> >>>On Sat, 07 May 2005 02:29:48 GMT, [EMAIL PROTECTED] (Bengt Richter) wrote: >>> >>> >>> >>>>On Sat, 07 May 2005 11:08:31 +1000, Maurice LING <[EMAIL PROTECTED]> wrote: >>>> >>>> >>>>>It doesn't seems to help. I'm thinking that it might be a SOAPpy >>>>>problem. The allocation fails when I grab a list of more than 150k >>>>>elements through SOAP but allocating a 1 million element list is fine in >>>>>python. >>>>> >>>>>Now I have a performance problem... >>>>> >>>>>Say I have 3 lists (20K elements, 1G elements, and 0 elements), call >>>>>them 'a', 'b', and 'c'. I want to filter all that is in 'b' but not in >>>>>'a' into 'c'... >>>>> >>>>> >>>>> >>>>>>>>a = range(1, 100000, 5) >>>>>>>>b = range(0, 1000000) >>>>>>>>c = [] >>>>>>>>for i in b: >>>>> >>>>>... if i not in a: c.append(i) >>>>>... >>>>> >>>>>This takes forever to complete. Is there anyway to optimize this? >>>>> >>>> >>>>Checking whether something is in a list may average checking equality with >>>>each element in half the list. Checking for membership in a set should >>>>be much faster for any significant size set/list. I.e., just changing to >>>> >>>> a = set(range(1, 100000, 5)) >>>> >>>>should help. I assume those aren't examples of your real data ;-) >>>>You must have a lot of memory if you are keeping 1G elements there and >>>>copying a significant portion of them. Do you need to do this file-to-file, >>>>keeping a in memory? Perhaps page-file thrashing is part of the time >>>>problem? >>> >>> >>>Since when was 1000000 == 1G?? >>> >>>Maurice, is this mucking about with 1M or 1G lists in the same >>>exercise as the "vm_malloc fails when allocating a 20K-element list" >>>problem? Again, it might be a good idea if you gave us a little bit >>>more detail. You haven't even posted the actual *PYTHON* error message >>>and stack trace that you got from the original problem. In fact, >>>there's a possible interpretation that the (system?) malloc merely >>>prints the vm_malloc message and staggers on somehow ... >>> >>>Regards, >>>John >> >>This is the exact error message: >> >>*** malloc: vm_allocate(size=9203712) failed (error code=3) >>*** malloc[489]: error: Can't allocate region >> >>Nothing else. No stack trace, NOTHING. >> > > 1. Can you post minimal exact code that produces the above exact error > message? > 2. Will you? ;-) > > Regards, > Bengt Richter
I've re-tried the minimal code mimicking the error in interactive mode and got this: >>> from SOAPpy import WSDL >>> serv = WSDL.Proxy('http://eutils.ncbi.nlm.nih.gov/entrez/eutils/soap/v1.1/eutils.wsdl' ) >>> result = serv.run_eSearch(db='pubmed', term='mouse', retmax=500000) *** malloc: vm_allocate(size=9121792) failed (error code=3) *** malloc[901]: error: Can't allocate region Traceback (most recent call last): File "<stdin>", line 1, in ? File "/sw/lib/python2.3/site-packages/SOAPpy/Client.py", line 453, in __call__ return self.__r_call(*args, **kw) File "/sw/lib/python2.3/site-packages/SOAPpy/Client.py", line 475, in __r_call self.__hd, self.__ma) File "/sw/lib/python2.3/site-packages/SOAPpy/Client.py", line 347, in __call config = self.config) File "/sw/lib/python2.3/site-packages/SOAPpy/Client.py", line 212, in call data = r.getfile().read(message_len) File "/sw/lib/python2.3/socket.py", line 301, in read data = self._sock.recv(recv_size) MemoryError >>> When changed retmax to 150000, it works nicely. -- http://mail.python.org/mailman/listinfo/python-list