really slow gzip decompress, why?
I've one big (6.9 Gb) .gz file with text inside it. zcat bigfile.gz > /dev/null does the job in 4 minutes 50 seconds python code have been doing the same job for 25 minutes and still doesn't finish =( the code is simpliest I could ever imagine: def main(): fh = gzip.open(sys.argv[1]) all(fh) As far as I understand most of the time it executes C code, so pythons no overhead should be noticible. Why is it so slow? -- http://mail.python.org/mailman/listinfo/python-list
Could you recommend job schedulling solution?
I've sinlge 8-way node dedicated for executing long running tasks. To be able to execute multiple tasks on this node it shoud spawn each task in another process. At the same time it should accept network connection with new tasks without blocking of client and put it on job queue. What is "task" ? Executing just ordinary python function will be enough. If solution contain some client library which allow easy task submit it will be great. -- http://mail.python.org/mailman/listinfo/python-list
Re: Could you recommend job schedulling solution?
On 11 фев, 20:26, "bruce" wrote: > hi... > > not sure exactly what you're looking for, but "condor" has a robust job > scheduling architecture for dealing with grid/distributed setups over > multiple systems.. > > give us more information, and there might be other suggestions! Condor, Globus or any other grid system looks like serious overkill for me. I need some sort of job manager, which accepts jobs from clients as pickled python functions and queues/executes them. Client side should be able to ask status of submitted job, their return value, ask to cancel job etc. -- http://mail.python.org/mailman/listinfo/python-list
Re: Could you recommend job schedulling solution?
> > I think parallel python will take of that for you > (http://www.parallelpython.com/) I've found that RPyC (http://rpyc.wikidot.com/) is quite usefull for my task. It allows me to build RPC service which accepts ordinary python function from client and return result in synchronous or asynchronous way. Of course it is able to serve multiple clients and it's forking server help me to solve GIL problems on intense calculations inside Python code. Some limitations are present, like the fact that you couldn't send on execution any code which use C extensions which are not present on RPC server, but I didn't expect this could work at all, so in general RPyC makes me happy. -- http://mail.python.org/mailman/listinfo/python-list
Re: cx_Oracle-5.0 Problem
> ImportError: dlopen(/Library/Frameworks/Python.framework/Versions/2.6/ > lib/python2.6/site-packages/cx_Oracle.so, 2): Symbol not found: > ___divdi3 You didn't link cx_Oracle.so all libs which it use. run "ldd -r cx_Oracle.so" and you'll have an idea about all missing symbols. The names of missed symbols could give you an idea what else should cx_Oracle.so should be linked with -- http://mail.python.org/mailman/listinfo/python-list
Re: something wrong with isinstance
Don't really sure, but try to define your class as new-style one. Like class GeoMap(object): ... -- http://mail.python.org/mailman/listinfo/python-list
Re: Break large file down into multiple files
> New to python I have a large file that I need to break up into > multiple smaller files. I need to break the large file into sections > where there are 65535 lines and then write those sections to seperate > files. If your lines are variable-length, then look at itertools recipes. from itertools import izip_longest def grouper(n, iterable, fillvalue=None): "grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx" args = [iter(iterable)] * n return izip_longest(fillvalue=fillvalue, *args) with open("/file","r") as f: for lines in grouper(65535,f,""): data_to_write = '\n'.join(lines).rstrip("\n") ... ... -- http://mail.python.org/mailman/listinfo/python-list
setuptools - library dependencies
I try to write setup.py which compiles C Extenstion (A). The problem is the fact, that my C Extension depends on another C lib (B). I've digged it setuptools sources a bit and found "libraries" option for setuptools.setup. Now it compile B library at build_clib stage and A Extenstion at build_ext stage. But it doesn't pack B library in final egg file, only A one. Even more, it compiles B as static lib, but links it to A as dynamic, it leads to undefined symbols in A. How could I either: 1) link A with B staticaly? 2) put B in final egg in same dir as A? -- http://mail.python.org/mailman/listinfo/python-list
multiprocessing: queue.get() blocks even if queue.qsize() != 0
I run into problem with queue from multiprocessing. Even if I queue.qsize() != 0 queue.get() still blocks and queue.get_nowait() raises Emtpy error. I'm unable to cut my big part to small test case, because smaller test case similair to my real app by design is works. In what conditions is it possible? while qresult.qsize(): result = qresult.get() #this code blocks! doWithResult(result) -- http://mail.python.org/mailman/listinfo/python-list
multiprocessing: Queue.get_nowait() never returns data
I stuck in new multiprocessing module (ex. processing). I dont' understand why queue.get_nowait() never returns data, but always raises Empty, even if it is guaranteed that queue is not empty. I've created small test case, here it is: http://pastebin.ca/1227666 Hope someone could explain why I'm wrong.It designed for 2.6 with multiprocessing module,but it's trivial to convert it to processing module for 2.5, just replace multiprocessing with "processing" and "freeze_support" with "freezeSupport" -- http://mail.python.org/mailman/listinfo/python-list
Re: multiprocessing: Queue.get_nowait() never returns data
my fault. changing "continue" to "break" solves the problem -- http://mail.python.org/mailman/listinfo/python-list
Re: multiprocessing eats memory
On 26 сент, 04:20, Istvan Albert <[EMAIL PROTECTED]> wrote: > On Sep 25, 8:40 am, "Max Ivanov" <[EMAIL PROTECTED]> wrote: > > > At any time in main process there are shouldn't be no more than two copies > > of data > > (one original data and one result). > > From the looks of it you are storing a lots of references to various > copies of your data via the async set. How could I avoid of storing them? I need something to check does it ready or not and retrieve results if ready. I couldn't see the way to achieve same result without storing asyncs set. -- http://mail.python.org/mailman/listinfo/python-list
Re: multiprocessing eats memory
On 26 сент, 17:03, MRAB <[EMAIL PROTECTED]> wrote: > On Sep 26, 9:52 am, redbaron <[EMAIL PROTECTED]> wrote: > > > On 26 ÓÅÎÔ, 04:20, Istvan Albert <[EMAIL PROTECTED]> wrote: > > > > On Sep 25, 8:40šam, "Max Ivanov" <[EMAIL PROTECTED]> wrote: > > > > > At any time in main process there are shouldn't be no more than two > > > > copies of data > > > > (one original data and one result). > > > > From the looks of it you are storing a lots of references to various > > > copies of your data via the async set. > > > How could I avoid of storing them? I need something to check does it > > ready or not and retrieve results if ready. I couldn't see the way to > > achieve same result without storing asyncs set. > > You could give each worker process an ID and then have them put the ID > into a queue to signal to the main process when finished. And how could I retrieve result from worker process without async? > > BTW, your test-case modifies the asyncs set while iterating over it, > which is a bad idea. My fault, there was list(asyncs) originally. -- http://mail.python.org/mailman/listinfo/python-list
Re: multiprocessing eats memory
> When processing data in parallel you will use up as muchmemoryas > many datasets you are processing at any given time. Worker processes eats 2-4 times more than I pass to them. >If you need to > reducememoryuse then you need to start fewer processes and use some > mechanism to distribute the work on them as they become free. (see > recommendation that uses Queues) I don't understand how could I use Queue here? If worker process finish computing, it puts its' id into Queue, in main process I retrieve that id and how could I retrieve result from worker process then? -- http://mail.python.org/mailman/listinfo/python-list
Using logging module to log either to screen or a file
Hi, I am beginner to python and i am writing a program that does a lot of things. One of the requirements is that the program shud generate a log file. I came across python loggging module and found it very useful. But I have a few problems Suppose by giving option '-v' along with the program the user can turn off logging to a file and instead display log on the screen. Since I am using a config file for logging, how do I accomplish this. I tried to define two handlers (fil and screen) and added it to my logger. But that logs data to both screen and the file. I need to log it to only one. How do I dynamically remove one of the handler from the logger based on user option. As a precursor how do i reference the handlers defined in config file in the code?? -- http://mail.python.org/mailman/listinfo/python-list
Re: Using logging module to log either to screen or a file
On Dec 7, 7:33 pm, Jean-Michel Pichavant wrote: > RedBaron wrote: > > Hi, > > I am beginner to python and i am writing a program that does a lot of > > things. One of the requirements is that the program shud generate a > > log file. I came across python loggging module and found it very > > useful. But I have a few problems > > Suppose by giving option '-v' along with the program the user can turn > > off logging to a file and instead display log on the screen. Since I > > am using a config file for logging, how do I accomplish this. > > I tried to define two handlers (fil and screen) and added it to my > > logger. But that logs data to both screen and the file. I need to log > > it to only one. How do I dynamically remove one of the handler from > > the logger based on user option. As a precursor how do i reference the > > handlers defined in config file in the code?? > > your logger has a public 'handlers' attribute. > > consoleHandlers = [h for h in logger.handlers if h.__class__ is > logging.StreamHandler] # the list of handlers logging to the console > (assuming they are instances of the StreamHandler class) > > if consoleHandlers: > h1 = consoleHandlers[0] > h1.filter = lambda x:True # enable the handler > h1.filter = lambda x:False # disable the handler > > JM Thanks JM, This works like charm. I had also though on similar lines bt I was using isinstance(). I have two handlers - logging.RotatingFIleHandler and StreamHandler. isinstance() was weird in the sense that no matter which handle I checked for being 'StreamHandler' I always got true. Also instead of setting filter to false, I was popping from the handlers list...Silly me Thanks a ton -- http://mail.python.org/mailman/listinfo/python-list
Case Sensitive Section names configparser
Is there any way by which configParser's get() function can be made case insensitive? -- http://mail.python.org/mailman/listinfo/python-list