Re: problem with multiprocessing and defaultdict

2010-01-12 Thread wiso
Robert Kern wrote: > On 2010-01-11 17:50 PM, wiso wrote: > >> The problem now is this: >> start reading file r1_200909.log >> start reading file r1_200910.log >> readen 488832 lines from file r1_200910.log >> readen 517247 lines from file r1_200909.log >

Re: problem with multiprocessing and defaultdict

2010-01-11 Thread wiso
Robert Kern wrote: > On 2010-01-11 17:15 PM, wiso wrote: >> I'm using a class to read some data from files: >> >> import multiprocessing >> from collections import defaultdict >> >> def SingleContainer(): >> return list() >> >> &

problem with multiprocessing and defaultdict

2010-01-11 Thread wiso
I'm using a class to read some data from files: import multiprocessing from collections import defaultdict def SingleContainer(): return list() class Container(defaultdict): """ this class store odd line in self["odd"] and even line in self["even"]. It is stupid, but it's only a

monitor reading file with thread

2010-01-08 Thread wiso
I'm reading and processing a huge file, so during the execution I want to now the state of the processing: how many lines are already processed, and so on. The first approach is: f = open(filename) n = 0 for l in f: if n % 1000 = 0: print "Reading %d lines" %n do_something(l) but I want

Re: Convert month name to month number faster

2010-01-06 Thread wiso
Antoine Pitrou wrote: > Le Wed, 06 Jan 2010 12:03:36 +0100, wiso a écrit : > > >> from time import time >> t = time(); xxx=map(to_dict,l); print time() - t # 0.5 t = time(); >> xxx=map(to_if,l); print time() - t # 1.0 > > Don't define your own function j

Convert month name to month number faster

2010-01-06 Thread wiso
I'm optimizing the inner most loop of my script. I need to convert month name to month number. I'm using python 2.6 on linux x64. month_dict = {"Jan":1,"Feb":2,"Mar":3,"Apr":4, "May":5, "Jun":6, "Jul":7,"Aug":8,"Sep":9,"Oct":10,"Nov":11,"Dec":12} def to_dict(name): return month_dic

FileInput too slow

2010-01-04 Thread wiso
I'm trying the fileinput module, and I like it, but I don't understand why it's so slow... look: from time import time from fileinput import FileInput file = ['r1_200907.log', 'r1_200908.log', 'r1_200909.log', 'r1_200910.log', 'r1_200911.log'] def f1(): n = 0 for f in file: print "new

google tech talk code (threading module)

2009-09-08 Thread wiso
I took a little code from google tech talk. It seems interesting, but it doesn't work: import sys, urllib, os, threading, Queue q = Queue.Queue() class RetrWorker(threading.Thread): def run(self): self.setDaemon(True) def hook(*a): print (fn,a) while True: