Student looking for a Scitki-Learn Numpy Tutor with a strong background in data science?
In a Masters for Data Science and need the help using Python/R mainly. Please forward background(education, work) teaching experence in stats, linear algebra, programming (Scikit, Panda, Numpy), timezone, and rates. -- https://mail.python.org/mailman/listinfo/python-list
Egg deinstallation
Hello everyone, I googled and googled and can't seem to find the definitive answer: how to *properly* deinstall egg? Just delete the folder and/or .py and .pyc files from Lib/site-packages? Would that break anything in Python installation or not? Regards, mk -- http://mail.python.org/mailman/listinfo/python-list
Re: Egg deinstallation
Diez B. Roggisch wrote: Thanks, Diez. -- http://mail.python.org/mailman/listinfo/python-list
Compressed vs uncompressed eggs
Hello everyone, Are there *good* reasons to use uncompressed eggs? Is there a, say, performance penalty in using compressed eggs? Regards, mk -- http://mail.python.org/mailman/listinfo/python-list
Re: subprocess.Popen stalls
psaff...@googlemail.com wrote: p = Popen(cmd, shell=True, bufsize=100, stdout=PIPE, stderr=PIPE) output = p.stdout.read() Better use communicate() method: standardoutputstr, standarderrorstr = subprocess.communicate(...) Never had any problem with subprocesses when using subprocess module in this manner (well it's possible that standardoutputstr and/or standarderrorstr fill up the memory, you get the idea). Regards, mk -- http://mail.python.org/mailman/listinfo/python-list
Re: subprocess.Popen stalls
psaff...@googlemail.com wrote: On 12 Jan, 15:33, mk wrote: Better use communicate() method: Oh yes - it's right there in the documentation. That worked perfectly. What's also in the docs and I did not pay attention to before: Note The data read is buffered in memory, so do not use this method if the data size is large or unlimited. Regards, mk -- http://mail.python.org/mailman/listinfo/python-list
Re: Does Python really follow its philosophy of "Readability counts"?
Paul Rubin wrote: 1) Parallelism. Commodity desktop computers now have 8 effective cpu cores or maybe even 16 soon (Mac Pro, Intel Core i7) but Python still has the evil GIL that forces all threads to run on one core. Java, Erlang, and Haskell (GHC) all beat Python in this area. By the time Python 4 comes out, we will probably all be using PC's with 32 or more cores, so the current limitations will be intolerable. Even today, since no one doing anything serious uses single core machines any more, the GIL is a huge pain in the neck which the multiprocessing module helps only slightly. (While we are at it, lightweight threads like Erlang's or GHC's would be very useful.) +100 for this one 2) Native-code compilation. Per the Alioth shootouts, Python is much slower (even on single cores) than Java, Haskell, ML, or even Scheme. PyPy is addressing this but it will be a while before it replaces CPython. The lack of this already causes some pains at my company. I was flabbergasted to read that optional static typing was dropped by Guido due to "lack of interest in community" IIRC. Why!! Among other reasons, this could have provided for very easy performance optimizations of the small portions of the code! This could have been a huge gain acquired for little effort! Regards, mk -- http://mail.python.org/mailman/listinfo/python-list
Re: Does Python really follow its philosophy of "Readability counts"?
Paul Rubin wrote: But, if something is done by convention, then departing from the convention is by definition unconventional. If you do something unconventional in a program, it could be on purpose for a reason, or it could be by accident indicating a bug. I for one would love to see at least compiler warning (optionally, error) for situation where attributes are added to self outside of __init__. I consider it generally evil and hate to see code like that. Regards, mk -- http://mail.python.org/mailman/listinfo/python-list
Re: Integrating awk in Python
Alfons Nonell-Canals wrote: At the beggining I thought to "translate" them and program them in python but I prefer to avoid it because it means a lot of work and I should do it after each new version of this external stuff. I would like to integrate them into my python code. That's kind of like retrofitting steam engine onto a Mach 2 jetfighter. Well there's always awk2c.. Regards, mk -- http://mail.python.org/mailman/listinfo/python-list
Differences between class and function decorator
Hello everyone, I rewrote an example someone posted here recently from: >>> def print_method_name(method): def new_meth(*args, **kwargs): print method.func_name return method(*args, **kwargs) return new_meth >>> @print_method_name def f2(): pass >>> f2() f2 ..to: >>> class MyMethod(object): def __init__(self, func): self.name = func.func_name self.func = func def __call__(self): print self.name return self.func >>> @MyMethod def f(): pass >>> f() f Note that function decorator returned None, while class decorator returned function. Why the difference in behavior? After all, print_method_name decorator also returns a function (well it's a new function but still a function)? Regards, mk -- http://mail.python.org/mailman/listinfo/python-list
Class decorator with argument
Hello, I wrote this class decorator with argument: >>> class ChangeDoc(object): def __init__(self, docstring): self.docstring = docstring def __call__(self, func): func.__doc__ = self.docstring return func It seems to work: >>> @ChangeDoc("bulba") def f(): pass >>> f.__doc__ 'bulba' Can someone please debug my reasoning if it's incorrect? 1. First, the decorator @ChangeDoc('bulba') instantiates with __init__(self, 'bulba'), to some class instance, let's call it _decor. 2. Then _decor's __call__ method is called with function f as argument, changing the docstring and returning the changed f object, like f = _decor(f) . Am I missing smth important / potentially useful in typical real-world applications in that picture? Regards, mk -- http://mail.python.org/mailman/listinfo/python-list
*Advanced* Python book?
Hello everyone, I looked for it I swear, but just can't find it. Most Python books seem to focus on examples of how to call functions from standard library. I don't need that, I have online Python documentation for that. I mean really advanced mental gymnastics, like gory details of how Python objects operate, how to exploit its dynamic capabilities, dos and donts with particular Python objects, advanced tricks, everything from chained decorators to metaprogramming. Dive Into Python comes closest to this ideal from what I have found, but still not far enough. Anybody found such holy grail? Regards, mk -- http://mail.python.org/mailman/listinfo/python-list
seeking to improve Python skills
-19 16:22 0.56 0.37 0.30 4/198 17118 08-11-19 16:23 0.20 0.30 0.28 4/198 17166 08-11-19 16:24 0.11 0.25 0.26 7/200 17221 08-11-19 16:25 0.61 0.40 0.31 6/199 2475 08-11-19 16:26 0.30 0.34 0.29 4/195 2629 08-11-19 16:27 0.11 0.27 0.27 6/196 2716 08-11-19 16:28 0.73 0.44 0.33 7/197 20385 08-11-19 16:29 0.27 0.36 0.30 4/198 20467 08-11-19 16:30 0.16 0.31 0.29 5/199 20520 08-11-19 16:31 0.53 0.41 0.32 6/197 5810 08-11-19 16:32 0.19 0.33 0.30 5/198 5870 08-11-19 16:33 0.11 0.28 0.28 6/206 5972 08-11-19 16:34 0.47 0.38 0.31 4/196 23579 08-11-19 16:35 0.17 0.31 0.29 5/198 23631 08-11-19 16:36 0.12 0.26 0.27 7/199 23681 08-11-19 16:37 1.03 0.52 0.36 5/201 9044 08-11-19 16:38 0.51 0.47 0.35 4/203 9118 08-11-19 16:39 0.23 0.40 0.33 6/206 9204 08-11-19 16:40 0.60 0.50 0.37 11/211 26871 08-11-19 16:41 0.22 0.40 0.35 4/203 26943 08-11-19 16:42 0.08 0.33 0.32 6/209 27037 08-11-19 16:43 0.70 0.47 0.37 5/203 12342 08-11-19 16:44 0.62 0.47 0.37 5/207 12424 08-11-19 16:45 0.26 0.39 0.35 6/206 12510 08-11-19 16:46 0.66 0.51 0.39 4/203 30135 08-11-19 16:47 0.31 0.43 0.37 5/206 30196 08-11-19 16:48 0.17 0.36 0.35 6/205 30243 08-11-19 16:49 0.62 0.49 0.39 5/206 15559 08-11-19 16:50 0.34 0.43 0.37 7/206 15604 08-11-19 16:51 0.12 0.35 0.35 7/209 15655 08-11-19 16:52 0.52 0.44 0.38 5/202 818 08-11-19 16:53 0.22 0.37 0.36 4/206 912 08-11-19 16:54 0.08 0.30 0.33 5/201 955 Regards, mk -- http://mail.python.org/mailman/listinfo/python-list
DrPython and py2exe
Hello, I'm trying to get DrPython to edit .py file on double-click on Windows. Sure, I can use trivial .bat file to open DrPython with file as argument. But the irritating thing is that DOS window stays open until particular instance of DrPython isn't closed. py2exe to rescue. I have modified DrPython's setup.py: setup(name='drpython.exe', version=MY_VER, description=description[0], long_description=description[1], classifiers = filter(None, classifiers.split('\n')), author=AUTHOR, author_email=AUTHOR_EMAIL, url=URL, platforms = "any", license = 'GPL', packages=[ MY_NAME ], package_dir={ MY_NAME : '.' }, package_data={ MY_NAME : DATA }, scripts=['postinst.py'], windows=['drpython.py'], ) py2exe builds application, but when I start it, I get this in the log file: Traceback (most recent call last): File "drpython.py", line 46, in File "wxversion.pyc", line 152, in select wxversion.VersionError: Requested version of wxPython not found I'm guessing this is because py2exe doesn't package wxPython together with the app. Since this topic is interesting for me anyway (i.e. how to transform wxPython app using py2exe into Windows executable), would someone please reply on how to do it? Thanks, mk -- http://mail.python.org/mailman/listinfo/python-list
Re: Implementing file reading in C/Python
John Machin wrote: The factor of 30 indeed does not seem right -- I have done somewhat similar stuff (calculating Levenshtein distance [edit distance] on words read from very large files), coded the same algorithm in pure Python and C++ (using linked lists in C++) and Python version was 2.5 times slower. Levenshtein distance using linked lists? That's novel. Care to divulge? I meant: using linked lists to store words that are compared. I found using vectors was slow. Regards, mk -- http://mail.python.org/mailman/listinfo/python-list
Re: DrPython and py2exe
imageguy wrote: drPython is probably selecting a specific version of wxpython and py2exe doesn't like it or can't find it. Once you solve that, py2exe will work fine with wxpython. Thanks, drPython was indeed making use of wxversion.select. What's strange is that it was selecting apparently correct version: import wxversion wxversion.select('2.8') I put this in setup.py: import wxversion wxversion.select("2.8.9.1") import wx ..and it worked. Regards, mk -- http://mail.python.org/mailman/listinfo/python-list
Re: seeking to improve Python skills
Jean-Paul Calderone wrote: The most significant thing missing from this code is unit tests. Developing automated tests for your code will help you learn a lot. Thanks for all the remarks, I'll restructure my code. Probably the biggest mistake was not keeping "moving average" class and others focused on its respective jobs only. However, unit tests are a tougher cookie, do you have any resource that's truly worth recommending for learning unit tests? The only resource I found that was talking at some length about it was "Dive into Python" and I was wondering if there's anything better out there... Regards, mk -- http://mail.python.org/mailman/listinfo/python-list
What's the business with the asterisk?
Hello everyone, From time to time I spot an asterisk (*) used in the Python code _outside_ the usual *args or **kwargs application. E.g. here: http://www.norvig.com/python-lisp.html def transpose (m): return zip(*m) >>> transpose([[1,2,3], [4,5,6]]) [(1, 4), (2, 5), (3, 6)] What does *m mean in this example and how does it do the magic here? Regards, mk -- http://mail.python.org/mailman/listinfo/python-list
unittest, order of test execution
Hello everyone, I've got 2 functions to test, extrfromfile which returns a list of dictionaries, and extrvalues that extracts values from that list. Now I can test them both in one test case, like this: def test_extrfromfile(self): valist = ma.extrfromfile('loadavg_unittest.txt') valist_ut = [ {'day': '08-11-19', 'time': '12:41', 'val': 0.11}, {'day': '08-11-19', 'time': '12:42', 'val': 0.08}, {'day': '08-11-19', 'time': '12:43', 'val': 0.57}, {'day': '08-11-19', 'time': '12:44', 'val': 0.21}, {'day': '08-11-19', 'time': '12:45', 'val': 0.08}, {'day': '08-11-19', 'time': '12:46', 'val': 0.66}, {'day': '08-11-19', 'time': '12:47', 'val': 0.32}, {'day': '08-11-19', 'time': '12:48', 'val': 0.12}, {'day': '08-11-19', 'time': '12:49', 'val': 0.47}, {'day': '08-11-19', 'time': '12:50', 'val': 0.17}] self.assertEqual(valist, valist_ut) vlextr_ut = [0.11, 0.08, 0.57, 0.21, 0.08, 0.66, 0.32, 0.12, 0.47, 0.17] vlextr = ma.extrvalues(valist) self.assertEqual(len(vlextr_ut), len(vlextr)) for (idx, elem) in enumerate(vlextr_ut): self.assertAlmostEqual(elem, vlextr[idx]) But I was wondering, *should* this test be separated into two unit tests, one for each function? On the face of it, it looks that's how it should be done. This, however, raises the question: what's the order of test execution in the unittest? And how to pass values between unit tests? Should I modify 'self' in unit test? Regards, mk -- http://mail.python.org/mailman/listinfo/python-list
Building matplotlib from source on windows
Hello everyone, I'm trying to get 0.98.5.2 installed on Windows to use Python 2.6 (dependency packages I need to use on that version, long story, etc). When I was trying to build it (python setup.py build), it was finding the VC 9.0 C++ compiler on my comp. However, after adding necessary packages (zlib, png, etc), it was reporting missing 'unistd.h'. Clearly, this means it was meant to be built with GCC for Windows like MinGW ? I have uninstalled the VC compiler, installed GnuWin32 packages and tried using MinGW (passing --compiler=mingw32 to python setup.py build ) but now compilation process fails like this: c:\MinGW\bin\g++.exe -mno-cygwin -shared -s build\temp.win32-2.6\Release\src\ft2font.o build\temp.wi n32-2.6\Release\src\mplutils.o build\temp.win32-2.6\Release\cxx\cxxsupport.o build\temp.win32-2.6\Re lease\cxx\cxx_extensions.o build\temp.win32-2.6\Release\cxx\indirectpythoninterface.o build\temp.win 32-2.6\Release\cxx\cxxextensions.o build\temp.win32-2.6\Release\src\ft2font.def -LC:\Python26\libs - LC:\Python26\PCbuild -lfreetype -lz -lgw32c -lstdc++ -lm -lpython26 -lmsvcr90 -o build\lib.win32-2.6 \matplotlib\ft2font.pyd c:\MinGW\bin\..\lib\gcc\mingw32\3.4.5\..\..\..\..\mingw32\bin\ld.exe: cannot find -lgw32c collect2: ld returned 1 exit status error: command 'g++' failed with exit status 1 What the heck is lgw32c?? Regards, mk -- http://mail.python.org/mailman/listinfo/python-list
Arguments for map'ped functions
Hello everyone, So I have this function I want to map onto a list of sequences of *several* arguments (while I would want to pass those arguments to each function in the normal fashion). I realize this is contrived, maybe an example would make this clear: params = [ ('comp.lang.python', ['rtfm', 'shut']), ('comp.lang.perl', ['rtfm','shut']) ] qurls = map(consqurls, params) def consqurls(args): ggroup, gkeywords = args ... Since the argument passed to map'ped function is a single tuple of arguments, I have to unpack them in the function. So the question is how to pass several arguments to a map'ped function here? Code in question: def fillurlfmt(args): urlfmt, ggroup, gkw = args return {'group':ggroup, 'keyword':gkw, 'url': urlfmt % (gkw, ggroup)} def consqurls(args): ggroup, gkeywords = args urlfmt = 'http://groups.google.com/groups/search?as_q=%s&as_epq=&as_oq=&as_eq=&num=10&scoring=&lr=&as_sitesearch=&as_qdr=&as_drrb=b&as_mind=1&as_minm=1&as_miny=1999&as_maxd=1&as_maxm=1&as_maxy=2009&as_ugroup=%s&as_usubject=&as_uauthors=&safe=off' qurls = map(fillurlfmt, [ (urlfmt, ggroup, gkw) for gkw in gkeywords]) return qurls if __name__ == "__main__": gkeywords = ['rtfm', 'shut'] ggroups = ['comp.lang.python', 'comp.lang.perl'] params = [(ggroup, gkeywords) for ggroup in ggroups] qurls = map(consqurls, params) print qurls Regards, mk -- http://mail.python.org/mailman/listinfo/python-list
Flattening lists
Hello everybody, Any better solution than this? def flatten(x): res = [] for el in x: if isinstance(el,list): res.extend(flatten(el)) else: res.append(el) return res a = [1, 2, 3, [4, 5, 6], [[7, 8], [9, 10]]] print flatten(a) [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] Regards, mk -- http://mail.python.org/mailman/listinfo/python-list
Re: Flattening lists
Brian Allen Vanderburg II wrote: >> def flatten(x): >> res = [] >> for el in x: >> if isinstance(el,list): >> res.extend(flatten(el)) >> else: >> res.append(el) >> return res > > I think it may be just a 'little' more efficient to do this: > > def flatten(x, res=None): >if res is None: > res = [] > >for el in x: > if isinstance(el, (tuple, list)): > flatten(el, res) > else: > res.append(el) > >return res Hmm why should it be more efficient? extend operation should not be very costly? Regards, mk -- http://mail.python.org/mailman/listinfo/python-list
Re: Flattening lists
Brian Allen Vanderburg II wrote: def flatten(x): res = [] for el in x: if isinstance(el,list): res.extend(flatten(el)) else: res.append(el) return res I think it may be just a 'little' more efficient to do this: def flatten(x, res=None): if res is None: res = [] for el in x: if isinstance(el, (tuple, list)): flatten(el, res) else: res.append(el) return res Hmm why should it be more efficient? extend operation should not be very costly? Regards, mk -- http://mail.python.org/mailman/listinfo/python-list
Re: Flattening lists
Baolong zhen wrote: less list creation. At the cost of doing this at each 'flatten' call: if res is None: res = [] The number of situations of executing above code is the same as the number of list creations (once for each 'flatten' call, obviously). Is list creation really more costly than above? Regards, mk -- http://mail.python.org/mailman/listinfo/python-list
Is c.l.py becoming less friendly?
(duck) 542 comp.lang.python rtfm 467 comp.lang.python shut+up 263 comp.lang.perl rtfm 45 comp.lang.perl shut+up Code: import urllib2 import re import time def fillurlfmt(args): urlfmt, ggroup, gkw = args return {'group':ggroup, 'keyword':gkw, 'url': urlfmt % (gkw, ggroup)} def consqurls(args): ggroup, gkeywords = args urlfmt = 'http://groups.google.com/groups/search?as_q=%s&as_epq=&as_oq=&as_eq=&num=10&scoring=&lr=&as_sitesearch=&as_drrb=q&as_qdr=&as_mind=1&as_minm=1&as_miny=1999&as_maxd=1&as_maxm=1&as_maxy=2009&as_ugroup=%s&as_usubject=&as_uauthors=&safe=off' qurls = map(fillurlfmt, [ (urlfmt, ggroup, gkw) for gkw in gkeywords ]) return qurls def flatten_list(x): res = [] for el in x: if isinstance(el,list): res.extend(flatten_list(el)) else: res.append(el) return res def ggsearch(urldict): opener = urllib2.build_opener() opener.addheaders = [('User-agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.20) Gecko/20081217 (CK-IBM) Firefox/2.0.0.20')] time.sleep(0.1) urlf = opener.open(urldict['url']) resdict = {'result': urlf.read()} resdict.update(urldict) urlf.close() return resdict def extrclosure(resregexp, groupno): def extrres(resdict): txtgr = resregexp.search(resdict['result']) resdict['result']=txtgr.group(groupno) return resdict return extrres def delcomma(x): x['result'] = x['result'].replace(',','') return x if __name__ == "__main__": gkeywords = ['rtfm', 'shut+up'] ggroups = ['comp.lang.python', 'comp.lang.perl'] params = [(ggroup, gkeywords) for ggroup in ggroups] qurls = map(consqurls, params) qurls = flatten_list(qurls) gresults = map(ggsearch, qurls) resre = re.compile('Results \1\ - \.+?\ of about \(.+?)\') gextrsearchresult = extrclosure(resre,1) gresults = map(gextrsearchresult, gresults) gresults = map(delcomma, gresults) for el in gresults: print el['result'], el['group'], el['keyword'] print This was inspired by http://mail.python.org/pipermail/python-list/2002-November/172466.html Regards, mk -- http://mail.python.org/mailman/listinfo/python-list
Re: Flattening lists
Mark Dickinson wrote: I often find myself needing a 'concat' method that turns a list of lists (or iterable of iterables) into a single list; itertools.chain does this quite nicely. But I don't think I've ever encountered a need for the full recursive version. You're most probably right in this; however, my main goal here was finding 'more Pythonic' way of doing this and learning this way rather than the practical purpose of flattening deeply nested lists. Regards, mk -- http://mail.python.org/mailman/listinfo/python-list
Re: Flattening lists
Michele Simionato wrote: Looks fine to me. In some situations you may also use hasattr(el, '__iter__') instead of isinstance(el, list) (it depends if you want to flatten generic iterables or only lists). Thanks! Such stuff is what I'm looking for. Regards, mk -- http://mail.python.org/mailman/listinfo/python-list
Re: Flattening lists
Brian Allen Vanderburg II wrote: Is list creation really more costly than above? Probably not. I wrote a small test program using a list several levels deep, each list containing 5 sublists at each level and finally just a list of numbers. Flattening 1000 times took about 3.9 seconds for the one creating a list at each level, and 3.2 for the one not creating the list at each level. Hmm, I'm surprised by even that! Apparently list creation is more expensive than I thought - it seems somewhat more expensive than the cost of interpreting bytecode for "if var is None". Either list creation is somewhat costly, or "if var is None" is really cheap. Regards, mk -- http://mail.python.org/mailman/listinfo/python-list
SOAP client
Hi, I'm trying to consume a SOAP web service using Python. So far I have found two libraries: SOAPpy and ZSI. Both of them rely on PyXML which is no longer maintained (and there is no build for 64bit Windows and the setup.py doesn't seem to know how to build it on Windows). Is there a live SOAP library for Python? I'm sort of surprised that a common standard like SOAP is not more actively supported in Python. I'm probably missing something? Thanks, m -- http://mail.python.org/mailman/listinfo/python-list
Re: SOAP client
On Feb 11, 11:20 am, Robin wrote: > On Feb 11, 3:33 pm, mk wrote: > > > Hi, > > > I'm trying to consume aSOAPweb service using Python. So far I have > > found two libraries: SOAPpy and ZSI. Both of them rely on PyXML which > > is no longer maintained (and there is no build for 64bit Windows and > > the setup.py doesn't seem to know how to build it on Windows). Is > > there a liveSOAPlibrary for Python? I'm sort of surprised that a > > common standard likeSOAPis not more actively supported in Python. > > I'm probably missing something? > > > Thanks, > > m > > For consuming services, I've found suds to be the best > client:https://fedorahosted.org/suds/ > > For creating them, I've been using soaplib:http://trac.optio.webfactional.com/ > > HTH, > > Robin Yes, thank you, Suds worked for me. Kind of weird how it is not in the google results for obvious searches. -- http://mail.python.org/mailman/listinfo/python-list
CSV readers and UTF-8 files
Hello everyone, Is it just me or CSV reader/DictReader and UTF-8 files do not work correctly in Python 2.6.1 (Windows)? That is, when I open UTF-8 file in a csv reader (after passing plain file object), I get fields as plain strings ('str'). Since this has been mangled, I can't get the non-ascii characters back. When I do: csvfo = codecs.open(csvfname, 'rb', 'utf-8') dl = csv.excel dl.delimiter=';' #rd = csv.DictReader(csvfo, dialect=dl) rd = csv.reader(csvfo, dialect=dl) ..I get plain strings as well (I get when calling type(field)), on top of error: Traceback (most recent call last): File "C:/Python26/converter3.py", line 99, in fill_sqla(session,columnlist,rd) File "C:/Python26/converter3.py", line 73, in fill_sqla for row in rd: UnicodeEncodeError: 'ascii' codec can't encode character u'\u0144' in position 74: ordinal not in range(128) ..when doing: for row in rd: ... Regards, mk -- http://mail.python.org/mailman/listinfo/python-list
multiprocessing vs thread performance
Hello everyone, After reading http://www.python.org/dev/peps/pep-0371/ I was under impression that performance of multiprocessing package is similar to that of thread / threading. However, to familiarize myself with both packages I wrote my own test of spawning and returning 100,000 empty threads or processes (while maintaining at most 100 processes / threads active at any one time), respectively. The results I got are very different from the benchmark quoted in PEP 371. On twin Xeon machine the threaded version executed in 5.54 secs, while multiprocessing version took over 222 secs to complete! Am I doing smth wrong in code below? Or do I have to use multiprocessing.Pool to get any decent results? # multithreaded version #!/usr/local/python2.6/bin/python import thread import time class TCalc(object): def __init__(self): self.tactivnum = 0 self.reslist = [] self.tid = 0 self.tlock = thread.allocate_lock() def testth(self, tid): if tid % 1000 == 0: print "== Thread %d working ==" % tid self.tlock.acquire() self.reslist.append(tid) self.tactivnum -= 1 self.tlock.release() def calc_100thousand(self): tid = 1 while tid <= 10: while self.tactivnum > 99: time.sleep(0.01) self.tlock.acquire() self.tactivnum += 1 self.tlock.release() t = thread.start_new_thread(self.testth, (tid,)) tid += 1 while self.tactivnum > 0: time.sleep(0.01) if __name__ == "__main__": tc = TCalc() tstart = time.time() tc.calc_100thousand() tend = time.time() print "Total time: ", tend-tstart # multiprocessing version #!/usr/local/python2.6/bin/python import multiprocessing import time def testp(pid): if pid % 1000 == 0: print "== Process %d working ==" % pid def palivelistlen(plist): pll = 0 for p in plist: if p.is_alive(): pll += 1 else: plist.remove(p) p.join() return pll def testp_100thousand(): pid = 1 proclist = [] while pid <= 10: while palivelistlen(proclist) > 99: time.sleep(0.01) p = multiprocessing.Process(target=testp, args=(pid,)) p.start() proclist.append(p) pid += 1 print "=== Main thread waiting for all processes to finish ===" for p in proclist: p.join() if __name__ == "__main__": tstart = time.time() testp_100thousand() tend = time.time() print "Total time:", tend - tstart -- http://mail.python.org/mailman/listinfo/python-list
Re: multiprocessing vs thread performance
janislaw wrote: Ah, so there are 100 processes at time. 200secs still don't sound strange. I ran the PEP 371 code on my system (Linux) on Python 2.6.1: Linux SLES (9.156.44.174) [15:18] root ~/tmp/src # ./run_benchmarks.py empty_func.py Importing empty_func Starting tests ... non_threaded (1 iters) 0.05 seconds threaded (1 threads)0.000235 seconds processes (1 procs) 0.002607 seconds non_threaded (2 iters) 0.06 seconds threaded (2 threads)0.000461 seconds processes (2 procs) 0.004514 seconds non_threaded (4 iters) 0.08 seconds threaded (4 threads)0.000897 seconds processes (4 procs) 0.008557 seconds non_threaded (8 iters) 0.10 seconds threaded (8 threads)0.001821 seconds processes (8 procs) 0.016950 seconds This is very different from PEP 371. It appears that the PEP 371 code was written on Mac OS X. The conclusion I get from comparing above costs sis that OS X must have very low cost of creating the process, at least when compared to Linux, not that multiprocessing is a viable alternative to thread / threading module. :-( -- http://mail.python.org/mailman/listinfo/python-list
Re: multiprocessing vs thread performance
Christian Heimes wrote: mk wrote: Am I doing smth wrong in code below? Or do I have to use multiprocessing.Pool to get any decent results? You have missed an important point. A well designed application does neither create so many threads nor processes. Except I was not developing "well designed application" but writing the test the goal of which was measuring the thread / process creation cost. The creation of a thread or forking of a process is an expensive operation. Sure. The point is, how expensive? While still being relatively expensive, it turns out that in Python creating a thread is much, much cheaper than creating a process via multiprocessing on Linux, while this seems to be not necessarily true on Mac OS X. You should use a pool of threads or processes. Probably true, except, again, that was not quite the point of this exercise.. The limiting factor is not the creation time but the communication and synchronization overhead between multiple threads or processes. Which I am probably going to test as well. -- http://mail.python.org/mailman/listinfo/python-list
Re: multiprocessing vs thread performance
Jarkko Torppa wrote: On the PEP371 it says "All benchmarks were run using the following: Python 2.5.2 compiled on Gentoo Linux (kernel 2.6.18.6)" Right... I overlooked that. My tests I quoted above were done on SLES 10, kernel 2.6.5. With python2.5 and pyProcessing-0.52 iTaulu:src torppa$ python2.5 run_benchmarks.py empty_func.py Importing empty_func Starting tests ... non_threaded (1 iters) 0.03 seconds threaded (1 threads)0.000143 seconds processes (1 procs) 0.002794 seconds non_threaded (2 iters) 0.04 seconds threaded (2 threads)0.000277 seconds processes (2 procs) 0.004046 seconds non_threaded (4 iters) 0.05 seconds threaded (4 threads)0.000598 seconds processes (4 procs) 0.007816 seconds non_threaded (8 iters) 0.08 seconds threaded (8 threads)0.001173 seconds processes (8 procs) 0.015504 seconds There's smth wrong with numbers posted in PEP. This is what I got on 4-socket Xeon (+ HT) with Python 2.6.1 on Debian (Etch), with kernel upgraded to 2.6.22.14: non_threaded (1 iters) 0.04 seconds threaded (1 threads)0.000159 seconds processes (1 procs) 0.001067 seconds non_threaded (2 iters) 0.05 seconds threaded (2 threads)0.000301 seconds processes (2 procs) 0.001754 seconds non_threaded (4 iters) 0.06 seconds threaded (4 threads)0.000581 seconds processes (4 procs) 0.003906 seconds non_threaded (8 iters) 0.09 seconds threaded (8 threads)0.001148 seconds processes (8 procs) 0.008178 seconds -- http://mail.python.org/mailman/listinfo/python-list
thread, multiprocessing: communication overhead
Hello everyone, This time I decided to test communication overhead in multithreaded / multiprocess communication. The results are rather disappointing, that is, communication overhead seems to be very high. In each of the following functions, I send 10,000 numbers to the function / 10 threads / 10 processes, which simply returns it in its respective way. Function: notfunBest: 0.00622 sec Average: 0.00633 sec (simple function) Function: threadsemfun Best: 0.64428 sec Average: 0.64791 sec (10 threads synchronizing using semaphore) Function: threadlockfun Best: 0.66288 sec Average: 0.66453 sec (10 threads synchronizing using locks) Function: procqueuefun Best: 1.16291 sec Average: 1.17217 sec (10 processes communicating with main process using queues) Function: procpoolfun Best: 1.18648 sec Average: 1.19577 sec (a pool of 10 processes) If I'm doing smth wrong in the code below (smth that would result in performance suffering), please point it out. Code: import threading import multiprocessing import time import timeit def time_fun(fun): t = timeit.Timer(stmt = fun, setup = "from __main__ import " + fun.__name__) results = t.repeat(repeat=10, number=1) best_result = min(results) avg = sum(results) / len(results) print "Function: %-15s Best: %5.5f sec Average: %5.5f sec" % (fun.__name__, best_result, avg) def notfun(): inputlist = range(0,1) reslist = [] for x in range(len(inputlist)): reslist.append(inputlist.pop()) def threadsemfun(): def tcalc(sem, inputlist, reslist, tid, activitylist): while len(inputlist) > 0: sem.acquire() try: x = inputlist.pop() except IndexError: sem.release() return #activitylist[tid] += 1 reslist.append(x) sem.release() inputlist = range(0,1) #print "before: ", sum(inputlist) reslist = [] tlist = [] activitylist = [ 0 for x in range(0,10) ] sem = threading.Semaphore() for t in range(0,10): tlist.append(threading.Thread(target=tcalc, args=(sem, inputlist, reslist, t, activitylist))) for t in tlist: t.start() for t in tlist: t.join() #print "after: ", sum(reslist) #print "thread action count:", activitylist def threadlockfun(): def tcalc(lock, inputlist, reslist, tid, activitylist): while True: lock.acquire() if len(inputlist) == 0: lock.release() return x = inputlist.pop() reslist.append(x) #activitylist[tid] += 1 lock.release() inputlist = range(0,1) #print "before: ", sum(inputlist) reslist = [] tlist = [] activitylist = [ 0 for x in range(0,10) ] sem = threading.Semaphore() for t in range(0,10): tlist.append(threading.Thread(target=tcalc, args=(sem, inputlist, reslist, t, activitylist))) for t in tlist: t.start() for t in tlist: t.join() #print "after: ", sum(reslist) #print "thread action count:", activitylist def pf(x): return x def procpoolfun(): pool = multiprocessing.Pool(processes=10) inputlist = range(0,1) reslist = [] i, j, jmax = 0, 10, len(inputlist) #print "before: ", sum(inputlist) while j <= jmax: res = pool.map_async(pf, inputlist[i:j]) reslist.extend(res.get()) i += 10 j += 10 #print "after: ", sum(reslist) def procqueuefun(): def pqf(qin, qout): pid = multiprocessing.current_process().pid while True: x = qin.get() if x == 'STOP': return qout.put((pid, x)) qin = multiprocessing.Queue() qout = multiprocessing.Queue() plist = [] activity = dict() for i in range(0,10): p = multiprocessing.Process(target = pqf, args=(qin, qout)) p.start() plist.append(p) activity[p.pid] = 0 inputlist = range(0,1) reslist = [] #print "before:", sum(inputlist) ilen = len(inputlist) x = 0 while x != ilen: for i in range(0,10): qin.put(inputlist[x+i]) for i in range(0,10):
Re: thread, multiprocessing: communication overhead
Aaron Brady wrote: snips def threadsemfun(): sem = threading.Semaphore() def threadlockfun(): sem = threading.Semaphore() You used a Semaphore for both lock objects here. Right... I corrected that (simply changed to threading.Lock() in threadlockfun) and the result is much better, though still an order of magnitude worse than plain function: Function: threadlockfun Best: 0.08665 sec Average: 0.08910 sec Function: notfunBest: 0.00987 sec Average: 0.01003 sec 'multiprocessing' is a really high level layer that makes a lot of decisions about trade-offs, has highly redundant communication, and is really easy to use. If you want to save a byte, you'll have to make your own decisions about trade-offs and redundancies (possibly even looking at real result data to make them). Hmm, do you think that lower-level 'thread' module might work more efficiently? I actually think 'multiprocessing' is really good, and even if I hand- wrote my own IPC, it would be slower! CMIIW, but I believe your timing function includes the time to launch the actual processes and threads, create the synch. objects, etc. You might try it again, creating them first, starting the timer, then loading them. Except I don't know how to do that using timeit.Timer. :-/ -- http://mail.python.org/mailman/listinfo/python-list
Re: How to Delete a Cookie?
tryg.ol...@gmail.com wrote: Hello - I managed to get a cookie set. Now I want to delete it but it is not working. Why struggle with this manually? Isn't it better to learn a bit of framework like Pylons and have it all done for you (e.g. in Pylons you have response.delete_cookie method)? Regards, mk -- http://mail.python.org/mailman/listinfo/python-list
Re: Implementing file reading in C/Python
Johannes Bauer wrote: Which takes about 40 seconds. I want the niceness of Python but a little more speed than I'm getting (I'd settle for factor 2 or 3 slower, but factor 30 is just too much). This probably doesn't contribute much, but have you tried using Python profiler? You might have *something* wrong that eats up a lot of time in the code. The factor of 30 indeed does not seem right -- I have done somewhat similar stuff (calculating Levenshtein distance [edit distance] on words read from very large files), coded the same algorithm in pure Python and C++ (using linked lists in C++) and Python version was 2.5 times slower. Regards, mk -- http://mail.python.org/mailman/listinfo/python-list
Naming conventions for regular variables
http://www.python.org/dev/peps/pep-0008/ "Function Names Function names should be lowercase, with words separated by underscores as necessary to improve readability." However, this PEP does not recommend any particular style for naming regular (local) variables. Personally I like "mixedCase", however using both styles looks a bit weird: def some_function_name(someVariableName, anotherVar) Naming both functions and variables using "lowercase with underscore separator between words for readability" leads to confusing function names with variable names - and distinguishing between variables and function names just by looking at them would be nice. Recommendations? -- http://mail.python.org/mailman/listinfo/python-list
(silly?) speed comparisons
Out of curiosity I decided to make some speed comparisons of the same algorithm in Python and C++. Moving slices of lists of strings around seemed like a good test case. Python code: def move_slice(list_arg, start, stop, dest): frag = list_arg[start:stop] if dest > stop: idx = dest - (stop - start) else: idx = dest del list_arg[start:stop] list_arg[idx:idx] = frag return list_arg b = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j'] >>> import timeit >>> t = timeit.Timer("move_slice.move_slice(move_slice.b, 4, 6, 7)", "import move_slice") >>> t.timeit() 3.879252810063849 (Python 2.5, Windows) Implementing the same algorithm in C++: #include #include #include using namespace std; vector move_slice(vector vec, int start, int stop, int dest) { int idx = stop - start; vector frag; // copy a fragment of vector for (idx = start; idx < stop; idx++) frag.push_back(vec.at(idx)); if( dest > stop) idx = dest - (stop - start); else idx = dest; // delete the corresponding fragment of orig vector vec.erase( vec.begin() + start, vec.begin() + stop); // insert the frag in proper position vec.insert( vec.begin() + idx, frag.begin(), frag.end()); return vec; } int main(int argc, char* argv) { vector slice; string u = "abcdefghij"; int pos; for (pos = 0; pos < u.length(); pos++) slice.push_back(u.substr(pos,1)); int i; for (i = 0; i<100; i++) move_slice(slice, 4, 6, 7); } Now this is my first take at vectors in C++, so it's entirely possible that an experienced coder would implement it in more efficient way. Still, vectors of strings seemed like a fair choice - after all, Python version is operating on similarly versatile objects. But I was still rather surprised to see that C++ version took 15% longer to execute! (vector, 4, 6, 7) $ time slice real0m4.478s user0m0.015s sys 0m0.031s Compiler: MinGW32/gcc 3.4.5, with -O2 optimization (not cygwin's gcc, which for some reason seems to produce sluggish code). When I changed moving the slice closer to the end of the list / vector, Python version executed even faster: >>> t = timeit.Timer("move_slice.move_slice(move_slice.b, 6, 7, 7)", "import move_slice") >>> t.timeit() 1.609766883779912 C++: (vector, 6, 7, 7) $ time slice.exe real0m3.786s user0m0.015s sys 0m0.015s Now C++ version took over twice the time to execute in comparison to Python time! Am I comparing apples to oranges? Should the implementations be different? Or does the MinGW compiler simply suck? Note: it appears that speed of Python lists falls down quickly the closer to the list beginning one deletes or inserts elements. C++ vectors do not seem to be as heavily position-dependent. -- http://mail.python.org/mailman/listinfo/python-list
Re: (silly?) speed comparisons
Rajanikanth Jammalamadaka wrote: Try using a list instead of a vector for the C++ version. Well, it's even slower: $ time slice4 real0m4.500s user0m0.015s sys 0m0.015s Time of execution of vector version (using reference to a vector): $ time slice2 real0m2.420s user0m0.015s sys 0m0.015s Still slower than Python! Source, using lists in C++: slice4.c++ #include #include #include using namespace std; list move_slice(list& slist, int start, int stop, int dest) { int idx; if( dest > stop) idx = dest - (stop - start); else idx = dest; list frag; int i; list::iterator startiter; list::iterator enditer; startiter = slist.begin(); for (i = 0; i < start; i++) startiter++; enditer = startiter; // copy fragment for (i = start; i < stop; i++) { frag.push_back(*enditer); enditer++; } // delete frag from the slist slist.erase( startiter, enditer ); // insert frag into slist at idx startiter = slist.begin(); for (i = 0; i < idx; i++) startiter++; slist.insert( startiter, frag.begin(), frag.end()); /* cout << "frag " << endl; for (startiter = frag.begin(); startiter != frag.end(); startiter ++) cout << *startiter << " "; cout << endl; cout << "slist " << endl; for (startiter = slist.begin(); startiter != slist.end(); startiter++) cout << *startiter << " "; cout << endl;*/ return slist; } int main(int argc, char* argv) { list slice; string u = "abcdefghij"; int pos; for (pos = 0; pos < u.length(); pos++) slice.push_back(u.substr(pos,1)); int i; for (i = 0; i<100; i++) move_slice(slice, 6, 7, 7); } Source, using reference to a vector: slice2.c++ #include #include #include using namespace std; vector move_slice(vector& vec, int start, int stop, int dest) { int idx = stop - start; vector frag; for (idx = start; idx < stop; idx++) frag.push_back(vec.at(idx)); if( dest > stop) idx = dest - (stop - start); else idx = dest; vec.erase( vec.begin() + start, vec.begin() + stop); vec.insert( vec.begin() + idx, frag.begin(), frag.end()); return vec; } int main(int argc, char* argv) { vector slice; string u = "abcdefghij"; int pos; for (pos = 0; pos < u.length(); pos++) slice.push_back(u.substr(pos,1)); int i; for (i = 0; i<100; i++) move_slice(slice, 6, 7, 7); } - -- http://mail.python.org/mailman/listinfo/python-list
Re: (silly?) speed comparisons
Maric Michaud wrote: Le Wednesday 09 July 2008 12:35:10 mk, vous avez écrit : vector move_slice(vector& vec, int start, int stop, int dest) I guess the point is to make a vector of referene to string if you don't want to copy string objects all around but just a word for an address each time. The signature should be : vector move_slice(vector& vec, int start, int stop, int dest) or vector move_slice(vector& vec, int start, int stop, int dest) That matters too, but I just found, the main culprit was _returning the list instead of returning the reference to list_. The difference is staggering - some 25 sec vs 0.2 sec: $ time slice6 real0m0.191s user0m0.015s sys 0m0.030s #include #include #include using namespace std; list& move_slice(list& slist, int start, int stop, int dest) { int idx; if( dest > stop) idx = dest - (stop - start); else idx = dest; int i; list::iterator startiter; list::iterator enditer; list::iterator destiter; startiter = slist.begin(); destiter = slist.begin(); for (i = 0; i < start; i++) startiter++; enditer = startiter; for (i = start; i < stop; i++) enditer++; for (i = 0; i < dest; i++) destiter++; slist.splice(destiter, slist, startiter, enditer); /* cout << "frag " << endl; for (startiter = frag.begin(); startiter != frag.end(); startiter ++) cout << *startiter << " "; cout << endl;*/ /* cout << " after: "; for (startiter = slist.begin(); startiter != slist.end(); startiter++) cout << *startiter << " "; cout << endl;*/ return slist; } int main(int argc, char* argv) { list slice; string u = "abcdefghijabcdefghijabcdefghijabcdefghijabcdefghijabcdefghijabcdefghijabcdefghijabcdefghijabcdefghij"; int pos; for (pos = 0; pos < u.length(); pos++) slice.push_back(new string(u)); int i; //for (i = 0; i<100; i++) /*list::iterator startiter; cout << "before: "; for (startiter = slice.begin(); startiter != slice.end(); startiter++) cout << *startiter << " "; cout << endl;*/ for (int i = 0; i<100; i++) move_slice(slice, 4, 6, 7); } -- http://mail.python.org/mailman/listinfo/python-list
Re: (silly?) speed comparisons
P.S. Java 1.6 rocks - I wrote equivalent version using ArrayList and it executed in 0.7s. -- http://mail.python.org/mailman/listinfo/python-list
Using SWIG to build C++ extension
Hello, I'm having terrible problems building C++ extension to Python 2.4 using SWIG. I'd appreciate if somebody knowledgeable at the subject took a look at it. swig-1.3.29, g++ (GCC) 4.1.1 20070105 (Red Hat 4.1.1-52). I used following commands to build C++ extension: # swig -c++ -python edit_distance.i # c++ -c edit_distance.c edit_distance_wrap.cxx edit_distance.cpp -I. -I/usr/include/python2.4 Linux RH (9.156.44.105) root ~/tmp # c++ -c edit_distance.c edit_distance_wrap.cxx edit_distance.cpp -I. -I/usr/include/python2.4 c++: edit_distance.cpp: No such file or directory edit_distance_wrap.cxx: In function ‘PyObject* _wrap_edit_distance(PyObject*, PyObject*)’: edit_distance_wrap.cxx:2579: error: ‘string’ was not declared in this scope edit_distance_wrap.cxx:2579: error: ‘arg1’ was not declared in this scope edit_distance_wrap.cxx:2580: error: ‘arg2’ was not declared in this scope edit_distance_wrap.cxx:2597: error: expected type-specifier before ‘string’ edit_distance_wrap.cxx:2597: error: expected `>' before ‘string’ edit_distance_wrap.cxx:2597: error: expected `(' before ‘string’ edit_distance_wrap.cxx:2597: error: expected primary-expression before ‘>’ token edit_distance_wrap.cxx:2597: error: expected `)' before ‘;’ token edit_distance_wrap.cxx:2605: error: expected type-specifier before ‘string’ edit_distance_wrap.cxx:2605: error: expected `>' before ‘string’ edit_distance_wrap.cxx:2605: error: expected `(' before ‘string’ edit_distance_wrap.cxx:2605: error: expected primary-expression before ‘>’ token edit_distance_wrap.cxx:2605: error: expected `)' before ‘;’ token What's weird is that I _did_ use std:: namespace prefix carefully in the code: #include #include #include const unsigned int cost_del = 1; const unsigned int cost_ins = 1; const unsigned int cost_sub = 1; unsigned int edit_distance( std::string& s1, std::string& s2 ) { const size_t len1 = s1.length(), len2 = s2.length(); std::vector > d(len1 + 1, std::vector(len2 + 1)); for(int i = 1; i <= len1; ++i) for(int j = 1; j <= len2; ++j) d[i][j] = std::min(d[i - 1][j] + 1, std::min(d[i][j - 1] + 1, d[i - 1][j - 1] + (s1[i - 1] == s2[j - 1] ? 0 : 1))); return d[len1][len2]; } Ok, anyway I fixed it in the generated code (edit_distance_wrap.cxx). It compiled to .o file fine then. It linked to _edit_distance.so as well: # c++ -shared edit_distance_wrap.o -o _edit_distance.so But now I get import error in Python! Linux RH root ~/tmp # python Python 2.4.3 (#1, Dec 11 2006, 11:38:52) [GCC 4.1.1 20061130 (Red Hat 4.1.1-43)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import edit_distance Traceback (most recent call last): File "", line 1, in ? File "edit_distance.py", line 5, in ? import _edit_distance ImportError: ./_edit_distance.so: undefined symbol: _Z13edit_distanceRSsS_ What did I do to deserve this? :-) edit_distance.i file just in case: %module edit_distance %{ #include "edit_distance.h" %} extern unsigned int edit_distance(string& s1, string& s2); -- http://mail.python.org/mailman/listinfo/python-list
Re: Using SWIG to build C++ extension
And what's infuriating is that the .o files do contain the necessary symbol: # grep _Z13edit_distanceRSsS_ * Binary file edit_distance.o matches Binary file _edit_distance.so matches Binary file edit_distance_wrap.o matches -- http://mail.python.org/mailman/listinfo/python-list
Re: Using SWIG to build C++ extension
Hello Bas, Thanks, man! Your recipe worked on Debian system, though not on RedHat, and I still have no idea why. :-) Anyway, I have it working. Thanks again. I took your example files and did the following: changed the #include "edit_distance.h" to #include "edit_distance.c" in the edit_distance.i file. Then I changed the first few lines of your function definition unsigned int edit_distance( const char* c1, const char* c2 ) { std::string s1( c1), s2( c2); and also adapted the signature in the edit_distance.i file. Then swig -shadow -c++ -python edit_distance.i g++ -c -fpic -I/usr/include/python edit_distance_wrap.cxx g++ -shared edit_distance_wrap.o -o _edit_distance.so -- http://mail.python.org/mailman/listinfo/python-list
Magic?
So I was playing around with properties and wrote this: class lstr(str): def __init__(self, initval): self._s = initval self._len = len(self._s) def fget_s(self): return str(self._s) def fset_s(self, val): self._s = val self._len = len(self._s) s = property(fget_s, fset_s) def fget_len(self): return self._len def fset_len(self, val): raise AttributeError, "Attribute is read-only." len = property(fget_len, fset_len) I obviously aimed at defining setters and getters for 's' and 'len' attributes via using properties to that. However, it appears that somehow this object prints the value of 's' attribute without me setting any specific methods to do that: >>> astr = lstr('abcdef') >>> astr 'abcdef' >>> astr.swapcase() 'ABCDEF' How does it know to do that? I mean, I can understand how it knows to do that since I used property: >>> astr.s 'abcdef' >>> vars(astr) {'_len': 6, '_s': 'abcdef'} How does the instance know to use _s value to return when the instance is called? Is this due to some trick handling of overriden __init__ method (i.e. it knows to treat initval argument somehow specially)? Some other way? If so, how? -- http://mail.python.org/mailman/listinfo/python-list
Re: Magic?
However, it appears that somehow this object prints the value of 's' attribute without me setting any specific methods to do that: >>> astr = lstr('abcdef') >>> astr 'abcdef' >>> astr.swapcase() 'ABCDEF' Correction: it doesn't really get the value from _s attribute: >>> astr = lstr('abcdef') >>> astr.s 'abcdef' >>> astr.s='xyzzy' >>> astr 'abcdef' >>> astr.s 'xyzzy' So my updated question is where does this lstr() instance keep the original value ('abcdef')? It obviously has smth to do with inheriting after str class, but I don't get the details of the mechanism. -- http://mail.python.org/mailman/listinfo/python-list
storing references instead of copies in a dictionary
Hello everyone, I'm storing functions in a dictionary (this is basically for cooking up my own fancy schmancy callback scheme, mainly for learning purpose): >>> def f2(arg): ... return "f2 " + arg ... >>> >>> def f1(arg): ... return "f1" + arg ... >>> a={'1': f1, '2': f2} >>> >>> [ x[1](x[0]) for x in a.items() ] ['f11', 'f2 2'] Well, neat. Except if I change function definitions now, old functions are called. And rightly: {'1': , '2': } >>> f1 >>> >>> def f1(arg): ... return "NEW f1 " + arg ... >>> f1 The address of function f1 has obviously changed on redefinition. Storing value copies in a dictionary on assignment is a reasonable default behaviour. However, in this particular case I need to specifically store _references to objects_ (e.g. f1 function), or should I say _labels_ (leading to objects)? Of course, I can basically update the dictionary with a new function definition. But I wonder, is there not a way _in general_ to specifically store references to functions/variables/first-class objects instead of copies in a dictionary? -- http://mail.python.org/mailman/listinfo/python-list
Re: storing references instead of copies in a dictionary
Calvin Spealman wrote: To your actual problem... Why do you wanna do this anyway? If you want to change the function in the dictionary, why don't you simply define the functions you'll want to use, and change the one you have bound to the key in the dictionary when you want to change it? In other words, define them all at once, and then just d['1'] = new_f1. What is wrong with that? Well, basically nothing except I need to remember I have to do that. Suppose one does that frequently in a program. It becomes tedious. I think I will define some helper function then: >>> def helper(fundict, newfun): ... fundict[newfun.func_name] = newfun ... _If_ there were some shorter and still "proper" way to do it, I'd use it. If not, no big deal. For completeness: def new_f1(arg): return "NEW f1 " + arg f1.func_code = new_f1.func_code Don't use that unless you really have to and I nearly promise that you don't. I promise I won't use it. :-) It seems like a 'wrong thing to do'. -- http://mail.python.org/mailman/listinfo/python-list
Re: storing references instead of copies in a dictionary
Uwe Schmitt wrote: Python stores references in dictionaries and does not copy ! (unless you explicitly use the copy module) ! In your case the entry in the dictionary is a reference to the same object which f1 references, that is the object at 0xb7f0ba04. If you now say "f1=...:" then f1 references a new object at 0xb7f0b994, and the entry in your dictionary still references the "old" object at 0xb7f0ba04. Erm, I theoretically knews that, I guess I suffered temporary insanity when I wrote "copies" of objects. To me it seems that Python actually stores _labels_ referencing _objects_. Here, the problem was that the label in the dictionary led to the "old" object. I do not know any method to automatically update your dictionary as there is no possibility to overload the assignement operator "=". But may be somebody can teach me a new trick :-) Theoretically I could define a class inheriting from dictionary and define a property with a setter (and getter). But I would still have to update the attribute manually, so plain dictionary is just as good. -- http://mail.python.org/mailman/listinfo/python-list
property getter with more than 1 argument?
It seems like getter is defined in such way that it passes only 'self': class FunDict(dict): def __init__(self): self.fundict = dict() def fget(self, fun): return fundict[fun.func_name] def fset(self, newfun): self.fundict[newfun.func_name] = newfun newfun = property (fget, fset) >>> a=FunDict() >>> >>> a.newfun=f1 >>> >>> a.newfun('f1') Traceback (most recent call last): File "", line 1, in a.newfun('f1') TypeError: fget() takes exactly 2 arguments (1 given) Is it possible to pass more than one argument to fget function? I know: I can define a function with property name ('newfun' in the example) and call it with more arguments. But then I do not get the benefits of setter and property in general! -- http://mail.python.org/mailman/listinfo/python-list
Re: storing references instead of copies in a dictionary
def f2(arg): return "f2 "+arg def f1(arg): return "f1 "+arg a={"1":"f1","2":"f2"} print [eval(x[1])(x[0]) for x in a.items()] def f2(arg): return "New f2 "+arg print [eval(x[1])(x[0]) for x in a.items()] Neat trick, if probably dangerous in some circumstances. Anyway, thanks, I didn't think of that. Don't know if this is any use to you.. At least I learned something. :-) -- http://mail.python.org/mailman/listinfo/python-list
Re: Unusual Exception Behaviour
I have looked through the application for any unusual or bare try/except blocks that don’t actually do anything with the except just in case any of them are causing the issue but can’t seem to see any. Why not capture exceptions themselves to a log file? http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/466332 -- http://mail.python.org/mailman/listinfo/python-list
Re: Unusual Exception Behaviour
Robert Rawlins wrote: I certainly like that implementation for logging the exceptions, however, at the moment I don't even know where the exceptions are occurring, or what type they are, could I still use this method to log any and all exceptions raised in the application? Sure. I'm a little confused as to how I can modify that implementation to do so. Remember, Google is your friend, here's the crux of the method without the fancy schmancy sugar coating: http://linux.byexamples.com/archives/365/python-convey-the-exception-traceback-into-log-file/ if __name__=="__main__": try: main() except: print "Trigger Exception, traceback info forward to log file." traceback.print_exc(file=open("errlog.txt","a")) sys.exit(1) This will obviously capture every exception happening in the main() function. The drawback of this method is that you cannot capture the error message / object accompanying the exception, as you normally can do while capturing specific exception: >>> import sys >>> import traceback >>> def anindextoofar(alist): print alist[10] >>> shortlist=range(1,10) >>> >>> try: anindextoofar(shortlist) except IndexError, err: print err traceback.print_exc(file=sys.stdout) list index out of range Traceback (most recent call last): File "", line 2, in File "", line 2, in anindextoofar IndexError: list index out of range -- http://mail.python.org/mailman/listinfo/python-list
Re: properly delete item during "for item in..."
Gary Herron wrote: You could remove the object from the list with del myList[i] if you knew i. HOWEVER, don't do that while looping through the list! Changing a list's length will interact badly with the for loop's indexing through the list, causing the loop to mis the element following the deleted item. Jumping into a thread, I know how not to do it, but not how to do it properly? Iterating over a copy may _probably_ work: >>> t=['a', 'c', 'b', 'd'] >>> >>> for el in t[:]: del t[t.index(el)] >>> t [] However, is it really safe? Defining safe as "works reliably in every corner case for every indexable data type"? Con: suppose the data structure t is really, really big. Just deleting some items from t temporarily doubles the memory consumption. -- http://mail.python.org/mailman/listinfo/python-list
Re: Unusual Exception Behaviour
http://linux.byexamples.com/archives/365/python-convey-the-exception-traceba ck-into-log-file/ if __name__=="__main__": try: main() except: print "Trigger Exception, traceback info forward to log file." traceback.print_exc(file=open("errlog.txt","a")) sys.exit(1) I've just given this solution a shot but I still seem to get the same result, it suddenly starts dumping the log data to the command line, and nothing gets printed in error.txt apart from the keyboard interrupt from when I kill the application. That's seriously weird. What's your Python version and platform? On my Windows and Linux machines, with more recent Python versions the above trick works flawlessly. Check your environment, namely PYTHON* variables. There may be something causing this behaviour. Unset them. Check the first line of your scripts. If you're calling wrong Python interpreter (there may be more than one in the system for some reason), this may cause it. You could also try setting up PYTHONINSPECT environment variable or run the python interpreter with -i option before program filename, which drops you into an interactive shell upon exception or termination of a program. For some reason the exceptions don't seem to be raised properly so obviously aren't being caught at this higher level. So confusing. This behavior is seriously unusual for Python. Maybe you have some old / buggy version? -- http://mail.python.org/mailman/listinfo/python-list
Re: Unusual Exception Behaviour
Ok, Just to add a little interest, when I comment out the configuration line for my logging, like so: #logging.config.fileConfig("/myapp/configuration/logging.conf") It appears to throw the exceptions as normal :-) :-s To tell the truth I have never used logging module extensively, so I'm not expert in this area and can't help you there. However, it seems to me that you may have stumbled upon some subtle bug / side effect of logging module that could cause some side effects in exceptions. Or perhaps it surfaces only in combination with glib? If you discover the root cause, please let us know on this ng, I'm also using Python extensions and bindings to other libraries and this could be of interest at least to me. -- http://mail.python.org/mailman/listinfo/python-list
getting with statement to deal with various exceptions
Hello, I'm trying to learn how with statement can be used to avoid writing: prepare() try: something_that_can_raise_SomeException() except SomeException, err: deal_with_SomeException finally: tear_it_down() Verbose, not very readable. OK, "with" to the rescue? Let's take a textbook example from PEP: with open('/etc/passwd', 'r') as f: BLOCK Well, great, it's neat that "with" closes the file after BLOCK has finished execution, but what if the file isn't there and attempt to open it raises IOException? I'd like to be able to catch it precisely to avoid writing verbose try: .. except: .. finally: .. blocks which as I understand has been much of the rationale behind creating "with" statement in the first place. "with" statement only allows easy dealing with prepare() and tear_it_down(). When I try to get it to deal with exceptions that might happen, it becomes more complicated: class FileContextManager: def __enter__(self, filename, mode): f = open(filename, mode) return f def __exit__(self, etype, evalue, etraceback): print "etype", etype, "evalue", evalue, "etraceback", etraceback >>> with FileContextManager("somefile", "r") as f: a = f.readlines() Traceback (most recent call last): File "", line 1, in with FileContextManager("somefile", "r") as f: TypeError: this constructor takes no arguments Bummer. Plus, no documentation I've read (on effbot, in PEP 343, etc) says how to deal with the exception happening in __enter__ method of context manager, which is precisely what I'd like to do. This is only natural, isn't it? When e.g. reading the file, preparation phase is typically checking if it can be opened in the first place. So it falls into __enter__ method of context manager. "with" limited only to successful execution of a_statement from "with a_statement" seems like limited benefit to me. I'd like to be able to write smth like class FileContextManager: def __enter__(self, filename, mode): f = open(filename, mode) return f def __except__(IOError, err): do_this print err def __except__(RuntimeError, err): do_that print "something bad happened", err def __exit__(self, etype, evalue, etraceback): print "etype", etype, "evalue", evalue, "etraceback", etraceback __exit__ deals with exceptions happening in the BLOCK below "with" statement, not with exceptions raised in "a_statement", when executing with a_statement as var: BLOCK In the above way "with" would give me the benefit of more terse, but still understandable and efficient code. Well, I can always do this: try: with open("somefile.txt") as txtfile: for line in txtfile: print line except IOError: print "No such file." No such file. But that's just ugly, nested too many times (flat is better than nested, right?) and not all that more readable. -- http://mail.python.org/mailman/listinfo/python-list
Re: getting with statement to deal with various exceptions
from __future__ import with_statement class ExceptionManager(object): def __enter__(self): pass def __exit__(self,exc_type,exc_value,tb): if exc_type == IOError: print 'IOError',exc_value[1] return True # suppress it with ExceptionManager(): with open('test.txt') as f: f.read() Neat. I think I'm going to use this approach, since BDFL-in-chief is unlikely to listen to me in any foreseeable future. :-) -- http://mail.python.org/mailman/listinfo/python-list
Re: Python Written in C?
Grant Edwards wrote: Using punch-cards and paper-tape. Real programmers can edit their programs with a pointy stick and some home-made sticky-tape. Wrong! Real programmers can program using only Touring machine (and something having to do with post for some reason). I'm sure our brilliant OP[1] could program in both. [1] Troll, really. Don't feed the troll. I wouldn't have posted about him because that only adds to the noise. oops. -- http://mail.python.org/mailman/listinfo/python-list
Seriously, though, about LLVM
http://llvm.org/ This project has gained some publicity. There's IronPython, right, so has anybody thought about implementing Python using LLVM as backend, as it seems not out of question at all? -- http://mail.python.org/mailman/listinfo/python-list
Re: Python Written in C?
Who cares what language a language is written in as long as you can be productive - which you certainly can be in Python. Seriously, though, would there be any advantage in re-implementing Python in e.g. C++? Not that current implementation is bad, anything but, but if you're not careful, the fact that lists are implemented as C arrays can bite your rear from time to time (it recently bit mine while using lxml). Suppose C++ re-implementation used some other data structure (like linked list, possibly with twists like having an array containing pointers to 1st linked list elements to speed lookups up), which would be a bit slower on average perhaps, but it would behave better re deletion? -- http://mail.python.org/mailman/listinfo/python-list
Re: Please recommend a RPC system working with twisted.
Unfortunately, there seems no such resolution existed. So maybe I have to give up some requirements. Why not PYRO? Note: I haven't used it. -- http://mail.python.org/mailman/listinfo/python-list
Re: Seriously, though, about LLVM
Fredrik Lundh wrote: mk wrote: This project has gained some publicity. There's IronPython, right, so has anybody thought about implementing Python using LLVM as backend, as it seems not out of question at all? you mean like: http://llvm.org/ProjectsWithLLVM/#pypy ? No, I don't mean "Python written in Python", with whatever backend. -- http://mail.python.org/mailman/listinfo/python-list
Re: calling source command within python
Jie wrote: Hi all, i'm having trouble executing os.system('source .bashrc') command within python, it always says that source not found and stuff. Any clue? It _might_ be that the shell it fires up is /bin/sh and this in turn is not bash. Anyway, it's better to use subprocess / Popen for this sort of operation. -- http://mail.python.org/mailman/listinfo/python-list
Re: Python Written in C?
Marc 'BlackJack' Rintsch wrote: An operation that most people avoid because of the penalty of "shifting down" all elements after the deleted one. Pythonistas tend to build new lists without unwanted elements instead. Which is exactly what I have done with my big lxml.etree, from which I needed to delete some elements: constructed a new tree only with elements I wanted. Sure, that works. There _is_ a minor side effect: nearly doubling memory usage while the operation lasts. 99% of the time it's not a problem, sure. > I can't even remember when I deleted something from a list in the past. Still, doesn't that strike you as.. workaround? I half-got used to it, but it would still be nice not to (practically) have to use it. Enough whining. Gonna eat my quiche and do my Python. :-) -- http://mail.python.org/mailman/listinfo/python-list
Re: Python Written in C?
Actually, all of the compilers I'm familiar with (gcc and a handful of cross compilers for various microprocessors) translate from high-level languages (e.g. C, C++) into assembly, which is then assembled into relocatable object files, which are then linked/loaded to produce machine language. Doesn't g++ translate C++ into C and then compile C? Last I heard, most C++ compilers were doing that. -- http://mail.python.org/mailman/listinfo/python-list
properties setting each other
Hello everyone, I try to set two properties, "value" and "square" in the following code, and arrange it in such way that setting one property also sets another one and vice versa. But the code seems to get Python into infinite loop: >>> import math >>> class Squared2(object): def __init__(self, val): self._internalval=val self.square=pow(self._internalval,2) def fgetvalue(self): return self._internalval def fsetvalue(self, val): self._internalval=val self.square=pow(self._internalval,2) value = property(fgetvalue, fsetvalue) def fgetsquare(self): return self.square def fsetsquare(self,s): self.square = s self.value = math.sqrt(self.square) square = property(fgetsquare, fsetsquare) >>> a=Squared2(5) Traceback (most recent call last): File "", line 1, in a=Squared2(5) File "", line 5, in __init__ self.square=pow(self._internalval,2) File "", line 19, in fsetsquare self.square = s File "", line 19, in fsetsquare self.square = s File "", line 19, in fsetsquare self.square = s File "", line 19, in fsetsquare self.square = s File "", line 19, in fsetsquare self.square = s File "", line 19, in fsetsquare ... Is there a way to achieve this goal of two mutually setting properties? -- http://mail.python.org/mailman/listinfo/python-list
Re: properties setting each other
Thanks to everyone for answers.. *but*, if you want to add more logic in the setters, you could want to add two extra methods : def _setsquare(self, v) : # some extra logic here self._square = s def fsetsquare(self,s): self._setsquare(s) self._setvalue = math.sqrt(s) def _setvalue(self, val): # some extra logic here self._internalval=val def fsetvalue(self, val): self._setvalue(val) self._setsquare=pow(val,2) Thanks for that, I'll keep that in mind. Note that if one property can really be computed from another, this kind of thing could be considered as bad design (except if the computation is heavy). Hmm, why? Is the line of thinking smth like: because the variables should be kept to minimum and they should be calculated at the moment they are needed? -- http://mail.python.org/mailman/listinfo/python-list
sort in Excel
Hello, Does anyone knows how can I -- http://mail.python.org/mailman/listinfo/python-list
sort in Excel
Hello, Does anyone knows how can I -- http://mail.python.org/mailman/listinfo/python-list
sort in Excel
Hello, Does anyone knows how to use the Excel sort function in python? -- http://mail.python.org/mailman/listinfo/python-list
Re: sort in Excel
u do it like this: from win32com.client import Dispatch xlApp = Dispatch("Excel.Application") xlApp.Workbooks.Open("D:\python\sortme.csv") xlApp.Range("A1:C100").Sort(Key1=xlApp.Range("B1"), Order1=2) xlApp.ActiveWorkbook.Close(SaveChanges=1) xlApp.Quit() del xlApp -- http://mail.python.org/mailman/listinfo/python-list
Threading problem / Paramiko problem ?
Hello everyone, I wrote "concurrent ssh" client using Paramiko, available here: http://python.domeny.com/cssh.py This program has a function for concurrent remote file/dir copying (class SSHThread, method 'sendfile'). One thread per host specified is started for copying (with a working queue of maximum length, of course). It does have a problem with threading or Paramiko, though: - If I specify, say, 3 hosts, the 3 threads started start copying onto remote hosts fast (on virtual machine, 10-15MB/s), using somewhat below 100% of CPU all the time (I wish it were less CPU-consuming but I'm doing sending file portion by portion and it's coded in Python, plus there are other calculations, well..) - If I specify say 10 hosts, copying is fast and CPU is under load until there are 2-3 threads left; then, CPU load goes down to some 15% and copying gets slow (at some 1MB/s). It looks as if the CPU time gets divided in more or less even portions for each thread running at the moment when the maximum number of threads is active (10 in this example) *and it stays this way even if some threads are finished and join()ed *. I do join() the finished threads (take a look at code, someone). Yet the CPU consumption and copying speed go down. Now, it's either that, or Paramiko "maxes out" sending bandwidth per thread to the "total divided by number of senders". I have no idea which and what's worse, no idea how to test this. I've done profiling which indicated nothing, basically all function calls except time.sleep take negligible time. Regards, mk -- http://mail.python.org/mailman/listinfo/python-list
Any way to use cProfile in a multi-threaded program?
I'm stumped; I read somewhere that one would have to modify Thread.run() method but I have never modified Python methods nor would I really want to do it. Is there any way to start cProfile on each thread and then combine the stats? Regards, mk -- http://mail.python.org/mailman/listinfo/python-list
multithreading, performance, again...
Hello everyone, I have figured out (sort of) how to do profiling of multithreaded programs with cProfile, it goes something like this: #!/usr/local/bin/python import cProfile import threading class TestProf(threading.Thread): def __init__(self, ip): threading.Thread.__init__(self) self.ip = ip def run(self): prof = cProfile.Profile() retval = prof.runcall(self.runmethod) prof.dump_stats('tprof' + self.ip) def runmethod(self): pass tp = TestProf('10.0.10.10') tp.start() tp.join() The problem is, now that I've done profiling in the actual program (profiled version here: http://python.domeny.com/cssh_profiled.py) with 9 threads and added up stats (using pstats.Stats.add()), the times I get are trivial: >>> p.strip_dirs().sort_stats('cumulative').print_stats(10) Wed Dec 30 16:23:59 2009csshprof9.156.44.113 Wed Dec 30 16:23:59 2009csshprof9.156.46.243 Wed Dec 30 16:23:59 2009csshprof9.156.46.89 Wed Dec 30 16:24:00 2009csshprof9.156.47.125 Wed Dec 30 16:24:00 2009csshprof9.156.47.17 Wed Dec 30 16:24:00 2009csshprof9.156.47.29 Wed Dec 30 16:24:01 2009csshprof9.167.41.241 Wed Dec 30 16:24:02 2009csshprof9.168.119.15 Wed Dec 30 16:24:02 2009csshprof9.168.119.218 39123 function calls (38988 primitive calls) in 6.004 CPU seconds Ordered by: cumulative time List reduced from 224 to 10 due to restriction <10> ncalls tottime percall cumtime percall filename:lineno(function) 90.0000.0006.0040.667 cssh.py:696(runmethod) 1000.0040.0005.4670.055 threading.py:389(wait) 820.0250.0005.4600.067 threading.py:228(wait) 4005.4000.0135.4000.013 {time.sleep} 90.0000.0005.2630.585 cssh.py:452(ssh_connect) 90.0030.0005.2620.585 client.py:226(connect) 90.0010.0002.8040.312 transport.py:394(start_client) 90.0050.0012.2540.250 client.py:391(_auth) 180.0010.0002.1150.117 transport.py:1169(auth_publickey) 180.0010.0002.0300.113 auth_handler.py:156(wait_for_response) It's not burning CPU time in the main thread (profiling with cProfile indicated smth similar to the above), it's not burning it in the individual worker threads - so where the heck it is burning this CPU time? bc 'top' shows heavy CPU load during most of the time of the program run. help... regards, mk -- http://mail.python.org/mailman/listinfo/python-list
Interesting (?) problem
Hello everyone, I have two lists of IP addresses: hostips = [ 'a', 'b', 'c', 'd', 'e' ] thread_results = [ 'd', 'b', 'c' ] I need to sort thread_results in the same order as hostips. (Obviously, hostips can contain any valid ip addresses as strings, they are sorted alphabetically here just for sake of example.) Since explicit is better than implicit, I will clarify: thread_results is obviously result of threads communicating with IPs from hostips, and that can finish at various times, thus returning ips into thread_results in any order. Sorting would be trivial to do if thread_results were not a subset of hostips (obviously, for some IPs communication can fail which excludes them from the result). One approach I can see is constructing hostips_limited list that would contain only ips that are in thread_results but that would preserve order of hostips: hostips_limited = [] for h in hostips: if h in thread_results: hostips_limited.append(h) ..and then doing sorting thread_results. But maybe there is more elegant / faster approach? Incidentally, it *seems* that list comprehension preserves order: hostips_limited = [ h for h in hostips if h in thread_results ] Empirically speaking it seems to work (I tested it on real ips), but please correct me if that's wrong. Regards, mk -- http://mail.python.org/mailman/listinfo/python-list
Re: Interesting (?) problem
mk wrote: Hello everyone, I have two lists of IP addresses: hostips = [ 'a', 'b', 'c', 'd', 'e' ] thread_results = [ 'd', 'b', 'c' ] I need to sort thread_results in the same order as hostips. P.S. One clarification: those lists are actually more complicated (thread_result is a list of tuples (ip, thread)), which is why I need thread_results sorted in order of hostips (instead of just constructing [ h for h in hostips if h in thread_results ] and be done with it). -- http://mail.python.org/mailman/listinfo/python-list
Building static Python binary
Hello, I have trouble building Python static binary (for use with 'freeze.py', as frozen Python programs do not include dynamically linked libs). Anybody? ./configure --disable-shared --with-ldflags=-ldl And yet after compiling the resulting binary is linked with following dynamic libraries: [r...@localhost Python-2.6.2]# ldd python linux-gate.so.1 => (0x0075b000) libpthread.so.0 => /lib/libpthread.so.0 (0x0090) libdl.so.2 => /lib/libdl.so.2 (0x008d1000) libutil.so.1 => /lib/libutil.so.1 (0x03bbe000) libssl.so.6 => /lib/libssl.so.6 (0x002e3000) libcrypto.so.6 => /lib/libcrypto.so.6 (0x001a1000) libm.so.6 => /lib/libm.so.6 (0x008d7000) libc.so.6 => /lib/libc.so.6 (0x0078b000) /lib/ld-linux.so.2 (0x0076d000) libgssapi_krb5.so.2 => /usr/lib/libgssapi_krb5.so.2 (0x005ae000) libkrb5.so.3 => /usr/lib/libkrb5.so.3 (0x004ce000) libcom_err.so.2 => /lib/libcom_err.so.2 (0x00767000) libk5crypto.so.3 => /usr/lib/libk5crypto.so.3 (0x00719000) libresolv.so.2 => /lib/libresolv.so.2 (0x037ab000) libz.so.1 => /usr/lib/libz.so.1 (0x00919000) libkrb5support.so.0 => /usr/lib/libkrb5support.so.0 (0x00741000) libkeyutils.so.1 => /lib/libkeyutils.so.1 (0x03909000) libselinux.so.1 => /lib/libselinux.so.1 (0x00caf000) libsepol.so.1 => /lib/libsepol.so.1 (0x00566000) How can I make the python binary static? -- http://mail.python.org/mailman/listinfo/python-list
Re: Building static Python binary
P.S. If I add -static to LDFLAGS in Makefile, I get this: gcc -pthread -static -Xlinker -export-dynamic -o python \ Modules/python.o \ libpython2.6.a -lpthread -ldl -lutil -L/usr/local/ssl/lib -lssl -lcrypto -lm libpython2.6.a(dynload_shlib.o): In function `_PyImport_GetDynLoadFunc': /var/www/html/tmp/Python-2.6.2/Python/dynload_shlib.c:130: warning: Using 'dlopen' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking libpython2.6.a(posixmodule.o): In function `posix_tmpnam': /var/www/html/tmp/Python-2.6.2/./Modules/posixmodule.c:7129: warning: the use of `tmpnam_r' is dangerous, better use `mkstemp' libpython2.6.a(posixmodule.o): In function `posix_tempnam': /var/www/html/tmp/Python-2.6.2/./Modules/posixmodule.c:7084: warning: the use of `tempnam' is dangerous, better use `mkstemp' libpython2.6.a(pwdmodule.o): In function `pwd_getpwall': /var/www/html/tmp/Python-2.6.2/./Modules/pwdmodule.c:157: warning: Using 'getpwent' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking libpython2.6.a(pwdmodule.o): In function `pwd_getpwnam': /var/www/html/tmp/Python-2.6.2/./Modules/pwdmodule.c:131: warning: Using 'getpwnam' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking libpython2.6.a(pwdmodule.o): In function `pwd_getpwuid': /var/www/html/tmp/Python-2.6.2/./Modules/pwdmodule.c:110: warning: Using 'getpwuid' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking libpython2.6.a(pwdmodule.o): In function `pwd_getpwall': /var/www/html/tmp/Python-2.6.2/./Modules/pwdmodule.c:156: warning: Using 'setpwent' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking /var/www/html/tmp/Python-2.6.2/./Modules/pwdmodule.c:167: warning: Using 'endpwent' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking /usr/lib/gcc/i386-redhat-linux/4.1.2/../../../libssl.a(kssl.o): In function `get_rc_clockskew': (.text+0xe1): undefined reference to `krb5_rc_default' /usr/lib/gcc/i386-redhat-linux/4.1.2/../../../libssl.a(kssl.o): In function `get_rc_clockskew': (.text+0xfc): undefined reference to `krb5_rc_initialize' /usr/lib/gcc/i386-redhat-linux/4.1.2/../../../libssl.a(kssl.o): In function `get_rc_clockskew': (.text+0x122): undefined reference to `krb5_rc_get_lifespan' /usr/lib/gcc/i386-redhat-linux/4.1.2/../../../libssl.a(kssl.o): In function `get_rc_clockskew': (.text+0x13c): undefined reference to `krb5_rc_destroy' /usr/lib/gcc/i386-redhat-linux/4.1.2/../../../libssl.a(kssl.o): In function `kssl_validate_times': (.text+0x174): undefined reference to `krb5_init_context' /usr/lib/gcc/i386-redhat-linux/4.1.2/../../../libssl.a(kssl.o): In function `kssl_validate_times': (.text+0x197): undefined reference to `krb5_timeofday' /usr/lib/gcc/i386-redhat-linux/4.1.2/../../../libssl.a(kssl.o): In function `kssl_validate_times': (.text+0x1ba): undefined reference to `krb5_free_context' /usr/lib/gcc/i386-redhat-linux/4.1.2/../../../libssl.a(kssl.o): In function `kssl_krb5_free_data_contents': (.text+0x241): undefined reference to `krb5_free_data_contents' /usr/lib/gcc/i386-redhat-linux/4.1.2/../../../libssl.a(kssl.o): In function `kssl_tgt_is_available': (.text+0x2ba): undefined reference to `krb5_init_context' /usr/lib/gcc/i386-redhat-linux/4.1.2/../../../libssl.a(kssl.o): In function `kssl_tgt_is_available': (.text+0x2d6): undefined reference to `krb5_free_principal' /usr/lib/gcc/i386-redhat-linux/4.1.2/../../../libssl.a(kssl.o): In function `kssl_tgt_is_available': (.text+0x2ec): undefined reference to `krb5_free_principal' /usr/lib/gcc/i386-redhat-linux/4.1.2/../../../libssl.a(kssl.o): In function `kssl_tgt_is_available': (.text+0x2fb): undefined reference to `krb5_free_context' /usr/lib/gcc/i386-redhat-linux/4.1.2/../../../libssl.a(kssl.o): In function `kssl_tgt_is_available': (.text+0x33b): undefined reference to `krb5_sname_to_principal' /usr/lib/gcc/i386-redhat-linux/4.1.2/../../../libssl.a(kssl.o): In function `kssl_tgt_is_available': (.text+0x355): undefined reference to `krb5_cc_default' /usr/lib/gcc/i386-redhat-linux/4.1.2/../../../libssl.a(kssl.o): In function `kssl_tgt_is_available': (.text+0x376): undefined reference to `krb5_cc_get_principal' /usr/lib/gcc/i386-redhat-linux/4.1.2/../../../libssl.a(kssl.o): In function `kssl_tgt_is_available': (.text+0x3a8): undefined reference to `krb5_get_credentials' /usr/lib/gcc/i386-redhat-linux/4.1.2/../../../libssl.a(kssl.o): In function `kssl_keytab_is_available': (.text+0x415): undefined reference to `krb5_init_context' /usr/lib/gcc/i386-redhat-linux/4.1.2/../../../libssl.a(kssl.o): In function `kssl_keytab_is_available': (
Freezing pycrypto
Hello everyone, I'm trying to freeze PyCrypto on Linux (using freeze.py) and having trouble with it, can you help me? PyCrypto is used by paramiko (ssh client module). I have added following in the Modules/Setup while building Python (This has to be done because freeze.py requires that all compiled extension modules are compiled statically into Python, it can't import dynamic libraries into frozen binary): *static* # ... lots of other modules Crypto.Cipher.DES3 DES3.c But this produces: [r...@localhost Python-2.6.2]# make /bin/sh ./Modules/makesetup -c ./Modules/config.c.in \ -s Modules \ Modules/Setup.config \ Modules/Setup.local \ Modules/Setup bad word Crypto.Cipher.DES3 in Crypto.Cipher.DES3 DES3.c make: *** [Makefile] Error 1 If I change that to: DES3 DES3.c Python compiles, but then I can't import this module and neither can PyCrypto in a frozen binary: [r...@localhost tmp2]# ./cssh :6: DeprecationWarning: the sha module is deprecated; use the hashlib module instead :6: DeprecationWarning: the md5 module is deprecated; use hashlib instead Traceback (most recent call last): File "cssh.py", line 7, in import paramiko File "/usr/local/lib/python2.6/site-packages/paramiko/__init__.py", line 69, in from transport import randpool, SecurityOptions, Transport File "/usr/local/lib/python2.6/site-packages/paramiko/transport.py", line 37, in from paramiko.dsskey import DSSKey File "/usr/local/lib/python2.6/site-packages/paramiko/dsskey.py", line 31, in from paramiko.pkey import PKey File "/usr/local/lib/python2.6/site-packages/paramiko/pkey.py", line 28, in from Crypto.Cipher import DES3 ImportError: cannot import name DES3 Regards, mk -- http://mail.python.org/mailman/listinfo/python-list
list comprehension problem
Hello everyone, print hosts hosts = [ s.strip() for s in hosts if s is not '' and s is not None and s is not '\n' ] print hosts ['9.156.44.227\n', '9.156.46.34 \n', '\n'] ['9.156.44.227', '9.156.46.34', ''] Why does the hosts list after list comprehension still contain '' in last position? I checked that: print hosts hosts = [ s.strip() for s in hosts if s != '' and s != None and s != '\n' ] print hosts ..works as expected: ['9.156.44.227\n', '9.156.46.34 \n', '\n'] ['9.156.44.227', '9.156.46.34'] Are there two '\n' strings in the interpreter's memory or smth so the identity check "s is not '\n'" does not work as expected? This is weird. I expected that at all times there is only one '\n' string in Python's cache or whatever that all labels meant by the programmer as '\n' string actually point to. Is that wrong assumption? Regards, mk -- http://mail.python.org/mailman/listinfo/python-list
Recommended number of threads? (in CPython)
Hello everyone, I wrote run-of-the-mill program for concurrent execution of ssh command over a large number of hosts. (someone may ask why reinvent the wheel when there's pssh and shmux around -- I'm not happy with working details and lack of some options in either program) The program has a working queue of threads so that no more than maxthreads number are created and working at particular time. But this begs the question: what is the recommended number of threads working concurrently? If it's dependent on task, the task is: open ssh connection, execute command (then the main thread loops over the queue and if the thread is finished, it closes ssh connection and does .join() on the thread) I found that when using more than several hundred threads causes weird exceptions to be thrown *sometimes* (rarely actually, but it happens from time to time). Although that might be dependent on modules used in threads (I'm using paramiko, which is claimed to be thread safe). -- http://mail.python.org/mailman/listinfo/python-list
Re: How to run a repeating timer every n minutes?
Frank Millman wrote: class Timer(threading.Thread): def __init__(self): threading.Thread.__init__(self) self.event = threading.Event() def run(self): while not self.event.is_set(): """ The things I want to do go here. """ self.event.wait(number_of_seconds_to_wait) def stop(self): self.event.set() In your main program - - to start the timer tmr = Timer() tmr.start() - to stop the timer tmr.stop() It is easy to extend this by passing the number_of_seconds_to_wait, or a function name to be executed, as arguments to the Timer. I'm newbie at threading, so I'm actually asking: should not method like stop() be surrounded with acquire() and release() of some threading.lock? I mean, is this safe to update running thread's data from the main thread without lock? Regards, mk -- http://mail.python.org/mailman/listinfo/python-list
Request for comments - concurrent ssh client
Hello everyone, Since I'm not happy with shmux or pssh, I wrote my own "concurrent ssh" program for parallel execution of SSH commands on multiple hosts. Before I release program to the wild, I would like to hear (constructive) comments on what may be wrong with the program and/or how to fix it. (note: the program requires paramiko ssh client module) #!/usr/local/bin/python -W ignore::DeprecationWarning import time import sys import os import operator import paramiko import threading import subprocess import optparse usage = "Usage: cssh [options] IP1 hostname2 IP3 hostname4 ...\n\n(IPs/hostnames on the commandline are actually optional, they can be specified in the file, see below.)" op = optparse.OptionParser(usage=usage) op.add_option('-c','--cmd',dest='cmd',help="""Command to run. Mutually exclusive with -s.""") op.add_option('-s','--script',dest='script',help="""Script file to run. Mutually exclusive with -c. Script can have its own arguments, specify them in doublequotes, like "script -arg arg".""") op.add_option('-i','--script-dir',dest='scriptdir',help="""The directory where script will be copied and executed. Defaults to /tmp.""") op.add_option('-l','--cleanup',dest='cleanup',action='store_true',help="""Delete the script on remote hosts after executing it.""") op.add_option('-f','--file',dest='file',help="""File with hosts to use, one host per line. Concatenated with list of hosts/IP addresses specified at the end of the commandline. Optionally, in a line of the file you can specify sequence: "Address/Hostname Username Password SSH_Port" separated by spaces (additional parameters can be specified on a subset of lines; where not specified, relevant parameters take default values).""") op.add_option('-d','--dir',dest='dir',help='Directory for storing standard output and standard error of command. If specified, directory will be created, with subdirs named IPs/hostnames and relevant files stored in those subdirs.') op.add_option('-u','--username',dest='username',help="""Username to specify for SSH. Defaults to 'root'.""") op.add_option('-p','--password',dest='password',help="""Password. Password is used first; if connection fails using password, cssh uses SSH key (default or specified).""") op.add_option('-o','--port',dest='port',help="""Default SSH port.""") op.add_option('-k','--key',dest='key',help="""SSH Key file. Defaults to '/root/.ssh/id_dsa'.""") op.add_option('-n','--nokey',dest='nokey',action="store_true", help="""Turns off using SSH key.""") op.add_option('-t','--timeout',dest='timeout',help="""SSH connection timeout. Defaults to 20 seconds.""") op.add_option('-m','--monochromatic',dest='mono',action='store_true',help="""Do not use colors while printing output.""") op.add_option('-r','--maxthreads',dest='maxthreads',help="""Maximum number of threads working concurrently. Default is 100. Exceeding 200 is generally not recommended due to potential exhaustion of address space (each thread can use 10 MB of address space and 32-bit systems have a maximum of 4GB of address space).""") op.add_option('-q','--quiet',dest='quiet',action='store_true',help="""Quiet. Do not print out summaries like IPs for which communication succeeded or failed, etc.""") # add resource file? (opts, args) = op.parse_args() failit = False if opts.cmd == None and opts.script == None: print "You have to specify one of the following: command to run, using -c command or --cmd command, or script to run, using -s scriptfile or --script scriptfile." print failit = True if opts.cmd != None and opts.script != None: print "Options command (-c) and script (-s) are mutually exclusive. Specify either one." print failit = True if opts.cmd == None and opts.script != None: try: scriptpath = opts.script.split()[0] scriptfo = open(scriptpath,'r') scriptfo.close() except IOError: print "Could not open script file %s." % opts.script print failit = True if opts.file == None and args == []: print "You have to specify at least one of the following:" print " - list of IPs/hostnames at the end of the command line (after all options)" print " - list of IPs/hostnames stored in file specified after -f or --file option (like: -f hostnames.txt)" print " You can also specify both sources. In that case IP/hostname lists will be concatenated." print failit = True if opts.password == None and opts.nokey: print "Since using key has been turned off using -n option, you have to specify password using -p password or --password password." print failit = True if opts.key is not None and opts.nokey: print "Options -n and -k keyfile are mutually exclusive. Specify either one." print failit = True if failit: sys.exit(0) if opts.scriptdir == None: opts.scriptdir = '/tmp' if opts.cleanup == None: opts.cleanup = False if opts.key == None: opts.key = '/root
checking 'type' programmatically
Disclaimer: this is for exploring and debugging only. Really. I can check type or __class__ in the interactive interpreter: Python 2.6.2 (r262:71600, Jun 16 2009, 16:49:04) [GCC 4.1.2 20080704 (Red Hat 4.1.2-44)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import subprocess >>> p=subprocess.Popen(['/bin/ls'],stdout=subprocess.PIPE,stderr=subprocess.PIPE) >>> p >>> (so, se) = p.communicate() >>> so 'abc.txt\nbak\nbox\nbuild\ndead.letter\nDesktop\nhrs\nmbox\nmmultbench\nmmultbench.c\npyinstaller\nscreenlog.0\nshutdown\ntaddm_import.log\nv2\nvm\nworkspace\n' >>> se '' >>> so.__class__ >>> type(so) >>> type(se) But when I do smth like this in code that is ran non-interactively (as normal program): req.write('stderr type %s' % type(se)) req.write('stderr class %s' % str(se.__class__)) then I get empty output. WTF? How do I get the type or __class__ into some object that I can display? Why do that: e.g. if documentation is incomplete, e.g. documentation on Popen.communicate() says "communicate() returns a tuple (stdoutdata, stderrdata)" but doesn't say what is the class of stdoutdata and stderrdata (a file object to read? a string?). Regards, mk -- http://mail.python.org/mailman/listinfo/python-list
Regexp and multiple groups (with repeats)
Hello, >>> r=re.compile(r'(?:[a-zA-Z]:)([\\/]\w+)+') >>> r.search(r'c:/tmp/spam/eggs').groups() ('/eggs',) Obviously, I would like to capture all groups: ('/tmp', '/spam', '/eggs') But it seems that re captures only the last group. Is there any way to capture all groups with repeat following it, i.e. (...)+ or (...)* ? Even better would be: ('tmp', 'spam', 'eggs') Yes, I know about re.split: >>> re.split( r'(?:\w:)?[/\\]', r'c:/tmp/spam\\eggs/' ) ['', 'tmp', 'spam', '', 'eggs', ''] My interest is more general in this case: how to capture many groups with a repeat? Regards, mk -- http://mail.python.org/mailman/listinfo/python-list
pointless musings on performance
#!/usr/local/bin/python import timeit def pythonic(): nonevar = None zerovar = 0 for x in range(100): if nonevar: pass if zerovar: pass def unpythonic(): nonevar = None zerovar = 0 for x in range(100): if nonevar is not None: pass if zerovar > 0: pass for f in [pythonic, unpythonic]: print f.func_name, timeit.timeit(f, number=10) # ./t.py pythonic 2.13092803955 unpythonic 2.82064604759 Decidedly counterintuitive: are there special optimizations for "if nonevar:" type of statements in cpython implementation? regards, mk -- http://mail.python.org/mailman/listinfo/python-list
Re: pointless musings on performance
MRAB wrote: In what way is it counterintuitive? In 'pythonic' the conditions are simpler, less work is being done, therefore it's faster. But the pythonic condition is more general: nonevar or zerovar can be '', 0, or None. So I thought it was more work for interpreter to compare those, while I thought that "is not None" is translated to one, more low-level and faster action. Apparently not. As Rob pointed out (thanks): 11 31 LOAD_FAST0 (nonevar) 34 JUMP_IF_FALSE4 (to 41) I'm no good at py compiler or implementation internals and so I have no idea what bytecode "JUMP_IF_FALSE" is actually doing. Regards, mk -- http://mail.python.org/mailman/listinfo/python-list
unable to catch this exception
Exception in thread Thread-9 (most likely raised during interpreter shutdown): Traceback (most recent call last): File "/usr/local/lib/python2.6/threading.py", line 522, in __bootstrap_inner File "/var/www/html/cssh.py", line 875, in run File "/var/www/html/cssh.py", line 617, in ssh_connect : 'NoneType' object has no attribute 'BadAuthenticationType' This happens on interpreter shutdown, even though I do try to catch the AttributeError exception: try: fake = paramiko.BadAuthenticationType try: self.conobj.connect(self.ip, username=self.username, key_filename=self.sshprivkey, port=self.port, timeout=opts.timeout) loginsuccess = True except paramiko.BadAuthenticationType, e: # this is line 617 self.conerror = str(e) except paramiko.SSHException, e: self.conerror = str(e) except socket.timeout, e: self.conerror = str(e) except socket.error, e: self.conerror = str(e) except AttributeError: # this happens on interpreter shutdown self.conerror = 'shutdown' It's clear what happens: paramiko gets its attributes cleared or the module perhaps gets unloaded and as result "paramiko" label leads to None, which obviously has no attribute BadAuthenticationType. However, even though this is surrounded by try .. except AttributeError block, it evidently isn't catch. How to catch that exception? Or at least preven displaying this message? Regards, mk -- http://mail.python.org/mailman/listinfo/python-list
passing string for editing in input()
Hello, Is there an easy way to get an editing (readline) in Python that would contain string for editing and would not just be empty? I googled but found nothing. Regards, mk -- http://mail.python.org/mailman/listinfo/python-list
Re: passing string for editing in input()
Peter Otten wrote: mk wrote: Is there an easy way to get an editing (readline) in Python that would contain string for editing and would not just be empty? http://mail.python.org/pipermail/python-list/2009-June/1209309.html Peter Thanks a lot! Just what I needed. Regards, mk -- http://mail.python.org/mailman/listinfo/python-list
The best library to create charting application
The application will display (elaborate) financial charts. Pygame? Smth else? dotnet? Regards, mk -- http://mail.python.org/mailman/listinfo/python-list
Re: The best library to create charting application
Phlip wrote: mk wrote: The application will display (elaborate) financial charts. Pygame? Smth else? Back in the day it was Python BLT. Are you on the Web or the Desktop? Desktop, really (there should be some nominal web interface but the main application will be desktop) Regards, mk -- http://mail.python.org/mailman/listinfo/python-list
Re: Python and Ruby
Steve Holden wrote: Jeez, Steve, you're beginning to sound like some kind of fallacy zealot... ;) Death to all those who confuse agumentum ad populum with argumentum ad verecundiam!!! Yeah, what did the zealots ever do for us? They produced Python? . . . Oh Python! Shut up! -- http://mail.python.org/mailman/listinfo/python-list
which
if isinstance(cmd, str): self.cmd = cmd.replace(r'${ADDR}',ip) else: self.cmd = cmd or self.cmd = cmd if isinstance(cmd, str): self.cmd = cmd.replace(r'${ADDR}',ip) -- http://mail.python.org/mailman/listinfo/python-list
Re: which
Jean-Michel Pichavant wrote: What is worrying me the most in your code sample is that self.cmd can hold diferrent types (str, and something else). That is usually a bad thing to do (putting None aside). However, my remark could be totally irrelevant of course, that depends on the context. That's a valid criticism - but I do not know how to handle this otherwise really, because the program can be called with "cmd" to run, or a script to run (or a directory to copy) and in those cases cmd is None. I guess I could use if cmd: self.cmd = ... But. Suppose that under some circumstances cmd is not string. What then? I know that isinstance is typically not recommended, but I don't see better solution here. Regards, mk -- http://mail.python.org/mailman/listinfo/python-list