Re: How to use a timer in Python?
Nico Grubert <[EMAIL PROTECTED]> wrote: > on a Linux machine running Python 2.3.5. I want to create a file > 'newfile' in a directory '/tmp' only if there is no file 'transfer.lock' > in '/temp'. > A cronjob creates a file 'transfer.lock' in '/temp' directory every 15 > minutes while the cronjob is doing something. This job takes around 30 > seconds. During these 30 seconds the 'transfer.lock' file is present in > the '/temp' directory and I must not create 'newfile'. After the cronjob > has been finished, the 'transfer.lock' file is deleted from '/temp' and > I can create 'newfile'. That all sounds very race-y to me! The cron-job and the other process need to take the same lock, otherwise the cron-job will start 1ms after the other process checks for transfer.lock and before it has a chance to create newfile and there will be trouble. Using files as locks isn't brilliant because the operations "read to see if the lock is there" and "create the file isn't" aren't atomic. Ie someone can get in there after you read the directory but before you create the file. However creating a directory is atomic, so you can take the lock by os.mkdir("/tmp/lock"). If that succeeded you got the lock, if it failed (threw OSError) then you didn't. If it failed then just time.sleep(1) and try again. This kind of locking works cross platform too. You can use it in shell too, eg "mkdir /tmp/lock || exit 1" in your cronjob. You could wrap the locking up into a module of course, and I bet someone already did. -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: How to use a timer in Python?
Nico Grubert <[EMAIL PROTECTED]> wrote: > There is no cronjob anymore now. I just need to check if there is a lock > file. How would you modify my short program to avoid the situation "Ie > someone can get in there after you read the directory but before > you create the file."? You can't do it with files. You can do it with directories though... Both processes need to be using the same lock, so something like this (all untested ;-) import os import time WINDOWS_SHARE = 'C:\\Temp' lock_file = os.path.join(WINDOWS_SHARE, "transfer.lock") gained_lock = False while not gained_lock: try: os.mkdir(lock_file) gained_lock = True except OSError: print "Busy, please wait..." time.sleep(10) f = open(WINDOWS_SHARE + '/myfile', 'w') f.write("test 123") f.close() os.rmdir(lock_file) print "Done!" You need to do the same thing to the program which currently creates transfer.lock - so it waits if the transfer.lock is in existence too. -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: subprocess and non-blocking IO (again)
Marc Carter <[EMAIL PROTECTED]> wrote: > import subprocess,select,sys > > speakers=[] > lProc=[] > > for machine in ['box1','box2','box3']: > p = subprocess.Popen( ('echo '+machine+';sleep 2;echo goodbye;sleep > 2;echo cruel;sleep 2;echo world'), stdout=subprocess.PIPE, > stderr=subprocess.STDOUT, stdin=None, universal_newlines=True ) > lProc.append( p ) > speakers.append( p.stdout ) > > while speakers: > speaking = select.select( speakers, [], [], 1000 )[0] > for speaker in speaking: > speech = speaker.readlines() > if speech: > for sentence in speech: > print sentence.rstrip('\n') > sys.stdout.flush() # sanity check > else: # EOF > speakers.remove( speaker ) > - SNIP - > The problem with the above is that the subprocess buffers all its output > when used like this and, hence, this automation is not informing me of > much :) The problem with the above is that you are calling speaker.readlines() which waits for all the output. If you replace that with speaker.readline() or speaker.read(1) you'll see that subprocess hasn't given you a buffered pipe after all! In fact you'll get partial reads of each line - you'll have to wait for a newline before processing the result, eg import subprocess,select,sys speakers=[] lProc=[] for machine in ['box1','box2','box3']: p = subprocess.Popen( ('echo '+machine+';sleep 2;echo goodbye;sleep 2;echo cruel;sleep 2;echo world'), stdout=subprocess.PIPE, stderr=subprocess.STDOUT, stdin=None, universal_newlines=True, shell=True) lProc.append( p ) speakers.append( p.stdout ) while speakers: speaking = select.select( speakers, [], [], 1000 )[0] for speaker in speaking: speech = speaker.readline() if speech: for sentence in speech: print sentence.rstrip('\n') sys.stdout.flush() # sanity check else: # EOF speakers.remove( speaker ) gives b o x 1 b o x 3 b o x 2 pause... g o o d b y e etc... I'm not sure why readline only returns 1 character - the pipe returned by subprocess really does seem to be only 1 character deep which seems a little inefficient! Changing bufsize to the Popen call doesn't seem to affect it. -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
xml.minidom and user defined entities
I'm using xml.minidom to parse some of our XML files. Some of these have entities like "°" in which aren't understood by xml.minidom. These give this error. xml.parsers.expat.ExpatError: undefined entity: line 12, column 1 Does anyone know how to add entities when using xml.minidom? I've spend some time searching the docs/code/google but I haven't found the answer to this question! Thanks -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: xml.minidom and user defined entities
Fredrik Lundh <[EMAIL PROTECTED]> wrote: > Nick Craig-Wood wrote: > > > I'm using xml.minidom to parse some of our XML files. Some of these > > have entities like "°" in which aren't understood by xml.minidom. > > ° is not a standard entity in XML (see below). No probably not... > > These give this error. > > > > xml.parsers.expat.ExpatError: undefined entity: line 12, column 1 > > > > Does anyone know how to add entities when using xml.minidom? > > the document is supposed to contain the necessary entity declarations > as an inline DTD, or contain a reference to an external DTD. (iirc, mini- > dom only supports inline DTDs, but that may have been fixed in recent > versions). The document doesn't define the entitys either internally or externally. I don't fancy adding an inline definition either as there are 100s of documents I need to process! > if you don't have a DTD, your document is broken (if so, and the set of > entities is known, you can use re.sub to fix replace unknown entities with > the corresponding characters before parsing. let me know if you want > sample code). I was kind of hoping I could poke my extra entities into some dict or other in the guts of xml.minidom... However the job demands quick and nasty rather than elegant so I'll go for the regexp solution I think, as the list of entities is well defined. Thanks for your help Nick -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: gmpy 1.01 rc near... anybody wanna test>
I tested gmpy cvs as of now on Debian/testing/x86 with python2.3. It compiled perfectly, ran all of its unit tests and also all my test programs - Well done! My test program seemed to run at the same speed with both versions (not suprising really since both are using the same libgmp.so on the system). Thanks and look forward to the release Nick -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: gmpy/decimal interoperation
Alex Martelli <[EMAIL PROTECTED]> wrote: > As things stand now (gmpy 1.01), an instance d of decimal.Decimal cannot > transparently become an instance of any of gmpy.{mpz, mpq, mpf}, nor > vice versa (the conversions are all possible, but a bit laborious, e.g. > by explicitly going through string-forms). > > I'm thinking about possible ways to fix this, in whole or in part, but, > before I spend more time on this, I was wondering if there's anybody > else who'd be interested I can't ever imaging mixing the two. I use GMPY when I want fast inifinite precision artithmetic. I'd use decimal if I wanted to do decimal arithmetic on currency or something like that (or perhaps if I hadn't discovered GMPY in which case I wouldn't be mixing with GMPY!) > if so, maybe we can discuss which conversions should happen > implicitly None I'd say! Perhaps make a less laborious manual conversion function, but I don't think implicit conversion is that useful since decimal and gmpy are solving quite different problems. Implicit conversions also opens the can of worms - what is the preferred promotion type? decimal + mpf == decimal? mpf? mpq? IMHO of course! -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: How to avoid "f.close" (no parens) bug?
Peter <[EMAIL PROTECTED]> wrote: > First off, please explain what you are talking about better next time. > > Second, What on earth are you talking about? > > "f" is a file object, correct? > > Are you trying to close a file by typing f.close or is the file closing > when you type f.close? I would guess that this person is coming to python from perl. In perl, you are allowed to miss the parens off a method call if it takes no arguments, so f.close is equivalent to f.close(). I used to make this mistake all the time. However it is one pychecker catches, so install pychecker and run it on all your programs is the answer! -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
simulating #include in python
We are currently investigating whether to move the data files from our application into python for ease of maintenance. Each data item turns into a class definition with some class data. The python approach looks great, but there is one feature that we'd like to have. Currently the data files can include other data files. Typically thats used to import standard definitions from somewhere else (rather like #include in C - conceptually its a completely textual import). We use them something like this include("../../standard_definitions") include("../shared_definitions") class A(base_from_standard_definitions): pass class B(A): pass include("more_definitions") The includes act much more like #include than import - all the symbols from the file before the include() must be available to the included file and the included file must export all its symbols back to the parent. These can of course be re-arranged to work in a pythonic way using 'from x import *' and putting "../.." on sys.path instead of the includes. However there are over 500 of these files so I'd prefer a more automatic solution which doesn't require re-arrangement in the interim. (The re-arrangement is needed because "more_definitions" above might refer to A, B or anything defined in standard/shared_definitions leading to mutually recursive imports and all the pain they cause) I have implemented a working prototype, but it seems such a horrendous bodge that there must surely be a better way! I'd really like to be able to run an __import__ in the context of the file thats running the include() but I haven't figured that out. Here is the code (avert your eyes if you are of a sensitive nature ;-) Any suggestions for improvement would be greatly appreciated! def include(path): # Add the include directory onto sys.path native_path = path.replace("/", os.path.sep) directory, module_name = os.path.split(native_path) if module_name.endswith(".py"): module_name = module_name[:-3] old_sys_path = sys.path if directory != "": sys.path.insert(0, directory) # Introspect to find the parent # Each record contains a frame object, filename, line number, function # name, a list of lines of context, and index within the context. up = inspect.stack()[1] frame = up[0] parent_name = frame.f_globals['__name__'] parent = sys.modules[parent_name] # Poke all the current definitions into __builtin__ so the module # uses them without having to import them old_builtin = __builtin__.__dict__.copy() overridden = {} poked = [] for name in dir(parent): if not (name.startswith("__") and name.endswith("__")): if hasattr(__builtin__, name): overridden[name] = getattr(__builtin__, name) else: poked.append(name) setattr(__builtin__, name, getattr(parent, name)) # import the code module = __import__(module_name, parent.__dict__, locals(), []) # Undo the modifications to __builtin__ for name in poked: delattr(__builtin__, name) for name, value in overridden.items(): setattr(__builtin__, name, value) # check we did it right! Note __builtin__.__dict__ is read only so # can't be over-written if old_builtin != __builtin__.__dict__: raise AssertionError("Failed to restore __builtin__ properly") # Poke the symbols from the import back in for name in dir(module): if not (name.startswith("__") and name.endswith("__")): setattr(parent, name, getattr(module, name)) # Restore sys.path sys.path = old_sys_path -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: simulating #include in python
Peter Otten <[EMAIL PROTECTED]> wrote: > Nick Craig-Wood wrote: > > > I'd really like to be able to run an __import__ in the context of the file > > thats running the include() but I haven't figured that out. > > execfile()? Yes thats exactly what I was looking for - thank you very much! -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: sort the list
Daniel Schüle <[EMAIL PROTECTED]> wrote: > lst.sort(lambda x,y: cmp(x[1], y[1])) Since no-one mentioned it and its a favourite of mine, you can use the decorate-sort-undecorate method, or "Schwartzian Transform" eg lst = [[1,4],[3,9],[2,5],[3,2]] # decorate - ie make a copy of each item with the key(s) first and the # actual object last L = [ (x[1],x) for x in lst ] # sort L.sort() # undecorate L = [ x[-1] for x in L ] The Schwartzian transform is especially good when making the key is expensive - it only needs to be done N times, wheras a typical sort routine will call the cmp function N log N times. Its expensive in terms of memory though. With python 2.4 you can wrap it up into one line if you want [ x[-1] for x in sorted([ (x[1],x) for x in lst ]) ] or even [ x[-1] for x in sorted((x[1],x) for x in lst) ] -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: sort the list
Neil Hodgson <[EMAIL PROTECTED]> wrote: > Nick Craig-Wood: > > > Since no-one mentioned it and its a favourite of mine, you can use the > > decorate-sort-undecorate method, or "Schwartzian Transform" > > That is what the aforementioned key argument to sort is: a built-in > decorate-sort-undecorate. Its also python 2.4 only though :-( (We are stuck on python 2.3 here) -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: Writing pins to the RS232
Richard Brodie <[EMAIL PROTECTED]> wrote: > If you just need one or two signals, then it might be practical to use one > of the control lines, and PySerial supports this (UPS monitoring software > often works this way). I've done this many times (not with PySerial) for misc sensors. With PySerial you can read 4 pins (ie 4 inputs) getCD(self) Read terminal status line: Carrier Detect getCTS(self) Read terminal status line: Clear To Send getDSR(self) Read terminal status line: Data Set Ready getRI(self) Read terminal status line: Ring Indicator and set two outputs setDTR(self, on=1) Set terminal status line: Data Terminal Ready setRTS(self, on=1) Set terminal status line: Request To Send Other than those 6 that you have Rx, Tx and Ground which you can't use for logic, on a standard 9-way PC serial port. You need to set the serial port up not to do automatic handshaking first (eg setDsrDtr() & setRtsCts()) RS232 levels are +/- 12V, though a lot of computers only generate +/- 5V. The threshold is +/- 3V IIRC. -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: efficient 'tail' implementation
Gerald Klix <[EMAIL PROTECTED]> wrote: > As long as memory mapped files are available, the fastest > method is to map the whole file into memory and use the > mappings rfind method to search for an end of line. Excellent idea. It'll blow up for large >2GB files on a 32bit OS though. -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: efficient 'tail' implementation
Gerald Klix <[EMAIL PROTECTED]> wrote: > As long as memory mapped files are available, the fastest > method is to map the whole file into memory and use the > mappings rfind method to search for an end of line. Actually mmap doesn't appear to have an rfind method :-( Here is a tested solution using mmap using your code. Inefficient if number of lines to be tailed is too big. import os import sys import mmap def main(nlines, filename): reportFile = open( filename ) length = os.fstat( reportFile.fileno() ).st_size if length == 0: # Don't map zero length files, windows will barf return try: mapping = mmap.mmap( reportFile.fileno(), length, mmap.MAP_PRIVATE, mmap.PROT_READ ) except AttributeError: mapping = mmap.mmap( reportFile.fileno(), 0, None, mmap.ACCESS_READ ) search = 1024 lines = [] while 1: if search > length: search = length tail = mapping[length-search:] lines = tail.split(os.linesep) if len(lines) >= nlines or search == length: break search *= 2 lines = lines[-nlines-1:] print "\n".join(lines) if __name__ == "__main__": if len(sys.argv) != 3: print "Syntax: %s n file" % sys.argv[0] else: main(int(sys.argv[1]), sys.argv[2]) -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: windows mem leak
Bob Smith <[EMAIL PROTECTED]> wrote: > Attached is the code. Run it yourself and see. This seems to run nmap over series of consecutive IP addresses. nmap can do that all by itself. From its man page:: Nmap also has a more powerful notation which lets you specify an IP address using lists/ranges for each element. Thus you can scan the whole class 'B' network 128.210.*.* by specifying '128.210.*.*' or '128.210.0-255.0-255' or even '128.210.1-50,51-255.1,2,3,4,5-255'. And of course you can use the mask notation: '128.210.0.0/16'. These are all equivalent. If you use astericts ('*'), remember that most shells require you to escape them with back slashes or protect them with quotes. This setting might be useful too:: --max_parallelism Specifies the maximum number of scans Nmap is allowed to perform in parallel. Setting this to one means Nmap will never try to scan more than 1 port at a time. It also effects other parallel scans such as ping sweep, RPC scan, etc. [sorry not Python related but may solve your problem!] -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: Refactoring; arbitrary expression in lists
Stephen Thorne <[EMAIL PROTECTED]> wrote: > Why not use a regexp based approach. Good idea... You could also use sre.Scanner which is supposed to be fast like this... import re, sre scanner = sre.Scanner([ (r"\.php$", "application/x-php"), (r"\.(cc|cpp)$", "text/x-c++-src"), (r"\.xsl$", "xsl"), (r"Makefile", "text/x-makefile"), (r".", None), ]) def detectMimeType( filename ): t = scanner.scan(filename)[0] if len(t) < 1: return None # raise NoMimeError return t[0] for f in ("index.php", "index.php3", "prog.cc", "prog.cpp", "flodge.xsl", "Makefile", "myMakefile", "potato.123"): print f, detectMimeType(f) ... prints index.php application/x-php index.php3 None prog.cc text/x-c++-src prog.cpp text/x-c++-src flodge.xsl xsl Makefile text/x-makefile myMakefile text/x-makefile potato.123 None -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: Debian says "Warning! you are running an untested version of Python." on 2.3
Alex Stapleton <[EMAIL PROTECTED]> wrote: > Whenever I run python I get > > "Warning! you are running an untested version of Python." > > prepended to the start of any output on stdout. > > This is with Debian and python 2.3 (running the debian 2.1 and 2.2 binaries > doesn't have this effect) What version of a) Debian and b) python are you running? I don't have that problem here (I'm running testing/sarge) $ python2.4 -c 'pass' $ python2.3 -c 'pass' $ python2.2 -c 'pass' $ python2.1 -c 'pass' $ $ dpkg -l python2.1 python2.2 python2.3 python2.4 ||/ Name VersionDescription +++-==-==- ii python2.1 2.1.3-25 An interactive high-level object-oriented la ii python2.2 2.2.3-10 An interactive high-level object-oriented la ii python2.3 2.3.4-18 An interactive high-level object-oriented la ii python2.4 2.4-2 An interactive high-level object-oriented la -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: Debian says "Warning! you are running an untested version of Python." on 2.3
Gerhard Haering <[EMAIL PROTECTED]> wrote: > ROFL. Are you using testing, sid or experimental? I expect overzealous > patching from Debian developers, but this is the worst I've heard > of. I would have thought that is unlikely given the way packages move (unmodified) from unstable -> testing (and eventually) -> stable. -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: rotor replacement
Robin Becker <[EMAIL PROTECTED]> wrote: > Paul Rubin wrote: > > "Reed L. O'Brien" <[EMAIL PROTECTED]> writes: > > > >>I see rotor was removed for 2.4 and the docs say use an AES module > >>provided separately... Is there a standard module that works alike or > >>an AES module that works alike but with better encryption? > > > > > > If you mean a module in the distribution, the answer is no, for > > political reasons. > > > .I'm also missing the rotor module and regret that something useful > was warned about and now removed with no plugin replacement. > > I had understood that this was because rotor was insecure, but you > mention politics. Are other useful modules to suffer from politics? > > What exactly are/were the political reasons for rotor removal? Presumably he is talking about crypo-export rules. In the past strong cryptography has been treated as munitions, and as such exporting it (especially from the USA) could have got you into very serious trouble. However I believe those restrictions have been lifted (the cat having been let out of the bag somewhat ;-), and its easy to do this for open source encryption software. A wade through http://www.bxa.doc.gov/Encryption/enc.htm Might be interesting. A case in point: the linux 2.6 kernel is chock full of crypo and comes with implementations of AES, ARC4, Blowfish, Cast5+6, DES, Serpent, Twofish, TEA, etc. The linux kernel+source surely goes everywhere python does so I don't think adding strong crypto modules to python is a problem now-a-days. AES in the core python library would be very useful and it would discourage people from writing their own crypto routines (looks easy but isn't!) -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: Overloading ctor doesn't work?
Martin Häcker <[EMAIL PROTECTED]> wrote: > Now I thought, just overide the ctor of datetime so that year, month and >day are static and everything should work as far as I need it. > > That is, it could work - though I seem to be unable to overide the ctor. :( > > Why is that? Its a bug! http://sourceforge.net/tracker/index.php?func=detail&aid=720908&group_id=5470&atid=105470 However its been fixed in a recent Python 2.3. (I was bitten by the same thing which used to fail but now works after an upgrade of python 2.3!) -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: rotor replacement
Paul Rubin wrote: > Here's the message I had in mind: > > http://groups-beta.google.com/group/comp.lang.python/msg/adfbec9f4d7300cc > > It came from someone who follows Python crypto issues as closely as > anyone, and refers to a consensus on python-dev. I'm not on python-dev > myself but I feel that the author of that message is credible and is > not just "anyone". And here is the relevant part... "A.M. Kuchling" wrote: > On Fri, 27 Feb 2004 11:01:08 -0800 Trevor Perrin wrote: > > Are you and Paul still looking at adding ciphers to stdlib? That would > > make me really, really happy :-) > > No, unfortunately; the python-dev consensus was that encryption raised > export control issues, and the existing rotor module is now on its way to > being removed. I'm sure thats wrong now-a-days. Here are some examples of open source software with strong crypto Linux kernel: http://www.kernel.org/ GNU crypto project: http://www.gnu.org/software/gnu-crypto/ TryCrypt: http://truecrypt.sourceforge.net/ OpenSSL: http://www.openssl.org/ AEScrypt: http://aescrypt.sourceforge.net/ Note that some of these are being worked on at sourceforge just like python. Surely it must be possible to add a few simple crypto modules to python? That said a) IANAL b) 'apt-get install python-crypto' works for me ;-) -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: default value in a list
TB <[EMAIL PROTECTED]> wrote: > Is there an elegant way to assign to a list from a list of unknown > size? For example, how could you do something like: > > >>> a, b, c = (line.split(':')) > if line could have less than three fields? You could use this old trick... a, b, c = (line+"::").split(':')[:3] Or this version if you want something other than "" as the default a, b, b = (line.split(':') + 3*[None])[:3] BTW This is a feature I miss from perl... -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: default value in a list
Alex Martelli <[EMAIL PROTECTED]> wrote: > Nick Craig-Wood <[EMAIL PROTECTED]> wrote: > ... > > Or this version if you want something other than "" as the default > > > > a, b, b = (line.split(':') + 3*[None])[:3] > > Either you mean a, b, c -- or you're being subtler than I'm > grasping. Just a typo - I meant c! > > BTW This is a feature I miss from perl... > > Hmmm, I understand missing the ``and all the rest goes here'' feature > (I'd really love it if the rejected > a, b, *c = whatever > suggestion had gone through, ah well), but I'm not sure what exactly > you'd like to borrow instead -- blissfully by now I've forgotten a lot > of the perl I used to know... care to clarify? I presume your construct above is equivalent to my ($a, $b, @c) = split /.../; which I do indeed miss. Sometimes I miss the fact that in the below any unused items are set to undef, rather than an exception being raised my ($a, $b, $c) = @array; However, I do appreciate the fact (for code reliability) that the python equivalent a, b, c = array will blow up if there aren't exactly 3 elements in array. So since I obviously can't have my cake an eat it here, I'd leave python how it is for the second case, and put one of the suggestions in this thread into my toolbox / the standard library. BTW I've converted a lot of perl programs to python so I've come across a lot of little things like this! -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: best way to do a series of regexp checks with groups
Mark Fanty <[EMAIL PROTECTED]> wrote: > In perl, I might do (made up example just to illustrate the point): > > if(/add (\d+) (\d+)/) { >do_add($1, $2); > } elsif (/mult (\d+) (\d+)/) { >do_mult($1,$2); > } elsif(/help (\w+)/) { >show_help($1); > } There was a thread about this recently under the title "regular expression: perl ==> python" Here is a different solution... class Result: def set(self, value): self.value = value return value m = Result() if m.set(re.search(r'add (\d+) (\d+)', line)): do_add(m.value.group(1), m.value.group(2)) elif m.set(re.search(r'mult (\d+) (\d+)', line)): do_mult(m.value.group(1), m.value.group(2)) elif m.set(re.search(r'help (\w+)', line)): show_help(m.value.group(1)) -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: Classical FP problem in python : Hamming problem
Francis Girard <[EMAIL PROTECTED]> wrote: > def hamming(): >def _hamming(): > yield 1 > hamming2 = hammingGenerators[0] > hamming3 = hammingGenerators[1] > hamming5 = hammingGenerators[2] > for n in imerge(imap(lambda h: 2*h, iter(hamming2)), > imerge(imap(lambda h: 3*h, iter(hamming3)), > imap(lambda h: 5*h, iter(hamming5: >yield n >hammingGenerators = tee(_hamming(), 4) >return hammingGenerators[3] If you are after readability, you might prefer this... def hamming(): def _hamming(): yield 1 for n in imerge(imap(lambda h: 2*h, iter(hamming2)), imerge(imap(lambda h: 3*h, iter(hamming3)), imap(lambda h: 5*h, iter(hamming5: yield n hamming2, hamming3, hamming5, result = tee(_hamming(), 4) return result PS interesting thread - never heard of Hamming sequences before! -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: Classical FP problem in python : Hamming problem
Francis Girard <[EMAIL PROTECTED]> wrote: > The following implementation is even more speaking as it makes self-evident > and almost mechanical how to translate algorithms that run after their tail > from recursion to "tee" usage : > > *** BEGIN SNAP > from itertools import tee, imap > import sys > > def imerge(xs, ys): >x = xs.next() >y = ys.next() >while True: > if x == y: >yield x >x = xs.next() >y = ys.next() > elif x < y: >yield x >x = xs.next() > else: >yield y >y = ys.next() Thinking about this some more leads me to believe a general purpose imerge taking any number of arguments will look neater, eg def imerge(*generators): values = [ g.next() for g in generators ] while True: x = min(values) yield x for i in range(len(values)): if values[i] == x: values[i] = generators[i].next() > def hamming(): >def _hamming(): > yield 1 > hamming2 = hammingGenerators[0] > hamming3 = hammingGenerators[1] > hamming5 = hammingGenerators[2] > for n in imerge(imap(lambda h: 2*h, iter(hamming2)), > imerge(imap(lambda h: 3*h, iter(hamming3)), > imap(lambda h: 5*h, iter(hamming5: >yield n >hammingGenerators = tee(_hamming(), 4) >return hammingGenerators[3] This means that this can be further simplified thus, def hamming(): def _hamming(): yield 1 for n in imerge( imap(lambda h: 2*h, hamming2), imap(lambda h: 3*h, hamming3), imap(lambda h: 5*h, hamming5) ): yield n hamming2, hamming3, hamming5, result = tee(_hamming(), 4) return result (Note the iter(...) seemed not to be doing anything useful so I removed them) -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: Classical FP problem in python : Hamming problem
Steven Bethard <[EMAIL PROTECTED]> wrote: > Nick Craig-Wood wrote: > > Thinking about this some more leads me to believe a general purpose > > imerge taking any number of arguments will look neater, eg > > > > def imerge(*generators): > > values = [ g.next() for g in generators ] > > while True: > > x = min(values) > > yield x > > for i in range(len(values)): > > if values[i] == x: > > values[i] = generators[i].next() > > > > This kinda looks like it dies after the first generator is exhausted, > but I'm not certain. Yes it will stop iterating then (rather like zip() on lists of unequal size). Not sure what the specification should be! It works for the hamming problem though. >>> list(imerge(iter([1, 2]), iter([1, 2, 3]), iter([1, 2, 3, 4]))) [1, 2] > An alternate version that doesn't search for 'i': > > py> def imerge(*iterables): > ... iters = [iter(i) for i in iterables] > ... values = [i.next() for i in iters] > ... while iters: > ... x, i = min((val, i) for i, val in enumerate(values)) > ... yield x > ... try: > ... values[i] = iters[i].next() > ... except StopIteration: > ... del iters[i] > ... del values[i] > ... > py> list(imerge([1, 4, 7], [2, 5, 8], [3, 6, 9])) > [1, 2, 3, 4, 5, 6, 7, 8, 9] > py> list(imerge([3, 6, 9], [1, 4, 7], [2, 5, 8])) > [1, 2, 3, 4, 5, 6, 7, 8, 9] > py> list(imerge([1, 4, 7], [3, 6, 9], [2, 5, 8])) > [1, 2, 3, 4, 5, 6, 7, 8, 9] This isn't quite right... >>> list(imerge([1, 2, 3], [1, 2, 3], [1, 2, 3])) [1, 1, 1, 2, 2, 2, 3, 3, 3] This should produce [1, 2, 3] So I'm afraid the searching *is* necessary - you've got to find all the generators with the min value and move them on. -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: Classical FP problem in python : Hamming problem
Francis Girard <[EMAIL PROTECTED]> wrote: > Thank you Nick and Steven for the idea of a more generic imerge. You are welcome :-) [It came to me while walking the children to school!] [snip] > class IteratorDeiterator: >def __init__(self, iterator): > self._iterator = iterator.__iter__() > self._firstVal = None ## Avoid consuming if not requested from outside >## Works only if iterator itself can't return None You can use a sentinel here if you want to avoid the "can't return None" limitation. For a sentinel you need an object your iterator couldn't possibly return. You can make one up, eg self._sentinel = object() self._firstVal = self._sentinel Or you could use self (but I'm not 100% sure that your recursive functions wouldn't return it!) >def __iter__(self): return self > >def next(self): > valReturn = self._firstVal > if valReturn is None: and if valReturn is self._sentinel: >valReturn = self._iterator.next() > self._firstVal = None self._firstVal = self._sentinel etc.. [snip more code] Thanks for some more examples of fp-style code. I find it hard to get my head round so its been good exercise! -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: Help With Python
Steven Bethard <[EMAIL PROTECTED]> wrote: > Nick Vargish wrote: > > # this is one viking making one order repeated 511 times. if you want > > # 511 vikings making seperate orders, you'll have to write a loop. > > No need to write a loop: > > py> class Viking(object): > ... def order(self): > ... return 'Spam' > ... > py> v = Viking() > py> orders = [v.order()] * 7 > py> ', '.join(orders) > 'Spam, Spam, Spam, Spam, Spam, Spam, Spam' > py> orders = [Viking().order()] * 7 > py> ', '.join(orders) > 'Spam, Spam, Spam, Spam, Spam, Spam, Spam' Thats still one Viking making 7 orders surely? Eg >>> vikings = [Viking()] * 7 >>> vikings[0] is vikings[1] True whereas >>> vikings = [Viking() for _ in range(7)] >>> vikings[0] is vikings[1] False So you want this... >>> orders = [ Viking().order() for _ in range(7) ] -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: Inherting from object. Or not.
Nick Coghlan <[EMAIL PROTECTED]> wrote: > Exactly. My advice is to use new-style classes unless you have a > reason not to (if you're inheriting from a builtin type, then there > is no need to inherit from object as well - the builtin types > already have the correct basic type). Except for Exception! Exception and anything that inherits from it is an old style class. I discovered the other day that you can't throw a new style class as an exception at all, eg >>> class MyException(object): pass ... >>> raise MyException Traceback (most recent call last): File "", line 1, in ? TypeError: exceptions must be classes, instances, or strings (deprecated), not type >>> (not a terribly helpful message - took me a while to work it out!) wheras old style works fine... >>> class MyOldException: pass ... >>> raise MyOldException Traceback (most recent call last): File "", line 1, in ? __main__.MyOldException: <__main__.MyOldException instance at 0xb7df4cac> >>> After that I recalled a thread on python-dev about it http://mail.python.org/pipermail/python-dev/2004-August/046812.html -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: XOR on string
Peter Hansen <[EMAIL PROTECTED]> wrote: > snacktime wrote: > > I need to calculate the lrc of a string using an exclusive or on each > > byte in the string. How would I do this in python? > > lrc == Linear Redundancy Check? or Longitudinal? Note that > such terms are not precisely defined... generally just acronyms > people make up and stick in their user manuals for stuff. :-) > > import operator > lrc = reduce(operator.xor, [ord(c) for c in string]) Or for the full functional programming effect... lrc = reduce(operator.xor, map(ord, string)) which is slightly faster and shorter... $ python2.4 -m timeit -s'import operator; string = "abcdefghij13123kj12l3k1j23lk12j3l12kj3"' \ 'reduce(operator.xor, [ord(c) for c in string])' 1 loops, best of 3: 20.3 usec per loop $ python2.4 -m timeit -s'import operator; string = "abcdefghij13123kj12l3k1j23lk12j3l12kj3"' \ 'reduce(operator.xor, map(ord, string))' 10 loops, best of 3: 15.6 usec per loop -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: Elliptic Code
Philip Smith <[EMAIL PROTECTED]> wrote: > I understand the algorithm quite well but how to code the multiplication > stage most efficiently in python eludes me. You might want to look at http://gmpy.sourceforge.net/ It has very fast multiplication up to any size you like! -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: debugging os.spawn*() calls
Martin Franklin <[EMAIL PROTECTED]> wrote: > Skip Montanaro wrote: > > I have an os.spawnv call that's failing: > > > > pid = os.spawnv(os.P_NOWAIT, "ssh", > > ["ssh", remote, > > "PATH=%(path)s nice -20 make -C %(pwd)s" % locals()]) > > > > When I wait for it the status returned is 32512, indicating an exit status > > of 127. Unfortunately, I see no way to collect stdout or stderr from the > > spawned process, so I can't tell what's going wrong. > > While not a 'real' answer - I use pexpect to automate my ssh scripts > these days as I had a few problems using ssh with the os.* family > perhaps you may find pexpect a wee bit easier... If using 2.4 the subprocess module is a good solution too. It lets you catch stdout/stderr easily. -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: What's so funny? WAS Re: rotor replacement
Paul Rubin wrote: > An AES or DES addition to an existing module that implements just one > call: > ECB(key, data, direction) > would be a huge improvement over what we have now. A more complete > crypto module would have some additional operations, but ECB is the > only one that's really essential. I would hate to see a module which only implemented ECB. Sure its the only operation necessary to build the others out of, but its the least secure mode of any block cipher. If you don't offer users a choice, then they'll use ECB and just that along with all its pitfalls, meanwhile thinking that they are secure because they are using AES/DES... For those people following along at home (I'm sure everyone who has contributed to thread knows this already) I tried to find a simple link to why ECB is bad, this PDF is the best I could come up with, via Google's Cache. http://www.google.com/search?q=cache:U5-RsbkSs0MJ:www.cs.chalmers.se/Cs/Grundutb/Kurser/krypto/lect04_4.pdf -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: What's so funny? WAS Re: rotor replacement
Skip Montanaro <[EMAIL PROTECTED]> wrote: > Fine. Go build a sumo distribution and track the normal CPython. > The problem isn't all that new. (Take a look at scipy.org for one > take on that theme. Of course Linux distros have been doing their > take on this forever.) If I'm writing code just for fun. I'll be doing on Debian Linux, then I can do apt-get install python-crypto and I'm away. However if I'm writing code for work, it has to work on windows as well, which introduces a whole extra barrier to using 3rd party modules. Something else to compile. Something else to stick in the installer. Basically a whole heap of extra work. I think one of the special things about Python is its batteries included approach, and a crypto library would seem to be an obvious battery to install since it doesn't (or needn't) depend on any other library or application. -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: What's so funny? WAS Re: rotor replacement
Paul Rubin wrote: > "Martin v. Löwis" <[EMAIL PROTECTED]> writes: > > Apparently, people disagree on what precisely the API should be. E.g. > > cryptkit has > > > > obj = aes(key) > > obj.encrypt(data) > > I don't disagree about the API. The cryptkit way is better than ECB > example I gave, but the ECB example shows it's possible to do it in > one call. There is a PEP about this... API for Block Encryption Algorithms v1.0 http://www.python.org/peps/pep-0272.html -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: limited python virtual machine
Jack Diederich <[EMAIL PROTECTED]> wrote: > The Xen virtual server[1] was recently metnioned on slashdot[2]. > It is more lightweight and faster than full scale machine emulators because > it uses a modified system kernel (so it only works on *nixes it has been > ported to). ...it also uses python for its control programs. -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: What's so funny? WAS Re: rotor replacement
Paul Rubin wrote: > Actually and surprisingly, that's not really true. Crypto algorithms > are pretty straightforward, so if you examine the code and check that > it passes a bunch of test vectors, you can be pretty sure it's > correct. I was going to write pretty much the same thing. If a security flaw is found in a block cipher (say) it won't be because it has a buffer overflow etc, it will be because the algorithm is flawed. You can't patch up crypto algorithms, you have to throw them away and start again (you can't have two incompatible versions of DES for instance). -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: Where are list methods documented?
Tim Peters <[EMAIL PROTECTED]> wrote: > You could have gotten to the same place in several ways. [snip] Since I'm a unix person, I would have typed pydoc -k sort But it doesn't come up with anything useful :-( $ pydoc -k sort MySQLdb.stringtimes - Use strings to handle date and time columns as a last resort. ... In fact I've never had any luck with pydoc -k. I think what it searches through is too narrow - from its man page Run "pydoc -k " to search for a keyword in the synopsis lines of all available modules. This is the NAME line that appears in pydoc I think. If it did a full text search that would be much more useful, or at least a search through all the names of the functions & variables that are exported. ... My top wish for pydoc is to have the rest of the documentation in it! Each module has lots of nice documentation which is only available in HTML format - I'd much rather type "pydoc whatever" to see it all, rather than have to load my browser, and click through several layers of web pages. -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: Which kid's beginners programming - Python or Forth?
Rocco Moretti <[EMAIL PROTECTED]> wrote: > So for Math you'd do something like: > > y = b + mx + cx^2 > > (Where ^2 is a superscript 2) > > For Python it would be: > > y = b + m*x + c*x**2 > > IIRC, for Forth it would be something like (please excuse the mistakes > in operator notation): > > x 2 ^ c * m x * + b + 'y' setvar In FORTH you don't generally use variables unless you really have to - that is what the stack is for, so you'd write a word like this... variable c 10 c ! variable m -2 m ! variable b 14 b ! : quad ( x -- b + m*x + c*x**2 ) dup dup ( x x x ) * c @ * swap ( cx**2 x ) m @ * + ( m*x + c*x**2 ) b @ + ( b + m*x + c*x**2 ) ; And now we test 7 quad . 490 ok Was that easy? Not really! Compared to python... >>> c = 10 >>> m = -2 >>> b = 14 >>> def quad(x): return b + m*x + c*x**2 ... >>> quad(7) 490 Was it fun? Well yes it was! FORTH is much lower level than python and you learn different things from it. At each step you have to worry about what is on the stack which attention to detail is important for a programmer. Its a lot of work to do even the simple stuff though. Its much easier to understand how FORTH works, and even implement your own from scratch. I learnt FORTH a long time ago, and I haven't used it for many many years! Its major pull back then was that it was fast, and easier to write than assembler. I don't think that really matters now though, Python is just as fast thanks to the 3 GHz machine I'm running it on (rather than the 4 MHz one I ran FORTH on then!) I think FORTH would be an interesting supplimentary language for anyone to learn though... *However* I reckon Python would make a much better first language than FORTH. The batteries included approach make a young programmers life much, much more fun, rather than starting from almost nothing (in modern terms) with FORTH. And like FORTH, Python has the interactive console which is essential when starting out. -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: sending binary files to a 16 micro controller.
Grant Edwards <[EMAIL PROTECTED]> wrote: > You have no control over packet size in TCP if you use the > normal socket interface. About the only thing you can to is > put delays between calls to send() in hope that the TCP stack > will send a packet. You can set the MTU (maximum transfer unit) for that interface. You do with with ifconfig under un*x - I expect windows has an interface to do it too (perhaps ipconfig?) For ethernet the MTU is 1500 bytes normally. > If you really do want control over packet size, you'll have to > use a raw socket and impliment TCP yourself. Actually I'd recommend the OP uses UDP, not TCP. I've implemented a few systems which speak UDP directly and its very easy. I wouldn't like to implement TCP though! -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: Revised PEP 349: Allow str() to return unicode strings
Thomas Heller <[EMAIL PROTECTED]> wrote: > I like the fact that currently unicode(x) is guarateed to return a > unicode instance, or raises a UnicodeDecodeError. Same for str(x), > which is guaranteed to return a (byte) string instance or raise an > error. I guess its analogous to this... >>> int(100L) 100L > Wouldn't also a new function make the intent clearer? > > So I think I'm +1 on the text() built-in, and -0 on changing str. Couldn't basestring() perform this function? Its kind of what basestring is for isn't it? -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: dual processor
Jeremy Jones <[EMAIL PROTECTED]> wrote: > One Python process will only saturate one CPU (at a time) because > of the GIL (global interpreter lock). I'm hoping python won't always be like this. If you look at another well known open source program (the Linux kernel) you'll see the progression I'm hoping for. At the moment Python is at the Linux 2.0 level. Its supports multiple processors, but has a single lock (Python == Global Interpreter Lock, Linux == Big Kernel Lock). Linux then took the path of splitting the BKL into smaller and smaller locks, increasing the scalability over multiple processors. Eventually by 2.6 we now have a fully preempt-able kernel, lock-less read-copy-update etc. Splitting the GIL introduces performance and memory penalties. Its been tried before in python (I can't find the link at the moment - sorry!). Exactly the same complaint was heard when Linux started splitting its BKL. However its crystal clear now the future is SMP. Modern chips seem to have hit the GHz barrier, and now the easy meat for the processor designers is to multiply silicon and make multiple thread / core processors all in a single chip. So, I believe Python has got to address the GIL, and soon. A possible compromise (also used by Linux) would be to have two python binaries. One with the BKL which will be faster on uniprocessor machines, and one with a system of fine grained locking for multiprocessor machines. This would be selected at compile time using C Macro magic. -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: dual processor
Scott David Daniels <[EMAIL PROTECTED]> wrote: > Nick Craig-Wood wrote: > > Splitting the GIL introduces performance and memory penalties > > However its crystal clear now the future is SMP. Modern chips seem to > > have hit the GHz barrier, and now the easy meat for the processor > > designers is to multiply silicon and make multiple thread / core > > processors all in a single chip. > > So, I believe Python has got to address the GIL, and soon. > However, there is no reason to assume that those multiple cores must > work in the same process. No of course not. However if they aren't then you've got the horrors of IPC to deal with! Which is difficult to do fast and portably. Much easier to communicate with another thread, especially with the lovely python threading primitives. > One of the biggest issues in running python in multiple > simultaneously active threads is that the Python opcodes themselves > are no longer indivisible. Making a higher level language that > allows updates work with multiple threads involves lots of > coordination between threads simply to know when data structures > are correct and when they are in transition. Sure! No one said it was easy. However I think it can be done to all of python's native data types, and in a way that is completely transparent to the user. > Even processes sharing some memory (in a "raw binary memory" style) are > easier to write and test. You'd lose too much processor to coordination > effort which was likely unnecessary. The simplest example I can think > of is decrementing a reference count. Only one thread can be allowed to > DECREF at any given time for fear of leaking memory, even though it will > most often turn out the objects being DECREF'ed by distinct threads are > themselves distinct. Yes locking is expensive. If we placed a lock in every python object that would bloat memory usage and cpu time grabbing and releasing all those locks. However if it meant your threaded program could use 90% of all 16 CPUs, rather than 100% of one I think its obvious where the payoff lies. Memory is cheap. Multiple cores (SMP/SMT) are everywhere! > In short, two Python threads running simultaneously cannot trust > that any basic Python data structures they access are in a > consistent state without some form of coordination. Aye, lots of locking is needed. -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: dual processor
Paul Rubin wrote: > Jorgen Grahn <[EMAIL PROTECTED]> writes: > > I feel the recent SMP hype (in general, and in Python) is a red herring. Why > > do I need that extra performance? What application would use it? > > How many mhz does the computer you're using right now have? When did > you buy it? Did you buy it to replace a slower one? If yes, you must > have wanted more performance. Just about everyone wants more > performance. That's why mhz keeps going up and people keep buying > faster and faster cpu's. > > CPU makers seem to be running out of ways to increase mhz. Their next > avenue to increasing performance is SMP, so they're going to do that > and people are going to buy those. Just like other languages, Python > makes perfectly good use of increasing mhz, so it keeps up with them. > If the other languages also make good use of SMP and Python doesn't, > Python will fall back into obscurity. Just to back your point up, here is a snippet from theregister about Sun's new server chip. (This is a rumour piece but theregister usually gets it right!) Sun has positioned Niagara-based systems as low-end to midrange Xeon server killers. This may sound like a familiar pitch - Sun used it with the much delayed UltraSPARC IIIi processor. This time around though Sun seems closer to delivering on its promises by shipping an 8 core/32 thread chip. It's the most radical multicore design to date from a mainstream server processor manufacturer and arrives more or less on time. It goes on later to say "The physical processor has 8 cores and 32 virtual processors" and runs at 1080 MHz. So fewer GHz but more CPUs is the future according to Sun. http://www.theregister.co.uk/2005/09/07/sun_niagara_details/ -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: Premature wakeup of time.sleep()
Erich Schreiber <[EMAIL PROTECTED]> wrote: > In the Python Library Reference the explanation of the time.sleep() > function reads amongst others: > > > The actual suspension time may be less than that requested because > > any caught signal will terminate the sleep() following execution > > of that signal's catching routine. Also, the suspension time may > > be longer than requested by an arbitrary amount because of the > > scheduling of other activity in the system. > > I don't understand the first part of this passage with the premature > wakeup. What signals would that be? If someone sent your process a signal. Say you pressed CTRL-C - that generates the INT signal which python translates to the KeyboardInterrupt exception - and which does interrupt the sleep() system call. This probably isn't happening to you though. > In the logs I see a about 1% of the wake-up delays beeing negative > from -1ms to about -20ms somewhat correlated with the duration of the > sleep. 20 minute sleeps tend to wake-up earlier then sub-second > sleeps. Can somebody explain this to me? Sleep under linux has a granularity of the timer interrupt rate (known as HZ or jiffies in linux-speak). Typically this is 100 Hz, which is a granularity of 10ms. So I'd expect your sleeps to be no more accurate than +/- 10ms. +/- 20ms doesn't seem unreasonable either. (Linux 2.4 was fond of 100Hz. Its more configurable in 2.6 so could be 1000 Hz. Its likely to be 100 Hz or less in a virtual private server.) I'm not sure the best way of finding out your HZ, here is one (enter it all on one line) start=`grep timer /proc/interrupts | awk '{print $2}'`; sleep 1; end=`grep timer /proc/interrupts | awk '{print $2}'`; echo $(($end-$start)) Which prints a number about 1000 on my 2.6 machine. -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: Efficient checksum calculating on lagre files
Ola Natvig <[EMAIL PROTECTED]> wrote: > Hi all > > Does anyone know of a fast way to calculate checksums for a large file. > I need a way to generate ETag keys for a webserver, the ETag of large > files are not realy nececary, but it would be nice if I could do it. I'm > using the python hash function on the dynamic generated strings (like in > page content) but on things like images I use the shutil's > copyfileobject function and the hash of a fileobject's hash are it's > handlers memmory address. > > Does anyone know a python utility which is possible to use, perhaps > something like the md5sum utility on *nix systems. Here is an implementation of md5sum in python. Its the same speed give or take as md5sum itself. This isn't suprising since md5sum is dominated by CPU usage of the MD5 routine (in C in both cases) and/or io (also in C). I discarded the first run so both tests ran with large_file in the cache. $ time md5sum large_file e7668fdc06b68fbf087a95ba888e8054 large_file real0m1.046s user0m0.946s sys 0m0.071s $ time python md5sum.py large_file e7668fdc06b68fbf087a95ba888e8054 large_file real0m1.033s user0m0.926s sys 0m0.108s $ ls -l large_file -rw-r--r-- 1 ncw ncw 115933184 Jul 8 2004 large_file """ Re-implementation of md5sum in python """ import sys import md5 def md5file(filename): """Return the hex digest of a file without loading it all into memory""" fh = open(filename) digest = md5.new() while 1: buf = fh.read(4096) if buf == "": break digest.update(buf) fh.close() return digest.hexdigest() def md5sum(files): for filename in files: try: print "%s %s" % (md5file(filename), filename) except IOError, e: print >> sys.stderr, "Error on %s: %s" % (filename, e) if __name__ == "__main__": md5sum(sys.argv[1:]) -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: Java Integer.ParseInt translation to python
jose isaias cabrera <[EMAIL PROTECTED]> wrote: [java output] > StringLength = 40 > c1 193 -63 > 7c 124 124 > e1 225 -31 > 86 134 -122 > ab 171 -85 > 94 148 -108 > ee 238 -18 > b0 176 -80 > de 222 -34 > 8a 138 -118 > e3 227 -29 > b5 181 -75 > b7 183 -73 > 51 81 81 > a7 167 -89 > c4 196 -60 > d8 216 -40 > e9 233 -23 > ed 237 -19 > eb 235 -21 > [EMAIL PROTECTED] > > But, here is what I have for python, > > def PrepareHash(HashStr): >while len(HashStr) > 0: > byte = HashStr[0:2] > print byte,int(byte,16),byte(int(byte,16)) # & 0xff > HashStr = HashStr[2:] >return byte > > def Main(): >HashStr = "c17ce186ab94eeb0de8ae3b5b751a7c4d8e9edeb" >HashStr = PrepareHash(HashStr) >print "Prepared HashStr :",HashStr > > Main() When I try your code I get this... >>> def PrepareHash(HashStr): ... while len(HashStr) > 0: ... byte = HashStr[0:2] ... print byte,int(byte,16),byte(int(byte,16)) # & 0xff ... HashStr = HashStr[2:] ... return byte ... >>> HashStr = "c17ce186ab94eeb0de8ae3b5b751a7c4d8e9edeb" >>> print PrepareHash(HashStr) c1 193 Traceback (most recent call last): File "", line 1, in ? File "", line 4, in PrepareHash TypeError: 'str' object is not callable >>> You cant do byte(int(byte,16)) - byte is a string! So you haven't posted the actual code you ran... Anyway what you want is this >>> decoded = HashStr.decode('hex') >>> for i,c in enumerate(decoded): print "%2d %02x" % (i,ord(c)) ... 0 c1 1 7c 2 e1 3 86 4 ab 5 94 6 ee 7 b0 8 de 9 8a 10 e3 11 b5 12 b7 13 51 14 a7 15 c4 16 d8 17 e9 18 ed 19 eb > and it results to, > > mulo 19:32:06-> python test.py > c1 193 Á > 7c 124 | > e1 225 á > 86 134 > ab 171 « > 94 148 > ee 238 î > b0 176 ° > de 222 Þ > 8a 138 > e3 227 ã > b5 181 µ > b7 183 · > 51 81 Q > a7 167 § > c4 196 Ä > d8 216 Ø > e9 233 é > ed 237 í > eb 235 ë > > which is not even close, and yes, I know that it's not the same > code. Actually the hex digits are the same in all 3 cases, so the strings are the same. The reason the characters look different is because you've got a different encoding for the python and java output I would guess. Java bytes are signed also just to add to the confusion. -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: Is Python as capable as Perl for sysadmin work?
Jeff Epler <[EMAIL PROTECTED]> wrote: > Finally, Python just doesn't respond to threats as well as Perl does. > I have run into many Perl programs that just didn't quite work right > until I wrote '... or die "$!"' in the right places. I find '... or die "You [EMAIL PROTECTED]"' works even better ;-) Thanks for a very amusing post! -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: Efficient checksum calculating on lagre files
Fredrik Lundh <[EMAIL PROTECTED]> wrote: > on my machine, Python's md5+mmap is a little bit faster than > subprocess+md5sum: > > import os, md5, mmap > > file = open(fn, "r+") > size = os.path.getsize(fn) > hash = md5.md5(mmap.mmap(file.fileno(), size)).hexdigest() > > (I suspect that md5sum also uses mmap, so the difference is > probably just the subprocess overhead) But you won't be able to md5sum a file bigger than about 4 Gb if using a 32bit processor (like x86) will you? (I don't know how the kernel / user space VM split works on windows but on linux 3Gb is the maximum possible size you can mmap.) $ dd if=/dev/zero of=z count=1 bs=1048576 seek=8192 $ ls -l z -rw-r--r-- 1 ncw ncw 8590983168 Feb 9 09:26 z >>> fn="z" >>> import os, md5, mmap >>> file = open(fn, "rb") >>> size = os.path.getsize(fn) >>> size 8590983168L >>> hash = md5.md5(mmap.mmap(file.fileno(), size)).hexdigest() Traceback (most recent call last): File "", line 1, in ? OverflowError: memory mapped size is too large (limited by C int) >>> -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: Efficient checksum calculating on lagre files
Thomas Heller <[EMAIL PROTECTED]> wrote: > Nick Craig-Wood <[EMAIL PROTECTED]> writes: > > Here is an implementation of md5sum in python. Its the same speed > > give or take as md5sum itself. This isn't suprising since md5sum is > > dominated by CPU usage of the MD5 routine (in C in both cases) and/or > > io (also in C). > > Your code won't work correctly on Windows, since you have to open files > with mode 'rb'. Yes you are correct (good old Windows ;-) > But there's a perfect working version in the Python distribution already: > tools/Scripts/md5sum.py The above is easier to understand though. -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: probably weird or stupid newbie dictionary question
Diez B. Roggisch <[EMAIL PROTECTED]> wrote: > But what happens in case of a hash code clash? Then a list of (key, values) > is stored, and for a passed key, each key in that list is additionally > compared for being equal to the passed one. So another requirement of > hashable objecst is the comparability. In java, this is done using the > equals method. > > So in the end, the actual mapping of key, value looks like this: > > hash(key) -> [(key, value), ] Thats called hashing with chaining. See Knuth: Sorting and Searching if you want to know more! -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: convert list of tuples into several lists
Cappy2112 <[EMAIL PROTECTED]> wrote: > What does the leading * do? It causes the list/tuple following the * to be unpacked into function arguments. Eg >>> zip(*[(1, 2, 3), (4, 5, 6)]) [(1, 4), (2, 5), (3, 6)] is the same as >>> zip((1, 2, 3), (4, 5, 6)) [(1, 4), (2, 5), (3, 6)] The * should make you think of dereferencing things (eg pointer de-reference in C). Its equivalent to the now deprecated apply function which does the same thing in a more wordy fashion, ie apply the list as parameters to the function. >>> apply(zip, [(1, 2, 3), (4, 5, 6)]) [(1, 4), (2, 5), (3, 6)] -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: Python and version control
Sergei Organov <[EMAIL PROTECTED]> wrote: > Carl <[EMAIL PROTECTED]> writes: > > [...] > > I am a keen user of Emacs, but version control, which is very simple > > when you are in a Linux environment, for example, is not a > > straightforward in Windows. > > Emacs + CVS (or CVSNT) should work just fine in Windows either. When I have to edit stuff on windows I use emacs. Cvs works fine on windows too. I haven't tried cvs in emacs on windows, but I suspect it will work fine as all emacs does is shell out to the cvs binaries. -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: Efficient checksum calculating on lagre files
Christos TZOTZIOY Georgiou <[EMAIL PROTECTED]> wrote: > On 09 Feb 2005 10:31:22 GMT, rumours say that Nick Craig-Wood > <[EMAIL PROTECTED]> might have written: > > >But you won't be able to md5sum a file bigger than about 4 Gb if using > >a 32bit processor (like x86) will you? (I don't know how the kernel / > >user space VM split works on windows but on linux 3Gb is the maximum > >possible size you can mmap.) > > Indeed... but the context was calculating efficiently checksums for large > files > to be /served/ by a webserver. I deduce it's almost certain that the files > won't be larger than 3GiB, but ICBW :) You are certainly right ;-) However I did want to make the point that while mmap is extremely attractive for certain things, it does limit you to files < 4 Gb which is something that people don't always realise. -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: goto, cls, wait commands
Grant Edwards <[EMAIL PROTECTED]> wrote: > I forgot to mention try/except. When I do use goto in C > programming it's almost always to impliment what would have > been a try/except block in Python. Yes I'd agree with that. No more 'goto out'. There is this also for (i = 0; ...) { if (something) goto found; } /* do stuff when not found */ found:; (yes I hate setting flags in loops ;-) This translates exactly to the for: else: construct for i in xrange(...): if something: break else: # do stuff when not found The last language I saw with this very useful feature was FORTH in about 1984! -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: ANN: pyMinGW support for Python 2.3.5 (final) is available
A.B., Khalid <[EMAIL PROTECTED]> wrote: > This is to inform those interested in compiling Python in MinGW that > an updated version of pyMinGW is now available. Ha anyone tried cross compiling python with mingw? At work we compile our software for lots of platforms (including windows) on a linux build host. The windows builds are done with a mingw cross compiler. It would be interesting if we could do this with python + extensions also. -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: ANN: pyMinGW support for Python 2.3.5 (final) is available
Simon John <[EMAIL PROTECTED]> wrote: > [snip] > > Ha anyone tried cross compiling python with mingw? At work we compile > > our software for lots of platforms (including windows) on a linux > > build host. The windows builds are done with a mingw cross compiler. > > It would be interesting if we could do this with python + extensions > > also. > > Yes, I was thinking of setting up a cross-compiling system, but why > would you use mingw instead of just gcc on Linux? ...because we cross-compile for Windows under linux. The linux builds are done with plain gcc of course. -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: For American numbers
Peter Hansen <[EMAIL PROTECTED]> wrote: > Only for hard drive manufacturers, perhaps. > > For the rest of the computer world, unless I've missed > a changing of the guard or something, "kilo" is 1024 > and "mega" is 1024*1024 and so forth... Yes. Unless you work in the telcoms industry, where, for example if you order a 2 Mbit/s line you'll get 2 * 1024 * 1000 bits / s ;-) -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: [PATCH] allow partial replace in string.Template
Nick Coghlan <[EMAIL PROTECTED]> wrote: > > a) Patches are more likely to be looked at if placed on the SF patch tracker. > > b) I don't quite see the point, given how easy these are to spell using the > basic safe_substitute. You're replacing one liners with one-liners. c) add a documentation patch d) add a test suite patch -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: low-end persistence strategies?
Paul Rubin wrote: > The issue with using an rdbms is not with the small amount of code > needed to connect to it and query it, but in the overhead of > installing the huge piece of software (the rdbms) itself, and keeping > the rdbms server running all the time so the infrequently used app can > connect to it. I've found SQLobject to be a really good way of poking objects in an SQL database with zero hassle. It can also use SQLite (which I haven't tried) which gets rid of your large rdbms process but also gives you a migration path should the problem expand. > ZODB is also a big piece of software to install. Is it at least > 100% Python with no C modules required? Does it need a separate > server process? If it needs either C modules or a separate server, > it really can't be called a low-end strategy. ZODB looks fun. I just wish (being lazy) that there was a seperate debian package for just it and not the whole of Zope. -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: Why doesn't join() call str() on its arguments?
Nick Vargish <[EMAIL PROTECTED]> wrote: > I feel pretty much the opposite... If a non-string-type has managed to > get into my list-of-strings, then something has gone wrong and I would > like to know about this potential problem. This is a good argument. Why not have another method to do this? I propose joinany which will join any type of object together, not just strings That way it becomes less of a poke in the eye to backwards compatibility too. > Nick "Explicit is better than Implicit" Aye! -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: low-end persistence strategies?
Michele Simionato <[EMAIL PROTECTED]> wrote: > Ok, I have yet another question: what is the difference > between fcntl.lockf and fcntl.flock? The man page of > my Linux system says that flock is implemented independently > of fcntl, however it does not say if I should use it in preference > over fcntl or not. flock() and lockf() are two different library calls. With lockf() you can lock parts of a file. I've always used flock(). >From man lockf() "On Linux, this call [lockf] is just an interface for fcntl(2). (In general, the relation between lockf and fcntl is unspecified.)" see man lockf and man flock -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: Comm. between Python and PHP
Peter Hansen <[EMAIL PROTECTED]> wrote: > Nils Emil P. Larsen wrote: > > What is the easiest way to make my threaded Python daemon communicate > > with a PHP-script running from Apache2 on localhost? > > "Easiest" of course depends on lots of things, and mostly on > specifics that you haven't told us, and your personal preference. > > Two obvious possibilities that come to mind are "write a file > somewhere from PHP and poll for updates from Python", and > "use sockets (TCP)" or "use UDP". If the data you want to pass is structured then you might consider XML-RPC which is a cross platform way of passing structured data around. XML-RPC is part of the python standard library (SimpleXMLRPCServer and xmlrpclib) and there seem to be several implementations for PHP http://www.google.co.uk/search?q=xml+rpc+php That might be overkill for what you want though! -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: accessor/mutator functions
Michael Spencer <[EMAIL PROTECTED]> wrote: > [EMAIL PROTECTED] wrote: > > When I look at how classes are set up in other languages (e.g. C++), I > > often observe the following patterns: > > 1) for each data member, the class will have an accessor member > > function (a Get function) > > 2) for each data member, the class will have a mutator member function > > (a Set function) > > 3) data members are never referenced directly; they are always > > referenced with the accessor and mutator functions > > > > My questions are: > > a) Are the three things above considered pythonic? > > No > > > b) What are the tradeoffs of using getattr() and setattr() rather than > > creating accessor and mutator functions for each data member? > > Use property descriptors instead: > http://www.python.org/2.2.1/descrintro.html#property > http://users.rcn.com/python/download/Descriptor.htm#properties Actually I would say just access the attribute directly for both get and set, until it needs to do something special in which case use property(). The reason why people fill their code up with boiler plate get/set methods is to give them the flexibility to change the implementation without having to change any of the users. In python you just swap from direct attribute access to using property(). Also note that with property() you can make an attribute read only (by defining only the get method) which is often the encapsulation you really want - and that is something you can't do in C++. -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: Flushing print()
Cameron Laird <[EMAIL PROTECTED]> wrote: > gf, remember to write > >sys.stdout.flush() > > rather than > >sys.stdout.flush > > That's a mistake that catches many. Many old perl programmers anyway (me included)! Its also a mistake pychecker catches. -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: Explicit or general importing of namespaces?
Paul Rubin wrote: > Peter Hansen <[EMAIL PROTECTED]> writes: > > Ultimately more important than mere "pollution" are the > > latent problems this can cause if any of the names in > > the original module can ever be re-bound. > > You know, this is another reason the compiler really ought to (at > least optionally) check for such shadowing and have a setting to > enforce user declarations, like perl's "use strict". As a (mostly) ex-perl user I wouldn't like to see python go there - it seems like a rather slippery slope (like the warnings subsystem). This is surely a job for pychecker / pylint? -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: Decimal, __radd__, and custom numeric types...
Blake T. Garretson <[EMAIL PROTECTED]> wrote: > I'm having some issues with decimal.Decimal objects playing nice with > custom data types. I have my own matrix and rational classes which > implement __add__ and __radd__. They know what to do with Decimal > objects and react appropriately. > > The problem is that they only work with Decimals if the custom type is > on the left (and therefore __add__ gets called), but NOT if the Decimal > is on the left. The Decimal immediately throws the usual "TypeError: > You can interact Decimal only with int, long or Decimal data types." > without even trying my __radd__ method to see if my custom type can > handle Decimals. > > From the Python docs (specifically sections 3.3.7 and 3.3.8), I thought > that the left object should try its own __add__, and if it doesn't know > what to do, THEN try the right object's __radd__ method. I guess > Decimal objects don't do this? Is there a way to change this behavior? > If Decimal objects prematurely throw a TypeError before trying the > __rop__, is Decimal broken, or was it designed this way? I think I'm > missing something... It looks like from reading 3.3.8 if decimal raised a NotImplemented exception instead of a TypeError then it would work. For objects x and y, first x.__op__(y) is tried. If this is not implemented or returns NotImplemented, y.__rop__(x) is tried. If this is also not implemented or returns NotImplemented, a TypeError exception is raised. But see the following exception: Exception to the previous item: if the left operand is an instance of a built-in type or a new-style class, and the right operand is an instance of a proper subclass of that type or class, the right operand's __rop__() method is tried before the left operand's __op__() method. This is done so that a subclass can completely override binary operators. Otherwise, the left operand's __op__ method would always accept the right operand: when an instance of a given class is expected, an instance of a subclass of that class is always acceptable. You could try this with a local copy of decimal.py since it is written in Python. -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: accessor/mutator functions
Dan Sommers <[EMAIL PROTECTED]> wrote: > On 28 Feb 2005 10:30:03 GMT, > Nick Craig-Wood <[EMAIL PROTECTED]> wrote: > > > Actually I would say just access the attribute directly for both get > > and set, until it needs to do something special in which case use > > property(). > > > The reason why people fill their code up with boiler plate get/set > > methods is to give them the flexibility to change the implementation > > without having to change any of the users. In python you just swap > > from direct attribute access to using property(). > > The reason their code is so inflexible is that they've filled their > classes with boiler plate get/set methods. Amen to that! As programmers we abhor code duplication, and boiler plate is just code duplication. Even if your fancy editor adds it for you ;-) > Why do users of classes need such access anyway? If my class performs > useful functions and returns useful results, no user of my class should > care about its attributes. If I "have to" allow access to my attributes > in order that my users be happy, then I did something else wrong when I > designed the class and its public interface in the first place. I would say this is an excellent philosphy for C++ or Java. When I'm writing C++ I try to keep the attributes private. I try not to make accessor methods at all until absolutely necessary. I always think I've failed if I end up writing getBlah and setBlah methods. In C++ its always in the back of my mind that an inline accessor method will get optimised into exactly the same code as accessing the attribute directly anyway. > I usually aim for this: if users of the public interface of my class > can figure out that I changed the implementation, then I've exposed too > much. Sure there are exceptions, but that's my basic thought process. However in python, there is no harm in accessing the attributes directly. You can change the implementation whenever you like, and change the attributes into property()s and the users will never know. And, as another poster pointed out - you are already accessing the instance variables just by calling its methods, so you shouldn't feel too squeamish! Read only attributes are easy to understand, unlikely to go wrong and faster than getBlah() accessor methods. Writable attributes I think are good candidates for methods though. Looking inside an object is one thing but changing its internal state is another and should probably be done through a defined interface. > Sorry about the rant. I wouldn't call that a rant, it was quite polite compared to some of the threads on c.l.p recently ;-) -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: accessor/mutator functions
Dan Sommers <[EMAIL PROTECTED]> wrote: > We used to have holy wars over the appropriate level of comments in > source code. Well according to the refactoring book I just read (by Martin Fowler) the appropriate level of comments is None. If you see a comment you should extract the complicated code into a method with a useful name, or add well named intermediate variables, or add an assertion. Its a point of view... Not 100% sure I agree with it but I see where he is coming from. I like a doc-string per public method so pydoc looks nice myself... -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: Regular Expressions: large amount of or's
André Søreng <[EMAIL PROTECTED]> wrote: > Given a string, I want to find all ocurrences of > certain predefined words in that string. Problem is, the list of > words that should be detected can be in the order of thousands. > > With the re module, this can be solved something like this: > > import re > > r = re.compile("word1|word2|word3|...|wordN") > r.findall(some_string) > > Unfortunately, when having more than about 10 000 words in > the regexp, I get a regular expression runtime error when > trying to execute the findall function (compile works fine, but > slow). I wrote a regexp optimiser for exactly this case. Eg a regexp for all 5 letter words starting with re $ grep -c '^re' /usr/share/dict/words 2727 $ grep '^re' /usr/share/dict/words | ./words-to-regexp.pl 5 re|re's|reac[ht]|rea(?:d|d[sy]|l|lm|m|ms|p|ps|r|r[ms])|reb(?:el|u[st])|rec(?:ap|ta|ur)|red|red's|red(?:id|o|s)|ree(?:d|ds|dy|f|fs|k|ks|l|ls|ve)|ref|ref's|refe[dr]|ref(?:it|s)|re(?:gal|hab|(?:ig|i)n|ins|lax|lay|lic|ly|mit|nal|nd|nds|new|nt|nts|p)|rep's|rep(?:ay|el|ly|s)|rer(?:an|un)|res(?:et|in|t|ts)|ret(?:ch|ry)|re(?:use|v)|rev's|rev(?:el|s|ue) As you can see its not perfect. Find it in http://www.craig-wood.com/nick/pub/words-to-regexp.pl Yes its perl and rather cludgy but may give you ideas! -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: os.system()
Joerg Schuster <[EMAIL PROTECTED]> wrote: > os.system(command) > > only works for some values of 'command' on my system (Linux). A certain > shell command (that *does* run on the command line) does not work when > called with os.system(). Does anyone know a simple and stable way to > have *any* string executed by the shell? The command is exectued through the shell, eg >>> os.system("sleep 60 > z") $ ps axf 5121 ?S 0:00 rxvt 5123 pts/77 Ss 0:00 \_ bash 5126 pts/77 S+ 0:00 \_ python 5149 pts/77 S+ 0:00 \_ sh -c sleep 60 > z 5150 pts/77 S+ 0:00 \_ sleep 60 Things to check 1) quoting, python vs shell 2) PATH - check PATH is set the same in shell / python 3) check the whole of the environment Also if you are using 2.4 check the subprocess module -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: shuffle the lines of a large file
Raymond Hettinger <[EMAIL PROTECTED]> wrote: > >>> from random import random > >>> out = open('corpus.decorated', 'w') > >>> for line in open('corpus.uniq'): > print >> out, '%.14f %s' % (random(), line), > > >>> out.close() > > sort corpus.decorated | cut -c 18- > corpus.randomized Very good solution! Sort is truly excellent at very large datasets. If you give it a file bigger than memory then it divides it up into temporary files of memory size, sorts each one, then merges all the temporary files back together. You tune the memory sort uses for in memory sorts with --buffer-size. Its pretty good at auto tuning though. You may want to set --temporary-directory also to save filling up your /tmp. In a previous job I did a lot of stuff with usenet news and was forever blowing up the server with scripts which used too much memory. sort was always the solution! -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: An Odd Little Script
Terry Hancock <[EMAIL PROTECTED]> wrote: > The only problem I see is the "in place" requirement, which seems silly > unless by "quite large" you mean multiple gigabytes. Surely Perl > actually makes a copy in the process even though you never see > it? If using "perl -i" then then it does make a copy -i[extension] specifies that files processed by the "<>" construct are to be edited in-place. It does this by renaming the input file, opening the output file by the original name, and selecting that output file as the default for print() statements. The solution posted previously using mmap actually does it in-place though which will work very well for files < 4GB. (And not at all for files > 4GB unless you are on a 64 bit machine). -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: Python becoming less Lisp-like
Torsten Bronger <[EMAIL PROTECTED]> wrote: > The current snapshot is a transitional Python and thus > with some double features. The numerical types and two kinds of > classes are examples. I'm very surprised about this, because Python > is a production language, but I'm happy, too. As long as python 2.x -> 3.x/3000 isn't like perl 5.x -> perl 6.x I'll be perfectly happy too. "Less is more" is a much better philosophy for a language and having the courage to take things out differentiates python from the crowd. Of course we users will complain about removals, but we'll knuckle down and take our medicine eventually ;-) -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: pre-PEP generic objects
Steven Bethard <[EMAIL PROTECTED]> wrote: > I promised I'd put together a PEP for a 'generic object' data type for > Python 2.5 that allows one to replace __getitem__ style access with > dotted-attribute style access (without declaring another class). Any > comments would be appreciated! This sounds very much like this class which I've used to convert perl programs to python class Hash: def __init__(self, **kwargs): for key,value in kwargs.items(): setattr(self, key, value) def __getitem__(self, x): return getattr(self, x) def __setitem__(self, x, y): setattr(self, x, y) My experience from using this is that whenever I used Hash(), I found that later on in the refinement of the conversion it became its own class. So my take on the matter is that this encourages perl style programming (just ram it in a hash, and write lots of functions acting on it) rather than creating a specific class for the job which is dead easy in python anyway and to which you can attach methods etc. YMMV ;-) -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: pre-PEP generic objects
Steven Bethard <[EMAIL PROTECTED]> wrote: > Nick Craig-Wood wrote: > > Steven Bethard <[EMAIL PROTECTED]> wrote: > > > >> I promised I'd put together a PEP for a 'generic object' data type for > >> Python 2.5 that allows one to replace __getitem__ style access with > >> dotted-attribute style access (without declaring another class). Any > >> comments would be appreciated! > > > > My experience from using this is that whenever I used Hash(), I found > > that later on in the refinement of the conversion it became its own > > class. > > This has also generally been my experience, though I'm not sure it's as > true for the XML DOM to Bunch translation. Did you use Hash() in the > same way for hierarchical data? Hash() got nested yes, but not in a general purpose structure like your XML example. > > > So my take on the matter is that this encourages perl style > > programming (just ram it in a hash, and write lots of functions acting > > on it) rather than creating a specific class for the job which is dead > > easy in python anyway and to which you can attach methods etc. > > You'll note that the (pre-)PEP explicitly notes that this object is > intended only for use when no methods are associated with the attributes: > > "When no methods are to be associated with the attribute-value mappings, > declaring a new class can be overkill." > > I do understand your point though -- people might not use Bunch in the > way it's intended. Of course, those same people can already do the same > thing with a dict instead (e.g. write a bunch of functions to handle a > certain type of dict). If someone wants to write Perl in Python, > there's not much we can really do to stop them... No there isn't ;-) The above does make it a lot more convenient though blob['foo'] is rather difficult to type compared to blob.foo! -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: pre-PEP generic objects
Scott David Daniels <[EMAIL PROTECTED]> wrote: > Nick Craig-Wood wrote: > > class Hash: > > def __init__(self, **kwargs): > > for key,value in kwargs.items(): > > setattr(self, key, value) > > def __getitem__(self, x): > > return getattr(self, x) > > def __setitem__(self, x, y): > > setattr(self, x, y) > > You can simplify this: > class Hash(object): > def __init__(self, **kwargs): > for key,value in kwargs.items(): > setattr(self, key, value) > __getitem__ = getattr > __setitem__ = setattr That doesn't work unfortunately... >>> class Hash(object): ... def __init__(self, **kwargs): ... for key,value in kwargs.items(): ... setattr(self, key, value) ... __getitem__ = getattr ... __setitem__ = setattr ... >>> h=Hash(a=1,b=2) >>> h.a 1 >>> h['a'] Traceback (most recent call last): File "", line 1, in ? TypeError: getattr expected at least 2 arguments, got 1 >>> I'm not exactly sure why though! -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: pre-PEP generic objects
Peter Otten wrote: > Functions written in Python have a __get__ attribute while builtin > functions (implemented in C) don't. Python-coded functions therefore > automatically act as descriptors while builtins are just another > attribute. Jp Calderone <[EMAIL PROTECTED]> wrote: > When the class object is created, the namespace is scanned for > instances of . For those and only those, a > descriptor is created which will produce bound and unbound methods. > Instances of other types, such as or 'builtin_function_or_method'>, are ignored, leading to the critical > difference in this case: I think I finally understand now - thank you to you both! -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: Galois field
Mikael Olofsson <[EMAIL PROTECTED]> wrote: > At our department we use Magma (http://magma.maths.usyd.edu.au/) > for finite field arithmetic and error control codes. Magma has > nothing to do with Python, instead it is a very mature tool of its > own, mainly for descrete math. It knows what a permutation group > is, it knows what GF(81) is, and much more. I think pari/gp can do arithmetic over general Galois fields. I certainly have a recollection of doing that with it in the past. Its free (GPL) too, and has a library that could be wrapped with SWIG/etc to make it python friendly. http://pari.math.u-bordeaux.fr/ * PARI is a C library, allowing fast computations. * gp is an interactive shell giving access to PARI functions, much easier to use. * GP is the name of gp' s scripting language * gp2c the GP-to-C compiler makes the best of both worlds If someone did wrap PARI in python it would certainly be easier to use than GP which I found very unintuitive! In fact I just found this which looks like just the job! http://www.fermigier.com/fermigier/PariPython/readme.html -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: long number multiplication
I.V. Aprameya Rao <[EMAIL PROTECTED]> wrote: > i have been wondering, how does python store its very long integers and > perform aritmetic on it. > > i needed to implement this myself and was thinking of storing the digits > of an integer in a list. > > however this would be very slow for operations like division etc. > > so if anyone can point me to some links or some method on how to do this i > would appreciate it Anyone interested in this subject should read SemiNumerical Algorithms D.E.Knuth, Addison-Wesley Its the bible for this area of computer science! Its also rather big and expensive so you'll probably want to look at in the library... -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: Recursive list comprehension
Adam DePrince <[EMAIL PROTECTED]> wrote: > def flatten( i ): > try: > i = i.__iter__() > while 1: > j = flatten( i.next() ) > try: > while 1: > yield j.next() > except StopIteration: > pass > except AttributeError: > yield i Hmm, there is more to that than meets the eye! I was expecting print list(flatten("hello")) to print ['h', 'e', 'l', 'l', 'o'] But it didn't, it printed ['hello'] With a little more investigation I see that str has no __iter__ method. However you can call iter() on a str >>> for c in iter("hello"): print c ... h e l l o Or even >>> for c in "hello": print c ... h e l l o ...and this works because str supports __getitem__ according to the docs. So there is some magic going on here! Is str defined to never have an __iter__ method? I see no reason why that it couldn't one day have an __iter__ method though. -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: Recursive list comprehension
Peter Otten <[EMAIL PROTECTED]> wrote: > Adam DePrince wrote: > > > def flatten( i ): > > try: > > i = i.__iter__() > > while 1: > > j = flatten( i.next() ) > > try: > > while 1: > > yield j.next() > > except StopIteration: > > pass > > except AttributeError: > > yield i > > While trying to break your code with a len() > 1 string I noted that strings > don't feature an __iter__ attribute. Therefore obj.__iter__() is not > equivalent to iter(obj) for strings. Do you (plural) know whether this is a > CPython implementation accident or can be relied upon? I'd like to know this too! You can write the above as the shorter (and safer IMHO - it doesn't catch any exceptions it shouldn't) def flatten( i ): if hasattr(i, "__iter__"): for j in i: for k in flatten(j): yield k else: yield i -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: collaborative editing
Robert Kern <[EMAIL PROTECTED]> wrote: > Personally, I loathe writing at any length inside a Web browser and > prefer to use a real editor at all times. Me too! You need mozex... http://mozex.mozdev.org/ Not sure about Mac support though /OT -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: collaborative editing
Nick Craig-Wood <[EMAIL PROTECTED]> wrote: > Robert Kern <[EMAIL PROTECTED]> wrote: > > Personally, I loathe writing at any length inside a Web browser and > > prefer to use a real editor at all times. > > Me too! You need mozex... > >http://mozex.mozdev.org/ Here is a good page about Wikis (from the creators of spamassassin) including stuff about Moin Moin (which is in Python) - lots of good advice for anyone thinging about a Wiki. http://taint.org/2004/09/28/191712a.html -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: Python vs. Perl
Keith Dart <[EMAIL PROTECTED]> wrote: > Oh, I forgot to mention that it also has a more user- and > programmer-friendly ExitStatus object that processess can return. This > is directly testable in Python: > > proc = proctools.spawn("somecommand") > exitstatus = proc.wait() > > if exitstatus: > print "good result (errorlevel of zero)" > else: > print exitstatus # prints message with exit value This sounds rather like the new subprocess module... >>> import subprocess >>> rc = subprocess.call(["ls", "-l"]) total 381896 -rw-r--r--1 ncw ncw 1542 Oct 12 17:55 1 [snip] -rw-r--r--1 ncw ncw 713 Nov 16 08:18 z~ >>> print rc 0 IMHO the new subprocess module is a very well thought out interface... -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: subprocess vs. proctools
Keith Dart <[EMAIL PROTECTED]> wrote: > Nick Craig-Wood wrote: > > This sounds rather like the new subprocess module... > > > >>>>import subprocess > >>>>rc = subprocess.call(["ls", "-l"]) > > > > total 381896 > > -rw-r--r--1 ncw ncw 1542 Oct 12 17:55 1 > > [snip] > > -rw-r--r--1 ncw ncw 713 Nov 16 08:18 z~ > > > >>>>print rc > > > > 0 > > But this evaluates to False in Python, but True in a shell. There are many ways for a program to fail (non-zero exit codes) but only one way for it to succeed (zero exit code). Therefore rc should be 0 for success. IMHO Shell semantics are nuts (0 is True - yeah!) - they hurt my head every time I have to use them ;-) > It also requires an extra check for normal exit, or exit by a > signal. >>> import subprocess >>> subprocess.call(["sleep", "60"]) -11 >>> # I killed the sleep process with a SEGV here from another xterm >>> subprocess.call(["sleep", "asdfasdf"]) sleep: invalid time interval `asdfasdf' Try `sleep --help' for more information. 1 >>> Signals are -ve, exit codes are +ve which seems perfect. Exit codes can only be from 0..255 under linux. Signals go from -1 to -64. > The proctools ExitStatus object avaluates to True only on a normal > exit, period. Thus it follows a shell semantics for clarity. You > cannot do this with the subprocess module: > > if rc: > print "exited normally" Actually I think if rc == 0: print "exited normally" is exactly equivalent! [snip] > It does not work with MS Windows I like python because I can write stuff on linux and it works on windows without too much effort, and in general I try not to use modules which don't work on both platforms. -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: Performance (pystone) of python 2.4 lower then python 2.3 ???
Peter Hansen <[EMAIL PROTECTED]> wrote: > For comparison, I do get a decent speedup. Machine is an > AMD Athlon XP 2500+ (1.82GHz) running Win XP Pro SP2. > > Python 2.3.4: 36393 pystones. > Python 2.4: 39400 pystones. > > ...about an 8% speedup. On my 2.6 GHz P4 running debian testing I got the following results :- $ for p in 2.1 2.2 2.3 2.4; do echo $p; python$p pystone.py 100 ; done 2.1 Pystone(1.1) time for 100 passes = 40.67 This machine benchmarks at 24588.1 pystones/second 2.2 Pystone(1.1) time for 100 passes = 39.64 This machine benchmarks at 25227 pystones/second 2.3 Pystone(1.1) time for 100 passes = 32.49 This machine benchmarks at 30778.7 pystones/second 2.4 Pystone(1.1) time for 100 passes = 29.88 This machine benchmarks at 33467.2 pystones/second Showing that 2.4 is the fastest so far! (And is also a good advert for AMD ;-) -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: Suggestion for "syntax error": ++i, --i
Richard Brodie <[EMAIL PROTECTED]> wrote: > "Terry Reedy" <[EMAIL PROTECTED]> wrote in message > news:[EMAIL PROTECTED] > > > You could propose to the author of Pychecker that he include, if possible, > > an option to check for and warn about '++', '--'. > > It does already. > > $ cat plusplus.py def main(): i = 1 return ++i $ pychecker plusplus Processing plusplus... Warnings... plusplus.py:4: Operator (++) doesn't exist, statement has no effect -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: expression form of one-to-many dict?
Steven Bethard <[EMAIL PROTECTED]> wrote: > map = {} > for key, value in sequence: > map.setdefault(key, []).append(value) I was thinking about exactly that the other day, when converting some perl to python. [ digression: In perl, you do push @{$map->{$key}}, $value If $map->{$key} doesn't exist is is autovivified into an array (since its in array context). Now thats exactly the sort of magic that gives perl a bad name ;-) ] However, one thing I noticed is that a list is created and destroyed as the second parameter to setdefault whether or not it is used in map, which strikes me as a bit wasteful. You obviously can't use the same list there either. If it was an object with a more expensive constructor then you'd notice it more. It would be nice if setdefault didn't evaluate the second argument unless it needed it. However I guess it would have to be a language feature to do that. Here are some timings $ /usr/lib/python2.3/timeit.py -s 'sequence=zip(range(1000),range(1000))' 'map = {} for key, value in sequence: map.setdefault(key, []).append(value)' 1000 loops, best of 3: 1.42e+03 usec per loop $ /usr/lib/python2.3/timeit.py -s 'sequence=zip(range(1000),[ i%11 for i in range(1000)])' 'map = {} for key, value in sequence: map.setdefault(key, []).append(value)' 1000 loops, best of 3: 1.57e+03 usec per loop $ /usr/lib/python2.3/timeit.py -s 'sequence=zip(range(1000),range(1000))' 'map = {} for key, value in sequence: if map.has_key(key): map[key].append(value) else: map[key] = [ value ]' 1000 loops, best of 3: 1.1e+03 usec per loop $ /usr/lib/python2.3/timeit.py -s 'sequence=zip(range(1000),[ i%11 for i in range(1000)])' 'map = {} for key, value in sequence: if map.has_key(key): map[key].append(value) else: map[key] = [ value ]' 1000 loops, best of 3: 1.11e+03 usec per loop Not that timing is everything of course ;-) -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: Easy "here documents" ??
Jim Hill <[EMAIL PROTECTED]> wrote: > I've done some Googling around on this and it seems like creating a here > document is a bit tricky with Python. Trivial via triple-quoted strings > if there's no need for variable interpolation but requiring a long, long > formatted arglist via (%s,%s,%s,ad infinitum) if there is. So my > question is: > > Is there a way to produce a very long multiline string of output with > variables' values inserted without having to resort to this wacky > > """v = %s"""%(variable) I prefer this >>> amount = 1 >>> cost = 2.0 >>> what = 'potato' >>> print """\ ... I'll have %(amount)s %(what)s ... for $%(cost)s please""" % locals() I'll have 1 potato for $2.0 please >>> Its almost as neat as perl / shell here documents and emacs parses """ strings properly too ;-) Note the \ after the triple quote so the first line is flush on the left, and the locals() statement. You can use globals() or a dictionary you might have lying around instead which is much more flexible than perl. You can even pass self.__dict__ if you are in a class method so you can access attributes. >>> class A: pass ... >>> a=A() >>> a.amount=10 >>> a.what="rutabaga" >>> a.cost=17.5 >>> print """\ ... I'll have %(amount)s %(what)s ... for $%(cost)s please""" % a.__dict__ I'll have 10 rutabaga for $17.5 please >>> -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: Easy "here documents" ??
Jim Hill <[EMAIL PROTECTED]> wrote: > Nick Craig-Wood wrote: > >I prefer this > > > > ... I'll have %(amount)s %(what)s > > ... for $%(cost)s please""" % locals() > > Looks pretty slick. This might just be what I need. > > >Its almost as neat as perl / shell here documents and emacs parses """ > >strings properly too ;-) > > Mmm...emacs... > > Thanks for the tip. You are welcome! I came to python after many years of perl too... -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: regular expression: perl ==> python
> 1) In perl: > $line = "The food is under the bar in the barn."; > if ( $line =~ /foo(.*)bar/ ) { print "got <$1>\n"; } > > in python, I don't know how I can do this? > How does one capture the $1? (I know it is \1 but it is still not clear > how I can simply print it. > thanks Fredrik Lundh <[EMAIL PROTECTED]> wrote: > "JZ" <[EMAIL PROTECTED]> wrote: > > > import re > > line = "The food is under the bar in the barn." > > if re.search(r'foo(.*)bar',line): > > print 'got %s\n' % _.group(1) > > Traceback (most recent call last): >File "jz.py", line 4, in ? > print 'got %s\n' % _.group(1) > NameError: name '_' is not defined I've found that a slight irritation in python compared to perl - the fact that you need to create a match object (rather than relying on the silver thread of $_ (etc) running through your program ;-) import re line = "The food is under the bar in the barn." m = re.search(r'foo(.*)bar',line) if m: print 'got %s\n' % m.group(1) This becomes particularly irritating when using if, elif etc, to match a series of regexps, eg line = "123123" m = re.search(r'^(\d+)$', line) if m: print "int",int(m.group(1)) else: m = re.search(r'^(\d*\.\d*)$', line) if m: print "float",float(m.group(1)) else: print "unknown thing", line The indentation keeps growing which looks rather untidy compared to the perl $line = "123123"; if ($line =~ /^(\d+)$/) { print "int $1\n"; } elsif ($line =~ /^(\d*\.\d*)$/) { print "float $1\n"; } else { print "unknown thing $line\n"; } Is there an easy way round this? AFAIK you can't assign a variable in a compound statement, so you can't use elif at all here and hence the problem? I suppose you could use a monstrosity like this, which relies on the fact that list.append() returns None... line = "123123" m = [] if m.append(re.search(r'^(\d+)$', line)) or m[-1]: print "int",int(m[-1].group(1)) elif m.append(re.search(r'^(\d*\.\d*)$', line)) or m[-1]: print "float",float(m[-1].group(1)) else: print "unknown thing", line -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: regular expression: perl ==> python
Fredrik Lundh <[EMAIL PROTECTED]> wrote: > that's not a very efficient way to match multiple patterns, though. a > much better way is to combine the patterns into a single one, and use > the "lastindex" attribute to figure out which one that matched. lastindex is useful, yes. > see > > http://effbot.org/zone/xml-scanner.htm > > for more on this topic. I take your point. However I don't find the below very readable - making 5 small regexps into 1 big one, plus a game of count the brackets doesn't strike me as a huge win... xml = re.compile(r""" <([/?!]?\w+) # 1. tags |&(\#?\w+); # 2. entities |([^<>&'\"=\s]+) # 3. text strings (no special characters) |(\s+) # 4. whitespace |(.) # 5. special characters """, re.VERBOSE) Its probably faster though, so I give in gracelessly ;-) -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: regular expression: perl ==> python
Fredrik Lundh <[EMAIL PROTECTED]> wrote: > the undocumented sre.Scanner provides a ready-made mechanism for this > kind of RE matching; see > > http://aspn.activestate.com/ASPN/Mail/Message/python-dev/1614344 > > for some discussion. > > here's (a slight variation of) the code example they're talking about: > > def s_ident(scanner, token): return token > def s_operator(scanner, token): return "op%s" % token > def s_float(scanner, token): return float(token) > def s_int(scanner, token): return int(token) > > scanner = sre.Scanner([ > (r"[a-zA-Z_]\w*", s_ident), > (r"\d+\.\d*", s_float), > (r"\d+", s_int), > (r"=|\+|-|\*|/", s_operator), > (r"\s+", None), > ]) > > >>> print scanner.scan("sum = 3*foo + 312.50 + bar") > (['sum', 'op=', 3, 'op*', 'foo', 'op+', 312.5, 'op+', 'bar'], > '') That is very cool - exactly the kind of problem I come across quite often! I've found the online documentation (using pydoc) for re / sre in general to be a bit lacking. For instance nowhere in pydoc sre Does it tell you what methods a match object has (or even what type it is). To find this out you have to look at the HTML documentation. This is probably what Windows people look at by default but Unix hackers like me expect everything (or at least a hint) to be in the man/pydoc pages. Just noticed in pydoc2.4 a new section MODULE DOCS http://www.python.org/doc/current/lib/module-sre.html Which is at least a hint that you are looking in the wrong place! ...however that page doesn't exist ;-) -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: Queue.Queue-like class without the busy-wait
Paul Rubin wrote: > Antoon Pardon <[EMAIL PROTECTED]> writes: > > I'm not sure that this would be an acceptable approach. I did the man > > semop and it indicates this is part of system V IPC. This makes me > > fear that semaphores will use file descriptors or other resources > > that are only available in a limited amount. Not usefull if you are > > talking about thousands of threads. > > That would be terrible, if semaphores are as heavy as file descriptors. > I'd like to hope the OS's are better designed than that. I believe futex is the thing you want for a modern linux. Not very portable though. >From futex(4) The Linux kernel provides futexes ('Fast Userspace muTexes') as a building block for fast userspace locking and semaphores. Futexes are very basic and lend themselves well for building higher level locking abstractions such as POSIX mutexes. This page does not set out to document all design decisions but restricts itself to issues relevant for application and library devel- opment. Most programmers will in fact not be using futexes directly but instead rely on system libraries built on them, such as the NPTL pthreads implementation. A futex is identified by a piece of memory which can be shared between different processes. In these different processes, it need not have identical addresses. In its bare form, a futex has semaphore semantics; it is a counter that can be incremented and decremented atomically; processes can wait for the value to become positive. Futex operation is entirely userspace for the non-contended case. The kernel is only involved to arbitrate the contended case. As any sane design will strive for non-contension, futexes are also optimised for this situation. In its bare form, a futex is an aligned integer which is only touched by atomic assembler instructions. Processes can share this integer over mmap, via shared segments or because they share memory space, in which case the application is commonly called multithreaded. -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: Queue.Queue-like class without the busy-wait
Paul Rubin wrote: > Nick Craig-Wood <[EMAIL PROTECTED]> writes: > > I believe futex is the thing you want for a modern linux. Not > > very portable though. > > That's really cool, but I don't see how it can be a pure userspace > operation if the futex has a timeout. The kernel must need to keep > track of the timeouts. However, since futexes can be woken by any > thread, the whole thing can be done with just one futex. In fact the > doc mentions something about using a file descriptor to support > asynchronous wakeups, but it's confusing whether that applies here. No it isn't pure user space, only for the non-contended case which for most locks is the most frequent operation. Futex operation is entirely userspace for the non-contended case. The kernel is only involved to arbitrate the contended case. As any sane design will strive for non-contension, futexes are also optimised for this situation. -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: Queue.Queue-like class without the busy-wait
[EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > Thinking about cross-platform issues. I found this, from the venerable > Tim Peters to be enlightening for python's choice of design: > > "It's possible to build a better Queue implementation that runs only on > POSIX systems, or only on Windows systems, or only on one of a dozen > other less-popular target platforms. The current implementation works > fine on all of them, although is suboptimal compared to what could be > done in platform-specific Queue implementations. " > > Here is a link: > > http://groups-beta.google.com/group/comp.lang.python/messages/011f680b2dac320c,a03b161980b81d89,1162a30e96ae330a,0db1e52548493843,6b8d593c84ad4fd4,b6293a53f98252ce,82cddc89805b4b56,81c7289cc4cb4441,0906b24cc1534844,3ff6629391074ed4?thread_id=55b80d05e9d54705&mode=thread&noheader=1&q=queue+timeout+python#doc_011f680b2dac320c Interesting thread. How about leaving the current threading alone, but adding a pthreads module for those OSes which can use or emulate posix threads? Which is windows and most unixes? -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: Compute pi to base 12 using Python?
[EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > I'm using GMPY (see code). [snip] If you are using gmpy you might as well do it like this. gmpy.pi() uses the Brent-Salamin Arithmetic-Geometric Mean formula for pi IIRC. This converges quadratically, and it will calculate you a million places without breaking a sweat. >>> import gmpy >>> from math import log >>> bits = int(3003*log(12)/log(2)) >>> pi=gmpy.pi(bits+100) >>> gmpy.fdigits(pi, 12, 3003) '3.184809493b918664573a6211bb151551a05729290a7809a492742140a60a55256a0661a03753a3aa54805646880181a3683083272bbba0a370b12265529a828903b4b256b8403759a71626b8a54687621849b849a8225616b442796a31737b229b2391489853943b8763725616447236b027a421aa17a38b52a18a838b01514a51144a23315a3009a8906b61b8b48a62253a88a50a43ba0944572315933664476b3aabb77583975120683526b75b462060bb03b432551913772729a2147553531793848a0402b999b5058535374465a68806716644039539a8431935198527b9399b112990abb0383b107645424577a51601b3624a88b7a676a3992912121a213887b92873946a61332242217aa7354115357744939112602ba4b18a3269222b528487747839994ab223b65b8762695422822669ba00a586097842a51750362073b5a768363b21bb1a97a4a194447749399804922175a068a46739461990a2065bb0a30bbab7024a585b1a84428195489784a07a331a7b0a1574565b373b05b03a5a80a13ab87857734679985558a5373178a7b28271992a3894a5776085083b9b238b2220542462888641a2bab8b3083ab49659172a312b78518654494a068662586a181835a64440b2970a122813975898815367208905801032881449223841428763329617531239b9! a657405584014534390b587625606bb80923795944b43757a431b039556282978a6a49590553490ba1844947175637a908247b50127722464441380a852b0847b5813019bb70a67663b426565434069884476132193344ba55a2128a03838974606b851b2979321a408067225a5aa4b3464a1a17473595333909ab9127079655b3164b68b9b28a9b818a220a025ab0934203995b7a62a7aa739355340539ba3182905b193905603a43b660b9426a92294697144a896a5b2339358bb2b7294bb89635b071a6351211360b820b1882ab8433b54757b87a373284b1ba182a10326476b369a4a6365b58b8018994bb152556765475a704bb94b6b2a39458971a8b90512786b5029404818644323552916170b3abb7363496427b088b68725a68570040617949289077b278069a09b559324b8a66828b40549b0296065b2300330592569a7b76b92ba1293585b6a9b604567a0901362856373b4b56897946256b4172b1b50474351364749a33996a81ba8847347a8411b850b79a03018291672aa0945656a159aa6aa0a845531a592005b8a34366b882257107b190969a846474836a9800750778920ba797297a2791101b0685a86bb704b9baa17b055293679843b35215b0a8b1182b611953b080aa5431b219907a8448a81b1a9493245676b88013b470335240859594158621014216! 619553246570601967448b470174b9244892444817453865a4003b5aa7176451aab906 [EMAIL PROTECTED]' -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list