Re: Embedding multiple interpreters
Hi Chris (and Michael), On 06/12/13 15:51, Chris Angelico wrote: On Fri, Dec 6, 2013 at 4:16 PM, Michael Torrie wrote: On 12/05/2013 07:34 PM, Garthy wrote: - My fallback if I can't do this is to implement each instance in a dedicated *process* rather than per-thread. However, there is a significant cost to doing this that I would rather not incur. What cost is this? Are you speaking of cost in terms of what you the programmer would have to do, cost in terms of setting things up and communicating with the process, or the cost of creating a process vs a thread? If it's the last, on most modern OS's (particularly Linux), it's really not that expensive. On Linux the cost of threads and processes are nearly the same. If you want my guess, the cost of going to multiple processes would be to do with passing data back and forth between them. Why is Python being embedded in another application? Sounds like there's data moving from C to Python to C, ergo breaking that into separate processes means lots of IPC. An excellent guess. :) One characteristic of the application I am looking to embed Python in is that there are a fairly large number calls from the app into Python, and for each, generally many back to the app. There is a healthy amount of data flowing back and forth each time. An implementation with an inter-process roundtrip each time (with a different scripting language) proved to be too limiting, and needlessly complicated the design of the app. As such, more development effort has gone into making things work better with components that work well running across thread boundaries than process boundaries. I am confident at this point I could pull things off with a Python one-interpreter-per-process design, but I'd then need to visit the IPC side of things again and put up with the limitations that arise. Additionally, the IPC code has has less attention and isn't as capable. I know roughly how I'd proceed if I went with this approach, but it is the least desirable outcome of the two. However, if I could manage to get a thread-based solution going, I can put the effort where it is most productive, namely into making sure that the thread-based solution works best. This is my preferred outcome and current goal. :) Cheers, Garth -- https://mail.python.org/mailman/listinfo/python-list
Re: Embedding multiple interpreters
Hi Gregory, On 06/12/13 17:28, Gregory Ewing wrote: > Garthy wrote: >> I am running into problems when using multiple interpreters [1] and I >> am presently trying to track down these issues. Can anyone familiar >> with the process of embedding multiple interpreters have a skim of the >> details below and let me know of any obvious problems? > > As far as I know, multiple interpreters in one process is > not really supported. There *seems* to be partial support for > it in the code, but there is no way to fully isolate them > from each other. That's not good to hear. Is there anything confirming that it's an incomplete API insofar as multiple interpreters are concerned? Wouldn't this carry consequences for say mod_wsgi, which also does this? > Why do you think you need multiple interpreters, as opposed > to one interpreter with multiple threads? If you're trying > to sandbox the threads from each other and/or from the rest > of the system, be aware that it's extremely difficult to > securely sandbox Python code. You'd be much safer to run > each one in its own process and rely on OS-level protections. To allow each script to run in its own environment, with minimal chance of inadvertent interaction between the environments, whilst allowing each script the ability to stall on conditions that will be later met by another thread supplying the information, and to fit in with existing infrastructure. >> - I don't need to share objects between interpreters (if it is even >> possible- I don't know). > > The hard part is *not* sharing objects between interpreters. > If nothing else, all the builtin type objects, constants, etc. > will be shared. I understand. To clarify: I do not need to pass any Python objects I create or receive back and forth between different interpreters. I can imagine some environments would not react well to this. Cheers, Garth -- https://mail.python.org/mailman/listinfo/python-list
Re: Embedding multiple interpreters
Hi all, A small update here: On 06/12/13 13:04, Garthy wrote: > [1] It presently crashes in Py_EndInterpreter() after running through a > series of tests during the shutdown of the 32nd interpreter I create. I > don't know if this is significant, but the tests pass for the first 31 > interpreters. This turned out to be a red herring, so please ignore this bit. I had a code path that failed to call Py_INCREF on Py_None which was held in a PyObject that was later Py_DECREF'd. This had some interesting consequences, and not surprisingly led to some double-frees. ;) I was able to get much further with this fix, although I'm still having some trouble getting multiple interpreters running together simultaneously. Advice and thoughts still very much welcomed on the rest of the email. :) Cheers, Garth -- https://mail.python.org/mailman/listinfo/python-list
Re: Embedding multiple interpreters
Hi Chris (and Michael), On 06/12/13 15:46, Michael Torrie wrote: > On 12/05/2013 07:34 PM, Garthy wrote: >> - My fallback if I can't do this is to implement each instance in a >> dedicated *process* rather than per-thread. However, there is a >> significant cost to doing this that I would rather not incur. > > What cost is this? Are you speaking of cost in terms of what you the > programmer would have to do, cost in terms of setting things up and > communicating with the process, or the cost of creating a process vs a > thread? If it's the last, on most modern OS's (particularly Linux), > it's really not that expensive. On Linux the cost of threads and > processes are nearly the same. An excellent guess. :) One characteristic of the application I am looking to embed Python in is that there are a fairly large number calls from the app into Python, and for each, generally many back to the app. There is a healthy amount of data flowing back and forth each time. An implementation with an inter-process roundtrip each time (with a different scripting language) proved to be too limiting, and needlessly complicated the design of the app. As such, more development effort has gone into making things work better with components that work well running across thread boundaries than process boundaries. I am confident at this point I could pull things off with a Python one-interpreter-per-process design, but I'd then need to visit the IPC side of things again and put up with the limitations that arise. Additionally, the IPC code has has less attention and isn't as capable. I know roughly how I'd proceed if I went with this approach, but it is the least desirable outcome of the two. However, if I could manage to get a thread-based solution going, I can put the effort where it is most productive, namely into making sure that the thread-based solution works best. This is my preferred outcome and current goal. :) Cheers, Garth -- https://mail.python.org/mailman/listinfo/python-list
Re: Embedding multiple interpreters
Hi all, A small update here: On 06/12/13 13:04, Garthy wrote: > [1] It presently crashes in Py_EndInterpreter() after running through a > series of tests during the shutdown of the 32nd interpreter I create. I > don't know if this is significant, but the tests pass for the first 31 > interpreters. This turned out to be a red herring, so please ignore this bit. I had a code path that failed to call Py_INCREF on Py_None which was held in a PyObject that was later Py_DECREF'd. This had some interesting consequences, and not surprisingly led to some double-frees. ;) I was able to get much further with this fix, although I'm still having some trouble getting multiple interpreters running together simultaneously. Advice and thoughts still very much welcomed on the rest of the email. :) Cheers, Garth -- https://mail.python.org/mailman/listinfo/python-list
Re: Embedding multiple interpreters
Hi Gregory, On 06/12/13 17:28, Gregory Ewing wrote: > Garthy wrote: >> I am running into problems when using multiple interpreters [1] and I >> am presently trying to track down these issues. Can anyone familiar >> with the process of embedding multiple interpreters have a skim of the >> details below and let me know of any obvious problems? > > As far as I know, multiple interpreters in one process is > not really supported. There *seems* to be partial support for > it in the code, but there is no way to fully isolate them > from each other. That's not good to hear. Is there anything confirming that it's an incomplete API insofar as multiple interpreters are concerned? Wouldn't this carry consequences for say mod_wsgi, which also does this? > Why do you think you need multiple interpreters, as opposed > to one interpreter with multiple threads? If you're trying > to sandbox the threads from each other and/or from the rest > of the system, be aware that it's extremely difficult to > securely sandbox Python code. You'd be much safer to run > each one in its own process and rely on OS-level protections. To allow each script to run in its own environment, with minimal chance of inadvertent interaction between the environments, whilst allowing each script the ability to stall on conditions that will be later met by another thread supplying the information, and to fit in with existing infrastructure. >> - I don't need to share objects between interpreters (if it is even >> possible- I don't know). > > The hard part is *not* sharing objects between interpreters. > If nothing else, all the builtin type objects, constants, etc. > will be shared. I understand. To clarify: I do not need to pass any Python objects I create or receive back and forth between different interpreters. I can imagine some environments would not react well to this. Cheers, Garth PS. Apologies if any of these messages come through more than once. Most lists that I've posted to set reply-to meaning a normal reply can be used, but python-list does not seem to. The replies I have sent manually to python-list@python.org instead don't seem to have appeared. I'm not quite sure what is happening- apologies for any blundering around on my part trying to figure it out. -- https://mail.python.org/mailman/listinfo/python-list
Re: Embedding multiple interpreters
On Fri, Dec 6, 2013 at 6:59 PM, Garthy wrote: > Hi Chris (and Michael), Hehe. People often say that to me IRL, addressing me and my brother. But he isn't on python-list, so you clearly mean Michael Torrie, yet my brain still automatically thought you were addressing Michael Angelico :) > To allow each script to run in its own environment, with minimal chance of > inadvertent interaction between the environments, whilst allowing each > script the ability to stall on conditions that will be later met by another > thread supplying the information, and to fit in with existing > infrastructure. Are the scripts written cooperatively, or must you isolate one from another? If you need to isolate them for trust reasons, then there's only one solution, and that's separate processes with completely separate interpreters. But if you're prepared to accept that one thread of execution is capable of mangling another's state, things are a lot easier. You can protect against *inadvertent* interaction much more easily than malicious interference. It may be that you can get away with simply running multiple threads in one interpreter; obviously that would have problems if you need more than one CPU core between them all (hello GIL), but that would really be your first limit. One thread could fiddle with __builtins__ or a standard module and thus harass another thread, but you would know if that's what's going on. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
squeeze out some performance
Hi, I try to squeeze out some performance of the code pasted on the link below. http://pastebin.com/gMnqprST The code will be used to continuously analyze sonar sensor data. I set this up to calculate all coordinates in a sonar cone without heavy use of trigonometry (assuming that this way is faster in the end). I optimized as much as I could. Maybe one of you has another bright idea to squeeze out a bit more? Thanks Robert -- https://mail.python.org/mailman/listinfo/python-list
Re: using ffmpeg command line with python's subprocess module
On 06/12/2013 06:23, iMath wrote: Dearest iMath, wouldst thou be kind enough to partake of obtaining some type of email client that dost not sendeth double spaced data into this most illustrious of mailing lists/newsgroups. Thanking thee for thine participation in my most humble of requests. I do remain your most obedient servant. -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence -- https://mail.python.org/mailman/listinfo/python-list
Re: Embedding multiple interpreters
On Fri, Dec 6, 2013 at 7:21 PM, Garthy wrote: > PS. Apologies if any of these messages come through more than once. Most > lists that I've posted to set reply-to meaning a normal reply can be used, > but python-list does not seem to. The replies I have sent manually to > python-list@python.org instead don't seem to have appeared. I'm not quite > sure what is happening- apologies for any blundering around on my part > trying to figure it out. They are coming through more than once. If you're subscribed to the list, sending to python-list@python.org should be all you need to do - where else are they going? ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: squeeze out some performance
Robert Voigtländer wrote: > I try to squeeze out some performance of the code pasted on the link > below. http://pastebin.com/gMnqprST > > The code will be used to continuously analyze sonar sensor data. I set > this up to calculate all coordinates in a sonar cone without heavy use of > trigonometry (assuming that this way is faster in the end). > > I optimized as much as I could. Maybe one of you has another bright idea > to squeeze out a bit more? This sort of code is probably harder to make faster in pure python. You could try profiling it to see where the hot spots are. Perhaps the choice of arrays or sets might have some speed impact. One idea would be to use something like cython to compile your python code to an extension module, with some hints to the types of the various values. I would go down the geometry route. If you can restate your problem in terms of geometry, it might be possible to replace all that code with a few numpy array operations. e.g. for finding pixels in a circle of radius 50 import numpy as np radiussqd = np.fromfunction(lambda y,x: (y-50)**2+(x-50)**2, (100,100) ) all_y, all_x = np.indices((100,100)) yvals = all_y[radiussqd < 50**2] Jeremy -- https://mail.python.org/mailman/listinfo/python-list
ANN: eGenix PyRun - One file Python Runtime 1.3.1
ANNOUNCING eGenix PyRun - One file Python Runtime Version 1.3.1 An easy-to-use single file relocatable Python run-time - available for Linux, Mac OS X and Unix platforms, with support for Python 2.5, 2.6 and 2.7 This announcement is also available on our web-site for online reading: http://www.egenix.com/company/news/eGenix-PyRun-1.3.0-GA.html INTRODUCTION Our new eGenix PyRun combines a Python interpreter with an almost complete Python standard library into a single easy-to-use executable, that does not require a system wide installation and is fully relocatable. eGenix PyRun's executable only needs 11MB, but still supports most Python application and scripts - and it can be further compressed to just 3-4MB using upx. Compared to a regular Python installation of typically 100MB on disk, this makes eGenix PyRun ideal for applications and scripts that need to be distributed to many target machines, client installations or customers. It makes "installing" Python on a Unix based system as simple as copying a single file. We have been using the product internally in our mxODBC Connect Server since 2008 with great success and have now extracted it into a stand-alone open-source product. We provide both the source archive to build your own eGenix PyRun, as well as pre-compiled binaries for Linux, FreeBSD and Mac OS X, as 32- and 64-bit versions. The binaries can be downloaded manually, or you can let our automatic install script install-pyrun take care of the installation: ./install-pyrun dir and you're done. Please see the product page for more details: http://www.egenix.com/products/python/PyRun/ NEWS This is a new minor release of eGenix PyRun, which comes with updates to the latest Python releases and includes a number of compatibility enhancements. New Features * Upgraded eGenix PyRun to work with and use Python 2.7.6 per default. * Upgraded eGenix PyRun to use Python 2.6.9 as default Python 2.6 version. install-pyrun Quick Installation Enhancements - Since version 1.1.0, eGenix PyRun includes a shell script called install-pyrun, which greatly simplifies installation of eGenix PyRun. It works much like the virtualenv shell script used for creating new virtual environments (except that there's nothing virtual about PyRun environments). https://downloads.egenix.com/python/install-pyrun With the script, an eGenix PyRun installation is as simple as running: ./install-pyrun targetdir We have updated this script since the last release: * install-pyrun now defaults to installing setuptools 1.4.2 and pip 1.4.1 when looking for local downloads of these tools. For a complete list of changes, please see the eGenix PyRun Changelog: http://www.egenix.com/products/python/PyRun/changelog.html For a list of changes in the 1.3.0 minor release, please read the eGenix PyRun 1.3.0 announcement: http://www.egenix.com/company/news/eGenix-PyRun-1.3.0-GA.html Presentation at EuroPython 2012 --- Marc-André Lemburg, CEO of eGenix, gave a presentation about eGenix PyRun at EuroPython 2012 last year. The talk video as well as the slides are available on our website: http://www.egenix.com/library/presentations/EuroPython2012-eGenix-PyRun/ LICENSE eGenix PyRun is distributed under the eGenix.com Public License 1.1.0 which is an Open Source license similar to the Python license. You can use eGenix PyRun in both commercial and non-commercial settings without fee or charge. Please see our license page for more details: http://www.egenix.com/products/python/PyRun/license.html The package comes with full source code. DOWNLOADS The download archives and instructions for installing eGenix PyRun can be found at: http://www.egenix.com/products/python/PyRun/ As always, we are providing pre-built binaries for all common platforms: Windows 32/64-bit, Linux 32/64-bit, FreeBSD 32/64-bit, Mac OS X 32/64-bit. Source code archives are available for installation on other platforms, such as Solaris, AIX, HP-UX, etc. ___ SUPPORT Commercial support for this product is available from eGenix.com. Please see http://www.egenix.com/services/support/ for details about our support offerings. MORE INFORMATION For more information about eGenix PyRun, licensing and download instructions, please visit our w
Re: Embedding multiple interpreters
Hi Chris, On 06/12/13 19:03, Chris Angelico wrote: > On Fri, Dec 6, 2013 at 6:59 PM, Garthy > wrote: >> Hi Chris (and Michael), > > Hehe. People often say that to me IRL, addressing me and my brother. > But he isn't on python-list, so you clearly mean Michael Torrie, yet > my brain still automatically thought you were addressing Michael > Angelico :) These strange coincidences happen from time to time- it's entertaining when they do. :) >> To allow each script to run in its own environment, with minimal chance of >> inadvertent interaction between the environments, whilst allowing each >> script the ability to stall on conditions that will be later met by another >> thread supplying the information, and to fit in with existing >> infrastructure. > > Are the scripts written cooperatively, or must you isolate one from > another? If you need to isolate them for trust reasons, then there's > only one solution, and that's separate processes with completely > separate interpreters. But if you're prepared to accept that one > thread of execution is capable of mangling another's state, things are > a lot easier. You can protect against *inadvertent* interaction much > more easily than malicious interference. It may be that you can get > away with simply running multiple threads in one interpreter; > obviously that would have problems if you need more than one CPU core > between them all (hello GIL), but that would really be your first > limit. One thread could fiddle with __builtins__ or a standard module > and thus harass another thread, but you would know if that's what's > going on. I think the ideal is completely sandboxed, but it's something that I understand I may need to make compromises on. The bare minimum would be protection against inadvertent interaction. Better yet would be a setup that made such interaction annoyingly difficult, and the ideal would be where it was impossible to interfere. My approaching this problem with interpreters was based on an assumption that it might provide a reasonable level of isolation- perhaps not ideal, but hopefully good enough. The closest analogy for understanding would be browser plugins: Scripts from multiple authors who for the most part aren't looking to create deliberate incompatibilities or interference between plugins. The isolation is basic, and some effort is made to make sure that one plugin can't cripple another trivially, but the protection is not exhaustive. Strangely enough, the GIL restriction isn't a big one in this case. For the application, the common case is actually one script running at a time, with other scripts waiting or not running at that time. They do sometimes overlap, but this isn't the common case. If it turned out that only one script could be progressing at a time, it's an annoyance but not a deal-breaker. If it's suboptimal (as seems to be the case), then it's actually not a major issue. With the single interpreter and multiple thread approach suggested, do you know if this will work with threads created externally to Python, ie. if I can create a thread in my application as normal, and then call something like PyGILState_Ensure() to make sure that Python has the internals it needs to work with it, and then use the GIL (or similar) to ensure that accesses to it remain thread-safe? If the answer is yes I can integrate such a thing more easily as an experiment. If it requires calling a dedicated "control" script that feeds out threads then it would need a fair bit more mucking about to integrate- I'd like to avoid this if possible. Cheers, Garth -- https://mail.python.org/mailman/listinfo/python-list
Re: Embedding multiple interpreters
Hi Chris, On 06/12/13 19:57, Chris Angelico wrote: > On Fri, Dec 6, 2013 at 7:21 PM, Garthy > wrote: >> PS. Apologies if any of these messages come through more than once. Most >> lists that I've posted to set reply-to meaning a normal reply can be used, >> but python-list does not seem to. The replies I have sent manually to >> python-list@python.org instead don't seem to have appeared. I'm not quite >> sure what is happening- apologies for any blundering around on my part >> trying to figure it out. > > They are coming through more than once. If you're subscribed to the > list, sending to python-list@python.org should be all you need to do - > where else are they going? I think I've got myself sorted out now. The mailing list settings are a bit different from what I am used to and I just need to reply to messages differently than I normally do. First attempt for three emails each went to the wrong place, second attempt for each appeared to have disappeared into the ether and I assumed non-delivery, but I was incorrect and they all actually arrived along with my third attempt at each. Apologies to all for the inadvertent noise. Cheers, Garth -- https://mail.python.org/mailman/listinfo/python-list
Re: Embedding multiple interpreters
On 06/12/2013 09:27, Chris Angelico wrote: > On Fri, Dec 6, 2013 at 7:21 PM, Garthy > wrote: >> PS. Apologies if any of these messages come through more than once. Most >> lists that I've posted to set reply-to meaning a normal reply can be used, >> but python-list does not seem to. The replies I have sent manually to >> python-list@python.org instead don't seem to have appeared. I'm not quite >> sure what is happening- apologies for any blundering around on my part >> trying to figure it out. > > They are coming through more than once. If you're subscribed to the > list, sending to python-list@python.org should be all you need to do - > where else are they going? I released a batch from the moderation queue from Garthy first thing this [my] morning -- ie about 1.5 hours ago. I'm afraid I didn't check first as to whether they'd already got through to the list some other way. TJG -- https://mail.python.org/mailman/listinfo/python-list
Re: using ffmpeg command line with python's subprocess module
On 12/6/13 4:23 AM, Mark Lawrence wrote: On 06/12/2013 06:23, iMath wrote: Dearest iMath, wouldst thou be kind enough to partake of obtaining some type of email client that dost not sendeth double spaced data into this most illustrious of mailing lists/newsgroups. Thanking thee for thine participation in my most humble of requests. I do remain your most obedient servant. iMath seems to be a native Chinese speaker. I think this message, though amusing, will be baffling and won't have any effect... --Ned. -- https://mail.python.org/mailman/listinfo/python-list
Re: Embedding multiple interpreters
On Fri, Dec 6, 2013 at 8:35 PM, Garthy wrote: > I think the ideal is completely sandboxed, but it's something that I > understand I may need to make compromises on. The bare minimum would be > protection against inadvertent interaction. Better yet would be a setup that > made such interaction annoyingly difficult, and the ideal would be where it > was impossible to interfere. In Python, "impossible to interfere" is a pipe dream. There's no way to stop Python from fiddling around with the file system, and if ctypes is available, with memory in the running program. The only way to engineer that kind of protection is to prevent _the whole process_ from doing those things (using OS features, not Python features), hence the need to split the code out into another process (which might be chrooted, might be running as a user with no privileges, etc). A setup that makes such interaction "annoyingly difficult" is possible as long as your users don't think Ruby. For instance: # script1.py import sys sys.stdout = open("logfile", "w") while True: print("Blah blah") # script2.py import sys sys.stdout = open("otherlogfile", "w") while True: print("Bleh bleh") These two scripts won't play nicely together, because each has modified global state in a different module. So you'd have to set that as a rule. (For this specific example, you probably want to capture stdout/stderr to some sort of global log file anyway, and/or use the logging module, but it makes a simple example.) Most Python scripts aren't going to do this sort of thing, or if they do, will do very little of it. Monkey-patching other people's code is a VERY rare thing in Python. > The closest analogy for understanding would be browser plugins: Scripts from > multiple authors who for the most part aren't looking to create deliberate > incompatibilities or interference between plugins. The isolation is basic, > and some effort is made to make sure that one plugin can't cripple another > trivially, but the protection is not exhaustive. Browser plugins probably need a lot more protection - maybe it's not exhaustive, but any time someone finds a way for one plugin to affect another, the plugin / browser authors are going to treat it as a bug. If I understand you, though, this is more akin to having two forms on one page and having JS validation code for each. It's trivially easy for one to check the other's form objects, but quite simple to avoid too, so for the sake of encapsulation you simply stay safe. > With the single interpreter and multiple thread approach suggested, do you > know if this will work with threads created externally to Python, ie. if I > can create a thread in my application as normal, and then call something > like PyGILState_Ensure() to make sure that Python has the internals it needs > to work with it, and then use the GIL (or similar) to ensure that accesses > to it remain thread-safe? Now that's something I can't help with. The only time I embedded Python seriously was a one-Python-per-process system (arbitrary number of processes fork()ed from one master, but each process had exactly one Python environment and exactly one database connection, etc), and I ended up being unable to make it secure, so I had to switch to embedding ECMAScript (V8, specifically, as it happens... I'm morbidly curious what my boss plans to do, now that he's fired me; he hinted at rewriting the C++ engine in PHP, and I'd love to be a fly on the wall as he tries to test a PHP extension for V8 and figure out whether or not he can trust arbitrary third-party compiled code). But there'll be someone on this list who's done threads and embedded Python. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: squeeze out some performance
On Fri, Dec 6, 2013 at 8:46 PM, Jeremy Sanders wrote: > This sort of code is probably harder to make faster in pure python. You > could try profiling it to see where the hot spots are. Perhaps the choice of > arrays or sets might have some speed impact. I'd make this recommendation MUCH stronger. Rule 1 of optimization: Don't. Rule 2 (for experts only): Don't yet. Once you find that your program actually is running too slowly, then AND ONLY THEN do you start looking at tightening something up. You'll be amazed how little you need to change; start with good clean idiomatic code, and then if it takes too long, you tweak just a couple of things and it's fast enough. And when you do come to the tweaking... Rule 3: Measure twice, cut once. Rule 4: Actually, measure twenty times, cut once. Profile your code to find out what's actually slow. This is very important. Here's an example from a real application (not in Python, it's in a semantically-similar language called Pike): https://github.com/Rosuav/Gypsum/blob/d9907e1507c52189c83ae25f5d7be85235b616fa/window.pike I noticed that I could saturate one CPU core by typing commands very quickly. Okay. That gets us past the first two rules (it's a MUD client, it should not be able to saturate one core of an i5). The code looks roughly like this: paint(): for line in lines: if line_is_visible: paint_line(line) paint_line(): for piece_of_text in text: if highlighted: draw_highlighted() else: draw_not_highlighted() My first guess was that the actual drawing was taking the time, since that's a whole lot of GTK calls. But no; the actual problem was the iteration across all lines and then finding out if they're visible or not (possibly because it obliterates the CPU caches). Once the scrollback got to a million lines or so, that was prohibitively expensive. I didn't realize that until I actually profiled the code and _measured_ where the time was being spent. How fast does your code run? How fast do you need it to run? Lots of optimization questions are answered by "Yaknow what, it don't even matter", unless you're running in a tight loop, or on a microcontroller, or something. Halving the time taken sounds great until you see that it's currently taking 0.0001 seconds and happens in response to user action. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
[newbie] problem trying out simple non object oriented use of Tkinter
I'm trying out Tkinter with the (non object oriented) code fragment below: It works partially as I expected, but I thought that pressing "1" would cause the program to quit, however I get this message: TypeError: quit() takes no arguments (1 given), I tried changing quit to quit() but that makes things even worse. So my question: can anyone here help me debug this? #!/usr/bin/env python import Tkinter as tk def quit(): sys.exit() root = tk.Tk() label = tk.Label(root, text="Hello, world") label.pack() label.bind("<1>", quit) root.mainloop() p.s. I like the code not object orientated -- https://mail.python.org/mailman/listinfo/python-list
Re: [newbie] problem trying out simple non object oriented use of Tkinter
- Original Message - > I'm trying out Tkinter with the (non object oriented) code fragment > below: > It works partially as I expected, but I thought that pressing "1" > would > cause the program to quit, however I get this message: > TypeError: quit() takes no arguments (1 given), I tried changing quit > to quit() > but that makes things even worse. So my question: can anyone here > help me > debug this? > > #!/usr/bin/env python > import Tkinter as tk > def quit(): > sys.exit() > root = tk.Tk() > label = tk.Label(root, text="Hello, world") > label.pack() > label.bind("<1>", quit) > root.mainloop() > > p.s. I like the code not object orientated > -- > https://mail.python.org/mailman/listinfo/python-list > the engine is probably passing an argument to your quit callback method. try def quit(param): sys.exit(str(param)) You probably don't even care about the parameter: def quit(param): sys.exit() JM -- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. -- https://mail.python.org/mailman/listinfo/python-list
Re: [newbie] problem trying out simple non object oriented use of Tkinter
Hi Jean, On Fri, Dec 06, 2013 at 04:24:59AM -0800, Jean Dubois wrote: > I'm trying out Tkinter with the (non object oriented) code fragment below: > It works partially as I expected, but I thought that pressing "1" would > cause the program to quit, however I get this message: > TypeError: quit() takes no arguments (1 given), I tried changing quit to > quit() > but that makes things even worse. So my question: can anyone here help me > debug this? I don't know the details of the Tkinter library, but you could find out what quit is being passed by modifying it to take a single parameter and printing it out (or using pdb): def quit(param): print(param) sys.exit() Having taken a quick look at the documentation, it looks like event handlers (like your quit function) are passed the event that triggered them. So you can probably just ignore the parameter: def quit(_): sys.exit() Cheers, Dan -- https://mail.python.org/mailman/listinfo/python-list
Re: Managing Google Groups headaches
On Friday, December 6, 2013 1:06:30 PM UTC+5:30, Roy Smith wrote: > Rusi wrote: > > On Thursday, December 5, 2013 6:28:54 AM UTC+5:30, Roy Smith wrote: > > > The real problem with web forums is they conflate transport and > > > presentation into a single opaque blob, and are pretty much universally > > > designed to be a closed system. Mail and usenet were both engineered to > > > make a sharp division between transport and presentation, which meant it > > > was possible to evolve each at their own pace. > > > Mostly that meant people could go off and develop new client > > > applications which interoperated with the existing system. But, it also > > > meant that transport layers could be switched out (as when NNTP > > > gradually, but inexorably, replaced UUCP as the primary usenet transport > > > layer). > > There is a deep assumption hovering round-about the above -- what I > > will call the 'Unix assumption(s)'. > It has nothing to do with Unix. The separation of transport from > presentation is just as valid on Windows, Mac, etc. > > But before that, just a check on > > terminology. By 'presentation' you mean what people normally call > > 'mail-clients': thunderbird, mutt etc. And by 'transport' you mean > > sendmail, exim, qmail etc etc -- what normally are called > > 'mail-servers.' Right?? > Yes. > > Assuming this is the intended meaning of the terminology (yeah its > > clearer terminology than the usual and yeah Im also a 'Unix-guy'), > > here's the 'Unix-assumption': > > - human communication� > > (is not very different from) > > - machine communication� > > (can be done by) > > - text� > > (for which) > > - ASCII is fine� > > (which is just) > > - bytes� > > (inside/between byte-memory-organized) > > - von Neumann computers > > To the extent that these assumptions are invalid, the 'opaque-blob' > > may well be preferable. > I think you're off on the wrong track here. This has nothing to do with > plain text (ascii or otherwise). It has to do with divorcing how you > store and transport messages (be they plain text, HTML, or whatever) > from how a user interacts with them. Evidently (and completely inadvertently) this exchange has just illustrated one of the inadmissable assumptions: "unicode as a medium is universal in the same way that ASCII used to be" I wrote a number of ellipsis characters ie codepoint 2026 as in: - human communication… (is not very different from) - machine communication… Somewhere between my sending and your quoting those ellipses became the replacement character FFFD > > - human communication� > > (is not very different from) > > - machine communication� Leaving aside whose fault this is (very likely buggy google groups), this mojibaking cannot happen if the assumption "All text is ASCII" were to uniformly hold. Of course with unicode also this can be made to not happen, but that is fragile and error-prone. And that is because ASCII (not extended) is ONE thing in a way that unicode is hopelessly a motley inconsistent variety. With unicode there are in-memory formats, transportation formats eg UTF-8, strange beasties like FSR (which then hopelessly and inveterately tickle our resident trolls!) multi-layer encodings (in html), BOMS and unnecessary/inconsistent BOMS (in microsoft-notepad). With ASCII, ASCII is ASCII; ie "ABC" is 65,66,67 whether its in-core, in-file, in-pipe or whatever. Ok there are a few wrinkles to this eg. the null-terminator in C-strings. I think this is the exception to the rule that in classic Unix, ASCII is completely inter-operable and therefore a universal data-structure for inter-process or inter-machine communication. It is this universal data structure that makes classic unix pipes and filters possible and easy (of which your separation of presentation and transportation is just one case). Give it up and the composability goes with it. Go up from the ASCII -> Unicode level to the plain-text -> hypertext (aka html) level and these composability problems hit with redoubled force. > Take something like Wikipedia (by which, I really mean, MediaWiki, which > is the underlying software package). Most people think of Wikipedia as > a web site. But, there's another layer below that which lets you get > access to the contents of articles, navigate all the rich connections > like category trees, and all sorts of metadata like edit histories. > Which means, if I wanted to (and many examples of this exist), I can > write my own client which presents the same information in different > ways. Not sure whats your point. Html is a universal data-structuring format -- ok for presentation, bad for data-structuring SQL databases (assuming thats the mediawiki backend) is another -- ok for data-structuring bad for presentation. Mediawiki mediates between the two formats. Beyond that I lost you... what are you trying to say?? -- https://mail.python.org/mailman/listinfo/python-list
Re: [newbie] problem trying out simple non object oriented use of Tkinter
Op vrijdag 6 december 2013 13:30:53 UTC+1 schreef Daniel Watkins: > Hi Jean, > > > > On Fri, Dec 06, 2013 at 04:24:59AM -0800, Jean Dubois wrote: > > > I'm trying out Tkinter with the (non object oriented) code fragment below: > > > It works partially as I expected, but I thought that pressing "1" would > > > cause the program to quit, however I get this message: > > > TypeError: quit() takes no arguments (1 given), I tried changing quit to > > quit() > > > but that makes things even worse. So my question: can anyone here help me > > > debug this? > > > > I don't know the details of the Tkinter library, but you could find out > > what quit is being passed by modifying it to take a single parameter and > > printing it out (or using pdb): > > > > def quit(param): > > print(param) > > sys.exit() > > > > Having taken a quick look at the documentation, it looks like event > > handlers (like your quit function) are passed the event that triggered > > them. So you can probably just ignore the parameter: > > > > def quit(_): > > sys.exit() > > > > > > Cheers, > > > > Dan I tried out your suggestions and discovered that I had the line import sys to the program. So you can see below what I came up with. It works but it's not all clear to me. Can you tell me what "label.bind("<1>", quit)" is standing for? What's the <1> meaning? #!/usr/bin/env python import Tkinter as tk import sys #underscore is necessary in the following line def quit(_): sys.exit() root = tk.Tk() label = tk.Label(root, text="Click mouse here to quit") label.pack() label.bind("<1>", quit) root.mainloop() thanks jean -- https://mail.python.org/mailman/listinfo/python-list
Re: Managing Google Groups headaches
On Sat, Dec 7, 2013 at 12:03 AM, rusi wrote: > SQL databases (assuming thats the mediawiki backend) is another -- ok for > data-structuring bad for presentation. No, SQL databases don't store structured text. MediaWiki just stores a single blob (not in the database sense of that word) of text. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Managing Google Groups headaches
On Friday, December 6, 2013 6:49:04 PM UTC+5:30, Chris Angelico wrote: > On Sat, Dec 7, 2013 at 12:03 AM, rusi wrote: > > SQL databases (assuming thats the mediawiki backend) is another -- ok for > > data-structuring bad for presentation. > No, SQL databases don't store structured text. MediaWiki just stores a > single blob (not in the database sense of that word) of text. I guess we are using 'structured' in different ways. All I am saying is that mediawiki which seems to present as html, actually stores its stuff as SQL -- nothing more or less structured than the schemas here: http://www.mediawiki.org/wiki/Manual:MediaWiki_architecture#Database_and_text_storage -- https://mail.python.org/mailman/listinfo/python-list
Re: Managing Google Groups headaches
On Sat, Dec 7, 2013 at 12:32 AM, rusi wrote: > I guess we are using 'structured' in different ways. All I am saying > is that mediawiki which seems to present as html, actually stores its > stuff as SQL -- nothing more or less structured than the schemas here: > http://www.mediawiki.org/wiki/Manual:MediaWiki_architecture#Database_and_text_storage Yeah, but the structure is all about the metadata. Ultimately, there's one single text field containing the entire content as you would see it in the page editor: wiki markup in straight text. MediaWiki uses an SQL database to store that lump of text, but ultimately the relationship is between wikitext and HTML, no SQL involvement. Wiki markup is reasonable for text structuring. (Not for generic data structuring, but it's decent for text.) Same with reStructuredText, used for PEPs. An SQL database is a good way to store mappings of "this key, this tuple of data" and retrieve them conveniently, including (and this is the bit that's more complicated in a straight Python dictionary) using any value out of the tuple as the key, and (and this is where a dict *really* can't hack it) storing/retrieving more data than fits in memory. The two are orthogonal. Your point is better supported by wikitext than by SQL, here, except that there aren't fifty other systems that parse and display wikitext. In fact, what you're suggesting is a good argument for deprecating HTML email in favour of RST email, and using docutils to render the result either as HTML (for webmail users) or as some other format. And I wouldn't be against that :) But good luck convincing the world that Microsoft Outlook is doing the wrong thing. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Sharing Python installation between architectures
In article , Paul Smith wrote: >One thing I always liked about Perl was the way you can create a single >installation directory which can be shared between archictures. Say >what you will about the language: the Porters have an enormous amount of >experience and expertise producing portable and flexible interpreter >installations. > >By this I mean, basically, multiple architectures (Linux, Solaris, >MacOSX, even Windows) sharing the same $prefix/lib/python2.7 directory. >The large majority of the contents there are completely portable across >architectures (aren't they?) so why should I have to duplicate many >megabytes worth of files? The solution is of course to replace all duplicates by hard links. A tool for this is useful in a lot of other circumstances too. In a re-installation of the whole or parts, the hard links will be removed, and the actual files are only removed if they aren't needed for any of the installations, so this is transparent for reinstallation. After a lot of reinstallation you want to run the tool again. This is of course only possible on real file systems (probably not on FAT), but your files reside on a server, so chances are they are on a real file system. (The above is partly in jest. It is a real solution to storage problems, but storage problems are unheard of in these days of Tera byte disks. It doesn't help with the clutter, which was probably the main motivation.) Symbolic links are not as transparent, but they may work very well too. Have the common part set apart and replace everything else by symbolic links. There is always one more way to skin a cat. Groetjes Albert -- Albert van der Horst, UTRECHT,THE NETHERLANDS Economic growth -- being exponential -- ultimately falters. albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst -- https://mail.python.org/mailman/listinfo/python-list
Re: Embedding multiple interpreters
Hi Chris, On 06/12/13 22:27, Chris Angelico wrote: > On Fri, Dec 6, 2013 at 8:35 PM, Garthy > wrote: >> I think the ideal is completely sandboxed, but it's something that I >> understand I may need to make compromises on. The bare minimum would be >> protection against inadvertent interaction. Better yet would be a setup that >> made such interaction annoyingly difficult, and the ideal would be where it >> was impossible to interfere. > > In Python, "impossible to interfere" is a pipe dream. There's no way > to stop Python from fiddling around with the file system, and if > ctypes is available, with memory in the running program. The only way > to engineer that kind of protection is to prevent _the whole process_ > from doing those things (using OS features, not Python features), > hence the need to split the code out into another process (which might > be chrooted, might be running as a user with no privileges, etc). Absolutely- it would be an impractical ideal. If it was my highest and only priority, CPython might not be the best place to start. But there are plenty of other factors that make Python very desirable to use regardless. :) Re file and ctype-style functionality, that is something I'm going to have to find a way to limit somewhat. But first things first: I need to see what I can accomplish re initial embedding with a reasonable amount of work. > A setup that makes such interaction "annoyingly difficult" is possible > as long as your users don't think Ruby. For instance: > > # script1.py > import sys > sys.stdout = open("logfile", "w") > while True: print("Blah blah") > > # script2.py > import sys > sys.stdout = open("otherlogfile", "w") > while True: print("Bleh bleh") > > > These two scripts won't play nicely together, because each has > modified global state in a different module. So you'd have to set that > as a rule. (For this specific example, you probably want to capture > stdout/stderr to some sort of global log file anyway, and/or use the > logging module, but it makes a simple example.) Thanks for the example. Hopefully I can minimise the cases where this would potentially be a problem. Modifying the basic environment and the source is something I can do readily if needed. Re stdout/stderr, on that subject I actually wrote a replacement log catcher for embedded Python a few years back. I can't remember how on earth I did it now, but I've still got the code that did it somewhere. > Most Python scripts > aren't going to do this sort of thing, or if they do, will do very > little of it. Monkey-patching other people's code is a VERY rare thing > in Python. That's good to hear. :) >> The closest analogy for understanding would be browser plugins: Scripts from >> multiple authors who for the most part aren't looking to create deliberate >> incompatibilities or interference between plugins. The isolation is basic, >> and some effort is made to make sure that one plugin can't cripple another >> trivially, but the protection is not exhaustive. > > Browser plugins probably need a lot more protection - maybe it's not > exhaustive, but any time someone finds a way for one plugin to affect > another, the plugin / browser authors are going to treat it as a bug. > If I understand you, though, this is more akin to having two forms on > one page and having JS validation code for each. It's trivially easy > for one to check the other's form objects, but quite simple to avoid > too, so for the sake of encapsulation you simply stay safe. There have been cases where browser plugins have played funny games to mess with the behaviour of other plugins (eg. one plugin removing entries from the configuration of another). It's certainly not ideal, but it comes from the environment being not entirely locked down, and one plugin author being inclined enough to make destructive changes that impact another. I think the right effort/reward ratio will mean I end up in a similar place. I know it's not the best analogy, but it was one that readily came to mind. :) >> With the single interpreter and multiple thread approach suggested, do you >> know if this will work with threads created externally to Python, ie. if I >> can create a thread in my application as normal, and then call something >> like PyGILState_Ensure() to make sure that Python has the internals it needs >> to work with it, and then use the GIL (or similar) to ensure that accesses >> to it remain thread-safe? > > Now that's something I can't help with. The only time I embedded > Python seriously was a one-Python-per-process system (arbitrary number > of processes fork()ed from one master, but each process had exactly > one Python environment and exactly one database connection, etc), and > I ended up being unable to make it secure, so I had to switch to > embedding ECMAScript (V8, specifically, as it happens... I'm morbidly > curious what my boss plans to do, now that he's fired me; he hinted at > rewri
Re: [newbie] problem trying out simple non object oriented use of Tkinter
> I tried out your suggestions and discovered that I had the line > import sys to the program. So you can see below what I came up with. > It works but it's not all clear to me. Can you tell me what > "label.bind("<1>", quit)" is standing for? What's the <1> meaning? > > > > #!/usr/bin/env python > import Tkinter as tk > import sys > #underscore is necessary in the following line > def quit(_): > sys.exit() > root = tk.Tk() > label = tk.Label(root, text="Click mouse here to quit") > label.pack() > label.bind("<1>", quit) > root.mainloop() > > thanks > jean The best thing to do would be to read http://effbot.org/tkinterbook/tkinter-events-and-bindings.htm "<1>" is the identifier for you mouse button 1. quit is the callback called by the label upon receiving the event mouse1 click. Note that the parameter given to your quit callback is the event. JM -- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. -- https://mail.python.org/mailman/listinfo/python-list
Re: Embedding multiple interpreters
Hi Tim, On 06/12/13 20:47, Tim Golden wrote: On 06/12/2013 09:27, Chris Angelico wrote: On Fri, Dec 6, 2013 at 7:21 PM, Garthy wrote: PS. Apologies if any of these messages come through more than once. Most lists that I've posted to set reply-to meaning a normal reply can be used, but python-list does not seem to. The replies I have sent manually to python-list@python.org instead don't seem to have appeared. I'm not quite sure what is happening- apologies for any blundering around on my part trying to figure it out. They are coming through more than once. If you're subscribed to the list, sending to python-list@python.org should be all you need to do - where else are they going? I released a batch from the moderation queue from Garthy first thing this [my] morning -- ie about 1.5 hours ago. I'm afraid I didn't check first as to whether they'd already got through to the list some other way. I had to make a call between re-sending posts that might have gone missing, or seemingly not responding promptly when people had taken the time to answer my complex query. I made a call to re-send, and it was the wrong one. The fault for the double-posting is entirely mine. Cheers, Garth -- https://mail.python.org/mailman/listinfo/python-list
Re: Why is there no natural syntax for accessing attributes with names not being valid identifiers?
On 2013-12-04, Piotr Dobrogost wrote: > On Wednesday, December 4, 2013 10:41:49 PM UTC+1, Neil Cerutti > wrote: >> not something to do commonly. Your proposed syntax leaves the >> distinction between valid and invalid identifiers a problem >> the programmer has to deal with. It doesn't unify access to >> attributes the way the getattr and setattr do. > > Taking into account that obj.'x' would be equivalent to obj.x > any attribute can be accessed with the new syntax. I don't see > how this is not unified access compared to using getattr > instead dot... I thought of that argument later the next day. Your proposal does unify access if the old obj.x syntax is removed. -- Neil Cerutti -- https://mail.python.org/mailman/listinfo/python-list
Re: Managing Google Groups headaches
On Friday, December 6, 2013 7:18:19 PM UTC+5:30, Chris Angelico wrote: > On Sat, Dec 7, 2013 at 12:32 AM, rusi wrote: > > I guess we are using 'structured' in different ways. All I am saying > > is that mediawiki which seems to present as html, actually stores its > > stuff as SQL -- nothing more or less structured than the schemas here: > > http://www.mediawiki.org/wiki/Manual:MediaWiki_architecture#Database_and_text_storage > Yeah, but the structure is all about the metadata. Ok (I'd drop the 'all') > Ultimately, there's one single text field containing the entire content Right > as you would see it in the page editor: wiki markup in straight text. Aha! There you are! Its 'page editor' here and not the html which 'display source' (control-u) which a browser would show. And wikimedia is the software that mediates. The usual direction (seen by users of wikipedia) is that wikimedia takes this text, along with the other unrelated (metadata?) seen around -- sidebar, tabs etc, css settings and munges it all into html The other direction (seen by editors of wikipedia) is that you edit a page and that page and history etc will show the changes, reflecting the fact that the SQL content has changed. > MediaWiki uses an SQL database to store that lump of text, but > ultimately the relationship is between wikitext and HTML, no SQL > involvement. Dunno what you mean. Every time someone browses wikipedia, things are getting pulled out of the SQL and munged into the html (s)he sees. -- https://mail.python.org/mailman/listinfo/python-list
Re: Managing Google Groups headaches
On Sat, Dec 7, 2013 at 1:11 AM, rusi wrote: > Aha! There you are! Its 'page editor' here and not the html which > 'display source' (control-u) which a browser would show. And wikimedia > is the software that mediates. > > The usual direction (seen by users of wikipedia) is that wikimedia > takes this text, along with the other unrelated (metadata?) seen > around -- sidebar, tabs etc, css settings and munges it all into html > > The other direction (seen by editors of wikipedia) is that you edit a > page and that page and history etc will show the changes, > reflecting the fact that the SQL content has changed. MediaWiki is fundamentally very similar to a structure that I'm trying to deploy for a community web site that I host, approximately thus: * A git repository stores a bunch of RST files * A script auto-generates index files based on the presence of certain file names, and renders via rst2html * The HTML pages are served as static content MediaWiki is like this: * Each page has a history, represented by a series of state snapshots of wikitext * On display, the wikitext is converted to HTML and served. The main difference is that MediaWiki is optimized for rapid and constant editing, where what I'm pushing for is optimized for less common edits that might span multiple files. (MW has no facility for atomically changing multiple pages, and atomically reverting those changes, and so on. Each page stands alone.) They're still broadly doing the same thing: storing marked-up text and rendering HTML. The fact that one uses an SQL database and the other uses a git repository is actually quite insignificant - it's as significant as the choice of whether to store your data on a hard disk or an SSD. The system is no different. >> MediaWiki uses an SQL database to store that lump of text, but >> ultimately the relationship is between wikitext and HTML, no SQL >> involvement. > > Dunno what you mean. Every time someone browses wikipedia, things are > getting pulled out of the SQL and munged into the html (s)he sees. Yes, but that's just mechanics. The fact that the PHP scripts to operate Wikipedia are being pulled off a file system doesn't mean that MediaWiki is an ext3-to-HTML renderer. It's a wikitext-to-HTML renderer. Anyway. As I said, your point is still mostly there, as long as you use wikitext rather than SQL. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: using ffmpeg command line with python's subprocess module
在 2013年12月6日星期五UTC+8下午5时23分59秒,Mark Lawrence写道: > On 06/12/2013 06:23, iMath wrote: > > > > Dearest iMath, wouldst thou be kind enough to partake of obtaining some > > type of email client that dost not sendeth double spaced data into this > > most illustrious of mailing lists/newsgroups. Thanking thee for thine > > participation in my most humble of requests. I do remain your most > > obedient servant. > > > > -- > > My fellow Pythonistas, ask not what our language can do for you, ask > > what you can do for our language. > > > > Mark Lawrence yes ,I am a native Chinese speaker.I always post question by Google Group not through email ,is there something wrong with it ? your english is a little strange to me . -- https://mail.python.org/mailman/listinfo/python-list
Re: using ffmpeg command line with python's subprocess module
在 2013年12月4日星期三UTC+8下午6时51分49秒,Chris Angelico写道: > On Wed, Dec 4, 2013 at 8:38 PM, Andreas Perstinger > wrote: > > > "fp" is a file object, but subprocess expects a list of strings as > > > its first argument. > > > > More fundamentally: The subprocess's arguments must include the *name* > > of the file. This means you can't use TemporaryFile at all, as it's > > not guaranteed to return an object that actually has a file name. > > > > There's another problem, too, and that's that you're not closing the > > file before expecting the subprocess to open it. And once you do that, > > you'll find that the file no longer exists once it's been closed. In > > fact, you'll need to research the tempfile module a bit to be able to > > do what you want here; rather than spoon-feed you an exact solution, > > I'll just say that there is one, and it can be found here: > > > > http://docs.python.org/3.3/library/tempfile.html > > > > ChrisA I think you mean I should create a temporary file by NamedTemporaryFile(). After tried it many times, I found there is nearly no convenience in creating a temporary file or a persistent one here ,because we couldn't use the temporary file while it has not been closed ,so we couldn't depend on the convenience of letting the temporary file automatically delete itself when closing, we have to delete it later by os.remove() after it has been used in that command line. code without the with statement is here ,but it is wrong ,it shows this line c:\docume~1\admini~1\locals~1\temp\tmp0d8959: Invalid data found when processing input fp=tempfile.NamedTemporaryFile(delete=False) fp.write(("file '"+fileName1+"'\n").encode('utf-8')) fp.write(("file '"+fileName2+"'\n").encode('utf-8')) subprocess.call(['ffmpeg', '-f', 'concat','-i',fp.name, '-c', 'copy', fileName]) fp.close() -- https://mail.python.org/mailman/listinfo/python-list
Re: using ffmpeg command line with python's subprocess module
On Sat, Dec 7, 2013 at 1:54 AM, iMath wrote: > fp=tempfile.NamedTemporaryFile(delete=False) > fp.write(("file '"+fileName1+"'\n").encode('utf-8')) > fp.write(("file '"+fileName2+"'\n").encode('utf-8')) > > > subprocess.call(['ffmpeg', '-f', 'concat','-i',fp.name, '-c', 'copy', > fileName]) > fp.close() You need to close the file before getting the other process to use it. Otherwise, it may not be able to open the file at all, and even if it can, you might find that not all the data has been written. But congrats! You have successfully found the points I was directing you to. Yes, I was hinting that you need NamedTemporaryFile, the .name attribute, and delete=False. Good job! ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: using ffmpeg command line with python's subprocess module
On 06/12/2013 14:52, iMath wrote: 在 2013年12月6日星期五UTC+8下午5时23分59秒,Mark Lawrence写道: On 06/12/2013 06:23, iMath wrote: Dearest iMath, wouldst thou be kind enough to partake of obtaining some type of email client that dost not sendeth double spaced data into this most illustrious of mailing lists/newsgroups. Thanking thee for thine participation in my most humble of requests. I do remain your most obedient servant. -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence yes ,I am a native Chinese speaker.I always post question by Google Group not through email ,is there something wrong with it ? your english is a little strange to me . You can see the extra lines inserted by google groups above. It's not too bad in one and only one message, but when a message has been backwards and forwards several times it's extremely irritating, or worse still effectively unreadable. Work arounds have been posted on this list, but I'd recommend using any decent email client. The English I used was archaic, please ignore it :) -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence -- https://mail.python.org/mailman/listinfo/python-list
Re: using ffmpeg command line with python's subprocess module
On Friday, December 6, 2013 8:22:48 PM UTC+5:30, iMath wrote: > 在 2013年12月6日星期五UTC+8下午5时23分59秒,Mark Lawrence写道: > > On 06/12/2013 06:23, iMath wrote: > > Dearest iMath, wouldst thou be kind enough to partake of obtaining some > > type of email client that dost not sendeth double spaced data into this > > most illustrious of mailing lists/newsgroups. Thanking thee for thine > > participation in my most humble of requests. I do remain your most > > obedient servant. > yes ,I am a native Chinese speaker.I always post question by Google Group not > through email ,is there something wrong with it ? Yes but its easily correctable I recently answered this question to another poster here https://groups.google.com/forum/#!searchin/comp.lang.python/rusi$20google$20groups|sort:date/comp.lang.python/C51hEvi-KbY/KSeaMFoHtcIJ -- https://mail.python.org/mailman/listinfo/python-list
Re: using ffmpeg command line with python's subprocess module
On Friday, December 6, 2013 8:42:02 PM UTC+5:30, Mark Lawrence wrote: > The English I used was archaic, please ignore it :) "Archaic" is almost archaic "Old" is ever-young :D -- https://mail.python.org/mailman/listinfo/python-list
Re: using ffmpeg command line with python's subprocess module
On Fri, 06 Dec 2013 06:52:48 -0800, iMath wrote: > yes ,I am a native Chinese speaker.I always post question by Google > Group not through email ,is there something wrong with it ? your > english is a little strange to me . Mark is writing in fake old-English style, the way people think English was spoken a thousand years ago. I don't know why he did that. Perhaps he thought it was amusing. There are many problems with Google Groups. If you pay attention to this forum, you will see dozens of posts about "Managing Google Groups headaches" and other complaints: - Google Groups double-spaces replies, so text which should appear like: line one line two line three line four turns into: line one blank line line two blank line line three blank line line four - Google Groups often starts sending HTML code instead of plain text - it often mangles indentation, which is terrible for Python code - sometimes it automatically sets the reply address for posts to go to Google Groups, instead of the mailing list it should go to - almost all of the spam on his forum comes from Google Groups, so many people automatically filter everything from Google Groups straight to the trash. There are alternatives to Google Groups: - the mailing list, python-list@python.org - Usenet, comp.lang.python - the Gmane mirror: http://gmane.org/find.php?list=python-list%40python.org and possibly others. You will maximise the number of people reading your posts if you avoid Google Groups. If for some reason you cannot use any of the alternatives, please take the time to fix some of the problems with Google Groups. If you search the archives, you should find some posts by Rusi defending Google Groups and explaining what he does to make it more presentable, and (if I remember correctly) I think Mark also sometimes posts a link to managing Google Groups. -- Steven -- https://mail.python.org/mailman/listinfo/python-list
Re: Packaging a proprietary Python library for multiple OSs
On 12/5/13, 10:50 AM, Michael Herrmann wrote: On Thursday, December 5, 2013 4:26:40 PM UTC+1, Kevin Walzer wrote: On 12/5/13, 5:14 AM, Michael Herrmann wrote: If your library and their dependencies are simply .pyc files, then I don't see why a zip collated via py2exe wouldn't work on other platforms. Obviously this point is moot if your library includes true compiled (C-based) extensions. As I said, I need to make my *build* platform-independent. Giving this further thought, I'm wondering how hard it would be to roll your own using modulefinder, Python's zip tools, and some custom code. Just sayin'. --Kevin -- Kevin Walzer Code by Kevin/Mobile Code by Kevin http://www.codebykevin.com http://www.wtmobilesoftware.com -- https://mail.python.org/mailman/listinfo/python-list
Re: using ffmpeg command line with python's subprocess module
On 06/12/2013 15:34, Steven D'Aprano wrote: (if I remember correctly) I think Mark also sometimes posts a link to managing Google Groups. You do, and here it is https://wiki.python.org/moin/GoogleGroupsPython -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence -- https://mail.python.org/mailman/listinfo/python-list
Re: using ffmpeg command line with python's subprocess module
On Friday, December 6, 2013 9:23:47 PM UTC+5:30, Mark Lawrence wrote: > On 06/12/2013 15:34, Steven D'Aprano wrote: > > (if I remember correctly) I think Mark also > > > sometimes posts a link to managing Google Groups. > > > > > You do, and here it is https://wiki.python.org/moin/GoogleGroupsPython That link needs updating. Even if my almost-automatic correction methods are not considered kosher for some reason or other, the thing that needs to go in there is that GG has TWO problems 1. Blank lines 2. Long lines That link only describes 1. Roy's yesterday's post in "Packaging a proprietary python library" says: > I, and Rusi, know enough, and take the effort, to overcome its > shortcomings doesn't change that. But in fact his post takes care of 1 not 2. In all fairness I did not know that 2 is a problem until rurpy pointed it out recently and was not correcting it. In fact, I'd take the trouble to make the lines long assuming that clients were intelligent enough to fit it properly into whatever was the current window!!! So someone please update that page! -- https://mail.python.org/mailman/listinfo/python-list
Re: using ffmpeg command line with python's subprocess module
On 06/12/2013 16:19, rusi wrote: On Friday, December 6, 2013 9:23:47 PM UTC+5:30, Mark Lawrence wrote: On 06/12/2013 15:34, Steven D'Aprano wrote: (if I remember correctly) I think Mark also sometimes posts a link to managing Google Groups. You do, and here it is https://wiki.python.org/moin/GoogleGroupsPython That link needs updating. Even if my almost-automatic correction methods are not considered kosher for some reason or other, the thing that needs to go in there is that GG has TWO problems 1. Blank lines 2. Long lines That link only describes 1. Roy's yesterday's post in "Packaging a proprietary python library" says: I, and Rusi, know enough, and take the effort, to overcome its shortcomings doesn't change that. But in fact his post takes care of 1 not 2. In all fairness I did not know that 2 is a problem until rurpy pointed it out recently and was not correcting it. In fact, I'd take the trouble to make the lines long assuming that clients were intelligent enough to fit it properly into whatever was the current window!!! So someone please update that page! This is a community so why don't you? -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence -- https://mail.python.org/mailman/listinfo/python-list
Re: squeeze out some performance
Thanks for your replies. I already did some basic profiling and optimized a lot. Especially with help of a goof python performance tips list I found. I think I'll follow the cython path. The geometry approach also sound good. But it's way above my math/geometry knowledge. Thanks for your input! -- https://mail.python.org/mailman/listinfo/python-list
Re: squeeze out some performance
On 06/12/2013 16:29, Robert Voigtländer wrote: Thanks for your replies. I already did some basic profiling and optimized a lot. Especially > with help of a goof python performance tips list I found. Wonderful typo -^ :) I think I'll follow the cython path. The geometry approach also sound good. But it's way above my math/geometry knowledge. Thanks for your input! -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence -- https://mail.python.org/mailman/listinfo/python-list
Re: using ffmpeg command line with python's subprocess module
On 06/12/2013 15:34, Steven D'Aprano wrote: On Fri, 06 Dec 2013 06:52:48 -0800, iMath wrote: yes ,I am a native Chinese speaker.I always post question by Google Group not through email ,is there something wrong with it ? your english is a little strange to me . Mark is writing in fake old-English style, the way people think English was spoken a thousand years ago. I don't know why he did that. Perhaps he thought it was amusing. [snip] You're exaggerating. It's more like 500 years ago. :-) -- https://mail.python.org/mailman/listinfo/python-list
Re: squeeze out some performance
Am Freitag, 6. Dezember 2013 17:36:03 UTC+1 schrieb Mark Lawrence: > > I already did some basic profiling and optimized a lot. Especially > with > > help of a goof python performance tips list I found. > > Wonderful typo -^ :) > Oh well :-) ... it was a good one. Just had a quick look at Cython. Looks great. Thanks for the tip. -- https://mail.python.org/mailman/listinfo/python-list
Re: using ffmpeg command line with python's subprocess module
On Friday, December 6, 2013 9:55:54 PM UTC+5:30, Mark Lawrence wrote: > On 06/12/2013 16:19, rusi wrote: > > So someone please update that page! > This is a community so why don't you? Ok done (at least a first draft) I was under the impression that anyone could not edit -- https://mail.python.org/mailman/listinfo/python-list
Re: Why is there no natural syntax for accessing attributes with names not being valid identifiers?
On Friday, December 6, 2013 3:07:51 PM UTC+1, Neil Cerutti wrote: > On 2013-12-04, Piotr Dobrogost > > wrote: > > > On Wednesday, December 4, 2013 10:41:49 PM UTC+1, Neil Cerutti > > wrote: > > >> not something to do commonly. Your proposed syntax leaves the > >> distinction between valid and invalid identifiers a problem > >> the programmer has to deal with. It doesn't unify access to > > >> attributes the way the getattr and setattr do. > > > > > > Taking into account that obj.'x' would be equivalent to obj.x > > any attribute can be accessed with the new syntax. I don't see > > how this is not unified access compared to using getattr > > instead dot... > > I thought of that argument later the next day. Your proposal does > unify access if the old obj.x syntax is removed. As long as obj.x is a very concise way to get attribute named 'x' from object obj it's somehow odd that identifier x is treated not like identifier but like string literal 'x'. If it were treated like an identifier then we would get attribute with name being value of x instead attribute named 'x'. Making it possible to use string literals in the form obj.'x' as proposed this would make getattr basically needless as long as we use only variable not expression to denote attribute's name. This is just casual remark. Regards, Piotr -- https://mail.python.org/mailman/listinfo/python-list
Re: squeeze out some performance
On Friday, December 6, 2013 12:47:54 AM UTC-8, Robert Voigtländer wrote: > I try to squeeze out some performance of the code pasted on the link below. > http://pastebin.com/gMnqprST Several comments: 1) I find this program to be very difficult to read, largely because there's a whole LOT of duplicated code. Look at lines 53-80, and lines 108-287, and lines 294-311. It makes it harder to see what this algorithm actually does. Is there a way to refactor some of this code to use some shared function calls? 2) I looked up the "Bresenham algorithm", and found two references which may be relevant. The original algorithm was one which computed good raster approximations to straight lines. The second algorithm described may be more pertinent to you, because it draws arcs of circles. http://en.wikipedia.org/wiki/Bresenham's_line_algorithm http://en.wikipedia.org/wiki/Midpoint_circle_algorithm Both of these algorithms are old, from the 1960's, and can be implemented using very simple CPU register operations and minimal memory. Both of the web pages I referenced have extensive example code and pseudocode, and discuss optimization. If you need speed, is this really a job for Python? 3) I THINK that I see some code -- those duplicated parts -- which might benefit from the use of multiprocessing (assuming that you have a multi-core CPU). But I would have to read more deeply to be sure. I need to understand the algorithm more completely, and exactly how you have modified it for your needs. -- https://mail.python.org/mailman/listinfo/python-list
Re: using ffmpeg command line with python's subprocess module
On Friday, December 6, 2013 10:11:04 PM UTC+5:30, MRAB wrote: > On 06/12/2013 15:34, Steven D'Aprano wrote: > > On Fri, 06 Dec 2013 06:52:48 -0800, iMath wrote: > >> yes ,I am a native Chinese speaker.I always post question by Google > >> Group not through email ,is there something wrong with it ? your > >> english is a little strange to me . > > Mark is writing in fake old-English style, the way people think English > > was spoken a thousand years ago. I don't know why he did that. Perhaps he > > thought it was amusing. > [snip] > You're exaggerating. It's more like 500 years ago. :-) I was going to say the same until I noticed the "the way people think English was spoken..." That makes it unarguable -- surely there are some people who (wrongly) think so? -- https://mail.python.org/mailman/listinfo/python-list
interactive help on the base object
Is it just me, or is this basically useless? >>> help(object) Help on class object in module builtins: class object | The most base type >>> Surely a few more words, or a pointer to this http://docs.python.org/3/library/functions.html#object, would be better? -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence -- https://mail.python.org/mailman/listinfo/python-list
Re: [newbie] problem trying out simple non object oriented use of Tkinter
Am 06.12.13 14:12, schrieb Jean Dubois: It works but it's not all clear to me. Can you tell me what "label.bind("<1>", quit)" is standing for? What's the <1> meaning? "bind" connects events sent to the label with a handler. The <1> is the event description; in this case, it means a click with the left mouse button. The mouse buttons are numbered 1,2,3 for left,middle,right, respectively (with right and middle switched on OSX, confusingly). It is actually short for Binding to the key "1" would look like this The event syntax is rather complex, for example it is possible to add modifiers to bind to a Shift-key + right click like this It is described in detail at the bind man page of Tk. http://www.tcl.tk/man/tcl8.6/TkCmd/bind.htm The event object passed to the handler contains additional information, for instance the position of the mouse pointer on the screen. In practice, for large parts of the interface you do not mess with the keyboard and mouse events directly, but use the corresponding widgets. In your program, the label works as a simple pushbutton, and therefore a button should be used. #!/usr/bin/env python import Tkinter as tk import ttk # for modern widgets import sys # no underscore - nothing gets passed def quit(): sys.exit() root = tk.Tk() button = ttk.Button(root, text="Click mouse here to quit", command=quit) button.pack() root.mainloop() note, that 1) nothing gets passed, so we could have left out changing quit(). This is because a button comand usually does not care about details of the mouse click. It just reacts as the user expects. 2) I use ttk widgets, which provide native look&feel. If possible, use those. Good examples on ttk usage are shown at http://www.tkdocs.com/tutorial/index.html HTH, Christia -- https://mail.python.org/mailman/listinfo/python-list
Does Python optimize low-power functions?
The following two functions return the same result: x**2 x*x But they may be computed in different ways. The first choice can accommodate non-integer powers and so it would logically proceed by taking a logarithm, multiplying by the power (in this case, 2), and then taking the anti-logarithm. But for a trivial value for the power like 2, this is clearly a wasteful choice. Just multiply x by itself, and skip the expensive log and anti-log steps. My question is, what do Python interpreters do with power operators where the power is a small constant, like 2? Do they know to take the shortcut? -- https://mail.python.org/mailman/listinfo/python-list
Re: Does Python optimize low-power functions?
- Original Message - > The following two functions return the same result: > > x**2 > x*x > > But they may be computed in different ways. The first choice can > accommodate non-integer powers and so it would logically proceed by > taking a logarithm, multiplying by the power (in this case, 2), and > then taking the anti-logarithm. But for a trivial value for the > power like 2, this is clearly a wasteful choice. Just multiply x by > itself, and skip the expensive log and anti-log steps. > > My question is, what do Python interpreters do with power operators > where the power is a small constant, like 2? Do they know to take > the shortcut? > -- > https://mail.python.org/mailman/listinfo/python-list It is probably specific to the interpreter implementation(cython, jython, iron python etc...). You'd better optimize it yourself should you really care about this. An alternative is to use numpy functions, like numpy.power, they are optimized version of most mathematical functions. JM -- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. -- https://mail.python.org/mailman/listinfo/python-list
Re: Does Python optimize low-power functions?
On 2013-12-06, John Ladasky wrote: > The following two functions return the same result: > > x**2 > x*x > > But they may be computed in different ways. The first choice > can accommodate non-integer powers and so it would logically > proceed by taking a logarithm, multiplying by the power (in > this case, 2), and then taking the anti-logarithm. But for a > trivial value for the power like 2, this is clearly a wasteful > choice. Just multiply x by itself, and skip the expensive log > and anti-log steps. > > My question is, what do Python interpreters do with power > operators where the power is a small constant, like 2? Do they > know to take the shortcut? It uses a couple of fast algorithms for computing powers. Here's the excerpt with the comments identifying the algorithms used. >From longobject.c: 2873 if (Py_SIZE(b) <= FIVEARY_CUTOFF) { 2874 /* Left-to-right binary exponentiation (HAC Algorithm 14.79) */ 2875 /* http://www.cacr.math.uwaterloo.ca/hac/about/chap14.pdf*/ ... 2886 else { 2887 /* Left-to-right 5-ary exponentiation (HAC Algorithm 14.82) */ The only outright optimization of the style I think your describing that I can see is it quickly returns zero when modulus is one. I'm not a skilled or experienced CPython source reader, though. -- Neil Cerutti -- https://mail.python.org/mailman/listinfo/python-list
ASCII and Unicode [was Re: Managing Google Groups headaches]
On Fri, 06 Dec 2013 05:03:57 -0800, rusi wrote: > Evidently (and completely inadvertently) this exchange has just > illustrated one of the inadmissable assumptions: > > "unicode as a medium is universal in the same way that ASCII used to be" Ironically, your post was not Unicode. Seriously. I am 100% serious. Your post was sent using a legacy encoding, Windows-1252, also known as CP-1252, which is most certainly *not* Unicode. Whatever software you used to send the message correctly flagged it with a charset header: Content-Type: text/plain; charset=windows-1252 Alas, the software Roy Smith uses, MT-NewsWatcher, does not handle encodings correctly (or at all!), it screws up the encoding then sends a reply with no charset line at all. This is one bug that cannot be blamed on Google Groups -- or on Unicode. > I wrote a number of ellipsis characters ie codepoint 2026 as in: Actually you didn't. You wrote a number of ellipsis characters, hex byte \x85 (decimal 133), in the CP1252 charset. That happens to be mapped to code point U+2026 in Unicode, but the two are as distinct as ASCII and EBCDIC. > Somewhere between my sending and your quoting those ellipses became the > replacement character FFFD Yes, it appears that MT-NewsWatcher is *deeply, deeply* confused about encodings and character sets. It doesn't just assume things are ASCII, but makes a half-hearted attempt to be charset-aware, but badly. I can only imagine that it was written back in the Dark Ages where there were a lot of different charsets in use but no conventions for specifying which charset was in use. Or perhaps the author was smoking crack while coding. > Leaving aside whose fault this is (very likely buggy google groups), > this mojibaking cannot happen if the assumption "All text is ASCII" were > to uniformly hold. This is incorrect. People forget that ASCII has evolved since the first version of the standard in 1963. There have actually been five versions of the ASCII standard, plus one unpublished version. (And that's not including the things which are frequently called ASCII but aren't.) ASCII-1963 didn't even include lowercase letters. It is also missing some graphic characters like braces, and included at least two characters no longer used, the up-arrow and left-arrow. The control characters were also significantly different from today. ASCII-1965 was unpublished and unused. I don't know the details of what it changed. ASCII-1967 is a lot closer to the ASCII in use today. It made considerable changes to the control characters, moving, adding, removing, or renaming at least half a dozen control characters. It officially added lowercase letters, braces, and some others. It replaced the up-arrow character with the caret and the left-arrow with the underscore. It was ambiguous, allowing variations and substitutions, e.g.: - character 33 was permitted to be either the exclamation mark ! or the logical OR symbol | - consequently character 124 (vertical bar) was always displayed as a broken bar ¦, which explains why even today many keyboards show it that way - character 35 was permitted to be either the number sign # or the pound sign £ - character 94 could be either a caret ^ or a logical NOT ¬ Even the humble comma could be pressed into service as a cedilla. ASCII-1968 didn't change any characters, but allowed the use of LF on its own. Previously, you had to use either LF/CR or CR/LF as newline. ASCII-1977 removed the ambiguities from the 1967 standard. The most recent version is ASCII-1986 (also known as ANSI X3.4-1986). Unfortunately I haven't been able to find out what changes were made -- I presume they were minor, and didn't affect the character set. So as you can see, even with actual ASCII, you can have mojibake. It's just not normally called that. But if you are given an arbitrary ASCII file of unknown age, containing code 94, how can you be sure it was intended as a caret rather than a logical NOT symbol? You can't. Then there are at least 30 official variations of ASCII, strictly speaking part of ISO-646. These 7-bit codes were commonly called "ASCII" by their users, despite the differences, e.g. replacing the dollar sign $ with the international currency sign ¤, or replacing the left brace { with the letter s with caron š. One consequence of this is that the MIME type for ASCII text is called "US ASCII", despite the redundancy, because many people expect "ASCII" alone to mean whatever national variation they are used to. But it gets worse: there are proprietary variations on ASCII which are commonly called "ASCII" but aren't, including dozens of 8-bit so-called "extended ASCII" character sets, which is where the problems *really* pile up. Invariably back in the 1980s and early 1990s people used to call these "ASCII" no matter that they used 8-bits and contained anything up to 256 characters. Just because somebody
Re: Does Python optimize low-power functions?
On 2013-12-06 19:01, Neil Cerutti wrote: On 2013-12-06, John Ladasky wrote: The following two functions return the same result: x**2 x*x But they may be computed in different ways. The first choice can accommodate non-integer powers and so it would logically proceed by taking a logarithm, multiplying by the power (in this case, 2), and then taking the anti-logarithm. But for a trivial value for the power like 2, this is clearly a wasteful choice. Just multiply x by itself, and skip the expensive log and anti-log steps. My question is, what do Python interpreters do with power operators where the power is a small constant, like 2? Do they know to take the shortcut? It uses a couple of fast algorithms for computing powers. Here's the excerpt with the comments identifying the algorithms used. From longobject.c: 2873 if (Py_SIZE(b) <= FIVEARY_CUTOFF) { 2874 /* Left-to-right binary exponentiation (HAC Algorithm 14.79) */ 2875 /* http://www.cacr.math.uwaterloo.ca/hac/about/chap14.pdf*/ ... 2886 else { 2887 /* Left-to-right 5-ary exponentiation (HAC Algorithm 14.82) */ It's worth noting that the *interpreter* per se is not doing this. The implementation of the `long` object does this in its implementation of the `__pow__` method, which the interpreter invokes. Other objects may implement this differently and use whatever optimizations they like. They may even (ab)use the syntax for things other than numerical exponentiation where `x**2` is not equivalent to `x*x`. Since objects are free to do so, the interpreter itself cannot choose to optimize that exponentiation down to multiplication. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco -- https://mail.python.org/mailman/listinfo/python-list
RE: Does Python optimize low-power functions?
>My question is, what do Python interpreters do with power operators where the >power is a small constant, like 2? Do they know to take the shortcut? Nope: Python 3.3.0 (default, Sep 25 2013, 19:28:08) [GCC 4.7.2] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import dis >>> dis.dis(lambda x: x*x) 1 0 LOAD_FAST0 (x) 3 LOAD_FAST0 (x) 6 BINARY_MULTIPLY 7 RETURN_VALUE >>> dis.dis(lambda x: x**2) 1 0 LOAD_FAST0 (x) 3 LOAD_CONST 1 (2) 6 BINARY_POWER 7 RETURN_VALUE The reasons why have already been answered, I just wanted to point out that Python makes it extremely easy to check these sorts of things for yourself. -- https://mail.python.org/mailman/listinfo/python-list
Re: ASCII and Unicode [was Re: Managing Google Groups headaches]
On Friday 06 December 2013 14:30:06 Steven D'Aprano did opine: > On Fri, 06 Dec 2013 05:03:57 -0800, rusi wrote: > > Evidently (and completely inadvertently) this exchange has just > > illustrated one of the inadmissable assumptions: > > > > "unicode as a medium is universal in the same way that ASCII used to > > be" > > Ironically, your post was not Unicode. > > Seriously. I am 100% serious. > > Your post was sent using a legacy encoding, Windows-1252, also known as > CP-1252, which is most certainly *not* Unicode. Whatever software you > used to send the message correctly flagged it with a charset header: > > Content-Type: text/plain; charset=windows-1252 > > Alas, the software Roy Smith uses, MT-NewsWatcher, does not handle > encodings correctly (or at all!), it screws up the encoding then sends a > reply with no charset line at all. This is one bug that cannot be blamed > on Google Groups -- or on Unicode. > > > I wrote a number of ellipsis characters ie codepoint 2026 as in: > Actually you didn't. You wrote a number of ellipsis characters, hex byte > \x85 (decimal 133), in the CP1252 charset. That happens to be mapped to > code point U+2026 in Unicode, but the two are as distinct as ASCII and > EBCDIC. > > > Somewhere between my sending and your quoting those ellipses became > > the replacement character FFFD > > Yes, it appears that MT-NewsWatcher is *deeply, deeply* confused about > encodings and character sets. It doesn't just assume things are ASCII, > but makes a half-hearted attempt to be charset-aware, but badly. I can > only imagine that it was written back in the Dark Ages where there were > a lot of different charsets in use but no conventions for specifying > which charset was in use. Or perhaps the author was smoking crack while > coding. > > > Leaving aside whose fault this is (very likely buggy google groups), > > this mojibaking cannot happen if the assumption "All text is ASCII" > > were to uniformly hold. > > This is incorrect. People forget that ASCII has evolved since the first > version of the standard in 1963. There have actually been five versions > of the ASCII standard, plus one unpublished version. (And that's not > including the things which are frequently called ASCII but aren't.) > > ASCII-1963 didn't even include lowercase letters. It is also missing > some graphic characters like braces, and included at least two > characters no longer used, the up-arrow and left-arrow. The control > characters were also significantly different from today. > > ASCII-1965 was unpublished and unused. I don't know the details of what > it changed. > > ASCII-1967 is a lot closer to the ASCII in use today. It made > considerable changes to the control characters, moving, adding, > removing, or renaming at least half a dozen control characters. It > officially added lowercase letters, braces, and some others. It > replaced the up-arrow character with the caret and the left-arrow with > the underscore. It was ambiguous, allowing variations and > substitutions, e.g.: > > - character 33 was permitted to be either the exclamation > mark ! or the logical OR symbol | > > - consequently character 124 (vertical bar) was always > displayed as a broken bar آ¦, which explains why even today > many keyboards show it that way > > - character 35 was permitted to be either the number sign # or > the pound sign آ£ > > - character 94 could be either a caret ^ or a logical NOT آ¬ > > Even the humble comma could be pressed into service as a cedilla. > > ASCII-1968 didn't change any characters, but allowed the use of LF on > its own. Previously, you had to use either LF/CR or CR/LF as newline. > > ASCII-1977 removed the ambiguities from the 1967 standard. > > The most recent version is ASCII-1986 (also known as ANSI X3.4-1986). > Unfortunately I haven't been able to find out what changes were made -- > I presume they were minor, and didn't affect the character set. > > So as you can see, even with actual ASCII, you can have mojibake. It's > just not normally called that. But if you are given an arbitrary ASCII > file of unknown age, containing code 94, how can you be sure it was > intended as a caret rather than a logical NOT symbol? You can't. > > Then there are at least 30 official variations of ASCII, strictly > speaking part of ISO-646. These 7-bit codes were commonly called "ASCII" > by their users, despite the differences, e.g. replacing the dollar sign > $ with the international currency sign آ¤, or replacing the left brace > { with the letter s with caron إ،. > > One consequence of this is that the MIME type for ASCII text is called > "US ASCII", despite the redundancy, because many people expect "ASCII" > alone to mean whatever national variation they are used to. > > But it gets worse: there are proprietary variations on ASCII which are > commonly called "ASCII" but aren't, including dozens of 8-bit so-called > "extended ASCII" character
Re: Does Python optimize low-power functions?
On Friday, December 6, 2013 11:32:00 AM UTC-8, Nick Cash wrote: > The reasons why have already been answered, I just wanted to point out that > Python makes it extremely easy to check these sorts of things for yourself. Thanks for the heads-up on the dis module, Nick. I haven't played with that one yet. -- https://mail.python.org/mailman/listinfo/python-list
Re: Does Python optimize low-power functions?
On 6 December 2013 18:16, John Ladasky wrote: > The following two functions return the same result: > > x**2 > x*x > > But they may be computed in different ways. The first choice can accommodate > non-integer powers and so it would logically proceed by taking a logarithm, > multiplying by the power (in this case, 2), and then taking the > anti-logarithm. But for a trivial value for the power like 2, this is > clearly a wasteful choice. Just multiply x by itself, and skip the expensive > log and anti-log steps. > > My question is, what do Python interpreters do with power operators where the > power is a small constant, like 2? Do they know to take the shortcut? As mentioned this will depend on the interpreter and on the type of x. Python's integer arithmetic is exact and unbounded so switching to floating point and using approximate logarithms is a no go if x is an int object. For CPython specifically, you can see here: http://hg.python.org/cpython/file/07ef52e751f3/Objects/floatobject.c#l741 that for floats x**2 will be equivalent to x**2.0 and will be handled by the pow function from the underlying C math library. If you read the comments around that line you'll see that different inconsistent math libraries can do things very differently leading to all kinds of different problems. For CPython if x is an int (long) then as mentioned before it is handled by the HAC algorithm: http://hg.python.org/cpython/file/07ef52e751f3/Objects/longobject.c#l3934 For CPython if x is a complex then it is handled roughly as you say: for x**n if n is between -100 and 100 then multiplication is performed using the "bit-mask exponentiation" algorithm. Otherwise it is computed by converting to polar exponential form and using logs (see also the two functions above this one): http://hg.python.org/cpython/file/07ef52e751f3/Objects/complexobject.c#l151 Oscar -- https://mail.python.org/mailman/listinfo/python-list
Re: ASCII and Unicode [was Re: Managing Google Groups headaches]
Steven D'Aprano pearwood.info> writes: > Yes, it appears that MT-NewsWatcher is *deeply, deeply* confused about > encodings and character sets. It doesn't just assume things are ASCII, > but makes a half-hearted attempt to be charset-aware, but badly. I can > only imagine that it was written back in the Dark Ages Indeed. The basic codebase probably goes back 20 years. I'm posting this from gmane, just so people don't think I'm a total luddite. > When transmitting ASCII characters, the networking protocol could include > various start and stop bits and parity codes. A single 7-bit ASCII > character might be anything up to 12 bits in length on the wire. Not to mention that some really old hardware used 1.5 stop bits! -- https://mail.python.org/mailman/listinfo/python-list
Python 2.8 release schedule
My apologies if you've seen this before but here is the official schedule http://www.python.org/dev/peps/pep-0404/ -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence -- https://mail.python.org/mailman/listinfo/python-list
Re: Embedding multiple interpreters
Garthy wrote: To allow each script to run in its own environment, with minimal chance of inadvertent interaction between the environments, whilst allowing each script the ability to stall on conditions that will be later met by another thread supplying the information, and to fit in with existing infrastructure. The last time I remember this being discussed was in the context of allowing free threading. Multiple interpreters don't solve that problem, because there's still only one GIL and some objects are shared. But if all you want is for each plugin to have its own version of sys.modules, etc., and you're not concerned about malicious code, then it may be good enough. It seems to be good enough for mod_wsgi, because presumably all the people with the ability to install code on a given web server trust each other. -- Greg -- https://mail.python.org/mailman/listinfo/python-list
Re: squeeze out some performance
On Fri, Dec 6, 2013 at 11:52 AM, John Ladasky wrote: > On Friday, December 6, 2013 12:47:54 AM UTC-8, Robert Voigtländer wrote: > > > I try to squeeze out some performance of the code pasted on the link > below. > > http://pastebin.com/gMnqprST > Not that this will speed up your code but you have this: if not clockwise: s = start start = end end = s Python people would write: end, start = start, end You have quite a few if statements that involve multiple comparisons of the same variable. Did you know you can do things like this in python: >>> x = 4 >>> 2 < x < 7 True >>> x = 55 >>> 2 < x < 7 False > Several comments: > > 1) I find this program to be very difficult to read, largely because > there's a whole LOT of duplicated code. Look at lines 53-80, and lines > 108-287, and lines 294-311. It makes it harder to see what this algorithm > actually does. Is there a way to refactor some of this code to use some > shared function calls? > > 2) I looked up the "Bresenham algorithm", and found two references which > may be relevant. The original algorithm was one which computed good raster > approximations to straight lines. The second algorithm described may be > more pertinent to you, because it draws arcs of circles. > > http://en.wikipedia.org/wiki/Bresenham's_line_algorithm > http://en.wikipedia.org/wiki/Midpoint_circle_algorithm > > Both of these algorithms are old, from the 1960's, and can be implemented > using very simple CPU register operations and minimal memory. Both of the > web pages I referenced have extensive example code and pseudocode, and > discuss optimization. If you need speed, is this really a job for Python? > > 3) I THINK that I see some code -- those duplicated parts -- which might > benefit from the use of multiprocessing (assuming that you have a > multi-core CPU). But I would have to read more deeply to be sure. I need > to understand the algorithm more completely, and exactly how you have > modified it for your needs. > -- > https://mail.python.org/mailman/listinfo/python-list > -- Joel Goldstick http://joelgoldstick.com -- https://mail.python.org/mailman/listinfo/python-list
Re: Embedding multiple interpreters
Garthy wrote: The bare minimum would be protection against inadvertent interaction. Better yet would be a setup that made such interaction annoyingly difficult, and the ideal would be where it was impossible to interfere. To give you an idea of the kind of interference that's possible, consider: 1) You can find all the subclasses of a given class object using its __subclasses__() method. 2) Every class ultimately derives from class object. 3) All built-in class objects are shared between interpreters. So, starting from object.__subclasses__(), code in any interpreter could find any class defined by any other interpreter and mutate it. This is not something that is likely to happen by accident. Whether it's "annoyingly difficult" enough is something you'll have to decide. Also keep in mind that it's fairly easy for Python code to chew up large amounts of memory and/or CPU time in an uninterruptible way, e.g. by evaluating 5**1. So even a thread that's keeping its hands entirely to itself can still cause trouble. -- Greg -- https://mail.python.org/mailman/listinfo/python-list
Re: squeeze out some performance
On 06/12/2013 16:52, John Ladasky wrote: On Friday, December 6, 2013 12:47:54 AM UTC-8, Robert Voigtländer wrote: I try to squeeze out some performance of the code pasted on the link below. http://pastebin.com/gMnqprST Several comments: 1) I find this program to be very difficult to read, largely because there's a whole LOT of duplicated code. Look at lines 53-80, and lines 108-287, and lines 294-311. It makes it harder to see what this algorithm actually does. Is there a way to refactor some of this code to use some shared function calls? A handy tool for detecting duplicated code here http://clonedigger.sourceforge.net/ for anyone who's interested. -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence -- https://mail.python.org/mailman/listinfo/python-list
Re: squeeze out some performance
On Fri, Dec 6, 2013 at 2:38 PM, Mark Lawrence wrote: > On 06/12/2013 16:52, John Ladasky wrote: > >> On Friday, December 6, 2013 12:47:54 AM UTC-8, Robert Voigtländer wrote: >> >> I try to squeeze out some performance of the code pasted on the link >>> below. >>> http://pastebin.com/gMnqprST >>> >> >> Several comments: >> >> 1) I find this program to be very difficult to read, largely because >> there's a whole LOT of duplicated code. Look at lines 53-80, and lines >> 108-287, and lines 294-311. It makes it harder to see what this algorithm >> actually does. Is there a way to refactor some of this code to use some >> shared function calls? >> >> > A handy tool for detecting duplicated code here > http://clonedigger.sourceforge.net/ for anyone who's interested. > Pylint does this too... -- https://mail.python.org/mailman/listinfo/python-list
Eliminate "extra" variable
Hi, ALL, I have following code: def MyFunc(self, originalData): data = {} dateStrs = [] for i in xrange(0, len(originalData)): dateStr, freq, source = originalData[i] data[str(dateStr)] = {source: freq} dateStrs.append(dateStr) for i in xrange(0, len(dateStrs) - 1): currDateStr = str(dateStrs[i]) nextDateStrs = str(dateStrs[i + 1]) It seems very strange that I need the dateStrs list just for the purpose of looping thru the dictionary keys. Can I get rid of the "dateStrs" variable? Thank you. -- https://mail.python.org/mailman/listinfo/python-list
Re: Managing Google Groups headaches
rusi wrote: On Friday, December 6, 2013 1:06:30 PM UTC+5:30, Roy Smith wrote: Which means, if I wanted to (and many examples of this exist), I can write my own client which presents the same information in different ways. Not sure whats your point. The point is the existence of an alternative interface that's designed for use by other programs rather than humans. This is what web forums are missing. If it existed, one could easily create an alternative client with a newsreader-like interface. Without it, such a client would have to be a monstrosity that worked by screen-scraping the html. It's not about the format of the messages themselves -- that could be text, or html, or reST, or bbcode or whatever. It's about the *framing* of the messages, and being able to query them by their metadata. -- Greg -- https://mail.python.org/mailman/listinfo/python-list
Re: Eliminate "extra" variable
On 12/06/2013 11:37 AM, Igor Korot wrote: Hi, ALL, I have following code: def MyFunc(self, originalData): data = {} dateStrs = [] for i in xrange(0, len(originalData)): dateStr, freq, source = originalData[i] data[str(dateStr)] = {source: freq} dateStrs.append(dateStr) for i in xrange(0, len(dateStrs) - 1): currDateStr = str(dateStrs[i]) nextDateStrs = str(dateStrs[i + 1]) It seems very strange that I need the dateStrs list just for the purpose of looping thru the dictionary keys. Can I get rid of the "dateStrs" variable? Thank you. You want to build a list, but you don't want to give that list a name? Why not? And how would you refer to that list in the second loop if it didn't have a name? And concerning that second loop: What are you trying to do there? It looks like a complete waste of time. In fact, with what you've shown us, you can eliminate the variable dateStrs, and both loops and be no worse off. Perhaps there is more to your code than you've shown to us ... Gary Herron -- https://mail.python.org/mailman/listinfo/python-list
Re: Eliminate "extra" variable
On Fri, Dec 6, 2013 at 2:37 PM, Igor Korot wrote: > Hi, ALL, > I have following code: > > def MyFunc(self, originalData): > data = {} > dateStrs = [] > for i in xrange(0, len(originalData)): >dateStr, freq, source = originalData[i] >data[str(dateStr)] = {source: freq} > # above line confuses me! >dateStrs.append(dateStr) > for i in xrange(0, len(dateStrs) - 1): > currDateStr = str(dateStrs[i]) > nextDateStrs = str(dateStrs[i + 1]) > > Python lets you iterate over a list directly, so : for d in originalData: dateStr, freq, source = d data[source] = freq Your code looks like you come from a c background. Python idioms are different I'm not sure what you are trying to do in the second for loop, but I think you are trying to iterate thru a dictionary in a certain order, and you can't depend on the order > > It seems very strange that I need the dateStrs list just for the > purpose of looping thru the dictionary keys. > Can I get rid of the "dateStrs" variable? > > Thank you. > -- > https://mail.python.org/mailman/listinfo/python-list > -- Joel Goldstick http://joelgoldstick.com -- https://mail.python.org/mailman/listinfo/python-list
Re: ASCII and Unicode [was Re: Managing Google Groups headaches]
On Sat, Dec 7, 2013 at 6:00 AM, Steven D'Aprano wrote: > - character 33 was permitted to be either the exclamation > mark ! or the logical OR symbol | > > - consequently character 124 (vertical bar) was always > displayed as a broken bar ¦, which explains why even today > many keyboards show it that way > > - character 35 was permitted to be either the number sign # or > the pound sign £ > > - character 94 could be either a caret ^ or a logical NOT ¬ Yeah, good fun stuff. I first met several of these ambiguities in the OS/2 REXX documentation, which detailed the language's operators by specifying their byte values as well as their characters - for instance, this quote from the docs (yeah, I still have it all here): """ Note: Depending upon your Personal System keyboard and the code page you are using, you may not have the solid vertical bar to select. For this reason, REXX also recognizes the use of the split vertical bar as a logical OR symbol. Some keyboards may have both characters. If so, they are not interchangeable; only the character that is equal to the ASCII value of 124 works as the logical OR. This type of mismatch can also cause the character on your screen to be different from the character on your keyboard. """ (The front material on the docs says "(C) Copyright IBM Corp. 1987, 1994. All Rights Reserved.") It says "ASCII value" where on this list we would be more likely to call it "byte value", and I'd prefer to say "represented by" rather than "equal to", but nonetheless, this is still clearly distinguishing characters and bytes. The language spec is on characters, but ultimately the interpreter is going to be looking at bytes, so when there's a problem, it's byte 124 that's the one defined as logical OR. Oh, and note the copyright date. The byte/char distinction isn't new. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
One liners
Does anyone else feel like Python is being dragged too far in the direction of long, complex, multiline one-liners? Or avoiding temporary variables with descriptive names? Or using regex's for everything under the sun? What happened to using classes? What happened to the beautiful emphasis on readability? What happened to debuggability (which is always harder than writing things in the first place)? And what happened to string methods? I'm pleased to see Python getting more popular, but it feels like a lot of newcomers are trying their best to turn Python into Perl or something, culturally speaking. -- https://mail.python.org/mailman/listinfo/python-list
Re: using ffmpeg command line with python's subprocess module
rusi wrote: On Friday, December 6, 2013 10:11:04 PM UTC+5:30, MRAB wrote: You're exaggerating. It's more like 500 years ago. :-) I was going to say the same until I noticed the "the way people think English was spoken..." That makes it unarguable -- surely there are some people who (wrongly) think so? Probably. They're surprisingly far off, though. Here's a sample of actual 1000-year-old English: http://answers.yahoo.com/question/index?qid=20100314001840AAygUaq -- Greg -- https://mail.python.org/mailman/listinfo/python-list
Re: One liners
On 12/6/13 6:54 PM, Dan Stromberg wrote: Does anyone else feel like Python is being dragged too far in the direction of long, complex, multiline one-liners? Or avoiding temporary variables with descriptive names? Or using regex's for everything under the sun? What happened to using classes? What happened to the beautiful emphasis on readability? What happened to debuggability (which is always harder than writing things in the first place)? And what happened to string methods? I'm pleased to see Python getting more popular, but it feels like a lot of newcomers are trying their best to turn Python into Perl or something, culturally speaking. I agree with you that those trends would be bad. But I'm not sure how you are judging that "Python" is being dragged in that direction? It's a huge community. Sure some people are obsessed with fewer lines, and micro-optimizations, and other newb mistakes, but there are good people too! --Ned, ever the optimist. -- https://mail.python.org/mailman/listinfo/python-list
Re: One liners
On 12/06/2013 04:54 PM, Dan Stromberg wrote: > Does anyone else feel like Python is being dragged too far in the direction > of long, complex, multiline one-liners? Or avoiding temporary variables > with descriptive names? Or using regex's for everything under the sun? > > What happened to using classes? What happened to the beautiful emphasis on > readability? What happened to debuggability (which is always harder than > writing things in the first place)? And what happened to string methods? > > I'm pleased to see Python getting more popular, but it feels like a lot of > newcomers are trying their best to turn Python into Perl or something, > culturally speaking. I have not seen any evidence that this trend of yours is widespread. The Python code I come across seems pretty normal to me. Expressive and readable. Haven't seen any attempt to turn Python into Perl or that sort of thing. And I don't see that culture expressed on the list. Maybe I'm just blind... -- https://mail.python.org/mailman/listinfo/python-list
Re: One liners
On Fri, Dec 6, 2013 at 4:10 PM, Michael Torrie wrote: > On 12/06/2013 04:54 PM, Dan Stromberg wrote: > > Does anyone else feel like Python is being dragged too far in the > direction > > of long, complex, multiline one-liners? Or avoiding temporary variables > > with descriptive names? Or using regex's for everything under the sun? > > > > What happened to using classes? What happened to the beautiful emphasis > on > > readability? What happened to debuggability (which is always harder than > > writing things in the first place)? And what happened to string methods? > > > > I'm pleased to see Python getting more popular, but it feels like a lot > of > > newcomers are trying their best to turn Python into Perl or something, > > culturally speaking. > > I have not seen any evidence that this trend of yours is widespread. > The Python code I come across seems pretty normal to me. Expressive and > readable. Haven't seen any attempt to turn Python into Perl or that > sort of thing. And I don't see that culture expressed on the list. > Maybe I'm just blind... I'm thinking mostly of stackoverflow, but here's an example I ran into (a lot of) on a job: somevar = some_complicated_thing(somevar) if some_other_complicated_thing(somevar) else somevar Would it really be so bad to just use an if statement? Why are we assigning somevar to itself? This sort of thing was strewn across 3 or 4 physical lines at a time. -- https://mail.python.org/mailman/listinfo/python-list
Re: Eliminate "extra" variable
In article , Joel Goldstick wrote: > Python lets you iterate over a list directly, so : > > for d in originalData: > dateStr, freq, source = d > data[source] = freq I would make it even simpler: > for dateStr, freq, source in originalData: > data[source] = freq -- https://mail.python.org/mailman/listinfo/python-list
Re: One liners
On 12/06/2013 05:14 PM, Dan Stromberg wrote: > I'm thinking mostly of stackoverflow, but here's an example I ran into (a > lot of) on a job: > > somevar = some_complicated_thing(somevar) if > some_other_complicated_thing(somevar) else somevar > > Would it really be so bad to just use an if statement? Why are we > assigning somevar to itself? This sort of thing was strewn across 3 or 4 > physical lines at a time. You're right that a conventional "if" block is not only more readable, but also faster and more efficient code. Sorry you have to deal with code written like that! That'd frustrate any sane programmer. It might bother me enough to write code to reformat the program to convert that style to something sane! There are times when the ternary (did I get that right?) operator is useful and clear. -- https://mail.python.org/mailman/listinfo/python-list
Re: Embedding multiple interpreters
Hi Gregory, On 07/12/13 08:53, Gregory Ewing wrote: > Garthy wrote: >> The bare minimum would be protection against inadvertent interaction. >> Better yet would be a setup that made such interaction annoyingly >> difficult, and the ideal would be where it was impossible to interfere. > > To give you an idea of the kind of interference that's > possible, consider: > > 1) You can find all the subclasses of a given class > object using its __subclasses__() method. > > 2) Every class ultimately derives from class object. > > 3) All built-in class objects are shared between > interpreters. > > So, starting from object.__subclasses__(), code in any > interpreter could find any class defined by any other > interpreter and mutate it. Many thanks for the excellent example. It was not clear to me how readily such a small and critical bit of shared state could potentially be abused across interpreter boundaries. I am guessing this would be the first in a chain of potential problems I may run into. > This is not something that is likely to happen by > accident. Whether it's "annoyingly difficult" enough > is something you'll have to decide. I think it'd fall under "protection against inadvertent modification"- down the scale somewhat. It doesn't sound like it would be too difficult to achieve if the author was so inclined. > Also keep in mind that it's fairly easy for Python > code to chew up large amounts of memory and/or CPU > time in an uninterruptible way, e.g. by > evaluating 5**1. So even a thread that's > keeping its hands entirely to itself can still > cause trouble. Thanks for the tip. The potential for deliberate resource exhaustion is unfortunately something that I am likely going to have to put up with in order to keep things in the same process. Cheers, Garth -- https://mail.python.org/mailman/listinfo/python-list
Re: Eliminate "extra" variable
On Fri, Dec 6, 2013 at 7:16 PM, Roy Smith wrote: > In article , > Joel Goldstick wrote: > > > Python lets you iterate over a list directly, so : > > > > for d in originalData: > > dateStr, freq, source = d > > data[source] = freq > > I would make it even simpler: > > > for dateStr, freq, source in originalData: > > data[source] = freq > +1 --- I agree To the OP: Could you add a docstring to your function to explain what is supposed to happen, describe the input and output? If you do that I'm sure you could get some more complete help with your code. > -- > https://mail.python.org/mailman/listinfo/python-list > -- Joel Goldstick http://joelgoldstick.com -- https://mail.python.org/mailman/listinfo/python-list
Re: Eliminate "extra" variable
On 2013-12-06 11:37, Igor Korot wrote: > def MyFunc(self, originalData): > data = {} > for i in xrange(0, len(originalData)): >dateStr, freq, source = originalData[i] >data[str(dateStr)] = {source: freq} this can be more cleanly/pythonically written as def my_func(self, original_data): for date, freq, source in original_data data[str(date)] = {source: freq} or even just data = dict( (str(date), {source: freq}) for date, freq, source in original_data ) You're calling it a "dateStr", which suggests that it's already a string, so I'm not sure why you're str()'ing it. So I'd either just call it "date", or skip the str(date) bit if it's already a string. That said, do you even need to convert it to a string (as datetime.date objects can be used as keys in dictionaries)? > for i in xrange(0, len(dateStrs) - 1): > currDateStr = str(dateStrs[i]) > nextDateStrs = str(dateStrs[i + 1]) > > It seems very strange that I need the dateStrs list just for the > purpose of looping thru the dictionary keys. > Can I get rid of the "dateStrs" variable? Your code isn't actually using the data-dict at this point. If you were doing something with it, it might help to know what you want to do. Well, you can iterate over the original data, zipping them together: for (cur, _, _), (next, _, _) in zip( original_data[:-1], original_data[1:] ): do_something(cur, next) If your purpose for the "data" dict is to merely look up stats from the next one, the whole batch of your original code can be replaced with: for ( (cur_dt, cur_freq, cur_source), (next_dt, next_freq, next_source) ) in zip(original_data[:-1], original_data[1:]): # might need to do str(cur_dt) and str(next_dt) instead? do_things_with(cur_dt, cur_freq, cur_source, next_dt, next_freq, next_source) That eliminates the dict *and* the extra variable name. :-) -tkc -- https://mail.python.org/mailman/listinfo/python-list
Re: One liners
On Fri, Dec 6, 2013 at 7:20 PM, Michael Torrie wrote: > On 12/06/2013 05:14 PM, Dan Stromberg wrote: > > I'm thinking mostly of stackoverflow, but here's an example I ran into (a > > lot of) on a job: > > > > somevar = some_complicated_thing(somevar) if > > some_other_complicated_thing(somevar) else somevar > > > > Would it really be so bad to just use an if statement? Why are we > > assigning somevar to itself? This sort of thing was strewn across 3 or 4 > > physical lines at a time. > > You're right that a conventional "if" block is not only more readable, > but also faster and more efficient code. Sorry you have to deal with > code written like that! That'd frustrate any sane programmer. It might > bother me enough to write code to reformat the program to convert that > style to something sane! There are times when the ternary (did I get > that right?) operator is useful and clear. > -- > https://mail.python.org/mailman/listinfo/python-list > While it seems to be a higher status in the team to write new code as compared to fixing old code, so much can be learned by having to plough through old code. To learn others coding style, pick up new understanding, and most importantly totally disabuse yourself of trying to be cute with code. Code is read by the machine and by the programmer. The programmer is the one who should be deferred to, imo. You buy the machine, you rent the programmer by the hour! Aside from django urls, I am not sure I ever wrote regexes in python. For some reason they must seem awfully sexy to quite a few people. Back to my point above -- ever try to figure out a complicated regex written by someone else? -- Joel Goldstick http://joelgoldstick.com -- https://mail.python.org/mailman/listinfo/python-list
Re: Why is there no natural syntax for accessing attributes with names not being valid identifiers?
On 06/12/2013 16:51, Piotr Dobrogost wrote: [...] I thought of that argument later the next day. Your proposal does unify access if the old obj.x syntax is removed. As long as obj.x is a very concise way to get attribute named 'x' from object obj it's somehow odd that identifier x is treated not like identifier but like string literal 'x'. If it were treated like an identifier then we would get attribute with name being value of x instead attribute named 'x'. Making it possible to use string literals in the form obj.'x' as proposed this would make getattr basically needless as long as we use only variable not expression to denote attribute's name. But then every time you wanted to get an attribute with a name known at compile time you'd need to write obj.'x' instead of obj.x, thereby requiring two additional keystrokes. Given that the large majority of attribute access Python code uses dot syntax rather than getattr, this seems like it would massively outweigh the eleven keystrokes one saves by writing obj.'x' instead of getattr(obj,'x'). -- https://mail.python.org/mailman/listinfo/python-list
Re: Eliminate "extra" variable
On 12/06/2013 03:38 PM, Joel Goldstick wrote: On Fri, Dec 6, 2013 at 2:37 PM, Igor Korot wrote: def MyFunc(self, originalData): data = {} dateStrs = [] for i in xrange(0, len(originalData)): dateStr, freq, source = originalData[i] data[str(dateStr)] = {source: freq} # above line confuses me! dateStrs.append(dateStr) for i in xrange(0, len(dateStrs) - 1): currDateStr = str(dateStrs[i]) nextDateStrs = str(dateStrs[i + 1]) Python lets you iterate over a list directly, so : for d in originalData: dateStr, freq, source = d data[source] = freq You could shorten that to for dateStr, freq, source in originalData: and if dateStr is already a string: data[dateStr] = {source: freq} Your code looks like you come from a c background. Python idioms are different Agreed. I'm not sure what you are trying to do in the second for loop, but I think you are trying to iterate thru a dictionary in a certain order, and you can't depend on the order The second loop is iterating over the list dateStrs. -- ~Ethan~ -- https://mail.python.org/mailman/listinfo/python-list
Re: Embedding multiple interpreters
Hi Gregory, On 07/12/13 08:39, Gregory Ewing wrote: > Garthy wrote: >> To allow each script to run in its own environment, with minimal >> chance of inadvertent interaction between the environments, whilst >> allowing each script the ability to stall on conditions that will be >> later met by another thread supplying the information, and to fit in >> with existing infrastructure. > > The last time I remember this being discussed was in the context > of allowing free threading. Multiple interpreters don't solve > that problem, because there's still only one GIL and some > objects are shared. I am fortunate in my case as the normal impact of the GIL would be much reduced. The common case is only one script actively progressing at a time- with the others either not running or waiting for external input to continue. But as you point out in your other reply, there are still potential concerns that arise from the smaller set of shared objects even across interpreters. > But if all you want is for each plugin to have its own version > of sys.modules, etc., and you're not concerned about malicious > code, then it may be good enough. I wouldn't say that I wasn't concerned about it entirely, but on the other hand it is not a hard requirement to which all other concerns are secondary. Cheers, Garth -- https://mail.python.org/mailman/listinfo/python-list
Re: One liners
In article , Joel Goldstick wrote: > Aside from django urls, I am not sure I ever wrote regexes in python. For > some reason they must seem awfully sexy to quite a few people. Back to my > point above -- ever try to figure out a complicated regex written by > someone else? Regex has a bad rap in the Python community. To be sure, you can abuse them, and write horrible monstrosities. On the other hand, stuff like this (slightly reformatted for posting): pattern = re.compile( r'haproxy\[(?P\d+)]: ' r'(?P(\d{1,3}\.){3}\d{1,3}):' r'(?P\d{1,5}) ' r'\[(?P\d{2}/\w{3}/\d{4}(:\d{2}){3}\.\d{3})] ' r'(?P\S+) ' r'(?P\S+)/' r'(?P\S+) ' r'(?P(-1|\d+))/' r'(?P(-1|\d+))/' r'(?P(-1|\d+))/' r'(?P(-1|\d+))/' r'(?P\+?\d+) ' r'(?P\d{3}) ' r'(?P\d+) ' r'(?P\S+) ' r'(?P\S+) ' r'(?P[\w-]{4}) ' r'(?P\d+)/' r'(?P\d+)/' r'(?P\d+)/' r'(?P\d+)/' r'(?P\d+) ' r'(?P\d+)/' r'(?P\d+) ' r'(\{(?P.*?)\} )?' # Comment out for stock haproxy r'(\{(?P.*?)\} )?' r'(\{(?P.*?)\} )?' r'"(?P.+)"' ) while intimidating at first glance, really isn't that hard to understand. Python's raw string literals, adjacent string literal catenation, and automatic line continuation team up to eliminate a lot of extra fluff. -- https://mail.python.org/mailman/listinfo/python-list
Re: interactive help on the base object
On 12/6/2013 12:03 PM, Mark Lawrence wrote: Is it just me, or is this basically useless? >>> help(object) Help on class object in module builtins: class object | The most base type Given that this can be interpreted as 'least desirable', it could definitely be improved. Surely a few more words, How about something like. '''The default top superclass for all Python classes. Its methods are inherited by all classes unless overriden. ''' When you have 1 or more concrete suggestions for the docstring, open a tracker issue. > or a pointer to this http://docs.python.org/3/library/functions.html#object, would be better? URLs don't belong in docstrings. People should know how to find things in the manual index. -- Terry Jan Reedy -- https://mail.python.org/mailman/listinfo/python-list
Re: Python 2.8 release schedule
On 12/6/2013 4:26 PM, Mark Lawrence wrote: My apologies if you've seen this before but here is the official schedule http://www.python.org/dev/peps/pep-0404/ The PEP number is not an accident ;-). -- Terry Jan Reedy -- https://mail.python.org/mailman/listinfo/python-list
Re: Python 2.8 release schedule
On 07/12/2013 01:39, Terry Reedy wrote: On 12/6/2013 4:26 PM, Mark Lawrence wrote: My apologies if you've seen this before but here is the official schedule http://www.python.org/dev/peps/pep-0404/ The PEP number is not an accident ;-). Sorry but I don't get it :) -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence -- https://mail.python.org/mailman/listinfo/python-list
Re: Python 2.8 release schedule
On Sat, Dec 7, 2013 at 12:48 PM, Mark Lawrence wrote: > On 07/12/2013 01:39, Terry Reedy wrote: >> >> On 12/6/2013 4:26 PM, Mark Lawrence wrote: >>> >>> My apologies if you've seen this before but here is the official >>> schedule http://www.python.org/dev/peps/pep-0404/ >> >> >> The PEP number is not an accident ;-). > > > Sorry but I don't get it :) HTTP error 404 "Not Found", probably the most famous (though not the most common) HTTP return code. You asked for Python 2.8? Sorry, not found... it's 404. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Python 2.8 release schedule
On 07/12/2013 01:54, Chris Angelico wrote: On Sat, Dec 7, 2013 at 12:48 PM, Mark Lawrence wrote: On 07/12/2013 01:39, Terry Reedy wrote: On 12/6/2013 4:26 PM, Mark Lawrence wrote: My apologies if you've seen this before but here is the official schedule http://www.python.org/dev/peps/pep-0404/ The PEP number is not an accident ;-). Sorry but I don't get it :) HTTP error 404 "Not Found", probably the most famous (though not the most common) HTTP return code. You asked for Python 2.8? Sorry, not found... it's 404. ChrisA Clearly that went straight over your head. -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence -- https://mail.python.org/mailman/listinfo/python-list
Re: Python 2.8 release schedule
On Sat, Dec 7, 2013 at 1:00 PM, Mark Lawrence wrote: > On 07/12/2013 01:54, Chris Angelico wrote: >> >> On Sat, Dec 7, 2013 at 12:48 PM, Mark Lawrence >> wrote: >>> Sorry but I don't get it :) >> >> [explained the joke] > > Clearly that went straight over your head. *facepalm* Yep, it did. Completely missed what you said there. Doh. I see what you did there... now. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: One liners
On Fri, 06 Dec 2013 15:54:22 -0800, Dan Stromberg wrote: > Does anyone else feel like Python is being dragged too far in the > direction of long, complex, multiline one-liners? Or avoiding temporary > variables with descriptive names? Or using regex's for everything under > the sun? All those things are stylistic issues, not language issues. Yes, I see far too many people trying to squeeze three lines of code into one, but that's their choice, not the language leading them that way. On the other hand, Python code style is influenced strongly by functional languages like Lisp, Scheme and Haskell (despite the radically different syntax). Python has even been described approvingly as "Lisp without the brackets". To somebody coming from a C or Pascal procedural background, or a Java OOP background, such functional-style code might seem too concise and/or weird. But frankly, I think that such programmers would write better code with a more functional approach. I refuse to apologise for writing the one-liner: result = [func(item) for item in sequence] instead of four: result = [] for i in range(len(sequence)): item = sequence[i] result.append(func(item)) > What happened to using classes? What happened to the beautiful emphasis > on readability? What happened to debuggability (which is always harder > than writing things in the first place)? And what happened to string > methods? What about string methods? As far as classes go, I find that they're nearly always overkill. Most of the time, a handful of pre-written standard classes, like dict, list, namedtuple and the like, get me 90% of the way to where I need to go. The beauty of Python is that it is a multi-paradigm language. You can write imperative, procedural, functional, OOP, or pipelining style (and probably more). The bad thing about Python is that if you're reading other people's code you *need* to be familiar with all those styles. > I'm pleased to see Python getting more popular, but it feels like a lot > of newcomers are trying their best to turn Python into Perl or > something, culturally speaking. They're probably writing code using the idioms they are used to from whatever language they have come from. Newcomers nearly always do this. The more newcomers you get, the less Pythonic the code you're going to see from them. -- Steven -- https://mail.python.org/mailman/listinfo/python-list
Re: Managing Google Groups headaches
On 12/6/13 8:03 AM, rusi wrote: I think you're off on the wrong track here. This has nothing to do with >plain text (ascii or otherwise). It has to do with divorcing how you >store and transport messages (be they plain text, HTML, or whatever) >from how a user interacts with them. Evidently (and completely inadvertently) this exchange has just illustrated one of the inadmissable assumptions: "unicode as a medium is universal in the same way that ASCII used to be" I wrote a number of ellipsis characters ie codepoint 2026 as in: - human communication… (is not very different from) - machine communication… Somewhere between my sending and your quoting those ellipses became the replacement character FFFD > > - human communication� > >(is not very different from) > > - machine communication� Leaving aside whose fault this is (very likely buggy google groups), this mojibaking cannot happen if the assumption "All text is ASCII" were to uniformly hold. Of course with unicode also this can be made to not happen, but that is fragile and error-prone. And that is because ASCII (not extended) is ONE thing in a way that unicode is hopelessly a motley inconsistent variety. You seem to be suggesting that we should stick to ASCII. There are of course languages that need more than just the Latin alphabet. How would you suggest we support them? Or maybe I don't understand? --Ned. -- https://mail.python.org/mailman/listinfo/python-list
Re: One liners
On Fri, 06 Dec 2013 17:20:27 -0700, Michael Torrie wrote: > On 12/06/2013 05:14 PM, Dan Stromberg wrote: >> I'm thinking mostly of stackoverflow, but here's an example I ran into >> (a lot of) on a job: >> >> somevar = some_complicated_thing(somevar) if >> some_other_complicated_thing(somevar) else somevar >> >> Would it really be so bad to just use an if statement? Why are we >> assigning somevar to itself? This sort of thing was strewn across 3 or >> 4 physical lines at a time. Unless you're embedding it in another statement, there's no advantage to using the ternary if operator if the clauses are so large you have to split the line over two or more lines in the first place. I agree that: result = (spam(x) + eggs(x) + toast(x) if x and condition(x) or another_condition(x) else foo(x) + bar(x) + foobar(x)) is probably better written as: if x and condition(x) or another_condition(x): result = spam(x) + eggs(x) + toast(x) else: result = foo(x) + bar(x) + foobar(x) The ternary if is slightly unusual and unfamiliar, and is best left for when you need an expression: ingredients = [spam, eggs, cheese, toast if flag else bread, tomato] As for your second complaint, "why are we assigning somevar to itself", I see nothing wrong with that. Better that than a plethora of variables used only once: # Screw this for a game of soldiers. def function(arg, param_as_list_or_string): if isinstance(param_as_list_or_string, str): param = param_as_list_or_string.split() else: param = param_as_list_or_string # Better. def function(arg, param): if isinstance(param, str): param = param.split() "Replace x with a transformed version of x" is a perfectly legitimate technique, and not one which ought to be too hard to follow. > You're right that a conventional "if" block is not only more readable, > but also faster and more efficient code. Really? I don't think so. This is using Python 2.7: [steve@ando ~]$ python -m timeit --setup="flag = 0" \ > "if flag: y=1 > else: y=2" 1000 loops, best of 3: 0.0836 usec per loop [steve@ando ~]$ python -m timeit --setup="flag = 0" "y = 1 if flag else 2" 1000 loops, best of 3: 0.0813 usec per loop There's practically nothing between the two, but the ternary if operator is marginally faster. As for readability, I accept that ternary if is unusual compared to other languages, but it's still quite readable in small doses. If you start chaining them: result = a if condition else b if flag else c if predicate else d you probably shouldn't. -- Steven -- https://mail.python.org/mailman/listinfo/python-list
Re: ASCII and Unicode [was Re: Managing Google Groups headaches]
On Saturday, December 7, 2013 12:30:18 AM UTC+5:30, Steven D'Aprano wrote: > On Fri, 06 Dec 2013 05:03:57 -0800, rusi wrote: > > Evidently (and completely inadvertently) this exchange has just > > illustrated one of the inadmissable assumptions: > > "unicode as a medium is universal in the same way that ASCII used to be" > Ironically, your post was not Unicode. > Seriously. I am 100% serious. > Your post was sent using a legacy encoding, Windows-1252, also known as > CP-1252, which is most certainly *not* Unicode. Whatever software you > used to send the message correctly flagged it with a charset header: > Content-Type: text/plain; charset=windows-1252 > Alas, the software Roy Smith uses, MT-NewsWatcher, does not handle > encodings correctly (or at all!), it screws up the encoding then sends a > reply with no charset line at all. This is one bug that cannot be blamed > on Google Groups -- or on Unicode. > > I wrote a number of ellipsis characters ie codepoint 2026 as in: > Actually you didn't. You wrote a number of ellipsis characters, hex byte > \x85 (decimal 133), in the CP1252 charset. That happens to be mapped to > code point U+2026 in Unicode, but the two are as distinct as ASCII and > EBCDIC. > > Somewhere between my sending and your quoting those ellipses became the > > replacement character FFFD > Yes, it appears that MT-NewsWatcher is *deeply, deeply* confused about > encodings and character sets. It doesn't just assume things are ASCII, > but makes a half-hearted attempt to be charset-aware, but badly. I can > only imagine that it was written back in the Dark Ages where there were a > lot of different charsets in use but no conventions for specifying which > charset was in use. Or perhaps the author was smoking crack while coding. > > Leaving aside whose fault this is (very likely buggy google groups), > > this mojibaking cannot happen if the assumption "All text is ASCII" were > > to uniformly hold. > This is incorrect. People forget that ASCII has evolved since the first > version of the standard in 1963. There have actually been five versions > of the ASCII standard, plus one unpublished version. (And that's not > including the things which are frequently called ASCII but aren't.) > ASCII-1963 didn't even include lowercase letters. It is also missing some > graphic characters like braces, and included at least two characters no > longer used, the up-arrow and left-arrow. The control characters were > also significantly different from today. > ASCII-1965 was unpublished and unused. I don't know the details of what > it changed. > ASCII-1967 is a lot closer to the ASCII in use today. It made > considerable changes to the control characters, moving, adding, removing, > or renaming at least half a dozen control characters. It officially added > lowercase letters, braces, and some others. It replaced the up-arrow > character with the caret and the left-arrow with the underscore. It was > ambiguous, allowing variations and substitutions, e.g.: > - character 33 was permitted to be either the exclamation > mark ! or the logical OR symbol | > - consequently character 124 (vertical bar) was always > displayed as a broken bar ¦, which explains why even today > many keyboards show it that way > - character 35 was permitted to be either the number sign # or > the pound sign £ > - character 94 could be either a caret ^ or a logical NOT ¬ > Even the humble comma could be pressed into service as a cedilla. > ASCII-1968 didn't change any characters, but allowed the use of LF on its > own. Previously, you had to use either LF/CR or CR/LF as newline. > ASCII-1977 removed the ambiguities from the 1967 standard. > The most recent version is ASCII-1986 (also known as ANSI X3.4-1986). > Unfortunately I haven't been able to find out what changes were made -- I > presume they were minor, and didn't affect the character set. > So as you can see, even with actual ASCII, you can have mojibake. It's > just not normally called that. But if you are given an arbitrary ASCII > file of unknown age, containing code 94, how can you be sure it was > intended as a caret rather than a logical NOT symbol? You can't. > Then there are at least 30 official variations of ASCII, strictly > speaking part of ISO-646. These 7-bit codes were commonly called "ASCII" > by their users, despite the differences, e.g. replacing the dollar sign $ > with the international currency sign ¤, or replacing the left brace > { with the letter s with caron š. > One consequence of this is that the MIME type for ASCII text is called > "US ASCII", despite the redundancy, because many people expect "ASCII" > alone to mean whatever national variation they are used to. > But it gets worse: there are proprietary variations on ASCII which are > commonly called "ASCII" but aren't, including dozens of 8-bit so-called > "extended ASCII" character sets, which i