Re: logging time format millisecond precision decimalsign
"Alex van der Spek" writes: > I use this formatter in logging: > > formatter = logging.Formatter(fmt='%(asctime)s \t %(name)s \t %(levelname)s > \t %(message)s') > > Sample output: > > 2012-07-19 21:34:58,382 root INFO Removed - C:\Users\ZDoor\Documents > > The time stamp has millisecond precision but the decimal separator is a > comma. > > Can I change the comma (,) into a period (.) and if so how? I do it by: 1. Replacing the default date format string to exclude ms. 2. Including %(msecs)03d in the format string where appropriate. Using 'd' instead of s truncates rather than shows the full float value. So in your case, I believe that changing your formatter creation to: formatter = logging.Formatter(fmt='%(asctime)s.%(msecs)03d \t %(name)s \t %(levelname)s \t %(message)s', '%Y-%m-%d %H:%M:%S') should work. This uses the same date format as the default, but without ms, though of course you could also opt to make any other date format you prefer. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: Migrate from Access 2010 / VBA
kgard writes: > I am the lone developer of db apps at a company of 350+ > employees. Everything is done in MS Access 2010 and VBA. I'm > frustrated with the limitations of this platform and have been > considering switching to Python. I've been experimenting with the > language for a year or so, and feel comfortable with the basics. (...) > Has anyone here made this transition successfully? If so, could you > pass along your suggestions about how to do this as quickly and > painlessly as possible? I went through a very similar transition a few years ago from standalone Access databases (with GUI forms, queries and reports, as well as replication) to a pure web application with full reporting (albeit centrally designed and not a report designer for users). I suppose my best overall suggestion is to migrate the data first and independently of any other activities. Unless your uses for Access in terms of GUI or reporting are extremely limited, don't try to replace your current system in one swoop, and in particular, be willing to continue allowing Access as long as necessary for GUI/reports until you're sure you've matched any current capabilities with an alternate approach. For all its warts, as a database GUI and reporting tool, Access has a lot going for it, and it can be more complex than you may think to replicate elsewhere. So the first thing I would suggest is to plan and implement a migration of the data itself. In my case I migrated the data from Access into PostgreSQL. That process itself took some planning and testing in terms of moving the data, and then correcting various bits of the schemas and data types (if I recall, booleans didn't round-trip properly at first), so was actually a series of conversions until I was happy, during which time everyone was using Access as usual. To support the migration, I created a mirror Access database to the production version, but instead of local Jet tables, I linked all the tables to the PostgreSQL server. All other aspects of the Access database (e.g., forms, reports, queries) remained the same, just now working off of the remote data. This needed testing too - for example, some multi-level joining in Access queries can be an issue. In some cases it was easier for me to migrate selected Access query logic into a database view and then replace the query in Access to use the view. You also need to (painfully) set any UI aspects of the table definitions manually since the linking process doesn't set that up, for which I used the original Access db as a model. I ended up doing that multiple times as I evolved the linked database, and I'll admit that was seriously tedious. While not required, I also wrapped up my new linked Access database into a simple installer (InnoSetup based in my case). Prior to this everyone was just copying the mdb file around, but afterwards I had an installer they just ran to be sure they had the latest version. If you do this part carefully, for your end users, aside from installing the new database, they see absolutely no difference, but you now have easy central access to the data, and most importantly can write other applications and tools against it without touching the Access side. It turns Access into just your GUI and reporting tool. If you have power users that make local changes they can continue to design additional queries or reports in their own local mdb against the linked tables. They'll need some extra support for updates though, either instructions to re-link, or instructions on exporting and importing their local changes into a newly installed version of your master mdb. Having done this, you are then free to start implementing, for example, a web based application to start taking over functionality. The nice thing is that you need not replicate everything at once, you can start slow or with the most desirable features, letting Access continue to handle the less common or more grungy legacy stuff at first. There are innumerable discussions on best web and application frameworks, so probably not worth getting into too much. In my case I'm using a CherryPy/Genshi/SQLAlchemy/psycopg2 stack. As long as you still have Access around, you'll have to take it into consideration with schema changes, but that's not really that much harder than any other schema migration management. It's just another client to the database you can run in parallel as long as you wish. If you do change the schema, when done, just load your master Access database, update the links, and rebuild/redistribute the installer to your users. Many changes (e.g., new columns with defaults) can be backwards compatible and avoid forced upgrades. You can operate both systems in parallel for a while even for similar functionality (for testing if nothing else), but can then retire functionality from Access as the web app supports it. Ideally this will be organic by your users preferring the web. Selecting when to drop Access entirely c
Re: Sandboxed Python: memory limits?
Chris Angelico writes: >So I'm hoping to restrict the script's ability to > consume all of memory, without (preferably) ulimit/rlimiting the > entire process (which does other things as well). But if it can't be, > it can't be. Just wondering, but rather than spending the energy to cap Python's allocations internally, could similar effort instead be directed at separating the "other things" the same process is doing? How tightly coupled is it? If you could split off just the piece you need to limit into its own process, then you get all the OS tools at your disposal to restrict the resources of that process. Depending on what the "other" things are, it might not be too hard to split apart, even if you have to utilize some IPC mechanism to coordinate among the two pieces. Certainly might be of the same order of magnitude of tweaking Python to limit memory internally. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: Replacing utf-8 characters
Mike <[EMAIL PROTECTED]> writes: > What you and I typed was ascii. The value of link came from importing > that utf-8 web page into that variable. That is why I think it is not > working. But not sure what the solution is. Are you sure you're asking what you think you are asking? Both the ampersand character (&) and the characters within the ampersand entity character reference (&) are ASCII. As it turns out they are also legal UTF-8, but I would not call a web page UTF-8 just because I saw the sequence of characters "&" within the stream. (That's not to say it isn't UTF-8 encoded, just that I don't think that's the issue) I'm just guessing, but you do realize that legal HTML should quote all uses of the ampersand character with an entity reference, since the ampersand itself is reserved for use in such references. This includes URL references whether inside attributes or in the body of the text. So when you see something in a browser in a web page that shows a URL that includes "&" such as for separating parameters, internally that page is (or should be) stored with "&" for that character. Thus if you retrieve the page in code, that's what you'll find. It's the browser processing that entity reference that turns it back into the "&" for presentation. Note that whether or not the page in question is encoded as UTF-8 is a completely distinct question - whatever encoding the page is in would be used to encode the characters in the entity reference (namely "&"). I'm assuming that in scraping the page you want to reverse the process (e.g., perform the interpretation of the entity references much as a browser would) before using that URL for other purposes. If so, the string replacement you tried should handle the replacement just fine, at least within the value of the URL as managed by your code. You then mention it being the same when you view the contents of the link, which isn't quite clear to me, but if that means retrieving another copy of the link as embedded in an HTML page then yes, it'll get quoted again since as initially, you have to quote an ampersand as an entity reference within HTML. What did you mean by "view the contents link"? -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: Securing a future for anonymous functions in Python
Scott David Daniels <[EMAIL PROTECTED]> writes: > David Bolen wrote: > > So for example, an asynchronous sequence of operations might be like: > > d = some_deferred_function() > > d.addCallback(lambda x: next_function()) > > d.addCallback(lambda blah: third_function(otherargs, blah)) > > d.addCallback(lambda x: last_function()) > > which to me is more readable (in terms of seeing the sequence of > > operations being performed in their proper order), then something like: > > def cb_next(x): > > return next_function() > > def cb_third(blah, otherargs): > > return third_function(otherargs, blah) > > def cb_last(x): > > return last_function() > > d = some_deferred_function() > > d.addCallback(cb_next) > > d.addCallback(cb_third, otherargs) > > d.addCallback(cb_next) > > which has an extra layer of naming (the callback functions), > > and > > requires more effort to follow the flow of what is really just a simple > > sequence of three functions being called. > > But this sequence contains an error of the same form as the "fat": "this" being which of the two scenarios you quote above? > while test() != False: > ...code... I'm not sure I follow the "error" in this snippet... > The right sequence using lambda is: > d = some_deferred_function() > d.addCallback(next_function) > d.addCallback(lambda blah: third_function(otherargs, blah)) > d.addCallback(last_function) By what metric are you judging "right"? In my scenario, the functions next_function and last_function are not written to expect any arguments, so they can't be passed straight into addCallback because any deferred callback will automatically receive the result of the prior deferred callback in the chain (this is how Twisted handles asynchronous callbacks for pending operations). Someone has to absorb that argument (either the lambda, or next_function itself, which if it is an existing function, needs to be handled by a wrapper, ala my second example). Your "right" sequence simply isn't equivalent to what I wrote. Whether or not next_function is fixable to be used this way is a separate point, but then you're discussing two different scenarios, and not two ways to write one scenario. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: Another PythonWin Excel question
"It's me" <[EMAIL PROTECTED]> writes: > Yes, I read about that but unfortunately I have no experience with VBA *at > all*. :=( You don't really have to know VBA, but if you're going to try to interact with COM objects from Python, you'll find it much smoother if you at least use any available reference information for the COM object model and interfaces you are using. In the Excel case, that means understanding - or at least knowing how to look in a reference - its object model, since that will tell you exactly what parameters an Add method on a worksheet object will take and how they work. For excel, online documentation can be found in a VBAXL9.CHM help file (the "9" may differ based on Excel release), but it might not always be installed depending on what options were selected on your system. In my English, Office 2000 installation, for example, the files are located in: c:\Program Files\Microsoft Office\Office\1033 You can load that file directly, or Excel itself will reference it from within the script editor help (Tools->Macro->Visual Basic Editor, then F1 for help). If you methods or classes and have the help installed it'll bring in the reference. You can also find it on MSDN on the web, although it can be tricky to navigate down to the right section - the top of the Office 2000 object documentation should be available at: http://msdn.microsoft.com/library/en-us/odeomg/html/deovrobjectmodelguide.asp This is mostly reference information, but there are some higher level discussions of overall objects (e.g., worksheets, workbooks, cells, etc...) too. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: extension module, thread safety?
Torsten Mohr <[EMAIL PROTECTED]> writes: > The question came up if this is by itself thread safe, > if some two or more threads try to change these data types, > are the C functions by themselves are "atomic" or can they > be interrupted be the perl interpreter and then (data types > are in some inconsistent half-changed state) another function > that works on these data is called? I presume you mean "Python" and not "perl"... If the threads under discussion are all Python threads, then by default yes, the extension module C functions will appear to be atomic from the perspective of the Python code. When the Python code calls into the extension module, the GIL (global interpreter lock) is still being held. Unless the extension module code explicitly releases the GIL, no other Python threads can execute (even though those threads are in fact implemented as native platform threads). So in general, if you write an extension module where none of its functions ever release the GIL, there's no way for two of its functions to be run from different Python threads simultaneously. Note that this restriction won't necessarily hold if there are other ways (at the C level, or from other extension modules) to trigger code in the extension module, since that's outside of the control of the Python GIL. Nor will it necessarily hold true if your extension module calls back out into Python (as a callback, or whatever) since once the interpreter is back in Python code the interpreter itself will periodically release the GIL, or some other extension code that the callback code runs may release it. To the extent possible, it's considered good practice to release the GIL in an extension module whenever you are doing lengthy processing so as to permit other Python threads (that may have nothing to do with using your extension module) to execute. For short routines this really isn't an issue, but if your extension module will be spending some time managing its data, you may wish to add some internal thread protection around that data, so that you can use your own locks rather than depending on the GIL. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: extension module, thread safety?
Nick Coghlan <[EMAIL PROTECTED]> writes: > Pierre Barbier de Reuille wrote: > > Ok, I wondered why I didn't know these functions, but they are new > > to Python 2.4 ( and I didn't take the time to look closely at Python > > 2.4 as some modules I'm working with are still not available for > > Python 2.4). But if it really allows to call Python code outside a > > Python thread ... then I'll surely use that as soon as I can use > > Python 2.4 :) Thanks for the hint :) > > The Python 2.4 docs claim the functions were added in Python 2.3, even > though they aren't documented in the 2.3.4 docs. > > The 2.3 release PEP (PEP 283) confirms that PEP 311 (which added these > functions) went in. And even before that it was certainly possible to call into the Python interpreter from a native thread using existing functions, albeit the newer functions are more convenient (and perhaps more robust, I don't know). My earliest interaction with Python (~1999, while writing a module that extended and embedded Python 1.5.2) used PyEval_AcquireThread() and PyEval_ReleaseThread() to get access to a thread state from a native C application thread (not initiated by the Python interpreter) to allow me to call safely into an executing Python script upon asynchronous data reception by the C code. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: lambda
Antoon Pardon <[EMAIL PROTECTED]> writes: > Op 2005-01-18, Simon Brunning schreef <[EMAIL PROTECTED]>: > > On 18 Jan 2005 07:51:00 GMT, Antoon Pardon <[EMAIL PROTECTED]> wrote: > >> 3 mutating an item in a sorted list *does* *always* cause problems > > > > No, it doesn't. It might cause the list no longer to be sorted, but > > that might or might no be a problem. > > Than in the same vain I can say that mutating a key in a dictionary > doesn't always cause problems either. Sure it may probably make a > key unaccessible directly, but that might or might not be a problem. Well, I'd definitely consider an inaccessible key as constituting a problem, but I don't think that's a good analogy to the list case. With the dictionary, the change can (though I do agree it does not have to) interfere with proper operation of the dictionary, while a list that is no longer sorted still functions perfectly well as a list. That is, I feel "problems" are more guaranteed with a dictionary since we have affected base object behavior, whereas sorted is not an inherent attribute of the base list type but something the application is imposing at a higher level. For example, I may choose to have an object type that is mutable (and not worthy for use as a dictionary key) but maintains a logical ordering so is sortable. I see no problem with sorting a list of such objects, and then walking that list to perform some mutation to each of the objects, even if along the way the mutation I am doing results in the items so touched no longer being in sorted order. The act of sorting was to provide me with a particular sequence of objects, but aside from that fact, the list continues to perform perfectly well as a list even after the mutations - just no longer delivering objects in sorted order. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: Redirecting stdout/err under win32 platform
Pierre Barbier de Reuille <[EMAIL PROTECTED]> writes: > AFAIK, there is no working bidirectionnal pipes on Windows ! The > functions exists in order for them to claim being POSIX, but they're > not working properly. (...) Can you clarify what you believe doesn't work properly? The os.popen* functions under Windows use native CreateProcess calls to create the child process and connect stdin/out/err handles to that child process, so should behave properly. (Subject of course to the same risk of deadlocks and what not due to buffering or queued up data that any system would have with these calls) -- David -- http://mail.python.org/mailman/listinfo/python-list
Re:
[EMAIL PROTECTED] (Roy Smith) writes: (...) > We've got code coveage tools. This is a testing tool. You keep > running tests and it keeps track of which lines of code are executed > (i.e. which logic branches are taken). One theory of testing says you > should keep writing test cases until you've exercised every branch. I > don't see any reason such a tool wouldn't be useful in a big Python > project, but I'm not aware of any. The coverage.py module (http://www.garethrees.org/2001/12/04/python-coverage) has worked pretty well for us. Just run your unit test suite under its control. There's also http://www.softwareverify.com/pythonCoverageValidator which is commercial. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: debugger?
Qiangning Hong <[EMAIL PROTECTED]> writes: (...) > However, while I use pdb or inserting "print" statement to debug my > apps, sometimes it is a pain. I think I need a good GUI debugger to > help me. The debugger should meet _most_ of the following > requirements: > > 1. can debug wxPython applications (and other GUI lib). > 2. an intuitive way to set/clear/enable/disable breakpoints. > 3. can set conditional breakpoints (i.e. break when some condition satisfied). > 4. variable watch list, namescope watching (local, global) > 5. evaluate expression, change variable values, etc within debugging. > 6. change the running routine, (i.e. go directly to a statement, skip > some statements, etc) > 7. clever way to express objects, not just a string returned by repr() > 8. perform profiling > 9. a clear interface. > 10. cross-platform. > 11. free, or better, open source. Although we typically use unit tests and 'print' debugging, I settled on Wing IDE as having the best debugger for the times when something more was needed. It's not free (pretty reasonable cost for an IDE though), but otherwise I think would meet your other points, except perhaps for profiling. It's easy enough to grab an evaluation version to try out (http://www.wingide.com). For us, a big point was wxPython debugging, and being able to stop at exceptions within wxPython event handlers. Interestingly enough, that's seems to be a tough requirement for many of the existing debuggers because the exceptions occur in code that has been called out to from within a C++ layer, and thus have to be caught before the C++ layer gets a chance to clear the exception. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: Bug on Python2.3.4 [FreeBSD]?
Uwe Mayer <[EMAIL PROTECTED]> writes: > AFAICT there seems to be a bug on FreeBSD's Python 2.3.4 open function. The > documentation states: > > > Modes 'r+', 'w+' and 'a+' open the file for updating (note that 'w+' > > truncates the file). Append 'b' to the mode to open the file in binary > > mode, on systems that differentiate between binary and text files (else it > > is ignored). If the file cannot be opened, IOError is raised. > > Consider: > > $ cat test > lalala > > $ python2.3 > Python 2.3.4 (#2, Jan 4 2005, 04:42:43) > [GCC 2.95.4 20020320 [FreeBSD]] on freebsd4 > Type "help", "copyright", "credits" or "license" for more information. > >>> f = open('test', 'r+') > >>> f.read() > 'lalala\n' > >>> f.write('testing') > >>> f.close() > >>> > [1]+ Stopped python2.3 > $ cat test > lalala > > -> write did not work; ok Strange, I tried this with Python 2.3.3 and 2.3.5 on two FreeBSD 4.10 systems and it seemed to append to the file properly in both cases. Going back further, it also worked with Python 2.2.2 on a FreeBSD 4.7 system. I don't see happen to have a 2.3.4 installation, but can't see any changes to the source for the file object between 2.3.4 and 2.3.5, for example. ~> python Python 2.3.5 (#2, May 5 2005, 11:11:17) [GCC 2.95.4 20020320 [FreeBSD]] on freebsd4 Type "help", "copyright", "credits" or "license" for more information. >>> f = open('test','r+') >>> f.read() 'lalala\n' >>> f.write('testing') >>> f.close() >>> ~> cat test lalala testing # no newline was present Which version of FreeBSD are you running? I thought it might be a dependency on needing to seek between reads and writes on a duplex stream (which is ANSI), but FreeBSD doesn't require that, at least back as far as a 4.7 system I have, and I assume much earlier than that. One dumb question - are you absolutely sure it wasn't appending? As written, there's no trailing newline on the file, so your final "cat test" would produce output where the "testing" was on the same line as your next command prompt, and can sometimes be missed visually. > Can anyone confirm that? Is there any other way of opening a file for > appending instead of a+? Well, if you really just want appending, I'd just use "a". It creates the file if necessary but always appends to the end. Of course, it's not set up for reading, but you wouldn't need that for appending. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: win32 service and sockets
Tom Brown <[EMAIL PROTECTED]> writes: > Well, I have found that it works if I launch the client on the same > machine as the service. It will not work from a remote machine. Any > ideas? Sounds like it might be an issue at the network layer rather than in your code - perhaps a routing or filtering problem between your two machines. Have you verified that you do in fact have network connectivity between the machines (such as with ping), and that you can reach your server's port from the client (perhaps try telnetting to the port). Since you mentioned Xp, could any of it's built-in firewall support be enabled, and perhaps blocking access to your server's port? -- David -- http://mail.python.org/mailman/listinfo/python-list
print and str subclass with tab in value
I ran into this strange behavior when noticing some missing spaces in some debugging output. It seems that somewhere in the print processing, there is special handling for string contents that isn't affected by changing how a string is represented when printed (overriding __str__). For example, given a class like: class mystr(str): def __new__(cls, value): return str.__new__(cls, value) def __str__(self): return 'Test' you get the following behavior >>> x = strtest.mystr('foo') >>> print x,1 Test 1 >>> print repr(x),1 'foo' 1 >>> x = strtest.mystr('foo\t') >>> print x,1 Test1 >>> print repr(x),1 'foo\t' 1 Note the lack of a space if the string value ends in a tab, even if that tab has nothing to do with the printed representation of a string. It looks like it's part of basic string output since with a plain old string literal the tab gets output (I've replaced the literal tab with [TAB] in the output below) but no following string. >>> x = 'testing\t' >>> print x,1 testing[TAB]1 >>> x = str('testing\t') >>> print x,1 testing[TAB]1 so I'm guessing it's part of some optimization of tab handling in print output, although a quick perusal of the Python source didn't have anything jump out at me. It seems to me that this is probably a buglet since I would expect print and its softspace handling to depend on what was actually written and not internal values - has anyone else ever run into this. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: cannot open file in write mode, no such file or directory
[EMAIL PROTECTED] writes: > I'm having a problem where when trying to open a file in write mode, I > get an IOError stating no such file or directory. I'm calling an > external program which takes an input file and produces an output file > repeatedly, simulating the input file separately for each replicate. > The error occurs when trying to open the input file to write out the > new data. The problem is difficult to reproduce since it only shows up > once every few thousand replicates. I've tried using both os.system > and os.popen to invoke the external program. Originally I was running > this on cygwin, but also tried under windows. You might be hitting a race condition where the OS is still considering the file to be in use when you get around to rewriting it, even if the using application has just exited. I've run into similar problems when trying to rename temporary files under NT based systems. The problem can be obscured because some of the Win32-specific IO errors can turn into more generic IOError exceptions at the Python level due to incomplete mappings available for all Win32 errors. In particular, a lot of Win32-layer failures turn into EINVAL errno's at the C RTL level, which Python in turn translates to ENOENT (which is the file not found). So the IOError exception at the Python level can be misleading. Since it sounds like you can reproduce the problem relatively easily (just run your application several thousand times), a quick check for this condition would be to trap the IOError, delay a few seconds (say 5-10 to be absolutely sure, although in the cases I've run into 2-3 is generally more than enough), and retry the operation. If that succeeds, then this might be the issue you're hitting. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: Gordon McMillan installer and Python 2.4
[EMAIL PROTECTED] (Svein Brekke) writes: > Has anyone else succeded in using McMillans installer on 2.4? > Thanks for any help. I have a feeling that it may be related to the fact that the starter modules (run*.exe) were built with VC6, which matches with Python builds up to 2.3, but not 2.4 (which is built with VC7). So you're probably running into a mismatch between C runtime libraries. Since it sounds like you have VS 2003 available to you, you might try rebuilding the appropriate run*.exe (the source is in the installer tree beneath the source directory). VS 2003 should auto-convert the .dsw/.dsp files but there's at least one dependency (zlib) that you might have to build separately and handle manually (e.g., update the project to locate your particular zlib library). -- David -- http://mail.python.org/mailman/listinfo/python-list
email.Message.set_charset and Content-Transfer-Encoding
I've noticed that using set_charset() on an email.Message instance will not replace any existing Content-Transfer-Encoding header but will install one if it isn't yet present. Thus, if you initially create a message without a charset, it defaults to us-ascii, and creates both Content-Type and Content-Transfer-Encoding headers (the latter of which defaults to "7bit"). If you then later attempt to change the charset (say, to "iso-8859-1") with set_charset(), it adjusts the Content-Type header, but leaves the Content-Transfer-Encoding header alone, which I would think is no longer accurate, since it is the new charset's body encoding that will eventually be used when flattening the message, which would then no longer match the encoding header. It's also different than if you had passed in an iso-8859-1 charset originally when constructing the message instance, in which case the encoding would have been selected as quoted-printable. The documentation for set_charset seemed to imply (at least to me) that eventual headers generated by a generator would be affected by the change in charset, so having it stay at 7bit was confusing. Is anyone aware of a reason why the encoding shouldn't adjust in response to a set_charset call similar to how a supplied charset initially establishes it at message creation time? -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: Number of colors in an image
Christos "TZOTZIOY" Georgiou <[EMAIL PROTECTED]> writes: > A set seems more appropriate in this case: > > color_count = len(set(izip(r, g, b))) # untested code Well, while potentially premature optimization, I was trying for performance in this case. In Python 2.3, the sets module is coded in Python, and just wraps a dictionary, and when handed an iterable, ends up looping (in Python) with individual dictionary key assignments. Although I didn't specifically test sets, when I did a loop like that myself, it was 5-6 times slower than directly building the dictionary. That might change in 2.4 with the built-in set - it's still a wrapper around dict but knows it's just directly setting items to a true value so can avoid dealing with the tuples that dict does (not to mention I don't have to build the extra tuple). Although I expect the direct support in PIL 1.1.5 that Fredrik posted about will be best. > >For a greyscale, or single banded image, it should be faster just to > >use the built-in PIL "histogram" method and take the length of the > >resulting list. > > More like the count of non-zero elements in the histogram; I believe the > length of the resulting list will be constant (ie 256). Oops, definitely yes. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: subclassing extension type and assignment to __class__
gregory lielens <[EMAIL PROTECTED]> writes: (...) > Yes I was affraid this would be the conclusion: embedding instead of > inheritance...But this means that not only the method that need to be > modified need to be rewritten in python, all the other ones also just > to delegate to the embedded instance... > This is not too practical, even if such writing can be more or less > automatized... Unless I'm misunderstanding, couldn't one of __getattr__ or __getattribute__ make mapping other methods you don't override very simple and practical, not to mention fully automated with 2-3 lines of code? In particular, __getattr__ would seem good for your use since it is only called for attributes that couldn't already be located. I've had code that wrapped underlying objects in a number of cases, and always found that to be a pretty robust mechanism. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: module imports and memory usage
Brad Tilley <[EMAIL PROTECTED]> writes: > When memory usage is a concern, is it better to do: > > from X import Y > > or > > import X Depending on "Y", the latter can technically use less memory, but it's likely to be fairly small and depends on how many symbols from that module you want to have local bindings. In general, it's not worth worrying about. For example, if module X has 1000 bound names in it, then doing an "import X" loads the module and gives your current namespace a single reference to the entire module. If however, you do a "from X import *" you still load the X module, but also get local names bound for each of the names in the X module. So you have double the references to the objects created within X. Those references take up space (for the textual name and the pointer) in your current namespace's dictionary. But if you just do a "from X import Y" where Y is a single symbol, then it's a wash because you get Y bound as a local namespace name, but you don't have a reference to the module itself. If you were to import multiple symbols, then you'd have a few extra references locally. Sometimes though it's not even the memory you can affect, but performance. One place where I did find this to be noticeable, for example, was with wxPython, where older releases typically often did a "from wxPython.wx import *" since all the wxPython names already had a "wx" prefix. But that yielded thousands of extra name bindings in the local namespace. Switching to "from wxPython import wx" and then using "wx.XXX" instead of just "XXX" actually made a fairly dramatic decrease in load time. It did also drop memory because I had a bunch of plugin modules, all of which were burning up a few thousand name bindings for the same wxPython symbols. Switching them to just use the module reference was a noticeable savings in that case. > Also, is there a way to load and unload modules as they are needed. I > have some scripts that sleep for extended periods during a while loop > and I need to be as memory friendly as possible. I can post a detailed > script that currently uses ~ 10MB of memory if anyone is interested. You can play tricks by manually deleting a module out of sys.modules, but there's no guarantee that the memory will actually be released (either by Python or the OS). Unless you're working under very specific external resources, I'd generally leave this to the OS. It will figure out when some of your working set is unused for extended periods and generally it should end up in swap space if you actually need the memory for something else. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: non blocking read()
Jp Calderone <[EMAIL PROTECTED]> writes: > def nonBlockingReadAll(fileObj): > bytes = [] > while True: > b = fileObj.read(1024) > bytes.append(b) > if len(b) < 1024: > break > return ''.join(bytes) Wouldn't this still block if the input just happened to end at a multiple of the read size (1024)? -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: Time zones
Timothy Hume <[EMAIL PROTECTED]> writes: > I want to ensure that all my time calculations are done in UTC. This is > easy with Python on UNIX machines. I simply set the TZ environment > variable to "UTC", and it ensures that the time functions use UTC. > > My question is, how do I get similar functionality using Python on > Windows? If you're just talking about time module functions, I think you should be able to do the same thing. The time functions are overlays for the C library versions which do use tzset() and look at the TZ environment variable as under Unix. > py24 Python 2.4 (#60, Nov 30 2004, 11:49:19) [MSC v.1310 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import time >>> print time.tzname ('Eastern Standard Time', 'Eastern Daylight Time') >>> print time.ctime() Wed Dec 01 17:42:08 2004 > TZ=UTC py24 Python 2.4 (#60, Nov 30 2004, 11:49:19) [MSC v.1310 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import time >>> print time.tzname ('UTC', '') >>> print time.ctime() Wed Dec 01 22:42:21 2004 (I've used 2.4 here, but the same results work back until 1.5.2) -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: installing 2.4
"Jive" <[EMAIL PROTECTED]> writes: > It's only getting worse. I went to Add/remove programs and removed 2.4. > Now Python 2.4 numarray and Python 2.4 pywin extensions are still listed as > installed, but I cannot remove them. You mentioned in your first post about "copying your site package" ... did you actually make a copy or did you perhaps "move" your site-packages directory from beneath 2.3 to under 2.4. If so, then the uninstall entry in the registry is not going to find the files to be able to uninstall them. Worst case you should be able to reinstall Python 2.3, and your extension packages from their installer images. Don't worry about the uninstall list in Add/Remove programs as reinstalling the packages will just update their entries. That will refresh the files in your Python 2.3 tree, and providing you don't disable the option, should re-establish file associations and what not back to a 2.3 installation. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: pythonwin broke
Trent Mick <[EMAIL PROTECTED]> writes: > It is also possible that there is some little installer bug or detail > on your environment that is causing the problem. You could try > ActivePython. I regularly install and uninstall ActivePython 2.3 and > 2.4 installers and both installs are still working fine. Just as another data point, I have all of Python 1.5.2, 2.0.1, 2.1.3, 2.2.3, 2.3.4 and 2.4 installed side by side on my Windows box, as installed by their standard installers, without any problems. And that includes uninstall/reinstall cycles for patch releases of versions older than the most recent (e.g., putting on 2.2.3 after a 2.3 variant was already installed). The only real restriction is as you noted - only one can own the file associations (or be associated with the COM support for pywin32). In case it matters, I do install everything as administrator for all users and this is under 2K (my NT box has everything but 2.4). -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: New versions breaking extensions, etc.
"Martin v. Löwis" <[EMAIL PROTECTED]> writes: > Can you elaborate? To me, that problem only originates from > the OS lack of support for deleting open files. If you could > delete a shared libary that is still in use (as you can on > Unix), the put the new version of the DLL in the place, (...) Note that at least on NT-based systems, you can at least rename the existing file out of the way while it is in use in order to install a new version. I do think however, that until more recent changes (not sure whether in 2K versus XP) in how DLL searches work (e.g., permitting local versions), even with that operation, if a named DLL was available loaded into memory, it would be used by a subsequent process attempting to reference it regardless of the state of the filesystem. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: Winge IDE Issue - an suggestions?
Mike Thompson writes: (...) > WingIDE bug seemed the only explanation, although it was puzzling me > that something so obvious could make it through their QA. Thanks again. I haven't used ElementTree, but if it includes an extension module (likely for performance), it's important to realize that WingIDE's debugger specifically catches exceptions that occur on the "far" side of an extension module. So even if that extension module would normally suppress the exception, thus hiding it from the original Python code that called the extension module, Wing's debugger will stop at it, which is different than normal runtime. This can actually be very helpful, since for example, debuggers that can't do this can't stop on exceptions in cases such as wxPython event handlers, since they occur from within a C extension too. Wing has a bunch of default locations that it ignores (that would otherwise trigger via normal standard library calls), but for your own applications or libraries, you need to teach it a bit, by asking it to ignore locations you know not to be relevent to your code. Once you mark such a location, it is remembered in your project so it won't bother you again. This was discussed a bit more in depth recently in the "False Exceptions" thread on this group. See: http://groups-beta.google.com/group/comp.lang.python/browse_frm/thread/f996d6554334e350/e581bea434d3d248 -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: Socket being garbage collected too early
Scott Robinson <[EMAIL PROTECTED]> writes: > I have been having trouble with the garbage collector and sockets. Are you actually getting errors or is this just theoretical? > Unfortunately, google keeps telling me that the problem is the garbage > collector ignoring dead (closed?) sockets instead of removing live > ones. My problem is > > > x.sock=socket.socket(socket.AF_INET,socket.SOCK_STREAM) > do_stuff(x.sock) > > > def do_stuff(sock): > sock_list.append(sock) > > once do_stuff finishes, x.sock disappears, and I can only believe it > is being garbage collected. Can you clarify this? What do you mean by "x.sock" disappears? Are you getting a NameError later when trying to use "x.sock"? x.sock is just a name binding, so it is not really involved in garbage collection (GC applies to the objects to which names are bound). In this case, you need to include much more in the way of code (a fully running, but smallest possible, snippet of code would be best), since the above can be interpreted many ways. At the least, it's very important to include information about the namespace within which those two code snippets run if anyone is likely to be able to give you a good answer. Also, being very precise about the error condition you are experiencing (including actual error messages, tracebacks, etc...) is crucial. Is 'x' referencing a local or global object, and does that socket code occur within a method, a function, or what? Also, in do_stuff, where is sock_list defined? Is it local, global? If, as written, sock_list is a local name to do_stuff, then that binding is going to disappear when do_stuff completes, thus, the list to which it is bound will be destroyed, including all references to objects that the list may contain. So at that point, when you return from do_stuff, the only reference to the socket object will be in x.sock. But if 'x' is also local to the function/method where the call to do_stuff is, the name binding will be removed when the function/method returns, at which point there will be no references to the socket object, and yes, it will be destroyed. But if sock_list is global, and continues to exist when do_stuff completes, then the reference it contains to the socket will keep the socket object alive even if you remove the x.sock binding. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: A completely silly question
Mike Meyer <[EMAIL PROTECTED]> writes: > Steven Bethard <[EMAIL PROTECTED]> writes: > > > Amir Dekel wrote: > >> What I need from the program is to wait for a single character > >> input, something like while(getchar()) in C. All those Python > >> modules don't make much sence to me... > > > > sys.stdin.read(1) > > That doesn't do what he wants, because it doesn't return until you hit > a newline. Well, but that's true as well for getchar() (at least in many cases of interactive input and line buffering), so in that respect I do think it's a fairly direct replacement, depending on how the OP was going to use getchar() in the application. For example, compare: with: #include >>> import sys >>> while 1: main() ... c = sys.stdin.read(1) { ... print ord(c), while (1) {... int ch = getchar(); printf("%d ",ch); } } When run, both produce (at least for me): 0123456789 (hit Enter here) 48 49 50 51 52 53 54 55 56 57 10 under both Unix (at least FreeBSD/Linux in my quick tests) and Windows (whether MSVC or Cygwin/gcc). (I don't include any output buffer flushing, since it shouldn't be needed on an interactive terminal, but you could add that to ensure that it isn't the output part that is being buffered - I did try it just to be sure on the Unix side) > The answer is system dependent. Or you can use massive overkill and > get curses, but if you're on windows you'll have to use a third party > curses package, and maybe wrap it If you want to guarantee you'll get the next console character without any waiting under Windows there's an msvcrt module that contains functions like kbhit() and getch[e] that would probably serve. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: uptime for Win XP?
Andrey Ivanov <[EMAIL PROTECTED]> writes: (...) > Writting this script was harder than I initially thought due to > a lack of documentation for win32all. And I still don't know what > that bizzare_int value stands for (an error/status code?). The pywin32 documentation tends not to duplicate information already available via MSDN (whether in a local installation or at msdn.microsoft.com) on the underlying Win32 API, so when in doubt, that's where to look. Then, the pywin32 documentation will sometimes qualify how the Python interface maps that function. But in particular, a general rule (as has already been posted) is that any out parameters are aggregated along with the overall result code into a result tuple. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: threading priority
Peter Hansen <[EMAIL PROTECTED]> writes: > [EMAIL PROTECTED] wrote: > > I googled as suggested, and the answer isn't crystal clear. My > > impression is that the problem is that a python thread must acquire the > > GIL in order to execute, and the strategy for deciding which thread > > should get the GIL when multiple threads are waiting for it is not > > based on priority. Is that correct? > > That's basically correct. I don't actually know what > the strategy is, though I suspect it's either not > formally documented or explicitly not defined, though > for a given platform there may be some non-arbitrary > pattern... > (...) I expect the Python interpreter has little to say over thread prioritization and choice of execution, although it does impose some granularity on the rate of switching. The GIL itself is implemented on the lower layer lock implementation, which is taken from the native threading implementation for the platform. Therefore, when multiple Python threads are waiting for the GIL, which one is going to get released will depend on when the underlying OS satisfies the lock request from the threads, which should be based on the OS thread scheduling system and have nothing to do with Python per-se. I do believe you are correct in that the Python GIL prevents thread pre-emption by the OS (because all other Python threads are waiting on the GIL and not in a running state), but the actual act of switching threads at a switching point (sys.setcheckinterval()) would be an OS only decision, and subject to whatever standard platform thread scheduling rules were in place. So if you were to use a platform specific method to control thread priority, that method should be honored by the Python threads (subject to the granularity of the system check interval for context switches). For example, here's a Windows approach that fiddles with the thread priority: - - - - - - - - - - - - - - - - - - - - - - - - - import threading import ctypes import time w32 = ctypes.windll.kernel32 THREAD_SET_INFORMATION = 0x20 THREAD_PRIORITY_ABOVE_NORMAL = 1 class DummyThread(threading.Thread): def __init__(self, begin, name, iterations): threading.Thread.__init__(self) self.begin = begin self.tid = None self.iterations = iterations self.setName(name) def setPriority(self, priority): if not self.isAlive(): print 'Unable to set priority of stopped thread' handle = w32.OpenThread(THREAD_SET_INFORMATION, False, self.tid) result = w32.SetThreadPriority(handle, priority) w32.CloseHandle(handle) if not result: print 'Failed to set priority of thread', w32.GetLastError() def run(self): self.tid = w32.GetCurrentThreadId() name = self.getName() self.begin.wait() while self.iterations: print name, 'running' start = time.time() while time.time() - start < 1: pass self.iterations -= 1 if __name__ == "__main__": start = threading.Event() normal = DummyThread(start, 'normal', 10) high = DummyThread(start, 'high', 10) normal.start() high.start() # XXX - This line adjusts priority - XXX high.setPriority(THREAD_PRIORITY_ABOVE_NORMAL) # Trigger thread execution start.set() - - - - - - - - - - - - - - - - - - - - - - - - - And the results of running this with and without the setPriority call: Without:With: normal running high running high runninghigh running normal running high running high runninghigh running normal running normal running high runninghigh running normal running high running high runninghigh running normal running high running high runningnormal running normal running high running high runninghigh running normal running normal running high runningnormal running normal running normal running high runningnormal running normal running normal running high runningnormal running normal running normal running high runningnormal running I'm not entirely positive why the normal thread gets occasionally executed before the high thread is done. It might be that the interpreter is actually releasing the GIL in the code I've written for the thread's run() (maybe during the I/O) which opens up an opportunity, or it may be that Windows is boosting the other thread occasionally to avoid starvation. So I expect the normal thread is getting occasional bursts of bytecode execution (the syscheckinterval). But clearly the OS level prioritizati
Re: A completely silly question
"Fredrik Lundh" <[EMAIL PROTECTED]> writes: > >> Well, but that's true as well for getchar() (at least in many cases of > >> interactive input and line buffering), so in that respect I do think > >> it's a fairly direct replacement, depending on how the OP was going to > >> use getchar() in the application. > > > > The OP said "wait for a single character input". sys.stdin.read(1) > > waits for a newline. > > in the same same sentence, the OP also said that he wanted something like > C's getchar(). if you guys are going to read only parts of the original post, > you could at least try to read an entire sentence, before you start arguing... Not even sure what's there to argue about - getchar() does do single character input, so the OPs (full) original sentence seems plausible to me, and his example was using it in a while loop which I took to represent processing some input one character at a time. In any event - I also gave a way (Windows-specific) to truly obtain the single next character without any buffering, so just ignore any controversy in the first part of the response if desired. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: where's "import" in the C sources?
Torsten Mohr <[EMAIL PROTECTED]> writes: > i tried to find the file and line in the C sources of python > where the command "import" is implemented. Can anybody give > me some hint on this? Well, there are several levels, depending on what you are looking for. The literal "import" syntax in a source module is translated by the Python compiler to various IMPORT_* bytecodes, which are processed in the main interpreter loop (see ceval.c). They all basically bubble down to making use of the builtin __import__ method, which is obtained from the builtin module defined in bltinmodule.c. That in turn makes use of the import processing module whose code can be found in import.c - which is the same source that also implements the "imp" module to provide lower layer access to to the import internals. Now, when it comes to physically loading in a module, Python source and compiled modules are handled by import (well, not the compiling part), but dynamically loaded extension modules are OS specific. You can find the handling of such extension modules in OS-specific source files dynload_*.c (e.g., dynload_win.c for Windows). All of these files can be found in the dist/src/Python directory in the Python source tree. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: Problem in threading
"It's me" <[EMAIL PROTECTED]> writes: > It depends on what "help" means to you. Both Windows and Unix (and it's > variances) are considered "thread-weak" OSes. So, using thread will come > with some cost. The long gone IBM OS/2 is a classic example of a > "thread-strong" OS. (...) Interesting - can you clarify what you perceive as the differences between a thread-weak and thread-strong OS? If given the choice, I would probably refer to Windows (at least NT based systems, let's ignore 9x) as thread-strong, and yes, often think of Windows as preferring thread based solutions, while Unix would often prefer process based. Windows is far more efficient at handling large numbers of threads than it is processes, with much less overhead and there is lots of flexibility in terms of managing threads and their resources. Threads are first class OS objects at the kernel and scheduler level (waitable and manageable). I can't think of anything offhand specific that OS/2 did with respect to threads that isn't as well supported by current Win32 systems. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: Is there a better way of listing Windows shares other than using "os.listdir"
[EMAIL PROTECTED] writes: > I'm currently using "os.listdir" to obtain the contents of some slow Windows > shares. I think I've seen another way of doing this using the win32 library > but I can't find the example anymore. Do you want the list of files on the shares or the list of shares itself? If the files, you can use something like FindFiles, but I don't expect it to be that much faster just to obtain directory names (likely the overhead is on the network). If you just want a list of shares, you could use NetUseEnum, which should be pretty speedy. (FindFiles is wrapped by win32api, and NetUseEnum by win32net, both parts of the pywin32 package) Here's a short example of displaying equivalent output to the "net use" command: - - - - - - - - - - - - - - - - - - - - - - - - - import win32net status = {0 : 'Ok', 1 : 'Paused', 2 : 'Disconnected', 3 : 'Network Error', 4 : 'Connected', 5 : 'Reconnected'} resume = 0 while 1: (results, total, resume) = win32net.NetUseEnum(None, 1, resume) for use in results: print '%-15s %-5s %s' % (status.get(use['status'], 'Unknown'), use['local'], use['remote']) if not resume: break - - - - - - - - - - - - - - - - - - - - - - - - - Details on the the arguments to NetUseEnum can be found in MSDN (with any pywin32 specifics in the pywin32 documentation). > My main problem with using "os.listdir" is that it hangs my gui application. > The tread running the "os.listdir" appears to block all other threads when > it calls this function. Yes, for a GUI you need to keep your main GUI thread always responsive (e.g., don't do any blocking operations). There are a number of alternatives to handling a long processing task in a GUI application, dependent on both the operation and toolkit in use. For wxPython, http://wiki.wxpython.org/index.cgi/LongRunningTasks covers several of the options (and the theory behind them is generally portable to other toolkits although implementation will change). -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: Securing a future for anonymous functions in Python
Ian Bicking <[EMAIL PROTECTED]> writes: > The one motivation I can see for function expressions is > callback-oriented programming, like: > >get_web_page(url, > when_retrieved={page | >give_page_to_other_object(munge_page(page))}) This is my primary use case for lambda's nowadays as well - typically just to provide a way to convert the input to a callback into a call to some other routine. I do a lot of Twisted stuff, whose deferred objects make heavy use of single parameter callbacks, and often you just want to call the next method in sequence, with some minor change (or to ignore) the last result. So for example, an asynchronous sequence of operations might be like: d = some_deferred_function() d.addCallback(lambda x: next_function()) d.addCallback(lambda blah: third_function(otherargs, blah)) d.addCallback(lambda x: last_function()) which to me is more readable (in terms of seeing the sequence of operations being performed in their proper order), then something like: def cb_next(x): return next_function() def cb_third(blah, otherargs): return third_function(otherargs, blah) def cb_last(x): return last_function() d = some_deferred_function() d.addCallback(cb_next) d.addCallback(cb_third, otherargs) d.addCallback(cb_next) which has an extra layer of naming (the callback functions), and requires more effort to follow the flow of what is really just a simple sequence of three functions being called. > I think this specific use case -- defining callbacks -- should be > addressed, rather than proposing a solution to something that isn't > necessary. (...) I'd be interested in this approach too, especially if it made it simpler to handle simple manipulation of callback arguments (e.g., since I often ignore a successful prior result in a callback in order to just move on to the next function in sequence). -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: Event-Driven Woes: making wxPython and Twisted work together
Daniel Bickett <[EMAIL PROTECTED]> writes: > My initial solution was, naturally, the wxPython support inside of the > twisted framework. However, it has been documented by the author that > the support is unstable at this time, and should not be used in > full-scale applications. Rather than the wx reactor, there's an alternate recipe that just cranks the twisted event loop from within a timer at the wx level that we've used very successfully. It does have some caveats (such as a potentially higher latency in servicing the network based on your timer interval), but so far for our applications it hasn't been an issue at all, so it might be something you might try. The code was based on http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/181780 The advantage of this approach is that no threads are necessary, and thus there's no problem issuing wxPython calls from within twisted callbacks or twisted calls from within wxPython event handlers. Just include the following bits into your application object (note that the use of "installSignalHandlers" might not be needed on all systems): class MyApp(wx.wxApp): (...) def OnInit(self): # Twisted Reactor code reactor.startRunning(installSignalHandlers=0) wx.EVT_TIMER(self, 99, self.OnTimer) self.timer = wx.wxTimer(self, 99) self.timer.Start(150, False) (...) def OnTimer(self, event): reactor.runUntilCurrent() reactor.doIteration(0) def __del__(self): self.timer.Stop() reactor.stop() wx.wxApp.__del__(self) and you can try adjusting the timer interval for the best mix of CPU load versus latency. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: Queue.Queue-like class without the busy-wait
"Paul L. Du Bois" <[EMAIL PROTECTED]> writes: > Has anyone written a Queue.Queue replacement that avoids busy-waiting? > It doesn't matter if it uses os-specific APIs (eg > WaitForMultipleObjects). I did some googling around and haven't found > anything so far. This isn't a Queue.Queue replacement, but it implements a buffer intended for inter-thread transmission, so it could be adjusted to mimic Queue semantics fairly easily. In fact, internally it actually keeps write chunks in a list until read for better performance, so just removing the coalesce process would be the first step. It was written specifically to minimize latency (which is a significant issue with the polling loop in the normal Python Queue implementation) and CPU usage in support of a higher level Win32-specific serial I/O class, so it uses Win32 events to handle the signaling for the key events when waiting. The fundamental issue with the native Python lock is that to be minimalistic in what it requires from each OS, it doesn't impose a model of being able to wait on an event signal - that's the key thing you need to have (a timed blocking wait on some signalable construct) to be most efficient for these operations - which is what I use the Win32 Event for. -- David - - - - - - - - - - - - - - - - - - - - - - - - - import thread import win32event as we class Buffer: """A thread safe unidirectional data buffer used to represent data traveling to or from the application and serial port handling threads. This class is used as an underlying implementation mechanism by SerialIO. Application code should not typically need to access this directly, but can handle I/O through SerialIO. Note that we use Windows event objects rather than Python's because Python's OS-independent versions are not very efficient with timed waits, imposing internal latencies and CPU usage due to looping around a basic non-blocking construct. We also use the lower layer thread lock rather than threading's to minimize overhead. """ def __init__(self, notify=None): self.lock = thread.allocate_lock() self.has_data = we.CreateEvent(None,1,0,None) self.clear() self.notify = notify def _coalesce(self): if self.buflist: self.buffer += ''.join(self.buflist) self.buflist = [] def __len__(self): self.lock.acquire() self._coalesce() result = len(self.buffer) self.lock.release() return result def clear(self): self.lock.acquire() self.buffer = '' self.buflist = [] self.lock.release() def get(self, size=0, timeout=None): """Retrieve data from the buffer, up to 'size' bytes (unlimited if 0), but potentially less based on what is available. If no data is currently available, it will wait up to 'timeout' seconds (forever if None, no blocking if 0) for some data to arrive""" self.lock.acquire() self._coalesce() if not self.buffer: # Nothing buffered, wait until something shows up (timeout # rules match that of threading.Event) self.lock.release() if timeout is None: win_timeout = we.INFINITE else: win_timeout = int(timeout * 1000) rc = we.WaitForSingleObject(self.has_data, win_timeout) self.lock.acquire() self._coalesce() if not size: size = len(self.buffer) result_len = min(size,len(self.buffer)) result = self.buffer[:result_len] self.buffer = self.buffer[result_len:] we.ResetEvent(self.has_data) self.lock.release() return result def put_back(self,data): self.lock.acquire() self.buffer = data + self.buffer self.lock.release() we.SetEvent(self.has_data) if self.notify: self.notify() def put(self, data): self.lock.acquire() self.buflist.append(data) self.lock.release() we.SetEvent(self.has_data) if self.notify: self.notify() -- http://mail.python.org/mailman/listinfo/python-list
Re: email: Content-Disposition and linebreaks with long filenames
Martin Körner <[EMAIL PROTECTED]> writes: > I am using email module for creating mails with attachment (and then > sending via smtplib). > > If the name of the attachment file is longer than about 60 characters > the filename is wrapped in the Content-Disposition header: > > Content-Disposition: attachment; > filename="This is a sample file with a very long filename > 0123456789.zip" > > > This leads to a wrong attachment filename in email clients - the space > after "filename" is not shown or the client displays a special > character (the linbreak or tab before 0123456789.zip). Yes, it would appear that the default Generator used by the Message object to create a textual version of a message explicitly uses tab (\t) as a continuation character rather than space - probably because it looks a little nicer when printed. Interestingly enough, the default Header continuation character is just a plain space which would work fine here. I should point out that I believe this header format could be considered correct, although I find RFC2822 a bit ambiguous on this point. It talks about runs of FWS (folding white space) in an unfolding operation as being considered a single space (section 3.2.3). However, I suppose someone might argue if "runs" includes a single character. I think it should, but obviously some e-mail clients disagree :-) (...) > Is it possible to prevent the linebreak? Should be - two approaches I can think of (msg below is the email.Message): 1) Create your own Header object for the specific header line rather than just storing it as a string via add_header. For that specific header you can then override the default maximum line length. Something like: from email.Header import Header cd_header = Header('Content-Disposition: attachment; filename="."', maxlinelen=998) msg['Content-Disposition'] = cd_header Note that because Header defaults to a space continuation character, you could also leave maxlinelen alone and let it break the line, but since it would break with a single space it would work right in clients. 2) Use your own Generator object to generate the textual version of the message (which is when the wrapping is occurring), and during the flattening process, disable (or set a longer value for) header wrapping. Something like: Assuming "fp" is an output File-like object: from email.Generator import Generator g = Generator(fp) g.flatten(msg, maxheaderlen=998) (or maxheaderlen=0 to disable wrapping) -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: Python FTP timeout value not effective
John Nagle writes: > Here's the relevant code: > > TIMEOUTSECS = 60 ## give up waiting for server after 60 seconds > ... > def urlopen(url,timeout=TIMEOUTSECS) : > if url.endswith(".gz") : # gzipped file, must decompress first > nd = urllib2.urlopen(url,timeout=timeout) # get connection > ... # (NOT .gz FILE, DOESN'T TAKE THIS PATH) > else : > return(urllib2.urlopen(url,timeout=timeout)) # (OPEN FAILS) > > > TIMEOUTSECS used to be 20 seconds, and I increased it to 60. It didn't > help. I apologize if it's an obvious question, but is there any possibility that the default value to urlopen is not being used, but some other timeout is being supplied? Or that somehow TIMEOUTSECS is being redefined before being used by the urlopen definition? Can you (or have you) verified the actual timeout parameter value being supplied to urllib2.urlopen? The fact that you seem to still be timing out very close to the prior 20s timeout value seems a little curious, since there's no timeout by default (barring an application level global socket default), so it feels like a value being supplied. Not sure which 2.7 you're using, but I tried the below with both 2.7.3 and 2.7.5 on Linux since they were handy, and the timeout parameter seems to be working properly at least in a case I can simulate (xxx is a firewalled host so the connection attempt just gets black-holed until the timeout): >>> import time, urllib2 >>> def test(timeout): ... print time.ctime() ... try: ... urllib2.urlopen('ftp://xxx', timeout=timeout) ... except: ... print 'Error' ... print time.ctime() ... >>> test(5) Mon Sep 2 17:36:15 2013 Error Mon Sep 2 17:36:20 2013 >>> test(20) Mon Sep 2 17:36:23 2013 Error Mon Sep 2 17:36:44 2013 >>> test(60) Mon Sep 2 17:36:50 2013 Error Mon Sep 2 17:37:50 2013 It's tougher to simulate a host that artificially delays the connection attempt but then succeeds, so maybe it's an issue related specifically to that implementation. Depending on how the delay is implemented (delaying SYN response versus accepting the connection but just delaying the welcome banner, for example), I suppose it may be tickling some very specific bug. Since all communication essentially boils down to I/O over the socket, it seems to me likely that those cases should still fail over time periods related to the timeout supplied, unlike your actual results, which makes me wonder about the actual urllib2.urlopen timeout parameter. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: Linux users: please run gui tests
Terry Reedy writes: > and report here python version, linux system, and result. > Alteration of environment and locale is a known issue, skip that. Using source builds on my slave (bolen-ubuntu): Linux buildbot-ubuntu 4.1.0-x86_64-linode59 #1 SMP Mon Jun 22 10:39:23 EDT 2015 x86_64 x86_64 x86_64 GNU/Linux NOTE: This is a 32-bit userspace system, just with a 64-bit kernel Python 3.6.0a0 (default:e56893df8e76, Aug 7 2015, 16:36:30) [GCC 4.8.4] on linux [1/3] test_tk [2/3] test_ttk_guionly [3/3] test_idle All 3 tests OK. Python 3.5.0b4+ (3.5:b9a0165a3de8, Aug 7 2015, 16:21:51) [GCC 4.8.4] on linux [1/3] test_tk [2/3] test_ttk_guionly [3/3] test_idle All 3 tests OK. Python 3.4.3+ (3.4:f5069e6e4229, Aug 7 2015, 16:38:53) [GCC 4.8.4] on linux [1/3] test_tk [2/3] test_ttk_guionly [3/3] test_idle All 3 tests OK. I have also adjusted the slave to run under Xvfb so the tests should be included going forward. -- David -- https://mail.python.org/mailman/listinfo/python-list
Re: How properly manage memory of this PyObject* array?? (C extension)
"[EMAIL PROTECTED]" <[EMAIL PROTECTED]> writes: > > *WRONG*. The object exists in and of itself. There may be one *or more* > > references to it, via pointers, scattered about in memory; they are > > *NOT* components of the object. A reference count is maintained inside > > the object and manipulated by Py_INCREF etc. The Python garbage > > collector knows *nothing* about the memory occupied by those pointers; > > John - Thanks again, > I think I just learned something that is important. If I free memory > associated > with pointers to Python objects it will NOT erase the Python objects!!! > I can merrily follow your orders to free(my_array); without worrying > about > nuking the Python objects too early! Thanks for pointing out that > reference count is maintained inside the Python object itself. This may be clear from the thread, but since you don't mention it, you must also have issued the Py_DECREF for each object reference in my_array at the point that you're planning on freeing it. Otherwise, while you will have freed up your own local memory allocation and no longer make use of the object references (pointers) previously there, the Python memory manager doesn't know that - it still has a ref count for your prior references - and the objects themselves will never be freed. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: py2exe and library.zip
Peter Hansen <[EMAIL PROTECTED]> writes: > Do you know that Subversion has (as I understand it) a fairly > intelligent binary file comparison routine, and it will (again, as I > understand it) not transmit the entire contents of the zip file but > would actually send only the portions that have changed? At least, > that's if the file isn't compressed in some way that prevents this > algorithm from working well. (Note to self: check if zip files that > can be in sys.path can be compressed, and if py2exe compresses them.) Even if the files were compressed, which has a net result similar to randomizing the contents and will certainly extend the portion that appears "changed", the worst that would happen is that subversion (which does use a binary delta algorithm) would end up downloading the single file portion of the zip file rather than the smaller change within the file. It should still be efficient. But to be honest, for something like the OPs purpose, it's not clear that an SCM is needed, since all he's trying to accomplish is bring a remote copy up to date with the central one. For that you could just publish a location containing the necessary files and have the users use something like rsync directly (which is just as efficient in terms of a binary delta) to update their own local version. Of course, if the Subversion server is already in place so it's a convenient server, or if more of the user base already has the client in place, it should work just about as well. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: py2exe and library.zip
Peter Hansen <[EMAIL PROTECTED]> writes: > Good point. When I wrote that I was picturing the form of compression > that a .tar.gz file would have, not what is actually used inside a > .zip file which is -- quite logically now that you point it out -- > done on a file-by-file basis. (Clearly to do otherwise would risk > your data and make changing compressed zips highly inefficient.) Right, and yes, .tar.gz files are very problematic for such algorithms, such as rsync. In fact, there was a patch made available for gzip (never made it ito the actual package I believe) that permitted resetting the compression engine at selected block boundaries - thus effectively bounding the "noise" generated by a single change. The output would grow a bit since resetting the engine dropped overall efficiency, but you got a tremendous gain back in terms of "rsyncability" of the file. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: py2exe and library.zip
Timothy Smith <[EMAIL PROTECTED]> writes: > I've got this working now, and fyi it downloads the entire zip every > time. and svn appears to be very slow at it to. Hmm, not what I would have expected, and certainly unfortunate for your desired use case. I just tried some experiments with rsync (easier to test locally than Subversion), and found that taking an existing zip, unpacking it and then repacking it with some rearrangement was in fact sending everything, even though the source files were unchanged. Since py2exe is effectively rebuilding that library.zip on each run, that probably is a fair representation of the generation process. I'm not familiar enough with zip file compression, but perhaps it includes the use of something that is file specific to seed the compression engine, which would mean that making a new zip file even with the same files in it might not yield precisely the same internal compressed storage. Both versions would be proper and decompressible, just not binary identical even for unchanged sources. If I disabled compression for the zip files (just did a store only), and rebuilt the zip even with a rearranged file order, rsync was able to detect just the changes. So you might want to try ensuring that your py2exe generated file is not compressing the individual modules (a verbose zip listing of the library.zip should just show that they were "Stored"). Your library.zip will get larger, but it should become more efficient to transfer - hopefully as well with Subversion as I was seeing with rsync. (In fact, I remember doing just something like this with a project of mine that I was using py2exe with, and then using rsync to push out the resultant files to remote sites - I had originally compressed the library.zip but rsync was pushing the whole thing out, so I stopped using the compression) -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: py2exe + svn - the final drama
Timothy Smith <[EMAIL PROTECTED]> writes: > Timothy Smith wrote: (...) > >zipimport.ZipImportError: bad local file header in Z:\temp\library.zip > > > > not that once i have finished client.update(''), it has successfully > > updated the zipfile, i open a dialoge box saying "click ok and > > restart program" AFTER i click on the above error pops up and my app > > shuts down as intended. > > > >ideas? > > > ok i have done some digging and i cound this > > /* Check to make sure the local file header is correct */ > fseek(fp, file_offset, 0); > l = PyMarshal_ReadLongFromFile(fp); > if (l != 0x04034B50) { > /* Bad: Local File Header */ > PyErr_Format(ZipImportError, >"bad local file header in %s", >archive); > fclose(fp); > > > can anyone explain to me about zip file headers and why it would be > different/incorrect and give me this error? Are you perhaps trying to update the zip file in-place while it is still being used by the application? I'm not sure that's a safe operation. A quick peek at the same module where I think you found the above code shows that when a zip importer instance is associated with a zip file, the directory for that zip file is read in and cached. So the importer is holding onto offset information for each file based on the contents of the zip directory at initialization time. If you then change the file contents (such as updating it with svn), those offsets will no longer be valid. I then expect that during your process exit, some bit of code is performing an extra import, which accesses the wrong (based on the new file contents) portion of the zip file, and the above safety check prevents it from loading an erroneous set of bytes thinking its a valid module. I expect you need to work on a mechanism to update the file independently of the running copy, and then arrange to have it moved into place for a subsequent execution. Or find some way to have the zip importer refresh its directory information or make a new importer instance once the zip file is updated. One (untested) thought ... before the update, make a copy of your current library.zip as some other name, and adjust your sys.path to reference that name (rather than the default pointer to the main library.zip that py2exe initializes things with). That should force any future imports to access the old copy of the zip file and not the one that svn will be updating. Since you need to leave that zip file copy in place through the exit (to satisfy any trailing imports), arrange for your application to check for that copy on startup and remove it if present. Or, after looking through import.c handling for zip file imports, there might be a simpler way. ZIP imports are handled by a zipimporter installed in sys.path_hooks, and once a specific path element has a path hook instantiated for it (based on the sys.path element name) it is cached in sys.path_hooks_cache. So, simply clearing out the path_hooks_cache entry for your main library.zip file should cause the next import attempt to re-create a new zipimporter instance and thus re-open the file and re-load the directory information. I don't know if py2exe installs the library.zip into sys.path just as "library.zip" or with some path information, but try checking out the keys in sys.path_hooks_cache from your application when it is running. You should find an entry (probably the only one unless you explicitly augment sys.path yourself) for library.zip - clear out that key after the update and see how it works. Heck, since you're the efficiency hit is likely not an issue, just flush all of sys.path_hooks_cache and don't even worry about the actual key name for library.zip. So a simple: sys.path_importer_cache.clear() call after your update completes may do the trick. -- David PS: In the same way that updating the library.zip under the running application is tricky, you might run into issues if you end up trying to update one of the extension modules. svn might not be able to update it (depending on how it deals with "in use" files). -- http://mail.python.org/mailman/listinfo/python-list
Re: py2exe + svn - the final drama
Just <[EMAIL PROTECTED]> writes: > the zipimport module has an attr called _zip_directory_cache, which is a > dict you can .clear(). Still, reloading modules is hairy at best, its > probably easiest to relaunch your app when the .zip file has changed. Except that he's getting an error during the process exit of the current execution, which is needed to restart. And if he updates to a different copy, there's the bootstrap problem of how to get it back into the standard location for the next restart since his application will need to have it to restart in the first place. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: py2exe + svn - the final drama
Timothy Smith <[EMAIL PROTECTED]> writes: > what i do is as soon as the update is complete i close the app, but it > still gives the error, i tried clear() after update and before it, it > still got the same error. it's be nice to not have to fiddle around > with the zip file, i really think making py2exe create a dir instead > of a zip will be much better Well, you'd still potentially have a problem if the update changed a file in that directory that hadn't been imported yet, but now depended on other updated files that your application had already loaded old versions for. That's a general problem of updating modules beneath the executing application, and not really specific to the zip file, although you're getting a zip importer specific error related to that in this case. > here what i do anyway > > if (os.name == 'nt') or (os.name == 'win32'): > client = pysvn.Client() > #get current revision number > CurrentRev = client.info('').revision.number > Check = client.update('') > sys.path_importer_cache.clear() > if Check.number > CurrentRev: > self.Popup('Update installed, click ok and restart > ','Update installed') > self.Destroy() > else: > InfoMsg.Update(3,'No Updates needed') Ah, it's more devious than I thought. Just pointed out the other missing piece in his response. Apparently there are two levels of caching that you've got to defeat if you change the underlying zip: 1. A global file set of file directory cache information for any opened zip file (for all files in the zip). This is held in the zipimport module global _zip_directory_cache. 2. Individual file cached information within the zipimporter instance that is kept in the path importer cache (sys.path_importer_cache). Technically these are just references to the same individual entries being held in the dictionary from (1). So when you cleared out (2), it still found the cached directory at the zipimport module level and re-used that information. But if you only clear out (1), then the reference in (2) to the directory entries for currently imported modules remains and still gets used. I tried testing this with a small zip file that I first built with normal compression on the entries, then imported one from a running interpreter, and then rebuilt the zip without compression. I couldn't seem to get the precise error you were getting, but doing this gave me a decompression error upon an attempted reload of an imported module, since the cached information still thought it was compressed. After clearing both sys.path_importer_cache and zipimport._zip_directory_cache, the reload went fine. It's sort of unfortunate that you have to cheat with the "private" cache clearing in this case. It might be worth an enhancement request to see if zipimport could know to update itself if the timestamp on the zip file changes, but this is sort of a very specialized scenario. Although maybe just a public way to cleanly flush import cache information would be useful. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: Unable to read large files from zip
Nick Craig-Wood <[EMAIL PROTECTED]> writes: > Kevin Ar18 <[EMAIL PROTECTED]> wrote: >> >> I posted this on the forum, but nobody seems to know the solution: >> http://python-forum.org/py/viewtopic.php?t=5230 >> >> I have a zip file that is several GB in size, and one of the files inside >> of it is several GB in size. When it comes time to read the 5+GB file from >> inside the zip file, it fails with the following error: >> File "...\zipfile.py", line 491, in read bytes = >> self.fp.read(zinfo.compress_size) >> OverflowError: long it too large to convert to int > > That will be an number which is bigger than 2**31 == 2 GB which can't > be converted to an int. > > It would be explained if zinfo.compress_size is > 2GB, eg > > >>> f=open("z") > >>> f.read(2**31) > Traceback (most recent call last): > File "", line 1, in ? > OverflowError: long int too large to convert to int > > However it would seem nuts that zipfile is trying to read > 2GB into > memory at once! Perhaps, but that's what the read(name) method does - returns a string containing the contents of the selected file. So I think this runs into a basic issue of the maximum length of Python strings (at least in 32bit builds, not sure about 64bit) as much as it does an issue with the zipfile module. Of course, the fact that the only "read" method zipfile has is to return the entire file as a string might be considered a design flaw. For the OP, if you know you are going to be dealing with very large files, you might want to implement your own individual file extraction, since I'm guessing you don't actually need all 5+GB of the problematic file loaded into memory in a single I/O operation, particularly if you're just going to write it out again, which is what your original forum code was doing. I'd probably suggest just using the getinfo(name) method to return the ZipInfo object for the file in question, then process the appropriate section of the zip file directly. E.g., just seek to the proper offset, then read the data incrementally up to the full size from the ZipInfo compress_size attribute. If the files are compressed, you can incrementally hand their data to the decompressor prior to other processing. E.g., instead of your original: fileData = dataObj.read(i) fileHndl = file(fileName,"wb") fileHndl.write(fileData) fileHndl.close() something like (untested): CHUNK = 65536# I/O chunk size fileHndl = file(fileName,"wb") zinfo = dataObj.getinfo(i) compressed = (zinfo.compress_type == ZLIB_DEFLATED) if compressed: dc = zlib.decompressobj(-15) dataObj.fp.seek(zinfo.header_offset+30) remain = zinfo.compress_size while remain: bytes = dataObj.fp.read(min(remain, CHUNK)) remain -= len(bytes) if compressed: bytes = dc.decompress(bytes) fileHndl.write(bytes) if compressed: bytes = dc.decompress('Z') + dc.flush() if bytes: fileHndl.write(bytes) fileHndl.close() Note the above assumes you are only reading from the zip file as it doesn't maintain the current read() method invariant of leaving the file pointer position unchanged, but you could add that too. You could also verify the file CRC along the way if you wanted to. Might be even better if you turned the above into a generator, perhaps as a new method on a local ZipFile subclass. Use the above as a read_gen method with the write() calls replaced with "yield bytes", and your outer code could look like: fileHndl = file(fileName,"wb") for bytes in dataObj.read_gen(i): fileHndle.write(bytes) fileHndl.close() -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: Creating a multi-tier client/server application
Jeff <[EMAIL PROTECTED]> writes: > reasons, not the least of which is that I've been working almost > entirely on web apps for the past few years, and I am getting mighty > sick of it. A lot of that is due to the language (PHP, which I have > since grown to hate) I had to use. I've worked on a site for my self > in Python (using Pylons, actually--which is excellent) which was > vastly easier and more fun. But I'd really like to try something > different. To contribute a data point against your original question - I've a similar (structurally, not functionality) system I completed recently. Without trying to get too mired in the thick client v. web application debate, there were a handful of points that decided me in favor of the thick client: * Needed to automate QuickTime viewer for video previews and extraction of selected frames to serve as thumbnails on web approval page. * Needed to control transfers to server of multiple very large files (hundreds of MBs to GBs at a shot) But assuming a thick client, in terms of your original question of components to use, here's what I've got. My primary networking component is Twisted. The pieces are: Client (OS X Cocoa application): * PyObjC based. Twisted for networking, Twisted's PB for the primary management channel, with an independent direct network connections for bulk file transfers. (I needed to go Mac native for clean integration of QuickTime UI elements including frame extraction to thumbnails) Server: * Twisted for networking. PB and raw connections for clients, web server through twisted.web. Genshi for web templating, with Mochikit (might move to JQuery) for client-side JS/AJAX. Twisted for email transmission (email construction using normal Python email package). Small UI front-end module (Cocoa/PyObjC). The client accesses server-based objects through Twisted PB, which for some of the server objects also control session change lifetime (transactions). So at least in my case, having a stateful connection from the client worked out well, particularly since I needed to coordinate both database changes as well as filesystem changes through independent file uploads, each of which can fail independently. Right now a single server application contains all support for client connections as well as the web application, but I could fracture that (so the web server was independent for example) if needed. For the client, I package it using py2app, and put into an normal Mac installer, and distribute as a dmg. If it were a Windows client, I'd probably wrap with py2exe, then Inno Setup. The server's web server has a restricted URL that provides access to the DMG. The client has a Help menu item taking users to that URL. Clients are versioned and accepted/rejected by the server during initial connection - from the server side I can "retire" old client versions, at which point users get a message at signon with a button to take them to the download page for the latest DMG. So right now upgrades are executed manually by the user, and I can support older clients during any transition period. I may provide built-in support for automatically pulling down the new image and executing its installer, but haven't found it a hardship yet. I probably won't bother trying to automate smaller levels of updates. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: Unzip: Memory Error
mcl <[EMAIL PROTECTED]> writes: > I am trying to unzip an 18mb zip containing just a single 200mb file > and I get a Memory Error. When I run the code on a smaller file 1mb > zip, 11mb file, it works fine. (...) > def unzip_file_into_dir(file, dir): > #os.mkdir(dir, 0777) > zfobj = zipfile.ZipFile(file) > for name in zfobj.namelist(): > if name.endswith('/'): > os.mkdir(os.path.join(dir, name)) > else: > outfile = open(os.path.join(dir, name), 'wb') > outfile.write(zfobj.read(name)) > outfile.close() The "zfobj.read(name)" call is reading the entire file out of the zip into a string in memory. It sounds like it's exceeding the resources you have available (whether overall or because the Apache runtime environment has stricter limits). You may want to peek at a recent message from me in the "Unable to read large files from zip" thread, as the suggestion there may also be suitable for your purposes. http://groups.google.com/group/comp.lang.python/msg/de04105c170fc805?dmode=source -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: Unzip: Memory Error
mcl <[EMAIL PROTECTED]> writes: > pseudo code > > zfhdl = zopen(zip,filename) # Open File in Zip Archive for > Reading > > while True: > ln = zfhdl.readline()# Get nextline of file > if not ln: # if EOF file > break > dealwithline(ln) # do whatever is necessary with > file > zfhdl.close > > That is probably over simplified, and probably wrong but you may get > the idea of what I am trying to achieve. Do you have to process the file as a textual line-by-line file? Your original post showed code that just dumped the file to the filesystem. If you could back up one step further and describe the final operation you need to perform it might be helpful. If you are going to read the file data incrementally from the zip file (which is what my other post provided) you'll prevent the huge memory allocations and risk of running out of resource, but would have to implement your own line ending support if you then needed to process that data in a line-by-line mode. Not terribly hard, but more complicated than my prior sample which just returned raw data chunks. Depending on your application need, it may still be simpler to just perform an extraction of the file to temporary filesystem space (using my prior code for example) and then open it normally. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: Creating a multi-tier client/server application
Jeff <[EMAIL PROTECTED]> writes: > David: Sounds like a pretty interesting app. Thanks for the in-depth > description. I went and checked out Twisted PB, and it seems > awesome. I may very well go with that. How was writing code with > it? I may also end up using py2app, but I'm also going to have to > support Windows, (p2exe, then), and possibly Linux. Well, maybe not > Linux, but I'll probably be doing most of the development in Linux, so > I guess that counts. I find PB very easy, but it's important to first become familiar with Twisted (in particular Deferred's), which can have a steep, but worth it IMO, learning curve. PB is a thin, transparent system, so it doesn't try to hide the fact that you are working remotely. Being thin, there also isn't very much to have to learn. For packaging, you don't have to use a single system if you are multi-platform. Your codebase can be common, and just have separate setup files using py2app on OS X and py2exe on Windows. A makefile or equivalent can handle final distribution packaging (e.g,. hdiutil for dmg on OS X, Inno Setup, NSIS, etc... on Windows). You'll spend some platform-specific time getting the initial stuff setup, but then new builds should be easy. For Linux, depending on the level of your users you can either just directly ship something like eggs (generated through a setup) or look into pyInstaller, which was the old Installer package that also supports single-exe generation for Linux. pyInstaller also does Windows, so if you have to support them both you could try using pyInstaller rather than both it and py2exe. But if you're just developing in Linux, final packaging probably isn't very important. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: Unzip: Memory Error
David Bolen <[EMAIL PROTECTED]> writes: > If you are going to read the file data incrementally from the zip file > (which is what my other post provided) you'll prevent the huge memory > allocations and risk of running out of resource, but would have to > implement your own line ending support if you then needed to process > that data in a line-by-line mode. Not terribly hard, but more > complicated than my prior sample which just returned raw data chunks. Here's a small example of a ZipFile subclass (tested a bit this time) that implements two generator methods: read_generator Yields raw data from the file readline_generator Yields "lines" from the file (per splitlines) It also corrects my prior code posting which didn't really skip over the file header properly (due to the variable sized name/extra fields). Needs Python 2.3+ for generator support (or 2.2 with __future__ import) Peak memory use is set "roughly" by the optional chunk parameter. It's roughly since that's an uncompressed chunk so will grow in memory during decompression. And the readline generator adds further copies for the data split into lines. For your file processing by line, it could be used as in: zipf = ZipFileGen('somefile.zip') g = zipf.readline_generator('somefilename.txt') for line in g: dealwithline(line) zipf.close() Even if not a perfect match, it should point you further in the right direction. -- David - - - - - - - - - - - - - - - - - - - - - - - - - import zipfile import zlib import struct class ZipFileGen(zipfile.ZipFile): def read_generator(self, name, chunk=65536): """Return a generator that yields file bytes for name incrementally. The optional chunk parameter controls the chunk size read from the underlying zip file. For compressed files, the data length returned by the generator will be larger as the decompressed version of a chunk. Note that unlike read(), this method does not preserve the internal file pointer and should not be mixed with write operations. Nor does it verify that the ZipFile is still opened and for reading. Multiple generators returned by this function are not designed to be used simultaneously (they do not re-seek the underlying file for each request.""" zinfo = self.getinfo(name) compressed = (zinfo.compress_type == zipfile.ZIP_DEFLATED) if compressed: dc = zlib.decompressobj(-15) self.fp.seek(zinfo.header_offset) # Skip the file header (from zipfile.ZipFile.read()) fheader = self.fp.read(30) if fheader[0:4] != zipfile.stringFileHeader: raise zipfile.BadZipfile, "Bad magic number for file header" fheader = struct.unpack(zipfile.structFileHeader, fheader) fname = self.fp.read(fheader[zipfile._FH_FILENAME_LENGTH]) if fheader[zipfile._FH_EXTRA_FIELD_LENGTH]: self.fp.read(fheader[zipfile._FH_EXTRA_FIELD_LENGTH]) # Process the file incrementally remain = zinfo.compress_size while remain: bytes = self.fp.read(min(remain, chunk)) remain -= len(bytes) if compressed: bytes = dc.decompress(bytes) yield bytes if compressed: bytes = dc.decompress('Z') + dc.flush() if bytes: yield bytes def readline_generator(self, name, chunk=65536): """Return a generator that yields lines from a file within the zip incrementally. Line ending detection based on splitlines(), and like file.readline(), the returned line does not include the line ending. Efficiency not guaranteed if used with non-textual files. Uses a read_generator() generator to retrieve file data incrementally, so it inherits the limitations of that method as well, and the optional chunk parameter is passed to read_generator unchanged.""" partial = '' g = self.read_generator(name, chunk=chunk) for bytes in g: # Break current chunk into lines lines = bytes.splitlines() # Add any prior partial line to first line if partial: lines[0] = partial + lines[0] # If the current chunk didn't happen to break on a line ending, # save the partial line for next time if bytes[-1] not in ('\n', '\r'): partial = lines.pop() # Then yield the lines we've identified so far for curline in lines: yield curline # Return any trailing data (if file didn't end in a line ending) if partial: yield partial -- http://mail.python.org/mailman/listinfo/python-list
Re: Unzip: Memory Error
I wrote: > Here's a small example of a ZipFile subclass (tested a bit this time) > that implements two generator methods: Argh, not quite tested enough - one fix needed, change: if bytes[-1] not in ('\n', '\r'): partial = lines.pop() to: if bytes[-1] not in ('\n', '\r'): partial = lines.pop() else: partial = '' (add the extra two lines) -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: HowTo Use Cython on a Windows XP Box?
David Lees <[EMAIL PROTECTED]> writes: > Yes, you are correct in understanding my question. I thought my post > was clear, but I guess not. I will go try the pyrex list. You might also try looking for references to distutils support for non-MS compilers, since Pyrex (and presumably Cython) uses distutils under the covers to build the final extension. I'm pretty sure there is support in recent Python releases for using mingw rather than MSVC for most extensions (there may be problems with using certain Python APIs that depending on specific C RTL structures like files). As to using VC, yes, it does have to be VC 7.1, e.g,. Visual Studio 2003. You can't use 2005, as MS didn't maintain runtime compatibility. I'm sure there are a number of threads about that also available. If I recall correctly, VC 7.1 began to be used in the 2.4 timeframe - although it was getting discussed back when 2.3 was getting released, based on an offer Microsoft had made to provide copies to core developers. The discussions are archived, but VC 6 was definitely long in the tooth at that point. As the development tools aren't free, they haven't been upgraded past that point to date. It's unfortunate that when MS changed the main runtime DLL with VC 7 (for the first time in a pretty long time), that they then did so immediately again (and incompatibly) with VC 8. At the time, there were also efforts with some success to use the free toolkit MS made available (although I think it was sans optimizer), but then I think that got pulled and/or it became more difficult to find/use, but my memory is fuzzy. You mention having VS 2005 - if so, do you also have an MSDN subscription? I believe you should still be able to get VS 2003 via that route if you first started with 2005 and thus never had 2003. If not, the mingw approach may be your best bet. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: Co-developers wanted: document markup language
Roy Smith <[EMAIL PROTECTED]> writes: > Anybody remember Scribe? (raising hand) OT, but I still have a bunch of Scribe source documents from college. Of course, as I attended CMU where it originated I suppose that's not unusual. Definitely pre-WYSIWYG, but one of the first to separate presentation markup from structure (very much in line with later stuff like SGML from IBM although I don't recall the precise timing relation of the two), including the use of styles. I personally liked it a lot (I think the markup syntax is easier on the eyes than the *ML family). If I remember correctly, for a while there, it was reasonably common to see Scribe-like markup in newsgroups (e.g,. "@begin(flame)" and @end("flame") or "@b[emphasis]") before SGML/XML/HTML became much more common (" ... "). -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: Low-overhead GUI toolkit for Linux w/o X11?
Grant Edwards <[EMAIL PROTECTED]> writes: > I'm looking for GUI toolkits that work with directly with the > Linux frambuffer (no X11). It's an embedded device with > limited resources, and getting X out of the picture would be a > big plus. Sounds like a reasonably modern "embedded" system since traditionally neither X (nor Python) would likely have even been plausible in such environments. Depending on the higher level GUI functionality you require and how tight the resources really are, you might want to consider investigating pure drawing libraries and then implement any missing GUI elements (widgets and mouse handling) you need yourself. When I was looking for an embedded graphics library for a prior platform (ELAN 486, 2MB flash, 6MB RAM) under DOS, we took a look at these: * GRX (http://grx.gnu.de/index.html) * Allegro (http://alleg.sourceforge.net/) We ended up using GRX, primarily because it was the simplest to develop a custom video driver for to match our platform, along with having a simpler core. We were under DOS but also used it with a later generation of the platform under Linux. Both libraries support operation over the framebuffer in Linux. Our app was in C++ (Python wasn't an option), and we implemented our own buttons and text widgets (in our case we never needed any scrolling widgets). There aren't any Python wrappers for GRX, but the library is straight C which should be easy to wrap (manually or with something like SWIG). No built-in widget support at all (some sample button processing code in a demo module), but easy enough to implement your own if your needs are modest. Although we didn't end up using it, Allegro is more fully featured (actually with some non-GUI cruft too since it targets games), and also appears to have two work-in-progress Python bindings. Some basic widget support in dialog processing routines. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: Low-overhead GUI toolkit for Linux w/o X11?
David Bolen <[EMAIL PROTECTED]> writes: > When I was looking for an embedded graphics library for a prior > platform (ELAN 486, 2MB flash, 6MB RAM) under DOS, we took a look at > these: > > * GRX (http://grx.gnu.de/index.html) (...) > There aren't any Python wrappers for GRX, but the library is straight > C which should be easy to wrap (manually or with something like SWIG). > No built-in widget support at all (some sample button processing code > in a demo module), but easy enough to implement your own if your needs > are modest. I had forgotten, since we didn't use it, but there is an external mGui library (http://web.tiscalinet.it/morello/MGui/index.html) that can layer on top of GRX to provide higher level functionality. Of course, it would also have to be wrapped for use from Python. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: is there enough information?
Dennis Lee Bieber <[EMAIL PROTECTED]> writes: > On Mon, 3 Mar 2008 08:11:43 -0500, Jean-Paul Calderone > <[EMAIL PROTECTED]> declaimed the following in comp.lang.python: > >> I'm not sure, but you seem to be implying that the only way to use Windows' >> asynchronous I/O APIs is with threads. Actually, it is possible (and Twisted >> allows you) to use these as well without writing a threaded application. >> > I only pointed out that, on Windows, one can not use the common > /select()/ function with files. And one rarely sees examples of coding a > Twisted-style (emphasis on style) asynchronous callback system mixing > files and network sockes using the Windows-specific API. > > If using threads, the Windows asynchronous I/O isn't needed... let > the thread block until the I/O completes, then transfer the data (or a > message that the data is available) back to the main processing > thread... You're probably right that it's rare, but when needed, using the Windows asynchronous/overlapping API can provide a better solution than blocking threads depending on the needs at hand, and without involving any callbacks or Twisted-style programming. An example of mine is high performance serial port handling as part of a custom FHSS wireless adapter with a serial port interface to the PC. In this case, minimizing I/O latency was crucial since delays could mean missing a broadcast timeslot (about 15ms) on the wireless network. A serial port isn't a disk file, but certainly a "file" in the context of Windows handles. Early implementations used independent threads for reading/writing to the serial port and blocking during such operations, but that turned out to have an undesirable amount of latency, and was also difficult to interrupt when the threads were in a blocked condition. Instead I created a single thread that had a loop using overlapped I/O simultaneously in each direction as well as native Windows event objects for aborting or signaling that there was additional data to be written (the pending read I/O handled the read case). The main loop was just a WaitForMultipleObjects to handle any of the I/O completion indications, requests for more I/O or aborts. It was very high performing (low latency) with low CPU usage - measurably better than a multi-threaded version. Communication with the rest of the application was through a thread-safe bi-directional buffer object, also using native Win32 event objects. It worked similar to a queue, but by using the native event objects I didn't have the performance inefficiencies for reads with timeouts of the Python objects. The underlying Python primitives don't have the timeout capability built in, so reads with timeouts get implemented through checks for data interspersed with increasing sleeps, which adds unnecessary latency. Anyway, it worked extremely well, and was a much better fit for my needs than a multi-threaded version with blocking I/O, without it having to be Twisted-style. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: Implementing file reading in C/Python
Johannes Bauer writes: > Yup, I changed the Python code to behave the same way the C code did - > however overall it's not much of an improvement: Takes about 15 minutes > to execute (still factor 23). Not sure this is completely fair if you're only looking for a pure Python solution, but to be honest, looping through a gazillion individual bytes of information sort of begs for trying to offload that into a library that can execute faster, while maintaining the convenience of Python outside of the pure number crunching. I'd assume numeric/numpy might have applicable functions, but I don't use those libraries much, whereas I've been using OpenCV recently for a lot of image processing work, and it has matrix/histogram support, which seems to be a good match for your needs. For example, assuming the OpenCV library and ctypes-opencv wrapper, add the following before the file I/O loop: from opencv import * # Histogram for each file chunk hist = cvCreateHist([256], CV_HIST_ARRAY, [(0,256)]) then, replace (using one of your posted methods as a sample): datamap = { } for i in data: datamap[i] = datamap.get(i, 0) + 1 array = sorted([(b, a) for (a, b) in datamap.items()], reverse=True) most = ord(array[0][1]) with: matrix = cvMat(1, len(data), CV_8UC1, data) cvCalcHist([matrix], hist) most = cvGetMinMaxHistValue(hist, min_val = False, max_val = False, min_idx = False, max_idx = True) should give you your results in a fraction of the time. I didn't run with a full size data file, but for a smaller one using smaller chunks the OpenCV varient ran in about 1/10 of the time, and that was while leaving all the other remaining Python code in place. Note that it may not be identical results to some of your other methods in the case of multiple values with the same counts, as the OpenCV histogram min/max call will always pick the lower value in such cases, whereas some of your code (such as above) will pick the upper value, or your original code depended on the order of information returned by dict.items. This sort of small dedicated high performance choke point is probably also perfect for something like Pyrex/Cython, although that would require a compiler to build the extension for the histogram code. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: Python Crashes
koranthala writes: > Could anyone guide me on this? I have been facing this issue for a > day, and cannot seem to solve it. We had a scheduling system that had a similar "once in a long while hard Windows-process crash" which after a bunch of work to try to track down the source, the most robust solution was just to trap the failure and restart, as the system ran off a persistent state that was already engineering to be robust in the case of a hardware crash (power outage, etc...) While I agree with others that it's most likely a fault in an extension that gets tickled over time, as was likely in our case, we needed all our extensions and were using latest versions at the time. So if your application is such that just restarting it is practical, it may be a sufficient temporary (or not so temporary - our system ran for years this way) workaround for you. What you can do is execute your script from beneath control of another script, and trap process failures, restarting the script on non-standard exits. This can be in addition to any top level exception handling of the child script itself, where it can provide more graceful support for internal failures. The trick, under Windows, is to ensure that you disable any pop-up windows that may occur during the crash, otherwise the monitoring task never gets a chance to get control and restart things. With the pywin32 extension, something like: import win32api, win32con old_mode = win32api.SetErrorMode(win32con.SEM_FAILCRITICALERRORS | win32con.SEM_NOGPFAULTERRORBOX | win32con.SEM_NOOPENFILEERRORBOX) Or with ctypes: import ctypes SEM_FAILCRITICALERRORS = 1 SEM_NOGPFAULTERRORBOX = 2 SEM_NOOPENFILEERRORBOX = 0x8000 old_mode = ctypes.windll.kernel32.SetErrorMode(SEM_FAILCRITICALERRORS | SEM_NOGPFAULTERRORBOX | SEM_NOOPENFILEERRORBOX) at any point prior to starting the child process will ensure that hard process errors will silently terminate the process and return control the parent, as well as not popping up any dialog boxes that require intervention by a person. Should the process exit harshly, the exit code should be fairly clear (I forget, but I think it's in the 0xC000 range, maybe 0xC005 for a typical GPF), and you can decide on restarting the task as opposed to just exiting normally. This will also prevent any pop-ups in the main monitoring process. You can restore old behavior there after starting the child by making another call to SetErrorMode using old_mode as the argument. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: is python Object oriented??
thmpsn@gmail.com writes: > I don't know how you would do it in C# (or Java for that matter). > > In C++ you can play with pointers to "get at" some memory location > somewhere in the object. The only portable way to know the exact > location between the beginning of the object and the desired member is > the offsetof() macro, but as I understand it this only works for POD > types, which means that it won't work for classes such as: > > class NonPOD > { > private: > int a; > int b; > public: > NonPOD(); > ~NonPOD(); > int C(); > }; > > (I haven't ever actually tried it, so I'm not sure.) > > Nevertheless, you can play and hope for the best. For example, if the > member you want to get at is 'b', then you can do: > > NonPOD obj; > std::cout << "obj.b = " << *(int*) ((unsigned char*) &obj + sizeof > (int)) << std::endl; > > and hope that the compiler didn't leave a hole between the 'a' member > and the 'b' member. Probably moving off topic, but I don't think you have to get anywhere near that extreme in terms of pointers, unless you're trying to deal with instances for which you have no source but only opaque pointers. I haven't gotten stuck having to do this myself yet, but I believe one commmon "hack" for the sort of class you show above is to just "#define private public" before including the header file containing the class definition. No fiddling with pointers, offsets, or whatever, just normal object access syntax past that point. Of course, I believe such a redefinition violates the letter of the C++ standard, but most preprocessors do it anyway. Also, it won't handle the case where the "private:" is not used, but the members are just declared prior to any other definition, since a class is private by default. But even then, if you had to, just make a copy of the class definition (or heck, just define a structure if it's just data elements), ensure the private portions are public, and then cast a pointer to the old class instance to one of your new class instance. Assuming you're building everything in a single compiler, the layouts should match just fine. Again, normal object member access, no casting or pointers needed (beyond the initial overall object pointer cast). -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: Using clock() in threading on Windows
"Martin v. Löwis" writes: > As a consequence, the half-busy loops could go away, at least > on systems where lock timeouts can be given to the system. I know that in some cases in the past I've had to bypass a Queue's use of threading objects for waiting for a queue to unblock because of the increased overhead (and latency as the timer increases) of the busy loop. On windows, replacing it with an implementation using WaitForObject calls with the same timeouts I would have used with the Queue performed much better, not unexpectedly, but was non-portable. The current interface to the lowest level locks in Python are certainly generic enough to cross lots of platforms, but it would definitely be useful if they could implement timeouts without busy loops on those platforms where they were supported. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: Python 3.0 automatic decoding of UTF16
Johannes Bauer <[EMAIL PROTECTED]> writes: > This is very strange - when using "utf16", endianness should be detected > automatically. When I simply truncate the trailing zero byte, I receive: Any chance that whatever you used to "simply truncate the trailing zero byte" also removed the BOM at the start of the file? Without it, utf16 wouldn't be able to detect endianness and would, I believe, fall back to native order. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: wxPython fast and slow
iu2 writes: > Indeed, but I don't think the CallAfter is necessary. I could just as > well remove the time.sleep in the original code. I could also make a > tight loop to replace time.sleep > for i in range(100): pass > and tune it to fit the speed I need. Except that CallAfter passed control back through the event loop which is crucial for your GUI to appear responsive in other ways. > I haven't mention this, but I actually want something to be the same > speed on different PC-s. So a timer seems to fit in. Then even a time.sleep() or plain loop isn't sufficient since each may have additional latencies depending on load. You will probably need to query a system clock of some type to verify when your interval has passed. > I just can't make it work. > Using wx.Timer is too slow. > Using time.sleep is fast with PyScripter active, and slow when it is > closed. I have to admit to thinking that perhaps you're trying to operate too quickly if you need better resolution than wx.Timer. Most screen operations don't have to appear that frequently to still appear smooth, but that's your call. Of course, even wx.Timer may be subject to other latencies if the system or your application is busy with other events, so it depends on how critical precise your timing needs to be. You might also try an idle event, implementing your own timer (using whatever call gives you the best resolution on your platform), and just ignoring idle events that occur more frequently than the timing you want. Just remember to always request a new event. You could do the same thing with CallAfter as well, just reschedule a new one if the current one is faster than your preferred interval. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: wxPython fast and slow
iu2 writes: > A question about CallAfter: As I understand, this function is intended > to be used from within threads, where it queues the operation to be > performed in the GUI queue. I agree with the second half of the sentence but not the first. CallAfter is intended to queue up a delayed call (via the GUI queue), but it can be used anywhere you wish that behavior. Yes, it's also one of the very few functions that can be called from a thread other than the GUI thread, but it works just as well from the GUI thread. Or to quote its docstring: Call the specified function after the current and pending event handlers have been completed. This is also good for making GUI method calls from non-GUI threads. Any extra positional or keyword args are passed on to the callable when it is called. > How does it work in this situation? Does it queue the opreation for > some idle time or does it perform it right away? You can actually see the source in _core.py in your wx installation. It always executes via a wx.PostEvent call. > And another question, if I may, I used to make tight loops in windows > API, planting inside them a command that processes messages from the > GUI queue and returns when no more messages exists. Something like > this: > > loop { > operations > process_gui_messages > } > > The loop ran quickly and the GUI remained responsive during the loop. > I did it on window API using a function I defined similar to this one: I don't think there's much difference in the above and doing your operations during one of the events. In both cases "operations" is going to block any further event processing so cannot be lengthy or the GUI will feel unresponsive. "Lengthy" varies but I'd certainly put it in the neighborhood of small fractions of a second. Your original code took almost 2 seconds for the "operations" part (before getting back to processing GUI messages through the main loop), which certainly seems too long. > void ProcessMessages() > { > while (PeekMessage()) { > TranslateMessage(..); > DispatchMessage(..); > } > } Not quite positive, but if you're talking about implementing this as a nested dispatch loop (e.g., called from within an existing event), you can do that via wxYield. Of course, as with any nested event loop processing, you have to be aware of possible reentrancy issues. > This technique is not good for long loops, where the user may activate > other long GUI opreations during the tight loop and make a mess. > But it carries out the job well where during the time of the loop the > user may only access to certain features, such as pressing a button to > cancel the operation, operating the menu to exit the program, etc. > This scheme saves some state-machine code that is required when using > event-based programming. Maybe - for my own part, I'm not completely convinced and tend to far prefer avoiding nested event loop dispatching. There are some times when it might be unavoidable, but I tend to find it indicative that I might want to re-examine what I am doing. It seems to me that as long as you have to keep the "operations" step of your loop small enough, you have to be able to divide it up. So you'll need some state no matter what to be able to work through each stage of the overall "operations" in between calls to process the GUI. At that point, whether it's a local variable within the scope of the looping code, or just some instance variables in the object handling the event loop seems about the same amount of state management. For example, in your original code you could probably consider the generator and/or 'x' your local state. But the current step in the movement could just as easily be an instance variable. > Does wxPython have something like ProcessMessages? If you just mean a way to process pending messages wxYield may be sufficient. If you want to take over the primary dispatch loop for the application, normally that has been handed off to wxWidgets via wxApp.MainLoop. However, I believe you can build your own main dispatch loop if you want, as there are functions in wxApp like ProcessPendingEvents, Pending, Dispatch and so on. You may need to explicitly continue to support Idle events in your own loop if desired. If you need to get into more details, it's probably better dealt with on the wxPython mailing list. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: finally successful in ods with python, just one help needed.
Krishnakant writes: > However when I apply the same elements and attributes to the one I am > creating with odfpy, I get "attribute not allowed " errors. > If some one is interested to look at the code, please let me know, I can > send an attachment off the list so that others are not forced to > download some thing they are not concerned about. I just tried this myself and the following creates a 3x3 spreadsheet with the first row spanning all three columns (no special formatting like centering or anything), using odf2py 0.8: import sys from odf.opendocument import OpenDocumentSpreadsheet from odf.style import Style, TableColumnProperties from odf.table import Table, TableRow, TableColumn, \ TableCell, CoveredTableCell from odf.text import P def make_ods(): ods = OpenDocumentSpreadsheet() col = Style(name='col', family='table-column') col.addElement(TableColumnProperties(columnwidth='1in')) table = Table() table.addElement(TableColumn(numbercolumnsrepeated=3, stylename=col)) ods.spreadsheet.addElement(table) # Add first row with cell spanning columns A-C tr = TableRow() table.addElement(tr) tc = TableCell(numbercolumnsspanned=3) tc.addElement(P(text="ABC1")) tr.addElement(tc) # Uncomment this to more accurately match native file ##tc = CoveredTableCell(numbercolumnsrepeated=2) ##tr.addElement(tc) # Add two more rows with non-spanning cells for r in (2,3): tr = TableRow() table.addElement(tr) for c in ('A','B','C'): tc = TableCell() tc.addElement(P(text='%s%d' % (c, r))) tr.addElement(tc) ods.save("ods-test.ods") Maybe that will give you a hint as to what is happening in your case. Note that it appears creating such a spreadsheet directly in Calc also adds covered table cells for those cells beneath the spanned cell, but Calc loads a file fine without those and still lets you later split the merge and edit the underlying cells. So I'm not sure how required that is as opposed to just how Calc manages its own internal structure. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: finally successful in ods with python, just one help needed.
Krishnakant writes: > based on your code snippid I added a couple of lines to actually center > align text in the merged cell in first row. Sorry, guess I should have verified handling all the requirements :-) I think there's two issues: * I neglected to add the style I created to the document, so even in my first example, columns had a default style (not the 1in) style I thought I was creating. * I don't think you want a paragraph style applied to the paragraph text within the cell, but to the cell as a whole. I think if you just try to associate it with the text.P() element the "width" of the paragraph is probably just the text itself so there's nothing to center, although that's just a guess. I've attached an adjusted version that does center the spanned cell for me. Note that I'll be the first to admit I don't necessarily understand all the ODF style rules. In particular, I got into a lot of trouble trying to add my styles to the overall document styles (e.g., ods.styles) which I think can then be edited afterwards rather than the automatic styles (ods.automaticstyles). The former goes into the styles.xml file whereas the latter is included right in contents.xml. For some reason using ods.styles kept causing OpenOffice to crash trying to load the document, so I finally just went with the flow and used automaticstyles. It's closer to how OO itself creates the spreadsheet anyway. -- David from odf.opendocument import OpenDocumentSpreadsheet from odf.style import Style, TableColumnProperties, ParagraphProperties from odf.table import Table, TableRow, TableColumn, \ TableCell, CoveredTableCell from odf.text import P def make_ods(): ods = OpenDocumentSpreadsheet() col = Style(name='col', family='table-column') col.addElement(TableColumnProperties(columnwidth='1in')) centered = Style(name='centered', family='table-cell') centered.addElement(ParagraphProperties(textalign='center')) ods.automaticstyles.addElement(col) ods.automaticstyles.addElement(centered) table = Table() table.addElement(TableColumn(numbercolumnsrepeated=3, stylename=col)) ods.spreadsheet.addElement(table) # Add first row with cell spanning columns A-C tr = TableRow() table.addElement(tr) tc = TableCell(numbercolumnsspanned=3, stylename=centered) tc.addElement(P(text="ABC1")) tr.addElement(tc) # Add two more rows with non-spanning cells for r in (2,3): tr = TableRow() table.addElement(tr) for c in ('A','B','C'): tc = TableCell() tc.addElement(P(text='%s%d' % (c, r))) tr.addElement(tc) ods.save("ods-test.ods") if __name__ == "__main__": make_ods() -- http://mail.python.org/mailman/listinfo/python-list
Re: global name 'self' is not defined - noob trying to learn
mark.sea...@gmail.com writes: > class myclass(object): > # > # def __new__(class_, init_val, size, reg_info): > def __init__(self, init_val, size, reg_info): > > # self = object.__new__(class_) > self.reg_info = reg_info > print self.reg_info.message > self.val = self Note that here you assign self.val to be the object itself. Are you sure you didn't mean "self.val = init_val"? > (...) > def __int__(self): > return self.val Instead of an integer, you return the current class instance as set up in __init__. The __int__ method ought to return an integer. > def __long__(self): > return long(self.val) And this will be infinite recursion, since long() will try to call the __long__ method on so you're just recursing on the __long__ method. You can see this more clearly with: >>> cat = myclass(0x55, 32, my_reg) >>> int(cat) Traceback (most recent call last): File "", line 1, in TypeError: __int__ returned non-int (type myclass) >>> I won't post the traceback for long(cat), as it's, well, "long" ... -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: Thread-killing, round 666 (was Re: Lisp mentality vs. Python mentality)
Vsevolod writes: > "This should be used with caution: it is implementation-defined > whether the thread runs cleanup forms or releases its locks first." > This doesn't mean deprecated. It means: implementation-dependent. For > example in SBCL: "Terminate the thread identified by thread, by > causing it to run sb-ext:quit - the usual cleanup forms will be > evaluated". And it works fine. I'm curious - do you know what happens if threading is implemented as a native OS thread and it's stuck in an I/O operation that is blocked? How does the Lisp interpreter/runtime gain control again in order to execute the specified function? I guess on many POSIX-ish environments, internally generating a SIGALRM to interrupt a system operation might work, but it would likely have portability problems. Or is that combination (native OS thread and/or externally blocking I/O) prevented by the runtime somehow (perhaps by internally polling what appears to code as blocking I/O)? But surely if there's an access to OS routines, the risk of blocking must be present? That scenario is really the only rationale use case I've run into for wanting to kill a thread, since in other cases the thread can be monitoring for an application defined way to shut down. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: Thread-killing, round 666 (was Re: Lisp mentality vs. Python mentality)
Vsevolod writes: > On Apr 27, 11:31 pm, David Bolen wrote: >> I'm curious - do you know what happens if threading is implemented as >> a native OS thread and it's stuck in an I/O operation that is blocked? >> How does the Lisp interpreter/runtime gain control again in order to >> execute the specified function? I guess on many POSIX-ish >> environments, internally generating a SIGALRM to interrupt a system >> operation might work, but it would likely have portability problems. > > We're arguing to the old argument, who knows better, what the > programmer wants: language implementor or the programmer himself. > AFAIK, Python community is on former side, while Lisp one -- on the > later. As always, there's no right answer. Note I wasn't trying to argue anything, I was actually interested in how the behavior is handled in Lisp? Do you know how the Lisp implementation of threads you spoke about handles this case? E.g., can the Lisp implementation you are familiar with actually kill such a thread blocked on an arbitrary external system or library call? -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: A fast way to read last line of gzip archive ?
"Barak, Ron" writes: > I thought maybe someone has a way to unzip just the end portion of > the archive (instead of the whole archive), as only the last part is > needed for reading the last line. The problem is that gzip compressed output has no reliable intermediate break points that you can jump to and just start decompressing without having worked through the prior data. In your specific code, using readlines() is probably not ideal as it will create the full list containing all of the decoded file contents in memory only to let you pick the last one. So a small optimization would be to just iterate through the file (directly or by calling readline()) until you reach the last line. However, since you don't care about the bulk of the file, but only need to work with the final line in Python, this is an activity that could be handled more efficiently handled with external tools, as you need not involve much intepreter time to actually decompress/discard the bulk of the file. For example, on my system, comparing these two cases: # last.py import gzip import sys in_file = gzip.open(sys.argv[1],'r') for line in in_file: pass print 'Last:', line # last-popen.py import sys from subprocess import Popen, PIPE # Implement gzip -dc | tail -1 gzip = Popen(['gzip', '-dc', sys.argv[1]], stdout=PIPE) tail = Popen(['tail', '-1'], stdin=gzip.stdout, stdout=PIPE) line = tail.communicate()[0] print 'Last:', line with an ~80MB log file compressed to about 8MB resulted in last.py taking about 26 seconds, while last-popen took about 1.7s. Both resulted in the same value in "line". As long as you have local binaries for gzip/tail (such as Cygwin or MingW or equivalent) this works fine on Windows systems too. If you really want to keep everything in Python, then I'd suggest working to optimize the "skip" portion of the task, trying to decompress the bulk of the file as quickly as possible. For example, one possibility would be something like: # last-chunk.py import gzip import sys from cStringIO import StringIO in_file = gzip.open(sys.argv[1],'r') chunks = ['', ''] while 1: chunk = in_file.read(1024*1024) if not chunk: break del chunks[0] chunks.append(chunk) data = StringIO(''.join(chunks)) for line in data: pass print 'Last:', line with the idea that you decode about a MB at a time, holding onto the final two chunks (in case the actual final chunk turns out to be smaller than one of your lines), and then only process those for lines. There's probably some room for tweaking the mechanism for holding onto just the last two chunks, but I'm not sure it will make a major difference in performance. In the same environment of mine as the earlier tests, the above took about 2.7s. So still much slower than the external utilities in percentage terms, but in absolute terms, a second or so may not be critical for you compared to pure Python. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: A fast way to read last line of gzip archive ?
"Barak, Ron" writes: > I couldn't really go with the shell utilities approach, as I have no > say in my user environment, and thus cannot assume which binaries > are install on the user's machine. I suppose if you knew your target you could just supply the external binaries to go with your application, but I agree that would probably be more of a pain than its worth for the performance gain in real world time. > I'll try and implement your last suggestion, and see if the > performance is acceptable to (human) users. In terms of tuning the third option a bit, I'd play with the tracking of the final two chunk (as mentioned in my first response), perhaps shrinking the chunk size or only processing a smaller chunk of it for lines (assuming a reasonable line size) to minimize the final loop. You could also try using splitlines() on the final buffer rather than a StringIO wrapper, although that'll have a memory hit for the constructed list but doing a small portion of the buffer would minimize that. I was curious what I could actually achieve, so here are three variants that I came up with. First, this just fine tunes slightly tracking the chunks and then only processes enough final data based on anticipated maximum length length (so if the final line is longer than that you'll only get the final MAX_LINE bytes of that line). I also found I got better performance using a smaller 1024 chunk size with GZipFile.read() than a MB - not entirely sure why although it perhaps matches the internal buffer size better: # last-chunk-2.py import gzip import sys CHUNK_SIZE = 1024 MAX_LINE = 255 in_file = gzip.open(sys.argv[1],'r') chunk = prior_chunk = '' while 1: prior_chunk = chunk # Note that CHUNK_SIZE here is in terms of decompressed data chunk = in_file.read(CHUNK_SIZE) if len(chunk) < CHUNK_SIZE: break if len(chunk) < MAX_LINE: chunk = prior_chunk + chunk line = chunk.splitlines(True)[-1] print 'Last:', line On the same test set as my last post, this reduced the last-chunk timing from about 2.7s to about 2.3s. Now, if you're willing to play a little looser with the gzip module, you can gain quite a bit more. If you directly call the internal _read() method you can bypass some of the unnecessary processing read() does, and go back to larger I/O chunks: # last-gzip.py import gzip import sys CHUNK_SIZE = 1024*1024 MAX_LINE = 255 in_file = gzip.open(sys.argv[1],'r') chunk = prior_chunk = '' while 1: try: # Note that CHUNK_SIZE here is raw data size, not decompressed in_file._read(CHUNK_SIZE) except EOFError: if in_file.extrasize < MAX_LINE: chunk = chunk + in_file.extrabuf else: chunk = in_file.extrabuf break chunk = in_file.extrabuf in_file.extrabuf = '' in_file.extrasize = 0 line = chunk[-MAX_LINE:].splitlines(True)[-1] print 'Last:', line Note that in this case since I was able to bump up CHUNK_SIZE, I take a slice to limit the work splitlines() has to do and the size of the resulting list. Using the larger CHUNK_SIZE (and it being raw size) will use more memory, so could be tuned down if necessary. Of course, the risk here is that you are dependent on the _read() method, and the internal use of the extrabuf/extrasize attributes, which is where _read() places the decompressed data. In looking back I'm pretty sure this code is safe at least for Python 2.4 through 3.0, but you'd have to accept some risk in the future. This approach got me down to 1.48s. Then, just for the fun of it, once you're playing a little looser with the gzip module, it's also doing work to compute the crc of the original data for comparison with the decompressed data. If you don't mind so much about that (depends on what you're using the line for) you can just do your own raw decompression with the zlib module, as in the following code, although I still start with a GzipFile() object to avoid having to rewrite the header processing: # last-decompress.py import gzip import sys import zlib CHUNK_SIZE = 1024*1024 MAX_LINE = 255 decompress = zlib.decompressobj(-zlib.MAX_WBITS) in_file = gzip.open(sys.argv[1],'r') in_file._read_gzip_header() chunk = prior_chunk = '' while 1: buf = in_file.fileobj.read(CHUNK_SIZE) if not buf: break d_buf = decompress.decompress(buf) # We might not have been at EOF in the read() but still have no # decompressed data if the only remaining data was not original data if d_buf: prior_chunk = chunk chunk = d_buf if len(chunk) < MAX_LINE: chunk = prior_chunk + chunk line = chunk[-MAX_LINE:].splitlines(True)[-1] print 'Last:', line This version got me down to 1.15s. So in summar
Re: AOPython Question
Roastie writes: > I installed the AOPython module: > >% easy_install aopython > > That left an aopython-1.0.3-py2.6.egg at > C:\mystuff\python\python_2.6.2\Lib\site-packages. An egg is basically a ZIP file with a specific structure (you can inspect it with common ZIP tools). Depending on the package easy_install is installing, it may be considered safe to install as a single file (which Python does support importing files from). I tend to prefer to have an actual unpacked tree myself. If you use the "-Z" option to easy_install, you can force it to always unpack any eggs when installing them. Alternatively, if you've already got the single egg, you can always unzip it yourself. Just rename it temporarily and unzip it into a directory named exactly the same as the single egg file was. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: Ah, ctypes
Nick Craig-Wood writes: > ctypes could potentially note that function types don't have enough > references to them when passed in as arguments to C functions? It > might slow it down microscopically but it would fix this problem. Except that ctypes can't know the lifetime needed for the callbacks. If the callbacks are only used while the called function is executing (say, perhaps for a progress indicator or internal completion callback) then it's safe to create the function wrapper just within the function call. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: [py2exe] What to download when updating?
Gilles Ganault <[EMAIL PROTECTED]> writes: > Hello > > Out of curiosity, if I recompile a Python (wxPython) app with > py2exe, can I have customers just download the latest .exe, or are > there dependencies that require downloading the whole thing again? It will depend on what you changed in your application. The most likely file that will change is your library.zip file since it has all of your Python modules. I believe that with py2exe the main exe is typically a standard stub, so it need not change, but it can if the top level script is named differently since it has to execute it. The other files are binary dependencies, so you may add or remove them during any given build process depending on what modules you may newly import (or have removed the use of). In the end, you could in theory just compare the prior version distribution tree to the new version is simplest. But then you'd need to package up an installer that did the right thing on the target system. To be honest, just packaging it up as a new version and putting it into a standard installer (as with InnoSetup or NSIS) and letting the installer keep track of what to do when installing the new version on top of an existing version is generally simplest overall, albeit larger. But during internal development or other special cases, I've definitely just distributed updated library.zip files without any problem. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: Python PIL and Vista/Windows 7 .. show() not working ...
Esmail writes: > I dug around in the docs and found a named parameter that I can set > when I > call show. > > Definition: im.show(self, title=None, command=None) > > I installed irfanview and specified it/its path in the parameter, > but that didn't work either. It's really quite puzzling in the > case of Vista since that's been around for quite a few years now. But I thought everyone was sticking their fingers in their ears and humming to try to forget Vista had been released, particularly now that Windows 7 is out :-) Perhaps there's an issue with the temporary file location. I don't have a Vista system to test on, but the show() operation writes the image to a temporary file as returned by tempfile.mktemp(), and then passes the name on to the external viewer. The viewing command is handed to os.system() with the filename embedded without any special quoting. So if, for example, the temporary location has spaces or "interesting" characters, it probably won't get parsed properly. One easy debugging step is probably to add a print just before the os.system() call that views the image (bottom of _showxv function in Image.py in my copy of 1.1.6). That way at least you'll know the exact command being used. If that's the issue, there are various ways around it. You could patch PIL itself (same function) to quote the filename when it is constructing the command. Alternatively, the tempfile module has a tempdir global you could set to some other temporary directory before using the show() function (or any other code using tempfile). -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: Which version of MSVC?90.DLL's to distribute with Python 2.6 based Py2exe executables?
Jonathan Hartley writes: > I guess I really need an installer. Oh well. This need not be that much of a hurdle. Several solutions exist such as Inno Setup (my personal preference), NSIS, etc... which are not hard to create a solid installer with. I suspect your end users will appreciate it too since your application (even if trivial) will install/uninstall just like other standard applications. Combining py2exe with such an installer is a solid combination for deployment under Windows. It could also help you over time since you'll have better control if needed over how future versions handle updates, can control menus, shortcuts, etc.. Even if a start menu shortcut just opens up a console window with your text based application, it's probably easier for users then telling them to open such a window manually, switch to the right directory, and start your script. You can arrange to have the redist installer run from within your installation script, so it's a one-time hit rather than each time your application starts. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: A "terminators' club" for clp
Terry Reedy writes: > r wrote: >> On Nov 14, 4:59 am, kj wrote: >>> But, as I already showed, I'm out of my depth here, >>> so I'd better shut up. >> >> Don't give up so easy! The idea is great, what Paul is saying is that >> most people who read this group use newsreaders and that has nothing >> to do with google groups. These guy's have kill filters for just this >> sort of thing but either way the emails are on their puters so they >> have to deal with them on an individual basis. It would be nice >> however to clean up the Google group version and rid it of the plagues >> of spam infestations. > > Anyone with a newsreader can, like me, read gmane.comp.python.general, > which mirrors python-list, which now filters out much/most of the spam > on c.l.p from G.g. The same is true on some (not sure if it qualifies for many) Usenet servers. I use news.individual.net for example (for a modest yearly fee as of a few years ago) and in my experience it does a great job at filtering spam. I'm sure there are other services that do as well. I don't have to manage any special filters and don't seem to see any of the stuff in this group, for example, mentioned in this thread. I do use gmane for a lot of other lists (including python-dev) that aren't operated as a Usenet newsgroups and it's an excellent service. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: python gui builders
Simon Hibbs writes: > I've had this problem for a few years. I've tried PythonCard, > WxWidgets with WxDesigner, BoaConstructor, etc. None of them come > anywhere close to PyQT/QTDesigner. For me, the killer feature missing from of all of the wx-based designers is that they require sizer based designs at all stages, not even permitting a fixed layout up front as a first draft. Either that or any case I've found permitting a fixed layout, then didn't permit turning that easily into a sizer-based layout. >From an overall design perspective, that was the feature I found most intriguing in QTDesigner. I could randomly drop stuff around the window while doing an initial layout, which is especially helpful when you aren't quite sure yet how you want the layout to look. Then you can select groups of objects and apply the containers to provide for flexible layout. I absolutely prefer sizer-based layouts for a final implementation, but early in the design stages find it more helpful, and freeing, not to be as tied to the containers. With that said, for various reasons I still prefer wxPython to Qt, and at the moment, find wxFormBuilder the best fit for my own designs (even before the direct Python support, just using XRC). -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: Executing Commands From Windows Service
T writes: > I have a script, which runs as a Windows service under the LocalSystem > account, that I wish to have execute some commands. Specifically, the > program will call plink.exe to create a reverse SSH tunnel. Right now > I'm using subprocess.Popen to do so. When I run it interactively via > an admin account, all is well. However, when I'm running it via > service, no luck. I'm assuming this is to do with the fact that it's > trying to run under the LocalSystem account, which is failing. What > would be the best way around this? Thanks! The LocalSystem account is not, if I recall correctly, permitted to access the network. You'll have to install the service to run under some other account that has appropriate access to the network. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: Executing Commands From Windows Service
T writes: > The more testing I do, I think you may be right..I was able to get it > to work under a local admin account, and it worked under debug mode > (which would also have been running as this user). I'm a bit > surprised though - I was under the assumption that LocalSystem had > rights to access the network? Not from my past experience - the system account (LocalSystem for services) can be surprising, in that it's pretty much unlimited access to all local resources, but severely limited in a handful of cases, one of which is any attempt to access the network. I can't recall for sure if it's an absolute block, or if in some cases you can configure around it (e.g., it might use a null session for remote shares which can be enabled through the registry on the target machine). I've basically stuck "LocalSystem = no network" in my head from past experience. So you can either install your service to run under your existing account, or create an account specifically for running your service, granting that account just the rights it needs. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: Executing Commands From Windows Service
David Bolen writes: > Not from my past experience - the system account (LocalSystem for > services) can be surprising, in that it's pretty much unlimited access > to all local resources, but severely limited in a handful of cases, > one of which is any attempt to access the network. I can't recall for > sure if it's an absolute block, or if in some cases you can configure > around it (e.g., it might use a null session for remote shares which > can be enabled through the registry on the target machine). I've > basically stuck "LocalSystem = no network" in my head from past > experience. Given it's been a few years, I decided to try some tests, and the above is too simplistic. The LocalSystem account runs without any local Windows credentials (e.g., not like a logged in user), which has several consequences. One is that you can't access any network resources that require such credentials (like shares). However, there's no sort of firewall filtering or anything, so plain old TCP/IP connections are fine. Unless, of course, the client being used also has other needs for local Windows credentials, independent or as a pre-requisite to the network operations. So backing up a bit, the TCP/IP connection that plink is making is not inherently disabled by running under LocalSystem, but it's certainly possible that plink is trying to identify the user under which it is operating to perhaps identify ssh keys or other local resources it needs to operate. You might be able to cover this with command line options (e.g., plink supports "-i" to specify a key file to use), but you'll also need to ensure that the file you are referencing is readable by the LocalSystem account. One of the other responders had a very good point about locating plink in the first place too. Services run beneath an environment that is inherited from the service control manager process, and won't include various settings that are applied to your user when logged in, especially things like local path changes, and working directories. Should you change the system path (via the environment settings), you'll need to reboot for the service control manager to notice - I don't think you can restart it without a reboot. So it's generally safer to be very clear, and absolute when possible, in a service for paths to external resources. The prior advice of running the service as an identified user (e.g., with local credentials) is still good as it does remove most of these issues since if you can run the script manually under that user you know it'll work under service. But it's not a hard requirement. If your script is dying such that a top level exception is being raised you should be able to find it in the application event log. So that might give further information on what about the different environment is problematic. You can also use the win32traceutil module to help with grabbing debug output on the fly. Import the module in your service, which will implicitly redirect stdout/stderr to a trace buffer. Run the same win32traceutil module from the command line in another window. Then start the service. Any stdout/stderr will be reflected in the other window. Can't catch everything (suppressed exceptions, or I/O that doesn't flow through the script's stdout/stderr), but again might help point in the right direction. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: listing existing windows services with python
alex23 writes: > News123 wrote: >> What is the best way with python to get a list of all windows services. >> >> As a start I would be glad to receive only the service names. >> >> However it would be nicer if I could get all the properties of a service >> as well. > > I highly recommend Tim Golden's fantastic WMI module[1]. Another alternative is the win32service module from the pywin32 package (which IMO you'll almost certainly want available when doing any significant Windows-specific operations) which wraps the native win32 libraries for enumerating, querying and controlling services. A simple loop could use EnumServicesStatus to iterate through the services, OpenService with the SERVICE_QUERY_CONFIG flag to get a handle to each service, and then QueryServiceConfig to retrieve configuration information. Since pywin32 is a relatively thin wrapper over the win32 libraries, pure MSDN documentation can be used for help with the calls, augmented by any Python-related information contained in the pywin32 documentation. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: Generic singleton
Duncan Booth writes: > It is also *everywhere* in the Python world. Unlike Java and C++, Python > even has its own built-in type for singletons. > > If you want a singleton in Python use a module. > > So the OP's original examples become: > > --- file singleton.py --- > foo = {} > bar = [] > > --- other.py --- > from singleton import foo as s1 > from singleton import foo as s2 > from singleton import bar as s3 > from singleton import bar as s4 > > ... and then use them as you wish. In the event you do use a module as a singleton container, I would advocate sticking with fully qualified names, avoiding the use of "from" imports or any other local namespace caching of references. Other code sharing the module may not update things as expected, e.g.: import singleton singleton.foo = {} at which point you've got two objects around - one in the singleton.py module namespace, and the s1/s2 referenced object in other.py. If you're confident of the usage pattern of all the using code, it may not be critical. But consistently using "singleton.foo" (or an import alias like s.foo) is a bit more robust, sticking with only one namespace to reach the singleton. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: Recommend Commercial graphing library
AlienBaby writes: > I'd be grateful for any suggestions / pointers to something useful, Ignoring the commercial vs. open source discussion, although it was a few years ago, I found Chart Director (http://www.advsofteng.com/) to work very well, with plenty of platform and language support, including Python. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: Impersonating a Different Logon
Kevin Holleran writes: > Thanks, I was able to connect to the remote machine. However, how do > I query for a very specific key value? I have to scan hundreds of > machines and need want to reduce what I am querying. I would like to > be able to scan a very specific key and report on its value. Any remote machine connection should automatically used any cached credentials for that machine, since Windows always uses the same credentials for a given target machine. So if you were to access a share with the appropriate credentials, using _winreg after that point should work. I normally use \\machine\ipc$ (even from the command line) which should always exist. You can use the wrappers in the PyWin32 library (win32net) to access and then release the share with NetUseAdd and NetUseDel. Of course, the extra step of accessing the share might or might not be any faster than WMI, but it would have a small advantage of not needing WMI support on the target machine - though that may be a non-issue nowadays. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: Extract a bordered, skewed rectangle from an image
"Paul Hemans" writes: > I am wondering whether there are any people here that have experience with > openCV and Python. If so, could you either give me some pointers on how to > approach this, or if you feel so inclined, bid on the project. There are 2 > problems: Can't offer actual services, but I've done image tracking and object identification in Python with OpenCV so can suggest some approaches. You might also try the OpenCV mailing list, though it's sometimes varies wildly in terms of S/N ratio. And for OpenCV specifically, I definitely recommend the book "Learning OpenCV" by O'Reilly. It's really hard to grasp the concepts and applications of the raw OpenCV calls from the API documentation, and I found the book (albeit not cheap) helped me out tremendously and was well worth it. I'll flip the two questions since the second is quicker to answer. > How to do this through Python into openCV? I am a newbie to Python, not > strong in Maths and ignorant of the usage of openCV. After trying a few wrappers, the bulk of my experience is with the ctypes-opencv wrapper and OpenCV 1.x (either 1.0 or 1.1pre). Things change a lot with the recent 2.x (which needs C++ wrappers), and I'm not sure the various wrappers are as stable yet. So if you don't have a hard requirement for 2.x, I might suggest at least starting with 1.x and ctypes-opencv, which is very robust, though I'm a little biased as I've contributed code to the wrapper. > How do I get openCV to ignore the contents of the label and just focus on > the border? There's likely no single answer, since multiple mechanisms for identifying features in an image exist, and you can also derive additional heuristics based on your own knowledge of the domain space (your own images). Without knowing exactly what the border design to make it easy to detect is, it's hard to say anything definitive. But in broad strokes, you'll often: 1. Normalize the image in some way. This can be to adjust for brightness from various scans to make later processing more consistent, or to switch spaces (to make color matching more effective) or even to remove color altogether if it just complicates matters. You may also mask of entire portions of the image if you have information that says they can't possibly be part of what you are looking for. 2. Attempt to remove noise. Even when portions of an image looks like a solid color, at the pixel level there can be may different variations in pixel values. Operations such as blurring or smoothing help to average out those values and simplify matching entire regions. 3. Attempt to identify the regions or features of interest. Here's where a ton of algorithms may apply due to your needs, but the simplest form to start with is basic color matching. For edge detection (like of your label) convolutions (such as gradient detection) might also ideal. 4. Process identified regions to attempt to clean them up, if possible weakening regions likely to be extraneous, and strengthening those more likely to be correct. Morphology operations are one class of processing likely to help here. 5. Select among features (if more than one) to identify the best match, using any knowledge you may have that can be used to rank them (e.g., size, position in image, etc...) My own processing is ball tracking in motion video, so I have some additional data in terms of adjacent frames that helps me remove static background information and minimize the regions under consideration for step 3, but a single image probably won't have that. But given that you have scanned documents, there may be other simplifying rules you can use, like eliminating anything too white or too black (depending on label color). My own flow works like: 1. Normalize each frame 1. Blur the frame (cvSmooth with CV_BLUR, 5x5 matrix). This smooths out the pixel values, improving the color conversion. 2. Balance brightess (in RGB space). I ended up just offsetting the image a fixed (x,x,x) value to maximize the RGB values. Found it worked better doing it in RGB before Lab conversion. 3. Convert the image to the "Lab" color space. I used Lab because the conversion process was fastest, but when frame rate isn't critical, HLS is likely better since hue/saturation are completely separate from lightness which makes for easier color matching. 2. Identify uninteresting regions in the current frame This may not apply to you, but here is where I mask out static information from prior background frames, based on difference calculations with the current frame, or very dark areas that I knew couldn't include what I was interested in. In your case, for example, if you know the label is going to show up fairly saturated (say it's a solid red or something), you could probably eliminate everything that is b
Re: What's the matter with docs.python.org?
Christian Mertes writes: > On Mi, 2010-05-19 at 16:42 -0700, Aahz wrote: >> Also, I think you need to pass the host HTTP header to access >> docs.python.org > > Look, I don't really want to read Python docs via telnet. I basically > wanted to point out that there is strange behaviour and someone might > feel responsible and look into it. I think the point is that if you are going to use telnet as a diagnostic tool you need to more accurately represent the browser. I just tried and using the Host header renders a completely different response than not (presumably because the server is using virtual hosting). With an appropriate "Host: docs.python.org" you get the actual documentation home page, without it you get the "page has moved" text you saw. It may or may not have anything to do with the original problem, but it probably does explain the response you got when you tried to use telnet as a test tool. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: Python 2.7 released
Martineau writes: > Some clarification. I meant installed 2.7 on top of 2.6.x. Doing so > would have interfered with the currently installed version because I > always install Python in the same directory, one named just "Python", > to minimize the number of changes I have to make to to other parts of > the system. That's fine, you're just making a conscious choice to only support (yourself) a single version installed at a time. I tend to need multiple versions around when developing, so I keep a bunch of versions all installed in separate directories as \Python\x.y (so I only have a single root directory). With 2.7, my current box has 6 Python interpreters (2.4-3.1) installed at the moment. I use Cygwin (wouldn't try to work on a Windows system without it), so just use bash aliases to execute the right interpreter, but a batch file could be used with the cmd interpreter, and you could link GUI shortcuts to that batch file. Not sure there's a good solution to your help file link, other than the existing Start menu links installed per Python version. Even with local links you'd probably want separate links per version anyway since they're different documents. Of course, since this started by just considering installing it to get at a single file (which I know was since solved), it's probably an acceptable use case for violating your standard policy and picking a different directory name just in this case, and then blowing it away later. :-) >I also believe the Windows installer makes registry > changes that also involve paths to the currently installed version, > which again, is something I wanted to avoid until I'm actually ready > to commit to upgrading. The path information installed in the registry (Software\Python\PythonCore under HLKM or HKCU depending on installation options) is structured according to major.minor release (e.g., 2.6 vs. 2.7 are distinct), but you're right Windows only supports one file extension mapping, so by default the last Python to be installed gets associated with .py/.pyw etc... by default. But you can optionally disable this during installation. On the customize screen showing during installation. de-select the "Register Extensions" option, and the active install won't change any existing mappings and thus have no impact on your current default installation. > If there are better ways on Windows to accomplish this, I'd like to > hear about them. I suppose I could use hardlinks or junctions but > they're not well supported on most versions of Windows. If you're still using the basic Windows command prompt or GUI links then a batch file is the simplest way to go. With something like Cygwin (which I personally would never do without), then you have a variety of techniques available including links, shell aliases, etc... -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: About problems that I have with learning wxPython in Macintosh
"ata.jaf" writes: > import wx > > class MainWindow(wx.Frame) : > def __init__(self, parent, title) : >wx.Frame.__init__(self, parent, title=title, size=(200, 100)) >self.control = wx.TextCtrl(self, style=wx.TE_MULTILINE) >self.CreateStatusBar() > >filemenu = wx.Menu() > >filemenu.Append(wx.ID_ABOUT, '&About', ' Information about this > program.') >filemenu.AppendSeparator() >filemenu.Append(wx.ID_EXIT, 'E&xit', ' Terminate the program') > >menuBar = wx.MenuBar() >menuBar.Append(filemenu, '&File') >self.SetMenuBar(menuBar) >self.Show(True) > > app = wx.App(False) > frame = MainWindow(None, 'Sample editor') > app.MainLoop() > > The menus doesn't appear in the product. > Can anyone help me to find a tutorial that is for using wxPython on a > Mac? I think the menus are actually working as designed, and they are present, but just not perhaps what or where you expected. That's because some of the standard IDs (e.g., wx.ID_ABOUT) and some names (e.g., "E&xit") are adjusted under OSX to conform to that platform's menu standard. This is actually to your benefit, as you can use the same wxPython code to get menus on each platform to which users on that platform will be familiar. So for example, ID_ABOUT and ID_EXIT are always under the Application menu (and E&xit becomes &Quit) which is where Mac users expect them to be. Mac users would be quite confused if your application exited with Command-x rather than Command-Q. See http://wiki.wxpython.org/Optimizing%20for%20Mac%20OS%20X for a little more information. There are also a series of methods on wxApp if you want finer control over this (such as SetMacAboutMenuItemId, SetMacExitMenuItemId, SetMacPreferencesMenuItemId) but using the standard ID_* names does it automatically. If you're looking for your own specific menus, I'd just switch away from the standard ids and names. For example, if you switched the wx.ID_* in the above to -1, you'd see them show up under the File menu rather than relocated to the Application menu. Although "E&xit" would still get replaced with "&Quit". But if you are in fact setting up a menu for an application exit, I'd let wxPython do what it's doing, as your application will appear "normal" to users on the Mac. I'd also suggest moving over to the wxPython mailing list for followup questions as there are more folks there familiar with wxPython. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: time between now and the next 2:30 am?
Neil Cerutti writes: > On 2010-07-23, Jim wrote: >> How can I calculate how much time is between now and the next >> 2:30 am? Naturally I want the system to worry about leap >> years, etc. > > You need the datetime module. Specifically, a datetime and > timedelta object. Although it sounds like the question is to derive the timedelta value, so it's not known up front. That's a little trickier since you'd need to construct the datetime object for "next 2:30 am" to subtract "now" from to get the delta. But that requires knowing when the next day is, thus dealing with month endings. Could probably use the built-in calendar module to help with that though. For the OP, you might also take a peek at the dateutil third party module, and its relativedelta support, which can simplify the creation of the "next 2:30 am" datetime object. Your case could be handled by something like: from datetime import datetime from dateutil.relativedelta import relativedelta target = datetime.now() + relativedelta(days=+1, hour=2, minute=30, second=0, microsecond=0) remaining = target - datetime.now() This ends up with target being a datetime instance for the next day at 2:30am, and remaining being a timedelta object representing the time remaining, at least as of the moment of its calculation. Note that relativedelta leaves fields alone that aren't specified, so since datetime.now() includes down to microseconds, I clear those explicitly). Since you really only need the date, you could also use datetime.date.today() instead as the basis of the calculation and then not need second/microsecond parameters to relativedelta. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: Off-topic: Usenet archiving history
Dennis Lee Bieber writes: > Either way -- it was still a change from "expiration at some > date"... Though since (Netcom/Mindspring)Earthlink seems to have > subcontracted NNTP service to Giganews (or some such) it wouldn't > surprise me to learn that service also keeps a mammoth archive... I'm not sure it's really a change, or if it is, it certainly isn't a change from how things were originally. "Expiration at some date" was never any sort of global policy for Usenet - just an aspect of a individual news server. Some servers held messages for long periods, particularly for the big seven groups - it's true that alt.* and in particular the binaries, might expire quickly. I know I certainly ran some servers that didn't bother expiring - or had expiration times in years - of the big seven. My experience post-dates the great renaming, so I can't speak to before that, but don't think behavior was very different. Individual messages could include an Expires: header if they wished, but even that was just a suggestion. Any actual expiration was due to local configuration on each news server, which while it could take Expires: headers into account, was just as often driven by local storage availability or the whims of the local news admin :-) I think Deja News was providing web access to their archive from the mid-90s on (so quite a while before Google even existed) so certainly by that point everyone had access to a rather complete archive even if messages had expired on their local server. I think Deja was also the first to introduce X-No-Archive. But other archives certainly existed pre-Deja, which I'm sure is, in large part, how Google was able to locate and incorporate the older messages into their system after their acquisition of the Deja archive. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: Off-topic: Usenet archiving history
Ben Finney writes: > David Bolen writes: > >> Individual messages could include an Expires: header if they wished, > > Since we're already well off-topic: NNTP, HTTP, and email, and probably > other protocols as well, all deal with messages. They are all consistent > in defining a message [0] as having *exactly one* header. Heh, I'm not sure it's quite as consistent as you may think, particularly with older RFCs, which are relevant in this discussion since we're talking about historical artifacts. For example, while more recent mail RFCs like 2822 may specifically talk about header fields as the "header" (singular) of the message, the older RFC 822 instead refers to a "headers" (plural) section. > Every time you call a field from the header “a header”, or refer to > the plural “headers of a message”, the IETF kills a kitten. You > don't want to hurt a kitten, do you? Heaven forbid - though I'd think I could hold my own with the IETF. My reference to "header" was in lieu of "header line", something that the Usenet RFCs (1036, and the older 850) do extensively themselves. But I'll be more careful in the future - need to ensure kitten safety! -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: Suppressing Implicit Chained Exceptions (Python 3.0)
"andrew cooke" writes: > However, when printed via format_exc(), this new exception still has the > old exception attached via the mechanism described at > http://www.python.org/dev/peps/pep-3134/ (this is Python 3.0). If you're in control of the format_exc() call, I think the new chain keyword parameter can disable this and restore the old behavior. If you're not in control of the traceback display, I'm not sure there's an easy way to prevent it, given that displaying chained exceptions is the default mode. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: PDF: finding a blank image
DrLeif writes: > What I would like to do is have python detect a "blank" pages in a PDF > file and remove it. Any suggestions? The odds are good that even a blank page is being "rendered" within the PDF as having some small bits of data due to scanner resolution, imperfections on the page, etc.. So I suspect you won't be able to just look for a well-defined pattern in the resulting PDF or anything. Unless you're using OCR, the odds are good that the scanner is rendering the PDF as an embedded image. What I'd probably do is extract the image of the page, and then use image processing on it to try to identify blank pages. I haven't had the need to do this myself, and tool availability would depend on platform, but for example, I'd probably try ImageMagick's convert operation to turn the PDF into images (like PNGs). I think Gimp can also do a similar conversion, but you'd probably have to script it yourself. Once you have an image of a page, you could then use something like OpenCV to process the page (perhaps a morphology operation to remove small noise areas, then a threshold or non-zero counter to judge "blankness"), or probably just something like PIL depending on complexity of the processing needed. Once you identify a blank page, removing it could either be with pure Python (there have been other posts recently about PDF libraries) or with external tools (such as pdftk under Linux for example). -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: Why not enforce four space indentations in version 3.x?
John Nagle writes: >Python 3 enforces the rule that you can't mix tabs and spaces > for indentation in the same file. That (finally) guarantees that > the indentation you see is what the Python parser sees. That's > enough to prevent non-visible indentation errors. Are you sure? It seems to restrict them in the same block, but not in the entire file. At least I was able to use both space and tab indented blocks in the same file with Python 3.0 and 3.1. I suspect precluding any mixture at all at the file level would be more intrusive, for example, when trying to combine multiple code sources in a single file. Not that this really changes your final point, since the major risk of a mismatch between the parser vs. visual display is within a single block. >It also means that the Python parser no longer has to have > any concept of how many spaces equal a tab. So the problem > is now essentially solved. "has to have" being a future possibility at this point, since I'm fairly sure the 3.x parser does technically still have the concept of a tab size of 8, though now it can be an internal implementation detail. -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: Why not enforce four space indentations in version 3.x?
Miles Kaufmann writes: > On Jul 14, 2009, at 5:06 PM, David Bolen wrote: >> Are you sure? It seems to restrict them in the same block, but not in >> the entire file. At least I was able to use both space and tab >> indented blocks in the same file with Python 3.0 and 3.1. > > It seems to me that, within an indented block, Python 3.1 requires > that you are consistent in your use of indentation characters *for > that indentation level*. For example, the following code seems to be > allowed: Um, right - in other words, what I said :-) -- David -- http://mail.python.org/mailman/listinfo/python-list
Re: Why not enforce four space indentations in version 3.x?
Nobody writes: > On Thu, 16 Jul 2009 09:18:47 -0500, Tim Chase wrote: > >> Yes, the dictatorial "a tab always equals 8 spaces" > > Saying "always" is incorrect; it is more accurate to say that tab stops > are every 8 columns unless proven otherwise, with the burden of proof > falling on whoever wants to use something different. I suspect Tim was referring to the Python tokenizer. Internally, barring the existence of one of a few Emacs/vi tab setting commands in the file, Python always assigns the logical indentation level for a tab to align with the next multiple-of-8 column. This is unrelated to how someone might choose to display such a file. So mixing tabs and spaces and using a visual display setting of something other than 8 for the tab size (other than one consistent with an instruction embedded in the file) can yield a discrepancy between what is shown on the screen and how the same code is perceived by the Python compiler. This in turn may cause errors or code to execute at different indent levels than expected. Thus, in general, such a mixture is a bad idea, and as per this thread, no longer permitted in a single block in Python 3.x. -- David -- http://mail.python.org/mailman/listinfo/python-list