In gluon/streamer.py there are a number of places were open() call is made on files. These are within a try block and exceptions are caught and the specific reason for the error is determined and different HTTP error returned. For example on such case from that file is:
try: open(static_file) except IOError, e: if e[0] == 21: raise HTTP(403, error_message, web2py_error='file is a directory') elif e[0] == 13: raise HTTP(403, error_message, web2py_error='inaccessible file') else: raise HTTP(404, error_message, web2py_error='invalid file') Any book on good programming style will tell you that you shouldn't use magic constants like above as it reduces maintainability and understandability. I am also not entirely sure that the POSIX error codes are guaranteed to have the same values on different platforms which don't share a common ABI. This is why one should always use the symbolic names for the constants instead. For Python such symbolic names are in the 'errno' module. I fully realise that you may save an ever so small amount of time by using hard wired constants rather than the symbolic names due to the latter needing a lookup, but this is generally a poor trade off when it comes to maintainability. It is not even on the main code path anyway, but an error condition. Suggest therefore you use instead: try: open(static_file) except IOError, e: if e[0] == errno.EISDIR: raise HTTP(403, error_message, web2py_error='file is a directory') elif e[0] == errno.EACCES: raise HTTP(403, error_message, web2py_error='inaccessible file') else: raise HTTP(404, error_message, web2py_error='invalid file') Do similar thing for other places in that file where that sort of check is done. For the case above, that code is somewhat questionable anyway. What the section of code does that starts with that bit of code is: 1. Try and open() the file just to see if it is possible. The resulting file object is then promptly ignored, not even closed. Yes it will close when scope left, but always better to be explicit and close as early as possible. 2. Perform a stat() on the file based on the path name to the file. This is done to determine file size and modification time. 3. Open the file yet again, this time so contents can be streamed back. This is a pretty wasteful way of going about things. First off, the EACCES could have been obtained from failure of the stat () call. Second, the EISDIR could have been determined from data contained in the structure returned by the stat() call. Thus the whole first open() could have been avoided. Third, a stat() call is going to be more expensive than a fstat() call after having opened the file for actual reading as fstat() can refer direct to the already open record for the file. Thus, the most efficient way would be to open() the file as if was going to use it for streaming. If that fails, can return the 403/404 based on that. Having got the file object, then perform a fstat() on the file object to get size and modification time. Then use the file handle for streaming. Yes there is other stuff happening in the stream_file_or_304_or_206() function, but can't see that it will be hard to fit in with that sort of change. On top of that, there is always an ever so small risk that the file could be written to between time stat is performed and file completely stream back. Imagine it was a log file for example that was being written to all the time. For normal case where whole file is to be returned, streamer() isn't passed bytes argument, ie., number of bytes to return and which was set in response Content-Length. This means that streamer() could actually return more bytes than Content-Length specified. For Apache/mod_wsgi that will not matter as it will truncate any returned data longer than Content-Length, but other WSGI servers may not do so. Thus suggest that bytes for length always be supplied to streamer. This will just protect against chance that send back more data than expected. If this were to occur, could stuff up where client was using keep alive on the connection. If it turns out less data was sent back because file got truncated, then Content-Length already sent and can't do anything about it. In that case client will timeout and close connection without reusing it for keep alive, so at least can't stuff up following request over same connection. One final thing, as already mentioned in prior mail, I think the 1MB chunk size should be reviewed as possibly performs worse than 64KB for some WSGI hosting mechanism. Some testing on what gives best results for a range of WSGI hosting mechanisms and platforms should be performed. The actual chunk size then perhaps should be configurable. The comments in the web2py book which seem to highlight this 1MB chunk size for some reason would need to be changed. Graham --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "web2py-users" group. To post to this group, send email to web2py@googlegroups.com To unsubscribe from this group, send email to web2py+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/web2py?hl=en -~----------~----~----~----~------~----~------~--~---