Thank you Armin, I am passing this to the list and will address them asap.
Begin forwarded message: > From: Armin Ronacher <armin.ronac...@active-4.com> > Date: August 2, 2009 1:10:29 PM CDT > To: "DiPierro, Massimo" <mdipie...@cs.depaul.edu> > Subject: My thoughts on web2py > > Hi Massimo, > > First a word of warning. I send you this mail in private because I > want > to share my thoughts on the web2py code and because I hope I can help > with that somehow. You're free to publish this mail if you want. > > As you might remember I came in contract with web2py a long ago when > you > were evaluating pygments as a syntax highlighter for the web2py > documentation or something and you decided to implement your own > because > you were unable to "ship" pygments as part of web2py. As far as I > remember py2app or py2exe were not able to pick up the on-demand > imports > in the lexer module. > > I have been watching the development ever since because (And please > don't think bad of me because of that) it appeared to me that web2py > was > a joke. I remember reading some very early revisions of your PDF > documentation where I remember a code that looks something like that: > > db.define_table('something', db.Field('meh', 'date', > default=date.today())) > > That line freaked me out. This line alone showed me that web2py is > doing something very, very weird. From this line I could see that it > means one of the following two things: > > - either the code has a bug for long running applications (in this > case > when the application code runs longer than a day in which case the > default would have to change dynamically) > > - or the global scope has different semantics and is reevaluated each > request. > > Turns out the latter is what happens in web2py. There still is a race > condition but only a small one. The bigger problem with this > however is > that this means the (not so) static definitions are reevaluated each > request. > > Especially for the database table definitions (which cause many > function > calls) this means an enormous per-request penality if you have many of > them. I even remember reading a case like this in the web2py > newsgroup > about someone who had twenty table definitions or something and the > majority of the request time was spent setting up the tables. > > However that also means another thing. Standard Python > optimizations no > longer apply. In Python usually modules are cached in an interpreter > bound dictionary for reusing. This of course does not apply for > web2py. > However the second problem here is that only the local scope gets > certain optimizations in cpython such as fast-local lookup. Instead > of > storing the variables in a dictionary they are instead stored in an an > object where items are looked up by index (fast locals) rather than > by a > string. > > Now of course one can argue that this is the way it's intended. > That's > nice but for that, the documentation is lacking. There is no > explanation of what implications this has on the code. And in > comparison with "enterprise ready" that gives me a strange feeling > about > the project. > > I'm not even exactly sure what kills web2py for me in the end, but I'm > pretty sure that part of the reason is the way the project is > advertised. > > web2py does not have worse code than many popular libraries out there. > BeautifulSoup comes to mind. That library does some really ugly > things > there including monkeypatching of the standard library. It does > different things depending on the version of Python it's running on > and > in the code are some really ugly idioms like a unicode subclass that > has > a reference to the parsed tree that results in memory leaks very > easily > because it's nearly impossible to get rid of the reference (actually > just by exploiting some weird Python internals that are mostly > undocumented). But the author of BeautifulSoup does not advertise > it as > a backwards-compatible, enterprise ready HTML parser. > > The web2py code on the other hand has some really problematic idioms > as > well. The execution of models, views and whatnot in a made-up > namespace > comes to mind. Due to the decision of being backwards compatible it > will be nearly impossible to "fix" those things. However there are so > many implications nobody yet knows about which is terribel for large > projects. > > For example the database table redfinition problem cited above. Of > course that particular problem can be worked around with making the > definition code internally lazy or something but there can be more > nobody yet knows about. > > Especially side-effects of code execution can do some terrible damage. > Decorators that would register code would instantly leak for example. > And some libraries provide auto-registration that you would not > instantly know about. > > Also libraries like pickle do not support weird situations like this. > > So I guess in the end I'm not (only) complaining about the web2py code > or decisions there, but mainly how web2py is advertised as enterprise > ready when nobody knows how these idioms scale or behave. > > Another part is that web2py (and this is something web2py has in > common > with so many other web frameworks) is the craze to reimplement > everything. I know you don't want to pull depedencies in but from > what > I can see you're providing one-click installers for the framework > which > could ship external dependencies. Both py2exe and py2app provide > support for external dependencies and I can't see what "not > depending on > webob/werkzeug" gives you there. > > The reason why I'm addressing this is that there are so many > limitations, problems and weird aspects in WSGI and HTTP that are hard > to tackle and many people get wrong. I'm not saying you're getting it > wrong, I haven't even studied the code in question, but I'm pretty > sure > it would make sense to collaborate on that. > > About the code quality itself, not the (IMO) broken concepts I picked > some of the snippets I found quickly searching with the help of ack: > > - Missing `close()` calls with files (ignored matches from > compileapp): > > gluon/fileutils.py:160: f.write(open(tarname, 'rb').read()) > gluon/fileutils.py:246: data = open(filename, 'rb').read() > gluon/highlight.py:313: data = open(sys.argv[1]).read() > gluon/languages.py:37: lang_text = open(filename, 'r').read()... > gluon/languages.py:153: data = open(file, 'r').read() > gluon/main.py:107:web2py_version = open(os.path.join(web2py_path, ... > gluon/rewrite.py:26: exec open('routes.py', 'r').read() ... > gluon/template.py:104: text = open(os.path.join(path, ... > gluon/template.py:118: parent = open(t, 'rb').read() > gluon/template.py:134: child = open(t, 'rb').read() > gluon/tools.py:1934: return urllib.urlopen(url).read() > gluon/widget.py:55:ProgramVersion = open('VERSION', ... > > - Same in streamer.py for checking for file existence > (open(static_file)) > > - Template engine seems to be missing proper escapes. Having ''' in > the > template should be enough to break it. > > - Whitespace between "else" and colon in the template engine is > unlikely > to work with the current parsing rules as far as I can see. > > - xmlrpc system sends text/xml content type which is with a missing > charset property set to us-ascii. No charset is set so it has to be > specified explicitly (proper header would be application/xml; > charset=utf-8) > > - Fileutils has a listdir method that does very inefficient deleting > from the list. Current code. > > 50 for dir in dirs[:]: > 51 if dir.startswith('.'): > 52 dirs.remove(dir) > > That creates a shallow copy, for each item it performs an O(n) lookup > for the item and the remove is a O(n) operation as well. Even > unecessary because you have the position if you enumerate over the > list. > > However you can do that with a list comprehension: > > 50 dirs[:] = [x for x in dirs if not x.startswith('.')] > > - Having classes in local scope (fileutils.py for the tarfile) makes > it > incredible hard to reuse code. Not a good idea, move that into the > global scope instead. Will also help performance. > > - There is no bug in shutil.copyfileobj. That function works perfetly > and does what the documentation says it should do. Maybe you were > copying from the wsgi input stream. That of course won't work that > easy because there is no end-of-file marker in the input stream. See > werkzeug's LimitedStream for ways how to fix that problem. A similar > class also exists in WebOb but I don't remember the name. > > - You should not inherit from BaseException unless you are dealing > with > special exceptions that should not be catched by an "except > Exception" > > - there are realtive imports all over the place. This is a deprecated > unfeature, switch to "from gluon.html import blah" instead of the > current "from html import blah". > > - the email code in tools is vulnerable to header injections if the > input is unfiltered. If this is intentional it should be documented > in the code. > > - There is no proper undecoding of the path info in main.py but spaces > are transformed into underscores. However a plus is converted into a > space as well which is against the RFC. Plus is only an alias for > %20 > in the querystring of a URL. > > - request.vars is either a dict of lists or a dict of plain strings. > This means the user has to perform instance checks each access, not > very clever. Other systems fix that by providing multi-value dicts. > > - time.strftime should not be used for header-date formatting because > it's affected by locales. That means if a user is using locales it > will break all headers sent that deal with dates. > > This list was created in ~15 minutes from quickly checking some files. > If one does a check on the code more things will come up. Some things > are also underspecified or unclear in the code. For example how the > code is supposed to deal with unicode (it does that in very different > ways) and how headers are quoted/unquoted, how URLs are decoded etc. > Also the code architecture is suboptimal in places. Hardcoded support > for Google appengine in places where you wouldn't expect it (like > validators etc.) > > For an "enterprise" framework this is a bit too much smelly code > that is > found in no time. > > What would I fix? Stop advertising the framework as enterprise ready > until it matured more. And do less advertising in general. And if > you > can, fix some of the things in the code. Those that are caused by > design choices won't be fixed anyway. > > > Regards, > Armin > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "web2py-users" group. To post to this group, send email to web2py@googlegroups.com To unsubscribe from this group, send email to web2py+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/web2py?hl=en -~----------~----~----~----~------~----~------~--~---