I am about to catch a plane so I will be short. We can talk more about this in a couple of days.
> > First a word of warning. I send you this mail in private because I > > want > > to share my thoughts on the web2py code and because I hope I can help > > with that somehow. You're free to publish this mail if you want. This is constructive criticism so I appreciate it. > > As you might remember I came in contract with web2py a long ago when > > you > > were evaluating pygments as a syntax highlighter for the web2py > > documentation or something and you decided to implement your own > > because > > you were unable to "ship" pygments as part of web2py. As far as I > > remember py2app or py2exe were not able to pick up the on-demand > > imports > > in the lexer module. > > > I have been watching the development ever since because (And please > > don't think bad of me because of that) it appeared to me that web2py > > was > > a joke. I remember reading some very early revisions of your PDF > > documentation where I remember a code that looks something like that: > > > db.define_table('something', db.Field('meh', 'date', > > default=date.today())) > > > That line freaked me out. This line alone showed me that web2py is > > doing something very, very weird. From this line I could see that it > > means one of the following two things: > > > - either the code has a bug for long running applications (in this > > case > > when the application code runs longer than a day in which case the > > default would have to change dynamically) > > > - or the global scope has different semantics and is reevaluated each > > request. > > > Turns out the latter is what happens in web2py. Correct. > There still is a race > > condition but only a small one. No there is not. Why do you think there is a race condition? > The bigger problem with this > > however is > > that this means the (not so) static definitions are reevaluated each > > request. Correct. A small price to pay for a lot of flexibility. We are working on lazy evaluation of tables. There is also a speedup if you press the "compile" button in admin. > > Especially for the database table definitions (which cause many > > function > > calls) this means an enormous per-request penality if you have many of > > them. I even remember reading a case like this in the web2py > > newsgroup > > about someone who had twenty table definitions or something and the > > majority of the request time was spent setting up the tables. There was a slow down problem that has been fixed by Alexey. I agree there is still room for improvement. > > However that also means another thing. Standard Python > > optimizations no > > longer apply. In Python usually modules are cached in an interpreter > > bound dictionary for reusing. This of course does not apply for > > web2py. > > However the second problem here is that only the local scope gets > > certain optimizations in cpython such as fast-local lookup. Instead > > of > > storing the variables in a dictionary they are instead stored in an an > > object where items are looked up by index (fast locals) rather than > > by a > > string. > > > Now of course one can argue that this is the way it's intended. > > That's > > nice but for that, the documentation is lacking. Yes it is true that documentation is lacking. > There is no > > explanation of what implications this has on the code. And in > > comparison with "enterprise ready" that gives me a strange feeling > > about > > the project. > > > I'm not even exactly sure what kills web2py for me in the end, but I'm > > pretty sure that part of the reason is the way the project is > > advertised. We prefer to talk about "advocating" not "advertising" since we do not sell it. Mistakes were made on my side. > > web2py does not have worse code than many popular libraries out there. Thanks. I take is as a compliment. > > BeautifulSoup comes to mind. That library does some really ugly > > things > > there including monkeypatching of the standard library. It does > > different things depending on the version of Python it's running on > > and > > in the code are some really ugly idioms like a unicode subclass that > > has > > a reference to the parsed tree that results in memory leaks very > > easily > > because it's nearly impossible to get rid of the reference (actually > > just by exploiting some weird Python internals that are mostly > > undocumented). But the author of BeautifulSoup does not advertise > > it as > > a backwards-compatible, enterprise ready HTML parser. > > > The web2py code on the other hand has some really problematic idioms > > as > > well. The execution of models, views and whatnot in a made-up > > namespace > > comes to mind. Yes. It was a design decision to be closer to Rails than Django in this respect. I think this is what is making web2py popular, more than anything else. > Due to the decision of being backwards compatible it > > will be nearly impossible to "fix" those things. We think of them as features, not bugs. > However there are so > > many implications nobody yet knows about which is terribel for large > > projects. > > > For example the database table redfinition problem cited above. Of > > course that particular problem can be worked around with making the > > definition code internally lazy or something but there can be more > > nobody yet knows about. Alexey has already done this. It achieves a 10-20% speedup. I am rewriting the DAL completely to make it easier to port on other backends and faster. The new one may support lazy evaluations of tables. Time permitting, > > Especially side-effects of code execution can do some terrible damage. > > Decorators that would register code would instantly leak for example. > > And some libraries provide auto-registration that you would not > > instantly know about. Can you provide an example of what you mean? I am pretty sure there is no leak. > > Also libraries like pickle do not support weird situations like this. > > So I guess in the end I'm not (only) complaining about the web2py code > > or decisions there, but mainly how web2py is advertised as enterprise > > ready when nobody knows how these idioms scale or behave. I disagree with "nobody knows how these idioms scale or behave"? We have been doing this for 2 years. We know very well how it works and scales. It actually works very well because as soon as a request is completed the environment is deleted, this is clean, avoids conflicts when multiple apps are running, and prevents memory leaks. > > Another part is that web2py (and this is something web2py has in > > common > > with so many other web frameworks) is the craze to reimplement > > everything. I know you don't want to pull depedencies in but from > > what > > I can see you're providing one-click installers for the framework > > which > > could ship external dependencies. Both py2exe and py2app provide > > support for external dependencies and I can't see what "not > > depending on > > webob/werkzeug" gives you there. web2py started as a teaching tool. It was important (and still is) that I can go over every line of code with my students know exactly why it is there. > > The reason why I'm addressing this is that there are so many > > limitations, problems and weird aspects in WSGI and HTTP that are hard > > to tackle and many people get wrong. I agree. > I'm not saying you're getting it > > wrong, I haven't even studied the code in question, but I'm pretty > > sure > > it would make sense to collaborate on that. I like the idea. Some conditions have to be met: 1) we have a new contributor license agreement on web2py-developers 2) patches must not break backward compatibility 3) patches must either make the code smaller/faster, or add a needed functionality. > > About the code quality itself, not the (IMO) broken concepts I picked > > some of the snippets I found quickly searching with the help of ack: > > > - Missing `close()` calls with files (ignored matches from > > compileapp): > > > gluon/fileutils.py:160: f.write(open(tarname, 'rb').read()) > > gluon/fileutils.py:246: data = open(filename, 'rb').read() > > gluon/highlight.py:313: data = open(sys.argv[1]).read() > > gluon/languages.py:37: lang_text = open(filename, 'r').read()... > > gluon/languages.py:153: data = open(file, 'r').read() > > gluon/main.py:107:web2py_version = open(os.path.join(web2py_path, ... > > gluon/rewrite.py:26: exec open('routes.py', 'r').read() ... > > gluon/template.py:104: text = open(os.path.join(path, ... > > gluon/template.py:118: parent = open(t, 'rb').read() > > gluon/template.py:134: child = open(t, 'rb').read() > > gluon/tools.py:1934: return urllib.urlopen(url).read() > > gluon/widget.py:55:ProgramVersion = open('VERSION', ... > > > - Same in streamer.py for checking for file existence > > (open(static_file)) I do not see the problem. Files are automatically closed when the file object goes out of scope. Can you provide an example when it does not? > > - Template engine seems to be missing proper escapes. Having ''' in > > the > > template should be enough to break it. I do not believe that is the case but I may be wrong. Can you provide an example of what you say? > > - Whitespace between "else" and colon in the template engine is > > unlikely > > to work with the current parsing rules as far as I can see. I agree it will not work. > > - xmlrpc system sends text/xml content type which is with a missing > > charset property set to us-ascii. No charset is set so it has to be > > specified explicitly (proper header would be application/xml; > > charset=utf-8) This has been fixed by user Mateusz. The patch will be incorporated soon. > > - Fileutils has a listdir method that does very inefficient deleting > > from the list. Current code. > > > 50 for dir in dirs[:]: > > 51 if dir.startswith('.'): > > 52 dirs.remove(dir) > > > That creates a shallow copy, for each item it performs an O(n) lookup > > for the item and the remove is a O(n) operation as well. Even > > unecessary because you have the position if you enumerate over the > > list. > > However you can do that with a list comprehension: > > > 50 dirs[:] = [x for x in dirs if not x.startswith('.')] Thanks. We will do so. > > - Having classes in local scope (fileutils.py for the tarfile) makes > > it > > incredible hard to reuse code. Not a good idea, move that into the > > global scope instead. Will also help performance. Thanks. We will do so. > > - There is no bug in shutil.copyfileobj. That function works perfetly > > and does what the documentation says it should do. Maybe you were > > copying from the wsgi input stream. That of course won't work that > > easy because there is no end-of-file marker in the input stream. See > > werkzeug's LimitedStream for ways how to fix that problem. A similar > > class also exists in WebOb but I don't remember the name. I am not sure what you are referring to here. Is this about a comment in the code? > > - You should not inherit from BaseException unless you are dealing > > with > > special exceptions that should not be catched by an "except > > Exception" > > > - there are realtive imports all over the place. This is a deprecated > > unfeature, switch to "from gluon.html import blah" instead of the > > current "from html import blah". This is an aesthetic issue. The current code works well with all supported python interpreters. Anyway, I do not oppose changing it, but it is tedious to do. > > - the email code in tools is vulnerable to header injections if the > > input is unfiltered. If this is intentional it should be documented > > in the code. Can you provide an example? If this is the case, it is not intentional. > > - There is no proper undecoding of the path info in main.py but spaces > > are transformed into underscores. However a plus is converted into a > > space as well which is against the RFC. Plus is only an alias for > > %20 > > in the querystring of a URL. web2py does not allow spaces in incoming urls. It just replaces them with _. It should also replace + with _ . Are you saying that is not the case? Perhaps you see a problem that I do not see. can you provide an example of when this would be a problem? > > - request.vars is either a dict of lists or a dict of plain strings. > > This means the user has to perform instance checks each access, not > > very clever. Other systems fix that by providing multi-value dicts. This cannot be changed because would break backward compatibility. I think it is fine as it is. This is how the cgi module parses vars. > > - time.strftime should not be used for header-date formatting because > > it's affected by locales. That means if a user is using locales it > > will break all headers sent that deal with dates. We do specify the format string. How do you suggest we change it? > > This list was created in ~15 minutes from quickly checking some files. > > If one does a check on the code more things will come up. Some things > > are also underspecified or unclear in the code. For example how the > > code is supposed to deal with unicode (it does that in very different > > ways) and how headers are quoted/unquoted, how URLs are decoded etc. > > Also the code architecture is suboptimal in places. Hardcoded support > > for Google appengine in places where you wouldn't expect it (like > > validators etc.) Unfortunately that is hard to avoid. GAE does things differently. > > For an "enterprise" framework this is a bit too much smelly code > > that is > > found in no time. > > What would I fix? Stop advertising the framework as enterprise ready > > until it matured more. And do less advertising in general. And if > > you > > can, fix some of the things in the code. Those that are caused by > > design choices won't be fixed anyway. As I mentioned we say "advocating" not "advertising". The "enterprise" feature of web2py is not that it is perfect (and I do not know of any enterprise application that is perfect, what would perfect mean anyway). By "enterprise" we mean: - our target users are small and medium businesses and non-profits (the term enterprise includes both) - we understand the needs of users by guaranteeing backward compatibility thus protecting their long term investment in web2py (this is not negotiable) - we care about usability and thus we try make web2py as easy as possible. - we fix bugs regularly, usually within 24 of when they are reported. "enterprise" does not mean that code cannot be improved. There is no code that cannot be improved. I will look into your comments more in details and I encourage other users to do so too. I appreciate them and they will help us improve. web2py was originally copyrighted as "Enterprise Web Framework" so this stuck with the name. The name changed to Gluon because there is a company called "EWF". The name changed again to web2py because there is a company that owns the "Gluon" trademark. It is makes you feel better feel free to read "Enterprise" as the "Star Trek Starship USS Enterprise NCC-1701". Perhaps we should change the tagline. web2py: to boldly go where no web framework has been before". ;-) Thanks Armin, Massimo --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "web2py-users" group. To post to this group, send email to web2py@googlegroups.com To unsubscribe from this group, send email to web2py+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/web2py?hl=en -~----------~----~----~----~------~----~------~--~---