At the last sage-days there were some people working on making sage start up faster. I think it would be nice to have a short piece in the documentation about "best practices for importing" so that all sage developers have a reference point on how to deal with these difficulties. Before I can write down such a piece I would like to have heard the opinion of several sage developers on this. In particular it would be nice to have hear from all the people who have been thinking about these problems verry hard on the last sage days what causes of slowdown they found and how to work around these causes, the more general the case the better of course. Also feel free to give your opinion if you weren't at the last sage days!
I will start with giving my opinion on one of the causes of slowdown. I heard from William that during sage startup there is a huge amount of filesystem acces going on, this causes a very bad startup time behaviour if you startup sage from a slow filesystem. I heard him complaining that there where a lot of useless checks if certain files existed before they where loaded (a lot of looking in different places and finding nothing). I think the reason for this is that we use way to many absolute imports. Importing stuff from the top level as: import sage.some_module.some_submodule will search for stuff in a lot of different places (this is also what William observed IIRC). A quote from: http://docs.python.org/tutorial/modules.html#intra-package-references The submodules often need to refer to each other. For example, the surround module might use the echo module. In fact, such references are so common that the import <http://docs.python.org/reference/simple_stmts.html#import> statement first looks in the containing package before looking in the standard module search path. So it seems to me we should make more use of this. And also more use of the explicit relative import statements described there. from . import echofrom .. import formatsfrom ..filters import equalizer I also have the feeling that a part of the slowness comes from our custom all.py import construction and not using the `__all__` construction in __init__.py as described on http://docs.python.org/tutorial/modules.html#importing-from-a-package altough I have to explicitly test that to check wether this has something to do with it. To give you a feeling on how much none relative imports occur on the top level of files and to proof that this is really an issue see the output of the following commands in the $SAGE_ROOT/devel/sage directory egrep -r "^from sage." ./ | nl gives 5312 absolute sage imports! and: egrep -r "^import sage." ./ | nl another 893. Some of these are of course imported multiple times, so not all of them cause trouble but it at least shows that this is really an issue. Another two things I think that should be avoided as much as possible are: 1. Code execution in modules, a module should have mainly function and class declarations, executing code makes the startup time longer. Explicit examples of where this happend, and how a workaround was found are appriciated. Maybe we should make a lazy decorator decorator (decorating a function also causes code execution) so that the execution of decorators which have an expensive initialization can be delayed (it depends on how much time is spend in decoration right now if it's worth the effort). 2. using from some_module import some_function, some_class this should mainly be avoided because of problems with circular imports. when doing imports you are not sure whether some_module has been initialized yet so this could give problems. For the same reason the following should be avoided: import some_module some_module.some_function() since you are not sure wether some_module is initialized yet. Even if you are sure that some_module will be initialized it will make the whole sage liberary harder to maintain since removing "unused" import statements from other files might change the way modules are initialized an give unexpected import errors. The following however is ok to do: from module import sub_module Since this will never lead to circular import problems on its own (if sub_module was not initialized yet the above statement will initialize sub_module, however if you use it to import an explicit class of function it might lead to problems read for example http://effbot.org/zone/import-confusion.htm for a better explanation). Ps. maybe we should also change our custom all.py initialization method and use the __init__.py method. This will make it possible in the future to only import certain sub modules of sage. For example if I was bothered by the startup time from sage, and I only want to do stuff with rings this would make it possible to make an optional argument for sage such that sage -import rings would only load the rings submodule and it's dependancies. Of course we could also do that right now with or all.py files, but it would be good if we obeyed the python standards. (does anybody now why we use all.py and not __init__.py for initialization?) Ok. this became a longer post than I anticipated, but I still hope on feedback from others :) Thanks Maarten -- To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org