At the last sage-days there were some people working on making sage start up 
faster. I think it would be nice to have a short piece in the documentation 
about "best practices for importing" so that all sage developers have a 
reference point on how to deal with these difficulties. 
Before I can write down such a piece I would like to have heard the opinion 
of several sage developers on this. In particular it would be nice to have 
hear from all the people who have been thinking about these problems verry 
hard on the last sage days what causes of slowdown they found and how to 
work around these causes, the more general the case the better of course. 
Also feel free to give your opinion if you weren't at the last sage days!

I will start with giving my opinion on one of the causes of slowdown. I 
heard from William that during sage startup there is a huge amount of 
filesystem acces going on, this causes a very bad startup time behaviour if 
you startup sage from a slow filesystem. I heard him complaining that there 
where a lot of useless checks if certain files existed before they where 
loaded (a lot of looking in different places and finding nothing). I think 
the reason for this is that we use way to many absolute imports. Importing 
stuff from the top level as:
import sage.some_module.some_submodule
will search for stuff in a lot of different places (this is also what 
William observed IIRC).

A quote from: 
http://docs.python.org/tutorial/modules.html#intra-package-references

The submodules often need to refer to each other. For example, the surround 
module 
might use the echo module. In fact, such references are so common that the 
import <http://docs.python.org/reference/simple_stmts.html#import> statement 
first looks in the containing package before looking in the standard module 
search path.

So it seems to me we should make more use of this. And also more use of the 
explicit relative import statements described there.

from . import echofrom .. import formatsfrom ..filters import equalizer

I also have the feeling that a part of the slowness comes from our custom 
all.py import construction and not using the `__all__` construction in 
__init__.py as described on 

http://docs.python.org/tutorial/modules.html#importing-from-a-package

altough I have to explicitly test that to check wether this has something to 
do with it.

To give you a feeling on how much none relative imports occur on the top 
level of files and to proof that this is really an issue see the output of 
the following commands in the $SAGE_ROOT/devel/sage directory

    egrep -r "^from sage." ./ | nl

gives 5312 absolute sage imports!

and:

    egrep -r "^import sage." ./ | nl

another 893. Some of these are of course imported multiple times, so not all 
of them cause trouble but it at least shows that this is really an issue.

Another two things I think that should be avoided as much as possible are:
1. Code execution in modules, a module should have mainly function and class 
declarations, executing code makes the startup time longer. Explicit 
examples of where this happend, and how a workaround was found are 
appriciated. Maybe we should make a lazy decorator decorator (decorating a 
function also causes code execution) so that the execution of decorators 
which have an expensive initialization can be delayed (it depends on how 
much time is spend in decoration right now if it's worth the effort).

2. using 

    from some_module import some_function, some_class

this should mainly be avoided because of problems with circular imports. 
when doing imports you are not sure whether some_module has been initialized 
yet so this could give problems. For the same reason the following should be 
avoided:

    import some_module
    some_module.some_function()

since you are not sure wether some_module is initialized yet.  Even if you 
are sure that some_module will be initialized it will make the whole sage 
liberary harder to maintain since removing "unused" import statements from 
other files might change the way modules are initialized an give unexpected 
import errors. The following however is ok to do:

    from module import sub_module

Since this will never lead to circular import problems on its own (if 
sub_module was not initialized yet the above statement will initialize 
sub_module, however if you use it to import an explicit class of function it 
might lead to problems read for example 
http://effbot.org/zone/import-confusion.htm for a better explanation).

Ps. maybe we should also change our custom all.py initialization method and 
use the __init__.py method. This will make it possible in the future to only 
import certain sub modules of sage. For example if I was bothered by the 
startup time from sage, and I only want to do stuff with rings this would 
make it possible to make an optional argument for sage such that

    sage -import rings

would only load the rings submodule and it's dependancies.
Of course we could also do that right now with or all.py files, but it would 
be good if we obeyed the python standards. (does anybody now why we use 
all.py and not __init__.py for initialization?)

Ok. this became a longer post than I anticipated, but I still hope on 
feedback from others :)

Thanks Maarten

-- 
To post to this group, send an email to sage-devel@googlegroups.com
To unsubscribe from this group, send an email to 
sage-devel+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/sage-devel
URL: http://www.sagemath.org

Reply via email to