On Mon, Aug 29, 2011 at 4:21 PM, William Stein <wst...@gmail.com> wrote:
> On Mon, Aug 29, 2011 at 2:56 PM, Maarten Derickx
> <m.derickx.stud...@gmail.com> wrote:
>> At the last sage-days there were some people working on making sage start up
>> faster. I think it would be nice to have a short piece in the documentation
>> about "best practices for importing" so that all sage developers have a
>> reference point on how to deal with these difficulties.
>> Before I can write down such a piece I would like to have heard the opinion
>> of several sage developers on this. In particular it would be nice to have
>> hear from all the people who have been thinking about these problems verry
>> hard on the last sage days what causes of slowdown they found and how to
>> work around these causes, the more general the case the better of course.
>> Also feel free to give your opinion if you weren't at the last sage days!
>> I will start with giving my opinion on one of the causes of slowdown. I
>> heard from William that during sage startup there is a huge amount of
>> filesystem acces going on, this causes a very bad startup time behaviour if
>> you startup sage from a slow filesystem. I heard him complaining that there
>> where a lot of useless checks if certain files existed before they where
>> loaded (a lot of looking in different places and finding nothing). I think
>> the reason for this is that we use way to many absolute imports. Importing
>> stuff from the top level as:
>> import sage.some_module.some_submodule
>> will search for stuff in a lot of different places (this is also what
>> William observed IIRC).
>> A quote from:
>> http://docs.python.org/tutorial/modules.html#intra-package-references
>>
>> The submodules often need to refer to each other. For example,
>> the surround module might use the echo module. In fact, such references are
>> so common that the import statement first looks in the containing package
>> before looking in the standard module search path.
>>
>> So it seems to me we should make more use of this. And also more use of the
>> explicit relative import statements described there.
>>
>> from . import echo
>> from .. import formats
>> from ..filters import equalizer
>>
>> I also have the feeling that a part of the slowness comes from our custom
>> all.py import construction and not using the `__all__` construction in
>> __init__.py as described on
>> http://docs.python.org/tutorial/modules.html#importing-from-a-package
>>
>> altough I have to explicitly test that to check wether this has something to
>> do with it.
>> To give you a feeling on how much none relative imports occur on the top
>> level of files and to proof that this is really an issue see the output of
>> the following commands in the $SAGE_ROOT/devel/sage directory
>>     egrep -r "^from sage." ./ | nl
>>
>> gives 5312 absolute sage imports!
>> and:
>>     egrep -r "^import sage." ./ | nl
>> another 893. Some of these are of course imported multiple times, so not all
>> of them cause trouble but it at least shows that this is really an issue.
>
> I don't think avoiding absolute imports is the right solution to this
> problem at all.   Much better is the code that Volker Braun wrote that
> caches import locations between Sage runs.  Basically, his code
> uniformly solves the above problem once and for all without requiring
> any unnatural changes at all to the Sage library.

+1. This might help some, but is a lot more work for a lot less
benefit than the "cached import locations" hack. Also I find absolute
imports easier to read in many cases.

>> Another two things I think that should be avoided as much as possible are:
>> 1. Code execution in modules, a module should have mainly function and class
>> declarations, executing code makes the startup time longer. Explicit
>> examples of where this happend, and how a workaround was found are
>> appriciated.

+1

>> Maybe we should make a lazy decorator decorator (decorating a
>> function also causes code execution) so that the execution of decorators
>> which have an expensive initialization can be delayed (it depends on how
>> much time is spend in decoration right now if it's worth the effort).

Decorators are usually quite cheap, so I don't think this is a concern
here. On that note, however, the cached_method/function decorator
should be pointed out as an alternative to pre-computing things in the
module global space.

> I remember trying to referee one purported example of this, which
> involved elliptic curve isogenies and big precomputed polynomials...
>
>> 2. using
>>     from some_module import some_function, some_class
>> this should mainly be avoided because of problems with circular imports.
>> when doing imports you are not sure whether some_module has been initialized
>> yet so this could give problems. For the same reason the following should be
>> avoided:
>>     import some_module
>>     some_module.some_function()
>> since you are not sure wether some_module is initialized yet.  Even if you
>> are sure that some_module will be initialized it will make the whole sage
>> liberary harder to maintain since removing "unused" import statements from
>> other files might change the way modules are initialized an give unexpected
>> import errors. The following however is ok to do:
>>     from module import sub_module
>> Since this will never lead to circular import problems on its own (if
>> sub_module was not initialized yet the above statement will initialize
>> sub_module, however if you use it to import an explicit class of function it
>> might lead to problems read for example
>> http://effbot.org/zone/import-confusion.htm for a better explanation).

There are pros and cons to this. In particular, if you have "import
sage.rings.all" and then in your code you use "sage.rings.all.ZZ" you
are forcing 4 lookups per access. It also makes when the module in
question is actually imported much harder to track down. Of course
there are valid usecases (e.g. when circular imports are necessary).

>> Ps. maybe we should also change our custom all.py initialization method and
>> use the __init__.py method. This will make it possible in the future to only
>> import certain sub modules of sage.
>
> It is unfortunately naive to think that replacing all.py's by
> __init__.py will magically "make it possible to only import certain
> sub modules of sage".

I've been looking into using __init__.py files to clean things up, but
not necessarily (at least not at this point) getting rid of the all.py
files. In particular, having __init__.py import * from all.py prevents
one from cherry-picking submodules without proper initialization of
their context, which is one of the reasons sage startup and changing
imports is so fragile. Though it wouldn't help with startup time per
se, it would I think make things much easier to follow and debug (e.g.
as part of a greater strategy to both simplify and streamline things).

>>  For example if I was bothered by the
>> startup time from sage, and I only want to do stuff with rings this would
>> make it possible to make an optional argument for sage such that
>>     sage -import rings
>> would only load the rings submodule and it's dependancies.
>> Of course we could also do that right now with or all.py files, but it would
>> be good if we obeyed the python standards. (does anybody now why we use
>> all.py and not __init__.py for initialization?)
>> Ok. this became a longer post than I anticipated, but I still hope on
>> feedback from others :)
>
> It's great that you've raised this issue for discussion, and I hope
> this thread turns out to be long.
>
> Even after banging on startup time, etc., a lot last week, I
> personally do not know what the "best practices" should be for imports
> in Sage.  There are  pro's and con's with several things you suggest
> above.
>
>  -- William
>
> --
> To post to this group, send an email to sage-devel@googlegroups.com
> To unsubscribe from this group, send an email to 
> sage-devel+unsubscr...@googlegroups.com
> For more options, visit this group at 
> http://groups.google.com/group/sage-devel
> URL: http://www.sagemath.org
>

-- 
To post to this group, send an email to sage-devel@googlegroups.com
To unsubscribe from this group, send an email to 
sage-devel+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/sage-devel
URL: http://www.sagemath.org

Reply via email to