* William Stein <wst...@gmail.com> [2011-08-12 11:30:21 -0700]:
> On Fri, Aug 12, 2011 at 10:05 AM, Julian Rüth <julian.ru...@gmail.com> wrote:
> > I'm not sure if this discussion has been picked up in a different thread
> > since February but the problem with zipimport seems to be that it can't load
> > .so files. Since we have plenty of them, just loading the .py files from a
> > zip file and the .so files from the file system might not be such a big
> > improvement anymore.
> > To address the problem with high latency filesystems why don't we unzip a
> > file containing all of site-packages/sage to some directory in /tmp and
> > import everything from there?
> 
> That might be interesting to try.  Can you give it a shot and report back?

Starting sage from a (relatively responsive) nfs filesystem here takes about
2 seconds:

$ time ./sage --startuptime > /dev/null
real    0m2.012s
user    0m0.741s
sys     0m0.283s

Unzipping a zip-file (200M) containing site-packages/sage to /tmp/ already
takes a while:

$ time unzip sage.zip -d /tmp/sage_zip > /dev/null
real    0m1.236s
user    0m0.881s
sys     0m0.316s

Starting sage using the site-packages directory in /tmp/ is then
somewhat faster than before:

$ time ./sage --startuptime > /dev/null
real    0m1.223s
user    0m0.759s
sys     0m0.178s

In total it is no improvement if I do the unzip on every start of sage.
I'm not sure how much latency some people are experiencing - the nfs I
have here seems to be quite responsive anyway.
Maybe someone with an nfs that is causing more trouble could give this a
try?


Btw moving all sage to /tmp/ doesn't make that much of a difference:

$ cd /tmp/sage
$ time ./sage --startuptime > /dev/null
real    0m1.028s
user    0m0.829s
sys     0m0.151s

cheers,
julian

> 
>  -- William
> 
> >
> > julian
> >
> > On Wednesday, February 23, 2011 10:47:14 PM UTC+1, jason wrote:
> > On 2/23/11 3:06 PM, Robert Bradshaw wrote:
> >> On Wed, Feb 23, 2011 at 11:34 AM, William Stein<wst...@gmail.com> wrote:
> >>> On Wed, Feb 23, 2011 at 10:57 AM, Jason Grout
> >>> <jason...@creativetrax.com> wrote:
> >>>> On 2/23/11 12:28 PM, William Stein wrote:
> >>>>>
> >>>>> At lunch yesterday Robert Bradshaw made the interesting suggestion to
> >>>>> read the docs for importlib
> >>>>> (http://docs.python.org/dev/library/importlib.html) and write a
> >>>>> customized import hook, so that every time during Sage startup that a
> >>>>> module is imported, the import is done from a single big in-memory zip
> >>>>> file instead of done using the filesystem. If this can be made to
> >>>>> work, it would be a huge win for slow filesystems. The basic problem
> >>>>> is that some filesystems are fast but have huge*latency*.
> >>>>
> >>>> Is it a big win primarily because the zip file contents can be read in
> >>>> and
> >>>> cached by us? I'm just trying to understand it better.
> >>>
> >>> Which would you rather do on a high latency filesystem:
> >>>
> >>> (1) Read/stat 20,000 little files, or
> >>> (2) Read exactly one 40MB file.
> >>>
> >>>> Is this the same idea as Jar files in java?
> >>>
> >>> I don't know.
> >>
> >> Yep. In that case the "high latency file system" was a webserver.
> >>
> >>>> You mean like http://docs.python.org/library/zipimport.html ?
> >>>
> >>> Cool.
> >>
> >> Note that this should just involve putting the zip file first in the
> >> python path.
> >>
> >>> I don't know for a fact that Robert Bradshaw's suggestion will be a
> >>> big win, since nobody has tried this yet. But I'm optimistic. The
> >>> idea would be to make a zip archive of
> >>> $SAGE_ROOT/local/lib/python/site-packages (say), and do *all* imports
> >>> using that massive zip archive.
> >>
> >> I'm optimistic too. This would, of course, make more sense for
> >> system-wide installs than development versions, but the former are
> >> more likely to be on a non-local filesystem anyways.
> >
> >
> > Sounds like it is time for a trial!
> >
> > I created a directory of 2000 .py files and an __init__.py file to make
> > it a module
> >
> > for i in range(2000):
> > with open('importtest/test_%s.py'%i,'w') as f:
> > f.write("VALUE=%s\n"%i)
> > with open('importtest/__init__.py','w') as f:
> > f.write(' ')
> >
> > Then I imported each of these so that .pyc files were created.
> >
> > for i in range(2000):
> > exec 'import importtest.test_%s'%i
> >
> >
> > Okay, then I copied the directory and zipped it up (in the shell now):
> >
> > $ cp -r importtest zipimporttest
> > $ zip -r tmp.zip zipimporttest
> > $ rm -rf zipimporttest
> >
> > One nice side effect is that the zip file is less than one MB, while the
> > directory of python files is around 16M.
> >
> > Now for the test. Here are my two scripts. One imports each module in
> > the directory and adds up the VALUE in each module:
> >
> > % cat mytest.py
> > s=0
> > for i in range(2000):
> > exec 'import importtest.test_%s as tt'%i
> > s+=tt.VALUE
> > print s
> >
> >
> > The other first adds the zip to the front of sys.path and then does the
> > same imports and summing, but using the zipped module:
> >
> > % cat mytestzip.py
> > import sys
> > sys.path.insert(0,'./tmp.zip')
> > s=0
> > for i in range(2000):
> > exec 'import zipimporttest.test_%s as tt'%i
> > s+=tt.VALUE
> > print s
> >
> >
> > And now for the timings:
> >
> > % time sage -python mytest.py
> > Detected SAGE64 flag
> > Building Sage on OS X in 64-bit mode
> > 1999000
> > sage -python mytest.py 0.26s user 1.47s system 75% cpu 2.282 total
> >
> >
> > % time sage -python mytestzip.py
> > Detected SAGE64 flag
> > Building Sage on OS X in 64-bit mode
> > 1999000
> > sage -python mytestzip.py 0.21s user 0.11s system 99% cpu 0.327 total
> >
> >
> > It looks like the zip is a clear winner in this case. And this is with
> > the directory presumably in the FS cache.
> >
> > Thanks,
> >
> > Jason
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > --
> > To post to this group, send an email to sage-devel@googlegroups.com
> > To unsubscribe from this group, send an email to
> > sage-devel+unsubscr...@googlegroups.com
> > For more options, visit this group at
> > http://groups.google.com/group/sage-devel
> > URL: http://www.sagemath.org
> >
> 
> 
> 
> -- 
> William Stein
> Professor of Mathematics
> University of Washington
> http://wstein.org
> 
> -- 
> To post to this group, send an email to sage-devel@googlegroups.com
> To unsubscribe from this group, send an email to 
> sage-devel+unsubscr...@googlegroups.com
> For more options, visit this group at 
> http://groups.google.com/group/sage-devel
> URL: http://www.sagemath.org

-- 
To post to this group, send an email to sage-devel@googlegroups.com
To unsubscribe from this group, send an email to 
sage-devel+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/sage-devel
URL: http://www.sagemath.org

Reply via email to