* William Stein <wst...@gmail.com> [2011-08-12 11:30:21 -0700]: > On Fri, Aug 12, 2011 at 10:05 AM, Julian Rüth <julian.ru...@gmail.com> wrote: > > I'm not sure if this discussion has been picked up in a different thread > > since February but the problem with zipimport seems to be that it can't load > > .so files. Since we have plenty of them, just loading the .py files from a > > zip file and the .so files from the file system might not be such a big > > improvement anymore. > > To address the problem with high latency filesystems why don't we unzip a > > file containing all of site-packages/sage to some directory in /tmp and > > import everything from there? > > That might be interesting to try. Can you give it a shot and report back?
Starting sage from a (relatively responsive) nfs filesystem here takes about 2 seconds: $ time ./sage --startuptime > /dev/null real 0m2.012s user 0m0.741s sys 0m0.283s Unzipping a zip-file (200M) containing site-packages/sage to /tmp/ already takes a while: $ time unzip sage.zip -d /tmp/sage_zip > /dev/null real 0m1.236s user 0m0.881s sys 0m0.316s Starting sage using the site-packages directory in /tmp/ is then somewhat faster than before: $ time ./sage --startuptime > /dev/null real 0m1.223s user 0m0.759s sys 0m0.178s In total it is no improvement if I do the unzip on every start of sage. I'm not sure how much latency some people are experiencing - the nfs I have here seems to be quite responsive anyway. Maybe someone with an nfs that is causing more trouble could give this a try? Btw moving all sage to /tmp/ doesn't make that much of a difference: $ cd /tmp/sage $ time ./sage --startuptime > /dev/null real 0m1.028s user 0m0.829s sys 0m0.151s cheers, julian > > -- William > > > > > julian > > > > On Wednesday, February 23, 2011 10:47:14 PM UTC+1, jason wrote: > > On 2/23/11 3:06 PM, Robert Bradshaw wrote: > >> On Wed, Feb 23, 2011 at 11:34 AM, William Stein<wst...@gmail.com> wrote: > >>> On Wed, Feb 23, 2011 at 10:57 AM, Jason Grout > >>> <jason...@creativetrax.com> wrote: > >>>> On 2/23/11 12:28 PM, William Stein wrote: > >>>>> > >>>>> At lunch yesterday Robert Bradshaw made the interesting suggestion to > >>>>> read the docs for importlib > >>>>> (http://docs.python.org/dev/library/importlib.html) and write a > >>>>> customized import hook, so that every time during Sage startup that a > >>>>> module is imported, the import is done from a single big in-memory zip > >>>>> file instead of done using the filesystem. If this can be made to > >>>>> work, it would be a huge win for slow filesystems. The basic problem > >>>>> is that some filesystems are fast but have huge*latency*. > >>>> > >>>> Is it a big win primarily because the zip file contents can be read in > >>>> and > >>>> cached by us? I'm just trying to understand it better. > >>> > >>> Which would you rather do on a high latency filesystem: > >>> > >>> (1) Read/stat 20,000 little files, or > >>> (2) Read exactly one 40MB file. > >>> > >>>> Is this the same idea as Jar files in java? > >>> > >>> I don't know. > >> > >> Yep. In that case the "high latency file system" was a webserver. > >> > >>>> You mean like http://docs.python.org/library/zipimport.html ? > >>> > >>> Cool. > >> > >> Note that this should just involve putting the zip file first in the > >> python path. > >> > >>> I don't know for a fact that Robert Bradshaw's suggestion will be a > >>> big win, since nobody has tried this yet. But I'm optimistic. The > >>> idea would be to make a zip archive of > >>> $SAGE_ROOT/local/lib/python/site-packages (say), and do *all* imports > >>> using that massive zip archive. > >> > >> I'm optimistic too. This would, of course, make more sense for > >> system-wide installs than development versions, but the former are > >> more likely to be on a non-local filesystem anyways. > > > > > > Sounds like it is time for a trial! > > > > I created a directory of 2000 .py files and an __init__.py file to make > > it a module > > > > for i in range(2000): > > with open('importtest/test_%s.py'%i,'w') as f: > > f.write("VALUE=%s\n"%i) > > with open('importtest/__init__.py','w') as f: > > f.write(' ') > > > > Then I imported each of these so that .pyc files were created. > > > > for i in range(2000): > > exec 'import importtest.test_%s'%i > > > > > > Okay, then I copied the directory and zipped it up (in the shell now): > > > > $ cp -r importtest zipimporttest > > $ zip -r tmp.zip zipimporttest > > $ rm -rf zipimporttest > > > > One nice side effect is that the zip file is less than one MB, while the > > directory of python files is around 16M. > > > > Now for the test. Here are my two scripts. One imports each module in > > the directory and adds up the VALUE in each module: > > > > % cat mytest.py > > s=0 > > for i in range(2000): > > exec 'import importtest.test_%s as tt'%i > > s+=tt.VALUE > > print s > > > > > > The other first adds the zip to the front of sys.path and then does the > > same imports and summing, but using the zipped module: > > > > % cat mytestzip.py > > import sys > > sys.path.insert(0,'./tmp.zip') > > s=0 > > for i in range(2000): > > exec 'import zipimporttest.test_%s as tt'%i > > s+=tt.VALUE > > print s > > > > > > And now for the timings: > > > > % time sage -python mytest.py > > Detected SAGE64 flag > > Building Sage on OS X in 64-bit mode > > 1999000 > > sage -python mytest.py 0.26s user 1.47s system 75% cpu 2.282 total > > > > > > % time sage -python mytestzip.py > > Detected SAGE64 flag > > Building Sage on OS X in 64-bit mode > > 1999000 > > sage -python mytestzip.py 0.21s user 0.11s system 99% cpu 0.327 total > > > > > > It looks like the zip is a clear winner in this case. And this is with > > the directory presumably in the FS cache. > > > > Thanks, > > > > Jason > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > To post to this group, send an email to sage-devel@googlegroups.com > > To unsubscribe from this group, send an email to > > sage-devel+unsubscr...@googlegroups.com > > For more options, visit this group at > > http://groups.google.com/group/sage-devel > > URL: http://www.sagemath.org > > > > > > -- > William Stein > Professor of Mathematics > University of Washington > http://wstein.org > > -- > To post to this group, send an email to sage-devel@googlegroups.com > To unsubscribe from this group, send an email to > sage-devel+unsubscr...@googlegroups.com > For more options, visit this group at > http://groups.google.com/group/sage-devel > URL: http://www.sagemath.org -- To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org