On Fri, Aug 12, 2011 at 10:05 AM, Julian Rüth <julian.ru...@gmail.com> wrote: > I'm not sure if this discussion has been picked up in a different thread > since February but the problem with zipimport seems to be that it can't load > .so files. Since we have plenty of them, just loading the .py files from a > zip file and the .so files from the file system might not be such a big > improvement anymore. > To address the problem with high latency filesystems why don't we unzip a > file containing all of site-packages/sage to some directory in /tmp and > import everything from there?
That might be interesting to try. Can you give it a shot and report back? -- William > > julian > > On Wednesday, February 23, 2011 10:47:14 PM UTC+1, jason wrote: > On 2/23/11 3:06 PM, Robert Bradshaw wrote: >> On Wed, Feb 23, 2011 at 11:34 AM, William Stein<wst...@gmail.com> wrote: >>> On Wed, Feb 23, 2011 at 10:57 AM, Jason Grout >>> <jason...@creativetrax.com> wrote: >>>> On 2/23/11 12:28 PM, William Stein wrote: >>>>> >>>>> At lunch yesterday Robert Bradshaw made the interesting suggestion to >>>>> read the docs for importlib >>>>> (http://docs.python.org/dev/library/importlib.html) and write a >>>>> customized import hook, so that every time during Sage startup that a >>>>> module is imported, the import is done from a single big in-memory zip >>>>> file instead of done using the filesystem. If this can be made to >>>>> work, it would be a huge win for slow filesystems. The basic problem >>>>> is that some filesystems are fast but have huge*latency*. >>>> >>>> Is it a big win primarily because the zip file contents can be read in >>>> and >>>> cached by us? I'm just trying to understand it better. >>> >>> Which would you rather do on a high latency filesystem: >>> >>> (1) Read/stat 20,000 little files, or >>> (2) Read exactly one 40MB file. >>> >>>> Is this the same idea as Jar files in java? >>> >>> I don't know. >> >> Yep. In that case the "high latency file system" was a webserver. >> >>>> You mean like http://docs.python.org/library/zipimport.html ? >>> >>> Cool. >> >> Note that this should just involve putting the zip file first in the >> python path. >> >>> I don't know for a fact that Robert Bradshaw's suggestion will be a >>> big win, since nobody has tried this yet. But I'm optimistic. The >>> idea would be to make a zip archive of >>> $SAGE_ROOT/local/lib/python/site-packages (say), and do *all* imports >>> using that massive zip archive. >> >> I'm optimistic too. This would, of course, make more sense for >> system-wide installs than development versions, but the former are >> more likely to be on a non-local filesystem anyways. > > > Sounds like it is time for a trial! > > I created a directory of 2000 .py files and an __init__.py file to make > it a module > > for i in range(2000): > with open('importtest/test_%s.py'%i,'w') as f: > f.write("VALUE=%s\n"%i) > with open('importtest/__init__.py','w') as f: > f.write(' ') > > Then I imported each of these so that .pyc files were created. > > for i in range(2000): > exec 'import importtest.test_%s'%i > > > Okay, then I copied the directory and zipped it up (in the shell now): > > $ cp -r importtest zipimporttest > $ zip -r tmp.zip zipimporttest > $ rm -rf zipimporttest > > One nice side effect is that the zip file is less than one MB, while the > directory of python files is around 16M. > > Now for the test. Here are my two scripts. One imports each module in > the directory and adds up the VALUE in each module: > > % cat mytest.py > s=0 > for i in range(2000): > exec 'import importtest.test_%s as tt'%i > s+=tt.VALUE > print s > > > The other first adds the zip to the front of sys.path and then does the > same imports and summing, but using the zipped module: > > % cat mytestzip.py > import sys > sys.path.insert(0,'./tmp.zip') > s=0 > for i in range(2000): > exec 'import zipimporttest.test_%s as tt'%i > s+=tt.VALUE > print s > > > And now for the timings: > > % time sage -python mytest.py > Detected SAGE64 flag > Building Sage on OS X in 64-bit mode > 1999000 > sage -python mytest.py 0.26s user 1.47s system 75% cpu 2.282 total > > > % time sage -python mytestzip.py > Detected SAGE64 flag > Building Sage on OS X in 64-bit mode > 1999000 > sage -python mytestzip.py 0.21s user 0.11s system 99% cpu 0.327 total > > > It looks like the zip is a clear winner in this case. And this is with > the directory presumably in the FS cache. > > Thanks, > > Jason > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > To post to this group, send an email to sage-devel@googlegroups.com > To unsubscribe from this group, send an email to > sage-devel+unsubscr...@googlegroups.com > For more options, visit this group at > http://groups.google.com/group/sage-devel > URL: http://www.sagemath.org > -- William Stein Professor of Mathematics University of Washington http://wstein.org -- To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org