On 2/23/11 3:56 PM, Robert Bradshaw wrote:
On Wed, Feb 23, 2011 at 1:47 PM, Jason Grout
<jason-s...@creativetrax.com> wrote:
On 2/23/11 3:06 PM, Robert Bradshaw wrote:
On Wed, Feb 23, 2011 at 11:34 AM, William Stein<wst...@gmail.com> wrote:
On Wed, Feb 23, 2011 at 10:57 AM, Jason Grout
<jason-s...@creativetrax.com> wrote:
On 2/23/11 12:28 PM, William Stein wrote:
At lunch yesterday Robert Bradshaw made the interesting suggestion to
read the docs for importlib
(http://docs.python.org/dev/library/importlib.html) and write a
customized import hook, so that every time during Sage startup that a
module is imported, the import is done from a single big in-memory zip
file instead of done using the filesystem. If this can be made to
work, it would be a huge win for slow filesystems. The basic problem
is that some filesystems are fast but have huge*latency*.
Is it a big win primarily because the zip file contents can be read in
and
cached by us? I'm just trying to understand it better.
Which would you rather do on a high latency filesystem:
(1) Read/stat 20,000 little files, or
(2) Read exactly one 40MB file.
Is this the same idea as Jar files in java?
I don't know.
Yep. In that case the "high latency file system" was a webserver.
You mean like http://docs.python.org/library/zipimport.html ?
Cool.
Note that this should just involve putting the zip file first in the
python path.
I don't know for a fact that Robert Bradshaw's suggestion will be a
big win, since nobody has tried this yet. But I'm optimistic. The
idea would be to make a zip archive of
$SAGE_ROOT/local/lib/python/site-packages (say), and do *all* imports
using that massive zip archive.
I'm optimistic too. This would, of course, make more sense for
system-wide installs than development versions, but the former are
more likely to be on a non-local filesystem anyways.
Sounds like it is time for a trial!
I created a directory of 2000 .py files and an __init__.py file to make it a
module
for i in range(2000):
with open('importtest/test_%s.py'%i,'w') as f:
f.write("VALUE=%s\n"%i)
with open('importtest/__init__.py','w') as f:
f.write(' ')
Then I imported each of these so that .pyc files were created.
for i in range(2000):
exec 'import importtest.test_%s'%i
Okay, then I copied the directory and zipped it up (in the shell now):
$ cp -r importtest zipimporttest
$ zip -r tmp.zip zipimporttest
$ rm -rf zipimporttest
One nice side effect is that the zip file is less than one MB, while the
directory of python files is around 16M.
Now for the test. Here are my two scripts. One imports each module in the
directory and adds up the VALUE in each module:
% cat mytest.py
s=0
for i in range(2000):
exec 'import importtest.test_%s as tt'%i
s+=tt.VALUE
print s
The other first adds the zip to the front of sys.path and then does the same
imports and summing, but using the zipped module:
% cat mytestzip.py
import sys
sys.path.insert(0,'./tmp.zip')
s=0
for i in range(2000):
exec 'import zipimporttest.test_%s as tt'%i
s+=tt.VALUE
print s
And now for the timings:
% time sage -python mytest.py
Detected SAGE64 flag
Building Sage on OS X in 64-bit mode
1999000
sage -python mytest.py 0.26s user 1.47s system 75% cpu 2.282 total
% time sage -python mytestzip.py
Detected SAGE64 flag
Building Sage on OS X in 64-bit mode
1999000
sage -python mytestzip.py 0.21s user 0.11s system 99% cpu 0.327 total
It looks like the zip is a clear winner in this case. And this is with the
directory presumably in the FS cache.
Cool. Given the CPU was pegged at 99%, have you tried using an
uncompressed zip file? It'd have more data to read, but less to do
with it once it's read.
In my case, using zip -0 (no compression) gives:
% time sage -python mytestzip.py
Detected SAGE64 flag
Building Sage on OS X in 64-bit mode
1999000
sage -python mytestzip.py 0.20s user 0.10s system 99% cpu 0.309 total
So just a slight savings.
Jason
--
To post to this group, send an email to sage-devel@googlegroups.com
To unsubscribe from this group, send an email to
sage-devel+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/sage-devel
URL: http://www.sagemath.org