On Fri, Aug 12, 2011 at 10:05 AM, Julian Rüth <julian.ru...@gmail.com> wrote:
> I'm not sure if this discussion has been picked up in a different thread
> since February but the problem with zipimport seems to be that it can't load
> .so files. Since we have plenty of them, just loading the .py files from a
> zip file and the .so files from the file system might not be such a big
> improvement anymore.
> To address the problem with high latency filesystems why don't we unzip a
> file containing all of site-packages/sage to some directory in /tmp and
> import everything from there?

That might be interesting to try.  Can you give it a shot and report back?

 -- William

>
> julian
>
> On Wednesday, February 23, 2011 10:47:14 PM UTC+1, jason wrote:
> On 2/23/11 3:06 PM, Robert Bradshaw wrote:
>> On Wed, Feb 23, 2011 at 11:34 AM, William Stein<wst...@gmail.com> wrote:
>>> On Wed, Feb 23, 2011 at 10:57 AM, Jason Grout
>>> <jason...@creativetrax.com> wrote:
>>>> On 2/23/11 12:28 PM, William Stein wrote:
>>>>>
>>>>> At lunch yesterday Robert Bradshaw made the interesting suggestion to
>>>>> read the docs for importlib
>>>>> (http://docs.python.org/dev/library/importlib.html) and write a
>>>>> customized import hook, so that every time during Sage startup that a
>>>>> module is imported, the import is done from a single big in-memory zip
>>>>> file instead of done using the filesystem. If this can be made to
>>>>> work, it would be a huge win for slow filesystems. The basic problem
>>>>> is that some filesystems are fast but have huge*latency*.
>>>>
>>>> Is it a big win primarily because the zip file contents can be read in
>>>> and
>>>> cached by us? I'm just trying to understand it better.
>>>
>>> Which would you rather do on a high latency filesystem:
>>>
>>> (1) Read/stat 20,000 little files, or
>>> (2) Read exactly one 40MB file.
>>>
>>>> Is this the same idea as Jar files in java?
>>>
>>> I don't know.
>>
>> Yep. In that case the "high latency file system" was a webserver.
>>
>>>> You mean like http://docs.python.org/library/zipimport.html ?
>>>
>>> Cool.
>>
>> Note that this should just involve putting the zip file first in the
>> python path.
>>
>>> I don't know for a fact that Robert Bradshaw's suggestion will be a
>>> big win, since nobody has tried this yet. But I'm optimistic. The
>>> idea would be to make a zip archive of
>>> $SAGE_ROOT/local/lib/python/site-packages (say), and do *all* imports
>>> using that massive zip archive.
>>
>> I'm optimistic too. This would, of course, make more sense for
>> system-wide installs than development versions, but the former are
>> more likely to be on a non-local filesystem anyways.
>
>
> Sounds like it is time for a trial!
>
> I created a directory of 2000 .py files and an __init__.py file to make
> it a module
>
> for i in range(2000):
> with open('importtest/test_%s.py'%i,'w') as f:
> f.write("VALUE=%s\n"%i)
> with open('importtest/__init__.py','w') as f:
> f.write(' ')
>
> Then I imported each of these so that .pyc files were created.
>
> for i in range(2000):
> exec 'import importtest.test_%s'%i
>
>
> Okay, then I copied the directory and zipped it up (in the shell now):
>
> $ cp -r importtest zipimporttest
> $ zip -r tmp.zip zipimporttest
> $ rm -rf zipimporttest
>
> One nice side effect is that the zip file is less than one MB, while the
> directory of python files is around 16M.
>
> Now for the test. Here are my two scripts. One imports each module in
> the directory and adds up the VALUE in each module:
>
> % cat mytest.py
> s=0
> for i in range(2000):
> exec 'import importtest.test_%s as tt'%i
> s+=tt.VALUE
> print s
>
>
> The other first adds the zip to the front of sys.path and then does the
> same imports and summing, but using the zipped module:
>
> % cat mytestzip.py
> import sys
> sys.path.insert(0,'./tmp.zip')
> s=0
> for i in range(2000):
> exec 'import zipimporttest.test_%s as tt'%i
> s+=tt.VALUE
> print s
>
>
> And now for the timings:
>
> % time sage -python mytest.py
> Detected SAGE64 flag
> Building Sage on OS X in 64-bit mode
> 1999000
> sage -python mytest.py 0.26s user 1.47s system 75% cpu 2.282 total
>
>
> % time sage -python mytestzip.py
> Detected SAGE64 flag
> Building Sage on OS X in 64-bit mode
> 1999000
> sage -python mytestzip.py 0.21s user 0.11s system 99% cpu 0.327 total
>
>
> It looks like the zip is a clear winner in this case. And this is with
> the directory presumably in the FS cache.
>
> Thanks,
>
> Jason
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> --
> To post to this group, send an email to sage-devel@googlegroups.com
> To unsubscribe from this group, send an email to
> sage-devel+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/sage-devel
> URL: http://www.sagemath.org
>



-- 
William Stein
Professor of Mathematics
University of Washington
http://wstein.org

-- 
To post to this group, send an email to sage-devel@googlegroups.com
To unsubscribe from this group, send an email to 
sage-devel+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/sage-devel
URL: http://www.sagemath.org

Reply via email to