On Jan 15, 11:51 pm, Emil Widmann <emil.widm...@gmail.com> wrote:
> Stripping Sage Binaries II
> --------------------------
>
> With hardlinking multible files and stripping executables a size
> reduction of 438 MB (-26%) was achieved. Further reduction involves
> moving directories which breaks sage -testall. The goal is to produce
> a binary package of sage with aedequate functionality and reduced
> size.  So I couldn't resist to push on and accept failing tests to see
> the overall potential.
>
> Preliminary results:
> -----------------------------------------------
> Original Directory tree (~1900 MB)
> sage-binary (775 MB / squashed FS 218 MB)
> That is: -60% / - 88 % compared to original size !!!
> sage-dev (529 MB)
> sage-doc (222 MB)
>
> The resulting binaries seem to work in a superficial test. The
> stripped binaries can be tested as a live iso or in a virtual Machine
> image (also easy frugal install into existing Linux desktop possible).
>
> Download iso image (400 MB, thats Live CD Base distro + stripped
> binaries)http://boxen.math.washington.edu/home/emil/sagelithe/
>
> Stripping Procedere:
> --------------------
> It started with building a binary distribution on sagelive-511-46-r3
> (Live CD release) - 1) (see Footnote). The resulting directory tree of
> this build was manually split into 3 directories:
> sage-binary
> sage-devel
> sage-doc
>
> The bulk which was moved out of the original binary tree were the
> following directories 2):
> SAGE_ROOT/devel/sage (190 MB)
> SAGE_ROOT/devel/sage-main/build/lib.linux-i686-2.6/sage (78 MB)
> SAGE_ROOT/devel/sage-main/build/temp.linux-i686-2.6/sage (104 MB)
> SAGE_ROOT/devel/sagenb-main/dist(14 MB)
> SAGE_ROOT/devel/sagenb-main/build/lib/sagenb(35 MB)
> hidden directories:
> SAGE_ROOT/devel/sage-main/.hg (50 MB)
> SAGE_ROOT/devel/sage-main/.hg (39 MB)
>
> After that the stripping procedere from the 1. attempt was applied
> (hardlink multiplicate file instances, strip binaries) to the binary
> directory 3) .
>
> The produced binary package worked for me in a brief test 4) in a
> fresh install of the base distribution. However sage -testall is not
> working anymore, so it is not easy to give confirmation about which
> parts of sage might be broken. Tracebacks seemed to work, because all
> the python source code stayed in the remaining directories.
>
> To investigate further possibilities for reduction I also checked
> source sizes still available in the binary tree 4):
>
> Total file size of Python source is: 83506845 Bytes
> Total file size of lisp source is 14091345 Bytes
> Total file size of C source is 5444780 Bytes
> Total file size of C++ source is 163105 Bytes
> Total file size of C headers is 3779884 Bytes
> --------------------
> Total size of source code found: 106985959 Bytes
>
> So removing the sourcefiles would gain another 100 MB.
> As I understand, ability for traceback at errors would be lost. But
> right at the moment I fear that it will break sage completely. Another
> aspect: There are lots of comments. An educated guess about the size
> of the comments in python code is about 40 MB. This estimation
> includes preservation of the original line-numbering, so tracebacks
> would yield the right line numbers. If one assumes that c code could
> also be shifted out then this would mean a reduction of 50 MB is
> possible ( I dont know if it is possible to shift maximas lisp code
> out).
>
> Regarding binaries, there would be the possibility to use upx
> compression. In the Live CD this is not needed, because files are
> already in a squashed FS. But for distributions which use uncompressed
> Filesystems this could give further substantial reduction.
>
> There was no prior knowledge of the structure of the sage package. So
> it might be possible that the split is not correct and some essential
> files are missing in the binaries. There is also the possibility that
> many files and directories could still be omitted in the binary
> package and shifted to one of the others.
>
> For further work I would be grateful for any input regarding the
> following:
> Test of the binaries, suggestions how to implement a working "sage -
> testall" for similar binaries?
> Feedback and input about the quality of the split, which files and
> directories were missed, or are wrong now?
> Information about the doc-tree. Which files are responsible to make
> the ? command in the CLI work?
> Test of the abilities for development. How does it behave if the
> development package is loaded? Can --strip-unneeded binaries be used
> for developement? (otherways it would be possible to fall back to --
> strip-debug for libraries).
>
> Summary
> -------
> A substantial reduction of the size of sage binaries was achieved
> using a combined approach of manual splitting, hardlinking double
> files and striping executables. The binary package was reduced to a
> size of 792 MB compared to a size of over 1900 MB of the original
> directory tree. This is a reduction of 60%. Size reduction in the
> squashed package was from 438 MB to 222 MB (-49 %).  "sage -testall"
> does not work any more in the reduced binary, so there is further
> testing needed to confirm the functionality of the created binary
> package.
>
> Footnotes:
> ---------
> 1)
>  !#/bin/sh
>  # build sage binaries for sagelive, be sure that Tcltk is installed
>  export SAGE_MATPLOTLIB_GUI="yes"
>  export SAGE_FAT_BINARY="yes"
>  make
>  ./sage -bdist sagelive-511-4.6.1-r4-fat
>
> comment:
> In my opinion it is important, that as many features of Sage
> Components are available as possible. There is access to plotting from
> R and pylab (TCL backend) in a standard way. It was not possibel to
> integrate other matplotlib-backends until now, I would wonder how much
> they would add to the total size?
>
> Are there any additional environment variables that should be set to
> generate the binaries? The idea is to use the sage Components and
> libraries as core of the distribution and to integrate it tightly.
> What do other components need (e.g. maxima) to "work out of a box"?
>
> 2)
> Textfiles with du -ch of the packages are available 
> here:http://boxen.math.washington.edu/home/emil/sagelithe
> The doc and dev package can be loaded as packages into the live
> version.
> (comming soon ...)
>
> 3)
> This is the procedure to hardlink multi-file instances and strip
> binaries
>
> #!/bin/sh
> # script to reduce size of directory tree and binaries, uses the
> package fslint (http://www.pixelbeat.org/fslint/)
> # be sure to have the scripts of fslint in your path, or edit line 6
> so that findup is found.
> cd SAGE_ROOT
> # replace double files with hardlinks
> findup -m .
> # strip executables
> find . | xargs file | grep "executable" | grep ELF | cut -f 1 -d : |
> xargs strip --strip-unneeded 2> /dev/null
> # Level 1 stripping for shared libraries (comment/uncomment to switch)
> # find . | xargs file | grep "shared object" | grep ELF | cut -f 1 -
> d : | xargs strip --strip-debug 2> /dev/null
> # Level 2 stripping for shared libraries (comment/uncomment to switch)
> find . | xargs file | grep "shared object" | grep ELF | cut -f 1 -d :
> | xargs strip --strip-unneeded 2> /dev/null
>
> 4)
> sage starts up ok in console and in the notebook.
> some quick plotting and easy equation solving works in the notebook
> without flaws.
>
> sage -sh
> R
> demo(graphics)
> works, produces R demo plottings.
>
> sage -python
> from pylab import *
> plot ([1,2],[2,1])
> show()
>
> produced a plot
> (I compiled with TclTk and have this dependency included in sagelive)
>
> built in help (doctstrings) doesn't work in console!, i.e plot ? gives
> just a short description and then
> Docstring:
> < no docstring >
>
> same command in the notebook works well.
>
> 5)
> just a quick copy paste hack:
>
> #!/bin/sh
> # calculates size of source files in directory tree
> tsum=0
> sum=0
> # check python
> for k in `find -name *.py -exec ls -l {} \+ | awk '{print $5}'`
> do
>    sum=$((sum+k))
> done
> echo "Total file size of Python source is: $sum Bytes"
> tsum=$((tsum+sum))
> sum=0
> # check lisp
> for k in `find -name *.lisp -exec ls -l {} \+ | awk '{print $5}'`
> ...SNIP
> etc ...

I always have to laugh when I do, say or write something silly, but at
least it's a good learning experience. But at least now I beginn to
grasp what doctrings are and how they work. I also check the sage -t
command.
I used
sage

So basically tests for the stripped binary works almost. 4 failures
remain:

        sage -t  "devel/sage/build/sage/misc/preparser.py"
        sage -t  "devel/sage/build/sage/misc/sagedoc.py"
        sage -t  "devel/sage/build/sage/misc/sageinspect.py"
        sage -t  "devel/sagenb/sagenb/misc/sageinspect.py"

>From those the failure of sagedoc.py is clear to me. It should be
mended if the Doc package is loaded



-- 
To post to this group, send an email to sage-devel@googlegroups.com
To unsubscribe from this group, send an email to 
sage-devel+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/sage-devel
URL: http://www.sagemath.org

Reply via email to