Hi Devs, Firstly thanks for making the effort to get SAGE this far. At the moment I'm a MMA user tracking its progress, and have some 'user-level' questions about SAGE's development, specifically related to parallel calculations. I'll also indulge in some observations that may be a little off-topic (?).
Context: I'm a member of the Business faculty within Sydney Uni - home of the Magma black-hole it seems, though I was totally unaware of that software :) My use case is not computer algebra research - so this may all be off topic. I have used MMA's computer algebra engine, usually as one step on the way to substituting a data observation for a symbol, rather than an end in itself - though I have done that too. I have also 'played' with writing some MPI based C code - with lots of frustration (no parallel debugger) and no 'real' success, again no parallel debugger :) The questions: [remember, non-mathematician, so be gentle and try not to laugh out loud :)] On Feb 8 2007, 1:29 pm, "William Stein" <[EMAIL PROTECTED]> wrote: > On Wed, 07 Feb2007 18:07:32 -0700, didier deshommes <[EMAIL PROTECTED]> wrote: > > > On 2/2/07, David Harvey <[EMAIL PROTECTED]> wrote: > > >> Hi everyone, > > >> It would be good if someone who was at theparallelcomputing workshop > >> could volunteer to give a talk about parallelism issues at SAGE Days 3, > >> as they might relate to SAGE. I'm sure some of the other SAGE > >> developers would be interested to hear about it, and we all probably > >> will have some thoughts after letting it percolate around for two > >> weeks. > > > Here are some questions that I would love to see answered: > > * OpenMP vs MPI: which one would be better for SAGE and why? > > * Is there a clear winner in the python world forparallel > > applications? googling for either "open mp python" or "mpi python" > > shows several modules. > > MPI. It's industry strength, very mature, openmpi is an excellent > implementation, python supports it very well, as does IPython, > and it's widely used. OpenMP is barely deployed in comparison. > SAGE is about using today's techology, not tomorrows. That said, > I bet most of the implementation ofparallelalgorithms in SAGE > will not require MPI at all (i.e, they'll use either pthreads > for low level things, ipython for midlevel, and dsage for > tast farming). > > > * Are there any plans to rewrite basic algorithms so that they are > >parallel-friendly? > Q1) Is the focus on performing symbolic calculations in parallel, or are numerically oriented calculations currently being considered? > Yes. But this won't happen on a grand scale until at least a dozen > sample applications get written to use parallelism (not re-written -- > all sequential algorithms will stay), and we learn from the outcome > of this. Q2) Have the MPI based PETSC and TAO libraries being considered for inclusion? Considered and rejected? PETSC: http://www-unix.mcs.anl.gov/petsc/petsc-as/ (and http://acts.nersc.gov/petsc/) TAO: http://www-unix.mcs.anl.gov/tao/ Rather than being an extra wheel on the SAGE 'car' they might represent the addition of a couple of extra cylinders to the SAGE car engine :) > > * Does anyone know of computer algebra systems that use that take > > advantage of more thant 1 CPU? > > Maple, Mathematica, MATLAB, andParallelGap. But with the first > three who knows how or what they do. The last is dated. Wolfram has gridMathematica and its personal grid edition, but I've not used them - for reasons I mention below in some observations. I was never certain what this meant for their symbolic calculations. > > * How much of a issue is thread-safety in current libraries that SAGE > > uses? Are there ways around it? > > Python is not thread safe. This stops a number of (probably bad, > in retrospect!) ideas in their tracks. > > I doubt threaded techniques are going to be used much, if at all, > in the core SAGE library. Probably IPython will be used a lot > in the core library, and dsage will be used a lot by end users. > Some observations. Disclaimer. These observations relate to a particular use case and aren't general. Nothing mentioned here is intended to criticize or invalidate the SAGE effort. - With parallel programming it seems that a good (any?) parallel debugger becomes essential. I found things getting very tricky when writing for MPI. - The SAGE notebook interface is an excellent idea, it seems similar to a MMA notebook in some respects. I wonder if it won't, for the same reasons, become the case that SAGE users come to need something like the Wolfram workbench? I found that application (Eclipse based) to be very useful in writing some MMA scripts. I guess this might be the case for some SAGE (non-math research) users. - Putting the above two observations together it seemed that a SAGE IDE might lie ahead. Continuing the 'build a car, not another wheel' philosophy, I thought it might be worth pointing out that the Eclipse project has a Parallel tools project, as well as a dynamic languages toolkit and Pydev. Tying these together as a SAGE workbench might be something the Eclipse foundation, or one of its members, might be interested in as sponsor for a SAGE IDE effort? (Disclaimer I use Netbeans (primarily) _and_ Eclipse) Eclipse PTP: http://www.eclipse.org/ptp/ Eclipse DLTK: http://www.eclipse.org/dltk/ Pydev: http://pydev.sourceforge.net/ - MPI vs threading. This is very application specific. One potential use of SAGE is to do very sophisticated calculations on very large data sets. I don't wish to depreciate the importance of 'best- speed' implementations, their benefits accrue if uses one or more cpu's. Nonetheless I have found that there quickly comes a point where data related resources demands, latencies and 'issues' can dominate calculation time. In these cases an MPI based algorithm that I can throw 100 computer's cpus, memory, hard-disk and network bandwidths dominates/beats a single computer (multiple core) multiple threaded implementation. I'd be very happy with single threaded but genuinely (beyond one machine) scalable algorithms. - Please think of a data point as a symbol in a computer algebra system :) That is, if at all possible, treat data points as first class citizens. My MMA gripe and the reason for not using gridMathematica is that it is atrocious at handling data. I quickly realized that writing to use gridMMA with my data would be more painful and take longer than if I switched to learn R and used Condor - calling MMA only when desperately in need of some arbitrary precision calculation. - If you aren't already familiar with Amazon's web services I'd strongly encourage taking an hour or two to explore them. Specifically the Amazon machine images. To my mind this is a seismic shift in the availability of computing power, and can change you mindset when thinking about SAGE applications. Given that 1TB will become available on demand I do think the 'sophisticated calculation on massive datasets' will become a much more common use case. It will also shift what you consider to be a 'standard' use case - Example: For USD 80.00 I can employ 100 machines each with 8 cpu's for one hour, i.e. 800 cpu's, and +100x8GB memory. For some more $'s each instance could have up to 1TB (or probably higher in a year or so) of storage :) Currently I use an Amazon machine image as a (Gnome) desktop machine - I know one user considered using an 8-cpu instance to compile their code more rapidly and this might help some SAGE devs? Again thanks for all the exceptional efforts, I'll continue to watch in anticipation (Currently my opensource efforts are spent on a Ruby ORM - Sequel). Regards Mark > William --~--~---------~--~----~------------~-------~--~----~ To post to this group, send email to sage-devel@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sage-devel URLs: http://www.sagemath.org -~----------~----~----~----~------~----~------~--~---