RE: MiDataSets for MiBench to enable more realistic benchmarking and better tuning of the GCC optimization heuristic

Grigori Fursin Mon, 19 Mar 2007 08:18:26 -0800

Yes, that's right that without good analysis part it is semi-useless.
Actually, it can even make life harder since instead of one dataset
you now have to try many ;) ...

We did some preliminary analysis of the compiler optimizations for programs
with multiple datasets in our HiPEAC'07 paper using the PathScale compiler 
and now we continue working on a better program characterization with multiple 
inputs and how it should be properly used to improve compiler heuristic. 

So, at the moment it is still more for research purposes. We had some requests
to make these datasets public after HiPEAC conference and HiPEAC GCC tutorial
so that some researchers and engineers can start looking at this issue. There 
was
an interest from some companies that develop embedded systems - they use
GCC more and more and they often use MediaBench and MiBench for the 
benchmarking but find that using only two datasets may not be representative 
enough ...

I now have a few projects where we use GCC and MiDataSets,
so whenever we have more practical results, I will post it here!

Hope it will be of any use,
Grigori

=====================================================
Grigori Fursin, PhD
INRIA Futurs, France
http://fursin.net/research

On 3/19/07, Grigori Fursin <[EMAIL PROTECTED]> wrote:
> Hi all,
>
> In case someone is interested, we are developing a set of inputs
> (MiDataSets) for the MiBench benchmark. Iterative optimization is now 
> a popular technique to obtain performance or code size improvements 
> over the default settings in a compiler. However, in most of the 
> research projects, the best configuration is found for one arbitrary 
> dataset and it is assumed that this configuration will work well with 
> any other dataset that a program uses.
> We created 20 different datasets per program for free MiBench 
> benchmark to evaluate this assumption and analyze the behavior of 
> various programs with multiple datasets. We hope that this will enable 
> more realistic benchmarking, practical iterative optimizations 
> (iterative compilation), and can help to automatically improve GCC 
> optimization heuristic.

I think this is nice but semi useless unless you look into also why stuff is 
better.
The anylsis part is the hard part really but the most useful part of to figure 
out why GCC is
failing to produce good code.

An example of this is I was working on a patch which speeds up most code (and 
reduces code size
there) but slows down some code (and increase the code too) and I found that 
scheduling, and
reordering blocks decisions would change which causes the code to become 
slower/larger.  This
anylsis was neccessary to figure out my patch/pass was not directly causing the 
slower/larger code.
This is the same thing with any kind of heuristic tuning is needed.

Thanks,
Andrew Pinski

RE: MiDataSets for MiBench to enable more realistic benchmarking and better tuning of the GCC optimization heuristic

Reply via email to