Hi David,

Thanks a lot for a good question - I completely forgot to discuss that.

Current workloads in the CK are just to test our collaborative optimization prototype. They are even a bit outdated (benchmarks and codelets from the MILEPOST project).

However, our point is to make an open system where the community can
add any workload via GitHub with some meta information in JSON format
to be able to participate in collaborative tuning. This meta information exposes
data sets used, command lines, input/output files, etc. This helps add
multiple data sets for a given benchmark or even reuse already shared ones.
Finally, this meta information makes it relatively straightforward to apply
predictive analytics to find correlations between workloads and optimizations.

Our hope is to eventually make a large and diverse pool of public workloads.
In such case, users themselves can derive representative workloads
for their requirements (performance, code size, energy, resource constraints, etc) and a target hardware. Furthermore, since optimization spaces are huge and it is infeasible to explore them by one user or even in one data center,
our approach allows all shared workloads to continuously participate
in crowdtuning, i.e. searching for good optimizations across diverse platforms
while recording "unexpected behavior".

Actually, adding more workloads to CK (while making this process more user-friendly)
and tuning them can be a GSOC project - we can help with that ...

You can find more about our view here:
* http://arxiv.org/abs/1506.06256
* https://hal.inria.fr/hal-01054763

Hope it makes sense and take care,

On 05/03/2016 16:16, David Edelsohn wrote:
On Sat, Mar 5, 2016 at 9:13 AM, Grigori Fursin <grig...@dividiti.com> wrote:
Dear colleagues,

If it's of interest, we have released a new version of our open-source
framework to share compiler optimization knowledge across diverse workloads
and hardware. We would like to thank all the volunteers who ran this
framework and shared some results for GCC 4.9 .. 6.0 in the public
repository here: http://cTuning.org/crowdtuning-results-gcc

Here is a brief note how this framework for crowdtuning compiler
optimization heuristics works (for more details, please see
https://github.com/ctuning/ck/wiki/Crowdsource_Experiments): you just
install a small Android app
or python-based Collective Knowledge framework
(http://github.com/ctuning/ck). This program sends system properties to a
public server. The server compiles a random shared workload using some flag
combinations that have been found to work well on similar machines, as well
as some new random ones. The client executes the compiled workload several
times to account for variability etc, and sends the results back to the

If a combination of compiler flags is found that improves performance over
the combinations found so far, it gets reduced (by removing flags that do
now affect the performance) and uploaded to a public repository.
Importantly, if a combination significantly degrades performance for a
particular workload, it gets recorded as well. This potentially points to a
problem with optimization heuristics for a particular target, which may be
worth investigating and improving.

At the moment, only global GCC compiler flags are exposed for collaborative
optimization. Longer term, it can be useful to cover finer-grain
transformation decisions (vectorization, unrolling, etc) via plugin
interface. Please, note that this is a prototype framework and much more can
be done! Please get in touch if you are interested to know more or
Thanks for creating and sharing this interesting framework.

I think a central issue is the "random shared workload" because the
optimal optimizations and optimization pipeline are
application-dependent.  The proposed changes to the heuristics may
benefit for the particular set of workloads that the framework tests
but why are those workloads and particular implementations of the
workloads representative for applications of interest to end users of
GCC?   GCC is turned for an arbitrary set of workloads, but why are
the workloads from cTuning any better?

Thanks, David

Reply via email to