On Mon, May 16, 2016 at 4:30 AM, James Greenhalgh <james.greenha...@arm.com> wrote: > As this change will change code generation for all cores (except > Exynos-M1), I'd like to hear from those with more detailed knowledge of > ThunderX, X-Gene and qdf24xx before I take this patch.
It looks like a slight lose on qdf24xx on SPEC CPU2006 at -O3. I see about a 0.37% loss on the integer benchmarks, and no significant change on the FP benchmarks. The integer loss is mainly due to 458.sjeng which drops 2%. We had tried various values for max_case_values earlier, and didn't see any performance improvement from setting it, so we are using the default value. We've been tracking changes to the FSF tree, and adjust our tuning structure as necessary, so I'm not too concerned about this. We will just set the max_case_values field in the tuning structure to get the result we want. What I am slightly concerned about is that the max_case_values field is only used at -O3 and above which limits the usefulness. If a port has specified a value, it probably should be used for all non-size optimization, which means we should check for optimize_size first, then check for a cpu specific value, then use the default. If you do that, then you don't need to change the default to get better generic/a53 code, you can change it in the generic and/or a53 tuning tables. Though I see that the original patch from Samsung that added the max_case_values field has the -O3 check, so there was apparently some reason why they wanted it to work that way. The value that the exynos-m1 is using, 48, looks pretty large, so maybe they thought that the code size expansion from that is only OK at -O3 and above. Worst case, we might need two max_case_value fields, one to use at -O1/-O2, and one to use at -O3. Jim