From: Soumya AR <soum...@nvidia.com>

Hi,

This RFC and subsequent patch series introduces support for printing and parsing
of aarch64 tuning parameters in the form of JSON.

It is important to note that this mechanism is specifically intended for power
users to experiment with tuning parameters. This proposal does not suggest the
use of JSON tuning files in production. Additionally, the JSON format should not
be considered stable and may change as GCC evolves.

[1] Introduction

Currently, the aarch64 backend in GCC (15) stores the tuning parameteres of all
backends under gcc/config/aarch64/tuning_models/. Since these parameters are 
hardcoded for each CPU, this RFC proposes a technique to support the adjustment
of these parameters at runtime. This allows easier experimentation with more
aggressive parameters to find optimal numbers.

The tuning data is fed to the compiler in JSON format, which was primarily 
chosen for the following reasons:

* JSON can represent hierarchical data. This is useful for incorporating the
nested nature of the tuning structures.
* JSON supports integers, strings, booleans, and arrays.
* GCC already has support for parsing and printing JSON, removing the need for
writing APIs to read and write the JSON files.
 
Thus, if we take the following example of some tuning parameters:

static struct cpu_addrcost_table generic_armv9_a_addrcost_table =
{
    {
      1, /* hi  */
      0, /* si  */
      0, /* di  */
      1, /* ti  */
    },
  0, /* pre_modify  */
  0, /* post_modify  */
  2, /* post_modify_ld3_st3  */
  2, /* post_modify_ld4_st4  */
};

static cpu_prefetch_tune generic_armv9a_prefetch_tune =
{
  0,                    /* num_slots  */
  -1,                   /* l1_cache_size  */
  64,                   /* l1_cache_line_size  */
  -1,                   /* l2_cache_size  */
  true,                 /* prefetch_dynamic_strides */
};

static struct tune_params neoversev3_tunings =
{
  &generic_armv9_a_addrcost_table,
  10, /* issue_rate  */
  AARCH64_FUSE_NEOVERSE_BASE, /* fusible_ops  */
  "32:16",      /* function_align.  */
  &generic_armv9a_prefetch_tune,
  AARCH64_LDP_STP_POLICY_ALWAYS,   /* ldp_policy_model.  */
};

We can represent them in JSON as:

{
  "tune_params": {
    "addr_cost": {
      "addr_scale_costs": { "hi": 1, "si": 0, "di": 0, "ti": 1 },
      "pre_modify": 0,
      "post_modify": 0,
      "post_modify_ld3_st3": 2,
      "post_modify_ld4_st4": 2
    },
    "issue_rate": 10,
    "fusible_ops": 1584,
    "function_align": "32:16",
    "prefetch": {
      "num_slots": 0,
      "l1_cache_size": -1,
      "l1_cache_line_size": 64,
      "l2_cache_size": -1,
      "prefetch_dynamic_strides": true
    },
    "ldp_policy_model": "AARCH64_LDP_STP_POLICY_ALWAYS"
  }
}

---

[2] Methodology 

Before the internal tuning parameters are overridden with user provided ones, we
must ensure the validity of the provided data.

This is done using a "base" JSON schema, which contains information about the 
tune_params data structure used by the aarch64 backend.

Example:

{
  "tune_params": {
    "addr_cost": {
      "addr_scale_costs": {
        "hi": "int",
        "si": "int",
        "di": "int",
        "ti": "int"
      },
      "pre_modify": "int",
      "post_modify": "int",
      "post_modify_ld3_st3": "int",
      "post_modify_ld4_st4": "int"
    },
    "issue_rate": "int",
    "fusible_ops": "uint",
    "function_align": "string",
    "prefetch": {
      "num_slots": "int",
      "l1_cache_size": "int",
      "l1_cache_line_size": "int",
      "l2_cache_size": "int",
      "prefetch_dynamic_strides": "boolean"
    },
    "ldp_policy_model": "string"
  }
}

Using this schema, we can:
        * Verify that the correct datatypes have been used.
        * Verify if the user provided "key" or tuning parameter exists.
        * Allow user to only specify the required fields (in nested fashion), 
        eliminating the need to list down every single paramter if they only
        wish to experiment with some.
        
The schema is currently stored as a raw JSON string in
config/aarch64/aarch64-json-schema.h.

1: Parsing User Input and Overriding aarch64_tune_params

Once validated, the data can be extracted and stored into aarch64_tune_params,
overriding the default tunings. 

Thus, if
-muser-provided-CPU=<json_file> is specified, we can call the following function
in aarch64.cc, to override the default tuning parameters:

void
aarch64_load_tuning_params_from_json (const char *data_filename,
                                      struct tune_params *tune);

2: Dumping Back the Tuning Data (in JSON)

If needed, the user can choose to print back the tuning data used during
runtime. This is helpful for debugging and getting access to a "starter" tuning
file template, which can be then modified and re-fed to the compiler.

Thus, if
-muser-provided-CPU=<json_file> is specified, we can call the following function
in aarch64.cc, after the final tuning structure has been populated:

void
aarch64_print_tune_params (const tune_params &params, const char *filename);

---

[3] Testing 

To test out the functionality for this change, we have to ensure the following
things are happening correctly:

1. The JSON tunings printer is able to print back the correct values, 
especially 
when it comes to trickier datatypes like enums. 
2. The error handling works as expected, espcially in the case of incorrect JSON
syntax, incorrect datatypes, and incorrect tuning data structure. 
3. During GCC invokation, the values from JSON are correctly loaded in
aarch64_tune_params. 

To test these, we make use of a combination of regression tests (in
gcc.target/aarch64/aarch64-json-tunings/) as well as self-tests to check the
contents of aarch64_tune_params during the GCC build.

---

[4] Limitations:

Lack of comments in JSON:
        * JSON does not have the ability to store comments, which leads to the 
        loss of useful information that is provided in the form of comments in 
        the header files. A workaround is to have a dummy "comment" key and
        ignore it when parsing. (e.g., "comment": "parameter description")

No enum support in JSON:
        * The current workaround for this is to use strings instead of enums,
        but we lose out on the ability to pass enum as values, as well as doing
        bitwise operations on the enums, something used quite frequently for
        some parameters. 

No type distinction in JSON:
        * JSON uses the "number" type which allows signed and unsigned integers
        as well as floats but provides no distinction between them.

Storing the JSON schema:
        * The JSON schema is currently stored as a raw JSON string in
        aarch64-json-schema.h. This is helpful in exposing the file to the
        testing framework, but is not the cleanest solution.
        
        * Theoretically, the schema could be stored in the installation
        directory, but this interferes with the idea of having self-tests for
        the JSON parser.
        
Maintaing the printer/parser routines and JSON schema:
        * Any change in the aarch64 tuning format will result in the need for 
        manual changes to be made to the routines for the JSON tunings printer,
        parser, and schema.     
        
---

[5] Follow-Up Ideas:

JSON to C++ File Conversion:
        * Once the user has a JSON file with tuning values they are satisfied
        with, they have to manually translate the file back to CPP header files
        using the correct structure formats. This can be automated using a
        script that reads the JSON data and generates the appropriate header
        file.

Soumya AR (5):
  aarch64 + arm: Remove const keyword from tune_params members and
    nested members
  aarch64: Enable dumping of AArch64 CPU tuning parameters to JSON
  json: Add get_map() method to JSON object class
  aarch64: Enable parsing of user-provided AArch64 CPU tuning parameters
  aarch64: Regression tests for parsing of user-provided AArch64 CPU
    tuning parameters

 gcc/config.gcc                                |   2 +-
 gcc/config/aarch64/aarch64-cost-tables.h      |  18 +-
 gcc/config/aarch64/aarch64-json-schema.h      | 261 ++++++
 .../aarch64/aarch64-json-tunings-parser.cc    | 837 ++++++++++++++++++
 .../aarch64/aarch64-json-tunings-parser.h     |  29 +
 .../aarch64/aarch64-json-tunings-printer.cc   | 517 +++++++++++
 .../aarch64/aarch64-json-tunings-printer.h    |  28 +
 gcc/config/aarch64/aarch64-protos.h           | 182 ++--
 gcc/config/aarch64/aarch64.cc                 |  45 +-
 gcc/config/aarch64/aarch64.opt                |   8 +
 gcc/config/aarch64/t-aarch64                  |  19 +
 gcc/config/aarch64/tuning_models/a64fx.h      |  14 +-
 gcc/config/aarch64/tuning_models/ampere1.h    |   8 +-
 gcc/config/aarch64/tuning_models/ampere1a.h   |   2 +-
 gcc/config/aarch64/tuning_models/ampere1b.h   |   8 +-
 gcc/config/aarch64/tuning_models/cortexa35.h  |   2 +-
 gcc/config/aarch64/tuning_models/cortexa53.h  |   4 +-
 gcc/config/aarch64/tuning_models/cortexa57.h  |   8 +-
 gcc/config/aarch64/tuning_models/cortexa72.h  |   2 +-
 gcc/config/aarch64/tuning_models/cortexa73.h  |   2 +-
 gcc/config/aarch64/tuning_models/cortexx925.h |  18 +-
 gcc/config/aarch64/tuning_models/emag.h       |   2 +-
 gcc/config/aarch64/tuning_models/exynosm1.h   |  14 +-
 .../aarch64/tuning_models/fujitsu_monaka.h    |   2 +-
 gcc/config/aarch64/tuning_models/generic.h    |  18 +-
 .../aarch64/tuning_models/generic_armv8_a.h   |  18 +-
 .../aarch64/tuning_models/generic_armv9_a.h   |  22 +-
 .../aarch64/tuning_models/neoverse512tvb.h    |  10 +-
 gcc/config/aarch64/tuning_models/neoversen1.h |   2 +-
 gcc/config/aarch64/tuning_models/neoversen2.h |  18 +-
 gcc/config/aarch64/tuning_models/neoversen3.h |  18 +-
 gcc/config/aarch64/tuning_models/neoversev1.h |  20 +-
 gcc/config/aarch64/tuning_models/neoversev2.h |  18 +-
 gcc/config/aarch64/tuning_models/neoversev3.h |  18 +-
 .../aarch64/tuning_models/neoversev3ae.h      |  18 +-
 gcc/config/aarch64/tuning_models/qdf24xx.h    |  12 +-
 gcc/config/aarch64/tuning_models/saphira.h    |   2 +-
 gcc/config/aarch64/tuning_models/thunderx.h   |  10 +-
 .../aarch64/tuning_models/thunderx2t99.h      |  12 +-
 .../aarch64/tuning_models/thunderx3t110.h     |  12 +-
 .../aarch64/tuning_models/thunderxt88.h       |   4 +-
 gcc/config/aarch64/tuning_models/tsv110.h     |  12 +-
 gcc/config/aarch64/tuning_models/xgene1.h     |  14 +-
 gcc/config/arm/aarch-common-protos.h          | 128 +--
 gcc/config/arm/aarch-cost-tables.h            |  12 +-
 gcc/config/arm/arm-protos.h                   |   2 +-
 gcc/config/arm/arm.cc                         |  20 +-
 gcc/json.h                                    |  21 +-
 gcc/selftest-run-tests.cc                     |   1 +
 gcc/selftest.h                                |   1 +
 .../aarch64-json-tunings.exp                  |  35 +
 .../aarch64/aarch64-json-tunings/boolean-1.c  |   6 +
 .../aarch64-json-tunings/boolean-1.json       |   9 +
 .../aarch64/aarch64-json-tunings/boolean-2.c  |   7 +
 .../aarch64-json-tunings/boolean-2.json       |   9 +
 .../aarch64-json-tunings/empty-brackets.c     |   6 +
 .../aarch64-json-tunings/empty-brackets.json  |   1 +
 .../aarch64/aarch64-json-tunings/empty.c      |   6 +
 .../aarch64/aarch64-json-tunings/empty.json   |   0
 .../aarch64/aarch64-json-tunings/enum-1.c     |   8 +
 .../aarch64/aarch64-json-tunings/enum-1.json  |   7 +
 .../aarch64/aarch64-json-tunings/enum-2.c     |   7 +
 .../aarch64/aarch64-json-tunings/enum-2.json  |   7 +
 .../aarch64/aarch64-json-tunings/integer-1.c  |   7 +
 .../aarch64-json-tunings/integer-1.json       |   6 +
 .../aarch64/aarch64-json-tunings/integer-2.c  |   7 +
 .../aarch64-json-tunings/integer-2.json       |   6 +
 .../aarch64/aarch64-json-tunings/integer-3.c  |   7 +
 .../aarch64-json-tunings/integer-3.json       |   5 +
 .../aarch64/aarch64-json-tunings/integer-4.c  |   6 +
 .../aarch64-json-tunings/integer-4.json       |   5 +
 .../aarch64/aarch64-json-tunings/string-1.c   |   8 +
 .../aarch64-json-tunings/string-1.json        |   7 +
 .../aarch64/aarch64-json-tunings/string-2.c   |   7 +
 .../aarch64-json-tunings/string-2.json        |   5 +
 .../aarch64-json-tunings/unidentified-key.c   |   6 +
 .../unidentified-key.json                     |   5 +
 77 files changed, 2289 insertions(+), 381 deletions(-)
 create mode 100644 gcc/config/aarch64/aarch64-json-schema.h
 create mode 100644 gcc/config/aarch64/aarch64-json-tunings-parser.cc
 create mode 100644 gcc/config/aarch64/aarch64-json-tunings-parser.h
 create mode 100644 gcc/config/aarch64/aarch64-json-tunings-printer.cc
 create mode 100644 gcc/config/aarch64/aarch64-json-tunings-printer.h
 create mode 100644 
gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/aarch64-json-tunings.exp
 create mode 100644 
gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/boolean-1.c
 create mode 100644 
gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/boolean-1.json
 create mode 100644 
gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/boolean-2.c
 create mode 100644 
gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/boolean-2.json
 create mode 100644 
gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/empty-brackets.c
 create mode 100644 
gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/empty-brackets.json
 create mode 100644 
gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/empty.c
 create mode 100644 
gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/empty.json
 create mode 100644 
gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/enum-1.c
 create mode 100644 
gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/enum-1.json
 create mode 100644 
gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/enum-2.c
 create mode 100644 
gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/enum-2.json
 create mode 100644 
gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/integer-1.c
 create mode 100644 
gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/integer-1.json
 create mode 100644 
gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/integer-2.c
 create mode 100644 
gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/integer-2.json
 create mode 100644 
gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/integer-3.c
 create mode 100644 
gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/integer-3.json
 create mode 100644 
gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/integer-4.c
 create mode 100644 
gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/integer-4.json
 create mode 100644 
gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/string-1.c
 create mode 100644 
gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/string-1.json
 create mode 100644 
gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/string-2.c
 create mode 100644 
gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/string-2.json
 create mode 100644 
gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/unidentified-key.c
 create mode 100644 
gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/unidentified-key.json

-- 
2.44.0

Reply via email to