From: Soumya AR <soum...@nvidia.com> Hi,
This RFC and subsequent patch series introduces support for printing and parsing of aarch64 tuning parameters in the form of JSON. It is important to note that this mechanism is specifically intended for power users to experiment with tuning parameters. This proposal does not suggest the use of JSON tuning files in production. Additionally, the JSON format should not be considered stable and may change as GCC evolves. [1] Introduction Currently, the aarch64 backend in GCC (15) stores the tuning parameteres of all backends under gcc/config/aarch64/tuning_models/. Since these parameters are hardcoded for each CPU, this RFC proposes a technique to support the adjustment of these parameters at runtime. This allows easier experimentation with more aggressive parameters to find optimal numbers. The tuning data is fed to the compiler in JSON format, which was primarily chosen for the following reasons: * JSON can represent hierarchical data. This is useful for incorporating the nested nature of the tuning structures. * JSON supports integers, strings, booleans, and arrays. * GCC already has support for parsing and printing JSON, removing the need for writing APIs to read and write the JSON files. Thus, if we take the following example of some tuning parameters: static struct cpu_addrcost_table generic_armv9_a_addrcost_table = { { 1, /* hi */ 0, /* si */ 0, /* di */ 1, /* ti */ }, 0, /* pre_modify */ 0, /* post_modify */ 2, /* post_modify_ld3_st3 */ 2, /* post_modify_ld4_st4 */ }; static cpu_prefetch_tune generic_armv9a_prefetch_tune = { 0, /* num_slots */ -1, /* l1_cache_size */ 64, /* l1_cache_line_size */ -1, /* l2_cache_size */ true, /* prefetch_dynamic_strides */ }; static struct tune_params neoversev3_tunings = { &generic_armv9_a_addrcost_table, 10, /* issue_rate */ AARCH64_FUSE_NEOVERSE_BASE, /* fusible_ops */ "32:16", /* function_align. */ &generic_armv9a_prefetch_tune, AARCH64_LDP_STP_POLICY_ALWAYS, /* ldp_policy_model. */ }; We can represent them in JSON as: { "tune_params": { "addr_cost": { "addr_scale_costs": { "hi": 1, "si": 0, "di": 0, "ti": 1 }, "pre_modify": 0, "post_modify": 0, "post_modify_ld3_st3": 2, "post_modify_ld4_st4": 2 }, "issue_rate": 10, "fusible_ops": 1584, "function_align": "32:16", "prefetch": { "num_slots": 0, "l1_cache_size": -1, "l1_cache_line_size": 64, "l2_cache_size": -1, "prefetch_dynamic_strides": true }, "ldp_policy_model": "AARCH64_LDP_STP_POLICY_ALWAYS" } } --- [2] Methodology Before the internal tuning parameters are overridden with user provided ones, we must ensure the validity of the provided data. This is done using a "base" JSON schema, which contains information about the tune_params data structure used by the aarch64 backend. Example: { "tune_params": { "addr_cost": { "addr_scale_costs": { "hi": "int", "si": "int", "di": "int", "ti": "int" }, "pre_modify": "int", "post_modify": "int", "post_modify_ld3_st3": "int", "post_modify_ld4_st4": "int" }, "issue_rate": "int", "fusible_ops": "uint", "function_align": "string", "prefetch": { "num_slots": "int", "l1_cache_size": "int", "l1_cache_line_size": "int", "l2_cache_size": "int", "prefetch_dynamic_strides": "boolean" }, "ldp_policy_model": "string" } } Using this schema, we can: * Verify that the correct datatypes have been used. * Verify if the user provided "key" or tuning parameter exists. * Allow user to only specify the required fields (in nested fashion), eliminating the need to list down every single paramter if they only wish to experiment with some. The schema is currently stored as a raw JSON string in config/aarch64/aarch64-json-schema.h. 1: Parsing User Input and Overriding aarch64_tune_params Once validated, the data can be extracted and stored into aarch64_tune_params, overriding the default tunings. Thus, if -muser-provided-CPU=<json_file> is specified, we can call the following function in aarch64.cc, to override the default tuning parameters: void aarch64_load_tuning_params_from_json (const char *data_filename, struct tune_params *tune); 2: Dumping Back the Tuning Data (in JSON) If needed, the user can choose to print back the tuning data used during runtime. This is helpful for debugging and getting access to a "starter" tuning file template, which can be then modified and re-fed to the compiler. Thus, if -muser-provided-CPU=<json_file> is specified, we can call the following function in aarch64.cc, after the final tuning structure has been populated: void aarch64_print_tune_params (const tune_params ¶ms, const char *filename); --- [3] Testing To test out the functionality for this change, we have to ensure the following things are happening correctly: 1. The JSON tunings printer is able to print back the correct values, especially when it comes to trickier datatypes like enums. 2. The error handling works as expected, espcially in the case of incorrect JSON syntax, incorrect datatypes, and incorrect tuning data structure. 3. During GCC invokation, the values from JSON are correctly loaded in aarch64_tune_params. To test these, we make use of a combination of regression tests (in gcc.target/aarch64/aarch64-json-tunings/) as well as self-tests to check the contents of aarch64_tune_params during the GCC build. --- [4] Limitations: Lack of comments in JSON: * JSON does not have the ability to store comments, which leads to the loss of useful information that is provided in the form of comments in the header files. A workaround is to have a dummy "comment" key and ignore it when parsing. (e.g., "comment": "parameter description") No enum support in JSON: * The current workaround for this is to use strings instead of enums, but we lose out on the ability to pass enum as values, as well as doing bitwise operations on the enums, something used quite frequently for some parameters. No type distinction in JSON: * JSON uses the "number" type which allows signed and unsigned integers as well as floats but provides no distinction between them. Storing the JSON schema: * The JSON schema is currently stored as a raw JSON string in aarch64-json-schema.h. This is helpful in exposing the file to the testing framework, but is not the cleanest solution. * Theoretically, the schema could be stored in the installation directory, but this interferes with the idea of having self-tests for the JSON parser. Maintaing the printer/parser routines and JSON schema: * Any change in the aarch64 tuning format will result in the need for manual changes to be made to the routines for the JSON tunings printer, parser, and schema. --- [5] Follow-Up Ideas: JSON to C++ File Conversion: * Once the user has a JSON file with tuning values they are satisfied with, they have to manually translate the file back to CPP header files using the correct structure formats. This can be automated using a script that reads the JSON data and generates the appropriate header file. Soumya AR (5): aarch64 + arm: Remove const keyword from tune_params members and nested members aarch64: Enable dumping of AArch64 CPU tuning parameters to JSON json: Add get_map() method to JSON object class aarch64: Enable parsing of user-provided AArch64 CPU tuning parameters aarch64: Regression tests for parsing of user-provided AArch64 CPU tuning parameters gcc/config.gcc | 2 +- gcc/config/aarch64/aarch64-cost-tables.h | 18 +- gcc/config/aarch64/aarch64-json-schema.h | 261 ++++++ .../aarch64/aarch64-json-tunings-parser.cc | 837 ++++++++++++++++++ .../aarch64/aarch64-json-tunings-parser.h | 29 + .../aarch64/aarch64-json-tunings-printer.cc | 517 +++++++++++ .../aarch64/aarch64-json-tunings-printer.h | 28 + gcc/config/aarch64/aarch64-protos.h | 182 ++-- gcc/config/aarch64/aarch64.cc | 45 +- gcc/config/aarch64/aarch64.opt | 8 + gcc/config/aarch64/t-aarch64 | 19 + gcc/config/aarch64/tuning_models/a64fx.h | 14 +- gcc/config/aarch64/tuning_models/ampere1.h | 8 +- gcc/config/aarch64/tuning_models/ampere1a.h | 2 +- gcc/config/aarch64/tuning_models/ampere1b.h | 8 +- gcc/config/aarch64/tuning_models/cortexa35.h | 2 +- gcc/config/aarch64/tuning_models/cortexa53.h | 4 +- gcc/config/aarch64/tuning_models/cortexa57.h | 8 +- gcc/config/aarch64/tuning_models/cortexa72.h | 2 +- gcc/config/aarch64/tuning_models/cortexa73.h | 2 +- gcc/config/aarch64/tuning_models/cortexx925.h | 18 +- gcc/config/aarch64/tuning_models/emag.h | 2 +- gcc/config/aarch64/tuning_models/exynosm1.h | 14 +- .../aarch64/tuning_models/fujitsu_monaka.h | 2 +- gcc/config/aarch64/tuning_models/generic.h | 18 +- .../aarch64/tuning_models/generic_armv8_a.h | 18 +- .../aarch64/tuning_models/generic_armv9_a.h | 22 +- .../aarch64/tuning_models/neoverse512tvb.h | 10 +- gcc/config/aarch64/tuning_models/neoversen1.h | 2 +- gcc/config/aarch64/tuning_models/neoversen2.h | 18 +- gcc/config/aarch64/tuning_models/neoversen3.h | 18 +- gcc/config/aarch64/tuning_models/neoversev1.h | 20 +- gcc/config/aarch64/tuning_models/neoversev2.h | 18 +- gcc/config/aarch64/tuning_models/neoversev3.h | 18 +- .../aarch64/tuning_models/neoversev3ae.h | 18 +- gcc/config/aarch64/tuning_models/qdf24xx.h | 12 +- gcc/config/aarch64/tuning_models/saphira.h | 2 +- gcc/config/aarch64/tuning_models/thunderx.h | 10 +- .../aarch64/tuning_models/thunderx2t99.h | 12 +- .../aarch64/tuning_models/thunderx3t110.h | 12 +- .../aarch64/tuning_models/thunderxt88.h | 4 +- gcc/config/aarch64/tuning_models/tsv110.h | 12 +- gcc/config/aarch64/tuning_models/xgene1.h | 14 +- gcc/config/arm/aarch-common-protos.h | 128 +-- gcc/config/arm/aarch-cost-tables.h | 12 +- gcc/config/arm/arm-protos.h | 2 +- gcc/config/arm/arm.cc | 20 +- gcc/json.h | 21 +- gcc/selftest-run-tests.cc | 1 + gcc/selftest.h | 1 + .../aarch64-json-tunings.exp | 35 + .../aarch64/aarch64-json-tunings/boolean-1.c | 6 + .../aarch64-json-tunings/boolean-1.json | 9 + .../aarch64/aarch64-json-tunings/boolean-2.c | 7 + .../aarch64-json-tunings/boolean-2.json | 9 + .../aarch64-json-tunings/empty-brackets.c | 6 + .../aarch64-json-tunings/empty-brackets.json | 1 + .../aarch64/aarch64-json-tunings/empty.c | 6 + .../aarch64/aarch64-json-tunings/empty.json | 0 .../aarch64/aarch64-json-tunings/enum-1.c | 8 + .../aarch64/aarch64-json-tunings/enum-1.json | 7 + .../aarch64/aarch64-json-tunings/enum-2.c | 7 + .../aarch64/aarch64-json-tunings/enum-2.json | 7 + .../aarch64/aarch64-json-tunings/integer-1.c | 7 + .../aarch64-json-tunings/integer-1.json | 6 + .../aarch64/aarch64-json-tunings/integer-2.c | 7 + .../aarch64-json-tunings/integer-2.json | 6 + .../aarch64/aarch64-json-tunings/integer-3.c | 7 + .../aarch64-json-tunings/integer-3.json | 5 + .../aarch64/aarch64-json-tunings/integer-4.c | 6 + .../aarch64-json-tunings/integer-4.json | 5 + .../aarch64/aarch64-json-tunings/string-1.c | 8 + .../aarch64-json-tunings/string-1.json | 7 + .../aarch64/aarch64-json-tunings/string-2.c | 7 + .../aarch64-json-tunings/string-2.json | 5 + .../aarch64-json-tunings/unidentified-key.c | 6 + .../unidentified-key.json | 5 + 77 files changed, 2289 insertions(+), 381 deletions(-) create mode 100644 gcc/config/aarch64/aarch64-json-schema.h create mode 100644 gcc/config/aarch64/aarch64-json-tunings-parser.cc create mode 100644 gcc/config/aarch64/aarch64-json-tunings-parser.h create mode 100644 gcc/config/aarch64/aarch64-json-tunings-printer.cc create mode 100644 gcc/config/aarch64/aarch64-json-tunings-printer.h create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/aarch64-json-tunings.exp create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/boolean-1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/boolean-1.json create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/boolean-2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/boolean-2.json create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/empty-brackets.c create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/empty-brackets.json create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/empty.c create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/empty.json create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/enum-1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/enum-1.json create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/enum-2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/enum-2.json create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/integer-1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/integer-1.json create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/integer-2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/integer-2.json create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/integer-3.c create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/integer-3.json create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/integer-4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/integer-4.json create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/string-1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/string-1.json create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/string-2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/string-2.json create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/unidentified-key.c create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/unidentified-key.json -- 2.44.0