> -----Original Message----- > From: soum...@nvidia.com <soum...@nvidia.com> > Sent: Monday, August 25, 2025 12:00 PM > To: gcc-patches@gcc.gnu.org > Cc: ktkac...@nvidia.com; rdsandif...@googlemail.com; pins...@gmail.com > Subject: [v3 PATCH 0/6] aarch64: Support for user-defined aarch64 tuning > > From: Soumya AR <soum...@nvidia.com> > > Hi, > > This is a follow-up to suggestions given on v2 of the aarch64 > user-defined tunings. > > https://gcc.gnu.org/pipermail/gcc-patches/2025-July/689893.html > Attaching an updated patch series here. > > Regarding some questions on this thread: > > I've made the changes suggested by Andrew and Richard. Thanks a lot for taking > a look! > > > Maybe you could do: > > ``` > > template<typename T> > > using serialize_func_type = std::unique_ptr<json::object> > > (*serialize_func) (const typename std::remove_pointer<T>::type &); > > > > serialize_object_helper ( > > const T &member, serialize_func_type<T> serialize_func) > > ``` > > Since std::remove_pointer<T>::type will T if T is not a pointer. > > Right, this is much better. I've updated both serialize_object_helper and > parse_object_helper to do this. > > > > + function_map = {} > > > + for path_key, (path, obj_schema) in all_objects_with_paths.items(): > > > + filtered_path = [p for p in path if p != "tune_params"] > > > + if filtered_path: > > > + full_name = "_".join(filtered_path) > > > + function_map[path_key] = f"{operation}_{full_name}" > > > + else: > > > + function_map[path_key] = f"{operation}_{path_key}" > > > > Are the 6 lines above equivalent to: > > > > if len(path) > 1: > > full_name = "_".join(path[1:]) > > function_map[path_key] = f"{operation}_{full_name}" > > else: > > function_map[path_key] = f"{operation}_{path_key}" > > > > ? If so, that might be easier to follow, since it emphasises that > > we're structurally ignoring path[0] for all cases. It looked at first > > like the filtering might remove something other than the first element. > > I actually had this check to avoid the case for when we encounter the > "tune_params" key. On a second look, I realized I was taking care of that case > earlier anyway, so the check isn't needed at all. Now it's just: > > for path_key, (path, obj_schema) in all_objects_with_paths.items(): > if path: > full_name = "_".join(path) > function_map[path_key] = f"{operation}_{full_name}" > else: > function_map[path_key] = f"{operation}_{path_key}" > > > > How is memory management handled (both here and for pointers to > > structures)? Do we just allocate and never free? If so, that's ok for > > command-line options, but I wonder whether we'll ewventually be using > > this on a per-function basis. (Hopefully not though...) > > About this, I addressed it earlier in a separate reply: > https://gcc.gnu.org/pipermail/gcc-patches/2025-July/689902.html > > Pasting the relevant information from that thread. > > Possible solutions: > 1. Simply don't bother freeing up the memory and let the memory leak be :) > 2. Use the GCC garbage collector > 3. Remove the pointers and have all the nested structures be embedded since > I'm not too sure why this is inconsistent in tune_params. > 4. (The current approach) Have each temporary structure be a static local > variable. This works, but operates under the assumption that nested structures > in tune_params will always have unique types, since the variable declaration > is > a template that is resolved by the type of the nested structures. There is an > extra check added to ensure we don't overwrite accidentally, in the case we > ever end up having structure members of the same type. This is still a bit > ugly > though, so I would appreciate if there are other suggestions on how to handle > this. > > > I agree with Andrew that it would be nice in principle to be able > > to autogenerate this. I'm personally happy without that while it > > remains this small, but I think it would at least be worth sharing > > this between the parsing and serialising code, rather than having > > separate lists for each. > > Auto-generating this would not be a problem, but my main agenda with the > script > was to have a setup that was only dependent on the JSON schema and not have > any > actual references to the tuning parameters. Essentially, make the script > as agnostic of the tuning parameters as possible, so anyone modifying the > tuning > paramters does not need to touch the script (and only modify the schema). > > Extending that philosophy to the enums, I now have the following setup: > > I've centralized the tuning enum definitions into a new > aarch64-tuning-enums.def > file. The enums (autoprefetch_model and ldp_stp_policy) are now defined once > using macros and consumed by both the core definitions (aarch64-opts.h, > aarch64-protos.h) and the generated parser/printer code. This ensures that if > someone wants to add a new enum value, they only need to make modifications in > the .def file. The codegen from the script refers to the same enums. > > If this is excessive, I can just have the script directly generate the enum to > string maps. Let me know what you think. > > > Could you explain the reason for Negative(fdump-tuning-model=) and ToLower? > > ToLower is incorrect here, I've removed it. We should accept any filename for > the user provided files. Thanks for pointing this out. > > As I understand it, Negative() will prune all earlier instances of the option, > yes? I suppose in any case, even if I don't specify Negative(), then the > second > fdump-tuning-model= would override the first one? At that point, it seems > better > to not expose the earlier option at all..? > > Similar logic for Negative(muser-provided-CPU=), where we would only like to > parse one file. > > > The option should be documented in invoke.texi, or marked Undocumented if > > we really don't want to put it in the manual. > > Not too sure about this one. I've marked it Undocumented for now but I can see > intances of options that are marked Undocumented but still specified in > invoke.texi. Should we document its usage anyway in invoke.texi? > Also, should we have something similar to -moverride, where we specify that > this > feature is only for power users? > > CCing Richard's personal email here (in case you would still like to give > feedback on this). Thank you for all your help with this feature, as well as > all > of my other work! :) >
Hi Soumya, Sorry I have not been following this review at all. I'll start going over the review History but will take a week or so. Just in case it helps, I'm not against this change at all and infact quite for it, Just trying to map out the current situation. Thanks, Tamar > Best, > Soumya > > Soumya AR (6): > aarch64 + arm: Remove const keyword from tune_params members and > nested members > aarch64: Enable dumping of AArch64 CPU tuning parameters to JSON > json: Add get_map() method to JSON object class > aarch64: Enable parsing of user-provided AArch64 CPU tuning parameters > aarch64: Regression tests for parsing of user-provided AArch64 CPU > tuning parameters > aarch64: Script to auto generate JSON tuning routines > > gcc/config.gcc | 2 +- > .../aarch64-generate-json-tuning-routines.py | 374 +++++++++++++ > gcc/config/aarch64/aarch64-json-schema.h | 261 +++++++++ > .../aarch64-json-tunings-parser-generated.inc | 355 ++++++++++++ > .../aarch64/aarch64-json-tunings-parser.cc | 527 ++++++++++++++++++ > .../aarch64/aarch64-json-tunings-parser.h | 29 + > ...aarch64-json-tunings-printer-generated.inc | 439 +++++++++++++++ > .../aarch64/aarch64-json-tunings-printer.cc | 137 +++++ > .../aarch64/aarch64-json-tunings-printer.h | 28 + > gcc/config/aarch64/aarch64-opts.h | 6 +- > gcc/config/aarch64/aarch64-protos.h | 169 +++--- > gcc/config/aarch64/aarch64-tuning-enums.def | 37 ++ > gcc/config/aarch64/aarch64.cc | 20 + > gcc/config/aarch64/aarch64.opt | 8 + > gcc/config/aarch64/t-aarch64 | 21 + > gcc/config/arm/aarch-common-protos.h | 128 ++--- > gcc/json.h | 26 +- > gcc/selftest-run-tests.cc | 1 + > gcc/selftest.h | 1 + > .../aarch64-json-tunings.exp | 35 ++ > .../aarch64/aarch64-json-tunings/boolean-1.c | 6 + > .../aarch64-json-tunings/boolean-1.json | 9 + > .../aarch64/aarch64-json-tunings/boolean-2.c | 7 + > .../aarch64-json-tunings/boolean-2.json | 9 + > .../aarch64-json-tunings/empty-brackets.c | 6 + > .../aarch64-json-tunings/empty-brackets.json | 1 + > .../aarch64/aarch64-json-tunings/empty.c | 6 + > .../aarch64/aarch64-json-tunings/empty.json | 0 > .../aarch64/aarch64-json-tunings/enum-1.c | 8 + > .../aarch64/aarch64-json-tunings/enum-1.json | 7 + > .../aarch64/aarch64-json-tunings/enum-2.c | 7 + > .../aarch64/aarch64-json-tunings/enum-2.json | 7 + > .../aarch64/aarch64-json-tunings/integer-1.c | 7 + > .../aarch64-json-tunings/integer-1.json | 6 + > .../aarch64/aarch64-json-tunings/integer-2.c | 7 + > .../aarch64-json-tunings/integer-2.json | 6 + > .../aarch64/aarch64-json-tunings/integer-3.c | 7 + > .../aarch64-json-tunings/integer-3.json | 5 + > .../aarch64/aarch64-json-tunings/integer-4.c | 6 + > .../aarch64-json-tunings/integer-4.json | 5 + > .../aarch64/aarch64-json-tunings/string-1.c | 8 + > .../aarch64-json-tunings/string-1.json | 7 + > .../aarch64/aarch64-json-tunings/string-2.c | 7 + > .../aarch64-json-tunings/string-2.json | 5 + > .../aarch64/aarch64-json-tunings/test-all.c | 58 ++ > .../aarch64-json-tunings/test-all.json | 39 ++ > .../aarch64-json-tunings/unidentified-key.c | 6 + > .../unidentified-key.json | 5 + > 48 files changed, 2705 insertions(+), 156 deletions(-) > create mode 100644 gcc/config/aarch64/aarch64-generate-json-tuning- > routines.py > create mode 100644 gcc/config/aarch64/aarch64-json-schema.h > create mode 100644 gcc/config/aarch64/aarch64-json-tunings-parser- > generated.inc > create mode 100644 gcc/config/aarch64/aarch64-json-tunings-parser.cc > create mode 100644 gcc/config/aarch64/aarch64-json-tunings-parser.h > create mode 100644 gcc/config/aarch64/aarch64-json-tunings-printer- > generated.inc > create mode 100644 gcc/config/aarch64/aarch64-json-tunings-printer.cc > create mode 100644 gcc/config/aarch64/aarch64-json-tunings-printer.h > create mode 100644 gcc/config/aarch64/aarch64-tuning-enums.def > create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json- > tunings/aarch64-json-tunings.exp > create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json- > tunings/boolean-1.c > create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json- > tunings/boolean-1.json > create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json- > tunings/boolean-2.c > create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json- > tunings/boolean-2.json > create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json- > tunings/empty-brackets.c > create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json- > tunings/empty-brackets.json > create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json- > tunings/empty.c > create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json- > tunings/empty.json > create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json- > tunings/enum-1.c > create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json- > tunings/enum-1.json > create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json- > tunings/enum-2.c > create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json- > tunings/enum-2.json > create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json- > tunings/integer-1.c > create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json- > tunings/integer-1.json > create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json- > tunings/integer-2.c > create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json- > tunings/integer-2.json > create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json- > tunings/integer-3.c > create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json- > tunings/integer-3.json > create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json- > tunings/integer-4.c > create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json- > tunings/integer-4.json > create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json- > tunings/string-1.c > create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json- > tunings/string-1.json > create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json- > tunings/string-2.c > create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json- > tunings/string-2.json > create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json- > tunings/test-all.c > create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json- > tunings/test-all.json > create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json- > tunings/unidentified-key.c > create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json- > tunings/unidentified-key.json > > -- > 2.44.0