> -----Original Message-----
> From: soum...@nvidia.com <soum...@nvidia.com>
> Sent: Monday, August 25, 2025 12:00 PM
> To: gcc-patches@gcc.gnu.org
> Cc: ktkac...@nvidia.com; rdsandif...@googlemail.com; pins...@gmail.com
> Subject: [v3 PATCH 0/6] aarch64: Support for user-defined aarch64 tuning
> 
> From: Soumya AR <soum...@nvidia.com>
> 
> Hi,
> 
> This is a follow-up to suggestions given on v2 of the aarch64
> user-defined tunings.
> 
> https://gcc.gnu.org/pipermail/gcc-patches/2025-July/689893.html
> Attaching an updated patch series here.
> 
> Regarding some questions on this thread:
> 
> I've made the changes suggested by Andrew and Richard. Thanks a lot for taking
> a look!
> 
> > Maybe you could do:
> > ```
> > template<typename T>
> > using serialize_func_type = std::unique_ptr<json::object>
> > (*serialize_func) (const typename std::remove_pointer<T>::type &);
> >
> > serialize_object_helper (
> >  const T &member, serialize_func_type<T>  serialize_func)
> > ```
> > Since std::remove_pointer<T>::type will T if T is not a pointer.
> 
> Right, this is much better. I've updated both serialize_object_helper and
> parse_object_helper to do this.
> 
> > > +    function_map = {}
> > > +    for path_key, (path, obj_schema) in all_objects_with_paths.items():
> > > +        filtered_path = [p for p in path if p != "tune_params"]
> > > +        if filtered_path:
> > > +            full_name = "_".join(filtered_path)
> > > +            function_map[path_key] = f"{operation}_{full_name}"
> > > +        else:
> > > +            function_map[path_key] = f"{operation}_{path_key}"
> >
> > Are the 6 lines above equivalent to:
> >
> >    if len(path) > 1:
> >        full_name = "_".join(path[1:])
> >        function_map[path_key] = f"{operation}_{full_name}"
> >    else:
> >        function_map[path_key] = f"{operation}_{path_key}"
> >
> > ?  If so, that might be easier to follow, since it emphasises that
> > we're structurally ignoring path[0] for all cases.  It looked at first
> > like the filtering might remove something other than the first element.
> 
> I actually had this check to avoid the case for when we encounter the
> "tune_params" key. On a second look, I realized I was taking care of that case
> earlier anyway, so the check isn't needed at all. Now it's just:
> 
>     for path_key, (path, obj_schema) in all_objects_with_paths.items():
>         if path:
>             full_name = "_".join(path)
>             function_map[path_key] = f"{operation}_{full_name}"
>         else:
>             function_map[path_key] = f"{operation}_{path_key}"
> 
> 
> > How is memory management handled (both here and for pointers to
> > structures)?  Do we just allocate and never free?  If so, that's ok for
> > command-line options, but I wonder whether we'll ewventually be using
> > this on a per-function basis.  (Hopefully not though...)
> 
> About this, I addressed it earlier in a separate reply:
> https://gcc.gnu.org/pipermail/gcc-patches/2025-July/689902.html
> 
> Pasting the relevant information from that thread.
> 
> Possible solutions:
> 1. Simply don't bother freeing up the memory and let the memory leak be :)
> 2. Use the GCC garbage collector
> 3. Remove the pointers and have all the nested structures be embedded since
> I'm not too sure why this is inconsistent in tune_params.
> 4. (The current approach) Have each temporary structure be a static local
> variable. This works, but operates under the assumption that nested structures
> in tune_params will always have unique types, since the variable declaration 
> is
> a template that is resolved by the type of the nested structures. There is an
> extra check added to ensure we don't overwrite accidentally, in the case we
> ever end up having structure members of the same type. This is still a bit 
> ugly
> though, so I would appreciate if there are other suggestions on how to handle
> this.
> 
> > I agree with Andrew that it would be nice in principle to be able
> > to autogenerate this.  I'm personally happy without that while it
> > remains this small, but I think it would at least be worth sharing
> > this between the parsing and serialising code, rather than having
> > separate lists for each.
> 
> Auto-generating this would not be a problem, but my main agenda with the 
> script
> was to have a setup that was only dependent on the JSON schema and not have
> any
> actual references to the tuning parameters. Essentially, make the script
> as agnostic of the tuning parameters as possible, so anyone modifying the 
> tuning
> paramters does not need to touch the script (and only modify the schema).
> 
> Extending that philosophy to the enums, I now have the following setup:
> 
> I've centralized the tuning enum definitions into a new 
> aarch64-tuning-enums.def
> file. The enums (autoprefetch_model and ldp_stp_policy) are now defined once
> using macros and consumed by both the core definitions (aarch64-opts.h,
> aarch64-protos.h) and the generated parser/printer code. This ensures that if
> someone wants to add a new enum value, they only need to make modifications in
> the .def file. The codegen from the script refers to the same enums.
> 
> If this is excessive, I can just have the script directly generate the enum to
> string maps. Let me know what you think.
> 
> > Could you explain the reason for Negative(fdump-tuning-model=) and ToLower?
> 
> ToLower is incorrect here, I've removed it. We should accept any filename for
> the user provided files. Thanks for pointing this out.
> 
> As I understand it, Negative() will prune all earlier instances of the option,
> yes? I suppose in any case, even if I don't specify Negative(), then the 
> second
> fdump-tuning-model= would override the first one? At that point, it seems 
> better
> to not expose the earlier option at all..?
> 
> Similar logic for Negative(muser-provided-CPU=), where we would only like to
> parse one file.
> 
> > The option should be documented in invoke.texi, or marked Undocumented if
> > we really don't want to put it in the manual.
> 
> Not too sure about this one. I've marked it Undocumented for now but I can see
> intances of options that are marked Undocumented but still specified in
> invoke.texi. Should we document its usage anyway in invoke.texi?
> Also, should we have something similar to -moverride, where we specify that 
> this
> feature is only for power users?
> 
> CCing Richard's personal email here (in case you would still like to give
> feedback on this). Thank you for all your help with this feature, as well as 
> all
> of my other work! :)
> 

Hi Soumya,

Sorry I have not been following this review at all. I'll start going over the 
review
History but will take a week or so.  Just in case it helps, I'm not against 
this change
at all and infact quite for it, Just trying to map out the current situation.

Thanks,
Tamar

> Best,
> Soumya
> 
> Soumya AR (6):
>   aarch64 + arm: Remove const keyword from tune_params members and
>     nested members
>   aarch64: Enable dumping of AArch64 CPU tuning parameters to JSON
>   json: Add get_map() method to JSON object class
>   aarch64: Enable parsing of user-provided AArch64 CPU tuning parameters
>   aarch64: Regression tests for parsing of user-provided AArch64 CPU
>     tuning parameters
>   aarch64: Script to auto generate JSON tuning routines
> 
>  gcc/config.gcc                                |   2 +-
>  .../aarch64-generate-json-tuning-routines.py  | 374 +++++++++++++
>  gcc/config/aarch64/aarch64-json-schema.h      | 261 +++++++++
>  .../aarch64-json-tunings-parser-generated.inc | 355 ++++++++++++
>  .../aarch64/aarch64-json-tunings-parser.cc    | 527 ++++++++++++++++++
>  .../aarch64/aarch64-json-tunings-parser.h     |  29 +
>  ...aarch64-json-tunings-printer-generated.inc | 439 +++++++++++++++
>  .../aarch64/aarch64-json-tunings-printer.cc   | 137 +++++
>  .../aarch64/aarch64-json-tunings-printer.h    |  28 +
>  gcc/config/aarch64/aarch64-opts.h             |   6 +-
>  gcc/config/aarch64/aarch64-protos.h           | 169 +++---
>  gcc/config/aarch64/aarch64-tuning-enums.def   |  37 ++
>  gcc/config/aarch64/aarch64.cc                 |  20 +
>  gcc/config/aarch64/aarch64.opt                |   8 +
>  gcc/config/aarch64/t-aarch64                  |  21 +
>  gcc/config/arm/aarch-common-protos.h          | 128 ++---
>  gcc/json.h                                    |  26 +-
>  gcc/selftest-run-tests.cc                     |   1 +
>  gcc/selftest.h                                |   1 +
>  .../aarch64-json-tunings.exp                  |  35 ++
>  .../aarch64/aarch64-json-tunings/boolean-1.c  |   6 +
>  .../aarch64-json-tunings/boolean-1.json       |   9 +
>  .../aarch64/aarch64-json-tunings/boolean-2.c  |   7 +
>  .../aarch64-json-tunings/boolean-2.json       |   9 +
>  .../aarch64-json-tunings/empty-brackets.c     |   6 +
>  .../aarch64-json-tunings/empty-brackets.json  |   1 +
>  .../aarch64/aarch64-json-tunings/empty.c      |   6 +
>  .../aarch64/aarch64-json-tunings/empty.json   |   0
>  .../aarch64/aarch64-json-tunings/enum-1.c     |   8 +
>  .../aarch64/aarch64-json-tunings/enum-1.json  |   7 +
>  .../aarch64/aarch64-json-tunings/enum-2.c     |   7 +
>  .../aarch64/aarch64-json-tunings/enum-2.json  |   7 +
>  .../aarch64/aarch64-json-tunings/integer-1.c  |   7 +
>  .../aarch64-json-tunings/integer-1.json       |   6 +
>  .../aarch64/aarch64-json-tunings/integer-2.c  |   7 +
>  .../aarch64-json-tunings/integer-2.json       |   6 +
>  .../aarch64/aarch64-json-tunings/integer-3.c  |   7 +
>  .../aarch64-json-tunings/integer-3.json       |   5 +
>  .../aarch64/aarch64-json-tunings/integer-4.c  |   6 +
>  .../aarch64-json-tunings/integer-4.json       |   5 +
>  .../aarch64/aarch64-json-tunings/string-1.c   |   8 +
>  .../aarch64-json-tunings/string-1.json        |   7 +
>  .../aarch64/aarch64-json-tunings/string-2.c   |   7 +
>  .../aarch64-json-tunings/string-2.json        |   5 +
>  .../aarch64/aarch64-json-tunings/test-all.c   |  58 ++
>  .../aarch64-json-tunings/test-all.json        |  39 ++
>  .../aarch64-json-tunings/unidentified-key.c   |   6 +
>  .../unidentified-key.json                     |   5 +
>  48 files changed, 2705 insertions(+), 156 deletions(-)
>  create mode 100644 gcc/config/aarch64/aarch64-generate-json-tuning-
> routines.py
>  create mode 100644 gcc/config/aarch64/aarch64-json-schema.h
>  create mode 100644 gcc/config/aarch64/aarch64-json-tunings-parser-
> generated.inc
>  create mode 100644 gcc/config/aarch64/aarch64-json-tunings-parser.cc
>  create mode 100644 gcc/config/aarch64/aarch64-json-tunings-parser.h
>  create mode 100644 gcc/config/aarch64/aarch64-json-tunings-printer-
> generated.inc
>  create mode 100644 gcc/config/aarch64/aarch64-json-tunings-printer.cc
>  create mode 100644 gcc/config/aarch64/aarch64-json-tunings-printer.h
>  create mode 100644 gcc/config/aarch64/aarch64-tuning-enums.def
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/aarch64-json-tunings.exp
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/boolean-1.c
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/boolean-1.json
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/boolean-2.c
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/boolean-2.json
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/empty-brackets.c
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/empty-brackets.json
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/empty.c
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/empty.json
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/enum-1.c
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/enum-1.json
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/enum-2.c
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/enum-2.json
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/integer-1.c
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/integer-1.json
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/integer-2.c
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/integer-2.json
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/integer-3.c
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/integer-3.json
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/integer-4.c
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/integer-4.json
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/string-1.c
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/string-1.json
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/string-2.c
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/string-2.json
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/test-all.c
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/test-all.json
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/unidentified-key.c
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/unidentified-key.json
> 
> --
> 2.44.0

Reply via email to