Szelethus created this revision. Szelethus added reviewers: NoQ, vsavchenko, martong, balazske, steakhal, xazax.hun. Szelethus added a project: clang. Herald added subscribers: manas, dang, ASDenysPetrov, gamesh411, donat.nagy, mikhail.ramalho, a.sidorin, rnkovacs, szepet, baloghadamsoftware, whisperity, mgorny. Szelethus requested review of this revision. Herald added a subscriber: cfe-commits.
As of now, analyzer configs are specified in the preprocessor file `clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def`, something like this: #define ANALYZER_OPTION(TYPE, NAME, CMDFLAG, DESC, DEFAULT_VAL) ANALYZER_OPTION(bool, ShouldIncludeImplicitDtorsInCFG, "cfg-implicit-dtors", "Whether or not implicit destructors for C++ objects " "should be included in the CFG.", true) ANALYZER_OPTION( unsigned, AlwaysInlineSize, "ipa-always-inline-size", "The size of the functions (in basic blocks), which should be considered " "to be small enough to always inline.", 3) ANALYZER_OPTION( StringRef, RawSilencedCheckersAndPackages, "silence-checkers", "A semicolon separated list of checker and package names to silence. " "Silenced checkers will not emit reports, but the modeling remain enabled.", "") ANALYZER_OPTION( StringRef, CXXMemberInliningMode, "c++-inlining", "Controls which C++ member functions will be considered for inlining. " "Value: \"constructors\", \"destructors\", \"methods\".", "destructors") This patch proposes to move to a TableGen format, seen in the diff below. The file generated from it (`build/tools/clang/include/clang/StaticAnalyzer/Core/Config.inc`) looks like this: // This file is automatically generated. Do not edit this file by hand. #ifdef BOOLEAN_OPTIONS BOOLEAN_OPTION(bool, ShouldIncludeImplicitDtorsInCFG, "cfg-implicit-dtors", "Whether or not implicit destructors for C++ objects should be " "included in the CFG.", true) #endif // BOOLEAN_OPTIONS #ifdef INTEGER_OPTIONS INTEGER_OPTION(int, AlwaysInlineSize, "ipa-always-inline-size", "The size of the functions (in basic blocks), which should be " "considered to be small enough to always inline.", 3) #endif // INTEGER_OPTIONS #ifdef STRING_OPTIONS STRING_OPTION( StringRef, RawSilencedCheckersAndPackages, "silence-checkers", "A semicolon separated list of checker and package names to silence. " "Silenced checkers will not emit reports, but the modeling remain enabled.", "") #endif // STRING_OPTIONS #ifdef ENUM_OPTIONS ENUM_OPTION( CXXMemberInliningModeKind, CXXMemberInliningMode, "c++-inlining", "Controls which C++ member functions will be considered for inlining.", "destructors", "none," "constructors," "destructors," "methods,") #endif // ENUM_OPTIONS #ifdef ENUMS enum class CXXMemberInliningModeKind { CIMK_None = 0, CIMK_Constructors = 1, CIMK_Destructors = 2, CIMK_Memberfunctions = 3, } #endif // ENUMS --- So, we're changing from //writing// a preprocessor file to //generating// it -- it'd be a stretch to say that all the trouble synonymous with the preprocessor are now gone. Its also rather easy and intuitive to add a new config to the existing format, and TableGen isn't universally liked in the analyzer. So, here is my pros and cons for why this is would be a good change: Pros: - Any changes to the current analyzer config format is miserable. - Might affect every macro, or needs manual `#define` shenanigans. Making changes to the TableGen emitter file are rather easy and not as invasive. - We may want to add features like what we're already seen in the checker TableGen file: development status (whether its in alpha, developer, or are okay for the end user to tinker with) and documentation uri. - As arguments aren't strongly typed (there are 3 string arguments one after the other), its easy to make a mistake in them. TableGen doesn't entirely negate this issue -- care still needs to be taken when using the preprocessor file it generates, but leading up to that point is far safer. - As there are a lot of entries in that file already, unless you compile with a single job, compilation errors can become really long and confusing, and thats true even if you do compile on a single job. TableGen again doesn't negate this entirely, but gets rid of them at least to the point of writing the configs. Errors are caught while generating the file, and easy to digest errors are emitted. - There are in fact 2 `.def` files for configuring the analyzer: the other is `clang/include/clang/StaticAnalyzer/Core/Analyses.def`. The latter contains values acceptable values for some of the string configs, for example: #ifndef ANALYSIS_STORE #define ANALYSIS_STORE(NAME, CMDFLAG, DESC, CREATFN) #endif ANALYSIS_STORE(RegionStore, "region", "Use region-based analyzer store", CreateRegionStoreManager) - (cont.) In general, we don't have a good handle on string option accepting only a set of values. In the help text, its redundantly hardcoded in the text. We can't just generate it into the help text like this: #define ANALYSIS_STORE(NAME, CMDFLAG, DESC, CREATFN) #CMDFLAG "," ANALYZER_OPTION(StringRef, AnalyzerStoreStr, "analyzer-store", "The Store implementation of the analyzer. The store is " "a mapping from variables or regions to values. Note: The " "analyzer was designed with its store being a replaceable " "component, but over the years it became increasingly unlikely " "that another kind of it will be developed. Value:" #include "clang/StaticAnalyzer/Core/Analyses.def" , "region") #undef ANALYSIS_STORE - (cont.) because such an `#include` is not portable, not to mention that it is rather ugly. So that redundancy is unavoidable, but thats not an issue if we generate that with TableGen. - TableGen allows for more thorough, and arguably less cheesy ways to validate whether the specified analyzer config is valid. - Unless you use `/*comments=*/`, arguments aren't as obvious. I argue that this is far more readable in the attached tablegen file. Cons: - Making changes to the TableGen emitter file might be okay, but still requires TableGen specific knowledge. However, making changes to the `.td` file itself is a nightmare unless you have some experience with it. In fact, it can be a bit troublesome even if you know a bit of it! Its usually only a few lines of class definition code, but even that can be a bit of a struggle. - TableGen helps a lot on the issues faced by working with the preprocessor, but it doesn't negate them. At the end of the day, the generated file still needs to be included in the same, or similar way as before. - Likely a bit slower to compile. Not that much, but its something. It is my personal opinion that the pros outweigh the cons. The inspiration came from that I wanted to move some options (like `-analyzer_constraints`) from frontend flags one level lower into -analyzer-configs (in a backward compatible manner), and realized I'd need to redundantly hardcore the acceptable values for each of the enum-like configs. --- So, about this patch. This serves as little more than a demonstration of how the TableGen file looks like, and what it could generate. The biggest item to highlight is how we can easily define enum configs: - We can put all the acceptable values in the help text AND generate the enum class WITHOUT any redundancy. - While not demonstrated, it'd be easy to add help texts to the enum values as well. - You can see that in the generated file, all the values are squashed into a single string, and would need a string split -- thats just the way it goes, it seems like, as other tablegen files seem to do that as well. Maybe I could use variadic macro arguments... - Automatic initialization and validation of the user input can be done by marshalling <https://clang.llvm.org/docs/InternalsManual.html#option-marshalling-infrastructure>, but its so strongly tied to regular llvm/clang flags, I'd rather use it as a source of inspiration. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D104439 Files: clang/include/clang/CMakeLists.txt clang/include/clang/StaticAnalyzer/AnalyzerFlagsBase.td clang/include/clang/StaticAnalyzer/CMakeLists.txt clang/include/clang/StaticAnalyzer/Checkers/CheckerBase.td clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.td clang/include/clang/StaticAnalyzer/Core/CMakeLists.txt clang/utils/TableGen/CMakeLists.txt clang/utils/TableGen/ClangSAConfigsEmitter.cpp clang/utils/TableGen/TableGen.cpp clang/utils/TableGen/TableGenBackends.h
Index: clang/utils/TableGen/TableGenBackends.h =================================================================== --- clang/utils/TableGen/TableGenBackends.h +++ clang/utils/TableGen/TableGenBackends.h @@ -70,6 +70,8 @@ void EmitClangSACheckers(llvm::RecordKeeper &Records, llvm::raw_ostream &OS); +void EmitClangSAConfigs(llvm::RecordKeeper &Records, llvm::raw_ostream &OS); + void EmitClangCommentHTMLTags(llvm::RecordKeeper &Records, llvm::raw_ostream &OS); void EmitClangCommentHTMLTagsProperties(llvm::RecordKeeper &Records, Index: clang/utils/TableGen/TableGen.cpp =================================================================== --- clang/utils/TableGen/TableGen.cpp +++ clang/utils/TableGen/TableGen.cpp @@ -55,6 +55,7 @@ GenClangTypeWriter, GenClangOpcodes, GenClangSACheckers, + GenClangSAConfigs, GenClangSyntaxNodeList, GenClangSyntaxNodeClasses, GenClangCommentHTMLTags, @@ -172,6 +173,8 @@ "Generate Clang constexpr interpreter opcodes"), clEnumValN(GenClangSACheckers, "gen-clang-sa-checkers", "Generate Clang Static Analyzer checkers"), + clEnumValN(GenClangSAConfigs, "gen-clang-sa-configs", + "Generate Clang Static Analyzer -analyzer-configs"), clEnumValN(GenClangSyntaxNodeList, "gen-clang-syntax-node-list", "Generate list of Clang Syntax Tree node types"), clEnumValN(GenClangSyntaxNodeClasses, "gen-clang-syntax-node-classes", @@ -356,6 +359,9 @@ case GenClangSACheckers: EmitClangSACheckers(Records, OS); break; + case GenClangSAConfigs: + EmitClangSAConfigs(Records, OS); + break; case GenClangCommentHTMLTags: EmitClangCommentHTMLTags(Records, OS); break; Index: clang/utils/TableGen/ClangSAConfigsEmitter.cpp =================================================================== --- /dev/null +++ clang/utils/TableGen/ClangSAConfigsEmitter.cpp @@ -0,0 +1,173 @@ +//==- ClangSAConfigsEmitter.cpp - Generate Clang SA config tables -*- C++ -*- +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This tablegen backend emits Clang Static Analyzer -analyzer-config tables. +// +//===----------------------------------------------------------------------===// + +#include "TableGenBackends.h" +#include "llvm/ADT/StringMap.h" +#include "llvm/Support/raw_ostream.h" +#include "llvm/TableGen/Error.h" +#include "llvm/TableGen/Record.h" +#include "llvm/TableGen/TableGenBackend.h" +#include <map> +#include <string> + +using namespace llvm; + +using SortedRecords = llvm::StringMap<const Record *>; + +static void printOptionFields(const Record *R, raw_ostream &OS, + bool IsDefaultValString) { + OS << R->getName() << "," + << "\""; + OS.write_escaped(R->getValueAsString("Name")) << "\",\""; + OS.write_escaped(R->getValueAsString("HelpText")) << "\","; + StringRef DefaultVal = R->getValueAsString("DefaultVal"); + + if (IsDefaultValString) { + if (DefaultVal == "") + OS << "\"\""; + else + OS << "\"" << DefaultVal << "\""; + } else + OS << DefaultVal; +} + +static SortedRecords getSortedDerivedDefinitions(RecordKeeper &Records, + StringRef ClassName) { + SortedRecords Ret; + for (const Record *R : Records.getAllDerivedDefinitions(ClassName)) + Ret[R->getValueAsString("Name")] = R; + return Ret; +} + +static std::string getEnumName(const Record *R) { + return R->getNameInitAsString() + "Kind"; +} + +void clang::EmitClangSAConfigs(RecordKeeper &Records, raw_ostream &OS) { + + OS << "// This file is automatically generated. Do not edit this file by " + "hand.\n"; + + // Emit boolean options. + // + // BOOLEAN_OPTION(TYPE, FIELD_NAME, CMDFLAG, HELPTEXT, DEFAULTVAL) + // - TYPE: Type of the option in class AnalyzerOptions. + // - FIELD_NAME: Name of the field in class AnalyzerOptions. + // - CMDFLAG: Name of the flag to be specified on the command line. For + // specifics, check clang -cc1 -analyzer-config-help. + // - HELPTEXT: Text displayed under -analyzer-config-help. + // - DEFAULTVAL: Default value of the option if its not specified in the + // command line. + OS << "\n" + "#ifdef BOOLEAN_OPTIONS\n"; + for (const auto &Pair : + getSortedDerivedDefinitions(Records, "BooleanConfig")) { + const Record *R = Pair.second; + OS << "BOOLEAN_OPTION(bool,"; + printOptionFields(R, OS, /*IsDefaultValString*/false); + OS << ")\n"; + } + OS << "#endif // BOOLEAN_OPTIONS\n" + "\n"; + + // Emit integer options. + // + // INTEGER_OPTIONS(TYPE, FIELD_NAME, CMDFLAG, HELPTEXT, DEFAULTVAL) + // Check the comments around BOOLEAN_OPTION for description of the parameters. + OS << "\n" + "#ifdef INTEGER_OPTIONS\n"; + for (const auto &Pair : + getSortedDerivedDefinitions(Records, "IntegerConfig")) { + const Record *R = Pair.second; + OS << "INTEGER_OPTION(int,"; + printOptionFields(R, OS, /*IsDefaultValString*/false); + OS << ")\n"; + } + OS << "#endif // INTEGER_OPTIONS\n" + "\n"; + + // Emit string options. + // + // STRING_OPTIONS(TYPE, FIELD_NAME, CMDFLAG, HELPTEXT, DEFAULTVAL) + // Check the comments around BOOLEAN_OPTION for description of the parameters. + OS << "\n" + "#ifdef STRING_OPTIONS\n"; + for (const auto &Pair : + getSortedDerivedDefinitions(Records, "StringConfig")) { + const Record *R = Pair.second; + OS << "STRING_OPTION(StringRef,"; + printOptionFields(R, OS, /*IsDefaultValString*/true); + OS << ")\n"; + } + OS << "#endif // STRING_OPTIONS\n" + "\n"; + + // Emit enum options. These are string options can only take a select fe + // values, and are parsed into enums. + // + // ENUM_OPTIONS( + // TYPE, FIELD_NAME, CMDFLAG, HELPTEXT, DEFAULTVAL, VALUES) + // + // Check the comments around BOOLEAN_OPTION for description of the first few + // parameters. + // - VALUES: A string that lists all options that can be specified on the + // command line. Each value is separated with a comma (its still one + // big string!). They correspond with enum values with the same + // index (see ENUMS below). + OS << "\n" + "#ifdef ENUM_OPTIONS\n"; + for (const auto &Pair : getSortedDerivedDefinitions(Records, "EnumConfig")) { + const Record *R = Pair.second; + OS << "ENUM_OPTION(" << getEnumName(R) << ","; + printOptionFields(R, OS, /*IsDefaultValString*/true); + OS << ','; + if (!R->isValueUnset("Values")) { + for (const Record *EnumVal : R->getValueAsListOfDefs("Values")) { + OS << "\""; + OS.write_escaped(EnumVal->getValueAsString("CmdFlag")); + OS << ",\""; + } + } + OS << ")\n"; + } + OS << "#endif // ENUM_OPTIONS\n" + "\n"; + + // Emit enum classes corresponding with the enum options. + // + // enum class <option name>Kind { + // <first enum value> = 0, + // ... + // <nth enum value> = n - 1 + // }; + // + // Enum values are equal with the index of the command line option in VALUES + // in ENUM_OPTIONS. + OS << "\n" + "#ifdef ENUMS\n"; + for (const auto &Pair : getSortedDerivedDefinitions(Records, "EnumConfig")) { + + const Record *R = Pair.second; + OS << "enum class " << getEnumName(R) << "{\n"; + int Counter = 0; + for (const Record *EnumVal : R->getValueAsListOfDefs("Values")) { + OS << " "; + OS.write_escaped(R->getValueAsString("EnumPrefix")) << '_'; + OS.write_escaped(EnumVal->getValueAsString("EnumName")) + << " = " << Counter++; + OS << ",\n"; + } + OS << "}\n\n"; + } + OS << "#endif // ENUMS\n" + "\n"; +} Index: clang/utils/TableGen/CMakeLists.txt =================================================================== --- clang/utils/TableGen/CMakeLists.txt +++ clang/utils/TableGen/CMakeLists.txt @@ -14,6 +14,7 @@ ClangOpenCLBuiltinEmitter.cpp ClangOptionDocEmitter.cpp ClangSACheckersEmitter.cpp + ClangSAConfigsEmitter.cpp ClangSyntaxEmitter.cpp ClangTypeNodesEmitter.cpp MveEmitter.cpp Index: clang/include/clang/StaticAnalyzer/Core/CMakeLists.txt =================================================================== --- /dev/null +++ clang/include/clang/StaticAnalyzer/Core/CMakeLists.txt @@ -0,0 +1,3 @@ +clang_tablegen(Configs.inc -gen-clang-sa-configs + SOURCE AnalyzerOptions.td + TARGET ClangSAConfigs) Index: clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.td =================================================================== --- /dev/null +++ clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.td @@ -0,0 +1,94 @@ +//===--- AnalyzerOptions.td - Analyzer configs ----------------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file defines the analyzer options avaible with -analyzer-config. +// TODO: Copy documentation from AnalyzerOptions.def before deleting it. +// +//===----------------------------------------------------------------------===// + +include "../AnalyzerFlagsBase.td" + + +class Config<string name, CmdLineOptionTypeEnum type> { + string Name = name; + bits<2> Type = type.Type; + string HelpText; + string DefaultVal; +} + +class DefaultVal<string val> { string DefaultVal = val; } +//===----------------------------------------------------------------------===// + +class BooleanConfig<string name> : Config<name, Boolean>; +//===----------------------------------------------------------------------===// + +class IntegerConfig<string name> : Config<name, Integer> { + list<int> Values = ?; +} + +class IntValues<list<int> Vals> { + list<int> Values = Vals; +} +//===----------------------------------------------------------------------===// + +class StringConfig<string name> : Config<name, String> { + bit NeedsNoValidation = 0; +} +//===----------------------------------------------------------------------===// + +class FileNameConfig<string name> : Config<name, String>; +//===----------------------------------------------------------------------===// + +class EnumVal<string cmdFlag, string enumName> { + string CmdFlag = cmdFlag; + string EnumName = enumName; +} + +class EnumConfig<string name> : Config<name, String> { + list<EnumVal> Values; + string EnumPrefix; +} + +class EnumValues<list<EnumVal> Vals> { + list<EnumVal> Values = Vals; +} + +class EnumPrefix<string enumPrefix> { + string EnumPrefix = enumPrefix; +} +//===----------------------------------------------------------------------===// + + +def ShouldIncludeImplicitDtorsInCFG : BooleanConfig<"cfg-implicit-dtors">, + HelpText<"Whether or not implicit destructors for C++ objects should be " + "included in the CFG.">, + DefaultVal<"true">; + +def AlwaysInlineSize : IntegerConfig<"ipa-always-inline-size">, + HelpText<"The size of the functions (in basic blocks), which should be " + "considered to be small enough to always inline.">, + DefaultVal<"3">; + +def RawSilencedCheckersAndPackages : StringConfig<"silence-checkers">, + HelpText<"A semicolon separated list of checker and package names to " + "silence. Silenced checkers will not emit reports, but the modeling " + "remain enabled.">, + DefaultVal<"">; + +def CXXMemberInliningMode : EnumConfig<"c++-inlining">, + HelpText<"Controls which C++ member functions will be considered for " + "inlining.">, + EnumPrefix<"CIMK">, + EnumValues<[ + EnumVal<"none", "None">, + EnumVal<"constructors", "Constructors">, + EnumVal<"destructors", "Destructors">, + EnumVal<"methods", "Memberfunctions"> + ]>, + DefaultVal<"destructors">; + Index: clang/include/clang/StaticAnalyzer/Checkers/CheckerBase.td =================================================================== --- clang/include/clang/StaticAnalyzer/Checkers/CheckerBase.td +++ clang/include/clang/StaticAnalyzer/Checkers/CheckerBase.td @@ -10,16 +10,7 @@ // //===----------------------------------------------------------------------===// -/// Describes a checker or package option type. This is important for validating -/// user supplied inputs. -/// New option types can be added by modifying this enum. Note that this -/// requires changes in the TableGen emitter file ClangSACheckersEmitter.cpp. -class CmdLineOptionTypeEnum<bits<2> val> { - bits<2> Type = val; -} -def Integer : CmdLineOptionTypeEnum<0>; -def String : CmdLineOptionTypeEnum<1>; -def Boolean : CmdLineOptionTypeEnum<2>; +include "../AnalyzerFlagsBase.td" /// Describes the state of the entry. We wouldn't like to display, for example, /// developer only entries for a list meant for end users. @@ -81,10 +72,6 @@ /// def CoreBuiltin : Package<"builtin">, ParentPackage<Core>; class ParentPackage<Package P> { Package ParentPackage = P; } -/// A description. May be displayed to the user when clang is invoked with -/// a '-help'-like command line option. -class HelpText<string text> { string HelpText = text; } - /// Describes what kind of documentation exists for the checker. class DocumentationEnum<bits<2> val> { bits<2> Documentation = val; Index: clang/include/clang/StaticAnalyzer/CMakeLists.txt =================================================================== --- /dev/null +++ clang/include/clang/StaticAnalyzer/CMakeLists.txt @@ -0,0 +1,2 @@ +add_subdirectory(Checkers) +add_subdirectory(Core) Index: clang/include/clang/StaticAnalyzer/AnalyzerFlagsBase.td =================================================================== --- /dev/null +++ clang/include/clang/StaticAnalyzer/AnalyzerFlagsBase.td @@ -0,0 +1,23 @@ +//===-- AnalyzerFlagBase.td - Static Analyzer Flags TableGen base classes -===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +/// Describes ain option type. This is important for validating user supplied +/// inputs. +/// New option types can be added by modifying this enum. Note that this +/// requires changes in the TableGen emitter files found in +/// clang/utils/TableGen. +class CmdLineOptionTypeEnum<bits<2> val> { + bits<2> Type = val; +} +def Integer : CmdLineOptionTypeEnum<0>; +def String : CmdLineOptionTypeEnum<1>; +def Boolean : CmdLineOptionTypeEnum<2>; + +/// A description. May be displayed to the user when clang is invoked with +/// a '-help'-like command line option. +class HelpText<string text> { string HelpText = text; } Index: clang/include/clang/CMakeLists.txt =================================================================== --- clang/include/clang/CMakeLists.txt +++ clang/include/clang/CMakeLists.txt @@ -4,5 +4,5 @@ add_subdirectory(Parse) add_subdirectory(Sema) add_subdirectory(Serialization) -add_subdirectory(StaticAnalyzer/Checkers) +add_subdirectory(StaticAnalyzer) add_subdirectory(Tooling/Syntax)
_______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits