jonathan.protzenko created this revision. jonathan.protzenko added a reviewer: aaron.ballman. Herald added a project: clang. Herald added a subscriber: cfe-commits. jonathan.protzenko requested review of this revision.
This is a preliminary patch that I hope can serve as a basis for discussion. The design summary below is copy/pasted from https://bugs.llvm.org/show_bug.cgi?id=46446#c7 (I'm intentionally posting a work-in-progress so that I can make sure that things are roughly looking good; I'd like to defer details (clang-format'ing my changes, naming conventions, tests) to later once the design is settled.) The patch essentially adds the ability for a plugin to handle the parsing of a CXX11 attribute's arguments. This means that with this patch, plugin-defined CXX11 attributes are no longer limited to constant attributes that take no arguments. The patch intentionally gives the plugin a fair amount of freedom, to enable interesting use-cases (see below). The patch does not (yet) tackle the ability for out-of-tree plugins to extend Attr.td. I'd be happy to work on that once the present patch makes progress. Design ====== I chose to extend the ParsedAttrInfo class. This allows reusing the logic that searches for an attribute plugin that has registered a matching spelling, and reusing the existing registry of plugins. (I contemplated a separate registry, but thought it would duplicate the spelling-search logic and would also lead to code duplication in the plugin itself.) By default, ParsedAttrInfo's (most of which do *not* come from plugins) are happy to let the built-in clang parsing logic take effect, enforcing tweaking knobs such as NumArgs and OptArgs. Plugins can, if they are so inclined, elect to bypass clang's parsing logic (which, as we saw on Bugzilla, ignores CXX11 attribute arguments when they're not known to clang). In that case, plugins are on their own; they are expected to handle all of the parsing, as well as pushing the resulting ParsedAttr into the ParsedAttributes. They then report whether the arguments are syntactically valid or not and the rest of the parsing code handles error reporting. (Side-note: this means that it's a three-state return value, and I'm currently reusing the existing AttrHandling enum, but it could be improved with either better names or a separate enum.) The plugin receives, in addition to location information, two key parameters: - the parser, obviously - a possibly-null Declarator representing the partially parsed declaration that the attribute is attached to, if any. The second parameter may be the most surprising part of this patch, so let me try to provide some motivation and context. My goal is to enable plugins to perform advanced static analysis and code generation, relying on rich CXX11 attributes capturing deep semantic information relating, say, a function's parameters, or a struct's fields. You can easily imagine someone authoring a static analyzer that recognizes: typedef struct { size_t len; uint8_t *chars; [[ my_analyzer::invariant(strlen(chars) == len) ]]; } managed_string; The point here being to leverage a nice CXX11 attribute that benefits from typo-checking, proper name and macro resolution, etc. rather than using some unreliable syntax embedded in comments like so many other tools are doing, or worse, a custom C++ parser. In our case, we're auto-generating parsers and serializers from a struct type definition (see https://www.usenix.org/system/files/sec19-ramananandro_0.pdf for more info), so I'd like to author: typedef struct { uint32_t min_version; uint32_t max_version [[ everparse::constraint (max_version >= min_version && min_version >= 2 && max_version <= 4) ]]; } my_network_format; ` Of course, our actual network formats are much more complicated, but this is just for the sake of example. My point is that allowing plugins a substantial degree of freedom is likely to enable a flurry of novel and exciting use-cases based on clang. Back to the design of the patch, receiving the Declarator allows me to do things like this (in the plugin): Sema &S = P->getActions(); TypeSourceInfo *T = S.GetTypeForDeclarator(*D, P->getCurScope()); QualType R = T->getType(); VarDecl *VD = VarDecl::Create(S.getASTContext(), S.CurContext, D->getBeginLoc(), D->getIdentifierLoc(), D->getIdentifier(), R, T, SC_None); // Doing this makes sure Sema is aware of the new scope entry, meaning this name // will be in scope when parsing the expression. (Parsing and scope // resolution are intertwined.) VD->setImplicit(true); S.PushOnScopeChains(VD, P->getCurScope()); This allows my plugin to successfully parse and recognize complex arguments that refer to the current declarator, in effect letting my plugin define its own extended syntax and scoping rules. I hope this helps, I'll keep working on the patch (this is the first version that seems to do something useful) but thought it'd be helpful to get the discussion started in the meanwhile. Details (optional reading) ========================== - I was able to refactor the lookup-by-spelling logic in ParsedAttrInfo::get -- I'm curious to know if the two overloaded methods are conforming to clang's style - I had to switch ParsedAttrInfo from a struct to a class because of a warning related to Microsoft ABI standards violations - I'm wondering if the Declarator that the plugin receives should instead be a more general class (with a Kind()) that covers the union of all the "things currently being parsed that an attribute may be attached to". But, since I couldn't come up with other compelling examples, I've left it as-is until I find a strong argument in favor of introducing extra complexity. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D86169 Files: clang/include/clang/Parse/Parser.h clang/include/clang/Sema/ParsedAttr.h clang/lib/Parse/ParseDeclCXX.cpp clang/lib/Sema/ParsedAttr.cpp
Index: clang/lib/Sema/ParsedAttr.cpp =================================================================== --- clang/lib/Sema/ParsedAttr.cpp +++ clang/lib/Sema/ParsedAttr.cpp @@ -109,18 +109,10 @@ } // namespace -const ParsedAttrInfo &ParsedAttrInfo::get(const AttributeCommonInfo &A) { - // If we have a ParsedAttrInfo for this ParsedAttr then return that. - if ((size_t)A.getParsedKind() < llvm::array_lengthof(AttrInfoMap)) - return *AttrInfoMap[A.getParsedKind()]; - - // If this is an ignored attribute then return an appropriate ParsedAttrInfo. - static const ParsedAttrInfo IgnoredParsedAttrInfo( - AttributeCommonInfo::IgnoredAttribute); - if (A.getParsedKind() == AttributeCommonInfo::IgnoredAttribute) - return IgnoredParsedAttrInfo; - - // Otherwise this may be an attribute defined by a plugin. First instantiate +const ParsedAttrInfo & +ParsedAttrInfo::get(std::string FullName, + AttributeCommonInfo::Syntax SyntaxUsed) { + // This may be an attribute defined by a plugin. First instantiate // all plugin attributes if we haven't already done so. static llvm::ManagedStatic<std::list<std::unique_ptr<ParsedAttrInfo>>> PluginAttrInstances; @@ -128,14 +120,11 @@ for (auto It : ParsedAttrInfoRegistry::entries()) PluginAttrInstances->emplace_back(It.instantiate()); - // Search for a ParsedAttrInfo whose name and syntax match. - std::string FullName = A.getNormalizedFullName(); - AttributeCommonInfo::Syntax SyntaxUsed = A.getSyntax(); if (SyntaxUsed == AttributeCommonInfo::AS_ContextSensitiveKeyword) SyntaxUsed = AttributeCommonInfo::AS_Keyword; for (auto &Ptr : *PluginAttrInstances) - for (auto &S : Ptr->Spellings) + for (const auto &S : Ptr->Spellings) if (S.Syntax == SyntaxUsed && S.NormalizedFullName == FullName) return *Ptr; @@ -145,6 +134,21 @@ return DefaultParsedAttrInfo; } +const ParsedAttrInfo &ParsedAttrInfo::get(const AttributeCommonInfo &A) { + // If we have a ParsedAttrInfo for this ParsedAttr then return that. + if ((size_t)A.getParsedKind() < llvm::array_lengthof(AttrInfoMap)) + return *AttrInfoMap[A.getParsedKind()]; + + // If this is an ignored attribute then return an appropriate ParsedAttrInfo. + static const ParsedAttrInfo IgnoredParsedAttrInfo( + AttributeCommonInfo::IgnoredAttribute); + if (A.getParsedKind() == AttributeCommonInfo::IgnoredAttribute) + return IgnoredParsedAttrInfo; + + // Search for a ParsedAttrInfo whose name and syntax match. + return get(A.getNormalizedFullName(), A.getSyntax()); +} + unsigned ParsedAttr::getMinArgs() const { return getInfo().NumArgs; } unsigned ParsedAttr::getMaxArgs() const { Index: clang/lib/Parse/ParseDeclCXX.cpp =================================================================== --- clang/lib/Parse/ParseDeclCXX.cpp +++ clang/lib/Parse/ParseDeclCXX.cpp @@ -4041,18 +4041,36 @@ ParsedAttributes &Attrs, SourceLocation *EndLoc, IdentifierInfo *ScopeName, - SourceLocation ScopeLoc) { + SourceLocation ScopeLoc, Declarator *D) { assert(Tok.is(tok::l_paren) && "Not a C++11 attribute argument list"); SourceLocation LParenLoc = Tok.getLocation(); const LangOptions &LO = getLangOpts(); ParsedAttr::Syntax Syntax = LO.CPlusPlus ? ParsedAttr::AS_CXX11 : ParsedAttr::AS_C2x; - // If the attribute isn't known, we will not attempt to parse any - // arguments. if (!hasAttribute(LO.CPlusPlus ? AttrSyntax::CXX : AttrSyntax::C, ScopeName, AttrName, getTargetInfo(), getLangOpts())) { - // Eat the left paren, then skip to the ending right paren. + std::string FullName; + FullName += ScopeName->getNameStart(); + FullName += "::"; + FullName += AttrName->getNameStart(); + const ParsedAttrInfo &PluginInfo = ParsedAttrInfo::get(FullName, Syntax); + if (PluginInfo.AttrKind != AttributeCommonInfo::UnknownAttribute) { + // A plugin exists for this attribute. See if the plugin wants to opt into + // custom parsing. + ParsedAttrInfo::AttrHandling H = PluginInfo.parseAttributePayload( + this, Attrs, D, AttrName, AttrNameLoc, EndLoc, ScopeName, ScopeLoc); + if (H == ParsedAttrInfo::AttributeApplied) + return true; + else if (H == ParsedAttrInfo::AttributeNotApplied) + return false; + else + ; // fall-through + } + + // Unknown attribute, or plugin declined to handle it. We do not attempt to + // parse its arguments. Eat the left paren, then skip to the ending right + // paren. ConsumeParen(); SkipUntil(tok::r_paren); return false; @@ -4126,7 +4144,8 @@ /// [C++11] attribute-namespace: /// identifier void Parser::ParseCXX11AttributeSpecifier(ParsedAttributes &attrs, - SourceLocation *endLoc) { + SourceLocation *endLoc, + Declarator *D) { if (Tok.is(tok::kw_alignas)) { Diag(Tok.getLocation(), diag::warn_cxx98_compat_alignas); ParseAlignmentSpecifier(attrs, endLoc); @@ -4207,7 +4226,7 @@ // Parse attribute arguments if (Tok.is(tok::l_paren)) AttrParsed = ParseCXX11AttributeArgs(AttrName, AttrLoc, attrs, endLoc, - ScopeName, ScopeLoc); + ScopeName, ScopeLoc, D); if (!AttrParsed) attrs.addNew( @@ -4234,7 +4253,7 @@ /// attribute-specifier-seq: /// attribute-specifier-seq[opt] attribute-specifier void Parser::ParseCXX11Attributes(ParsedAttributesWithRange &attrs, - SourceLocation *endLoc) { + SourceLocation *endLoc, Declarator *D) { assert(standardAttributesAllowed()); SourceLocation StartLoc = Tok.getLocation(), Loc; @@ -4242,7 +4261,7 @@ endLoc = &Loc; do { - ParseCXX11AttributeSpecifier(attrs, endLoc); + ParseCXX11AttributeSpecifier(attrs, endLoc, D); } while (isCXX11AttributeSpecifier()); attrs.Range = SourceRange(StartLoc, *endLoc); Index: clang/include/clang/Sema/ParsedAttr.h =================================================================== --- clang/include/clang/Sema/ParsedAttr.h +++ clang/include/clang/Sema/ParsedAttr.h @@ -17,6 +17,7 @@ #include "clang/Basic/AttrSubjectMatchRules.h" #include "clang/Basic/AttributeCommonInfo.h" #include "clang/Basic/Diagnostic.h" +#include "clang/Basic/IdentifierTable.h" #include "clang/Basic/SourceLocation.h" #include "clang/Sema/Ownership.h" #include "llvm/ADT/PointerUnion.h" @@ -34,14 +35,18 @@ class ASTContext; class Decl; +class Declarator; class Expr; class IdentifierInfo; class LangOptions; class ParsedAttr; +class ParsedAttributes; +class Parser; class Sema; class TargetInfo; -struct ParsedAttrInfo { +class ParsedAttrInfo { +public: /// Corresponds to the Kind enum. unsigned AttrKind : 16; /// The number of required arguments of this attribute. @@ -75,6 +80,27 @@ virtual ~ParsedAttrInfo() = default; + enum AttrHandling { NotHandled, AttributeApplied, AttributeNotApplied }; + + /// Override the default parsing logic and perform parsing of the attribute + /// payload here. This is only meaningful for plugins that register new + /// attributes. Flags such as HasCustomParsing, NumArgs, OptArgs + /// are not honored, and the plugin has to do all of the validation by hand. + /// Plugin receives an possibly-null Declarator D that provides, if available, + /// information on the in-progress Decl that the attribute is attached to. + /// + /// NotHandled defers to default parsing (accept expression + /// arguments for GNU syntax, ignore arguments for CXX11 syntax); + /// AttributeNotApplied indicates a parsing failure and AttributeApplied a + /// success. + virtual AttrHandling + parseAttributePayload(Parser *P, ParsedAttributes &Attrs, Declarator *D, + IdentifierInfo *AttrName, SourceLocation AttrNameLoc, + SourceLocation *EndLoc, IdentifierInfo *ScopeName, + SourceLocation ScopeLoc) const { + return NotHandled; + } + /// Check if this attribute appertains to D, and issue a diagnostic if not. virtual bool diagAppertainsToDecl(Sema &S, const ParsedAttr &Attr, const Decl *D) const { @@ -99,11 +125,6 @@ llvm::SmallVectorImpl<std::pair<attr::SubjectMatchRule, bool>> &Rules, const LangOptions &LangOpts) const { } - enum AttrHandling { - NotHandled, - AttributeApplied, - AttributeNotApplied - }; /// If this ParsedAttrInfo knows how to handle this ParsedAttr applied to this /// Decl then do so and return either AttributeApplied if it was applied or /// AttributeNotApplied if it wasn't. Otherwise return NotHandled. @@ -112,6 +133,14 @@ return NotHandled; } + /// Search amongst registered plugins for a ParsedAttrInfo whose name and + /// syntax match, and return a default value otherwise. + static const ParsedAttrInfo &get(std::string FullName, + AttributeCommonInfo::Syntax SyntaxUsed); + + /// Once parsing has been performed, an AttributeCommonInfo (e.g. ParsedAttr) + /// most likely has a ParsedAttrInfo attached to it. Perform ParsedAttrInfo + /// resolution via A, falling back on get above otherwise. static const ParsedAttrInfo &get(const AttributeCommonInfo &A); }; Index: clang/include/clang/Parse/Parser.h =================================================================== --- clang/include/clang/Parse/Parser.h +++ clang/include/clang/Parse/Parser.h @@ -26,6 +26,7 @@ #include "llvm/Frontend/OpenMP/OMPContext.h" #include "llvm/Support/Compiler.h" #include "llvm/Support/PrettyStackTrace.h" +#include "llvm/Support/Registry.h" #include "llvm/Support/SaveAndRestore.h" #include <memory> #include <stack> @@ -2658,7 +2659,7 @@ if (standardAttributesAllowed() && isCXX11AttributeSpecifier()) { ParsedAttributesWithRange attrs(AttrFactory); SourceLocation endLoc; - ParseCXX11Attributes(attrs, &endLoc); + ParseCXX11Attributes(attrs, &endLoc, &D); D.takeAttributes(attrs, endLoc); } } @@ -2681,16 +2682,19 @@ } void ParseCXX11AttributeSpecifier(ParsedAttributes &attrs, - SourceLocation *EndLoc = nullptr); + SourceLocation *EndLoc = nullptr, + Declarator *D = nullptr); void ParseCXX11Attributes(ParsedAttributesWithRange &attrs, - SourceLocation *EndLoc = nullptr); + SourceLocation *EndLoc = nullptr, + Declarator *D = nullptr); /// Parses a C++11 (or C2x)-style attribute argument list. Returns true /// if this results in adding an attribute to the ParsedAttributes list. bool ParseCXX11AttributeArgs(IdentifierInfo *AttrName, SourceLocation AttrNameLoc, ParsedAttributes &Attrs, SourceLocation *EndLoc, IdentifierInfo *ScopeName, - SourceLocation ScopeLoc); + SourceLocation ScopeLoc, + Declarator *D = nullptr); IdentifierInfo *TryParseCXX11AttributeIdentifier(SourceLocation &Loc);
_______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits