jonathan.protzenko created this revision.
jonathan.protzenko added a reviewer: aaron.ballman.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.
jonathan.protzenko requested review of this revision.

This is a preliminary patch that I hope can serve as a basis for discussion. The
design summary below is copy/pasted from 
https://bugs.llvm.org/show_bug.cgi?id=46446#c7

(I'm intentionally posting a work-in-progress so that I can make sure that 
things
are roughly looking good; I'd like to defer details (clang-format'ing my
changes, naming conventions, tests) to later once the design is settled.)

The patch essentially adds the ability for a plugin to handle the parsing of a
CXX11 attribute's arguments. This means that with this patch, plugin-defined
CXX11 attributes are no longer limited to constant attributes that take no
arguments.

The patch intentionally gives the plugin a fair amount of freedom, to enable
interesting use-cases (see below).

The patch does not (yet) tackle the ability for out-of-tree plugins to extend
Attr.td. I'd be happy to work on that once the present patch makes progress.

Design
======

I chose to extend the ParsedAttrInfo class. This allows reusing the logic that
searches for an attribute plugin that has registered a matching spelling, and
reusing the existing registry of plugins. (I contemplated a separate registry,
but thought it would duplicate the spelling-search logic and would also lead to
code duplication in the plugin itself.)

By default, ParsedAttrInfo's (most of which do *not* come from plugins) are
happy to let the built-in clang parsing logic take effect, enforcing tweaking
knobs such as NumArgs and OptArgs. Plugins can, if they are so inclined, elect
to bypass clang's parsing logic (which, as we saw on Bugzilla, ignores CXX11
attribute arguments when they're not known to clang). In that case, plugins are
on their own; they are expected to handle all of the parsing, as well as pushing
the resulting ParsedAttr into the ParsedAttributes. They then report whether the
arguments are syntactically valid or not and the rest of the parsing code
handles error reporting.

(Side-note: this means that it's a three-state return value, and I'm currently
reusing the existing AttrHandling enum, but it could be improved with either
better names or a separate enum.)

The plugin receives, in addition to location information, two key parameters:

- the parser, obviously
- a possibly-null Declarator representing the partially parsed declaration that 
the attribute is attached to, if any.

The second parameter may be the most surprising part of this patch, so let me
try to provide some motivation and context.

My goal is to enable plugins to perform advanced static analysis and code
generation, relying on rich CXX11 attributes capturing deep semantic information
relating, say, a function's parameters, or a struct's fields.

You can easily imagine someone authoring a static analyzer that recognizes:

  typedef struct
  {
    size_t len;
    uint8_t *chars;
      [[ my_analyzer::invariant(strlen(chars) == len) ]];
  }
  managed_string;

The point here being to leverage a nice CXX11 attribute that benefits from
typo-checking, proper name and macro resolution, etc. rather than using some
unreliable syntax embedded in comments like so many other tools are doing, or
worse, a custom C++ parser.

In our case, we're auto-generating parsers and serializers from a struct type
definition (see https://www.usenix.org/system/files/sec19-ramananandro_0.pdf for
more info), so I'd like to author:

  typedef struct {
    uint32_t min_version;
    uint32_t max_version
      [[ everparse::constraint (max_version >= min_version && min_version >= 2 
&& max_version <= 4) ]];
  } my_network_format;
  `

Of course, our actual network formats are much more complicated, but this is
just for the sake of example. My point is that allowing plugins a substantial
degree of freedom is likely to enable a flurry of novel and exciting use-cases
based on clang.

Back to the design of the patch, receiving the Declarator allows me to do things
like this (in the plugin):

  Sema &S = P->getActions();
  TypeSourceInfo *T = S.GetTypeForDeclarator(*D, P->getCurScope());
  QualType R = T->getType();
  VarDecl *VD = VarDecl::Create(S.getASTContext(), S.CurContext,
      D->getBeginLoc(), D->getIdentifierLoc(), D->getIdentifier(), R,
      T,
      SC_None);
  // Doing this makes sure Sema is aware of the new scope entry, meaning this 
name
  // will be in scope when parsing the expression. (Parsing and scope
  // resolution are intertwined.)
  VD->setImplicit(true);
  S.PushOnScopeChains(VD, P->getCurScope());

This allows my plugin to successfully parse and recognize complex arguments that
refer to the current declarator, in effect letting my plugin define its own
extended syntax and scoping rules.

I hope this helps, I'll keep working on the patch (this is the first version
that seems to do something useful) but thought it'd be helpful to get the
discussion started in the meanwhile.

Details (optional reading)
==========================

- I was able to refactor the lookup-by-spelling logic in ParsedAttrInfo::get -- 
I'm curious to know if the two overloaded methods are conforming to clang's 
style
- I had to switch ParsedAttrInfo from a struct to a class because of a warning 
related to Microsoft ABI standards violations
- I'm wondering if the Declarator that the plugin receives should instead be a 
more general class (with a Kind()) that covers the union of all the "things 
currently being parsed that an attribute may be attached to". But, since I 
couldn't come up with other compelling examples, I've left it as-is until I 
find a strong argument in favor of introducing extra complexity.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D86169

Files:
  clang/include/clang/Parse/Parser.h
  clang/include/clang/Sema/ParsedAttr.h
  clang/lib/Parse/ParseDeclCXX.cpp
  clang/lib/Sema/ParsedAttr.cpp

Index: clang/lib/Sema/ParsedAttr.cpp
===================================================================
--- clang/lib/Sema/ParsedAttr.cpp
+++ clang/lib/Sema/ParsedAttr.cpp
@@ -109,18 +109,10 @@
 
 } // namespace
 
-const ParsedAttrInfo &ParsedAttrInfo::get(const AttributeCommonInfo &A) {
-  // If we have a ParsedAttrInfo for this ParsedAttr then return that.
-  if ((size_t)A.getParsedKind() < llvm::array_lengthof(AttrInfoMap))
-    return *AttrInfoMap[A.getParsedKind()];
-
-  // If this is an ignored attribute then return an appropriate ParsedAttrInfo.
-  static const ParsedAttrInfo IgnoredParsedAttrInfo(
-      AttributeCommonInfo::IgnoredAttribute);
-  if (A.getParsedKind() == AttributeCommonInfo::IgnoredAttribute)
-    return IgnoredParsedAttrInfo;
-
-  // Otherwise this may be an attribute defined by a plugin. First instantiate
+const ParsedAttrInfo &
+ParsedAttrInfo::get(std::string FullName,
+                    AttributeCommonInfo::Syntax SyntaxUsed) {
+  // This may be an attribute defined by a plugin. First instantiate
   // all plugin attributes if we haven't already done so.
   static llvm::ManagedStatic<std::list<std::unique_ptr<ParsedAttrInfo>>>
       PluginAttrInstances;
@@ -128,14 +120,11 @@
     for (auto It : ParsedAttrInfoRegistry::entries())
       PluginAttrInstances->emplace_back(It.instantiate());
 
-  // Search for a ParsedAttrInfo whose name and syntax match.
-  std::string FullName = A.getNormalizedFullName();
-  AttributeCommonInfo::Syntax SyntaxUsed = A.getSyntax();
   if (SyntaxUsed == AttributeCommonInfo::AS_ContextSensitiveKeyword)
     SyntaxUsed = AttributeCommonInfo::AS_Keyword;
 
   for (auto &Ptr : *PluginAttrInstances)
-    for (auto &S : Ptr->Spellings)
+    for (const auto &S : Ptr->Spellings)
       if (S.Syntax == SyntaxUsed && S.NormalizedFullName == FullName)
         return *Ptr;
 
@@ -145,6 +134,21 @@
   return DefaultParsedAttrInfo;
 }
 
+const ParsedAttrInfo &ParsedAttrInfo::get(const AttributeCommonInfo &A) {
+  // If we have a ParsedAttrInfo for this ParsedAttr then return that.
+  if ((size_t)A.getParsedKind() < llvm::array_lengthof(AttrInfoMap))
+    return *AttrInfoMap[A.getParsedKind()];
+
+  // If this is an ignored attribute then return an appropriate ParsedAttrInfo.
+  static const ParsedAttrInfo IgnoredParsedAttrInfo(
+      AttributeCommonInfo::IgnoredAttribute);
+  if (A.getParsedKind() == AttributeCommonInfo::IgnoredAttribute)
+    return IgnoredParsedAttrInfo;
+
+  // Search for a ParsedAttrInfo whose name and syntax match.
+  return get(A.getNormalizedFullName(), A.getSyntax());
+}
+
 unsigned ParsedAttr::getMinArgs() const { return getInfo().NumArgs; }
 
 unsigned ParsedAttr::getMaxArgs() const {
Index: clang/lib/Parse/ParseDeclCXX.cpp
===================================================================
--- clang/lib/Parse/ParseDeclCXX.cpp
+++ clang/lib/Parse/ParseDeclCXX.cpp
@@ -4041,18 +4041,36 @@
                                      ParsedAttributes &Attrs,
                                      SourceLocation *EndLoc,
                                      IdentifierInfo *ScopeName,
-                                     SourceLocation ScopeLoc) {
+                                     SourceLocation ScopeLoc, Declarator *D) {
   assert(Tok.is(tok::l_paren) && "Not a C++11 attribute argument list");
   SourceLocation LParenLoc = Tok.getLocation();
   const LangOptions &LO = getLangOpts();
   ParsedAttr::Syntax Syntax =
       LO.CPlusPlus ? ParsedAttr::AS_CXX11 : ParsedAttr::AS_C2x;
 
-  // If the attribute isn't known, we will not attempt to parse any
-  // arguments.
   if (!hasAttribute(LO.CPlusPlus ? AttrSyntax::CXX : AttrSyntax::C, ScopeName,
                     AttrName, getTargetInfo(), getLangOpts())) {
-    // Eat the left paren, then skip to the ending right paren.
+    std::string FullName;
+    FullName += ScopeName->getNameStart();
+    FullName += "::";
+    FullName += AttrName->getNameStart();
+    const ParsedAttrInfo &PluginInfo = ParsedAttrInfo::get(FullName, Syntax);
+    if (PluginInfo.AttrKind != AttributeCommonInfo::UnknownAttribute) {
+      // A plugin exists for this attribute. See if the plugin wants to opt into
+      // custom parsing.
+      ParsedAttrInfo::AttrHandling H = PluginInfo.parseAttributePayload(
+          this, Attrs, D, AttrName, AttrNameLoc, EndLoc, ScopeName, ScopeLoc);
+      if (H == ParsedAttrInfo::AttributeApplied)
+        return true;
+      else if (H == ParsedAttrInfo::AttributeNotApplied)
+        return false;
+      else
+        ; // fall-through
+    }
+
+    // Unknown attribute, or plugin declined to handle it. We do not attempt to
+    // parse its arguments. Eat the left paren, then skip to the ending right
+    // paren.
     ConsumeParen();
     SkipUntil(tok::r_paren);
     return false;
@@ -4126,7 +4144,8 @@
 /// [C++11] attribute-namespace:
 ///         identifier
 void Parser::ParseCXX11AttributeSpecifier(ParsedAttributes &attrs,
-                                          SourceLocation *endLoc) {
+                                          SourceLocation *endLoc,
+                                          Declarator *D) {
   if (Tok.is(tok::kw_alignas)) {
     Diag(Tok.getLocation(), diag::warn_cxx98_compat_alignas);
     ParseAlignmentSpecifier(attrs, endLoc);
@@ -4207,7 +4226,7 @@
     // Parse attribute arguments
     if (Tok.is(tok::l_paren))
       AttrParsed = ParseCXX11AttributeArgs(AttrName, AttrLoc, attrs, endLoc,
-                                           ScopeName, ScopeLoc);
+                                           ScopeName, ScopeLoc, D);
 
     if (!AttrParsed)
       attrs.addNew(
@@ -4234,7 +4253,7 @@
 /// attribute-specifier-seq:
 ///       attribute-specifier-seq[opt] attribute-specifier
 void Parser::ParseCXX11Attributes(ParsedAttributesWithRange &attrs,
-                                  SourceLocation *endLoc) {
+                                  SourceLocation *endLoc, Declarator *D) {
   assert(standardAttributesAllowed());
 
   SourceLocation StartLoc = Tok.getLocation(), Loc;
@@ -4242,7 +4261,7 @@
     endLoc = &Loc;
 
   do {
-    ParseCXX11AttributeSpecifier(attrs, endLoc);
+    ParseCXX11AttributeSpecifier(attrs, endLoc, D);
   } while (isCXX11AttributeSpecifier());
 
   attrs.Range = SourceRange(StartLoc, *endLoc);
Index: clang/include/clang/Sema/ParsedAttr.h
===================================================================
--- clang/include/clang/Sema/ParsedAttr.h
+++ clang/include/clang/Sema/ParsedAttr.h
@@ -17,6 +17,7 @@
 #include "clang/Basic/AttrSubjectMatchRules.h"
 #include "clang/Basic/AttributeCommonInfo.h"
 #include "clang/Basic/Diagnostic.h"
+#include "clang/Basic/IdentifierTable.h"
 #include "clang/Basic/SourceLocation.h"
 #include "clang/Sema/Ownership.h"
 #include "llvm/ADT/PointerUnion.h"
@@ -34,14 +35,18 @@
 
 class ASTContext;
 class Decl;
+class Declarator;
 class Expr;
 class IdentifierInfo;
 class LangOptions;
 class ParsedAttr;
+class ParsedAttributes;
+class Parser;
 class Sema;
 class TargetInfo;
 
-struct ParsedAttrInfo {
+class ParsedAttrInfo {
+public:
   /// Corresponds to the Kind enum.
   unsigned AttrKind : 16;
   /// The number of required arguments of this attribute.
@@ -75,6 +80,27 @@
 
   virtual ~ParsedAttrInfo() = default;
 
+  enum AttrHandling { NotHandled, AttributeApplied, AttributeNotApplied };
+
+  /// Override the default parsing logic and perform parsing of the attribute
+  /// payload here. This is only meaningful for plugins that register new
+  /// attributes. Flags such as HasCustomParsing, NumArgs, OptArgs
+  /// are not honored, and the plugin has to do all of the validation by hand.
+  /// Plugin receives an possibly-null Declarator D that provides, if available,
+  /// information on the in-progress Decl that the attribute is attached to.
+  ///
+  /// NotHandled defers to default parsing (accept expression
+  /// arguments for GNU syntax, ignore arguments for CXX11 syntax);
+  /// AttributeNotApplied indicates a parsing failure and AttributeApplied a
+  /// success.
+  virtual AttrHandling
+  parseAttributePayload(Parser *P, ParsedAttributes &Attrs, Declarator *D,
+                        IdentifierInfo *AttrName, SourceLocation AttrNameLoc,
+                        SourceLocation *EndLoc, IdentifierInfo *ScopeName,
+                        SourceLocation ScopeLoc) const {
+    return NotHandled;
+  }
+
   /// Check if this attribute appertains to D, and issue a diagnostic if not.
   virtual bool diagAppertainsToDecl(Sema &S, const ParsedAttr &Attr,
                                     const Decl *D) const {
@@ -99,11 +125,6 @@
       llvm::SmallVectorImpl<std::pair<attr::SubjectMatchRule, bool>> &Rules,
       const LangOptions &LangOpts) const {
   }
-  enum AttrHandling {
-    NotHandled,
-    AttributeApplied,
-    AttributeNotApplied
-  };
   /// If this ParsedAttrInfo knows how to handle this ParsedAttr applied to this
   /// Decl then do so and return either AttributeApplied if it was applied or
   /// AttributeNotApplied if it wasn't. Otherwise return NotHandled.
@@ -112,6 +133,14 @@
     return NotHandled;
   }
 
+  /// Search amongst registered plugins for a ParsedAttrInfo whose name and
+  /// syntax match, and return a default value otherwise.
+  static const ParsedAttrInfo &get(std::string FullName,
+                                   AttributeCommonInfo::Syntax SyntaxUsed);
+
+  /// Once parsing has been performed, an AttributeCommonInfo (e.g. ParsedAttr)
+  /// most likely has a ParsedAttrInfo attached to it. Perform ParsedAttrInfo
+  /// resolution via A, falling back on get above otherwise.
   static const ParsedAttrInfo &get(const AttributeCommonInfo &A);
 };
 
Index: clang/include/clang/Parse/Parser.h
===================================================================
--- clang/include/clang/Parse/Parser.h
+++ clang/include/clang/Parse/Parser.h
@@ -26,6 +26,7 @@
 #include "llvm/Frontend/OpenMP/OMPContext.h"
 #include "llvm/Support/Compiler.h"
 #include "llvm/Support/PrettyStackTrace.h"
+#include "llvm/Support/Registry.h"
 #include "llvm/Support/SaveAndRestore.h"
 #include <memory>
 #include <stack>
@@ -2658,7 +2659,7 @@
     if (standardAttributesAllowed() && isCXX11AttributeSpecifier()) {
       ParsedAttributesWithRange attrs(AttrFactory);
       SourceLocation endLoc;
-      ParseCXX11Attributes(attrs, &endLoc);
+      ParseCXX11Attributes(attrs, &endLoc, &D);
       D.takeAttributes(attrs, endLoc);
     }
   }
@@ -2681,16 +2682,19 @@
   }
 
   void ParseCXX11AttributeSpecifier(ParsedAttributes &attrs,
-                                    SourceLocation *EndLoc = nullptr);
+                                    SourceLocation *EndLoc = nullptr,
+                                    Declarator *D = nullptr);
   void ParseCXX11Attributes(ParsedAttributesWithRange &attrs,
-                            SourceLocation *EndLoc = nullptr);
+                            SourceLocation *EndLoc = nullptr,
+                            Declarator *D = nullptr);
   /// Parses a C++11 (or C2x)-style attribute argument list. Returns true
   /// if this results in adding an attribute to the ParsedAttributes list.
   bool ParseCXX11AttributeArgs(IdentifierInfo *AttrName,
                                SourceLocation AttrNameLoc,
                                ParsedAttributes &Attrs, SourceLocation *EndLoc,
                                IdentifierInfo *ScopeName,
-                               SourceLocation ScopeLoc);
+                               SourceLocation ScopeLoc,
+                               Declarator *D = nullptr);
 
   IdentifierInfo *TryParseCXX11AttributeIdentifier(SourceLocation &Loc);
 
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to