Hi,

In the commit [1] one side change was to fix mixed up usage of
BlockNumber and Buffer variables. This was a one-off manual effort, and
indeed it hardly seems possible to get compilation warnings for such
code, as long as the underlying types could be converted in a standard
conforming way. As far as I see such conversions are acceptable and
used every now and then in the project, but they may hide some subtle
issues.

One interesting way I found to address this was to benefit from clang
plugin system [2]. A clang plugin allows you to run some frontend
actions when compiling code with clang, and this approach is used in
some large open source projects (e.g. I've got my inspiration largely
from libreoffice [3]). After a little experimenting I could teach such a
plugin to warn about situations similar to the one described above. What
it does is:

* It visits all the function call expressions
* Verify if the function arguments are defined via typedef
* Compare the actual argument with the function declaration
* Consult with the suppression rules to minimize false positives

In the end the plugin catches the error mentioned above, and we get
something like this:

    ginget.c:143:41: warning: Typedef check: Expected 'BlockNumber' (aka 
'unsigned int'),
            got 'Buffer' (aka 'int') in PredicateLockPage 
PredicateLockPage(btree->index, stack->buffer, snapshot);

Of course the plugin produces more warning, and I haven't checked all of
them yet -- some probably would have to be ignored as well. But in the
long run I'm pretty confident it's possible to fine tune this logic and
suppression rules to get minimum noise.

As I see it, there are advantages of using plugins in such way:

* Helps automatically detect some issues
* Other type of plugins could be useful to support large undertakings,
  where a lot of code transformation is involved.

And of course disadvantages as well:

* It has to be fine-tuned to be useful
* It's compiler dependent, not clear how to always exercise it

I would like to get your opinion, folks. Does it sound interesting
enough for the community, is it worth it to pursue the idea?

Some final notes about infrastructure bits. Never mind cmake in there --
it was just for a quick start, I'm going to convert it to something
else. There are some changes needed to tell the compiler to actually
load the plugin, those of course could be done much better as well. I
didn't do anything with meson here, because it turns out meson tends to
enclose the plugin file with '-Wl,--start-group ... -Wl,--end-group' and
it breaks the plugin discovery.

[1]: 
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=126552c85c1cfb6ce6445159b8024cfa5631f33e
[2]: https://clang.llvm.org/docs/ClangPlugins.html
[3]: https://git.libreoffice.org/core/+/refs/heads/master/compilerplugins/clang/
>From 01e645f456f9476c60943f12d794465567f2d265 Mon Sep 17 00:00:00 2001
From: Dmitrii Dolgov <9erthali...@gmail.com>
Date: Thu, 3 Aug 2023 15:51:26 +0200
Subject: [PATCH] Typedef-check clang plugin

Add a clang plugin to warn about suspicious type casting for types
defined via typedef, e.g. BlockNumber and Buffer.
---
 configure                                     |  36 ++++
 configure.ac                                  |  12 ++
 src/tools/clang/typedef_check/CMakeLists.txt  |  31 +++
 .../clang/typedef_check/TypedefCheck.cpp      | 183 ++++++++++++++++++
 4 files changed, 262 insertions(+)
 create mode 100644 src/tools/clang/typedef_check/CMakeLists.txt
 create mode 100644 src/tools/clang/typedef_check/TypedefCheck.cpp

diff --git a/configure b/configure
index 2e518c8007d..71da30779de 100755
--- a/configure
+++ b/configure
@@ -767,6 +767,7 @@ enable_coverage
 GENHTML
 LCOV
 GCOV
+enable_typedef_check
 enable_debug
 enable_rpath
 default_port
@@ -835,6 +836,7 @@ enable_rpath
 enable_spinlocks
 enable_atomics
 enable_debug
+enable_typedef_check
 enable_profiling
 enable_coverage
 enable_dtrace
@@ -1528,6 +1530,7 @@ Optional Features:
   --disable-spinlocks     do not use spinlocks
   --disable-atomics       do not use atomic operations
   --enable-debug          build with debugging symbols (-g)
+  --enable-typedef-check  build with typedef-check plugin
   --enable-profiling      build with profiling enabled
   --enable-coverage       build with coverage testing instrumentation
   --enable-dtrace         build with DTrace support
@@ -3344,6 +3347,34 @@ fi
 
 
 
+#
+# --enable-typedef-check adds clang typedef-check plugin
+#
+
+
+# Check whether --enable-typedef-check was given.
+if test "${enable_typedef_check+set}" = set; then :
+  enableval=$enable_typedef_check;
+  case $enableval in
+    yes)
+      :
+      ;;
+    no)
+      :
+      ;;
+    *)
+      as_fn_error $? "no argument expected for --enable-typedef-check option" "$LINENO" 5
+      ;;
+  esac
+
+else
+  enable_typedef_check=no
+
+fi
+
+
+
+
 #
 # --enable-profiling enables gcc profiling
 #
@@ -7828,6 +7859,11 @@ if test "$enable_debug" = yes && test "$ac_cv_prog_cxx_g" = yes; then
   CXXFLAGS="$CXXFLAGS -g"
 fi
 
+if test "$enable_typedef_check" = yes; then
+  abs_path=`readlink -f $srcdir`
+  CFLAGS="$CFLAGS -Xclang -load -Xclang $abs_path/src/tools/clang/typedef_check/build/lib/TypedefCheck.so -Xclang -add-plugin -Xclang typedef-check"
+fi
+
 # enable code coverage if --enable-coverage
 if test "$enable_coverage" = yes; then
   if test "$GCC" = yes; then
diff --git a/configure.ac b/configure.ac
index 3ebe1a796dc..3b90bf9d1c5 100644
--- a/configure.ac
+++ b/configure.ac
@@ -206,6 +206,13 @@ PGAC_ARG_BOOL(enable, debug, no,
               [build with debugging symbols (-g)])
 AC_SUBST(enable_debug)
 
+#
+# --enable-typedef-check adds clang typedef-check plugin
+#
+PGAC_ARG_BOOL(enable, typedef-check, no,
+              [build with typedef-check plugin])
+AC_SUBST(enable_typedef_check)
+
 #
 # --enable-profiling enables gcc profiling
 #
@@ -683,6 +690,11 @@ if test "$enable_debug" = yes && test "$ac_cv_prog_cxx_g" = yes; then
   CXXFLAGS="$CXXFLAGS -g"
 fi
 
+if test "$enable_typedef_check" = yes; then
+  abs_path=`readlink -f $srcdir`
+  CFLAGS="$CFLAGS -Xclang -load -Xclang $abs_path/src/tools/clang/typedef_check/build/lib/TypedefCheck.so -Xclang -add-plugin -Xclang typedef-check"
+fi
+
 # enable code coverage if --enable-coverage
 if test "$enable_coverage" = yes; then
   if test "$GCC" = yes; then
diff --git a/src/tools/clang/typedef_check/CMakeLists.txt b/src/tools/clang/typedef_check/CMakeLists.txt
new file mode 100644
index 00000000000..7508d9906f1
--- /dev/null
+++ b/src/tools/clang/typedef_check/CMakeLists.txt
@@ -0,0 +1,31 @@
+cmake_minimum_required(VERSION 3.20.0)
+project(Typedef)
+
+execute_process(COMMAND llvm-config --cmakedir
+                OUTPUT_VARIABLE LLVM_CMAKEDIR
+                OUTPUT_STRIP_TRAILING_WHITESPACE)
+
+execute_process(COMMAND llvm-config --includedir
+                OUTPUT_VARIABLE LLVM_INCLUDEDIR
+                OUTPUT_STRIP_TRAILING_WHITESPACE)
+
+message(STATUS ${LLVM_CMAKEDIR})
+message(STATUS ${LLVM_INCLUDEDIR})
+
+list(APPEND CMAKE_MODULE_PATH ${LLVM_CMAKEDIR})
+
+message(STATUS ${CMAKE_MODULE_PATH})
+
+set(LLVM_RUNTIME_OUTPUT_INTDIR ${CMAKE_BINARY_DIR}/${CMAKE_CFG_INTDIR}/bin)
+set(LLVM_LIBRARY_OUTPUT_INTDIR ${CMAKE_BINARY_DIR}/${CMAKE_CFG_INTDIR}/lib)
+set(LLVM_ENABLE_PLUGINS true)
+
+include(HandleLLVMOptions) # important: matches compiler flags to LLVM/Clang build
+
+include(AddLLVM)
+
+add_llvm_library(TypedefCheck MODULE TypedefCheck.cpp PLUGIN_TOOL clang)
+
+target_include_directories(TypedefCheck PUBLIC ${LLVM_INCLUDEDIR})
+
+target_compile_features(TypedefCheck PRIVATE cxx_std_17)
diff --git a/src/tools/clang/typedef_check/TypedefCheck.cpp b/src/tools/clang/typedef_check/TypedefCheck.cpp
new file mode 100644
index 00000000000..f41216ee7c3
--- /dev/null
+++ b/src/tools/clang/typedef_check/TypedefCheck.cpp
@@ -0,0 +1,183 @@
+/*
+ * Clang plugin for PostgreSQL to warn about suspicious misuse of types defined
+ * via typedef, e.g. BlockNumber and Buffer. Such validation could not be done
+ * without false positives, thus the plugin has to be tuned via suppression
+ * rules for types, pair of types and functions.
+ */
+
+#include "clang/Frontend/FrontendPluginRegistry.h"
+#include "clang/AST/AST.h"
+#include "clang/AST/ASTConsumer.h"
+#include "clang/AST/RecursiveASTVisitor.h"
+#include "clang/Frontend/CompilerInstance.h"
+#include "llvm/Support/raw_ostream.h"
+
+using namespace clang;
+
+namespace {
+
+class TypedefCheckConsumer : public ASTConsumer {
+  CompilerInstance &Instance;
+
+public:
+  TypedefCheckConsumer(CompilerInstance &Instance) : Instance(Instance) {}
+
+  void HandleTranslationUnit(ASTContext& context) override {
+    struct Visitor : public RecursiveASTVisitor<Visitor> {
+      std::set<StringRef> IgnoredTypes = {"size_t", "uintptr_t",
+                                          "float8", "uint8", "int8",
+                                          "uint16", "int16",
+                                          "uint32", "int32",
+                                          "uint64", "int64",
+                                          "__m128i", "Vector8",
+                                          "__off_t", "__off64_t",
+                                          "__uid_t", "__mode_t",
+                                          "__gid_t", "key_t", "__pid_t",
+                                          "__time_t", "Datum",
+                                          "yy_size_t", "pg_wchar",
+                                          "wchar_t", "fmStringInfo"};
+
+      std::set<std::string> IgnoredFunctions = {"FunctionalCall1Coll", "FunctionalCall2Coll",
+                                                "fmgr_info_cxt", "fmgr_info",
+                                                "check_amproc_signature", "format_procedure",
+                                                "check_amoptsproc_signature",
+                                                "check_hash_func_signature"};
+
+      std::set<std::pair<StringRef, StringRef>> IgnoredPairs = {
+        std::pair<StringRef, StringRef>("RegProcedure", "Oid"),
+        std::pair<StringRef, StringRef>("Oid", "RegProcedure"),
+        std::pair<StringRef, StringRef>("Timestamp", "TimestampTz"),
+        std::pair<StringRef, StringRef>("TimestampTz", "Timestamp"),
+      };
+
+      Visitor(DiagnosticsEngine &Diags) : Diags(Diags) {}
+
+      bool VisitCallExpr(CallExpr *expr) {
+        const auto funcDecl = expr->getDirectCallee();
+        unsigned DiagID = Diags.getCustomDiagID(DiagnosticsEngine::Warning,
+                                            "Typedef check: Expected %0, got %1 in %2");
+
+        if (funcDecl == nullptr)
+            return true;
+
+        for (unsigned int i = 0; i != expr->getNumArgs(); ++i) {
+            /* Declared argument */
+            ParmVarDecl const* argDecl;
+            QualType argDeclT;
+            QualType argDeclTypedefT;
+            TypedefNameDecl const* argDeclTypedef;
+
+            /* Actually provided value */
+            Expr *arg;
+            QualType argT;
+            QualType argTypedefT;
+            TypedefNameDecl const* argTypedef;
+
+            arg = expr->getArg(i)->IgnoreParenImpCasts();
+            argT = arg->getType();
+
+            if (i >= funcDecl->getNumParams())
+                continue;
+
+            argDecl = funcDecl->getParamDecl(i);
+            argDeclT = argDecl->getOriginalType();
+
+            if (argT.isNull())
+                continue;
+
+            if (argDeclT.isNull())
+                continue;
+
+            /* Verify if we deal with a typedef */
+            if (auto const t = argT.getTypePtr()->getAs<TypedefType>()) {
+                argTypedef = t->getDecl();
+                argTypedefT = t->desugar();
+            }
+            else
+                continue;
+
+            if (auto const t = argDeclT.getTypePtr()->getAs<TypedefType>()) {
+                argDeclTypedef = t->getDecl();
+                argDeclTypedefT = t->desugar();
+            }
+            else
+                continue;
+
+            /*
+             * If both types (actual argument type and declared one) are not
+             * builtin and could be desugarized into the same type, ignore it.
+             */
+            if (!argDeclTypedefT->isBuiltinType() &&
+                !argTypedefT->isBuiltinType() &&
+                (argDeclTypedefT.getAsString() == argTypedefT.getAsString()))
+                continue;
+
+            /*
+             * If one desugarized type (either the declared or actual) is the
+             * same as not desugarized another one, ignore it.
+             */
+            if (argDeclTypedefT.getAsString() == argTypedef->getName())
+                continue;
+
+            if (argTypedefT.getAsString() == argDeclTypedef->getName())
+                continue;
+
+            /* If it is one of the suppressed functions, ignore it. */
+            if (IgnoredFunctions.find(funcDecl->getNameInfo().getAsString()) != IgnoredFunctions.end())
+                continue;
+
+            /* If it is one of the suppressed types, ignore it. */
+            if (IgnoredTypes.find(argTypedef->getName()) != IgnoredTypes.end())
+                continue;
+
+            if (IgnoredTypes.find(argDeclTypedef->getName()) != IgnoredTypes.end())
+                continue;
+
+            /* If it is one of the suppressed pair of types, ignore it. */
+            if (IgnoredPairs.find(std::pair<StringRef, StringRef>(argTypedef->getName(), argDeclTypedef->getName())) != IgnoredPairs.end())
+                continue;
+
+            if (argTypedef->getName() != argDeclTypedef->getName())
+            {
+                Diags.Report(arg->getExprLoc(), DiagID)
+                    << argDecl->getOriginalType()
+                    << arg->getType()
+                    << funcDecl->getNameInfo().getAsString();
+
+            }
+        }
+
+        return true;
+      }
+
+      std::set<FunctionDecl*> LateParsedDecls;
+      DiagnosticsEngine &Diags;
+
+    } v(Instance.getDiagnostics());
+    v.TraverseDecl(context.getTranslationUnitDecl());
+  }
+};
+
+class TypedefCheckAction : public PluginASTAction {
+protected:
+  std::unique_ptr<ASTConsumer> CreateASTConsumer(CompilerInstance &CI,
+                                                 llvm::StringRef) override {
+    return std::make_unique<TypedefCheckConsumer>(CI);
+  }
+
+  bool ParseArgs(const CompilerInstance &CI,
+                 const std::vector<std::string> &args) override {
+      return true;
+  }
+
+  void PrintHelp(llvm::raw_ostream& ros) {
+    ros << "TypedefCheck plugin for PostgreSQL warns about suspicious "
+           "misuse of types defined via typedef.\n";
+  }
+
+};
+
+}
+
+static FrontendPluginRegistry::Add<TypedefCheckAction>
+X("typedef-check", "check typedefs");

base-commit: 74a2dfee2255a1bace9b0053d014c4efa2823f4d
-- 
2.32.0

Reply via email to