================
@@ -51,28 +57,71 @@ class StringToOffsetTable {
     return II->second;
   }
 
-  // Emit the string using string literal concatenation, for better readability
-  // and searchability.
-  void EmitStringLiteralDef(raw_ostream &OS, const Twine &Decl,
-                            const Twine &Indent = "  ") const {
+  // Emit a string table definition with the provided name and indent.
+  //
+  // When possible, this uses string-literal concatenation to emit the string
+  // contents in a readable and searchable way. However, for (very) large 
string
+  // tables MSVC cannot reliably use string literals and so there we use a 
large
+  // character array. We still use a line oriented emission and add comments to
+  // provide searchability even in this case.
+  //
+  // The string table, and its input string contents, are always emitted as 
both
+  // `static` and `constexpr`. Both `Name` and (`Name` + "Storage") must be
+  // valid identifiers to declare.
+  void EmitStringTableDef(raw_ostream &OS, const Twine &Name,
+                          const Twine &Indent = "") const {
     OS << formatv(R"(
 #ifdef __GNUC__
 #pragma GCC diagnostic push
 #pragma GCC diagnostic ignored "-Woverlength-strings"
 #endif
-{0}{1} = )",
-                  Indent, Decl);
+{0}static constexpr char {1}Storage[] = )",
+                  Indent, Name);
+
+    // MSVC silently miscompiles string literals longer than 64k in some
+    // circumstances. When the string table is longer, emit it as an array of
+    // character literals.
+    bool UseChars = AggregateString.size() > (64 * 1024);
----------------
s-barannikov wrote:

> I don't know the history of why we have two offset table emission systems.
> I can try to consolidate the divergence that his grown between these two if 
> necessary, but it's a pretty big undertaking.

`SequenceToOffsetTable` supports arbitrary sequences (e.g. register numbers). 
I'm not sure if consolidating them is possible or necessary.

> To me, it seems a simpler model to just unconditionally emit tables larger 
> than 64k using the awkward syntax rather than add more tablegen flags that 
> change behavior. This lets the vast majority of cases use the simpler form.

This is just two LoC change (declaring the option and using it in the `if`) 
that will avoid compile time penalty if the compiler is not MSVC and make the 
[large] generated tables readable/searchable. Not a big win probably, so LGTM 
as is.


https://github.com/llvm/llvm-project/pull/123548
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to