abhina.sreeskantharajan marked 8 inline comments as done.
abhina.sreeskantharajan added inline comments.


================
Comment at: clang/include/clang/Lex/LiteralSupport.h:244
   bool Pascal;
+  ConversionState TranslationState;
+
----------------
tahonermann wrote:
> abhina.sreeskantharajan wrote:
> > tahonermann wrote:
> > > Same concern here with respect to persisting the conversion state as a 
> > > data member.
> > If this member is removed in StringLiteralParser, we will need to pass the 
> > State to multiple functions in StringLiteralParser like init(). Would this 
> > solution be preferable to keeping a data member?
> I think so, yes.  Data members should be used to reflect the state of the 
> object, not as a convenient mechanism to avoid passing arguments.
Thanks, I've removed this member.


================
Comment at: clang/lib/Lex/LiteralSupport.cpp:1322-1323
+  TranslationState = translationState;
+  if (Kind == tok::wide_string_literal)
+    TranslationState = TranslateToSystemCharset;
+  else if (isUTFLiteral(Kind))
----------------
tahonermann wrote:
> abhina.sreeskantharajan wrote:
> > tahonermann wrote:
> > > Converting wide character literals to the system encoding doesn't seem 
> > > right to me.  For z/OS, this should presumably convert to the wide EBCDIC 
> > > encoding, but for all other supported platforms, the wide execution 
> > > character set is either UTF-16 or UTF-32 depending on the size of 
> > > `wchar_t` (which may be influenced by the `-fshort-wchar` option).
> > Since we don't implement -fwide-exec-charset yet, what do you think should 
> > be the default behaviour for the interim?
> Perhaps an Internal compiler error to indicate that appropriate support is 
> not yet in place?
Thanks for the suggestion. I've added assertions for wide character translation 
before we do any translation.


================
Comment at: clang/lib/Lex/LiteralSupport.cpp:1594-1595
+  ConversionState State = TranslationState;
+  if (Kind == tok::wide_string_literal)
+    State = TranslateToSystemCharset;
+  else if (isUTFLiteral(Kind))
----------------
tahonermann wrote:
> Converting wide string literals to the system encoding doesn't seem right to 
> me.  For z/OS, this should presumably convert to the wide EBCDIC encoding, 
> but for all other supported platforms, the wide execution character set is 
> either UTF-16 or UTF-32 depending on the size of `wchar_t` (which may be 
> influenced by the `-fshort-wchar` option).
I've now added an assertion when translating wide characters.


================
Comment at: clang/test/CodeGen/systemz-charset.cpp:1-25
+// RUN: %clang %s -std=c++17 -emit-llvm -S -target s390x-ibm-zos -o - | 
FileCheck %s
+
+const char *RawString = R"(Hello\n)";
+//CHECK: c"\C8\85\93\93\96\E0\95\00"
+
+char UnicodeChar8 = u8'1';
+//CHECK: i8 49
----------------
tahonermann wrote:
> This is good.  I suggest adding escape sequences and UCNs to validate that 
> they are not converted to IBM-1047.
Good idea, I added those testcases as per your suggestion.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D93031/new/

https://reviews.llvm.org/D93031

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to