zturner requested changes to this revision.
zturner added a comment.
This revision now requires changes to proceed.

Overall I like the idea and I'm happy to get Unicode paths and strings working 
properly.  I'll look at the rest in more detail later, but I have some starting 
comments for now.

One thing I'm I'm not real excited about is how ugly this makes the code, but 
then again there's not a great way to handle this in an elegant fashion.  But I 
do have one suggestion.  Suppose we had a class like this:

  class WinUtfString
  {
  public:
      explicit WinUtfString(const char *utf8)
          : m_utf8(utf8), m_utf16(nullptr)
      { }
      explicit WinUtfString(const UTF16 *utf16)
          : m_utf8(nullptr), m_utf16(utf16)
      { }
      // Probably would want to add more useful constructors here that take 
StringRefs, std::strings, ConstStrings, and whatever else.
  
      const char *GetUtf8()
      {
          if (!m_utf8)
              ConvertToUtf8();
          return m_utf8;
      }
      const UTF16 *GetUtf16()
      {
          if (!m_utf16)
              ConvertToUtf16();
          return m_utf16;
      }
  
  private:
      void ConvertToUtf8();  // initialize m_utf8 from m_utf16
      void ConvertToUtf16();  // initialize m_utf16 from m_utf8
  
      llvm::SmallVector<UTF8, 128> m_utf8;
      llvm::SmallVector<UTF16, 128> m_utf16;
  };

The point of all this?  Much of the code becomes cleaner and easier to read.  
For example, this:

  llvm::SmallVector<UTF16, 128> srcBuffer, dstBuffer;
  if (!llvm::convertUTF8ToUTF16String(src.GetCString(), srcBuffer))
      error.SetError(1, lldb::eErrorTypeGeneric);
  else if (!llvm::convertUTF8ToUTF16String(dst.GetCString(), dstBuffer))
      error.SetError(1, lldb::eErrorTypeGeneric);
  else if (!::CreateHardLinkW((LPCWSTR)srcBuffer.data(), 
(LPCWSTR)dstBuffer.data(), nullptr))
      error.SetError(::GetLastError(), lldb::eErrorTypeWin32);

Becomes this:

  WinUtfString utfSrc(src.GetCString());
  WinUtfString utfDest(dest.GetCString());
  if (!CreateHardLinkW(utfSrc.GetUtf16(), utfDest.GetUtf16(), nullptr))
      error.SetError(::GetLastError(), lldb::eErrorTypeWin32);

and this:

  llvm::SmallVector<UTF16, 128> wexecutable;
  llvm::convertUTF8ToUTF16String(executable, wexecutable);
  
  launch_info.GetArguments().GetQuotedCommandString(commandLine);
  llvm::SmallVector<UTF16, 64> wcommandLine;
  llvm::convertUTF8ToUTF16String(commandLine, wcommandLine);
  
  llvm::SmallVector<UTF16, 128> wworkingDirectory;
  
llvm::convertUTF8ToUTF16String(launch_info.GetWorkingDirectory().GetCString(), 
wworkingDirectory);
  
  BOOL result = ::CreateProcessW((LPCWSTR)wexecutable.data(), 
(LPWSTR)wcommandLine.data(), NULL, NULL, TRUE, flags,
                                 env_block, (LPCWSTR)wworkingDirectory.data(), 
&startupinfo, &pi);

becomes this:

  launch_info.GetArguments().GetQuotedCommandString(commandLine);
  WinUtfString utfExecutable(executable);
  WinUtfString utfCommandLine(commandLine);
  WinUtfString 
utfWorkingDirectory(launch_info.GetWorkingDirectory().GetCString());
  BOOL result = ::CreateProcessW(utfExecutable.GetUtf16(), 
utfCommandLine.GetUtf16(), NULL, NULL, TRUE, flags,
                                 env_block, utfWorkingDirectory.GetUtf16(), 
&startupinfo, &pi);

etc.

I know this adds more work for you, but this is the type of thing that if it's 
not done right hte first time, it will stagnate and be gross forever.  So we 
should think about how to do this in the least-bad way possible.


================
Comment at: cfe/trunk/tools/libclang/CIndexer.cpp:57
@@ -55,3 +56,3 @@
   MEMORY_BASIC_INFORMATION mbi;
-  char path[MAX_PATH];
+  wchar_t wpath[MAX_PATH];
   VirtualQuery((void *)(uintptr_t)clang_createTranslationUnit, &mbi,
----------------
You'll probably need to submit this as a separate patch to cfe-commits, since 
it goes in a separate repo.  You can then commit them independently.

================
Comment at: lldb/trunk/source/Commands/CommandCompletions.cpp:171
@@ -168,1 +170,3 @@
         {
+#ifdef _WIN32
+            llvm::SmallVector<UTF16, 128> wpartial_name_copy;
----------------
I'm not fond of this approach of sprinkling #ifdefs around in every location 
where we call into a file system API.  We have `lldb/Host/FileSystem`, probably 
some of these methods should be moved over there, and we should provide a 
windows implementation and a non-windows implementation.  This way in places 
like this, you simply write:

    isa_directory = FileSystem::GetPermissions(partial_name_copy) & 
FileSystem::eDirectory);

and all this ugliness is hidden.

================
Comment at: lldb/trunk/source/Core/ConnectionSharedMemory.cpp:138
@@ -134,10 +137,3 @@
 #ifdef _WIN32
-    HANDLE handle;
-    if (create) {
-        handle = CreateFileMapping(
-            INVALID_HANDLE_VALUE,
-            NULL,
-            PAGE_READWRITE,
-            llvm::Hi_32(size),
-            llvm::Lo_32(size),
-            name);
+    HANDLE handle = INVALID_HANDLE_VALUE;
+    llvm::SmallVector<UTF16, 128> wname;
----------------
Same thing applies here, we should probably have a `Host/SharedMemory.h` that 
exposes a `Create()` method.  Although, in this particular case the `#ifdef` 
was already here so I can't entirely complain, but it would be nice to improve 
the code as we go and get ifdefs like this out of generic code.

================
Comment at: lldb/trunk/source/Core/Disassembler.cpp:881
@@ -878,3 +880,3 @@
     }
-        
-    FILE *test_file = fopen (file_name, "r");
+    FILE *test_file = nullptr;
+#if _WIN32
----------------
Would LLDB's `File` class work here instead of using raw APIs?  I see you fixed 
the code in `lldb_private::File`, so perhaps simply using that class here 
instead of a `FILE*` would allow you to reuse the fix from that class.

================
Comment at: lldb/trunk/source/Host/common/File.cpp:353
@@ +352,3 @@
+        int stat_result;
+#ifdef _WIN32
+        llvm::SmallVector<UTF16, 128> wpath;
----------------
Ahh, here's that method I was looking for earlier when I mentioend 
`FileSystem`.  So I guess you don't have to implement it, it's already here.  
So you could just use this method earlier in the part that gets file 
permissions by manually calling stat.

================
Comment at: lldb/trunk/source/Host/common/FileSpec.cpp:108
@@ +107,3 @@
+        if (stat_result == 0)
+            *stats_ptr = *reinterpret_cast<struct stat *>(&file_stats);
+        return stat_result == 0;
----------------
I don't think this line is correct.  The source and destination struct types do 
not always have the same layout, so I think you may need to copy the values out 
one by one.

================
Comment at: lldb/trunk/source/Host/common/FileSpec.cpp:237
@@ +236,3 @@
+    int stat_result;
+#ifdef _WIN32
+    llvm::SmallVector<UTF16, 128> wpath;
----------------
Maybe you can just call `GetStats()` here

================
Comment at: lldb/trunk/source/Host/common/FileSpec.cpp:829
@@ -799,2 +828,3 @@
 #ifdef _WIN32
-    auto attrs = ::GetFileAttributes (resolved_path);
+    llvm::SmallVector<UTF16, 128> wpath;
+    if (!llvm::convertUTF8ToUTF16String (resolved_path, wpath))
----------------
I think a `FileSystem::IsSymbolicLink()` here would be better, and then have 
`File` call `FileSpec::IsSymbolicLink()`

There's currently some confusion and overlap about what goes in `FileSpec` and 
what goes in `FileSystem`, but my general opinion is that wherever possible, 
anything that queries the filesystem should be a static method in 
`Host/FileSystem`, and any other classes who want to do the same thing can call 
the methods in `FileSystem`.  This way you don't have to, for example, 
construct a `FileSpec` from a string just to check if something is a symbolic 
link since you could just call the low level method directly.

================
Comment at: lldb/trunk/source/Host/common/FileSpec.cpp:1173-1180
@@ -1134,4 +1172,10 @@
 
-            char child_path[MAX_PATH];
-            const int child_path_len = ::snprintf (child_path, 
sizeof(child_path), "%s\\%s", dir_path, ffd.cFileName);
+            std::string fileName;
+            if (!llvm::convertUTF16ToUTF8String((const UTF16*)ffd.cFileName, 
fileName))
+            {
+                continue;
+            }
+
+            char child_path[PATH_MAX];
+            const int child_path_len = ::snprintf (child_path, 
sizeof(child_path), "%s\\%s", dir_path, fileName.c_str());
             if (child_path_len < (int)(sizeof(child_path) - 1))
----------------
sprintf'ing parent and child paths together to concatenate them doesn't seem 
like the best idea.  Can we use `llvm::sys::path::append()` here instead?


Repository:
  rL LLVM

http://reviews.llvm.org/D17107



_______________________________________________
lldb-commits mailing list
lldb-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

Reply via email to