Issue |
143968
|
Summary |
LLVM/Clang does not build on Windows 11 from git source if cloned under a directory that contains non-ASCII Unicode characters
|
Labels |
clang
|
Assignees |
|
Reporter |
juj
|
I am building LLVM/Clang as part of the Emscripten WebAssembly toolchain.
In the Finnish language, characters `ä` and `ö` are common in first and last names.
Testing on a Windows 11 system with steps that are essentially
```
mkdir "ä ö"
cd "ä ö"
git clone https://github.com/llvm/llvm-project.git
cd llvm-project
mkdir build
cd build
cmake -G "Visual Studio 17" -DCMAKE_BUILD_TYPE=Release -A x64 -Thost=x64 -DLLVM_TARGETS_TO_BUILD=WebAssembly;X86 -DLLVM_ENABLE_PROJECTS=clang;lld ..
cmake --build .
```
The build fails in errors like:
```
..\..\..\..\..\Release\bin\clang-tblgen.exe: Could not open input file 'C:/Σ ÷/emsdk/llvm/git/src/clang/include/clang/AST/CommentHTMLTags.td': No mapping for the Unicode character exists in the target multi-byte code page.
..\..\..\..\..\Release\bin\clang-tblgen.exe: Could not open input file 'C:/Σ ÷/emsdk/llvm/git/src/clang/include/clang/Basic/arm_cde.td': No mapping for the Unicode character exists in the target multi-byte code page.
..\..\..\..\..\Release\bin\clang-tblgen.exe: Could not open input file 'C:/Σ ÷/emsdk/llvm/git/src/clang/include/clang/Sema/../Basic/Attr.td': No mapping for the Unicode character exists in the target multi-byte code page.
..\..\..\..\..\Release\bin\clang-tblgen.exe: Could not open input file 'C:/Σ ÷/emsdk/llvm/git/src/clang/include/clang/Basic/arm_sve.td': No mapping for the Unicode character exists in the target multi-byte code page.
..\..\..\..\..\Release\bin\clang-tblgen.exe: Could not open input file 'C:/Σ ÷/emsdk/llvm/git/src/clang/include/clang/Basic/arm_mve.td': No mapping for the Unicode character exists in the target multi-byte code page.
..\..\..\..\..\Release\bin\clang-tblgen.exe: Could not open input file 'C:/Σ ÷/emsdk/llvm/git/src/clang/include/clang/Parse/../Basic/Attr.td': No mapping for the Unicode character exists in the target multi-byte code page.
..\..\..\..\..\Release\bin\clang-tblgen.exe: Could not open input file 'C:/Σ ÷/emsdk/llvm/git/src/clang/include/clang/Basic/arm_mve.td': No mapping for the Unicode character exists in the target multi-byte code page.
..\..\..\..\..\Release\bin\clang-tblgen.exe: Could not open input file 'C:/Σ ÷/emsdk/llvm/git/src/clang/include/clang/Sema/../Basic/Attr.td': No mapping for the Unicode character exists in the target multi-byte code page.
..\..\..\..\..\Release\bin\clang-tblgen.exe: Could not open input file 'C:/Σ ÷/emsdk/llvm/git/src/clang/include/clang/Basic/arm_cde.td': No mapping for the Unicode character exists in the target multi-byte code page.
..\..\..\..\..\Release\bin\clang-tblgen.exe: Could not open input file 'C:/Σ ÷/emsdk/llvm/git/src/clang/include/clang/AST/CommentCommands.td': No mapping for the Unicode character exists in the target multi-byte code page.
..\..\..\..\..\Release\bin\clang-tblgen.exe: Could not open input file 'C:/Σ ÷/emsdk/llvm/git/src/clang/include/clang/Basic/arm_mve.td': No mapping for the Unicode character exists in the target multi-byte code page.
..\..\..\..\..\Release\bin\clang-tblgen.exe: Could not open input file 'C:/Σ ÷/emsdk/llvm/git/src/clang/include/clang/Basic/Attr.td': No mapping for the Unicode character exists in the target multi-byte code page.
..\..\..\..\..\Release\bin\clang-tblgen.exe: Could not open input file 'C:/Σ ÷/emsdk/llvm/git/src/clang/include/clang/Basic/Diagnostic.td': No mapping for the Unicode character exists in the target multi-byte code page.
The system cannot find the batch label specified - VCEnd
..\..\..\..\..\Release\bin\clang-tblgen.exe: Could not open input file 'C:/Σ ÷/emsdk/llvm/git/src/clang/include/clang/AST/../Basic/CommentNodes.td': No mapping for the Unicode character exists in the target multi-byte code pag
e.
..\..\..\..\..\Release\bin\clang-tblgen.exe: Could not open input file 'C:/Σ ÷/emsdk/llvm/git/src/clang/include/clang/Basic/Diagnostic.td': No mapping for the Unicode character exists in the target multi-byte code page.
..\..\..\..\..\Release\bin\clang-tblgen.exe: Could not open input file 'C:/Σ ÷/emsdk/llvm/git/src/clang/include/clang/Basic/arm_sme.td': No mapping for the Unicode character exists in the target multi-byte code page.
The system cannot find the batch label specified - VCEnd
..\..\..\..\..\Release\bin\clang-tblgen.exe: Could not open input file 'C:/Σ ÷/emsdk/llvm/git/src/clang/include/clang/Basic/arm_neon.td': No mapping for the Unicode character exists in the target multi-byte code page.
..\..\..\..\..\Release\bin\clang-tblgen.exe: Could not open input file 'C:/Σ ÷/emsdk/llvm/git/src/clang/include/clang/Basic/Diagnostic.td': No mapping for the Unicode character exists in the target multi-byte code page.
The system cannot find the batch label specified - VCEnd
The system cannot find the batch label specified - VCEnd
C:\Program Files\Microsoft Visual Studio\2022\Community\MSBuild\Microsoft\VC\v170\Microsoft.CppCommon.targets(237,5): error MSB8066: Custom build for 'C:\ä ö\emsdk\llvm\git\build_main_vs2022_64\CMakeFiles\dcd65da04696a235ed2c63a
df3f0bec7\arm_mve_builtin_sema.inc.rule' exited with code 1. [C:\ä ö\emsdk\llvm\git\build_main_vs2022_64\tools\clang\include\clang\Basic\ClangARMMveBuiltinSema.vcxproj]
C:\Program Files\Microsoft Visual Studio\2022\Community\MSBuild\Microsoft\VC\v170\Microsoft.CppCommon.targets(237,5): error MSB8066: Custom build for 'C:\ä ö\emsdk\llvm\git\build_main_vs2022_64\CMakeFiles\23ddb6ff97ed6a4ff0554c6
7e584edbb\CommentHTMLTagsProperties.inc.rule' exited with code 1. [C:\ä ö\emsdk\llvm\git\build_main_vs2022_64\tools\clang\include\clang\AST\ClangCommentHTMLTagsProperties.vcxproj]
The system cannot find the batch label specified - VCEnd
..\..\..\..\..\Release\bin\clang-tblgen.exe: Could not open input file 'C:/Σ ÷/emsdk/llvm/git/src/clang/include/clang/Basic/Attr.td': No mapping for the Unicode character exists in the target multi-byte code page.
..\..\..\..\..\Release\bin\clang-tblgen.exe: Could not open input file 'C:/Σ ÷/emsdk/llvm/git/src/clang/include/clang/Basic/arm_sme.td': No mapping for the Unicode character exists in the target multi-byte code page.
C:\Program Files\Microsoft Visual Studio\2022\Community\MSBuild\Microsoft\VC\v170\Microsoft.CppCommon.targets(237,5): error MSB8066: Custom build for 'C:\ä ö\emsdk\llvm\git\build_main_vs2022_64\CMakeFiles\dcd65da04696a235ed2c63a
df3f0bec7\arm_cde_builtin_aliases.inc.rule' exited with code 1. [C:\ä ö\emsdk\llvm\git\build_main_vs2022_64\tools\clang\include\clang\Basic\ClangARMCdeBuiltinAliases.vcxproj]
..\..\..\..\..\Release\bin\clang-tblgen.exe: Could not open input file 'C:/Σ ÷/emsdk/llvm/git/src/clang/include/clang/AST/CommentHTMLTags.td': No mapping for the Unicode character exists in the target multi-byte code page.
1>Generating export list for LLVM-C
Building Custom Rule C:/ä ö/emsdk/llvm/git/src/llvm/tools/lli/CMakeLists.txt
C:\├ñ ├╢\emsdk\llvm\git\build_main_vs2022_64\Release\bin\llvm-nm.exe : error : C:/├â┬ñ ├â┬╢/emsdk/llvm/git/build_main_vs2022_64/Release/lib/LLVMDemangle.lib: no such file or directory [C:\ä ö\emsdk\llvm\git\build_main_vs2022_64\
tools\llvm-shlib\LLVM-C_exports.vcxproj]
Traceback (most recent call last):
File "C:\Σ ÷\emsdk\llvm\git\src\llvm\tools\llvm-shlib\gen-msvc-exports.py", line 124, in <module>
main()
File "C:\Σ ÷\emsdk\llvm\git\src\llvm\tools\llvm-shlib\gen-msvc-exports.py", line 120, in main
gen_llvm_c_export(ns.output, ns.underscore, libs, ns.nm)
File "C:\Σ ÷\emsdk\llvm\git\src\llvm\tools\llvm-shlib\gen-msvc-exports.py", line 72, in gen_llvm_c_export
check_call([nm, "-g", lib], stdout=dumpout_f)
File "C:\python\Lib\subprocess.py", line 413, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['C:/Σ ÷/emsdk/llvm/git/build_main_vs2022_64/Release/bin/llvm-nm.exe', '-g', 'C:/├ñ ├╢/emsdk/llvm/git/build_main_vs2022_64/Release/lib/LLVMDemangle.lib']' returned non-zero exit status 1
.
The system cannot find the batch label specified - VCEnd
C:\Program Files\Microsoft Visual Studio\2022\Community\MSBuild\Microsoft\VC\v170\Microsoft.CppCommon.targets(237,5): error MSB8066: Custom build for 'C:\ä ö\emsdk\llvm\git\build_main_vs2022_64\CMakeFiles\c3aaee8fb191e6d43e345c3
ec948c0ed\libllvm-c.exports.rule' exited with code 1. [C:\ä ö\emsdk\llvm\git\build_main_vs2022_64\tools\llvm-shlib\LLVM-C_exports.vcxproj]
```
I tried running `chcp 65001` to ensure Unicode codepage in command prompt before git clone, though that did not have an effect.
(In the above log prints, the test prefix path `ä ö` does also contain a space, which I tested to not be an issue, e.g. building to a subdir `a b` did work ok)
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs