The easiest fix would be to use a perl which is using the W (wide) string functions and not the A, as ActivePerl does. And then recompile libwin32.
I'm also fighting with the same issue on Win32::GUI (japanese and proper umlauts) with UCS-2 encoded strings. 2006/1/25, Dewey Allen <[EMAIL PROTECTED]>: > I previously posted the following to the perl-win32-users list (early > yesterday). This has a few updates. > > I have a Win32 Perl coding challenge to search a directory (WinXP, NTFS) > with a file specification pattern where the directory may contain files > with unicode / wide filenames in addition to ANSI filenames. Through > trial, error, and searches against Perl mailing list archives, it seems > apparent that Win32 Perl's builtin directory functions do not support / > return Win32 unicode / wide filenames. More specifically, the builtin > functions return filenames like "??????_HostID_2006-01-19_213218.xls" > when the filename contains unicode / wide characters (the same as the > DOS "dir" command). The problem is that you can't pass these filenames > to something like stat or via OLE to ask Excel to open it. > > I saw other postings to various Perl lists that referenced the Win32 API > directory search functions FindFirstFile, FindNextFile, and FindClose. > I also noticed that these functions are not packaged in > "Win32API::File" -- the title of which is "Low-level access to Win32 > system API calls for files/dirs" (hmm, why are these low level Win32 API > directory functions not included in Win32API::File?). > > I was able to code a test script that used the ANSI versions of these > Win32API calls but it produced the same results as Perl's builtin > directory functions: it returned filenames like > "??????_HostID_2006-01-19_213218.xls" when the filename contains unicode > / wide characters. Which is to be expected. > > I then found a posting from Jan Dubois (9 Dec 2005, perl5-porters) that > suggested the solution to this problem was to use Win32::OLE and the > Scripting.FileSystemObject. I was able to successfully implement this > method with one shortcoming: the Scripting.FileSystemObject does not > support directory searches -- only directory listings (as best I can > tell.) THANK YOU JAN DUBOIS! > > Being stubborn and curious, I went back to fiddling with the Win32 API > directory search functions to see if I could make the "wide" version of > these calls work (starting with example code posted to various Perl > lists by others -- whom I thank). I think I'm close to getting these > functions to work... but I'm a novice at implementing Win32 API > functions in Perl -- and using Perl's Unicode functions. > > The following example script runs (and sometimes crashes) but for > filenames with only ANSI characters, $FileInfo->{cFileName}, seems to > contain only the first character. And for filenames that start with > unicode, wide characthers, $FileInfo->{cFileName}, seems to contain > only the leading unicode/wide characters. In the data dump output, the > buffer seems to show the full 16bit unicode file name (e.g., > "t^!ki<eQR_ H o s t I D _ 2 0 0 6 - 0 1 - 1 9 _ 2 1 4 3 5 8 . x l > s"). I suspect that the spaces between the ANSI characters are null > (\0) characters. And I suspect that the Perl Win32 interface layer > treats these as null terminated C strings -- as opposed to 16 bit > unicode characters -- and therefore terminates the string at the first > null byte it encounters. > > Various Perl unicode documents indicate that the Win32 API unicode > format is UTF-16LE. But decoding "$FileInfo->{cFileName}" using > UTF-16LE doesn't seem to work any way that I've tried it. > > I'm also not sure of the proper array dimension for cFileName (and > cAlternateFilename) in the WIN32_FIND_DATAW struct. In the ANSI version > of this structure, cFileName is a TCHAR of dimension 260 (MAX_PATH) > where TCHAR is a single byte (according to Win32:API::Type->sizeof). In > the WIDE version (WIN32_FIND_DATAW, it's a WHCAR of the same dimension > -- but WCHAR is 2 bytes. When I make cFileName and cAltnerateFile > TCHARs of dimension 260 and 14 respectively, Perl crashes. And it also > crashes when I make them WCHARs of the same dimension. Only when I > double the dimensions to 520 and 28 does the script run without crashing > - using either TCHAR or WCHAR. > > Any ideas on how to make these functions work correctly would be greatly > appreciated. And if I'm missing something obvious or doing something > dumb, please don't hesitate to point that out :-). > > Regards, > > ... Dewey > > > > use strict; > use Win32::API; > use Data::Dumper; $Data::Dumper::Indent=1; $Data::Dumper::Sortkeys=1; > use Encode qw(encode decode); > use Unicode::String; > use Devel::Peek; > use English; > > $OUTPUT_AUTOFLUSH=1; > > $Win32::API::DEBUG = 0; > > binmode(STDOUT, ":utf8"); > > > use constant ERROR_NO_MORE_FILES => 18; > use constant INVALID_HANDLE_VALUE => -1; > > print "tchar is known: ", Win32::API::Type->is_known("TCHAR"), "\n"; > print "wchar is known: ", Win32::API::Type->is_known("WCHAR"), "\n"; > print "sizeof tchar is: ", Win32::API::Type->sizeof("TCHAR"), "\n"; > print "sizeof wchar is: ", Win32::API::Type->sizeof("WCHAR"), "\n"; > > > > Win32::API::Struct-> typedef('FILETIME', qw( > DWORD dwLowDateTime; > DWORD dwHighDateTime; > )); # 8 bytes > > use constant FILE_ATTRIBUTE_READONLY => 0x00000001; > use constant FILE_ATTRIBUTE_HIDDEN => 0x00000002; > use constant FILE_ATTRIBUTE_SYSTEM => 0x00000004; > use constant FILE_ATTRIBUTE_DIRECTORY => 0x00000010; > use constant FILE_ATTRIBUTE_ARCHIVE => 0x00000020; > use constant FILE_ATTRIBUTE_NORMAL => 0x00000080; > use constant FILE_ATTRIBUTE_TEMPORARY => 0x00000100; > use constant FILE_ATTRIBUTE_COMPRESSED => 0x00000800; > use constant MAX_PATH => 260; > > Win32::API::Struct-> typedef('WIN32_FIND_DATAW', qw( > DWORD dwFileAttributes; > FILETIME ftCreationTime; > FILETIME ftLastAccessTime; > FILETIME ftLastWriteTime; > DWORD nFileSizeHigh; > DWORD nFileSizeLow; > DWORD dwReserved0; > DWORD dwReserved1; > WCHAR cFileName[520]; > WCHAR cAlternateFileName[28]; > )); > > > my $FindFirstFile = Win32::API->new('kernel32.dll', 'FindFirstFileW', > 'PS', 'N') or die "FindFirstFile: $^E"; > my $FindNextFile = Win32::API->new('kernel32.dll', 'FindNextFileW', > 'NS', 'I') or die "FindNextFile $^E"; > my $FindClose = Win32::API->new('kernel32.dll', 'FindClose', 'N', > 'I') or die "FileClose $^E"; > > > my $FileSpec = "//?/C:/My Documents/Tool/*.xls\0"; > > my $FileInfo = Win32::API::Struct-> new('WIN32_FIND_DATAW'); > #print Data::Dumper-> Dump([$FileSpec, $FileInfo], [qw($FileSpec > $FileInfo)]); > > my $uFileSpec = Unicode::String->new; > $uFileSpec->utf8($FileSpec); > print "FileSpec = ", $uFileSpec->as_string, "\n"; > > my $handle = $FindFirstFile-> Call($uFileSpec->utf16le, $FileInfo); > #my $handle = $FindFirstFile-> Call(encode("UTF-16LE", $FileSpec), > $FileInfo); > > if ($handle == INVALID_HANDLE_VALUE) { > printf "Error is %d - %s\n", Win32::GetLastError (), > Win32::FormatMessage (Win32::GetLastError ()); > exit(1); > } else { > print "FindFirstFile worked\n"; > > Dump $FileInfo->{cFileName}; > #print Data::Dumper-> Dump([$FileInfo], [qw($FileInfo)]); > > my $ufn = Unicode::String->new; > $ufn->utf16le($FileInfo->{cFileName}); > > print "first filename = ", $ufn->as_string, "\n"; > print "first filename = '", $FileInfo->{cFileName}, "'\n"; > #print "first filename = ", decode("UTF-16LE", $FileInfo->{cFileName} > ), "\n"; > while (my $result = $FindNextFile->Call($handle,$FileInfo)) { > Dump $FileInfo->{cFileName}; > $ufn->utf16le($FileInfo->{cFileName}); > print "next filename = ", $ufn->as_string, "\n"; > print "next filename = '", $FileInfo->{cFileName}, "'\n"; > #print "next filename = ", decode("UTF-16LE", > $FileInfo->{cFileName} > ), "\n"; > #print Data::Dumper-> Dump([$FileInfo], [qw($FileInfo)]); > > } > } > > $FindClose->Call($handle) or die "FindClose $^E"; -- Reini Urban http://phpwiki.org/ http://spacemovie.mur.at/