you're right, and, if you need to deal with unicode in file names, you do
something like this:
 
use Win32::OLE qw(in CP_UTF8);
use Win32::OLE::Const;
 
Win32::OLE->Option(CP=>CP_UTF8);
 
use Unicode::String qw/utf8/;
 
my $oshell = Win32::OLE->new('Shell.Application') or die "$@";
my $f = $oshell->NameSpace(Win32::GetCwd());
print "[$f]";
my $fi = $f->Items;
print $fi->Count;
print "\n";
for (0 .. $fi->Count-1) {
  my $item = $fi->Item($_);
  my $name = $item->Name;
  my $u=utf8($name);
  my $s = $u->hex;
  $s=~s/U\+00(\w\w)/my($r,$p)=((pack 'H*',$1),$&);if($r=~m(^[()\w
.;\-+!]$)){$r}else{$p}/eg;
  $s=~s/(U\+[\da-f][\da-f][\da-f][\da-f])/($1)/ig;
  my $ren=0;
  $ren=1 if $s=~/U\+(?!00)/;
  $s=~s/[ +]//g;
  print "$ren|$s\n";
  if($ren){$item->{Name}=$s}
}

But may be using wide functions is simplier and better.
 
I am sure www.perlmonks.org <http://www.perlmonks.org>  people know how to
deal with this and will happily provide advises.
 
Best regards,
Vadim.

-----Original Message-----
From: D D Allen [mailto:[EMAIL PROTECTED]
Sent: Thursday, January 26, 2006 4:05 PM
To: Konovalov, Vadim
Cc: libwin32@perl.org
Subject: RE: Win32 API, Directories with Unicode / Wide Filenames, FindFir
stFileW, FindNextFileW



Thank you for the response.   I tried that but it doesn't work... because
the actual filename returned by the builtin functions, using the example, is
"??????_HostID_2006-01-19_213218.xls".   It's not just a STDOUT display
issue where the leading UTF8 characters are being displayed as question mark
characters.  The builtin directory functions actually return the question
mark characters.   

In the workaround solution suggested by Jan Dubois using Win32:OLE and
Scripting.FileSystemObject, I do enable unicode / UTF8 binding which then
returns filenames with the UTF8 characters -- which can be passed to Excel
which finds and opens the file.  But the builtin directory functions return
question mark characters in place of the UTF8 characters.  And passing
filenames with question mark characters to Excel does no good -- even with
Win32:OLE UTF8 bindings enabled. 

At one point, it looks like you could set a perl variable that enabled
"wide" Win32 API directory calls in the Perl source.  For example, Perl's
"win32.c" source has calls to a function "USING_WIDE()" that if it returned
true, would have called the wide versions of the Win32 API FindFirstFile,
FindNextFile calls -- which should have returned filenames in UTF8 /
UTF-16LE format.   But according to the following, this USING_WIDE
functionality had been turned off for some time in the Perl code and has now
been stripped out. 

http://dev.perl.org/perl5/list-summaries/2005/20051107.html 

This seems to guarantee that the Win32 Perl 5.8.+ source calls the ANSI
versions of the Win32 API directory functions -- which explains why the
builtin directory functions (e.g., readdir) return filenames with question
mark characters in place of UTF8 / UTF-16LE characters. 

Regards,

... Dewey




"Konovalov, Vadim" <[EMAIL PROTECTED]> 


01/26/2006 04:25 AM 


To
D D Allen/Fairfax/[EMAIL PROTECTED], libwin32@perl.org 

cc

Subject
RE: Win32 API, Directories with Unicode / Wide Filenames, FindFir
stFileW, FindNextFileW

        




> I have a Win32 Perl coding challenge to search a directory 
> (WinXP, NTFS) 
> with a file specification pattern where the directory may 
> contain files 
> with unicode / wide filenames in addition to ANSI filenames.  
>  Through 
> trial, error, and searches against Perl mailing list 
> archives, it seems 
> apparent that Win32 Perl's builtin directory functions do not 
> support / 
> return Win32 unicode / wide filenames.  More specifically, 
> the builtin 
> functions return filenames like "??????_HostID_2006-01-19_213218.xls" 
> when the filename contains unicode / wide characters (the same as the 
> DOS "dir" command).  The problem is that you can't pass these 
> filenames 
> to something like stat or via OLE to ask Excel to open it.

you need activate Unicode in OLE binding:


use Win32::OLE qw(in CP_UTF8);
Win32::OLE->Option(CP=>CP_UTF8);




Reply via email to