Chas. Owens wrote: > On Wed, Jan 7, 2009 at 07:41, Anže Vidmar <anz...@gmail.com> wrote: >> hello! >> >> I have some nasty, non-ascii character in some files that contains php code >> (actually somewhere in my SVN branch). What I want to do here is to >> recursively find all the files that contains a specific non-ascii character >> in the file. And most importantly - i need to know the name of the files >> containing it. >> >> So far, I found a script that looks into a file for non-ascii characters and >> prints this characters in hex: >> >> while (<>) { >> s/([\x80-\xff])/sprintf "\\x{%02x}",ord($1)/eg; >> print; >> } >> >> Ok, this is good, the non-ascii character (in hex) that I'm looking for is: >> >> x{ef}\\x{bb}\\x{bf} >> >> The problem here is that I can't run this script to run recursively and I >> don't get the name of the file that actually contains this characters. >> >> I've tried with bash, but since it's standard output, I can't get any >> resault on this. Here is what I've tried: >> >> find |xargs /usr/local/bin/check_for_non-ascii_characters.sh |grep -l >> 'x{ef}\\x{bb}\\x{bf}' >> >> So, I need a way to recursively find non-ascii characters (a specific >> pattern, mentioned before) in all files and I need the name of the files >> containing it. >> >> It would be enough if I would be able only to see what file contains this >> character set. >> >> Thanks > > #!/usr/bin/perl > > use strict; > use warnings; > > use File::Find; > > File::Find::find( > sub { > return unless -f; > #refine further with a return unless /\.php$/ if desired > open my $fh, "<", $_ > or die "could not open $_"; > while (<$fh>) { > my $offset = 0; > for my $char (split //) { > if (ord $char > 127) { > printf "non-ascii char (%04x) in file %s on line > %d position %d:\n%s\n", > ord($char), $File::Find::name, $., $offset, $_; > } > $offset++; > } > } > }, > @ARGV > );
File::Find exports find() by default. It is better either to use the import or to prevent it altogether with use File::Find (); in the first place. Rob -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/