Hmmm - http://search.cpan.org/~dankogai/Encode-2.39/lib/Encode/Guess.pm
It says right at the bottom that below method won't work to guess the encoding. :( Cheers, Parag On Sun, Jan 3, 2010 at 10:23 PM, Parag Kalra <paragka...@gmail.com> wrote: > Thanks a bunch Shlomi. > > Using your snippet now I am to create even 1 Giga file. Previously it was > throwing 'Out of Memory' message. :) > > Ok coming to UTF discussion, will the following work: > > use Encode; > my @all_encodings = Encode->encodings(":all"); > use Encode::Guess @all_encodings; > > > while(<$sample_file_fh>){ > > # Encoding into utf data > $utf_internal = decode("Guess",$_); > $utf_data = encode("utf8", $utf_internal); > > $data_string = $data_string.$utf_data; > } > > > And then the snippet suggested by Shlomi. > > Cheers, > Parag > > > > > > On Sun, Jan 3, 2010 at 9:12 PM, Shlomi Fish <shlo...@iglu.org.il> wrote: > >> On Sunday 03 Jan 2010 16:25:09 Parag Kalra wrote: >> > I am curious to know more on UTF and understand related issues that may >> > creep in my algorithm. Could someone please shed some light on it. >> > >> > Can I use following: >> > >> > use Encode; >> > >> >> Make sure you add "use strict;" and "use warnings;". >> >> > while(<$sample_file_fh>){ >> > >> > # Encoding into utf data >> > $utf_data = encode("utf8", $_); >> > $data_string = $data_string.$utf_data; >> > } >> > >> > >> > # Checking the current length of the string >> > while(length($data_string)<$total_size){ >> > $data_string = $data_string.$data_string; >> > } >> >> This snippet: >> >> 1. Will grow the size of $data_string twice each time (exponentially). >> >> 2. Will create a very large buffer in memory. >> >> 3. Can be better written as "$data_string .= $data_string;" >> >> A better snippet would be (untested): >> >> <<<<<<<<<<<< >> { >> open my $out_fh, ">", $out_filename >> or die "Could not open $out_filename - $!"; >> >> my $length_so_far = 0; >> >> while ($length_so_far < $total_size) >> { >> print {$out_fh} $data_string; >> >> $length_so_far += length($data_string); >> } >> >> close($out_fh); >> } >> >>>>>>>>>>>> >> >> Regards, >> >> Shlomi Fish >> >> -- >> ----------------------------------------------------------------- >> Shlomi Fish http://www.shlomifish.org/ >> Funny Anti-Terrorism Story - http://shlom.in/enemy >> >> Bzr is slower than Subversion in combination with Sourceforge. >> ( By: http://dazjorz.com/ ) >> > >