Thanks a bunch Shlomi. Using your snippet now I am to create even 1 Giga file. Previously it was throwing 'Out of Memory' message. :)
Ok coming to UTF discussion, will the following work: use Encode; my @all_encodings = Encode->encodings(":all"); use Encode::Guess @all_encodings; while(<$sample_file_fh>){ # Encoding into utf data $utf_internal = decode("Guess",$_); $utf_data = encode("utf8", $utf_internal); $data_string = $data_string.$utf_data; } And then the snippet suggested by Shlomi. Cheers, Parag On Sun, Jan 3, 2010 at 9:12 PM, Shlomi Fish <shlo...@iglu.org.il> wrote: > On Sunday 03 Jan 2010 16:25:09 Parag Kalra wrote: > > I am curious to know more on UTF and understand related issues that may > > creep in my algorithm. Could someone please shed some light on it. > > > > Can I use following: > > > > use Encode; > > > > Make sure you add "use strict;" and "use warnings;". > > > while(<$sample_file_fh>){ > > > > # Encoding into utf data > > $utf_data = encode("utf8", $_); > > $data_string = $data_string.$utf_data; > > } > > > > > > # Checking the current length of the string > > while(length($data_string)<$total_size){ > > $data_string = $data_string.$data_string; > > } > > This snippet: > > 1. Will grow the size of $data_string twice each time (exponentially). > > 2. Will create a very large buffer in memory. > > 3. Can be better written as "$data_string .= $data_string;" > > A better snippet would be (untested): > > <<<<<<<<<<<< > { > open my $out_fh, ">", $out_filename > or die "Could not open $out_filename - $!"; > > my $length_so_far = 0; > > while ($length_so_far < $total_size) > { > print {$out_fh} $data_string; > > $length_so_far += length($data_string); > } > > close($out_fh); > } > >>>>>>>>>>>> > > Regards, > > Shlomi Fish > > -- > ----------------------------------------------------------------- > Shlomi Fish http://www.shlomifish.org/ > Funny Anti-Terrorism Story - http://shlom.in/enemy > > Bzr is slower than Subversion in combination with Sourceforge. > ( By: http://dazjorz.com/ ) >