On Jun 29, 2012, at 9:31 AM, lina wrote:

> On Sat, Jun 30, 2012 at 12:21 AM, John W. Krahn <jwkr...@shaw.ca> wrote:
>> lina wrote:
>>> 
>> 
>> 
>> 
>> $ echo "0.35 1.32 3
>> 
>> 0.35 4.35 2
>> 0.36 0.36 1
>> 0.36 1.32 1
>> 0.36 1.45 1
>> 0.36 1.46 1" | perl -e'
>> 
>> my ( %columns, %data );
>> while ( <> ) {
>>    my ( $row, $col, $val ) = split;
>>    $data{ $row }{ $col } = $val;
>>    $columns{ $col } = 1;
> 
> This is smart.
> 
> But the output is weird for the real file (see the below link), not
> the sample file.
> 
> https://docs.google.com/open?id=0B_oe_t9o_2c3eFZsQ1NYMjJwZGc

That file (with 84393 records) has 722 unique values in column 1 (rows) and 474 
unique values in column 2 (columns), so you are trying to print a 722 x 474 
matrix with 342,228 entries. Sure it is going to look pretty weird if you print 
out the whole thing in one go. You need to break it up into sections, something 
like this, whcih prints out 10 columns at a time (note that  operator is only 
available in later Perls):


#!/usr/local/bin/perl
use strict;
use warnings;

my ( %columns, %data );
my $dups = 0;
open( my $in, '<', "lina.dat") or die($!);
while ( <$in> ) {
   my ( $row, $col, $val ) = split;
   if( exists $data{$row}{$col} ) {
        $dups++;
   }
   $data{ $row }{ $col } = $val;
   $columns{ $col } = 1;
}
print scalar keys %columns, " columns found\n";
print scalar keys %data, " rows found\n";
print " in $. lines with $dups duplicate entries\n";
close($in) or warn($!);

my @columns = sort { $a <=> $b } keys %columns;
my @rows = sort { $a <=> $b } keys %data;

my $row_batch = 20;
my $col_batch = 10;
for( my $start_col = 0; $start_col < @columns; $start_col += $col_batch ) {
        my $end_col = $start_col + $col_batch - 1;
        $end_col = $#columns if $end_col > $#columns;
        print "\n Columns:  ";
        printf("%5.2f ",$_) for @columns[$start_col .. $end_col];
        print "\n";
        for( my $row = 0; $row < @rows; $row++ ) {
                printf("    %5s: ", $rows[$row]);
                for( my $col = $start_col; $col <= $end_col; $col++) {
                        my $val = $data{$rows[$row]}{$columns[$col]} // '-';
                        printf("%5s ",$val);
                }
                print "\n";
        }               
}





--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/


Reply via email to