Hello Folks,

This is my first post here.

I am trying to emulate Linux 'sort' command through Perl. I got following
code through Internet to sort the text file:

# cat sort.pl
my $column_number = 2; # Sorting by 3rd column since 0-origin based
my $prev = "";
for (
  map { $_->[0] }
  sort { $a->[1] cmp $b->[1] }
  map { [$_, (split)[$column_number]] }
  <>
) {
  print unless $_ eq $prev;
  $prev = $_;
}

Suppose I want to sort the data of text file having following rows &
columns:

# cat test.out
jhvXgF    U13GWt    3OvMCf    VMkAWj
4ewejk    pFnjd4    ie0hZF    pPipQJ
4ewejk    4sqprx    ie0hZF    cqtexi
FT9mWp    d4fgMB    gvZRJU    XRRu0N
hnzI2c    GXAXWF    6xKH7A    3dLh18

When I sort it using the 'sort' command by 3rd column I get following
output:

# sort -u -k 3 test.out
jhvXgF    U13GWt    3OvMCf    VMkAWj
hnzI2c    GXAXWF    6xKH7A    3dLh18
FT9mWp    d4fgMB    gvZRJU    XRRu0N
4ewejk    4sqprx    ie0hZF    cqtexi
4ewejk    pFnjd4    ie0hZF    pPipQJ

However when I sort the same text file by 3rd column using the piece of
code, I get following:
jhvXgF    U13GWt    3OvMCf    VMkAWj
hnzI2c    GXAXWF    6xKH7A    3dLh18
FT9mWp    d4fgMB    gvZRJU    XRRu0N
4ewejk    pFnjd4    ie0hZF    pPipQJ
4ewejk    4sqprx    ie0hZF    cqtexi

Difference can be seen the last 2 row values of 2nd column.

 The reason being 'ie0hZF' is repeated twice in 3rd column and also
corresponding values in 1st column are same - '4ewejk' so discrepancy has
occured in 2nd column.
Can anybody help me fix the bug in the above code.

Also as I am a beginner in Perl, I couldn't understand the code completely
so once the bug is fixed if someone could explain it line by line, I would
be grateful to him/her...

Cheers,
Parag

Reply via email to