Hi,
On 13.02.21 21:28, Leonard Janis Robert König wrote:
On Sat, 2021-02-13 at 21:15 +0100, Erik Auerswald wrote:
On 13.02.21 19:29, Leonard Janis Robert König wrote:
[...]
That being said, I don't see this exact distinction reflected in
the
code, so perhaps I just misunderstood.
Disabling "Tabification" only when "-s" was active is missing. That
resulted in the 2007 bug. Making the needed special treatment always
used fixed the 2007 bug, but broke your use case.
That some special treatment is needed and intended can be gleaned
from the following comment (with line numbers from pr.c in the
current master branch @ 2de30c7350a77b091afa1eb284acdf082c0f6aa5):
1031 /* It's rather pointless to define a TAB separator with column
1032 alignment */
The code after that comment does not disable alignment, but changes
the separator from a TAB to a space.
My patch adds the special treatment, since it works both for the 2007
bug and this bug (bug#46422).
The attached version 4 of my patch does that in a way that more
clearly shows the intent. I think this is a better fix for the
2007 bug than commit 553d347d3e08e00ee4f9df520b37c964c3f26e28.
Expanding TABs on input is enabled unless when a single TAB is
used as column separator. This conforms better to POSIX and
does not introduce the regression that causes the current bug
(bug#46422).
I have added more test cases, because manual testing showed that
the options "-s" and "-s$'\t'" were treated differently by pr.
Using "-s" to activate the default TAB separator should result
in the same output as using "-s$'\t'" to specify one TAB character
as separator, i.e., the default, explicitly.
[...] with the patch my rather obscure (and complex)
use case of printing thousands of lines of code works properly now!
Thanks for testing!
Thanks all to you
May I ask you to test the new patch (v4) as well?
Thanks,
Erik
diff --git a/src/pr.c b/src/pr.c
index 22d032ba3..5b003cb9a 100644
--- a/src/pr.c
+++ b/src/pr.c
@@ -1237,6 +1237,8 @@ init_parameters (int number_of_files)
col_sep_string = column_separator;
truncate_lines = true;
+ if (! (col_sep_length == 1 && *col_sep_string == '\t'))
+ untabify_input = true;
tabify_output = true;
}
else
diff --git a/tests/pr/pr-tests.pl b/tests/pr/pr-tests.pl
index b7d868cf8..d0ac40520 100755
--- a/tests/pr/pr-tests.pl
+++ b/tests/pr/pr-tests.pl
@@ -466,6 +466,27 @@ push @Tests,
{IN=>{2=>"m\tn\to\n"}},
{IN=>{3=>"x\ty\tz\n"}},
{OUT=>join("\t", qw(a b c m n o x y z)) . "\n"} ];
+# -s and -s$'\t' use different code paths
+push @Tests,
+ ['merge-w-tabs-sepstr', "-m -s'\t' -t",
+ {IN=>{1=>"a\tb\tc\n"}},
+ {IN=>{2=>"m\tn\to\n"}},
+ {IN=>{3=>"x\ty\tz\n"}},
+ {OUT=>join("\t", qw(a b c m n o x y z)) . "\n"} ];
+
+# Exercise a variant of the bug with pr -m -s (commit 553d347)
+# test 2 files, too (merging 3 files automatically aligns columns on TAB stops)
+push @Tests,
+ ['merge-2-w-tabs', '-m -s -t',
+ {IN=>{1=>"a\tb\tc\n"}},
+ {IN=>{2=>"m\tn\to\n"}},
+ {OUT=>join("\t", qw(a b c m n o)) . "\n"} ];
+# -s and -s$'\t' use different code paths
+push @Tests,
+ ['merge-2-w-tabs-sepstr', "-m -s'\t' -t",
+ {IN=>{1=>"a\tb\tc\n"}},
+ {IN=>{2=>"m\tn\to\n"}},
+ {OUT=>join("\t", qw(a b c m n o)) . "\n"} ];
# This resulted in reading invalid memory before coreutils-8.26
push @Tests,
@@ -474,6 +495,23 @@ push @Tests,
{IN=>{2=>"a\n"}},
{OUT=>"a\t\t\t\t \t\t\ta\n"} ];
+# Exercise a bug with pr -t -2 (bug #46422)
+push @Tests,
+ ['mcol-w-tabs', '-t -2',
+ {IN=>"x\tx\tx\tx\tx\nx\tx\tx\tx\tx\n"},
+ {OUT=>"x\tx\tx\tx\tx x\t x\t x\t x\t x\n"} ];
+
+# generalize case from commit 553d347 (problem results from -s, not -m)
+push @Tests,
+ ['mcol-w-tabs-w-tabsep', '-t -2 -s',
+ {IN=>"x\tx\tx\tx\tx\nx\tx\tx\tx\tx\n"},
+ {OUT=>"x\tx\tx\tx\tx\tx\tx\tx\tx\tx\n"} ];
+# -s and -s$'\t' use different code paths
+push @Tests,
+ ['mcol-w-tabs-w-tabsep-sepstr', "-t -2 -s'\t'",
+ {IN=>"x\tx\tx\tx\tx\nx\tx\tx\tx\tx\n"},
+ {OUT=>"x\tx\tx\tx\tx\tx\tx\tx\tx\tx\n"} ];
+
@Tests = triple_test \@Tests;
my $save_temps = $ENV{DEBUG};