On Tue, Apr 17, 2007 at 06:47:14PM +0200, Agustin Martin wrote:
> Hi, David and Sano,
> 
> On Sat, Apr 14, 2007 at 02:07:45PM -0700, David Lawyer wrote:
> > Package: linuxdoc-tools
> > Version 0.9.21-0.5
> > 
> > Please merge bug 175575 into this bug report since it's a subset of
> > the bug (and proposed fix) I'm now reporting.
> > 
> > When I use sgml2txt I get both escape sequences and overstrikes which
> > plain text output shouldn't normally have.  There is an -f option to
> > sgml2txt to eliminate the overstrikes.
> 
> which by the way is buggy, and quite often does not remove all escapes,

I don't think the -f option is supposed to remove escapes.  A few
years or so ago, the escape problem didn't exist.  I think it was
caused by a change in the grotty program that made output with escapes
the default.

> > It very important to keep the
> > use of sgml2txt as simple as possible since the main advantage of
> > linuxdoc format over docbook is that it's simple and using the
> > linuxdoc-tools should also be simple.  The escape sequences are only
> > for vt100 terminals (and the like) and will not display if one uses an
> > editor (like vim) or pager (like less or most) to read the file.
> > Overstrikes don't usually get displayed right either although some
> > pagers can deal with them for some cases (such as underline).
> > 
> > So the default for conversion to text should (in my opinion) be just
> > plain text.  
> > 
> > The documentation for linuxdoc-tools fails to explain how to get
> > various types of text outputs using sgml2txt.  It should.  The way to
> > get plain text is to pass options to the grotty program from the sgml2txt
> > command line.  Like this: sgml2txt --pass="-P-bcou".  See "man grotty"
> > for how these 4 options (bcou) work together.  To make this the
> > default, one could modify: /usr/share/linuxdoc-tools/dist/fmt_txt.pl
> > For example, this seems to work although I've never studied Perl:
> > 
> >       create_temp("$global->{tmpbase}.txt.1");
> > #next line added by DL (David Lawyer)
> >       $global->{pass} = "-P-cbou" if $global->{pass} eq "";
> >       $outfile = new FileHandle
> >       "|$main::progs->{GROFF} $global->{pass} -T $global->{charset} -t 
> > $main::progs->{GROFFMACRO} >\"$global->{tmpbase}.txt.1\"";
> 
> Based on your proposed fix, I think something like in this diff
> 
> ----------------------------------------------------------------
> @@ -329,6 +323,7 @@
>  {
>    my $infile = shift;
>    my ($outfile, $groffout);
> +  my $txtfilter = $txt->{filter} ? "-P-cbou" : "";
>  
>    if ($txt->{manpage})
>      {
> @@ -338,7 +333,7 @@
>      {
>        create_temp("$global->{tmpbase}.txt.1");
>        $outfile = new FileHandle 
> -         "|$main::progs->{GROFF} $global->{pass} -T $global->{charset} -t  
> $main::progs->{GROFFMACRO} >\"$global->{tmpbase}.txt.1\"";
> +         "|$main::progs->{GROFF} $global->{pass} $txtfilter -T 
> $global->{charset} -t $main::progs->{GROFFMACRO} 
> >\"$global->{tmpbase}.txt.1\"";
>      }
>  
>    #
> ------------------------------------------------------------------
> 
> can be used to make the -f option work as expected.
But it's not expected to remove escapes.  Also, you would need to
delete the old code that removes overstrikes from the output if -f is
used (uses the s/ command in perl.
Suppose someone uses: sgml2txt -f --pass="-P-cu"
Then the -f will pass the -cbou options to grotty while the user only
wanted to pass -cu.  My proposed patch would do just what the user
specified with --pass but then -f would act as a filter and filter out
overstrikes.  So in your solution -f and --pass both give options to
grotty and these options may conflict.  -f is no longer a filter since
it just passes options to grotty.

One solution would be to use my "patch" and then have -f do nothing
except print a message the use of -f was no longer needed.  Eventually
-f could be eliminated or just do nothing for backwards compatibility.

> 
> I am generally not in favour of changing long-standing behaviors.
I don't think many people outside of LDP are using sgml2txt.  And I
suspect those that are are likely using the -f option.
> However, in this case, escaped characters are of so limited use that
> might worth considering that.
And they were introduced by a change in grotty.  So then one would use
-f to get plain text and without -f one would get overstrikes.
> 
> I think a middle point is possible, making -f default for sgml2txt,
> but not for linuxdoc -B txt. This way, escaped chars can easily be
> obtained if really required (directly calling linuxdoc without the
> -f option), but plain text is obtained from calls to sgml2txt (that
> would be trivial to implement), with no option for the opposite
> behavior here. If we are flamed for this, we could reconsider the
> change. What do you think? 

I don't think it's too good since then the two commands aren't the
same.  But if you don't want to have the -B txt produce plain text,
then it's better than doing nothing.  sgml2txt is the older command
while the -B is the newer.  What about display a message when using the
txt output to let people know of the change?  It already displays a
short message so just add to that.

> 
> > Instead of hard-coding -cbou options into the code as I've done
> > above, one could create a new variable, GROTTYOPTS, and set it
> > equal to -cbou in the main program:
> > /usr/share/linuxdoc-tools/LinuxDocTools.pm.  I'm willing to do
> > some more work on this and create patches (I've never done Linux
> > patches before) provided of course that it's agreed that sgml2txt
> > should generate plain text by default.
> 
> I am relatively new to perl, and far from being a perl guru. I might
> be missing a lot of important things here, but linuxdoc-tools perl
> seems to me extremely ancient, and there are a lot of things I do
> not understand why are done that way. As a matter of fact I am
> changing some things to what is IMHO more readable, and found no
> drawbacks yet.
> 
> I mean that is probably not the kind of perl a new person will
> enjoy, but you are of course welcome.
> 
> As I mentioned elsewhere, although I am not the maintainer of this
> package, I plan to keep improving it when possible, so I am happy to
> receive your feedback through the Debian BTS.
> 
> Thanks for your help and suggestions,
And thanks for your quick response and effort on this.
> 
> -- Agustin
> 
                        David Lawyer


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

Reply via email to