On Tue, Mar 22, 2022, at 9:49 PM, 201009-suckl...@planhack.com wrote:
> sed is the canonical paragraph mangler. It's worth spending a bit to 
> grok how that is true.
>
>     tr -d '\r' | sed '/^$/!{H;d;};p;x;s/\n/ /g;'
>
> Gutenberg lines are CRLF-terminated so `tr` is needed.

Right I forgot to mention that I had to 
  tr -d '\r'
first.  Thanks for mentioning that.

Close, but no cigar.  That sed command introduces extra blank lines.  It is 
incorrect.  ssam reigns supreme!

  tr -d '\r' < 2488-0.txt | ssam -e 'x/\n+/ v/\n\n+/ c/ /' | wc -l
7667
  tr -d '\r' < 2488-0.txt | sed '/^$/!{H;d;};p;x;s/\n/ /g;' | wc -l
7782

Reply via email to