On Tue, Mar 22, 2022, at 9:49 PM, 201009-suckl...@planhack.com wrote: > sed is the canonical paragraph mangler. It's worth spending a bit to > grok how that is true. > > tr -d '\r' | sed '/^$/!{H;d;};p;x;s/\n/ /g;' > > Gutenberg lines are CRLF-terminated so `tr` is needed.
Right I forgot to mention that I had to tr -d '\r' first. Thanks for mentioning that. Close, but no cigar. That sed command introduces extra blank lines. It is incorrect. ssam reigns supreme! tr -d '\r' < 2488-0.txt | ssam -e 'x/\n+/ v/\n\n+/ c/ /' | wc -l 7667 tr -d '\r' < 2488-0.txt | sed '/^$/!{H;d;};p;x;s/\n/ /g;' | wc -l 7782