On 2024-03-26 09:40, Bruno Haible wrote:
> Other methods described in [3], such as counters maintained in an 'awk'
> or 'perl' process, or the 'unique' program that is part of the 'john' package
> [4], can be ignored, because they need O(N) space and are thus not usable for
> 40 GB large inputs [5
Pádraig Brady wrote:
> A documentation patch would be very useful.
> If you search for "Decorate Sort Undecorate" in coreutils.texi
> you can see an existing example of this DSU pattern.
> I would also mention DSU if adding another example.
OK, I'll provide doc input for this (post 9.5).
> Re add
On 26/03/2024 16:40, Bruno Haible wrote:
The documentation of 'uniq' [1] says:
"The input need not be sorted, but repeated input lines are
detected only if they are adjacent. If you want to discard
non-adjacent duplicate lines, perhaps you want to use 'sort -u'."
When I wrote gnulib-