GNU Datamash 1.8 released [stable]

2022-07-22 Thread Tim Rice

This is to announce datamash-1.8, a new release.

Datamash is a command-line program which performs basic numeric, textual and
statistical operations on input textual data.



This is the first release for new maintainer Tim Rice, with much appreciation
to Shawn Wagner and Erik Auerswald for their help. See the AUTHORS and THANKS
files for additional credits and acknowledgements.



GNU Datamash home page:
   https://www.gnu.org/software/datamash/

Please report any problem you may experience to the bug-datam...@gnu.org
mailing list.

Happy Hacking!
- Tim Rice

==

Here are the compressed sources and a GPG detached signature[*]:
https://ftp.gnu.org/gnu/datamash/datamash-1.8.tar.gz
https://ftp.gnu.org/gnu/datamash/datamash-1.8.tar.gz.sig

Use a mirror for higher download bandwidth:
https://ftpmirror.gnu.org/datamash/datamash-1.8.tar.gz
https://ftpmirror.gnu.org/datamash/datamash-1.8.tar.gz.sig

[*] Use a .sig file to verify that the corresponding file (without the
.sig suffix) is intact.  For instructions about how to do this, please
refer to https://ftp.gnu.org/README.  (In particular you will need to
retrieve the GNU keyring rather than using any keyservers.)

==

The checksums of the archive are:

$ sha1sum datamash-1.8.tar.gz
e77e15ed2c6b17b4045251fd87f16430c3bf2166  datamash-1.8.tar.gz

$ sha256sum datamash-1.8.tar.gz
94a4e11819ad259aa3745b7eca392e385e3a676d276e8cbb616269dbbb17fe6d  
datamash-1.8.tar.gz

$ b2sum datamash-1.8.tar.gz
dfe4060ea65ea46a1796e01463fd9b0e55c2d633d06da153f585a3a569acf3e9211a14cb3905daf8ecae347358daa04db940d557b909f0ce5ebbba2f57d3a410
  datamash-1.8.tar.gz

==

NEWS

* Noteworthy changes in release 1.8 (2022-07-23) [stable]

** Changes in Behavior

  Schedule -f/--full combined with non-linewise operations for deprecation.
  In a future release, -f/--full will only be usable with operations where
  it makes sense. For now, we print a warning to stderr when -f/--full is
  used with non-linewise operations, and such usage will no longer be
  supported.

  The bin operation now uses more intuitive bins. Previously, a command
  such as `datamash bin 1 <<< -0` would output -100; and -100 did not fall
  in its own bin. We now require all bins to take the form `[nx,(n+1)x)`
  with integer n and bin width x. We discard the sign on -0 and gate such
  inputs into the [0,x) bin.

  Operations taking more than one argument now provide more complete output
  with --header-out. Previously, an operation such as `pcov x:y` would
  produce an output header like `pcov(y)`, discarding the `x`. The new
  behavior will output header `pcov(x,y)`.

  datamash(1) no longer ignores --output-delimiter with the rmdup operation.

** New Features

  New datamash option --sort-cmd argument to specify the program used
  by the -s option to sort input, plus enhancements to the security and
  portability of building sort command lines.

  New datamash option -c/--collapse-delimiter=X argument uses character
  X instead of comma between values in collapse and unique lists.

  New datamash operations: mean square (ms) and root mean square (rms).

  Decorate now supports sorting IP addresses of both versions 4 and 6
  together. IPv4 addresses are logically converted to IPv6 addresses,
  either as IPv4-Mapped (ipv6v4map) or IPv4-Compatible (ipv6v4comp)
  addresses.

  Add two command aliases:
'echo' may now be used instead of 'cut'.
'uniq' may now be used instead of 'unique'.

** Improvements

  Updated the bash completion script to reflect recent additions.

** Bug Fixes

  Datamash now passes the -z/--zero-terminated flag to the sort(1) child
  process when used with "--sort --zero-terminated". Additionally,
  if the system's sort(1) does not support -z, datamash reports the error
  and exits. Previously it would omit the "-z" when running sort(1),
  resulting in incorrect results.

  Documentation fixes and spelling corrections.

  Incorrect format in a decorate(1) error breaking compilation on some
  systems.

  datamash(1), decorate(1): Fix some minor memory leaks.

  datamash(1) no longer crashes when the unique or countunique operations
  are used with input data containing NUL bytes.  The problem was reported
  in https://lists.gnu.org/archive/html/bug-datamash/2020-11/msg1.html
  by Catalin Patulea.

  datamash(1) no longer crashes when crosstab with --header-in is called
  by field name instead of index. I.e. `datamash --header-in ct x,y` now
  works as expected.



GNU Datamash 1.9 released

2025-04-04 Thread Tim Rice

This is to announce GNU Datamash 1.9, a stable release.

Home page: https://www.gnu.org/software/datamash
Announcement link: https://savannah.gnu.org/news/?id=10746

GNU Datamash is a command-line program which performs basic numeric,
textual and statistical operations on input textual data files.

It is designed to be portable and reliable, and aid researchers
to easily automate analysis pipelines, without writing code or even
short scripts. It is very friendly to GNU Bash and GNU Make pipelines.

There have been 52 commits by 5 people in the 141 weeks since 1.8.

See the NEWS below for a brief summary.

The following people contributed changes to this release:

  Dima Kogan (1)
  Erik Auerswald (14)
  Georg Sauthoff (4)
  Shawn Wagner (6)
  Timothy Rice (27)

Thanks to everyone who has contributed!

Please report any problem you may experience to the bug-datam...@gnu.org
mailing list.

Happy Hacking!
- Tim

==

Here is the GNU datamash home page:
  https://gnu.org/s/datamash/

Here are the compressed sources and a GPG detached signature:
  https://ftp.gnu.org/gnu/datamash/datamash-1.9.tar.gz
  https://ftp.gnu.org/gnu/datamash/datamash-1.9.tar.gz.sig

Use a mirror for higher download bandwidth:
  https://ftpmirror.gnu.org/datamash

More about GNU mirrors is at:
  https://www.gnu.org/order/ftp.html

Here are the SHA1 and SHA256 checksums:

  File: datamash-1.9.tar.gz
  SHA1 sum:   935c9f24a925ce34927189ef9f86798a6303ec78
  SHA256 sum: f382ebda03650dd679161f758f9c0a6cc9293213438d4a77a8eda325aacb87d2

Use a .sig file to verify that the corresponding file (without the
.sig suffix) is intact.  First, be sure to download both the .sig file
and the corresponding tarball.  Then, run a command like this:

  gpg --verify datamash-1.9.tar.gz.sig

The signature should match the fingerprint of the following key:

  pub   ed25519 2022-04-05 [SC]
3338 2C8D 6201 7A10 12A0  5B35 BDB7 2EC3 D3F8 7EE6
  uid   Timothy Rice (Yubikey 5 Nano 13139911) 

If that command fails because you don't have the required public key,
or that public key has expired, try the following command to retrieve
or refresh it, and then rerun the 'gpg --verify' command.

  wget -q https://ftp.gnu.org/gnu/gnu-keyring.gpg
  gpg --keyring gnu-keyring.gpg --verify datamash-1.9.tar.gz.sig

This release is based on the datamash git repository, available as

  git clone https://git.savannah.gnu.org/git/datamash.git

with commit 39101c367a07f2c1aea8f3b540fc490735596e6a tagged as v1.9.

For a summary of changes and contributors, see:

  https://git.sv.gnu.org/gitweb/?p=datamash.git;a=shortlog;h=v1.9

or run this command from a git-cloned datamash directory:

  git shortlog v1.8..v1.9

This release was bootstrapped with the following tools:
  Autoconf 2.72
  Automake 1.17
  Gnulib 2025-03-27 54fc57c23dcd833819a7adbdfcc3bd1c805103a8

NEWS

* Noteworthy changes in release 1.9 (2025-04-05) [stable]

** Changes in Behavior

  datamash(1), decorate(1): Add short options -h and -V for --help and --version
  respectively.

  datamash(1): the rand operation now uses getrandom(2) for generating a random
  seed, instead of relying on date/time/pid mixing.

** New Features

  datamash(1): add operation dotprod for calculating the scalar product of two
  columns.

  datamash(1): Add option -S/--seed to set a specific seed for pseudo-random
  number generation.

  datamash(1): Add option --vnlog to enable experimental support for the vnlog
  format. More about vnlog is at https://github.com/dkogan/vnlog.

  datamash(1): -g/groupby takes ranges of columns (e.g. 1-4)

** Bug Fixes

  datamash(1) now correctly calculates the "antimode" for a sequence
  of numbers.  Problem reported by Kingsley G. Morse Jr. in
  .

  When using the locale's decimal separator as field separator, numeric
  datamash(1) operations now work correctly.  Problem reported by Jérémie
  Roquet in
  
  and by Jeroen Hoek in
  .

  datamash(1): The "getnum" operation now stays inside the specified field.