On Tue, Jan 13, 2009 at 6:42 PM, Matthew Woehlke <[email protected]> wrote: > > Vitali Lovich wrote: >> >> Perhaps - but for sort, at least from my thinking of how I would >> implement this, the additional logic (at least to behave correctly on >> all inputs) would be somewhat complicated. Can you please explain why >> you believe this belongs in sort and wouldn't be better served by >> pre-processing the text before sort & post-processing it after as >> necessary? > > I'd like to point out that, if you're going to require that, you've defeated > the purpose of sort understanding human-readable numbers in the first place. > If I have to write >something more convoluted than 'du -sh * | sort -h', I might as well write >'sdu -s *'. (Which, in fact, I did. 'sdu' is a script that expects normal >everything-in-bytes output, >does a plain old 'sort -n' on it, and then uses awk to make the sizes >human-readable.) You are correct. However, if you look, the implementation I posted (and I explained this in the original design assumptions) is specifically designed to handle du & df (so du -sh * | sort -h works perfectly as does sorting the various output columns of df -h). My question was strictly regarding trying to parse the longer versions of those suffixes (i.e. MiB & MB) - does it make sense to support this option. When things settle down for me and I get time, I'll post my implementation for this so people can determine whether or not it makes sense to do this in the code.
> > IOW, if you make people format the output anyway, you might as well just > forget -h and make the post-formatting be what converts from raw integer > sizes to human-friendly sizes. At least, that's my $0.02. And that's a valid point about the post-formatting - perhaps another tool would be useful that somehow formats & converts numbers within output. The syntax might have to be invented (or you could just have primitive switches for common conversions). But that still wouldn't mean that the -h flag within sort wouldn't be useful just because sort is a far more popular tool (maybe if your post-formatting tool gains popularity, then there could be an argument made for deprecating the -h flag). Also, such a tool may have issues because it would have to convert the string number into an actual number first, which leads to overflow & precision problems, whether or not you should allow the exponent 'E' to indicate a power of 10 multiplier, etc. And again, obviously the thing about pre-formatting/post-formatting was just a suggestion on how to support longer suffixes. The script to do this is far more trivial and easier to create than the equivalent scripts needed to sort du output correctly (and most of those times those scripts are tool specific meaning it's non trivial to port those scripts to sort the output of df for instance). Sort -h is also a far more generic solution than any custom wrapper script (at least the implementation is far more straightforward). Also, the equivalent sed scripts to do this are far easier to write than adding that additional code in C (because sed is meant for manipulating text). Vitali _______________________________________________ Bug-coreutils mailing list [email protected] http://lists.gnu.org/mailman/listinfo/bug-coreutils
