On Mon, 02 Aug 2010 05:56:31 -0700 Bill <bi...@uniserve.com> wrote: > I'm not sure if this is a bug, a question or a feature request, > but there is a problem with the cut command, specifically with > it's delimiter option '-d'. > > In older times disk space was scarce and every byte was > conserved. Fields in data files were delimited with a single > character such as ':'. This practise continues today. But > sometimes it does not and fields in some files are separated > with multiple characters. Space is no longer precious. > > Suppose I wish to import information about a disk partition > into my backup script. I want to assign the type of filesystem > to a variable. Compare the output of these two commands. > > cat /etc/fstab |grep home | cut -d ' ' -f3 > yields a blank output line > > cat /etc/fstab |grep opt | awk -F " " '{print $3}' > yields the desired output - reiserfs. > > The problem is that the cut command can't handle multiple > instances of the same delimiter. It's designed to handle > a single character like ':', but can't cope with repeating > characters like '::' or a series of spaces as in /etc/fstab. > > So my question is shouldn't the cut delimiter handle > multiple instances of the same character internally or > failing that, shouldn't there be some way of specifying a > series of single delimiter characters such as -d':'+ ?
cut is required by POSIX to treat every separator character as delimiting a field. "Output fields shall be separated by a single occurrence of the field delimiter character." However, what you suggest might be implemented as an extension, which the user would have to enable explicitly (although I wouldn't bet that the maintainers think this is a good idea, but I may be wrong). On a side note, you mention awk which in your specific example of space as separator happens to work fine. However, that is specifically special-cased in awk; with any other single-character separator, awk works exactly like cut: echo 'a::b:c' | awk -F':' '{print "-"$1"--"$2"--"$3"--"$4"-"}' -a----b--c- note the empty second field. But of course in awk, unlike cut. you can say -F ':+' and get the behavior you want. -- D.