On Thu, Aug 30, 2012 at 04:37:19PM -0700, daniel jimenez wrote: > Hello all, > I need some help fixing the format of some pretty strangely compressed > data files. An example would be like this: > > 2883 > 452 > 0 7 > 1 6 > 2 > 4 > 6 > 10 7 > Parsing rules: > The first two lines should be ignored. > The first column is the 'index', the second column being the 'counter'. > If there is no second number (ex. index=2), then the second number should > be set to '1'. > If there the index skips (ex. from index=2 to index=4), then the indexes > which where skipped should be set to '0' > Max index is 1024. > That is it. I'd like to be guided to an app (scripting language? awk? sed? > I haven't used those so I really don't know where to start) that can help > me do that effectively. > The command, with the script possibly as the argument, is to be included > in a bash script right before a fortran program is executed as the fortran > program expects the file to be uncompressed and it doesn't seem intuitive > to do it from fortran. Although it would be nice for a guru to let me know > how to handle it from within... > In the end, any solution would be a great help. > Thanks. > -- > Daniel Jimenez
Hi Daniel, Here's my awk solution: NR > 2 { # Ignore lines 1 & 2 if (NF < 2){ # If number of fields is less than one... counter=1 # Set variable counter to one } else { counter=$2 # Otherwise set counter to 2nd field } difference = $1 - last_index # Subtract last index to find gaps if (difference > 1){ # If gaps exist... for (i=1; i<=difference; i++){ arr[i+last_index]=0 # Add skipped indices to array w/ zero value } } arr[$1]=counter # Add index to array with value counter last_index=$1 # Remember this index for the next line } END { for (j=0; j<=last_index; j++){ print j, arr[j] # Print all indices and their values } } -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120831014403.ga4...@cerulean.myhome.westell.com