I think that I may have misunderstood what the OP wanted. The awk script you
give and the perl one that I gave give different output on the first line of my
modified file. It is the way I was envisioning what the OP wanted was: "Find
the first instance of CD in the given string. Find all other characters
following that until the first QR substring. Replace those characters with
"junkt". What my perl regexp matches and your awk matches are not the same
segment. I don't know which the OP wanted, now. My modified file places a Q in
column 11, moving all other characters right one. It also removes the D which
was originally in column 27.
$cat test.txt
QQQQABCDEFQGNOPQRXXXPPPPABCEFGNOPQRYYYOOOOABCDEFGNOPQRZZZ
QQQQABCDEFGNOPQRXXXPPPPABCDEFGNOPQRYYYOOOOABCDEFGNOPQRZZZ
QQQQABCDEFGNOPQRXXXPPPPABCDEFGNOPQRYYYOOOOABCDEFGNOPQRZZZ
QQQQABCDEFGNOPQRXXXPPPPABCDEFGNOPQRYYYOOOOABCDEFGNOPQRZZZ
QQQQABCDEFGNOPQRXXXPPPPABCDEFGNOPQRYYYOOOOABCDEFGNOPQRZZZ
QQQQABCDEFGNOPQRXXXPPPPABCDEFGNOPQRYYYOOOOABCDEFGNOPQRZZZ
$awk 'sub(/CD[^Q]*QR/,"junkt")' test.txt
QQQQABCDEFQGNOPQRXXXPPPPABCEFGNOPQRYYYOOOOABjunktZZZ
QQQQABjunktXXXPPPPABCDEFGNOPQRYYYOOOOABCDEFGNOPQRZZZ
QQQQABjunktXXXPPPPABCDEFGNOPQRYYYOOOOABCDEFGNOPQRZZZ
QQQQABjunktXXXPPPPABCDEFGNOPQRYYYOOOOABCDEFGNOPQRZZZ
QQQQABjunktXXXPPPPABCDEFGNOPQRYYYOOOOABCDEFGNOPQRZZZ
QQQQABjunktXXXPPPPABCDEFGNOPQRYYYOOOOABCDEFGNOPQRZZZ
$perl -np -e 's/CD.*?QR/junkt/' test.txt
QQQQABjunktXXXPPPPABCEFGNOPQRYYYOOOOABCDEFGNOPQRZZZ
QQQQABjunktXXXPPPPABCDEFGNOPQRYYYOOOOABCDEFGNOPQRZZZ
QQQQABjunktXXXPPPPABCDEFGNOPQRYYYOOOOABCDEFGNOPQRZZZ
QQQQABjunktXXXPPPPABCDEFGNOPQRYYYOOOOABCDEFGNOPQRZZZ
QQQQABjunktXXXPPPPABCDEFGNOPQRYYYOOOOABCDEFGNOPQRZZZ
QQQQABjunktXXXPPPPABCDEFGNOPQRYYYOOOOABCDEFGNOPQRZZZ
All the lines, other than the first, in "test.txt" are identical. The first
line has a Q inserted after the first F. And the second D removed from between
the C and the E. My string matches from the first CD (column 7) to the first QR
(column 16). Your awk matches the CD in column 45 to the QR in column 53.
--
John McKown
Systems Engineer IV
IT
Administrative Services Group
HealthMarkets(r)
9151 Boulevard 26 * N. Richland Hills * TX 76010
(817) 255-3225 phone *
[email protected] * www.HealthMarkets.com
Confidentiality Notice: This e-mail message may contain confidential or
proprietary information. If you are not the intended recipient, please contact
the sender by reply e-mail and destroy all copies of the original message.
HealthMarkets(r) is the brand name for products underwritten and issued by the
insurance subsidiaries of HealthMarkets, Inc. -The Chesapeake Life Insurance
Company(r), Mid-West National Life Insurance Company of TennesseeSM and The
MEGA Life and Health Insurance Company.SM
<snip>
> >>
> >> try this:
> >>
> >> awk 'sub(/CD[^Q]*QR/,"junkt")'
> >>
> >> or this:
> >>
> >> sed -e 's/CD[^Q]*QR/junkt/'
> >>
> >> Bill
> >
> >Will work on that specific example. But won't if a Q appears
> with some other character after it, before the first QR.
> >
>
> Did you try it?
>
> Where a Q appears with some other character after it, before
> the first QR? I did. It skips to the one that has the first
> QR, as it should.
>
> echo
> "QQQQABCDEFGNOPQSXXXPPPPABCDEFGNOPQRYYYOOOOABCDEFGNOPQRZZZ" |
> awk 'sub(/CD[^Q]*QR/,"junkt")'
>
> QQQQABCDEFGNOPQSXXXPPPPABjunktYYYOOOOABCDEFGNOPQRZZZ
>
> I see that Ken has added to the problem description since my
> earlier reply.
>
> Bill
>
> Bill
>
> ----------------------------------------------------------------------
> For IBM-MAIN subscribe / signoff / archive access instructions,
> send email to [email protected] with the message: INFO IBM-MAIN
>
>
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN