Keenan, Greg John (Greg)** CTR ** wrote: > -----Original Message----- >> From: Wagner, David --- Senior Programmer Analyst --- WGO >> [mailto:[EMAIL PROTECTED] >> Sent: Friday, 19 August 2005 3:21 AM >> To: Keenan, Greg John (Greg)** CTR **; beginners@perl.org >> Subject: RE: regex - no field seperator >> >> Keenan, Greg John (Greg)** CTR ** wrote: >>> Hi, >>> >>> I have the following data that I'm trying to parse into an array. >>> There are 19 fields but with hosts 5 & 6 fields 6 & 7 do not have >>> any space between them. This is how I get it from the OS and have >>> no control over it. >>> >>> The maximum length for field 6 is 7 chars and field 7 is 6 chars. >>> >>> 200508171648 host1.dom.com 0 0 14 2166 623 8 4 12 0 0 0 35 131 14 0 >>> 0 100 200508171648 host2.dom.com 0 0 0 265 7563 5 3 8 0 0 0 34 66 7 >>> 0 0 100 200508171648 host3.dom.com 0 0 0 461 8112 4 0 6 0 0 0 53 84 >>> 9 0 0 100 200508171648 host4.dom.com 0 0 0 46 9468 5 3 9 0 0 0 39 >>> 75 8 0 2 98 200508171648 host5.dom.com 0 1 0 7008342480 3 0 0 0 0 0 >>> 0 41 8 0 2 98 200508171648 host6.dom.com 0 1 0 8936445548 3 0 0 0 0 >>> 0 0 14 5 0 0 100 >>> >>> I have tried the following, and several other combos, with no luck. >>> It matches the first 4 lines but fails for the last 2 because they >>> appear to have only 18 fields I assume. >>> @oput = /(\d+) (.+\..+\..+) (\d+) (\d+) (\d+) (\d{2,7}) (\d{2,6}) >>> (\d+) (\d+) (\d+) (\d+) (\d+) (\d+) (\d+) (\d+) (\d+) (\d+) (\d+) >>> (\d+)/; >>> >> You are working much too hard to capture the data. Use split like: >> >> @oput = split (/\s+/,$_); >> You say it is a total of 13 characters, but in this case you have 10 >> characters. How do you identify which field is full? Once you do that >> then >the ability to get it can be done. But you have to first >> identify how to know out say in this case the 10 chaacters what >> the proper split is? > > Fields 6 & 7 could be a minimum of 2 chars or 7 & 6 chars > respectively but the only time fields 6 & 7 merge is if field 7 has > reached its maximum length of 6 chars.
Here is a shot: #!perl use strict; use warnings; my @oput = (); while ( <DATA> ) { chomp; @oput = split (/\s+/,$_); if ( scalar(@oput) != 19 ) { # missing a field, make one assumption if ( length($oput[5]) > 6 ) { # you have two fields combined my $MyLen = length($oput[5]) - 6; # stated only combines if 7 has 6 characters my $MyWrkFld = substr($oput[5],$MyLen); # Move the 6 chars substr($oput[5],$MyLen,6) = ''; # Delete the 6 characters splice(@oput,6,0,$MyWrkFld); # insert into array } } printf "<%s>\n", join(';', @oput); } __DATA__ 200508171648 host1.dom.com 0 0 14 2166 623 8 4 12 0 0 0 35 131 14 0 0 100 200508171648 host2.dom.com 0 0 0 265 7563 5 3 8 0 0 0 34 66 7 0 0 100 200508171648 host3.dom.com 0 0 0 461 8112 4 0 6 0 0 0 53 84 9 0 0 100 200508171648 host4.dom.com 0 0 0 46 9468 5 3 9 0 0 0 39 75 8 0 2 98 200508171648 host5.dom.com 0 1 0 7008342480 3 0 0 0 0 0 0 41 8 0 2 98 200508171648 host6.dom.com 0 1 0 8936445548 3 0 0 0 0 0 0 14 5 0 0 100 Output: <200508171648;host1.dom.com;0;0;14;2166;623;8;4;12;0;0;0;35;131;14;0;0;100> <200508171648;host2.dom.com;0;0;0;265;7563;5;3;8;0;0;0;34;66;7;0;0;100> <200508171648;host3.dom.com;0;0;0;461;8112;4;0;6;0;0;0;53;84;9;0;0;100> <200508171648;host4.dom.com;0;0;0;46;9468;5;3;9;0;0;0;39;75;8;0;2;98> <200508171648;host5.dom.com;0;1;0;7008;342480;3;0;0;0;0;0;0;41;8;0;2;98> <200508171648;host6.dom.com;0;1;0;8936;445548;3;0;0;0;0;0;0;14;5;0;0;100> ******************************************************* This message contains information that is confidential and proprietary to FedEx Freight or its affiliates. It is intended only for the recipient named and for the express purpose(s) described therein. Any other use is prohibited. ******************************************************* -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] <http://learn.perl.org/> <http://learn.perl.org/first-response>