> -----Original Message----- > From: Dermot Paikkos [mailto:[EMAIL PROTECTED]] > Sent: Wednesday, November 14, 2001 6:22 AM > To: [EMAIL PROTECTED] > Subject: REGEX or Parse a text file > > > Hi Perlgurus, > I am having trouble getting the data I want out of a text > file: The file has > this sturcture: > > File Name: m:\a\a084099.jpg > Width x Height: 2480 x 2062 > Number of Colours: True Colour (24 bits) > Dots per inch: 300 x 300 > Image size (inches): 8.27 x 6.87 > Raw size: 15341280 Actual size: 1203289 (Compression > ratio: 12.8:1) > > There a thousand or so entries. Each entry is separated by a > blank new line. > I want to get just the file name and the last line. My > troulbe is I can't > manage to parse it correctly. The best I have been able to get is: > m:\a\a084100.jpg Uncompressed Size: 15341280 Actual size: 1203289 > m:\a\a084100.jpg Uncompressed Size: 15341280 Actual size: 1203289 > m:\a\a084100.jpg Uncompressed Size: 15341280 Actual size: 1203289 > m:\a\a084100.jpg Uncompressed Size: 15341280 Actual size: 1203289 > m:\a\a084100.jpg Uncompressed Size: 15341280 Actual size: 1203289 > m:\a\a084100.jpg Uncompressed Size: 15006480 Actual size: 1251205 > m:\a\a084100.jpg Uncompressed Size: 15006480 Actual size: 1251205 > > The lines get repeated 7 times before getting the next entry. I used: > > while (<REPORT>) { > if (/File/gc) { # I tried with or without gc > but it made no > difference. > $name = $'; > chomp($name); > ($file = $name) =~ s/ Name: //; > } > if (/Raw/) { > $size = $'; > chomp($size); > ($foo = $size) =~ s/\((Compression...*)//; > ($bar = $foo) =~ s/size/Uncompressed Size/; > } > > print OUTPUT "$file $bar \n";
Since you have blank lines between each entry, you can have Perl read each entry "paragraph" as a single string by setting $/ to "". Then just use a regex to grab the items you need. Give this a try: $/ = ""; # read "paragraphs" while (<REPORT>) { s/\s+$//; my ($f, $r, $a) = /File Name: (.*?)\n.*Raw.*?(\d+).*Actual.*?(\d+)/s; ... } -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]