Use LWP to get web data - not lynx and the like unless you can't help it. I prefer using Web::Scraper to parse html but either way it's probably best not to use a regex (see SO and similar for discussions on the like).
On Feb 23, 2014 8:13 AM, "Wernher Eksteen" <crypt...@gmail.com> wrote: > > Hi, > > Thanks, but how do I assign the value found by the regex to a variable so that the "1.2.4" from 6 file names in the array @fileList are print only once, and if there are other versions found say 1.2.5 and 1.2.6 to print the unique values from all. > > This is my script thus far. The aim of this script is to connect to the site, remove all html tags and obtain only the file names I need. > > #!/usr/bin/perl > > use strict; > use warnings; > > # initiating package names to be used later > my @getList; > my @fileList; > > # get files using lynx and parse through it > my $url = "http://mathias-kettner.com/download"; > open my $in, "lynx -dump $url |" or die $!; > > # get the bits we need and push it to an array to further filter what we need > while(<$in>){ > chomp; > if( /\[(\d+)\](.+)/ ){ > next if $1 == 1; > push @getList, "$2\n"; > } > } > > # filter only the files we need into final array > foreach my $i (@getList) { > my @list = split /\s+/, $i; > push @fileList, "$list[0]\n", if $i =~ /rpm|tar/ && $i !~ /[0-9][a-z]/; > } > > # print the list > print "\nList of files to be retrieved from $url:\n\n @fileList\n"; > > The output is then: > > List of files to be retrieved from http://mathias-kettner.com/download: > > > check_mk-1.2.4.tar.gz > check_mk-agent-1.2.4-1.noarch.rpm > check_mk-agent-logwatch-1.2.4-1.noarch.rpm > check_mk-agent-oracle-1.2.4-1.noarch.rpm > mk-livestatus-1.2.4.tar.gz > mkeventd-1.2.4.tar.gz > > From that I want to get the value 1.2.4 and assign it to a variable, if there are more than one value such as 1.2.5 and 1.2.6 as well, it should print them too, but only the unique values. > > My attempt shown below to print only the value 1.2.4 is as follow, but it prints out "1.2.41.2.41.2.41.2.41.2.41.2.4" next to each other, if I pass a newline to $i such as "$i\n" it then prints "111111" ? > > foreach my $i (@fileList) { > print $i =~ /\b(\d+\.\d+\.\d+)\b/; > } > The 1s are all of the returns of true (or one match). You want to print "$i\n" if (foo)