The lookup hash is like this
%clean=(
Heating    Engineer =>  (?:Heating.*?Engineer)\b.*?
HGV
Driver=>(?=\A(?:(?!tech|mech).)*$)(?:HGV|LGV|Class.?1|Class.?2).?(?:1|2|3|)(?:.+Driver|).*?
HGV Mechanic=> (?:(?:HGV|LGV|Lorry).+(?:Mech?anics?|technicians?))\b.*?
Highway Engineer=> (?:(?:Highway.?) (?:Engineer.?))\b.*?
Highway Technician => (?:Highway.?) (?:Technician.?)\b.*?
)

this is then used for

 foreach my $list ( sort  keys %jtitles){
 no warnings 'uninitialized';

 foreach my $clean (sort keys %clean){
 if ($list=~/$clean{$clean}->{pattern}/i){
 $jtitles{$list}->{JobClean}=$clean{$clean}->{replace} ;
 }

 }

(jtitles is created from a bigger listing of unformatted text)


On 19 January 2015 at 13:53, Brandon McCaig <bamcc...@gmail.com> wrote:

> Mike:
>
> On Mon, Jan 19, 2015 at 01:25:56PM +0000, Mike Martin wrote:
> > Hi
>
> Hello,
>
> > I am looking for the most performant way to achieve this
> >
> > I have a big list of text (470000+ lines in a hash) I then run
> > a hash ref consisting of replacement text - pattern to search -
> > optional 3rd param for grouping matches
> >
> > So I loop through the text and then loop the regex hash against
> > each record
> >
> > This works but takes about 20 minutes
> >
> > Any suggestions
>
> It might help to post some actual code. I wonder how those 500k
> lines are getting loaded into the first hash in the first place,
> what keys are being used, and then how the second hash is related
> to it. This sounds a bit like an XY problem[1], but you'll likely
> want to solve the problem with a more ideal algorithm and that
> might require a more ideal data structure. Not that I'm any
> authority on this problem. :\
>
> [1] http://meta.stackexchange.com/questions/66377/what-is-the-xy-problem
>
> Regards,
>
>
> --
> Brandon McCaig <bamcc...@gmail.com> <bamcc...@castopulence.org>
> Castopulence Software <https://www.castopulence.org/>
> Blog <http://www.bambams.ca/>
> perl -E '$_=q{V zrna gur orfg jvgu jung V fnl. }.
> q{Vg qbrfa'\''g nyjnlf fbhaq gung jnl.};
> tr/A-Ma-mN-Zn-z/N-Zn-zA-Ma-m/;say'
>
>

Reply via email to