> Le mardi 29 mars 2005 à 17:52, Orton, Yves écrivait:
> > >
> > > Any other name ideas or comments about the module and its
> interface?
> >
> > I started working on a project like this but never got
> around to finishing
> > it. I called it "Generic Record Processing System" IE GRPS.
> The point being
> > that this isnt a facility related to parsing log files, its
> a facility
> > relating to processing any file of parsable records in a
> mechanical way.
>
> This is one reason why I didn't like the Log:: prefix. This
> also implies
> that my Regexp::Log suite of modules is badly named as well.
>
> > Interesting rules would be stuff like set membership "is
> field x in set Y"
> > (which would be implemented as a hash lookup etc). Also
> interesting would be
> > a framwork for specifiying which ruleset to apply based on
> > filename/directory conventions. Additionally stuff like record
> > transformation, prefix matching and logical evaluation
> would be cool too.
>
> Set operations? Good idea. If the field is a string and the
> set is a set
> of strings, computing a regular _expression_ with Regexp::Assemble might
> be a good option as well.
Well, actually if the set is a set of strings a hash lookup will be faster. If its a set of regexs then yes, and if its a prefix match you probably will want to use something like RA for it in current perl releases.
<shameless plug>
But David and the other Regexp authors need to update their code to take advantage of 5.9.2 and later innate TRIE optimisation. They still have room for optimising the patterns that they build but they will need to build fairly different looking patterns to really harness the TRIE regop.
</shameless plug>
> And then, caching the generated code (e.g. because Regexp::Assemble takes some time to run) could be a bonus.
What i did was generate a perl subroutine write it to disk, then do FILE it into existance. That way the debugger can find the text which means you can use it to debug the generated code and you also get useful error messages when it blows up. You could then just return a reference to the codeblock.
Actually on thinking about it, maybe the Inline framework could be harnessed along with Parse::RecDescent. Then its just an inline module which would handle all the details of caching and directories and stuff....
Anyway, im not saying you should do it like I was planning to, just sharing my thoughts on the subject. I really look forward to seeing what you come up with.
Cheers,
Yves