On Fri, Oct 25, 2002 at 01:49:19AM -0500, _brian_d_foy wrote: > In article <20021025063502.GA22390@heat>, Jeffrey Baker <[EMAIL PROTECTED]> wrote: > > > On Fri, Oct 25, 2002 at 01:07:08AM -0500, _brian_d_foy wrote: > > > Well, that's the trick. You tell it what company you are > > looking for (e.g. O'Reilly and Associates), and the module > > finds instances of that company in the free text (e.g. > > O'Reilly, O'Reilly and Assoc., ORA, etc.) > > in that case i don't think this belongs in Finance. it doesn't > have anything to do with Finance (or as much to do with Finance > as Marketing::*, Sales::*, etc).
The guys here are OK with Lingua::EN. > > The main point is that the code understands permutations > > applied to company names in the English language, such as > > contraction, abbreviation, embellishment, and so forth. > > i think this might live under Lingua:: or something similar, > then. OK. > > As usual with CPAN, we hope to put it in the section where > > people will actually find it. Finance or Business is the > > most suitable first word, but unfortunately Finance is in > > Chapter 23 "Also Ran", and this module is about text. So > > either would be appropiate. > > > How about Text::ExtractCompanyNames? > > Business::ExtractCompanyNames? > > even those name have problems. right now you only handle english, > but someone may want to handle another language. your namespace > should be able to accomodate that. Anybody wanting to handle another language is going to start from scratch. None of the techniques we use is appplicable to anything other than English. > beyond that, you want to separate parts of the names. > "ExtractCompanyName" has too much in one part of the namespace, > so the "Extract" should be separated from the "CompanyName". > > i'm thinking something like: > > Lingua::EN::CompanyName > > that can easily accomodate other languages. then, you can provide > another module to extract the possible permutations that the Lingua:: > modules return (and that module probably lives in another namespace). > separate the permutation logic from the extraction logic and you will > have much better flexibility. I don't think it is possible to separate the two as you suggest, but the namespace works for us anyway. Regards, Jeffrey