You say that named entity recognition is not generalised beyond Mail, but the support library is there for anyone to use. See for example https://developer.apple.com/documentation/foundation/nslinguistictagger/identifying_people_places_and_organizations
In Python, you can use NLTK to do roughly the same. There's no real point in reimplementing this stuff in Pharo. Just set up a separate process, send text to it, and receive results back. On Thu, 7 Mar 2019 at 22:53, Cédrick Béler <cdric...@gmail.com> wrote: > Hi all, > > I’ve often got the need to analyse some random unstructured text to > discover (structured) information (in email for instance), to extract : > - emails > - telephone numbers > - addresses > - events > - person names (according to a list of known persons), > - etc… > > Apple do it in email for instance (strangely, this is not generalized). > > > So my questions are : > - do we have something equivalent in Smalltalk/Pharo ? (I didn’t find) > - if not, what strategy would you use ? > => I do really stupid text analysis (substrings, finding @, …, parsing > according to the text structure when there is… kind of Soup parsing…) > => I feel this is a job for PetitParser ? And would be a nice feet to the > new GToolkit. > > All ideas or suggestions are welcome ;-) > > > TIA, > > Cédrick > > > >