Re: [Haskell-cafe] Parsing unstructured data

2007-12-05 Thread Olivier Boudry
On Nov 29, 2007 5:31 AM, Reinier Lamers <[EMAIL PROTECTED]> wrote: > Especially in the fuzzy cases like this one, NLP often turns to machine > learning models. One could try to train a hidden Markov model or support > vector machines to label parts of the string as "name", "street", > "number", "c

Re: [Haskell-cafe] Parsing unstructured data

2007-12-03 Thread Olivier Boudry
On 12/2/07, Steven Fodstad <[EMAIL PROTECTED]> wrote: > > Sorry for not responding earlier. The haskell-cafe list is hard to keep > up with. > > The process of finding geographic (lat/long) coordinates from a text > address is called geocoding. Obviously extracting the parts of an > address is pa

Re: [Haskell-cafe] Parsing unstructured data

2007-11-29 Thread Reinier Lamers
Olivier Boudry wrote: On 11/28/07, *Grzegorz Chrupala* <[EMAIL PROTECTED] > wrote: You may have better luck checking out methods used in parsing natural language. In order to use statistical parsing techniques such as Probabilistic Context Free Grammars ([

Re: [Haskell-cafe] Parsing unstructured data

2007-11-28 Thread Olivier Boudry
On 11/28/07, Grzegorz Chrupala <[EMAIL PROTECTED]> wrote: > > You may have better luck checking out methods used in parsing natural > language. In order to use statistical parsing techniques such as > Probabilistic Context Free Grammars ([1],[2] ) the standard approach is to > extract rule probabil

Re: [Haskell-cafe] Parsing unstructured data

2007-11-28 Thread Grzegorz Chrupala
Olivier Boudry wrote: > > Hi all, > > This e-mail may be a bit off topic. My question is more about methods and > algorithms than Haskell. I'm looking for links to methods or tools for > parsing unstructured data. > > I'm currently working on data cleaning of a Customer Addresses database. > A

Re: [Haskell-cafe] Parsing unstructured data

2007-11-28 Thread Olivier Boudry
On 11/28/07, Hans van Thiel <[EMAIL PROTECTED]> wrote: > > Have you looked at the Java Rule Engine (I believe JSR 94) and in > particular Jess? > http://herzberg.ca.sandia.gov/ > > I have no experience with it myself, though, just heard of it. > > Regards, > > Hans van Thiel Hi Hans, Never heard

Re: [Haskell-cafe] Parsing unstructured data

2007-11-28 Thread Hans van Thiel
On Wed, 2007-11-28 at 12:58 -0500, Olivier Boudry wrote: > Hi all, > > This e-mail may be a bit off topic. My question is more about methods > and algorithms than Haskell. I'm looking for links to methods or tools > for parsing unstructured data. > > I'm currently working on data cleaning of a Cu

[Haskell-cafe] Parsing unstructured data

2007-11-28 Thread Olivier Boudry
Hi all, This e-mail may be a bit off topic. My question is more about methods and algorithms than Haskell. I'm looking for links to methods or tools for parsing unstructured data. I'm currently working on data cleaning of a Customer Addresses database. Addresses are stored as 3 lines of text with