Re: [BUGS] BUG #5021: ts_parse doesn't recognize email addresses with underscores

2009-10-22 Thread Alvaro Herrera
Euler Taveira de Oliveira escribió: > Robert Haas escreveu: > > I'm not real familiar with ts_parse(), but I'm thinking that it > > doesn't have any special casing for email addresses and is just > > intended to parse text for full-text-search - in which case splitting > > on _ is a pretty good alg

Re: [BUGS] BUG #5021: ts_parse doesn't recognize email addresses with underscores

2009-10-22 Thread Dan O'Hara
I agree that it isn't easy to determine if given text is a valid email address. As I couldn't use ts_parse, I ended up using a regex, which worked substantially better at pulling out the emails from the text stream. I haven't looked at the code, but perhaps it is possible to do the same thing her

Re: [BUGS] BUG #5021: ts_parse doesn't recognize email addresses with underscores

2009-10-22 Thread Euler Taveira de Oliveira
Robert Haas escreveu: > I'm not real familiar with ts_parse(), but I'm thinking that it > doesn't have any special casing for email addresses and is just > intended to parse text for full-text-search - in which case splitting > on _ is a pretty good algorithm. > It is a bug. The tsearch claims to

Re: [BUGS] BUG #5021: ts_parse doesn't recognize email addresses with underscores

2009-10-22 Thread Dan O'Hara
Thanks for having a look at this bug. According to section 12.8.2 of the postgres manual, ts_parse is supposed to recognize different types of data, one of which (#4) is an email address. The list of recognized data formats for parse can be selected via this query: SELECT * FROM ts_token_type('

Re: [BUGS] BUG #5021: ts_parse doesn't recognize email addresses with underscores

2009-10-22 Thread Robert Haas
On Fri, Aug 28, 2009 at 9:59 AM, Dan O'Hara wrote: > > The following bug has been logged online: > > Bug reference:      5021 > Logged by:          Dan O'Hara > Email address:      danarasoftw...@gmail.com > PostgreSQL version: 8.3.7 > Operating system:   win32 > Description:        ts_parse doesn

[BUGS] BUG #5021: ts_parse doesn't recognize email addresses with underscores

2009-08-28 Thread Dan O'Hara
The following bug has been logged online: Bug reference: 5021 Logged by: Dan O'Hara Email address: danarasoftw...@gmail.com PostgreSQL version: 8.3.7 Operating system: win32 Description:ts_parse doesn't recognize email addresses with underscores Details: In the foll