Re: [HACKERS] Html parsing and inline elements

2016-05-01 Thread Ryan Pedela
On Wed, Apr 13, 2016 at 9:57 AM, Marcelo Zabani wrote: > Hi, Tom, > > You're right, I don't think one can argue that the default parser should > know HTML. > How about your suggestion of there being an HTML parser, is it feasible? I > ask this because I think that a lot of people store HTML docum

Re: [HACKERS] Html parsing and inline elements

2016-04-30 Thread Oleg Bartunov
On Wed, Apr 13, 2016 at 6:57 PM, Marcelo Zabani wrote: > Hi, Tom, > > You're right, I don't think one can argue that the default parser should > know HTML. > How about your suggestion of there being an HTML parser, is it feasible? I > ask this because I think that a lot of people store HTML docum

Re: [HACKERS] Html parsing and inline elements

2016-04-29 Thread David G. Johnston
On Fri, Apr 29, 2016 at 1:47 PM, Bruce Momjian wrote: > On Wed, Apr 13, 2016 at 12:57:19PM -0300, Marcelo Zabani wrote: > > Hi, Tom, > > > > You're right, I don't think one can argue that the default parser should > know > > HTML. > > How about your suggestion of there being an HTML parser, is it

Re: [HACKERS] Html parsing and inline elements

2016-04-29 Thread Bruce Momjian
On Wed, Apr 13, 2016 at 12:57:19PM -0300, Marcelo Zabani wrote: > Hi, Tom, > > You're right, I don't think one can argue that the default parser should know > HTML. > How about your suggestion of there being an HTML parser, is it feasible? I ask > this because I think that a lot of people store HT

Re: [HACKERS] Html parsing and inline elements

2016-04-13 Thread Marcelo Zabani
Hi, Tom, You're right, I don't think one can argue that the default parser should know HTML. How about your suggestion of there being an HTML parser, is it feasible? I ask this because I think that a lot of people store HTML documents these days, and although there probably aren't lots of HTML wit

Re: [HACKERS] Html parsing and inline elements

2016-04-13 Thread Tom Lane
Marcelo Zabani writes: > I was here wondering whether HTML parsing should separate tokens that are > not separated by spaces in the original text, but are separated by an > inline element. Let me show you an example: > *SELECT to_tsvector('english', 'Helloneighbor, you are > nice')* > *Results:**

[HACKERS] Html parsing and inline elements

2016-04-13 Thread Marcelo Zabani
Hi everyone, I was here wondering whether HTML parsing should separate tokens that are not separated by spaces in the original text, but are separated by an inline element. Let me show you an example: *SELECT to_tsvector('english', 'Helloneighbor, you are nice')* *Results:** "'ce':7 'hello':1 'n'