On Mon, Mar 7, 2016 at 10:46 PM, Artur Zakirov <a.zaki...@postgrespro.ru> wrote:
> Hello, > > On 07.03.2016 23:55, Dmitrii Golub wrote: > >> >> >> Hello, >> >> Should we added tests for this case? >> > > I think we should. I have added tests for teo...@123-stack.net and > 1...@stack.net emails. > > >> 123_reg.ro <http://123_reg.ro> is not valid domain name, bacause of >> symbol "_" >> >> https://tools.ietf.org/html/rfc1035 page 8. >> >> Dmitrii Golub >> > > Thank you for the information. Fixed. Hm... now that doesn't look all that consistent to me (after applying the patch): =# select ts_debug('simple', 'a...@123-yyy.zzz'); ts_debug --------------------------------------------------------------------------- (email,"Email address",a...@123-yyy.zzz,{simple},simple,{a...@123-yyy.zzz}) (1 row) But: =# select ts_debug('simple', 'aaa@123_yyy.zzz'); ts_debug --------------------------------------------------------- (asciiword,"Word, all ASCII",aaa,{simple},simple,{aaa}) (blank,"Space symbols",@,{},,) (uint,"Unsigned integer",123,{simple},simple,{123}) (blank,"Space symbols",_,{},,) (host,Host,yyy.zzz,{simple},simple,{yyy.zzz}) (5 rows) One can also see that if we only keep the domain name, the result is similar: =# select ts_debug('simple', '123-yyy.zzz'); ts_debug ------------------------------------------------------- (host,Host,123-yyy.zzz,{simple},simple,{123-yyy.zzz}) (1 row) =# select ts_debug('simple', '123_yyy.zzz'); ts_debug ----------------------------------------------------- (uint,"Unsigned integer",123,{simple},simple,{123}) (blank,"Space symbols",_,{},,) (host,Host,yyy.zzz,{simple},simple,{yyy.zzz}) (3 rows) But, this only has to do with 123 being recognized as a number, not with the underscore: =# select ts_debug('simple', 'abc_yyy.zzz'); ts_debug ------------------------------------------------------- (host,Host,abc_yyy.zzz,{simple},simple,{abc_yyy.zzz}) (1 row) =# select ts_debug('simple', '1abc_yyy.zzz'); ts_debug ------------------------------------------------------- (host,Host,1abc_yyy.zzz,{simple},simple,{1abc_yyy.zzz}) (1 row) In fact, the 123-yyy.zzz domain is not valid either according to the RFC (subdomain can't start with a digit), but since we already allow it, should we not allow 123_yyy.zzz to be recognized as a Host? Then why not recognize aaa@123_yyy.zzz as an email address? Another option is to prohibit underscore in recognized host names, but this has more breakage potential IMO. -- Alex