2016-03-14 16:22 GMT+03:00 Shulgin, Oleksandr <oleksandr.shul...@zalando.de> :
> On Mon, Mar 7, 2016 at 10:46 PM, Artur Zakirov <a.zaki...@postgrespro.ru> > wrote: > >> Hello, >> >> On 07.03.2016 23:55, Dmitrii Golub wrote: >> >>> >>> >>> Hello, >>> >>> Should we added tests for this case? >>> >> >> I think we should. I have added tests for teo...@123-stack.net and >> 1...@stack.net emails. >> >> >>> 123_reg.ro <http://123_reg.ro> is not valid domain name, bacause of >>> symbol "_" >>> >>> https://tools.ietf.org/html/rfc1035 page 8. >>> >>> Dmitrii Golub >>> >> >> Thank you for the information. Fixed. > > > Hm... now that doesn't look all that consistent to me (after applying the > patch): > > =# select ts_debug('simple', 'a...@123-yyy.zzz'); > ts_debug > --------------------------------------------------------------------------- > (email,"Email address",a...@123-yyy.zzz,{simple},simple,{a...@123-yyy.zzz}) > (1 row) > > But: > > =# select ts_debug('simple', 'aaa@123_yyy.zzz'); > ts_debug > --------------------------------------------------------- > (asciiword,"Word, all ASCII",aaa,{simple},simple,{aaa}) > (blank,"Space symbols",@,{},,) > (uint,"Unsigned integer",123,{simple},simple,{123}) > (blank,"Space symbols",_,{},,) > (host,Host,yyy.zzz,{simple},simple,{yyy.zzz}) > (5 rows) > > One can also see that if we only keep the domain name, the result is > similar: > > =# select ts_debug('simple', '123-yyy.zzz'); > ts_debug > ------------------------------------------------------- > (host,Host,123-yyy.zzz,{simple},simple,{123-yyy.zzz}) > (1 row) > > =# select ts_debug('simple', '123_yyy.zzz'); > ts_debug > ----------------------------------------------------- > (uint,"Unsigned integer",123,{simple},simple,{123}) > (blank,"Space symbols",_,{},,) > (host,Host,yyy.zzz,{simple},simple,{yyy.zzz}) > (3 rows) > > But, this only has to do with 123 being recognized as a number, not with > the underscore: > > =# select ts_debug('simple', 'abc_yyy.zzz'); > ts_debug > ------------------------------------------------------- > (host,Host,abc_yyy.zzz,{simple},simple,{abc_yyy.zzz}) > (1 row) > > =# select ts_debug('simple', '1abc_yyy.zzz'); > ts_debug > ------------------------------------------------------- > (host,Host,1abc_yyy.zzz,{simple},simple,{1abc_yyy.zzz}) > (1 row) > > In fact, the 123-yyy.zzz domain is not valid either according to the RFC > (subdomain can't start with a digit), but since we already allow it, should > we not allow 123_yyy.zzz to be recognized as a Host? Then why not > recognize aaa@123_yyy.zzz as an email address? > > Another option is to prohibit underscore in recognized host names, but > this has more breakage potential IMO. > > -- > Alex > > Alex, actually subdomain can start with digit, try it.