Ping. Thanks, Xuelei
On 8/7/2013 11:17 PM, Xuelei Fan wrote: > Please review the new update: > > http://cr.openjdk.java.net./~xuelei/8020842/webrev.01/ > > With this update, "com." is valid (return "com."); "." and > "example..com" are invalid. And IAE will be thrown for invalid IDN. > > Thanks, > Xuelei > > On 8/7/2013 10:18 PM, Michael McMahon wrote: >> On 07/08/13 15:13, Xuelei Fan wrote: >>> On 8/7/2013 10:05 PM, Michael McMahon wrote: >>>> Resolvers seem to accept queries using trailing dots. >>>> >>>> eg nslookup www.oracle.com. >>>> >>>> or InetAddress.getByName("www.oracle.com."); >>>> >>>> The part of RFC3490 quoted below seems to me to be saying >>>> that the empty label implied by the trailing dot is not regarded >>>> as a label so that you don't end up calling toAscii() or toUnicode() >>>> with an empty string. I don't think it's saying the trailing dot can't >>>> be there. >>>> >>> It makes sense. >>> >>> What's your preference to return for IDN.toASCII("www.oracle.com."), >>> "www.oracle.com." or "www.oracle.com"? The current returned value is >>> "www.oracle.com". I would like to reserve the behavior in this update. >> >> My opinion is to keep it as at present ie. "www.oracle.com." >> >> Michael >> >>> I think we are on same page soon. >>> >>> Thanks, >>> Xuelei >>> >>>> Michael >>>> >>>> On 07/08/13 13:44, Xuelei Fan wrote: >>>>> On 8/7/2013 12:06 AM, Matthew Hall wrote: >>>>>> Trailing dots are allowed in plain DNS (thus almost surely in IDN), >>>>>> and the single dot represents the root zone. So you have to be >>>>>> careful making this sort of change to check the DNS RFCs first. >>>>> That's the first question we need to answer, whether IDN allow tailling >>>>> dots ("com."), zero-length root label ("."), and zero-length label ("", >>>>> for example ""example..com")? >>>>> >>>>> Per the specification of IDN.toASCII(): >>>>> ======================================= >>>>> "ToASCII operation can fail. ToASCII fails if any step of it fails. If >>>>> ToASCII operation fails, an IllegalArgumentException will be thrown. In >>>>> this case, the input string should not be used in an internationalized >>>>> domain name. >>>>> >>>>> A label is an individual part of a domain name. The original ToASCII >>>>> operation, as defined in RFC 3490, only operates on a single label. >>>>> This >>>>> method can handle both label and entire domain name, by assuming that >>>>> labels in a domain name are always separated by dots. ... >>>>> >>>>> Throws IllegalArgumentException - if the input string doesn't >>>>> conform to >>>>> RFC 3490 specification" >>>>> >>>>> Per the specification of RFC 3490: >>>>> ================================== >>>>> [section 2] >>>>> "A label is an individual part of a domain name. Labels are usually >>>>> shown separated by dots; for example, the domain name >>>>> "www.example.com" is composed of three labels: "www", "example", and >>>>> "com". (The zero-length root label described in [STD13], which can >>>>> be explicit as in "www.example.com." or implicit as in >>>>> "www.example.com", is not considered a label in this >>>>> specification.)" >>>>> >>>>> "An "internationalized label" is a label to which the ToASCII >>>>> operation (see section 4) can be applied without failing (with the >>>>> UseSTD3ASCIIRules flag unset). ... >>>>> Although most Unicode characters can appear in >>>>> internationalized labels, ToASCII will fail for some input strings, >>>>> and such strings are not valid internationalized labels." >>>>> >>>>> "An "internationalized domain name" (IDN) is a domain name in which >>>>> every label is an internationalized label." >>>>> >>>>> [Section 4.1] >>>>> "ToASCII consists of the following steps: >>>>> >>>>> ... >>>>> >>>>> 8. Verify that the number of code points is in the range 1 to 63 >>>>> inclusive." >>>>> >>>>> >>>>> Here are the questions: >>>>> 1. whether "example..com" is an valid IDN? >>>>> As dot is used as label separators, there are three labels, >>>>> "example", "", "com". Per RFC 3490, "" is not a valid label. Hence, >>>>> "example..com" is not a valid IDN. >>>>> >>>>> We need to address the issue in IDN. >>>>> >>>>> 2. whether "xyz." is an valid IDN? >>>>> It's an gray area, I think. We can treat the trailing "." as root >>>>> label, or a label separator. >>>>> If the trailing "." is treated as label separator, "xyz." is >>>>> invalid >>>>> per RFC 3490. >>>>> if the trailing "." is treated as root label, what's the expected >>>>> return value of IDN.toASCII("xyz.")? I think the return value can be >>>>> either "xyz." or "xyz". The current implementation returns "xyz". >>>>> >>>>> We may need not to update the implementation if tailing "." is >>>>> treated as root label. >>>>> >>>>> 3. whether "." is an valid IDN? >>>>> It's an gray area again, I think. >>>>> As above, if the trailing "." is treated as root label, I think >>>>> the >>>>> return value can be either "." or "". The current implementation >>>>> throws >>>>> a StringIndexOutOfBoundsException. >>>>> >>>>> However, what empty domain name ("") really means? I would >>>>> prefer to >>>>> return "." for "." instead. >>>>> >>>>> We need to address the issue in IDN. >>>>> >>>>> >>>>> Here comes the solution, the IDN.toASCII() returns: >>>>> 1. "." for "."; >>>>> 2. "xyz" for "xyz."; >>>>> 3. IAE for "example..com". >>>>> >>>>> Does it make sense? >>>>> >>>>> Thanks, >>>>> Xuelei >>>>> >>>>> >>>>> On 8/7/2013 1:35 AM, Michael McMahon wrote: >>>>>> I don't really understand the reason for the restriction in >>>>>> SNIHostName >>>>>> But, I guess that is where it should be enforced if it is required. >>>>>> >>>>>> Michael. >>>>>> >>>>>> On 06/08/13 17:43, Dmitry Samersoff wrote: >>>>>>> Xuelei, >>>>>>> >>>>>>> . (dot) is perfectly valid domain name and it means root domain so >>>>>>> com. >>>>>>> is valid domain name as well. >>>>>>> >>>>>>> It thinks to me that in context of methods your change we should >>>>>>> ignore >>>>>>> trailing dots, rather than throw exception. >>>>>>> >>>>>>> -Dmitry >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 2013-08-06 15:44, Xuelei Fan wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> Please review the bug fix to strict the illegal input checking in >>>>>>>> IDN. >>>>>>>> >>>>>>>> webrev: http://cr.openjdk.java.net./~xuelei/8020842/webrev.00/ >>>>>>>> >>>>>>>> Here is two test cases, which are expected to get IAE. >>>>>>>> >>>>>>>> Case 1: >>>>>>>> String host = IDN.toASCII(".", IDN.USE_STD3_ASCII_RULES); >>>>>>>> Exception in thread "main" >>>>>>>> java.lang.StringIndexOutOfBoundsException: >>>>>>>> String index out of range: 0 >>>>>>>> at java.lang.StringBuffer.charAt(StringBuffer.java:204) >>>>>>>> at java.net.IDN.toASCIIInternal(IDN.java:279) >>>>>>>> at java.net.IDN.toASCII(IDN.java:118) >>>>>>>> >>>>>>>> Case 2: >>>>>>>> String host = IDN.toASCII("com.", IDN.USE_STD3_ASCII_RULES); >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Xuelei >>>>>>>> >> >