Xuelei, 119 p = q + 1; 120 if (p < input.length() || q == (input.length() - 1)) {
Could be simplified to: q <= input.length()-1 -Dmitry On 2013-08-09 04:41, Xuelei Fan wrote: > Ping. > > Thanks, > Xuelei > > On 8/7/2013 11:17 PM, Xuelei Fan wrote: >> Please review the new update: >> >> http://cr.openjdk.java.net./~xuelei/8020842/webrev.01/ >> >> With this update, "com." is valid (return "com."); "." and >> "example..com" are invalid. And IAE will be thrown for invalid IDN. >> >> Thanks, >> Xuelei >> >> On 8/7/2013 10:18 PM, Michael McMahon wrote: >>> On 07/08/13 15:13, Xuelei Fan wrote: >>>> On 8/7/2013 10:05 PM, Michael McMahon wrote: >>>>> Resolvers seem to accept queries using trailing dots. >>>>> >>>>> eg nslookup www.oracle.com. >>>>> >>>>> or InetAddress.getByName("www.oracle.com."); >>>>> >>>>> The part of RFC3490 quoted below seems to me to be saying >>>>> that the empty label implied by the trailing dot is not regarded >>>>> as a label so that you don't end up calling toAscii() or toUnicode() >>>>> with an empty string. I don't think it's saying the trailing dot can't >>>>> be there. >>>>> >>>> It makes sense. >>>> >>>> What's your preference to return for IDN.toASCII("www.oracle.com."), >>>> "www.oracle.com." or "www.oracle.com"? The current returned value is >>>> "www.oracle.com". I would like to reserve the behavior in this update. >>> >>> My opinion is to keep it as at present ie. "www.oracle.com." >>> >>> Michael >>> >>>> I think we are on same page soon. >>>> >>>> Thanks, >>>> Xuelei >>>> >>>>> Michael >>>>> >>>>> On 07/08/13 13:44, Xuelei Fan wrote: >>>>>> On 8/7/2013 12:06 AM, Matthew Hall wrote: >>>>>>> Trailing dots are allowed in plain DNS (thus almost surely in IDN), >>>>>>> and the single dot represents the root zone. So you have to be >>>>>>> careful making this sort of change to check the DNS RFCs first. >>>>>> That's the first question we need to answer, whether IDN allow tailling >>>>>> dots ("com."), zero-length root label ("."), and zero-length label ("", >>>>>> for example ""example..com")? >>>>>> >>>>>> Per the specification of IDN.toASCII(): >>>>>> ======================================= >>>>>> "ToASCII operation can fail. ToASCII fails if any step of it fails. If >>>>>> ToASCII operation fails, an IllegalArgumentException will be thrown. In >>>>>> this case, the input string should not be used in an internationalized >>>>>> domain name. >>>>>> >>>>>> A label is an individual part of a domain name. The original ToASCII >>>>>> operation, as defined in RFC 3490, only operates on a single label. >>>>>> This >>>>>> method can handle both label and entire domain name, by assuming that >>>>>> labels in a domain name are always separated by dots. ... >>>>>> >>>>>> Throws IllegalArgumentException - if the input string doesn't >>>>>> conform to >>>>>> RFC 3490 specification" >>>>>> >>>>>> Per the specification of RFC 3490: >>>>>> ================================== >>>>>> [section 2] >>>>>> "A label is an individual part of a domain name. Labels are usually >>>>>> shown separated by dots; for example, the domain name >>>>>> "www.example.com" is composed of three labels: "www", "example", and >>>>>> "com". (The zero-length root label described in [STD13], which can >>>>>> be explicit as in "www.example.com." or implicit as in >>>>>> "www.example.com", is not considered a label in this >>>>>> specification.)" >>>>>> >>>>>> "An "internationalized label" is a label to which the ToASCII >>>>>> operation (see section 4) can be applied without failing (with the >>>>>> UseSTD3ASCIIRules flag unset). ... >>>>>> Although most Unicode characters can appear in >>>>>> internationalized labels, ToASCII will fail for some input strings, >>>>>> and such strings are not valid internationalized labels." >>>>>> >>>>>> "An "internationalized domain name" (IDN) is a domain name in which >>>>>> every label is an internationalized label." >>>>>> >>>>>> [Section 4.1] >>>>>> "ToASCII consists of the following steps: >>>>>> >>>>>> ... >>>>>> >>>>>> 8. Verify that the number of code points is in the range 1 to 63 >>>>>> inclusive." >>>>>> >>>>>> >>>>>> Here are the questions: >>>>>> 1. whether "example..com" is an valid IDN? >>>>>> As dot is used as label separators, there are three labels, >>>>>> "example", "", "com". Per RFC 3490, "" is not a valid label. Hence, >>>>>> "example..com" is not a valid IDN. >>>>>> >>>>>> We need to address the issue in IDN. >>>>>> >>>>>> 2. whether "xyz." is an valid IDN? >>>>>> It's an gray area, I think. We can treat the trailing "." as root >>>>>> label, or a label separator. >>>>>> If the trailing "." is treated as label separator, "xyz." is >>>>>> invalid >>>>>> per RFC 3490. >>>>>> if the trailing "." is treated as root label, what's the expected >>>>>> return value of IDN.toASCII("xyz.")? I think the return value can be >>>>>> either "xyz." or "xyz". The current implementation returns "xyz". >>>>>> >>>>>> We may need not to update the implementation if tailing "." is >>>>>> treated as root label. >>>>>> >>>>>> 3. whether "." is an valid IDN? >>>>>> It's an gray area again, I think. >>>>>> As above, if the trailing "." is treated as root label, I think >>>>>> the >>>>>> return value can be either "." or "". The current implementation >>>>>> throws >>>>>> a StringIndexOutOfBoundsException. >>>>>> >>>>>> However, what empty domain name ("") really means? I would >>>>>> prefer to >>>>>> return "." for "." instead. >>>>>> >>>>>> We need to address the issue in IDN. >>>>>> >>>>>> >>>>>> Here comes the solution, the IDN.toASCII() returns: >>>>>> 1. "." for "."; >>>>>> 2. "xyz" for "xyz."; >>>>>> 3. IAE for "example..com". >>>>>> >>>>>> Does it make sense? >>>>>> >>>>>> Thanks, >>>>>> Xuelei >>>>>> >>>>>> >>>>>> On 8/7/2013 1:35 AM, Michael McMahon wrote: >>>>>>> I don't really understand the reason for the restriction in >>>>>>> SNIHostName >>>>>>> But, I guess that is where it should be enforced if it is required. >>>>>>> >>>>>>> Michael. >>>>>>> >>>>>>> On 06/08/13 17:43, Dmitry Samersoff wrote: >>>>>>>> Xuelei, >>>>>>>> >>>>>>>> . (dot) is perfectly valid domain name and it means root domain so >>>>>>>> com. >>>>>>>> is valid domain name as well. >>>>>>>> >>>>>>>> It thinks to me that in context of methods your change we should >>>>>>>> ignore >>>>>>>> trailing dots, rather than throw exception. >>>>>>>> >>>>>>>> -Dmitry >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On 2013-08-06 15:44, Xuelei Fan wrote: >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> Please review the bug fix to strict the illegal input checking in >>>>>>>>> IDN. >>>>>>>>> >>>>>>>>> webrev: http://cr.openjdk.java.net./~xuelei/8020842/webrev.00/ >>>>>>>>> >>>>>>>>> Here is two test cases, which are expected to get IAE. >>>>>>>>> >>>>>>>>> Case 1: >>>>>>>>> String host = IDN.toASCII(".", IDN.USE_STD3_ASCII_RULES); >>>>>>>>> Exception in thread "main" >>>>>>>>> java.lang.StringIndexOutOfBoundsException: >>>>>>>>> String index out of range: 0 >>>>>>>>> at java.lang.StringBuffer.charAt(StringBuffer.java:204) >>>>>>>>> at java.net.IDN.toASCIIInternal(IDN.java:279) >>>>>>>>> at java.net.IDN.toASCII(IDN.java:118) >>>>>>>>> >>>>>>>>> Case 2: >>>>>>>>> String host = IDN.toASCII("com.", IDN.USE_STD3_ASCII_RULES); >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Xuelei >>>>>>>>> >>> >> > -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the source code.