On Aug 9, 2013, at 14:08, Dmitry Samersoff <dmitry.samers...@oracle.com> wrote:
> Xuelei, > > 119 p = q + 1; > 120 if (p < input.length() || q == (input.length() - 1)) { > > Could be simplified to: > > q <= input.length()-1 > It's cool! Xuelei > -Dmitry > > On 2013-08-09 04:41, Xuelei Fan wrote: >> Ping. >> >> Thanks, >> Xuelei >> >> On 8/7/2013 11:17 PM, Xuelei Fan wrote: >>> Please review the new update: >>> >>> http://cr.openjdk.java.net./~xuelei/8020842/webrev.01/ >>> >>> With this update, "com." is valid (return "com."); "." and >>> "example..com" are invalid. And IAE will be thrown for invalid IDN. >>> >>> Thanks, >>> Xuelei >>> >>> On 8/7/2013 10:18 PM, Michael McMahon wrote: >>>> On 07/08/13 15:13, Xuelei Fan wrote: >>>>> On 8/7/2013 10:05 PM, Michael McMahon wrote: >>>>>> Resolvers seem to accept queries using trailing dots. >>>>>> >>>>>> eg nslookup www.oracle.com. >>>>>> >>>>>> or InetAddress.getByName("www.oracle.com."); >>>>>> >>>>>> The part of RFC3490 quoted below seems to me to be saying >>>>>> that the empty label implied by the trailing dot is not regarded >>>>>> as a label so that you don't end up calling toAscii() or toUnicode() >>>>>> with an empty string. I don't think it's saying the trailing dot can't >>>>>> be there. >>>>> It makes sense. >>>>> >>>>> What's your preference to return for IDN.toASCII("www.oracle.com."), >>>>> "www.oracle.com." or "www.oracle.com"? The current returned value is >>>>> "www.oracle.com". I would like to reserve the behavior in this update. >>>> >>>> My opinion is to keep it as at present ie. "www.oracle.com." >>>> >>>> Michael >>>> >>>>> I think we are on same page soon. >>>>> >>>>> Thanks, >>>>> Xuelei >>>>> >>>>>> Michael >>>>>> >>>>>> On 07/08/13 13:44, Xuelei Fan wrote: >>>>>>> On 8/7/2013 12:06 AM, Matthew Hall wrote: >>>>>>>> Trailing dots are allowed in plain DNS (thus almost surely in IDN), >>>>>>>> and the single dot represents the root zone. So you have to be >>>>>>>> careful making this sort of change to check the DNS RFCs first. >>>>>>> That's the first question we need to answer, whether IDN allow tailling >>>>>>> dots ("com."), zero-length root label ("."), and zero-length label ("", >>>>>>> for example ""example..com")? >>>>>>> >>>>>>> Per the specification of IDN.toASCII(): >>>>>>> ======================================= >>>>>>> "ToASCII operation can fail. ToASCII fails if any step of it fails. If >>>>>>> ToASCII operation fails, an IllegalArgumentException will be thrown. In >>>>>>> this case, the input string should not be used in an internationalized >>>>>>> domain name. >>>>>>> >>>>>>> A label is an individual part of a domain name. The original ToASCII >>>>>>> operation, as defined in RFC 3490, only operates on a single label. >>>>>>> This >>>>>>> method can handle both label and entire domain name, by assuming that >>>>>>> labels in a domain name are always separated by dots. ... >>>>>>> >>>>>>> Throws IllegalArgumentException - if the input string doesn't >>>>>>> conform to >>>>>>> RFC 3490 specification" >>>>>>> >>>>>>> Per the specification of RFC 3490: >>>>>>> ================================== >>>>>>> [section 2] >>>>>>> "A label is an individual part of a domain name. Labels are usually >>>>>>> shown separated by dots; for example, the domain name >>>>>>> "www.example.com" is composed of three labels: "www", "example", and >>>>>>> "com". (The zero-length root label described in [STD13], which can >>>>>>> be explicit as in "www.example.com." or implicit as in >>>>>>> "www.example.com", is not considered a label in this >>>>>>> specification.)" >>>>>>> >>>>>>> "An "internationalized label" is a label to which the ToASCII >>>>>>> operation (see section 4) can be applied without failing (with the >>>>>>> UseSTD3ASCIIRules flag unset). ... >>>>>>> Although most Unicode characters can appear in >>>>>>> internationalized labels, ToASCII will fail for some input strings, >>>>>>> and such strings are not valid internationalized labels." >>>>>>> >>>>>>> "An "internationalized domain name" (IDN) is a domain name in which >>>>>>> every label is an internationalized label." >>>>>>> >>>>>>> [Section 4.1] >>>>>>> "ToASCII consists of the following steps: >>>>>>> >>>>>>> ... >>>>>>> >>>>>>> 8. Verify that the number of code points is in the range 1 to 63 >>>>>>> inclusive." >>>>>>> >>>>>>> >>>>>>> Here are the questions: >>>>>>> 1. whether "example..com" is an valid IDN? >>>>>>> As dot is used as label separators, there are three labels, >>>>>>> "example", "", "com". Per RFC 3490, "" is not a valid label. Hence, >>>>>>> "example..com" is not a valid IDN. >>>>>>> >>>>>>> We need to address the issue in IDN. >>>>>>> >>>>>>> 2. whether "xyz." is an valid IDN? >>>>>>> It's an gray area, I think. We can treat the trailing "." as root >>>>>>> label, or a label separator. >>>>>>> If the trailing "." is treated as label separator, "xyz." is >>>>>>> invalid >>>>>>> per RFC 3490. >>>>>>> if the trailing "." is treated as root label, what's the expected >>>>>>> return value of IDN.toASCII("xyz.")? I think the return value can be >>>>>>> either "xyz." or "xyz". The current implementation returns "xyz". >>>>>>> >>>>>>> We may need not to update the implementation if tailing "." is >>>>>>> treated as root label. >>>>>>> >>>>>>> 3. whether "." is an valid IDN? >>>>>>> It's an gray area again, I think. >>>>>>> As above, if the trailing "." is treated as root label, I think >>>>>>> the >>>>>>> return value can be either "." or "". The current implementation >>>>>>> throws >>>>>>> a StringIndexOutOfBoundsException. >>>>>>> >>>>>>> However, what empty domain name ("") really means? I would >>>>>>> prefer to >>>>>>> return "." for "." instead. >>>>>>> >>>>>>> We need to address the issue in IDN. >>>>>>> >>>>>>> >>>>>>> Here comes the solution, the IDN.toASCII() returns: >>>>>>> 1. "." for "."; >>>>>>> 2. "xyz" for "xyz."; >>>>>>> 3. IAE for "example..com". >>>>>>> >>>>>>> Does it make sense? >>>>>>> >>>>>>> Thanks, >>>>>>> Xuelei >>>>>>> >>>>>>> >>>>>>> On 8/7/2013 1:35 AM, Michael McMahon wrote: >>>>>>>> I don't really understand the reason for the restriction in >>>>>>>> SNIHostName >>>>>>>> But, I guess that is where it should be enforced if it is required. >>>>>>>> >>>>>>>> Michael. >>>>>>>> >>>>>>>> On 06/08/13 17:43, Dmitry Samersoff wrote: >>>>>>>>> Xuelei, >>>>>>>>> >>>>>>>>> . (dot) is perfectly valid domain name and it means root domain so >>>>>>>>> com. >>>>>>>>> is valid domain name as well. >>>>>>>>> >>>>>>>>> It thinks to me that in context of methods your change we should >>>>>>>>> ignore >>>>>>>>> trailing dots, rather than throw exception. >>>>>>>>> >>>>>>>>> -Dmitry >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On 2013-08-06 15:44, Xuelei Fan wrote: >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> Please review the bug fix to strict the illegal input checking in >>>>>>>>>> IDN. >>>>>>>>>> >>>>>>>>>> webrev: http://cr.openjdk.java.net./~xuelei/8020842/webrev.00/ >>>>>>>>>> >>>>>>>>>> Here is two test cases, which are expected to get IAE. >>>>>>>>>> >>>>>>>>>> Case 1: >>>>>>>>>> String host = IDN.toASCII(".", IDN.USE_STD3_ASCII_RULES); >>>>>>>>>> Exception in thread "main" >>>>>>>>>> java.lang.StringIndexOutOfBoundsException: >>>>>>>>>> String index out of range: 0 >>>>>>>>>> at java.lang.StringBuffer.charAt(StringBuffer.java:204) >>>>>>>>>> at java.net.IDN.toASCIIInternal(IDN.java:279) >>>>>>>>>> at java.net.IDN.toASCII(IDN.java:118) >>>>>>>>>> >>>>>>>>>> Case 2: >>>>>>>>>> String host = IDN.toASCII("com.", IDN.USE_STD3_ASCII_RULES); >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Xuelei > > > -- > Dmitry Samersoff > Oracle Java development team, Saint Petersburg, Russia > * I would love to change the world, but they won't give me the source code.