Dear Horiguchi-san, Fujii-san, Perfect work... Thank you for replying and analyzing!
> A. "^-?[0-9]+.*" : returns valid padding. p goes after the last digit. > B. "^[^0-9-].*" : padding = 0, p doesn't advance. > C. "^-[^0-9].*" : padding = 0, p advances by 1 byte. > D. "^-" : padding = 0, p advances by 1 byte. > (if *p == 0 then breaks) I confirmed them and your patterns are correct. > If we wan to make the behaviors C and D same with the current, the > else clause should be like the follows, but I don't think we need to > do that. > else > { > padding = 0; > if (*p == '-') > p++; > } This treatments is not complex so I want to add them if possible. > One possible cause of a difference in behavior is character class > handling including multibyte characters of isdigit and strtol. If > isdigit accepts '一' as a digit (some platforms might do this) , and > strtol doesn't (I believe it is universal behavior), '%一0p' is > converted to '%' and the pointer moves onto '一'. But I don't think we > need to do something for such a crazy specification. Does isdigit() understand multi-byte character correctly? The arguments of isdigit() is just a unsigned char, and this is 1byte. Hence I thought that they cannot distinguish 'ー'. Actually I considered about another thing. Maybe isdigit() just checks whether the value of the argument is in (int)48 and (int)57, and that means that the first part of some multi-byte characters may be accepted as digit in some locales. But, of cause I agreed this is the crazy case. Best Regards, Hayato Kuroda FUJITSU LIMITED