On 03/27/2015 02:48 AM, Markus Armbruster wrote: > Eric Blake <ebl...@redhat.com> writes: > >> Previous commits demonstrated that the generator overlooked various >> bad naming situations: >> - types, commands, and events need a valid name >> - union and alternate branches cannot be marked optional >> >> The set of valid names includes [a-zA-Z0-9._-] (where '.' is >> useful only in downstream extensions). >>
>> +valid_characters = set(string.ascii_letters + string.digits + '.' + '-' + >> '_') > > strings.ascii_letters depends on the locale... https://docs.python.org/2/library/string.html string.ascii_letters The concatenation of the ascii_lowercase and ascii_uppercase constants described below. This value is not locale-dependent. You are thinking of string.letters, which IS locale-dependent. I intentionally used ascii_letters. > >> +def check_name(expr_info, source, name, allow_optional = False): >> + membername = name >> + >> + if not isinstance(name, str): >> + raise QAPIExprError(expr_info, >> + "%s requires a string name" % source) >> + if name == '**': >> + return > > Doesn't this permit '**' anywhere, not just as pseudo-type in command > arguments and results? Yes, on the grounds that check_type then filters it appropriately. But worthy of a comment (probably both in the commit message AND in the code base). Grounds for a v6 respin. > >> + if name.startswith('*'): >> + membername = name[1:] >> + if not allow_optional: >> + raise QAPIExprError(expr_info, >> + "%s does not allow optional name '%s'" >> + % (source, name)) >> + if not set(membername) <= valid_characters: > > ... so this check would break if we called locale.setlocale() in this > program. While I don't think we need to worry about it, I think you > could just as well use something like > > valid_name = re.compile(r"^[A-Za-z0-9-._]+$") > > if not valid_name.match(membername): regex is slightly slower than string matching _if the regex is precompiled_, and MUCH slower than string matching if the regex is compiled every time. In turn, string matching is slower than open-coding things, but has the benefit of being more compact and maintainable (open-coded loops are the worst on that front). Here's where I got my inspiration: https://stackoverflow.com/questions/1323364/in-python-how-to-check-if-a-string-only-contains-certain-characters But I may just go with the regex after all (I don't know how efficient python is about reusing a regex when a function is called multiple times, rather than recompiling the regex every time. Personal side note: back in 2009 or so, I was able to make 'm4' significantly faster in the context of 'autoconf' when I taught it to cache the compilation of the 8 most-recently-encountered regex, rather than recompiling every time; and then made 'autoconf' even faster when I taught it to do actions that didn't require regex use from 'm4' in the first place.) The nice thing, though, is that I factored things so that the implementation of this one function can change without having to hunt down all call-sites, if I keep the contract the same. >> discriminator_type = base_fields.get(discriminator) >> if not discriminator_type: >> raise QAPIExprError(expr_info, > > What happens when I try 'discriminator': '**'? No clue. Good thing for me to add somewhere in my series. However, I did manage to have this series at least think about a base type with '*switch':'Enum', then use 'discriminator':'*switch', which got past the generator (who knows what the C code would have done if have_switch was false?), so I plugged that hole; but in the process of testing it, I noticed that '*switch':'Enum' paired with 'discriminator':'switch' complained that 'switch' was not a member of the base class (which is a lie; it is present in the base class, but as an optional member). Proof that the generator is a bunch of hacks strung together :) -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
signature.asc
Description: OpenPGP digital signature