Eric Blake <ebl...@redhat.com> writes: > On 11/05/2015 09:01 AM, Daniel P. Berrange wrote: >> On Thu, Nov 05, 2015 at 04:30:00PM +0100, Markus Armbruster wrote: >>> QAPI names needn't be valid C identifiers, so we mangle them with >>> c_name(). Except for enumeration constants, which we mangle with >>> camel_to_upper(). >>> >>> c_name() is easy enough to understand: replace '.' and '-' by '_', >>> prefix certain ticklish identifiers with 'q_'. >>> >>> camel_to_upper() is a hairball of heuristics, and guessing how it'll >>> mangle interesting input could serve as a (nerdy) game. Despite some >>> tweaking (commit 5d371f4), it's still inadqeuate for some QAPI names >>> (commit 351d36e). > > One of the issues at hand is whether we want to (eventually) teach QMP > to be case-insensitive. Right now, our c_name() mangling preserves case > (you can have a struct with members 'a' and 'A'), although (hopefully) > no one is relying on it. But camel_to_upper() is case-insensitive ('a' > and 'A' would result in the same enum constant). > > In order to (later) support case-insensitive QMP, we need to decide up > front that we will not allow any qapi member names to collide > case-insensitively (outlaw 'a' and 'A' in the same struct; although the > C code is still case-preserving); and now that this series is adding a > single check_clash() function, it's very easy to do. In fact, I'll add > that to my series for 2.5 (it's always easier to reserve something now, > especially if no one was using it, and then relax later; than it is to > try and restrict things later but run into counter-cases).
I doubt QMP should be made case-insensitive. JSON isn't, C isn't. Our use of case is actually fairly consistent: event names are ALL_CAPS, everything else is in lower case. Complete list of exceptions found in result of query-qmp-schema: * struct UuidInfo member UUID * struct CpuInfo members CPU and PC * enum ACPISlotType member DIMM * enum InputButton members Left, Middle, Right, WheelUp, WheelDown * enum InputAxis members X, Y That said, an interface where names differ only in case is a badly designed interface. I'd be fine with rejecting such abuse. Oddballs not related to case: * enum BlkdebugEvent uses '.' in member names * enum QKeyCode uses member names starting with a digit For me, the one argument for some kind of insensitivity is our idiotic habit to sometimes string words together with '_' instead of '-', which has led to an unholy mess. The offenders are * commands block_passwd, block_resize, block_set_io_throttle, client_migrate_info, device_del, expire_password, migrate_cancel, migrate_set_downtime, migrate_set_speed, netdev_add, netdev_del, set_link, set_password, system_powerdown, system_reset, system_wakeup * enum types BlkdebugEvent, BlockdevDriver, QKeyCode * object types BlkdebugSetStateOptions, BlockDeviceInfo, BlockDeviceInfo, BlockDeviceStats, BlockInfo, CpuInfo, PciBusInfo, PciDeviceInfo, PciMemoryRegion, VncClientInfo >>> Having two separate manglings complicates this. Enumeration constants >>> must be distinct after mangling with camel_to_upper(). But as soon as >>> you use the enumeration as union tag, they must *also* be distinct >>> after mangling with c_name(). > > But this should already be the case - can you come up with a pair of > valid enum values that won't clash under camel_to_upper() but would > result in in a c_name() value collision? One glance at camel_to_upper() should make it obvious why I'd prefer not to have to know. But since you asked... I guess it can't happen, because camel_to_upper() first mangles with c_name(), then mangles some more. If c_name() clashes, further mangling can't make it clash less. > The converse is not true - > there ARE pairs of c_name() values that are distinct, but which map to > the same mangling with camel_to_upper(). Yes. Example: 'GotCha' and 'got-cha' both map to 'GOT_CHA'. > But if we insist that names do > not collide case-insensitively, then that issue goes away - having > ALL_CAP enum constants won't cause any collisions even when those > constants are derived from member names, because member names are > already forbidden from case-insensitive clash. > > There is still the question of whether we can get rid of the spurious > collision with 'max', by putting the enum sentinel out of the namespace > generated for other values. I wanted to get out an RFC quickly, so I didn't try. > But even with ALL_CAPS, that is possible: > > BLOCK_DEVICE_IO_STATUS_OK = 0, > BLOCK_DEVICE_IO_STATUS_FAILED = 1, > BLOCK_DEVICE_IO_STATUS_NOSPACE = 2, > BLOCK_DEVICE_IO_STATUSMAX = 3, Running shouted words together like STATUSMAX is even less legible than StatusMax. There's a reason why scriptio continua was abandoned in the middle ages. > Or insist that no enum value can start with a lone _ (double __ is okay > for downstream extensions, though): > > BLOCK_DEVICE_IO_STATUS_OK = 0, > BLOCK_DEVICE_IO_STATUS_FAILED = 1, > BLOCK_DEVICE_IO_STATUS_NOSPACE = 2, > BLOCK_DEVICE_IO_STATUS__MAX = 3, Workable. With the separate mangling dropped, we could do typedef enum BlockDeviceIoStatus { BlockDeviceIoStatus_ok = 0, BlockDeviceIoStatus_failed = 1, BlockDeviceIoStatus_nospace = 2, BlockDeviceIoStatusMAX = 3, } BlockDeviceIoStatus; which I think is a least ugly solution given our convention to use CamelCase for type names and my proposal to use the enum name as enum constant prefix. I'll reply to Dan separately. [...]