Oliver Ford <ojf...@gmail.com> writes: > Attached is v2 of src, tests and docs. Doc patch is unchanged from v1.
I took a quick look at this patch. The roman_to_int function itself seems fairly reasonable, except that I don't like anything about the error reporting. ERRCODE_INVALID_PARAMETER_VALUE doesn't really seem like the right thing for bogus data; probably ERRCODE_INVALID_TEXT_REPRESENTATION is the closest thing we've got. More generally, reporting a character position in a string you're not showing to the user is neither helpful nor per project style. It'd be much better to just report the whole input string. I'm tempted to suggest that all the error reports could be errmsg("incorrect Roman numeral: \"%s\"", numerals) I'm not sure it's worth breaking it down more finely than that, and even if it is, several of these messages aren't too intelligible as-is. An angle that might be worth considering is whether we really want to reject non-canonical Roman input. As is, the function rejects "IIII", insisting that you must spell it "IV", but is that helpful or just pedantic? Likewise I'm not that excited about expending code and a separate translatable message to insist that people have to write "X" rather than "VV". However, what I really don't like is the way that you plugged roman_to_int into the existing to_number infrastructure, which is to say that you didn't. Putting in an "if roman then <x> else <all the existing code>" test is not the way to make this work. As an example, that fails to handle literal text in the format. If this works: regression=# select to_char(123, 'foo FMRN'); to_char ------------ foo CXXIII (1 row) then this should, but it fails: regression=# select to_number('foo CXXIII', 'foo FMRN'); ERROR: invalid character " " I think you need to be calling roman_to_int from the NUM_RN/NUM_rn switch cases in NUM_processor, not trying to bypass NUM_processor entirely. Another issue is that on the input side, we really need to reject formats like 'RN 999', since it's very unclear how we'd combine the input values. There already is some logic that rejects 'RN EEEE', but it needs extension. On the other hand, I do not see any really good reason to reject 'RN 999' for output; it would result in duplicative output, but if the user asked for it why not do it? (I see the existing code fails to handle that case for some reason.) So some refactoring of the existing roman-numeral code seems needed anyway. regards, tom lane