There's just one thing left to do to make composite types useful as table columns: we have to support I/O of composite values. (Without this, pg_dump would fail to work on such columns, rendering them not very useful in the real world.) This means we have to hammer out a definition for what the external representation is. Here are my thoughts on the subject.
Textual representation: I am inclined to define this similarly to the representation for arrays; however, we need to allow for NULLs. I suggest {item,item,item} The separator is always comma (it can't be type-specific since the items might have different types). Backslashes and double quotes can be used in the usual ways to quote characters in the item strings. If an item string is completely empty it is taken as NULL; to write an actual empty-string value, you must write "". There is an ambiguity whether '{}' represents a zero-column row or a one-column row containing a NULL, but I don't think this is a problem since the input converter will always know how many columns it is expecting. There are a couple of fine points of the array I/O behavior that I think we should not emulate. One is that leading whitespace in an item string is discarded. This seems inconsistent, mainly because trailing whitespace isn't discarded. In the cases where it really makes sense to discard whitespace (namely numeric datatypes), the underlying datatype's input converter can do that just fine, and so I suggest that the record converter itself should not discard whitespace. It seems OK to ignore whitespace before and after the outer braces, however. The other fine point has to do with double quoting. In the array code, {a"b""c"d} is legal input representing an item 'abcd'. I think it would be more consistent with usual SQL conventions to treat it as meaning 'ab"cd', that is a doubled double quote within double quotes should represent a double quote not nothing. Anyone have a strong feeling one way or the other? (In the long run we might want to think about making these same changes in array_in, but that's a can of worms I don't wish to open today.) Binary representation: This seems relatively easy. I propose we send number of fields (int4) followed by, for each field: type oid (sizeof(Oid)), data length (int4), data according to the binary representation of the field datatype. The field count and type oids are not strictly necessary but seem like a good idea for error-checking purposes. Infrastructure changes: record_out/record_send can extract the needed type info right from the Datum, but record_in/record_recv really need to be told what data type to expect, and the current call conventions for input converters don't pass them any useful information. I propose that we adjust the present definitions so that the second argument passed to I/O conversion routines, rather than being always pg_type.typelem, is defined as "if pg_type.typtype is 'c' then pg_type.oid else pg_type.typelem". That is, for composite types we'll pass the type's own OID in place of typelem. This does not affect I/O routines for user-defined types, since there are no user-defined I/O routines for composite types. It could break any user-written code that calls I/O routines, if it's been hard-wired to pass typelem instead of using one of the support routines like getTypeInputInfo() or get_type_io_data() to collect the parameters to pass. By my count there are about a dozen places in the backend code that will need to be fixed to use one of these routines instead of having a hard-wired typelem reference. An alternative definition that might be more useful in the long run is to define the second parameter as "if pg_type.typelem is not zero then pg_type.typelem else pg_type.oid". In other words, for everything *except* arrays we'd pass the type OID. This would allow I/O routines to be written to support multiple datatypes. However there seems a larger chance of breaking things if we do this, and I'm also fuzzy on which OID to pass for domain types. So I'm inclined to keep it conservative for now, and change the behavior only for composite types. Comments, objections, better ideas? regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])