On 9-Oct-06, at 12:28 PM, Sara Golemon wrote:
(C) Add a UConverter *encoding_conv; element to pdo_dbh and
pdo_stmt objects, and an INI setting: pdo.default_encoding. When
passing data to/from a stmt object, the statement objects encoder
is used if available (set during prepare), if not available the
driver's converter is used (set by factory), otherwise
pdo.default_encoding is used as a fallback. Data exchanges
between the dbh object are similarly handled though (obviously)
skipping the stmt step.
Pros: Keeps character set conversion work out of the driver layer.
Reduces the amount of #ifdef work for multiple version
support.
Recognizes that some drivers (SQLITE) use a single encoding
universally, while others allow different tables to use different
encodings.
Cons: Doesn't solve the "do()" problem of encoding to different
charsets when inserting to tables of a driver which allows
different charsets per table.
Doesn't provide an indicator which says "This came from a
unicode string and was converter by ICU so is reliably in the
correct encoding" versus "This was handed to me by the user as a
binary string and may contain anything". Though this is also
"fixable" by either changing the handler proto or by burying a
state flag in the dbh/stmt objects.
From what you propose I think option C is the most reasonable
solution, but I'd like to offer a few revisions.
PDO already has an API for setting attributes via setAttribute(),
which can be set for a connection (default) and can be modified on a
per-statement via the same method. Attributes can also be passed via
a parameters, this lets the user decide what charsets to send to the
database. In some cases there is a neat cheat that can be applied by
setting connection charset to utf-8 or even utf-16 and let the
database (assuming it does this) do up/down conversion of the data as
needed.
Ilia Alshanetsky
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php