On 10/01/2012 02:45 PM, Michael Stahl wrote:
On 01/10/12 14:23, Noel Grandin wrote:
On 2012-10-01 13:58, Michael Stahl wrote:
The only problem with a change there is our ABI - which explicitly
exposes the encoding of that.
the right time to do it is for LO4. sadly nobody has signed up for that
yet :( ... (while there are volunteers for far sillier proposals, like
getting rid of com.sun.star...)
Perhaps we need to split out some preparatory tasks?
For example
- fix code that directly accesses the underlying buffer
- create an external iterator class (which would currently be a thin
wrapper around int) for looping over the buffer and indexing into it
- fix code that indexes into an OUString to use the new external iterator
there already exists method iterateCodePoints, using a pointer to the
next code unit as the iterator (note that this interface depends on
immutability of the buffer):
inline sal_uInt32 iterateCodePoints(
sal_Int32 * indexUtf16, sal_Int32 incrementCodePoints = 1) const
problem is, nobody is using it...
guess you could comment out operator[], that should find lots of
convertible call sites :)
Note that in the common case of accessing (i.e., searching for, etc.)
7-bit ASCII content in a string, regardless of whether it is internally
represented as UTF-8 or UTF-16, going via an operator[] interface that
operates directly on the string object's innards might be more efficient
than going via an iterator interface (which is, of course, necessary
when potentially accessing non-ASCII content).
What an ideal string abstraction would look like is not clear to me at all.
Stephan
_______________________________________________
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice