On 02/22/2012 11:25 AM, Michael Meeks wrote:
        Great ! :-) incidentally, I had one minor point around the ASCII vs.
UTF-8 side; the rtl_string2UString (cf. sal/rtl/source/string.cxx) does
a typically slower UTF-8 length counting loop; I suggest that we could
do better performance wise (and we do create a biggish scad of these
strings) by sticking with ascii, and doing a single, simple copy/expand
of the string. Perhaps in a new rtl_uString_newFromAsciiL method.

Thinking about it again, the restriction to ASCII could become a hindrance in the longer run. C++11 has provision for UTF-8 string literals (u8"..."), but they still have type char const[], so are not distinguishable from traditional plain "..." literals via function overloading. So, if we ever wanted to extend the new facilities to also support UTF-8 string literals, but would want to keep the performance benefit for the ASCII-only case, we could not offer the same simple syntax

  rtl::OUString("foo");
  rtl::OUString(u8"I\u2764C++");

for both.  One solution might be to go via an indirection

  template<std::size_t N> struct A { char const s[N]; }
  template<std::size_t N> struct U { char const s[N]; }

that encodes the knowledge whether the given string literal is ASCII or UTF-8, and have rtl::OUString ctors overloaded on those. Of course, this would bring back ugly warts into client code

  rtl::OUString(rtl::A("foo"));
  rtl::OUString(rtl::U(u8"I\u2764C++"));

And of course it would also work to syntactically optimize the ASCII case (as we would do now) and add the indirection only for the UTF-8 case (at the expense of some ugly asymmetry).

Just some thoughts,
Stephan
_______________________________________________
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice

Reply via email to