Re: RTL_CONSTASCII_USTRINGPARAM: cleanup wanted?

Stephan Bergmann Wed, 22 Feb 2012 04:43:03 -0800

On 02/22/2012 11:25 AM, Michael Meeks wrote:

        Great ! :-) incidentally, I had one minor point around the ASCII vs.
UTF-8 side; the rtl_string2UString (cf. sal/rtl/source/string.cxx) does
a typically slower UTF-8 length counting loop; I suggest that we could
do better performance wise (and we do create a biggish scad of these
strings) by sticking with ascii, and doing a single, simple copy/expand
of the string. Perhaps in a new rtl_uString_newFromAsciiL method.

Thinking about it again, the restriction to ASCII could become ahindrance in the longer run. C++11 has provision for UTF-8 stringliterals (u8"..."), but they still have type char const[], so are notdistinguishable from traditional plain "..." literals via functionoverloading. So, if we ever wanted to extend the new facilities to alsosupport UTF-8 string literals, but would want to keep the performancebenefit for the ASCII-only case, we could not offer the same simple syntax


  rtl::OUString("foo");
  rtl::OUString(u8"I\u2764C++");

for both.  One solution might be to go via an indirection

  template<std::size_t N> struct A { char const s[N]; }
  template<std::size_t N> struct U { char const s[N]; }

that encodes the knowledge whether the given string literal is ASCII orUTF-8, and have rtl::OUString ctors overloaded on those. Of course,this would bring back ugly warts into client code


  rtl::OUString(rtl::A("foo"));
  rtl::OUString(rtl::U(u8"I\u2764C++"));

And of course it would also work to syntactically optimize the ASCIIcase (as we would do now) and add the indirection only for the UTF-8case (at the expense of some ugly asymmetry).


Just some thoughts,
Stephan
_______________________________________________
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice

Re: RTL_CONSTASCII_USTRINGPARAM: cleanup wanted?

Reply via email to