On 03.12.2013 09:13, Andre Fischer wrote:
A developer who apparently wants to remain anonymous has added the
function isEmpty() to the rtl::OUString class. See
main/sal/inc/rtl/ustring.hxx for not much more information.
Sorry for being too short. The full semantic for isEmpty() is:
"The method isEmpty() returns true if the string is empty. If the length
of the string is one or two or three or any number bigger than zero then
isEmpty() returns false."
I added isEmpty() to make it possible to cleanly express the check for
an empty string. In our codebase there were quite a few constructs such as
if( aString) {}
which were intended to mean
if( aString.isEmpty()) {}
What's funny is that the old construct compiled but it did the wrong
thing: The string was implicitly converted to a pointer to its elements
and that pointer was then compared against NULL. For our OUString that
pointer was always non-NULL though.
Please see issue 123068 for further problems caused by the implicit
conversion of the OUString to a pointer to its elements. This dangerous
conversion is now disabled. By making the method private all such
problems will be found and prevented by the compiler. When we're
confident that all has been found the operator can be removed completely.
This in itself may not yet be very exciting but I hope that it is the
first of several improvements to one of our most frequently used
classes. Sadly, we missed the opportunity to make some more substantial
but incompatible changes for the 4.0 release. However, some changes that
make OUString more accessible to new (and old) developers might include:
- Make construction from string literal more straightforward. At the
moment you have to write
::rtl::OUString("text", sizeof("text"), RTL_TEXTENCODING_ASCII_US)
or slightly shorter and safer
::rtl::OUString::createFromAscii("text")
Allocating heap space, transcoding a literal string to this memory and
deallocating it later when the string is deleted are quite wasteful
operations. Especially when considering that the literal string is
already there. It would be great if constructs such
OUString( L"hello")
used the pointer to the UTF-16 literal directly instead of copying its
contents around. The same applies for the OString(). The 'L' prefix is a
Windows convention but C++11 has even more possibilities with its
support for unicode string literals.
Also we shouldn't bother our main string classes with non-unicode
support. Having external tooling for converting from/to other encodings
is still needed though.
Looking over our string processing I'm confident that we could get along
great with UTF-8 strings. Only when interfacing with other APIs an
eventual conversion to UTF-16 would be needed.
And if we were using UTF-8 byte strings we could base them directly on
the standard std::string.
- Conversion back to char* is not much better
::rtl::OUStringToOString(sOUStringVariable,
RTL_TEXTENCODING_ASCII_US).getStr()
This awful construct could be made much simpler if our strings were
always unicode (UTF-8/UTF-16/UTF-32).
Do you have more ideas?
Using ideas from languages such as Python/Perl/Java for convenient and
powerful string processing to replace the awkward string handling that
is too often seen in our code base. E.g. having regexp enabled match()
or search() methods would be a great start.
Herbert
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@openoffice.apache.org
For additional commands, e-mail: dev-h...@openoffice.apache.org