On Jun 15, 2018, at 12:55 PM, Christian Stimming <christ...@cstimming.de>
wrote:
Am Freitag, 15. Juni 2018, 10:05:09 schrieb John Ralls:
A very good catch indeed. But pre-constructing the string in
qofbook.cpp only saves two string constructions per invocation as the
vector still has to make its own copies. I guess that its much worse
for you because the ancient gcc on Ubuntu 14.04 (gcc4.8) doesn't do
small-string optimization.
Is there any reason we cant use std::string& in the vector? Or do we
think that we might lose references there?
In this instance the string is static so it would be safe to use const
std::string&, but while there's an is_volatile type trait there's no
is_static so we can't insert a static assert to catch cases where the
actual std::string is on the stack. That could lead to some ugly bugs.
The std containers are specified in a way so that they make only sense
to
contain the values by-value (ok, this might be formulated a bit too
short, but
not too long ago this was a good rule of thumb). It is up to the
implementation to make optimizations (copy-on-write and such), but the
interface requires the programmer to think of the stored type as
"by-value".
Nevertheless I think my change made good use of the std::string's
already
existing optimizations. In particular, the saving was so surprisingly
large
here that I think gcc's std::string itself implements some sort of
copy-on-
write optimization, and this means its content isn't deep-copied in
these
calls.
Some numbers from the valgrind/callgrind evaluation. I got the following
number of calls, starting from qof_query_run and skipping some
uninterestings
calls in between:
- 6 qof_query_run
- 6 g_list_sort_with_data
- 360,000 (approx.) sort_func - due to the number of splits and txn in
my file
- 360,000 xaccSplitOrder
The most expensive part in this function is this next callee:
- 360,000 qof_book_use_split_action_for_num_field
- 360,000 (approx.) qof_book_get_property
This function in turn has approx. 1,100,000 calls to std::basic_string
constructor and destructor. The change in my commit is this:
Before my commit, these calls also caused 1,100,000 calls to
std::string::_S_create and std::string::_M_destroy i.e. the by-value
constructor and destructor. After my commit, I see a mere 22,000 calls
to
std::string::_S_create and std::string::_M_destroy.
Besides, using strings is still leaving performance on the table: A
string
comparison is n times longer than an int comparison where n is the
number
of consecutive leading characters that are the same in the two strings.
I
got a huge speedup on loading a few months ago because GncGUID's
operator==() was doing a character-wise comparison of the ascii
representation instead of comparing the two int64_ts of the actual GUID.
Sounds good. The above numbers seem to indicate that std::string already
optimizes away a long comparison as long as the string memory itself is
identical. In my opinion this is good enough of an optimization.
KVP paths aren't really a good use for std::string anyway: It's a lot of
weight for something that's used as a human-readable index. Even static
char[] is heavy for this purpose. We could get a huge boost if we use
something like Glib's quarks [1].
Want to have a go at that?
Err... this would mean a major refactoring of any access to the KVP
system.
No, I don't feel motivated to work into that direction. Also, if I
understnad
the above valgrind evaluation correctly, we can already achieve almost
optimum
(i.e. pointer comparison) by ensuring a set of std::string constants to
be
used. I think this is the most efficient way to proceed here, and I
would just
push this commit after some more cleanup.
Christian,
Hardly major, particularly considering that Aaron has already completely
rewritten KVP into C++ and separately changed the access from '/'-delimited
string paths to std::vector<std::string> of the tokens.
All that's required is lifting Glib's GQuark and converting it to C++,
changing KVP frame's std::vector<std::string> to std::vector<GncQuark>, and
using g_quark_intern_from_static_string instead of static std::string.
The situation you've tested is a bit of a special case because all of the
uses of the KVP_OPTIONS keys are in one file. Any keys used in more than
one file will have different static std::strings and pointer comparison
won't work. Examples include anything accessed with qof_instance_set/get.
That said, the mallocs/deallocs are 99% or more of the performance hit. If
you're willing and have time to do it your way and have time to get it done
quickly then by all means go ahead.
Regards,
John Ralls
_______________________________________________
gnucash-devel mailing list
gnucash-devel@gnucash.org
https://lists.gnucash.org/mailman/listinfo/gnucash-devel