On 5/17/19 10:24 PM, Jonathan Wakely wrote:
On 17/05/19 18:19 +0200, François Dumont wrote:
Hi
I got tired of '__n' being used in _Hashtable for many different
purposes: node, bucket, bucket count, bucket hint. It makes the code
difficult to read. This code makes sure that __n is a node except is
some very limited use cases where the method name is clear enough to
tell what __n means.
So I'd like to commit this patch which only change that and some
comments before moving forward to more serious stuff. The only code
change is a use of auto return type on _M_allocate_node.
My main concern is the ChangeLog entry. Is the following entry ok ?
Rename variables and cleanup comments.
* include/bits/hashtable_policy.h
* include/bits/hashtable.h
Tested under Linux x86_64 (even if it can't be otherwise)
François
@@ -350,24 +347,24 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
_M_base_alloc() { return *this; }
__bucket_type*
- _M_allocate_buckets(size_type __n)
+ _M_allocate_buckets(size_type __bkt_count)
This is a much more helpful name, thanks.
@@ -439,30 +436,31 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
{ }
explicit
- _Hashtable(size_type __n,
+ _Hashtable(size_type __bkt_hint,
This seems less helpful. Would __num_bkts_hint be clearer?
Or for consistency, __bkt_count_hint?
I thought also about a longer name like this one but then I considered
that I didn't want to make it too long and that '__bkt_hint' was enough
know that this parameter was related to the buckets. But I can use
__bkt_count_hint if you prefer.
@@ -1415,9 +1414,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
-> iterator
{
__hash_code __code = this->_M_hash_code(__k);
- std::size_t __n = _M_bucket_index(__k, __code);
- __node_type* __p = _M_find_node(__n, __k, __code);
- return __p ? iterator(__p) : end();
+ std::size_t __bkt = _M_bucket_index(__k, __code);
+ __node_type* __n = _M_find_node(__bkt, __k, __code);
+ return __n ? iterator(__n) : end();
Is __n really an improvement over __p here?
Outside any context no but within _Hashtable implementation __n is
already used most of the time to indicate a node. This is patch just try
to fix this 'most of the time' part.
If you're changing it, __node or __ptr might be an improvement, but
changing __p to __n seems like unnecessary churn.
I'm not convinced that __n is a big enough improvement over __p to
bother changing dozens of lines, for not much benefit. All those
changes will make it slower to use git blame to track down when thigns
changed, and will make it harder to review diffs between trunk and
older branches.
Yes, this is why I wanted to commit it outside of any real change so
that this commit can be ignore from any git log or blame more easily.
So I can limit the patch to renaming __n occurences into __bkt,
__bkt_count_hint, __bkt_count when possible and leave other __n but also
__p & al untouch.
@@ -1479,17 +1478,17 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
-> pair<iterator, iterator>
{
__hash_code __code = this->_M_hash_code(__k);
- std::size_t __n = _M_bucket_index(__k, __code);
- __node_type* __p = _M_find_node(__n, __k, __code);
+ std::size_t __bkt = _M_bucket_index(__k, __code);
+ __node_type* __n = _M_find_node(__bkt, __k, __code);
- if (__p)
+ if (__n)
{
- __node_type* __p1 = __p->_M_next();
- while (__p1 && _M_bucket_index(__p1) == __n
- && this->_M_equals(__k, __code, __p1))
- __p1 = __p1->_M_next();
+ __node_type* __n1 = __n->_M_next();
__p1 is not a good name, but __n1 is no better.
At least with __p the second pointer could be __q, which is a fairly
idiomatic pairing of letters :-)
How about __first and __last? Or __n and __next? Even __n1 and __n2
seems better than __n and __n1. Those pointers end up being used for
the 'first' and 'second' members of a pair, so __n1 and __n2 makes
more sense than setting 'first' from __n and 'second' from __n1.
But I don't feel strongly about it, so if it's just me who dislikes
__n and __n1 then it doesn't matter.
diff --git a/libstdc++-v3/include/bits/hashtable_policy.h
b/libstdc++-v3/include/bits/hashtable_policy.h
index a4d2a97f4f3..bb2e7b762ff 100644
--- a/libstdc++-v3/include/bits/hashtable_policy.h
+++ b/libstdc++-v3/include/bits/hashtable_policy.h
@@ -181,7 +181,7 @@ namespace __detail
* @tparam _Cache_hash_code Boolean value. True if the value of
* the hash function is stored along with the value. This is a
* time-space tradeoff. Storing it may improve lookup speed by
- * reducing the number of times we need to call the _Equal
+ * reducing the number of times we need to call the _Hash
Doesn't it reduce both?
In _M_equals we don't bother calling the _Equal predicate if the
cached hash code doesn't match the one for the key we're comparing.
Sure it reduces both, I'll just add _Hash (but first)