Hi Martin,

On 5/5/20 8:26 PM, Martin Buchholz wrote:
See related:
https://guava.dev/releases/23.0/api/docs/com/google/common/collect/Maps.html#newHashMapWithExpectedSize-int-


This is basically the same calculation (or at least gives same result) as Naoto did (without the max part):

Naoto: (int)(expectedSize / 0.75f) + 1

Guava: (int) ((float) expectedSize / 0.75F + 1.0F)

but in case expectedSize is a multiple of 3, it gives the result which is 1 more than needed. If what is needed is also a power of 2, then twice the needed space is allocated in the HashMap backing table.


Regards, Peter



On Tue, May 5, 2020 at 11:03 AM <naoto.s...@oracle.com <mailto:naoto.s...@oracle.com>> wrote:

    And here is the fix. Please review.

    http://cr.openjdk.java.net/~naoto/8244459/webrev.00/

    Naoto

    On 5/5/20 10:25 AM, naoto.s...@oracle.com
    <mailto:naoto.s...@oracle.com> wrote:
    > Hi Peter,
    >
    > You are correct. Thanks. I'll remove that initial value of 16.
    >
    > Naoto
    >
    > On 5/5/20 9:37 AM, Peter Levart wrote:
    >> Hi Naoto,
    >>
    >> On 4/30/20 12:18 AM, naoto.s...@oracle.com
    <mailto:naoto.s...@oracle.com> wrote:
    >>> Hello,
    >>>
    >>> Please review this small fix to the following issue:
    >>>
    >>> https://bugs.openjdk.java.net/browse/JDK-8244152
    >>>
    >>> The proposed changeset is located at:
    >>>
    >>> https://cr.openjdk.java.net/~naoto/8244152/webrev.00/
    >>>
    >>> The hash map used there didn't have initial capacity, even
    though the
    >>> exact numbers are known.
    >>
    >>
    >> Well, it has to be calculated 1st (countTokens), but I guess
    this pays
    >> off when HashSet (the backing HashMap) does not have to be
    rehashed then.
    >>
    >> The expression you use:
    >>
    >>      Math.max((int)(tokens.countTokens() / 0.75f) + 1, 16)
    >>
    >> ...has a minimum value of 16. Why is that? 16 is just HashMap's
    >> default initialCapacity if not specified explicitly. But if you
    only
    >> want to store say 1 entry in the map, you can specify 2 as
    >> initialCapacity and HashMap will happily work for such case
    without
    >> resizing.
    >>
    >>
    >> So you could just use:
    >>
    >>      (int)(tokens.countTokens() / 0.75f) + 1
    >>
    >> And even this expression is sometimes overshooting the minimal
    >> required value by 1 (when # of tokens is "exact" multiple of
    0.75f,
    >> say 6). I think the following could be used to optimally
    pre-size the
    >> HashMap with default load factor 0.75:
    >>
    >>      (tokens.countTokens() * 4 + 2) / 3
    >>
    >>
    >> Regards, Peter
    >>
    >>>
    >>> Naoto

Reply via email to