+1 on setting initial capacity only when have good expectation on the collection size.
Thank you~ Xintong Song On Thu, Aug 1, 2019 at 2:32 PM Andrey Zagrebin <and...@ververica.com> wrote: > Hi all, > > As you probably already noticed, Stephan has triggered a discussion thread > about code style guide for Flink [1]. Recently we were discussing > internally some smaller concerns and I would like start separate threads > for them. > > This thread is about creating collections always with initial capacity. As > you might have seen, some parts of our code base always initialise > collections with some non-default capacity. You can even activate a check > in IntelliJ Idea that can monitor and highlight creation of collection > without initial capacity. > > Pros: > - performance gain if there is a good reasoning about initial capacity > - the capacity is always deterministic and does not depend on any changes > of its default value in Java > - easy to follow: always initialise, has IDE support for detection > > Cons (for initialising w/o good reasoning): > - We are trying to outsmart JVM. When there is no good reasoning about > initial capacity, we can rely on JVM default value. > - It is even confusing e.g. for hash maps as the real size depends on the > load factor. > - It would only add minor performance gain. > - a bit more code, increases maintenance burden. > > The conclusion is the following at the moment: > Only set the initial capacity if you have a good idea about the expected > size. > > Please, feel free to share you thoughts. > > Best, > Andrey > > [1] > > http://mail-archives.apache.org/mod_mbox/flink-dev/201906.mbox/%3ced91df4b-7cab-4547-a430-85bc710fd...@apache.org%3E >