Hi Ivan P., I configured it for both PDS (Indexing) and PDS 4 (was asked by Nikita Tolstunov). It totally worked, not a single 137 since then. Occasional 130 will be fixed in [1], it has a different problem behind it.
Now I'm trying to find someone who knows TC configuration better and will be able to propagate the setting to all suites. Also I don't have the access to agents so "jemalloc" is definitely not an option for me specifically. [1] https://issues.apache.org/jira/browse/IGNITE-13266 вс, 26 июл. 2020 г. в 17:36, Ivan Pavlukhin <vololo...@gmail.com>: > Ivan B., > > I noticed that you were able to configure environment variables for > PDS (Indexing). Do field experiments show that the suggested approach > fixes the problem? > > Interesting stuff with jemalloc. It might be useful to file a ticket. > > 2020-07-23 16:07 GMT+03:00, Ivan Daschinsky <ivanda...@gmail.com>: > >> > >> About "jemalloc" - it's also an option, but it also requires > >> reconfiguring > >> suites on > >> TC, maybe in a more complicated way. It requires additional > installation, > >> right? > >> Can we stick to the solution that I already tested or should we update > TC > >> agents? :) > > > > > > Yes, if you want to use jemalloc, you should install it and configure a > > specific env variable. > > This is just an option to consider, nothing more. I suppose that your > > approach is may be the > > best variant right now. > > > > > > чт, 23 июл. 2020 г. в 15:28, Ivan Bessonov <bessonov...@gmail.com>: > > > >> > > >> > glibc allocator uses arenas for minimize contention between threads > >> > >> > >> I understand it the same way. I did testing with running of Indexing > >> suite > >> locally > >> and periodically executing "pmap <pid>", it showed that the number of > >> 64mb > >> arenas grows constantly and never shrinks. By the middle of the suite > the > >> amount > >> of virtual memory was close to 50 Gb and used physical memory was at > >> least > >> 6-7 Gb, if I recall it correctly. I have only 8 cores BTW, so it should > >> be > >> worse on TC. > >> It means that there is enough contention somewhere in tests. > >> > >> About "jemalloc" - it's also an option, but it also requires > >> reconfiguring > >> suites on > >> TC, maybe in a more complicated way. It requires additional > installation, > >> right? > >> Can we stick to the solution that I already tested or should we update > TC > >> agents? :) > >> > >> чт, 23 июл. 2020 г. в 15:02, Ivan Daschinsky <ivanda...@gmail.com>: > >> > >> > AFAIK, glibc allocator uses arenas for minimize contention between > >> threads > >> > when they trying to access > >> > or free preallocated bit of memory. But seems that we > >> > use -XX:+AlwaysPreTouch, so heap is allocated > >> > and committed at start time. We allocate memory for durable memory in > >> > one > >> > thread. > >> > So I think there will be not so much contention between threads for > >> native > >> > memory pools. > >> > > >> > Also, there is another approach -- try to use jemalloc. > >> > This allocator shows better result than default glibc malloc in our > >> > scenarios. (memory consumption) [1] > >> > > >> > [1] -- > >> > > >> > > >> > http://ithare.com/testing-memory-allocators-ptmalloc2-tcmalloc-hoard-jemalloc-while-trying-to-simulate-real-world-loads/ > >> > > >> > > >> > > >> > чт, 23 июл. 2020 г. в 14:19, Ivan Bessonov <bessonov...@gmail.com>: > >> > > >> > > Hello Ivan, > >> > > > >> > > It feels like the problem is more about new starting threads rather > >> than > >> > > the > >> > > allocation of offheap regions. Plus I'd like to see results soon, > >> > > your > >> > > proposal is > >> > > a major change for Ignite that can't be implemented fast enough. > >> > > > >> > > Anyway, I think this makes sense, considering that one day Unsafe > >> > > will > >> be > >> > > removed. But I wouldn't think about it right now, maybe as a > separate > >> > > proposal... > >> > > > >> > > > >> > > > >> > > чт, 23 июл. 2020 г. в 13:40, Ivan Daschinsky <ivanda...@gmail.com>: > >> > > > >> > > > Ivan, I think that we should use mmap/munmap to allocate huge > >> > > > chunks > >> of > >> > > > memory. > >> > > > > >> > > > I've experimented with JNA and invoke mmap/munmap with it and it > >> works > >> > > > fine. > >> > > > May be we can create module (similar to direct-io) that use > >> mmap/munap > >> > on > >> > > > platforms, that support them > >> > > > and fallback to Unsafe if not? > >> > > > > >> > > > чт, 23 июл. 2020 г. в 13:31, Ivan Bessonov <bessonov...@gmail.com > >: > >> > > > > >> > > > > Hello Igniters, > >> > > > > > >> > > > > I'd like to discuss the current issue with "out of memory" fails > >> > > > > on > >> > > > > TeamCity. Particularly suites [1] > >> > > > > and [2], they have quite a lot of "Exit code 137" failures. > >> > > > > > >> > > > > I investigated the "PDS (Indexing)" suite under [3]. There's > >> another > >> > > > > similar issue as well: [4]. > >> > > > > I came to the conclusion that the main problem is inside the > >> default > >> > > > memory > >> > > > > allocator (malloc). > >> > > > > Let me explain the way I see it right now: > >> > > > > > >> > > > > "malloc" is allowed to allocate (for internal usages) up to 8 * > >> > (number > >> > > > of > >> > > > > cores) blocks called > >> > > > > ARENA, 64 mb each. This may happen when a program creates/stops > >> > threads > >> > > > > frequently and > >> > > > > allocates a lot of memory all the time, which is exactly what > our > >> > tests > >> > > > do. > >> > > > > Given that TC agents > >> > > > > have 32 cores, 8 * 32 * 64 mb gives 16 gigabytes, that's like > the > >> > whole > >> > > > > amount of RAM on the > >> > > > > single agent. > >> > > > > > >> > > > > The total amount of arenas can be manually lowered by setting > >> > > > > the MALLOC_ARENA_MAX > >> > > > > environment variable to 4 (or other small value). I tried it > >> locally > >> > > and > >> > > > in > >> > > > > PDS (Indexing) suite > >> > > > > settings on TC, results look very promising: [5] > >> > > > > > >> > > > > It is said that changing this variable may lead to some > >> > > > > performance > >> > > > > degradation, but it's hard to tell whether we have it or not, > >> because > >> > > the > >> > > > > suite usually failed before it was completed. > >> > > > > > >> > > > > So, I have two questions right now: > >> > > > > > >> > > > > - can those of you, who are into hardcore Linux and C, confirm > >> > > > > that > >> > the > >> > > > > solution can help us? Experiments show that it completely solves > >> the > >> > > > > problem. > >> > > > > - can you please point me to a person who usually does TC > >> > maintenance? > >> > > > I'm > >> > > > > not entirely sure > >> > > > > that I can propagate this environment variable to all suites by > >> > myself, > >> > > > > which is necessary to > >> > > > > avoid occasional error 137 (resulted from the same problem) in > >> > future. > >> > > I > >> > > > > just don't know all the > >> > > > > details about suites structure. > >> > > > > > >> > > > > Thank you! > >> > > > > > >> > > > > [1] > >> > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > https://ci.ignite.apache.org/viewType.html?buildTypeId=IgniteTests24Java8_PdsIndexing&tab=buildTypeHistoryList&state=failed&branch_IgniteTests24Java8=%3Cdefault%3E > >> > > > > [2] > >> > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > https://ci.ignite.apache.org/viewType.html?buildTypeId=IgniteTests24Java8_Pds4&tab=buildTypeHistoryList&branch_IgniteTests24Java8=%3Cdefault%3E&state=failed > >> > > > > [3] https://issues.apache.org/jira/browse/IGNITE-13266 > >> > > > > [4] https://issues.apache.org/jira/browse/IGNITE-13263 > >> > > > > [5] > >> > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > https://ci.ignite.apache.org/viewType.html?buildTypeId=IgniteTests24Java8_PdsIndexing&tab=buildTypeHistoryList&branch_IgniteTests24Java8=pull%2F8051%2Fhead > >> > > > > > >> > > > > -- > >> > > > > Sincerely yours, > >> > > > > Ivan Bessonov > >> > > > > > >> > > > > >> > > > > >> > > > -- > >> > > > Sincerely yours, Ivan Daschinskiy > >> > > > > >> > > > >> > > > >> > > -- > >> > > Sincerely yours, > >> > > Ivan Bessonov > >> > > > >> > > >> > > >> > -- > >> > Sincerely yours, Ivan Daschinskiy > >> > > >> > >> > >> -- > >> Sincerely yours, > >> Ivan Bessonov > >> > > > > > > -- > > Sincerely yours, Ivan Daschinskiy > > > > > -- > > Best regards, > Ivan Pavlukhin > -- Sincerely yours, Ivan Bessonov