> > About "jemalloc" - it's also an option, but it also requires reconfiguring > suites on > TC, maybe in a more complicated way. It requires additional installation, > right? > Can we stick to the solution that I already tested or should we update TC > agents? :)
Yes, if you want to use jemalloc, you should install it and configure a specific env variable. This is just an option to consider, nothing more. I suppose that your approach is may be the best variant right now. чт, 23 июл. 2020 г. в 15:28, Ivan Bessonov <bessonov...@gmail.com>: > > > > glibc allocator uses arenas for minimize contention between threads > > > I understand it the same way. I did testing with running of Indexing suite > locally > and periodically executing "pmap <pid>", it showed that the number of 64mb > arenas grows constantly and never shrinks. By the middle of the suite the > amount > of virtual memory was close to 50 Gb and used physical memory was at least > 6-7 Gb, if I recall it correctly. I have only 8 cores BTW, so it should be > worse on TC. > It means that there is enough contention somewhere in tests. > > About "jemalloc" - it's also an option, but it also requires reconfiguring > suites on > TC, maybe in a more complicated way. It requires additional installation, > right? > Can we stick to the solution that I already tested or should we update TC > agents? :) > > чт, 23 июл. 2020 г. в 15:02, Ivan Daschinsky <ivanda...@gmail.com>: > > > AFAIK, glibc allocator uses arenas for minimize contention between > threads > > when they trying to access > > or free preallocated bit of memory. But seems that we > > use -XX:+AlwaysPreTouch, so heap is allocated > > and committed at start time. We allocate memory for durable memory in one > > thread. > > So I think there will be not so much contention between threads for > native > > memory pools. > > > > Also, there is another approach -- try to use jemalloc. > > This allocator shows better result than default glibc malloc in our > > scenarios. (memory consumption) [1] > > > > [1] -- > > > > > http://ithare.com/testing-memory-allocators-ptmalloc2-tcmalloc-hoard-jemalloc-while-trying-to-simulate-real-world-loads/ > > > > > > > > чт, 23 июл. 2020 г. в 14:19, Ivan Bessonov <bessonov...@gmail.com>: > > > > > Hello Ivan, > > > > > > It feels like the problem is more about new starting threads rather > than > > > the > > > allocation of offheap regions. Plus I'd like to see results soon, your > > > proposal is > > > a major change for Ignite that can't be implemented fast enough. > > > > > > Anyway, I think this makes sense, considering that one day Unsafe will > be > > > removed. But I wouldn't think about it right now, maybe as a separate > > > proposal... > > > > > > > > > > > > чт, 23 июл. 2020 г. в 13:40, Ivan Daschinsky <ivanda...@gmail.com>: > > > > > > > Ivan, I think that we should use mmap/munmap to allocate huge chunks > of > > > > memory. > > > > > > > > I've experimented with JNA and invoke mmap/munmap with it and it > works > > > > fine. > > > > May be we can create module (similar to direct-io) that use > mmap/munap > > on > > > > platforms, that support them > > > > and fallback to Unsafe if not? > > > > > > > > чт, 23 июл. 2020 г. в 13:31, Ivan Bessonov <bessonov...@gmail.com>: > > > > > > > > > Hello Igniters, > > > > > > > > > > I'd like to discuss the current issue with "out of memory" fails on > > > > > TeamCity. Particularly suites [1] > > > > > and [2], they have quite a lot of "Exit code 137" failures. > > > > > > > > > > I investigated the "PDS (Indexing)" suite under [3]. There's > another > > > > > similar issue as well: [4]. > > > > > I came to the conclusion that the main problem is inside the > default > > > > memory > > > > > allocator (malloc). > > > > > Let me explain the way I see it right now: > > > > > > > > > > "malloc" is allowed to allocate (for internal usages) up to 8 * > > (number > > > > of > > > > > cores) blocks called > > > > > ARENA, 64 mb each. This may happen when a program creates/stops > > threads > > > > > frequently and > > > > > allocates a lot of memory all the time, which is exactly what our > > tests > > > > do. > > > > > Given that TC agents > > > > > have 32 cores, 8 * 32 * 64 mb gives 16 gigabytes, that's like the > > whole > > > > > amount of RAM on the > > > > > single agent. > > > > > > > > > > The total amount of arenas can be manually lowered by setting > > > > > the MALLOC_ARENA_MAX > > > > > environment variable to 4 (or other small value). I tried it > locally > > > and > > > > in > > > > > PDS (Indexing) suite > > > > > settings on TC, results look very promising: [5] > > > > > > > > > > It is said that changing this variable may lead to some performance > > > > > degradation, but it's hard to tell whether we have it or not, > because > > > the > > > > > suite usually failed before it was completed. > > > > > > > > > > So, I have two questions right now: > > > > > > > > > > - can those of you, who are into hardcore Linux and C, confirm that > > the > > > > > solution can help us? Experiments show that it completely solves > the > > > > > problem. > > > > > - can you please point me to a person who usually does TC > > maintenance? > > > > I'm > > > > > not entirely sure > > > > > that I can propagate this environment variable to all suites by > > myself, > > > > > which is necessary to > > > > > avoid occasional error 137 (resulted from the same problem) in > > future. > > > I > > > > > just don't know all the > > > > > details about suites structure. > > > > > > > > > > Thank you! > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > https://ci.ignite.apache.org/viewType.html?buildTypeId=IgniteTests24Java8_PdsIndexing&tab=buildTypeHistoryList&state=failed&branch_IgniteTests24Java8=%3Cdefault%3E > > > > > [2] > > > > > > > > > > > > > > > > > > > > https://ci.ignite.apache.org/viewType.html?buildTypeId=IgniteTests24Java8_Pds4&tab=buildTypeHistoryList&branch_IgniteTests24Java8=%3Cdefault%3E&state=failed > > > > > [3] https://issues.apache.org/jira/browse/IGNITE-13266 > > > > > [4] https://issues.apache.org/jira/browse/IGNITE-13263 > > > > > [5] > > > > > > > > > > > > > > > > > > > > https://ci.ignite.apache.org/viewType.html?buildTypeId=IgniteTests24Java8_PdsIndexing&tab=buildTypeHistoryList&branch_IgniteTests24Java8=pull%2F8051%2Fhead > > > > > > > > > > -- > > > > > Sincerely yours, > > > > > Ivan Bessonov > > > > > > > > > > > > > > > > > -- > > > > Sincerely yours, Ivan Daschinskiy > > > > > > > > > > > > > -- > > > Sincerely yours, > > > Ivan Bessonov > > > > > > > > > -- > > Sincerely yours, Ivan Daschinskiy > > > > > -- > Sincerely yours, > Ivan Bessonov > -- Sincerely yours, Ivan Daschinskiy