imay opened a new issue #1776: create chunk allocator for memory pool URL: https://github.com/apache/incubator-doris/issues/1776 ## Motivation In the case of high concurrency testing, many threads are waiting to be applied and released in memory, and a large part of them are released by Chunk in MemPool. One of the reasons for this is that MemPool is used everywhere in code. On the other hand, the memory usage of these chunks is relatively large 4K - 512K. This large amount of memory make TCMalloc easily exceed the free memory reserved for each thread and needs to be applied to the central memory. Therefore, I implemented a demo ChunkAllocator to keep the released Chunk, avoiding frequent allocate from or release to TCMalloc. Using this demo to test the same high concurrency case, the throughput is more than doubled. The throughput has increased from 280 QPS to 650 QPS. So based on this, I want to implement a ChunkAllocator to reduce the allocation and release operations of Chunk from system allocator, thus improving the performance of the system. ## Design How to manage free Chunks? The size of the Chunk is power-of-two, so we can maintain a separate free chunk list for each size. When the Chunk is no longer used, it will be placed in the free list of the corresponding size. When allocating a new Chunk, it will first try to find it from the corresponding size free list. If it can't find it, try to allocate a new Chunk from the system allocator. In order to avoid the Chunk Allocator's lock conflict which will affect system performance, we need to reduce the collision domain. The idea here is to maintain an Chunk Arena for each CPU core. When allocating, try to allocate memory from the corresponding Chunk Arena. For memory limitations, there are two options. One is to set a limit on the total amount of memory that can be allocated; and the other is to set a limit on the maximum amount of free memory that is reserved. In order to be compatible with the current system behavior, I intend to limit only the total amount of reserved memory. This only fails when the system memory is completely drained, which is consistent with the current behavior. The larger the reserved free memory limit is, the better it will result in a better cache hit, but it will also lead to excessive free memory, causing other modules hard to allocate memory. What system allocator is used? malloc vs mmap? Currently, Malloc is used. If we change to mmap and do not change the system parameters(vm.max_map_count), it may cause the memory allocating to fail even if there is memory. We can implement these two types system allocator, and then leave a configure to choose which way to complete the system memory allocation. And configure malloc as default future work: All large memory applications in the system can be applied through Chunk Allocator, so that the Chunk Allocator can be changed from the reserved limit to the memory allocating limit. ## Structure ``` Struct Chunk { Uint8_t* data; Size_t size; // core id from which this chunk was allocated Int core_id; }; // Keep free chunk for each CPU core Class ChunkArena { Public: // Pop a free chunk from correspoding fres list // Return true if success with valid chunk saved in "chunk" Bool pop_free_chunk(size_t size, Chunk* chunk); // push a free chunk in this arena for later use Void push_free_chunk(const Chunk& chunk); }; Class ChunkAllocator { Public: // Allocate memory in size, size must be power-of-two. // Return Status::OK() if success, and allocated chunk info will be saved in chunk Status allocate(size_t size, Chunk* chunk); Void free(const Chunk& chunk); }; ``` Allocate process: 1. Get the current core_id 2. Try to apply for an idle Chunk from the corresponding Arena. If successful, return the corresponding Chunk. 3. Try to get free Chunk from Arena corresponding to other cores. If successful, return to Chunk 4. Assign Chunk from the system allocator Release process: 1. Determine if there is enough cache capacity, and if so, place the chunk in the idle queue for the corresponding Arena. 2. Call the system release function to release the resource
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@doris.apache.org For additional commands, e-mail: dev-h...@doris.apache.org