HappenLee opened a new issue #3926:
URL: https://github.com/apache/incubator-doris/issues/3926
## Motivation
At present, the use of Doris often encounters the limitation bottleneck of
```mem limit```, which leads to many queries can not be completed.
Although we can solve this problem by adjusting the ```mem_limit``` of
query. But in some memory bottleneck scenarios, this is futile.
The capacity of the disk is usually about 100 times of the memory, if we can
spill the data beyond the memory limit to the disk. This almost solves the
above problem perfectly, but the speed of disk is much slower than that of
memory, which will also lead to long execution time of query.
### It can bring us the following benefits:
1. In some memory tight scenarios, more memory is available at the expense
of query execution time. This is necessary in some scenarios
2. Doris can dispose larger query without memory constraints
## Implementation
1. Now, The ```BufferedBlockMgr2``` and ```DiskIOMgr``` have already
supported to spill mem data to disk. We need to use these functions to writes
data to a temporary work area on disk. The default location of this work area
is ```doris-scratch```, when an operation completes, the data is removed from
the disk.
2. There are 3 version of ```BufferedTupleStream``` which make us confuse.
We need to unify the abstraction of this important part to do a good job for
spilling to disk.
3. Successively implement the disk dropping function of the following
execution nodes:
* Sort
* Aggregation
* Analytic function
* Join
4. Remove redundant code, such as ```BufferTupleStream```, ```HashTable```
and so on.
5. Some optimization of spilling to disk:
* Size limit of temporary file
* Limit of IO speed of spilling to disk
* Using the IO capability of SSD
* Compression and decompression of spilling data
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]