morningman commented on issue #4656: URL: https://github.com/apache/incubator-doris/issues/4656#issuecomment-725778532
There are two main problems here. 1. The Plan process is inefficient. In the Plan phase, when multiple UNION combinations are encountered, multiple rounds of iteration will increase the plan time. In my test, it may cost 2-3 seconds to analyze and plan a SQL with 30 UNIONs. 2. Too many Fragments A SQL may generate hundreds or thousands of Fragments, and all Fragments are currently sent to the corresponding BE separately. The common fields of these Fragments will undergo multiple serialization and deserialization operations, which results in a very high RPC processing time. In my test, a SQL with 30 UNIONs generates 300+ fragments, and total size of these fragments is over 100MB, mainly occupied by `DescriptorTable`. Optimization Strategy: 1. For problem 1, it is necessary to analyze the corresponding code hot spots through the Java Profiler tool to optimize the analysis process. (TODO) 2. For problem 2, multiple fragments sent to the same BE can be merged and sent in batch, reducing the number of RPCs and reducing the amount of data sent. (WIP) ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org