You can add my weixin xiaoqiangnk to provide the detailed message. On Sat, Apr 12, 2025 at 3:28 PM 吴晓东 <a799767...@126.com> wrote:
> 感谢您的解答: > > > 很抱歉某些问题没有描述清楚,耽误您的时间 > > 问题二:感谢您的答疑,解决了我的疑惑 > > 问题三 :能否请提供一些读取hive大表,提升性能的优化手段 > > 我现在有45台 128c > 机器750G,hive大表为30T,当我做了如下配置,目前执行sql时间10-20分钟(单表查询无join),再增加parallel_pipeline_task_num > 观察到似乎不会再有更大提升,请问还有没有其他手段(试过了file_cache,但是由于数据过大,加上block写入时间性能更差,并且磁盘io过载) > > query_timeout = 60000; > parallel_pipeline_task_num = 150; > exec_mem_limit = 250G; > remote_storage_read_buffer_mb=150; > > doris_scanner_row_num = 1000000 ;doris_scanner_row_bytes = 419430400 > > 问题五:偶尔be会出现如下报错,也没有其他的提示,be进程也没有挂,端口也是通的 > > > “”ERROR 1105 (HY000) at line 5: errCode = 2, detailMessage = > (10.xxxx)[CANCELLED]failed to send brpc when exchange, error=Host is > down, error_text=[E110]Fail to connect Socket{id=688 addr=10.xxxxx:8060} > (0x0x7fafa37df8c0): Connection timed out [R1][E112]Not connected to > 10.xxxxxx:8060 yet, server_id=688 [R2][E112]Not connected to 10.xxxxxx:8060 > yet, server_id=688 [R3][E112]Not connected to 10xxxxxx:8060 yet, > server_id=688 [R4][E112]Not connected to 10xxxxxx:8060 yet, server_id=688 > [R5][E112]Not connected to 10“ > > > 很抱歉再次麻烦您,您的指导解决了我的实际问题 > > > > > > 在 2025-04-10 15:20:34,"Yongqiang YANG" <dataroar...@gmail.com> 写道: > >Q1: when you read tables in hive, it contributes to REMOTE_SCAN_ > >BYTES_PER_SECOND. > >Q2:Thread pool num is determined when doris starts, when it is -1, actually > >doris use std::max(512, CpuInfo::num_cores() * 10); > >Q3: You can propose your problem more explicitly. > >Q4: > >https://doris.apache.org/docs/query-acceleration/performance-tuning-overview/analysis-tools?_highlight=profile#doris-profile > >Q5: I did not receive your image. > > > >On Thu, Apr 10, 2025 at 11:00 AM 吴晓东 <a799767...@126.com> wrote: > > > >> 您好: > >> 想请教一些问题(doris 3.0 存算一体模式) > >> 问题一:为什么当前没有任务REMOTE_SCAN_BYTES_PER_SECOND 还有几千G ,使用sar监控网卡每秒只有1G > >> ,这是为什么,统计原理是什么(目前是用doris连hive,hive表数据量几十T) > >> 问题二:doris_max_remote_scanner_thread_pool_thread_num > >> 官网默认值是512,使用sql查询默认值是-1,如果be > >> 是128核他的实际计算是公式是多少?默认情况他是随着parallel_pipeline_task_num变化而动态变化的吗? > >> > >> 问题三:能否请提供一些读取hive大表,提升性能的优化手段 > >> 问题四:有没有针对profile文件指标的详细解释 > >> 问题五:偶尔be会出现如下报错,也没有其他的提示,be进程也没有挂 > >> > >> 问题比较多,实际也再用,并且对doris也很感兴趣,希望大佬能给与指导,不胜感激!!!!! > >> > >> > >