Hi GaoXin: I think you are talking 2 things, although they are related. One is the "Resource Queue", which is used to define different resource group for different workload. The other is "Submit SQL Job", which is used to submit arbitrary SQL (select, insert, etc.) asynchronously. And the "SQL job" can be submitted to the "Resource Queue".
I am more interested in "Submit SQL Job", I would like to make Doris "All-in-SQL", which means all kinds of job can be decribed by a SQL. For example, a "Broker Load" statement can be a "submit sql job(insert into dest_table select * from hdfs(file1));", or a "Export" job can be a "submit sql job(insert into s3(bucket1) select * from source_table". Looking forward to this feature, and could you please write a DSIP for this feature? You can provide your wiki account, and I can open the write permission for you. -- 此致!Best Regards 陈明雨 Mingyu Chen Email: morning...@apache.org At 2022-11-23 20:54:41, "高鑫" <helloxi...@foxmail.com> wrote: >Hi, > > >Doris users feedback that: When the database load reaches a certain level, >each query will compete for CPU resources and memory resources, resulting in >low overall query performance, so we need the resource queue to limit the >concurrency of large queries and ensure the stable performance of small >queries. Support for resource queue is motivated by the following points: > >Large queries or jobs preempt cluster resources, resulting in small queries >that cannot be completed quickly; > >Unable to limit the submission of large queries. A lot of parallel large >queries lead to problems such as BE OOM or full preemption of cluster >resources. >Resource Queue: User can specify the number of concurrent queries that the >database can run and the number of queries queued according to your own >business. >This can ensure that there are expected system resources when executing the >query, so as to obtain the expected query performance. > > > > >Related Work > >Creating resource queues for queries is common in various database products. >Aliyun AnalyticDB specifies the number of concurrent queries that the database >can run, > >the memory size that each query can use, and the CPU resources that can be >used by creating a resource queue; > >In the process of using the cloud data warehouse PostgreSQL, a single complex >query may consume too many resources and affect other users' queries or >calculations. > >When it is necessary to limit the consumption of system resources for a single >user or query statement, Tencent Cloud uses resource queues to limit. > > > >Create Resource Queue >The resource queue stores two types of information: > >Queue configuration: describes the resource limits available for this queue, >such as: Concurrency, CPU, memory, scan rows, etc. > >Matching policy: After a query (such as select/insert) is submitted, a >matching queue will be matched according to the job information. Matching >rules can be username, IP, database name and table name. > > >SQL syntax:// create resource queue > >CREATE RESOURCE QUEUE queue_name [WITH RESOURCE ( "max_concurrency" = "1", >// Limit the number of queries running simultaneously in the queue >"max_queue_size" = "10" // Limit the number of queries queued in the queue )] >[WITH MATCHING POLICY ( "user" = "rd_group*", // Match the prefix of user >name "ip" = "192.10.1.*" // Match the prefix of IP )]; // drop resource >queue DROP RESOURCE QUEUE queue_name; // show resource queues SHOW RESOURCE >QUEUES; // show specified queue: queueId, type, pendingNum, runningNum, >queueConfig, matchingPolicy SHOW RESOURCE QUEUE queue_name; >Submit Asynchronous Job: >SUBMIT SQL JOB [WITH LABEL label_name]( sql_stmt ) [PROPERTIES( >"wait_timeout_ms" = "-1", // Max time of waiting in the queue, -1 means >waiting all the time "query_timeout_ms" = "-1" // Max time of running, -1 >means consistent with the system )]