liaoxin01 opened a new pull request, #63561:
URL: https://github.com/apache/doris/pull/63561

   ## Summary
   
   The BE-side `GroupCommitBlockSinkOperatorX::init` does **not** consume 
`TOlapTableSink.location` or `slave_location` (it only reads `tuple_id` / 
`schema` / `db_id` / `table_id` / `partition` / `group_commit_mode` / `load_id` 
/ `max_filter_ratio`). However, FE still runs `createLocation`, which iterates 
`O(partitions * indexes * tablets * replicas)` and, for every replica, takes 
the `CloudSystemInfoService` RW read lock via 
`CloudReplica.getCurrentClusterId`.
   
   Under high-concurrency group commit stream load on wide-partition tables 
(3000+ partitions in a real production incident), CAS contention on the RW 
lock's `state` cache line saturated all FE CPUs, and the cluster could not 
recover even after scaling out (more cores = more CAS contenders = worse 
contention).
   
   ## Change
   
   - Introduce a `protected initLocationParams(TOlapTableSink)` hook on 
`OlapTableSink`. Default behavior delegates to `createLocation`, so 
non-group-commit sinks are unaffected.
   - Route both `init(...)` overloads in `OlapTableSink` through the hook.
   - `GroupCommitBlockSink` overrides the hook to return empty placeholder 
`TOlapTableLocationParam` objects. `TOlapTableSink.location` is a required 
thrift field, so we still set non-null placeholders, but no tablet/replica 
enumeration happens.
   
   Effect on the group-commit path:
   - Per-request FE CPU: `O(partitions * indexes * tablets * replicas)` → `O(1)`
   - `CloudSystemInfoService` RW lock acquisitions: hundreds of concurrent CAS 
spinners → 0
   
   ## Test plan
   
   - [x] Added `GroupCommitBlockSinkTest` covering:
     - `initLocationParams` returns 2 placeholders with empty tablet lists 
(verifies the override is what runs, not `createLocation`).
     - `parseGroupCommit` parses `async_mode` / `sync_mode` / `off_mode` 
(case-insensitive) and returns null for unknown values.
   - [ ] Existing regression tests for stream load with `group_commit=true` 
still pass.
   - [ ] Manual high-concurrency stream load run on a wide-partition table to 
confirm FE CPU is no longer dominated by `CloudSystemInfoService` lock 
contention.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to