kaka11chen opened a new pull request, #63861:
URL: https://github.com/apache/doris/pull/63861
### What problem does this PR solve?
Issue Number: None
Related PR: None
Problem Summary: ANN index writer used to reserve ann_index_build_chunk_size
* dimension floats during init. For high-dimensional vectors such as 3072
dimensions, the default 1,000,000-row chunk eagerly requests multi-GB memory
before any rows are added. This change removes the init-time reserve and adds
ann_index_build_chunk_bytes as a target byte bound when choosing the effective
build chunk. The effective chunk still keeps at least one row and the
index-required minimum training rows, so IVF/PQ/SQ training is not split below
FAISS requirements.
### Release note
Add BE config ann_index_build_chunk_bytes to bound ANN index build chunk
buffering by bytes.
### Check List (For Author)
- Test <!-- At least one of them must be included. -->
- [ ] Regression test
- [ ] Unit Test
- [ ] Manual test (add detailed scripts or steps below)
- [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
- [ ] Previous test can cover this change.
- [ ] No code files have been changed.
- [ ] Other reason <!-- Add your reason? -->
- Behavior changed:
- [ ] No.
- [x] Yes. ANN index build buffering now respects
ann_index_build_chunk_bytes and no longer preallocates the full row-count chunk
in init.
- Does this need documentation?
- [ ] No.
- [ ] Yes. <!-- Add document PR link here. eg:
https://github.com/apache/doris-website/pull/1214 -->
### Check List (For Reviewer who merge this PR)
- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR should
merge into -->
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]