eldenmoon opened a new pull request, #64641:
URL: https://github.com/apache/doris/pull/64641

   ### What problem does this PR solve?
   
   Issue Number: DORIS-26505
   
   Related PR: N/A
   
   Problem Summary:
   
   On master, variant compaction can materialize the empty JSON key path as a 
regular subcolumn when `default_variant_max_subcolumns_count = 0` and sparse 
hash sharding is enabled. The empty path also represents the variant root path, 
so after cumulative compaction the values from `Tags['']` can be lost and read 
back as NULL.
   
   This PR keeps empty paths in the sparse path set instead of materializing 
them as subcolumns in all compaction path selection helpers, and adds a 
regression that reproduces the sparse-bucket empty-key case.
   
   ### Release note
   
   None
   
   ### Check List (For Author)
   
   - Test
       - [x] Regression test
       - [ ] Unit Test
       - [x] Manual test (add detailed scripts or steps below)
       - [ ] No need to test or manual test. Explain why:
           - [ ] This is a refactor/code format and no logic has been changed.
           - [ ] Previous test can cover this change.
           - [ ] No code files have been changed.
           - [ ] Other reason
   
   Manual test / local verification:
   
   - Built master BE/FE with `BUILD_TYPE=ASAN USE_MEM_TRACKER=ON bash build.sh 
--be --fe`.
   - Reproduced on unmodified master with `default_variant_max_subcolumns_count 
= 0`, `default_variant_enable_doc_mode = false`, `use_v3_storage_format = 
false`, `default_variant_enable_typed_paths_to_sparse = false`, 
`default_variant_sparse_hash_shard_count = 3`; before compaction `Tags['']` 
returned the inserted empty-key values, after cumulative compaction all rows 
read as NULL.
   - Rebuilt BE after the fix with `BUILD_TYPE=ASAN USE_MEM_TRACKER=ON bash 
build.sh --be`.
   - Verified the same manual repro after cumulative compaction preserves the 
empty-key values.
   - `./run-regression-test.sh --run --conf tmp/regression-conf.auto.groovy -d 
variant_p0 -s test_variant_empty_key_sparse_bucket -forceGenOut`
   - `./run-regression-test.sh --run --conf tmp/regression-conf.auto.groovy -d 
variant_p0 -s test_variant_empty_key_sparse_bucket`
   - `./run-regression-test.sh --run --conf tmp/regression-conf.auto.groovy -d 
variant_p0 -s regression_test_variant_column_name`
   - `./run-regression-test.sh --run --conf tmp/regression-conf.auto.groovy -d 
variant_p0 -s test_variant_compaction_empty_path_bug`
   
   - Behavior changed:
       - [x] No.
       - [ ] Yes.
   
   - Does this need documentation?
       - [x] No.
       - [ ] Yes.
   
   ### Check List (For Reviewer who merge this PR)
   
   - [ ] Confirm the release note
   - [ ] Confirm test cases
   - [ ] Confirm document
   - [ ] Add branch pick label
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to