github-actions[bot] commented on code in PR #64094:
URL: https://github.com/apache/doris/pull/64094#discussion_r3354189201


##########
be/src/exprs/function/cast/cast_to_variant.h:
##########
@@ -32,45 +32,60 @@ inline Status cast_from_variant_impl(FunctionContext* 
context, Block& block,
     auto& col_with_type_and_name = block.get_by_position(arguments[0]);
     auto& col_from = col_with_type_and_name.column;
     const IColumn* variant_column = col_from.get();
-    if (const auto* nullable = 
check_and_get_column<ColumnNullable>(*variant_column)) {
+    const auto* nullable = 
check_and_get_column<ColumnNullable>(*variant_column);
+    if (nullable != nullptr) {
         variant_column = &nullable->get_nested_column();
     }
+    const auto* variant = assert_cast<const ColumnVariant*>(variant_column);
+    ColumnPtr col_to = data_type_to->create_column();
 
-    if (!assert_cast<const ColumnVariant&>(*variant_column).is_finalized()) {
-        // ColumnVariant should be finalized before parsing, finalize maybe 
modify original column structure
-        auto mutable_column = 
IColumn::mutate(std::move(col_with_type_and_name.column));
-        if (auto* nullable = 
check_and_get_column<ColumnNullable>(*mutable_column)) {
-            const auto& const_nullable = *nullable;
-            auto nested_column = 
IColumn::mutate(const_nullable.get_nested_column_ptr());
-            assert_cast<ColumnVariant&>(*nested_column).finalize();
-            ColumnPtr nested_column_ptr = std::move(nested_column);
-            nullable->change_nested_column(nested_column_ptr);
+    DCHECK_LE(input_rows_count, col_from->size());
+    ColumnPtr finalized_input_column;
+    if (!variant->is_finalized()) {
+        // Local exchange can share the same input block across multiple 
downstream tasks.
+        // Finalize a private copy so variant casts never mutate shared input 
columns.
+        auto finalized_variant = variant->clone_finalized();
+        variant = assert_cast<const ColumnVariant*>(finalized_variant.get());
+        DCHECK_LE(input_rows_count, finalized_variant->size());
+        if (input_rows_count < finalized_variant->size()) {
+            finalized_variant = 
finalized_variant->clone_resized(input_rows_count);
+            variant = assert_cast<const 
ColumnVariant*>(finalized_variant.get());
+        }
+        if (nullable != nullptr) {
+            auto cloned_null_map =
+                    
nullable->get_null_map_column_ptr()->clone_resized(input_rows_count);
+            finalized_input_column = 
ColumnNullable::create(std::move(finalized_variant),
+                                                            
std::move(cloned_null_map));
         } else {
-            assert_cast<ColumnVariant&>(*mutable_column).finalize();
+            finalized_input_column = std::move(finalized_variant);
         }
-        col_with_type_and_name.column = std::move(mutable_column);
-    }
-
-    variant_column = col_with_type_and_name.column.get();
-    if (const auto* nullable = 
check_and_get_column<ColumnNullable>(*variant_column)) {
-        variant_column = &nullable->get_nested_column();

Review Comment:
   This resize can create a zero-row `ColumnVariant` when a caller executes a 
prefix with `input_rows_count == 0` against a non-empty unfinalized input 
column. Immediately after this block, the code calls 
`variant->is_scalar_variant()` and potentially 
`variant->only_have_default_values()`, and both helpers index 
`serialized_*_offsets()[num_rows - 1]` (`ColumnVariant::is_scalar_variant()` / 
`only_have_default_values()`), so the zero-row clone can read out of bounds 
before the string/jsonb executor gets a chance to run. Prefix execution with a 
count smaller than the physical column is already exercised by the new tests 
with count 1, so count 0 should also be safe. Please either return an empty 
result before calling these helpers or cut/handle the zero-row case in a way 
that avoids calling Variant helpers that assume `num_rows > 0`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to