[
https://issues.apache.org/jira/browse/CALCITE-5426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17645458#comment-17645458
]
Julian Hyde commented on CALCITE-5426:
--------------------------------------
Thanks for logging this, [~dmsysolyatin]. It's an important correctness issue
and we should fix it.
{quote}
Are there any performance tests that prove that this optimization significantly
improves performance?
{quote}
No, because there are no performance tests. But if we removed common
subexpression elimination - and say moved an array initializer from a static
initializer to inside a for loop - I'm sure there would be significant
degradation.
Let's provide a way for the caller of the API a way to say that the object is
mutable. If the object is of an immutable type, say {{String}} or {{Double}} or
{{ImmutableList}} then it would be an error if the mutable flag is true. If the
object is of a possibly-mutable type, say {{List}} or {{ArrayList}} or an
array, then we have to take the caller at their word.
Maybe there are different kinds of mutable. Perhaps a list is populated at the
start of each execution but doesn't change from one row to the next. Keep an
eye out for those cases; we might need more information than boolean mutable vs
immutable, or we might be able to safely treat them as immutable.
> BlockBuilder should not optimize expressions related to mutable objects to
> variable
> -----------------------------------------------------------------------------------
>
> Key: CALCITE-5426
> URL: https://issues.apache.org/jira/browse/CALCITE-5426
> Project: Calcite
> Issue Type: Bug
> Components: linq4j
> Affects Versions: 1.32.0
> Reporter: Dmitry Sysolyatin
> Priority: Major
>
> Inside BlockBuilder there is an optimization that replaces an expression with
> a variable if the expressions are equal and have the final modifier.
> But this optimization can cause problems when used with a mutable objects
> (One of the problems has been found in CALCITE-5388):
> For example [1]:
> {code:java}
> @Test void testReuseCollectionExpression() throws NoSuchMethodException {
> Method putMethod = HashMap.class.getMethod("put", Object.class,
> Object.class);
> Method sizeMethod = HashMap.class.getMethod("size");
> Expression multiMapParent = b.append("multiMap",
> Expressions.new_(Types.of(HashMap.class)));
> b.add(Expressions.statement(
> Expressions.call(multiMapParent, putMethod, Expressions.box(ONE),
> Expressions.box(ONE))));
> BlockBuilder nested = new BlockBuilder(true, b);
> Expression multiMapNested = nested.append("multiMap",
> Expressions.new_(Types.of(HashMap.class)));
> nested.add(Expressions.statement(
> Expressions.call(multiMapNested, putMethod, Expressions.box(TWO),
> Expressions.box(TWO))));
> nested.add(Expressions.call(multiMapNested, sizeMethod));
> b.add(nested.toBlock());
> b.append(Expressions.call(multiMapParent, sizeMethod));
> // It is wrong output. Map should be reused
> assertEquals(
> "{\n"
> + " final java.util.HashMap multiMap = new
> java.util.HashMap();\n"
> + " multiMap.put(Integer.valueOf(1), Integer.valueOf(1));\n"
> + " {\n"
> + " multiMap.put(Integer.valueOf(2), Integer.valueOf(2));\n"
> + " return multiMap.size();\n"
> + " }\n"
> + " return multiMap.size();\n"
> + "}\n",
> b.toBlock().toString());
> }
> {code}
> Are there any performance tests that prove that this optimization
> significantly improves performance?
> [1]
> https://github.com/dssysolyatin/calcite/commit/626a5f48ef9e69e543aeec277a4f38000a190b10
--
This message was sent by Atlassian Jira
(v8.20.10#820010)