* evaluator can resolve 1(big decimal) +1 (big decimal) but then fails when the result is used as an argument for substring (given it requires in
I think this is really a symptom of the fact that the validator does not make the all implicit casts explicit. This is a problem with many function type-checking rules. I consider this a bug, but it can be construed as a feature - when a function is truly polymorphic. (SUBSTRING is polymorphic in its string argument (can be a BINARY), but not in its position/length arguments, which should always be INT.) Mihai ________________________________ From: Gonzalo Ortiz Jaureguizar <golthir...@gmail.com> Sent: Friday, July 4, 2025 5:46 AM To: dev@calcite.apache.org <dev@calcite.apache.org> Subject: Re: Reduce class loading during query optimizing Thanks for your answers, > Another thing you could try is to build a RexExecutor which does nothing and use that in the RexSimplify class. That may be simpler. That would have the side effect of not simpliying casts whose input is a literal, right? That may be acceptable for Calcite (we can simplify that later) but I think that would be not ideal. > The alternative to that is interpretation but there is no such implementation currently in Calcite Following Mihai's idea, I'm trying to implement my own RexExecutor that calls RexInterpreter, but I as is indicated by its javadoc, it doesn't support all functions. From my inexperience, it looks like we would need to call something like: ``` BuiltInMethod builtInMethod = BuiltInMethod.valueOf(call.getOperator().getName()); return (Comparable) builtInMethod.method.invoke(null, values.toArray()); ``` This seems good enough for trivial cases, but it is not correct. I'm using RexExecutorTest.testBinarySubstring as an example and this trivial evaluator can resolve 1(big decimal) +1 (big decimal) but then fails when the result is used as an argument for substring (given it requires ints). I'm sure there are other things I'm missing (perhaps including canonizing the function names, custom functions etc.), given my limited knowledge of Calcite's internal components. Still, so far, it has been a good experience to understand it. I used RexImpTable as my main inspiration. The refactor that Stamatis proposes makes sense to me. Perhaps it's my inexperience with the framework, or maybe it's the fact that the code has organically evolved over the years (we have the same issue in Pinot). However, sometimes when I review the code, I get the feeling that some modules are too tightly coupled. I also think this Rex interpreter would be handy in cases Calcite is used to analyze a query instead of running it. My agenda for July is a bit tight, and I will be unavailable for the next couple of weeks, but I would like to contribute to these features. Bests, Gonzalo El vie, 4 jul 2025 a las 11:22, Stamatis Zampetakis (<zabe...@gmail.com>) escribió: > Here are a few thoughts that could lead to some useful contributions > to Calcite as well. > > The fact that RexSimplify calls RexExecutor is not ideal. Personally, > I feel that it should be completely independent from one the other. > Most often (e.g., in the ReduceExpressionsRule) before we apply the > expression simplification (RexSimplify) we do constant > reduction/folding (RexExecutor) so in that sense we don't need to > embed the executor inside the simplifier. At the moment, the > RexSimplify class is using RexExecutor only in a very specific case > (simply CAST expressions) so it may be a good time to completely > dissociate the two components before the coupling increases. > > The current implementation of the RexExecutor relies on code > generation/compilation to evaluate expressions (RexNode). The > alternative to that is interpretation but there is no such > implementation currently in Calcite. There are always pros and cons > between compilation and interpretation and the choice always depends > on the use-case. In general, I feel that it would be useful if Calcite > provided an interpreter for row expressions (RexNode). Concretely, I > think this would mean implementing the ScalarCompiler [2] interface > using interpretation instead of code generation. This contribution > would be valuable for RexExecutor but also for the Bindable convention > of Calcite that relies on interpretation. > > The overhead of compilation and class loading has also been observed > in the past and for this reason we have added a cache layer [3] that > can avoid this for frequently appearing queries. I guess the caching > does not take effect for RexExecutor but it could make sense if there > are frequently reappearing expressions. > > All in all, I see at least three contribution areas that could help > with the use-case encountered in Apache Pinot and at the same time > improve Calcite and potentially other projects as well. It would be > nice to see some of them land in the main Calcite repo :) > > Best, > Stamatis > > [1] > https://github.com/apache/calcite/blob/e536d3949674cbbc0fd1162b3fccb211ca75789b/core/src/main/java/org/apache/calcite/rel/rules/ReduceExpressionsRule.java > [2] > http://github.com/apache/calcite/blob/main/core/src/main/java/org/apache/calcite/interpreter/Interpreter.java#L496 > [3] > https://github.com/apache/calcite/blob/e536d3949674cbbc0fd1162b3fccb211ca75789b/core/src/main/java/org/apache/calcite/adapter/enumerable/EnumerableInterpretable.java#L100 > > On Thu, Jul 3, 2025 at 7:36 PM Mihai Budiu <mbu...@gmail.com> wrote: > > > > Another thing you could try is to build a RexExecutor which does nothing > and use that in the RexSimplify class. That may be simpler. > > > > Mihai > > > > ________________________________ > > From: Gonzalo Ortiz Jaureguizar <golthir...@gmail.com> > > Sent: Thursday, July 3, 2025 4:42 AM > > To: dev@calcite.apache.org <dev@calcite.apache.org> > > Subject: Reduce class loading during query optimizing > > > > Hi there, > > > > Here at Apache Pinot, we utilize Apache Calcite for query optimization. > > Once the query is optimized at logical level (pushing predicates, > > simplifying expressions, etc), we transform the query tree to our own > nodes > > and distribute the execution between different nodes. > > > > But we are having some issues in cases where we have 1000 queries per > > second. Specifically, we found that in these cases, we may end up with > > >80000 instances of > > org.codehaus.commons.compiler.util.reflect.ByteArrayClassLoader instances > > and, even worse, several query threads have to block for the lock of > > jdk.internal.loader.BuiltinClassLoader.loadClassOrNull. > > > > In these two cases, the stacktrace points to Janino. Specifically, Janino > > is being used to compile trivial expressions when queries are simplified. > > Is that necessary or even desirable? We may be doing something wrong (our > > Calcite knowledge is not as good as we would like), but it sounds > expensive > > to generate bytecode to run code that will be executed only once. Is > there > > a way to disable Janino for these cases? > > > > For the context, > > > > Stack trace of the thread allocating a new classloader: > > ``` > > > > <init>:57, ByteArrayClassLoader > (org.codehaus.commons.compiler.util.reflect) > > run:357, SimpleCompiler$2 (org.codehaus.janino) > > run:351, SimpleCompiler$2 (org.codehaus.janino) > > doPrivileged:74, AccessController (java.security) > > getClassLoader2:351, SimpleCompiler (org.codehaus.janino) > > getClassLoader:341, SimpleCompiler (org.codehaus.janino) > > cook:308, ClassBodyEvaluator (org.codehaus.janino) > > cook2:297, ClassBodyEvaluator (org.codehaus.janino) > > cook:273, ClassBodyEvaluator (org.codehaus.janino) > > compile:64, RexExecutable (org.apache.calcite.rex) > > <init>:53, RexExecutable (org.apache.calcite.rex) > > reduce:144, RexExecutorImpl (org.apache.calcite.rex) > > simplifyCast:2304, RexSimplify (org.apache.calcite.rex) > > simplify:293, RexSimplify (org.apache.calcite.rex) > > lambda$simplifyList$2:682, RexSimplify (org.apache.calcite.rex) > > apply:-1, RexSimplify$$Lambda/0x00007faed7bfa218 (org.apache.calcite.rex) > > replaceAllRange:1803, ArrayList (java.util) > > replaceAll:1793, ArrayList (java.util) > > simplifyList:682, RexSimplify (org.apache.calcite.rex) > > simplifyComparison:523, RexSimplify (org.apache.calcite.rex) > > simplifyComparison:515, RexSimplify (org.apache.calcite.rex) > > simplify:311, RexSimplify (org.apache.calcite.rex) > > lambda$simplifyList$2:682, RexSimplify (org.apache.calcite.rex) > > apply:-1, RexSimplify$$Lambda/0x00007faed7bfa218 (org.apache.calcite.rex) > > replaceAllRange:1803, ArrayList (java.util) > > replaceAll:1793, ArrayList (java.util) > > simplifyList:682, RexSimplify (org.apache.calcite.rex) > > simplifyAnd:1529, RexSimplify (org.apache.calcite.rex) > > simplify:282, RexSimplify (org.apache.calcite.rex) > > simplifyUnknownAs:251, RexSimplify (org.apache.calcite.rex) > > simplifyUnknownAsFalse:240, RexSimplify (org.apache.calcite.rex) > > simplifyFilterPredicates:2898, RexSimplify (org.apache.calcite.rex) > > filter:1921, RelBuilder (org.apache.calcite.tools) > > filter:1886, RelBuilder (org.apache.calcite.tools) > > convertProject:393, PushProjector (org.apache.calcite.rel.rules) > > onMatch:179, ProjectFilterTransposeRule (org.apache.calcite.rel.rules) > > fireRule:350, AbstractRelOptPlanner (org.apache.calcite.plan) > > applyRule:541, HepPlanner (org.apache.calcite.plan.hep) > > depthFirstApply:370, HepPlanner (org.apache.calcite.plan.hep) > > depthFirstApply:384, HepPlanner (org.apache.calcite.plan.hep) > > depthFirstApply:384, HepPlanner (org.apache.calcite.plan.hep) > > depthFirstApply:384, HepPlanner (org.apache.calcite.plan.hep) > > applyRules:436, HepPlanner (org.apache.calcite.plan.hep) > > executeRuleCollection:285, HepPlanner (org.apache.calcite.plan.hep) > > execute:105, HepInstruction$RuleCollection$State > (org.apache.calcite.plan.hep) > > lambda$executeProgram$0:210, HepPlanner (org.apache.calcite.plan.hep) > > accept:-1, HepPlanner$$Lambda/0x00007faed7c20f90 > (org.apache.calcite.plan.hep) > > forEach:423, ImmutableList (com.google.common.collect) > > executeProgram:209, HepPlanner (org.apache.calcite.plan.hep) > > execute:118, HepProgram$State (org.apache.calcite.plan.hep) > > executeProgram:204, HepPlanner (org.apache.calcite.plan.hep) > > findBestExp:190, HepPlanner (org.apache.calcite.plan.hep) > > optimize:463, QueryEnvironment (org.apache.pinot.query) > > compileQuery:356, QueryEnvironment (org.apache.pinot.query) > > compile:283, QueryEnvironment (org.apache.pinot.query) > > compile:261, QueryEnvironment (org.apache.pinot.query) > > getTableNames:225, PinotQueryResource > > (org.apache.pinot.controller.api.resources) > > ... > > > > ``` > > > > Stack trace of the threads being blocked: > > > > ``` > > ## Thread has lock > > "multi-stage-query-compile-executor-2-thread-1" #374 [369] prio=5 > os_prio=0 > > cpu=4267041.58ms elapsed=114126.21s tid=0x00007f7a3d1f5810 nid=369 > runnable > > [0x00007f79c517b000] > > java.lang.Thread.State: RUNNABLE > > at java.lang.ClassLoader.findBootstrapClass(java.base@21.0.7 > /Native > > Method) > > at > java.lang.ClassLoader.findBootstrapClassOrNull(java.base@21.0.7 > > /ClassLoader.java:1277) > > at java.lang.System$2.findBootstrapClassOrNull(java.base@21.0.7 > > /System.java:2397) > > at > > > jdk.internal.loader.ClassLoaders$BootClassLoader.loadClassOrNull(java.base@21.0.7 > > /ClassLoaders.java:140) > > at > > jdk.internal.loader.BuiltinClassLoader.loadClassOrNull(java.base@21.0.7 > > /BuiltinClassLoader.java:700) > > at > > jdk.internal.loader.BuiltinClassLoader.loadClassOrNull(java.base@21.0.7 > > /BuiltinClassLoader.java:676) > > - locked <0x000000030a1d87f8> (a java.lang.Object) > > at > > jdk.internal.loader.BuiltinClassLoader.loadClassOrNull(java.base@21.0.7 > > /BuiltinClassLoader.java:700) > > at > > jdk.internal.loader.BuiltinClassLoader.loadClassOrNull(java.base@21.0.7 > > /BuiltinClassLoader.java:676) > > - locked <0x000000030ab04140> (a java.lang.Object) > > at > jdk.internal.loader.BuiltinClassLoader.loadClass(java.base@21.0.7 > > /BuiltinClassLoader.java:639) > > at > > > jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(java.base@21.0.7 > > /ClassLoaders.java:188) > > at java.lang.ClassLoader.loadClass(java.base@21.0.7 > > /ClassLoader.java:526) > > at > > > org.codehaus.janino.ClassLoaderIClassLoader.findIClass(ClassLoaderIClassLoader.java:75) > > at > > org.codehaus.janino.IClassLoader.loadIClass(IClassLoader.java:317) > > - locked <0x00000007f780cc68> (a > > org.codehaus.janino.ClassLoaderIClassLoader) > > at > > org.codehaus.janino.UnitCompiler.findTypeByName(UnitCompiler.java:9074) > > at > > > org.codehaus.janino.UnitCompiler.getRawReferenceType(UnitCompiler.java:7242) > > at > > org.codehaus.janino.UnitCompiler.getReferenceType(UnitCompiler.java:7142) > > at > > org.codehaus.janino.UnitCompiler.getReferenceType(UnitCompiler.java:7023) > > at > org.codehaus.janino.UnitCompiler.getType2(UnitCompiler.java:6994) > > at > > org.codehaus.janino.UnitCompiler.access$14900(UnitCompiler.java:240) > > at > > > org.codehaus.janino.UnitCompiler$24.visitReferenceType(UnitCompiler.java:6891) > > at > > > org.codehaus.janino.UnitCompiler$24.visitReferenceType(UnitCompiler.java:6888) > > > > ...... > > > > at > > org.apache.calcite.rex.RexSimplify.simplify(RexSimplify.java:281) > > at > > > org.apache.calcite.rex.RexSimplify.simplifyUnknownAs(RexSimplify.java:250) > > at > > > org.apache.calcite.rex.RexSimplify.simplifyUnknownAsFalse(RexSimplify.java:239) > > at > > > org.apache.calcite.rex.RexSimplify.simplifyFilterPredicates(RexSimplify.java:2810) > > at > org.apache.calcite.tools.RelBuilder.filter(RelBuilder.java:1809) > > at > org.apache.calcite.tools.RelBuilder.filter(RelBuilder.java:1774) > > at > > > org.apache.calcite.rel.rules.PushProjector.convertProject(PushProjector.java:394) > > at > > > org.apache.calcite.rel.rules.ProjectFilterTransposeRule.onMatch(ProjectFilterTransposeRule.java:179) > > at > > > org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:337) > > at > > org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:556) > > at > > > org.apache.calcite.plan.hep.HepPlanner.depthFirstApply(HepPlanner.java:371) > > at > > > org.apache.calcite.plan.hep.HepPlanner.depthFirstApply(HepPlanner.java:385) > > at > > > org.apache.calcite.plan.hep.HepPlanner.depthFirstApply(HepPlanner.java:385) > > at > > > org.apache.calcite.plan.hep.HepPlanner.depthFirstApply(HepPlanner.java:385) > > at > > org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:437) > > at > > > org.apache.calcite.plan.hep.HepPlanner.executeRuleCollection(HepPlanner.java:286) > > at > > > org.apache.calcite.plan.hep.HepInstruction$RuleCollection$State.execute(HepInstruction.java:105) > > at > > > org.apache.calcite.plan.hep.HepPlanner.lambda$executeProgram$0(HepPlanner.java:211) > > at > > > org.apache.calcite.plan.hep.HepPlanner$$Lambda/0x00007f79fbf0a360.accept(Unknown > > Source) > > at > > org.apache.pinot.shaded.com > .google.common.collect.ImmutableList.forEach(ImmutableList.java:423) > > at > > > org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:210) > > at > > org.apache.calcite.plan.hep.HepProgram$State.execute(HepProgram.java:118) > > at > > > org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:205) > > at > > org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:191) > > at > > > org.apache.pinot.query.QueryEnvironment.optimize(QueryEnvironment.java:394) > > at > > > org.apache.pinot.query.QueryEnvironment.compileQuery(QueryEnvironment.java:347) > > at > > > org.apache.pinot.query.QueryEnvironment.planQuery(QueryEnvironment.java:187) > > at > > > org.apache.pinot.broker.requesthandler.MultiStageBrokerRequestHandler.lambda$handleRequest$2(MultiStageBrokerRequestHandler.java:222) > > > > > > > > > > > > Waiting threads: > > "multi-stage-query-compile-executor-2-thread-3" #376 [371] prio=5 > os_prio=0 > > cpu=4261107.12ms elapsed=114126.12s tid=0x00007f79c7d8b010 nid=371 > waiting > > for monitor entry [0x00007f79c48fb000] > > java.lang.Thread.State: BLOCKED (on object monitor) > > at > > jdk.internal.loader.BuiltinClassLoader.loadClassOrNull(java.base@21.0.7 > > /BuiltinClassLoader.java:651) > > - waiting to lock <0x000000030ab04140> (a java.lang.Object) > > at > jdk.internal.loader.BuiltinClassLoader.loadClass(java.base@21.0.7 > > /BuiltinClassLoader.java:639) > > at > > > jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(java.base@21.0.7 > > /ClassLoaders.java:188) > > at java.lang.ClassLoader.loadClass(java.base@21.0.7 > > /ClassLoader.java:526) > > at > > > org.codehaus.janino.ClassLoaderIClassLoader.findIClass(ClassLoaderIClassLoader.java:75) > > at > > org.codehaus.janino.IClassLoader.loadIClass(IClassLoader.java:317) > > - locked <0x00000007f70a2398> (a > > org.codehaus.janino.ClassLoaderIClassLoader) > > at > > org.codehaus.janino.UnitCompiler.findTypeByName(UnitCompiler.java:9074) > > at > > > org.codehaus.janino.UnitCompiler.getRawReferenceType(UnitCompiler.java:7242) > > at > > org.codehaus.janino.UnitCompiler.getReferenceType(UnitCompiler.java:7142) > > at > > org.codehaus.janino.UnitCompiler.getReferenceType(UnitCompiler.java:7023) > > at > org.codehaus.janino.UnitCompiler.getType2(UnitCompiler.java:6994) > > at > > org.codehaus.janino.UnitCompiler.access$14900(UnitCompiler.java:240) > > at > > > org.codehaus.janino.UnitCompiler$24.visitReferenceType(UnitCompiler.java:6891) > > at > > > org.codehaus.janino.UnitCompiler$24.visitReferenceType(UnitCompiler.java:6888) > > at org.codehaus.janino.Java$ReferenceType.accept(Java.java:4289) > > at > org.codehaus.janino.UnitCompiler.getType(UnitCompiler.java:6888) > > at > > org.codehaus.janino.UnitCompiler.getRawType(UnitCompiler.java:6884) > > at > org.codehaus.janino.UnitCompiler.getType2(UnitCompiler.java:7366) > > > > "multi-stage-query-compile-executor-2-thread-4" #396 [391] prio=5 > os_prio=0 > > cpu=4264527.65ms elapsed=114125.23s tid=0x00007f79cdbc2010 nid=391 > waiting > > for monitor entry [0x00007f79c12fb000] > > java.lang.Thread.State: BLOCKED (on object monitor) > > at > > jdk.internal.loader.BuiltinClassLoader.loadClassOrNull(java.base@21.0.7 > > /BuiltinClassLoader.java:651) > > - waiting to lock <0x000000030ab04140> (a java.lang.Object) > > at > jdk.internal.loader.BuiltinClassLoader.loadClass(java.base@21.0.7 > > /BuiltinClassLoader.java:639) > > at > > > jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(java.base@21.0.7 > > /ClassLoaders.java:188) > > at java.lang.ClassLoader.loadClass(java.base@21.0.7 > > /ClassLoader.java:526) > > at > > > org.codehaus.janino.ClassLoaderIClassLoader.findIClass(ClassLoaderIClassLoader.java:75) > > at > > org.codehaus.janino.IClassLoader.loadIClass(IClassLoader.java:317) > > - locked <0x00000007f880ca18> (a > > org.codehaus.janino.ClassLoaderIClassLoader) > > at > > org.codehaus.janino.UnitCompiler.findTypeByName(UnitCompiler.java:9074) > > at > > > org.codehaus.janino.UnitCompiler.getRawReferenceType(UnitCompiler.java:7242) > > at > > org.codehaus.janino.UnitCompiler.getReferenceType(UnitCompiler.java:7142) > > at > > org.codehaus.janino.UnitCompiler.getReferenceType(UnitCompiler.java:7023) > > at > org.codehaus.janino.UnitCompiler.getType2(UnitCompiler.java:6994) > > at > > org.codehaus.janino.UnitCompiler.access$14900(UnitCompiler.java:240) > > at > > > org.codehaus.janino.UnitCompiler$24.visitReferenceType(UnitCompiler.java:6891) > > at > > > org.codehaus.janino.UnitCompiler$24.visitReferenceType(UnitCompiler.java:6888) > > at org.codehaus.janino.Java$ReferenceType.accept(Java.java:4289) > > at > org.codehaus.janino.UnitCompiler.getType(UnitCompiler.java:6888) > > at > > org.codehaus.janino.UnitCompiler.getRawType(UnitCompiler.java:6884) > > at > org.codehaus.janino.UnitCompiler.getType2(UnitCompiler.java:7366) > > ``` >