This sounds like a very useful transformation. Are you considering contributing this in some way as a utility function?
Mihai ________________________________ From: Ian Bertolacci <ian.bertola...@workday.com.INVALID> Sent: Tuesday, March 4, 2025 2:50 PM To: dev@calcite.apache.org <dev@calcite.apache.org> Subject: Re: RelFieldTrimmer not optimally trimming after filters under joins? I just hacked together an override where it will build a redundant project on each side if necessary. That should eliminate any overhead of invoking any planners or rules. (For our needs, additional projects have not performance implications) -Ian From: Ian Bertolacci <ian.bertola...@workday.com.INVALID> Reply-To: "dev@calcite.apache.org" <dev@calcite.apache.org> Date: Tuesday, March 4, 2025 at 14:25 To: "dev@calcite.apache.org" <dev@calcite.apache.org> Subject: Re: RelFieldTrimmer not optimally trimming after filters under joins? > I think you could work around this by always inserting trivial projects over > every node in the tree before trimming, and then clean up with > ProjectRemoveRule. This is pretty much exactly what I was doing. Good to know > that I’m not wildly > I think you could work around this by always inserting trivial projects over > every node in the tree before trimming, and then clean up with > ProjectRemoveRule. This is pretty much exactly what I was doing. Good to know that I’m not wildly off-track Thanks! -Ian From: Steven Phillips <ste...@dremio.com.INVALID> Reply-To: "dev@calcite.apache.org" <dev@calcite.apache.org> Date: Tuesday, March 4, 2025 at 13:55 To: "dev@calcite.apache.org" <dev@calcite.apache.org> Subject: Re: RelFieldTrimmer not optimally trimming after filters under joins? In think this is a current limitation of FieldTrimmer. The Join and Filter nodes can't drop columns (since they don't carry column selection information), and the trimmer doesn't add Project nodes (currently). I have worked around this limitation In think this is a current limitation of FieldTrimmer. The Join and Filter nodes can't drop columns (since they don't carry column selection information), and the trimmer doesn't add Project nodes (currently). I have worked around this limitation by using HepPlanner with various ProjectTranspose rules. I think you could work around this by always inserting trivial projects over every node in the tree before trimming, and then clean up with ProjectRemoveRule. On Tue, Mar 4, 2025 at 1:33 PM Ian Bertolacci <ian.bertola...@workday.com.invalid> wrote: > I’m looking at using RelFieldTrimmer, and I’m noticing that if a side of a > join has unnecessary fields after a filter, there is no trim-fields project > on that side to reduce the width of the row. > Is this expected, or is there a configuration or pre-processing step that > I am missing? > > For example, starting with this tree (these all look better in monospace, > hopefully the formatting comes through) > 4:Project(C5633_14509=[$4], C5633_486=[$8]) > └── 3:Join(condition=[=($1, $6)], joinType=[inner]) > ....├── 1:Filter(condition=[<($2, 10)]) > ....│...└── 0:TableScan(table=[T902], Schema=[...6 fields...]) > ....└── 2:TableScan(table=[T895], Schema=[...64 fields...]) > > The result of RelFieldTrimmer is this: > 9:Project(C5633_14509=[$2], C5633_486=[$4]) > └── 8:Join(condition=[=($0, $3)], joinType=[inner]) > ....├── 6:Filter(condition=[<($1, 10)]) > ....│...└── 5:Project(C5633_14505=[$1], C5633_14506=[$2], C5633_14509=[$4]) > ....│.......└── 0:TableScan(table=[T902], Schema=[...6 fields...]) > ....└── 7:Project(ID=[$0], C5633_486=[$2]) > ........└── 2:TableScan(table=[T895], Schema=[...64 fields...]) > > Notice: $1 on the LHS of the node is not used *after* the filter so a > projection of only the $0 and $2 fields would be reduce the width of the > row before the join. > > However, I can force the insertion of a projection which is simply the > identity (ie, projecting all fields of the input row with now additions or > subtractions): > 5:Project(C5633_14509=[$4], C5633_486=[$8]) > └── 4:Join(condition=[=($1, $6)], joinType=[inner]) > ....├── 2:Project(...Identity mapping, 6 fields...) > ....│...└── 1:Filter(condition=[<($2, 10)]) > ....│.......└── 0:TableScan(table=[T902], Schema=[...6 fields...]) > ....└── 3:TableScan(table=[T895], Schema=[...64 fields...]) > > And the result is a projection wich only has the 2 fields necessary after > the filter. > 11:Project(C5633_14509=[$1], C5633_486=[$3]) > └── 10:Join(condition=[=($0, $2)], joinType=[inner]) > ....├── 8:Project(C5633_14505=[$0], C5633_14509=[$2]) <- trimmed > ....│...└── 7:Filter(condition=[<($1, 10)]) > ....│.......└── 6:Project(C5633_14505=[$1], C5633_14506=[$2], > C5633_14509=[$4]) > ....│...........└── 0:TableScan(table=[T902], Schema=[...6 fields...]) > ....└── 9:Project(ID=[$0], C5633_486=[$2]) > ........└── 3:TableScan(table=[T895], Schema=[...64 fields...]) > > Thanks! > -Ian >