I don’t think I understand this conversation. RelFieldTrimmer is intended to be invoked on the whole tree. Each node, when invoking the trimmer on its input (child), tells the trimmer which of the fields of that input it actually uses. Now ‘which fields it actually uses’ is based on the fields that its consumer (parent) said that it was using.
If fields are not being trimmed as expected, look for one node that is wrongly telling its input ‘I need all of your fields’. Julian > On Mar 4, 2025, at 2:50 PM, Ian Bertolacci > <ian.bertola...@workday.com.invalid> wrote: > > I just hacked together an override where it will build a redundant project on > each side if necessary. > That should eliminate any overhead of invoking any planners or rules. > (For our needs, additional projects have not performance implications) > -Ian > > From: Ian Bertolacci <ian.bertola...@workday.com.INVALID> > Reply-To: "dev@calcite.apache.org" <dev@calcite.apache.org> > Date: Tuesday, March 4, 2025 at 14:25 > To: "dev@calcite.apache.org" <dev@calcite.apache.org> > Subject: Re: RelFieldTrimmer not optimally trimming after filters under joins? > >> I think you could work around this by always inserting trivial projects over >> every node in the tree before trimming, and then clean up with >> ProjectRemoveRule. This is pretty much exactly what I was doing. Good to >> know that I’m not wildly > > >> I think you could work around this by always inserting trivial projects over >> every node in the tree before trimming, and then clean up with >> ProjectRemoveRule. > > > > This is pretty much exactly what I was doing. > > Good to know that I’m not wildly off-track > > Thanks! > > -Ian > > > > > > > > > > From: Steven Phillips <ste...@dremio.com.INVALID> > > Reply-To: "dev@calcite.apache.org" <dev@calcite.apache.org> > > Date: Tuesday, March 4, 2025 at 13:55 > > To: "dev@calcite.apache.org" <dev@calcite.apache.org> > > Subject: Re: RelFieldTrimmer not optimally trimming after filters under joins? > > > > In think this is a current limitation of FieldTrimmer. The Join and Filter > nodes can't drop columns (since they don't carry column selection > information), and the trimmer doesn't add Project nodes (currently). I have > worked around this limitation > > > > > > In think this is a current limitation of FieldTrimmer. The Join and Filter > > > > nodes can't drop columns (since they don't carry column selection > > > > information), and the trimmer doesn't add Project nodes (currently). I have > > > > worked around this limitation by using HepPlanner with various > > > > ProjectTranspose rules. > > > > > > > > I think you could work around this by always inserting trivial projects > > > > over every node in the tree before trimming, and then clean up with > > > > ProjectRemoveRule. > > > > > > > > On Tue, Mar 4, 2025 at 1:33 PM Ian Bertolacci > > > > <ian.bertola...@workday.com.invalid> wrote: > > > > > > > >> I’m looking at using RelFieldTrimmer, and I’m noticing that if a side of a > > > >> join has unnecessary fields after a filter, there is no trim-fields project > > > >> on that side to reduce the width of the row. > > > >> Is this expected, or is there a configuration or pre-processing step that > > > >> I am missing? > > > >> > > > >> For example, starting with this tree (these all look better in monospace, > > > >> hopefully the formatting comes through) > > > >> 4:Project(C5633_14509=[$4], C5633_486=[$8]) > > > >> └── 3:Join(condition=[=($1, $6)], joinType=[inner]) > > > >> ....├── 1:Filter(condition=[<($2, 10)]) > > > >> ....│...└── 0:TableScan(table=[T902], Schema=[...6 fields...]) > > > >> ....└── 2:TableScan(table=[T895], Schema=[...64 fields...]) > > > >> > > > >> The result of RelFieldTrimmer is this: > > > >> 9:Project(C5633_14509=[$2], C5633_486=[$4]) > > > >> └── 8:Join(condition=[=($0, $3)], joinType=[inner]) > > > >> ....├── 6:Filter(condition=[<($1, 10)]) > > > >> ....│...└── 5:Project(C5633_14505=[$1], C5633_14506=[$2], C5633_14509=[$4]) > > > >> ....│.......└── 0:TableScan(table=[T902], Schema=[...6 fields...]) > > > >> ....└── 7:Project(ID=[$0], C5633_486=[$2]) > > > >> ........└── 2:TableScan(table=[T895], Schema=[...64 fields...]) > > > >> > > > >> Notice: $1 on the LHS of the node is not used *after* the filter so a > > > >> projection of only the $0 and $2 fields would be reduce the width of the > > > >> row before the join. > > > >> > > > >> However, I can force the insertion of a projection which is simply the > > > >> identity (ie, projecting all fields of the input row with now additions or > > > >> subtractions): > > > >> 5:Project(C5633_14509=[$4], C5633_486=[$8]) > > > >> └── 4:Join(condition=[=($1, $6)], joinType=[inner]) > > > >> ....├── 2:Project(...Identity mapping, 6 fields...) > > > >> ....│...└── 1:Filter(condition=[<($2, 10)]) > > > >> ....│.......└── 0:TableScan(table=[T902], Schema=[...6 fields...]) > > > >> ....└── 3:TableScan(table=[T895], Schema=[...64 fields...]) > > > >> > > > >> And the result is a projection wich only has the 2 fields necessary after > > > >> the filter. > > > >> 11:Project(C5633_14509=[$1], C5633_486=[$3]) > > > >> └── 10:Join(condition=[=($0, $2)], joinType=[inner]) > > > >> ....├── 8:Project(C5633_14505=[$0], C5633_14509=[$2]) <- trimmed > > > >> ....│...└── 7:Filter(condition=[<($1, 10)]) > > > >> ....│.......└── 6:Project(C5633_14505=[$1], C5633_14506=[$2], > > > >> C5633_14509=[$4]) > > > >> ....│...........└── 0:TableScan(table=[T902], Schema=[...6 fields...]) > > > >> ....└── 9:Project(ID=[$0], C5633_486=[$2]) > > > >> ........└── 3:TableScan(table=[T895], Schema=[...64 fields...]) > > > >> > > > >> Thanks! > > > >> -Ian > > > >>