adriangb commented on PR #19404:
URL: https://github.com/apache/datafusion/pull/19404#issuecomment-3728564354

   > > @discord9 I can't make a PR to your fork as far as far as I can tell, 
but can you check the diff in 
[7ee2bfb](https://github.com/apache/datafusion/commit/7ee2bfb353c30764f848491298f2124421f617c7)
 and see what you think? Sorry some of it is in 
[22981aa](https://github.com/apache/datafusion/commit/22981aa3c2b3d9f789e608e5dcccfd8f2b52cba8)
 as well.
   > 
   > I have some doubt about leave it unchanged, say if in this case:
   > 
   > ```
   > Sort filter=a@0>800
   >   <Some strange Extension node that went unhandled>(say something cause a 
unknown column(a new a@0 in this case) to appear>
   >     Projection some other columns, old a@0 not included(filter should be 
unknown column at this stage)
   >       <do something with old a@0>
   >       Projection b@0 as a@0
   >         TableScan filter= DynamicFilter [ b@0 > 800](should be unknown 
column, but left unchange confuse it with existing columns)
   > ```
   > 
   > I guess my point is that replace not found column with `UnknownColumn` 
will allow downstream user to better debug their problem when `DynamicFilter` 
is involved(Since evaluate `UnknownColumn` return errors, and downstream custom 
TableScan could use this to easily determine whether he can use this dynamic 
filter, and even downstream user miss to handle Extension node's gather 
pushdown, it's easier to spot due to `UnknownColumn`), and for datafusion 
itself, in normal case without external extension logical plan, projection 
shouldn't cause unknown columns in filter, so it's ok? But if we left unknown 
columns unchanged, certain query(with extension plan) might confuse those 
unchanged columns with something else and cause hidden bug which is much hard 
to discover or debug, so having `UnknownColumn` is more of a defensive guard 
thing? And it seems better to have a explict way to see if it went wrong does 
no harm and is beneficial? Altough if you insist I can cherry pick 
[7ee2bfb](http
 
s://github.com/apache/datafusion/commit/7ee2bfb353c30764f848491298f2124421f617c7)
 Edit: Maybe a middle ground is just refuse to pushdown filters when encounter 
unknown columns? Edit: done
   
   I just don't think this will happen. If a plan is introducing an 
`UnknownColumn` indeed we can't support projection pushdown.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to