Omega359 commented on issue #14563:
URL: https://github.com/apache/datafusion/issues/14563#issuecomment-2767224780
This should now be resolved with the changes from
https://github.com/apache/datafusion/pull/14653
--
This is an automated message from the Apache Git Service.
To respond to t
Omega359 closed issue #14563: Perf: Dataframe with_column and
with_column_renamed are slow
URL: https://github.com/apache/datafusion/issues/14563
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
Omega359 commented on issue #14563:
URL: https://github.com/apache/datafusion/issues/14563#issuecomment-2654798363
I'll be honest - I'm pretty out of my element with these changes. I don't
know what is 'correct behaviour' and what isn't here. My thinking for the
changes in my current branch
blaginin commented on issue #14563:
URL: https://github.com/apache/datafusion/issues/14563#issuecomment-2654717456
I really like that idea, Bruce! I tried to break your branch, but everything
seems to work 🙂 I think the issue was that on every rename, we tried to
recursively normalize _ever
Omega359 commented on issue #14563:
URL: https://github.com/apache/datafusion/issues/14563#issuecomment-2652215840
Interesting. I tried a somewhat different approach -
https://github.com/apache/datafusion/compare/main...Omega359:arrow-datafusion:with_column_updates
It is much much fas
blaginin commented on issue #14563:
URL: https://github.com/apache/datafusion/issues/14563#issuecomment-2652180854
Okay, so I think the issue is that with every `.with_column_renamed` /
`.with_column` we add a new projection - that creates a lot of layers and each
time adding a new one is m
blaginin commented on issue #14563:
URL: https://github.com/apache/datafusion/issues/14563#issuecomment-2652057779
Stacktrace also may also help
https://github.com/user-attachments/assets/83ea287f-5312-4624-bc70-3824fb55c203";
/>
--
This is an automated message from the Ap
blaginin commented on issue #14563:
URL: https://github.com/apache/datafusion/issues/14563#issuecomment-2652036656
A lot of `TreeNodeRecursion::visit_sibling`... may be related to
https://github.com/apache/datafusion/issues/13748 ?

for the benchmark
Omega359 opened a new issue, #14563:
URL: https://github.com/apache/datafusion/issues/14563
### Describe the bug
Dataframe functions `.with_column` and `.with_column_renamed` (and possibly
others) are slow. One can really see this in dataframe's with many many columns
where a .with_c
11 matches
Mail list logo