Copilot commented on code in PR #37731:
URL: https://github.com/apache/superset/pull/37731#discussion_r2775461460
##########
superset/utils/pandas_postprocessing/histogram.py:
##########
@@ -48,6 +48,9 @@ def histogram(
if groupby is None:
groupby = []
+ # Create an explicit copy to avoid SettingWithCopyWarning
+ df = df.copy()
+
Review Comment:
Copying the full input DataFrame before `dropna` can be unnecessarily
expensive for large query results (it duplicates all columns/rows, even though
the next line discards rows and the function only needs `column` + `groupby`).
Consider moving the copy after `dropna` and/or copying only the required
columns to reduce memory/CPU while still avoiding `SettingWithCopyWarning`.
##########
superset/utils/pandas_postprocessing/histogram.py:
##########
@@ -48,6 +48,9 @@ def histogram(
if groupby is None:
groupby = []
+ # Create an explicit copy to avoid SettingWithCopyWarning
+ df = df.copy()
+
Review Comment:
There are unit tests for `histogram`, but none that exercise the specific
regression this PR fixes (a `SettingWithCopyWarning` when the input `df` is a
slice/view). Adding a test that passes a sliced DataFrame and asserts no
`SettingWithCopyWarning` is emitted would prevent this from resurfacing.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]