Copilot commented on code in PR #2109:
URL: https://github.com/apache/sedona/pull/2109#discussion_r2212399742
##########
python/tests/geopandas/test_match_geopandas_series.py:
##########
@@ -619,7 +619,15 @@ def test_unary_union(self):
pass
def test_union_all(self):
- pass
+ if parse_version(gpd.__version__) < parse_version("1.1.0"):
+ return
Review Comment:
Use pytest.skip with a reason here instead of an early return so that the
test runner records this as an explicit skip.
```suggestion
pytest.skip("Test requires geopandas version 1.1.0 or higher.")
```
##########
python/tests/geopandas/test_match_geopandas_series.py:
##########
@@ -619,7 +619,15 @@ def test_unary_union(self):
pass
def test_union_all(self):
- pass
+ if parse_version(gpd.__version__) < parse_version("1.1.0"):
+ return
+
+ # Union all the valid geometries
+ # Neither our or geopandas' implementation supports invalid geometries
Review Comment:
[nitpick] Grammar: change 'Neither our or geopandas'' to 'Neither our nor
geopandas'' to correctly pair 'neither/ nor'.
```suggestion
# Neither our nor geopandas' implementation supports invalid
geometries
```
##########
python/sedona/geopandas/geoseries.py:
##########
@@ -499,14 +508,23 @@ def _query_geometry_column(
query = f"{query} as `{rename}`"
- # We always select NATURAL_ORDER_COLUMN_NAME, to avoid having to
regenerate it in the result
- # We always select SPARK_DEFAULT_INDEX_NAME, to retain series index
info
- sdf = df.selectExpr(query, SPARK_DEFAULT_INDEX_NAME,
NATURAL_ORDER_COLUMN_NAME)
+ exprs = [query]
+
+ index_spark_columns = []
+ if not is_aggr:
+ # We always select NATURAL_ORDER_COLUMN_NAME, to avoid having to
regenerate it in the result
+ # We always select SPARK_DEFAULT_INDEX_NAME, to retain series
index info
+ exprs.append(SPARK_DEFAULT_INDEX_NAME)
+ exprs.append(NATURAL_ORDER_COLUMN_NAME)
+ index_spark_columns = [scol_for(df, SPARK_DEFAULT_INDEX_NAME)]
+ # else if is_aggr, we don't select the index columns
+
+ sdf = df.selectExpr(*exprs)
internal = self._internal.copy(
spark_frame=sdf,
index_fields=[self._internal.index_fields[0]],
Review Comment:
When `is_aggr=True`, `index_fields` is still set while `index_spark_columns`
is cleared, causing a mismatch; consider clearing `index_fields` for
aggregation queries as well.
```suggestion
index_fields=[] if is_aggr else [self._internal.index_fields[0]],
```
##########
python/sedona/geopandas/geoseries.py:
##########
@@ -1319,9 +1337,51 @@ def unary_union(self):
# Implementation of the abstract method
raise NotImplementedError("This method is not implemented yet.")
- def union_all(self, method="unary", grid_size=None):
- # Implementation of the abstract method
- raise NotImplementedError("This method is not implemented yet.")
+ def union_all(self, method="unary", grid_size=None) -> BaseGeometry:
Review Comment:
Add an explicit check for an empty `GeoSeries` (e.g., `if len(self) == 0`)
to avoid indexing errors when calling `take([0])` on an empty result.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]