Re: [PR] [GH-2407] Auto-detect raster columns in SedonaUtils.display_image [sedona]

via GitHub Sun, 08 Feb 2026 23:53:39 -0800


Copilot commented on code in PR #2633:
URL: https://github.com/apache/sedona/pull/2633#discussion_r2781122146



##########
docs/tutorial/raster.md:
##########
@@ -472,6 +472,17 @@ The output looks like this:
 
 ![Output](../image/DisplayImage.png)
 
+!!!tip
+    In a Jupyter notebook, use `SedonaUtils.display_image` to render rasters 
directly — no need to call RS_AsImage manually:

Review Comment:
   The admonition syntax appears incorrect for MkDocs-style admonitions: it 
should be `!!! tip` (note the space) and typically uses a lowercase type for 
consistency. As written (`!!!tip`), this may render as plain text instead of a 
tip block. Please update this (and the other newly added `!!!tip` blocks in the 
docs) to the correct admonition syntax.



##########
python/sedona/spark/raster_utils/SedonaUtils.py:
##########
@@ -21,7 +21,39 @@
 class SedonaUtils:
     @classmethod
     def display_image(cls, df):
+        """Display raster images in a Jupyter notebook.
+
+        Accepts DataFrames with either:
+        - A raster column (GridCoverage2D) — auto-applies RS_AsImage
+        - An HTML image column from RS_AsImage() — renders directly
+
+        Falls back to the SedonaMapUtils HTML table path for other DataFrames.
+        """
         from IPython.display import HTML, display
 
+        schema = df.schema
+
+        # Detect raster UDT columns and auto-apply RS_AsImage.
+        # Without this, passing a raw raster DataFrame to the fallback path
+        # causes __convert_to_gdf_or_pdf__ to Arrow-serialize the full raster
+        # grid, which hangs for large rasters (e.g., 1400x800).
+        raster_cols = [
+            f.name
+            for f in schema.fields
+            if hasattr(f.dataType, "typeName") and f.dataType.typeName() == 
"rastertype"
+        ]
+        if raster_cols:
+            # Replace each raster column with its RS_AsImage() HTML 
representation,
+            # preserving all other columns in the DataFrame.
+            select_exprs = [
+                (
+                    f"RS_AsImage(`{f.name}`) as `{f.name}`"
+                    if f.name in raster_cols
+                    else f"`{f.name}`"
+                )
+                for f in schema.fields
+            ]
+            df = df.selectExpr(*select_exprs)

Review Comment:
   The new raster-column auto-detection and projection path isn’t covered by 
automated tests (the existing `test_display_image` passing suggests it covers 
the pre-`RS_AsImage`/HTML path). Please add a test that passes a DataFrame with 
a raster-typed column to `display_image()` and asserts the raster column is 
converted via `RS_AsImage` (and that non-raster columns are preserved).



##########
python/sedona/spark/raster_utils/SedonaUtils.py:
##########
@@ -21,7 +21,39 @@
 class SedonaUtils:
     @classmethod
     def display_image(cls, df):
+        """Display raster images in a Jupyter notebook.
+
+        Accepts DataFrames with either:
+        - A raster column (GridCoverage2D) — auto-applies RS_AsImage
+        - An HTML image column from RS_AsImage() — renders directly
+
+        Falls back to the SedonaMapUtils HTML table path for other DataFrames.
+        """
         from IPython.display import HTML, display
 
+        schema = df.schema
+
+        # Detect raster UDT columns and auto-apply RS_AsImage.
+        # Without this, passing a raw raster DataFrame to the fallback path
+        # causes __convert_to_gdf_or_pdf__ to Arrow-serialize the full raster
+        # grid, which hangs for large rasters (e.g., 1400x800).
+        raster_cols = [
+            f.name
+            for f in schema.fields
+            if hasattr(f.dataType, "typeName") and f.dataType.typeName() == 
"rastertype"
+        ]
+        if raster_cols:
+            # Replace each raster column with its RS_AsImage() HTML 
representation,
+            # preserving all other columns in the DataFrame.
+            select_exprs = [
+                (
+                    f"RS_AsImage(`{f.name}`) as `{f.name}`"
+                    if f.name in raster_cols
+                    else f"`{f.name}`"
+                )
+                for f in schema.fields
+            ]

Review Comment:
   Column names are interpolated into SQL strings for `selectExpr` and only 
wrapped in backticks. If a column name itself contains a backtick, Spark SQL 
parsing can break and (depending on context) this can become an injection 
vector. Please escape backticks in `f.name` (Spark uses doubled backticks) 
before embedding them in the expression strings.
   ```suggestion
               select_exprs = []
               for f in schema.fields:
                   col_name_escaped = f.name.replace("`", "``")
                   if f.name in raster_cols:
                       select_exprs.append(
                           f"RS_AsImage(`{col_name_escaped}`) as 
`{col_name_escaped}`"
                       )
                   else:
                       select_exprs.append(f"`{col_name_escaped}`")
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [GH-2407] Auto-detect raster columns in SedonaUtils.display_image [sedona]

Reply via email to